CN114140390A - Crack detection method and device based on semi-supervised semantic segmentation - Google Patents

Crack detection method and device based on semi-supervised semantic segmentation Download PDF

Info

Publication number
CN114140390A
CN114140390A CN202111291405.1A CN202111291405A CN114140390A CN 114140390 A CN114140390 A CN 114140390A CN 202111291405 A CN202111291405 A CN 202111291405A CN 114140390 A CN114140390 A CN 114140390A
Authority
CN
China
Prior art keywords
image
crack
loss
model
semi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111291405.1A
Other languages
Chinese (zh)
Inventor
蔡长青
刘爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202111291405.1A priority Critical patent/CN114140390A/en
Publication of CN114140390A publication Critical patent/CN114140390A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30132Masonry; Concrete

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a crack detection method and device based on semi-supervised semantic segmentation. The method mainly comprises the following steps: acquiring an image of a crack to be detected; inputting the crack image into a student model and a teacher model for training, and updating the weight of the student model through gradient reduction of a loss function obtained through training; updating the weight of the teacher model through the exponential moving average value of the weight of the student model; and evaluating the accuracy of the training model. The student model and the teacher model have the same network structure with EfficientNet as an encoder and UNet as a decoder, multi-scale crack characteristic information can be efficiently extracted, and loss of image information is reduced. The invention also adopts a semi-supervised learning means, thereby reducing the annotation workload. Experiments show that the invention can reduce the workload of data annotation and keep higher detection precision.

Description

Crack detection method and device based on semi-supervised semantic segmentation
Technical Field
The invention relates to the field of image detection, in particular to a surface crack detection method and device based on a semi-supervised semantic segmentation network.
Background
With the development of economy and society in China, most of infrastructure is lost to different degrees due to overload use, wherein cracks on the surface of the infrastructure are obvious characterization phenomena of the loss of the infrastructure. The detection of surface cracks is crucial to ensure the safety and usability of the civil infrastructure.
In recent years, the automatic detection method has high efficiency and objective detection results, so that the traditional manual detection is gradually replaced. In the automatic detection method, the semantic segmentation algorithm based on deep learning shows good performance in crack detection. However, in the semantic segmentation algorithm, the currently common fully supervised segmentation method needs to label a large amount of data manually, which is time-consuming. Is not beneficial to popularization and application in the field of image detection.
Disclosure of Invention
In view of this, the present invention provides a crack detection method and apparatus based on semi-supervised semantic segmentation.
The invention provides a crack detection method based on semi-supervised semantic segmentation, which is characterized by comprising the following steps of;
acquiring a crack image, wherein the crack image comprises a first crack image with an annotation and a second crack image without the annotation;
inputting the first crack image into a student model, and inputting the second crack image into the student model and a teacher model to perform crack feature extraction and feature segmentation; the student model completes semi-supervised training by the supervised loss and the unsupervised loss, the supervised loss is obtained by calculating dice loss and cross entropy loss of the first crack image and the second crack image, and the unsupervised loss is obtained by performing network regularization on the second crack image;
updating the weights of the student models through gradient descent of supervision loss and unsupervised loss;
updating the weight of the teacher model by an exponential moving average of the weights of the student models;
and adjusting the output of the student model and the output of the teacher model according to the weight of the student model and the weight of the teacher model, so that the output of the student model and the output of the teacher model are consistent, and obtaining a crack characteristic prediction result.
Further, the structure of the student model and the teacher model is a codec structure, and the codec structure comprises an encoder and a decoder; wherein the encoder comprises a CNN-based image feature extraction network; the decoder comprises an image feature segmentation network based on U-Net; and inputting the first crack image or the second crack image as an input image into an encoder, and performing feature extraction and segmentation by using a coder-decoder to obtain an output image.
Further, the CNN-based image feature extraction network comprises an EfficientNet network;
the EfficientNet network comprises 1 convolution layer and 23 mobile turning bottleneck convolution modules;
the mobile roll-over bottleneck convolution module comprises 4 convolution layers of 1 multiplied by 1, 1 separable convolution layer with depth and 1 global average pooling layer;
the working phase of the EfficientNet network comprises the following steps:
performing convolution processing on an input image to obtain a first-stage image;
performing moving turning bottleneck convolution processing twice on the first-stage image to obtain a second-stage image;
carrying out three-time moving and turning bottleneck convolution processing on the second-stage image to obtain a third-stage image;
carrying out three-time moving and turning bottleneck convolution processing on the third-stage image to obtain a fourth-stage image;
performing four-time moving and turning bottleneck convolution processing on the fourth-stage image to obtain a fifth-stage image;
performing four-time moving and turning bottleneck convolution processing on the fifth-stage image to obtain a sixth-stage image;
carrying out five times of moving and turning bottleneck convolution processing on the sixth-stage image to obtain a seventh-stage image;
carrying out moving and turning bottleneck convolution processing twice on the seventh-stage image to obtain an eighth-stage image;
and outputting the second-stage image, the third-stage image, the fifth-stage image, the sixth-stage image and the eighth-stage image to a decoder.
Further, the image feature segmentation network based on the U-Net comprises 5 upsampling layers and 1 convolutional layer, wherein an image input by an encoder is subjected to upsampling layer by layer and then is input into the convolutional layer to obtain a feature image which is used as an output image to be output; the feature image has the same size as the input image.
Further, the supervision loss formula is
Figure BDA0003333082440000021
Wherein L isdiceIs die loss, Lcross-entropyIs the cross entropy loss, piIs the prediction of the input data, y is the true value, and N is the number of data.
Further, the unsupervised loss formula is Lunsup=Ex,η′,η[||f(c,θ′,η′)-f(x,θ,η)||2](ii) a Where f (x, θ ', η') is the prediction of the teacher model, f (x, θ, η) is the prediction of the student model, x represents input data, θ 'represents weight, η, η' represents noise, and E represents expectation.
Further, the total loss function formula of the student model and the teacher model is Ltotal=Lsup+ω(t)·Lunsup(ii) a Wherein L issupIs a loss of supervision; l isunsupIs unsupervised loss; ω (t) is a Gaussian preheating function having the formula
Figure BDA0003333082440000022
Where t is the current training step, tmaxIs the total training step.
Further, the crack detection method based on semi-supervised semantic segmentation further comprises the following steps: the accuracy of the model was evaluated using F1-Score.
Further, the training step of the crack detection method based on semi-supervised semantic segmentation comprises the following steps:
acquiring annotated data X1 and unannotated data X2, wherein an annotation label is Y1;
inputting X1 and X2 into a student model to obtain an input prediction result P1 of X1 and an input prediction result P2 of X2;
calculating a loss function Lsup (Y1, P1) of X1;
inputting X2 into the teacher model to obtain an input prediction result P3 of X2;
computing the loss function Luntup (P2, P3) for unannotated data X2
Calculating the gradient decline of the total loss function Lsup (Y1, P1) + Lunsup (P2, P3);
updating the weight of the student model according to the gradient descent result;
the weights of the teacher model are updated by an exponential moving average.
The invention also discloses a computer device, which comprises a processor and a memory;
the memory is used for storing programs;
the processor executes the program to realize a crack detection method based on semi-supervised semantic segmentation.
The invention has the following beneficial effects: the invention is composed of a student model and a teacher model, and has the same network structure, so that multi-scale crack characteristic information can be efficiently extracted, and the loss of image information is reduced. Meanwhile, the invention adopts a semi-supervised learning means, thereby reducing the annotation workload. Experiments show that the invention can reduce the workload of data annotation and keep higher detection precision.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a network structure of a student model and a teacher model in a crack detection method and device based on semi-supervised semantic segmentation;
FIG. 2 is a network structure of a codec in a crack detection method and apparatus based on semi-supervised semantic segmentation;
FIG. 3 is a network structure of a mobile rollover bottleneck convolution module in a crack detection method and device based on semi-supervised semantic segmentation.
Reference numerals: in fig. 2, Conv,3 × 3 represents a convolution module with a convolution kernel size of 3 × 3, MBConv1,3 × 3 represents a moving flip bottleneck convolution module with an expansion ratio of 1, a convolution kernel size of 3 × 3, MBConv6,3 × 3 represents a moving flip bottleneck convolution module with an expansion ratio of 6, a convolution kernel size of 3 × 3, MBConv6,5 × 5 represents a moving flip bottleneck convolution module with an expansion ratio of 6, a convolution kernel size of 5 × 5, Up-Conv2D represents an upsampling module, Concat represents a Concat function, Conv2D represents a two-dimensional convolution module, and numbers in (H, W, C) represent the height, width, and number of channels of an image in this step.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The embodiment provides a crack detection method and device based on semi-supervised semantic segmentation, and the overall architecture is shown in fig. 1 and comprises two parts, namely a student model and a teacher model. The student model and the teacher model have the same structure, and the specific structure thereof will be described later.
Inputting the image with the crack annotation into a student model, and inputting the image without the crack annotation into the student model and a teacher model for training;
the loss function of the student model comprises two parts of supervised loss and unsupervised loss, wherein the supervised loss comprises the combination of dice loss and cross entropy loss; unsupervised loss is similar to regularized loss, and a specific loss value is obtained after multiplication by a weight. The loss function of the teacher model is an unsupervised loss, and a specific formula of the loss function is provided later.
The weights of the student models are updated through gradient reduction of the loss function, the weights of the teacher models are updated according to the exponential moving average of the weights of the student models, and finally F1-score is used for evaluating the accuracy of model prediction.
In the embodiment, noise is added into input data, the noise has the same matrix form as the input data, and the value of the noise is between-0.2 and 0.2; through the training, the model of the embodiment has certain anti-noise capability, and the robustness of the model is improved.
The weights of the teacher model are obtained by Exponential Moving Average (EMA) of the student models. Thus, the teacher model collects information after each training step and gives the model a better intermediate representation. The parameter update formula is as follows:
θ′t=θ′t-1+(1-α)θt
wherein α is a smoothing coefficient hyperparameter (α is 0.99 in the ascent stage, and α is 0.999 in the remaining stages); thetatIs the weight of the student model at the training step t; theta'tIs the weight of the teacher model at training step t time.
The embodiment completes the training process by the following method:
acquiring annotated data X1 and unannotated data X2, wherein an annotation label is Y1;
inputting X1 and X2 into a student model to obtain an input prediction result P1 of X1 and an input prediction result P2 of X2;
calculating a loss function Lsup (Y1, P1) of X1;
inputting X2 into the teacher model to obtain an input prediction result P3 of X2;
computing the loss function Luntup (P2, P3) for unannotated data X2
Calculating the gradient decline of the total loss function Lsup (Y1, P1) + Lunsup (P2, P3);
updating the weight of the student model according to the gradient descent result;
the weights of the teacher model are updated by an exponential moving average.
This embodiment introduces codec structures used by a student model and a teacher model. The function of the encoder is to perform progressive downsampling on the input image through a convolutional neural network to extract the features of the image. Most of CNNs used by the encoder come from networks used by classification tasks, such as VGG, ResNet and the like, so that weights trained in advance on a large data set can be borrowed, and a better result is obtained through transfer learning. In order to adjust the height, width and number of channels of an image comprehensively, the present embodiment adopts EfficientNet as the encoder of the present embodiment. The function of the decoder is to extract the features of the image, and in the present embodiment, UNet is used as the decoder to perform the upsampling and cascading operations. The structure of the codec is shown in fig. 2, in this embodiment, EfficientNet uses a Mobile Inverted Bottleneck Convolution (MBConv) module used by MobileNetV2 as a basic building block of the model. At the same time, the module also encapsulates the compression and excitation methods in SEnet to optimize the network structure. The structure of MBConv is shown in FIG. 3. MBConv firstly increases the number of feature mapping channels through 1 × 1 convolutional layers, then performs 3 × 3 or 5 × 5 convolutional operation on the increased number of channels, and then performs compression and excitation operation, so that the neural network can better map channel dependence and acquire global information. Finally, the second 1x1 convolutional layer down-samples the number of channels to the initial number of channels. The specific working process is as follows:
performing convolution processing on an input image to obtain a first-stage image;
performing moving turning bottleneck convolution processing twice on the first-stage image to obtain a second-stage image;
carrying out three times of moving and turning bottleneck convolution processing on the second-stage image to obtain a third-stage image;
carrying out three-time moving and turning bottleneck convolution processing on the third-stage image to obtain a fourth-stage image;
performing four-time moving and turning bottleneck convolution processing on the fourth-stage image to obtain a fifth-stage image;
performing four-time moving and turning bottleneck convolution processing on the fifth-stage image to obtain a sixth-stage image;
carrying out five times of moving and turning bottleneck convolution processing on the sixth-stage image to obtain a seventh-stage image;
carrying out moving turnover bottleneck convolution processing twice on the seventh-stage image to obtain an eighth-stage image;
and outputting the second-stage image, the third-stage image, the fifth-stage image, the sixth-stage image and the eighth-stage image to a decoder.
The decoder part of the embodiment fuses the feature map obtained by the up-sampling operation and the feature map with the same scale, so that the size of the final feature map is restored to the size of the input image. The skip connection is extended at the first level in the decoder part, comprising two 3 x 3 convolutional layers, each layer followed by bulk normalization and ReLU activation on input. In the remaining levels, the decoder blocks are the remaining blocks. In the feature extraction section, five scale feature maps in EfficientNet are selected in total. In the up-sampling part, the last feature mapping of the encoder is selected as 2 × up-sampling, and then is fused with the feature map of the same scale in the feature extraction part. The process is repeated for the feature maps of five scales in sequence, and the size of the final feature map is restored to the size of the input image. Compared with the original UNet model, the model has deeper compression paths and contains richer characteristic information.
The embodiment describes a crack detection method and a crack detection device based on semi-supervised semantic segmentation. The semi-supervised learning method mainly comprises the steps of adding items related to unannotated data into a loss function, and enhancing generalization capability of a model to unknown data by using the unannotated data. The model employs a consistency regularization method. The main idea is that for an input, the prediction should be consistent with the prediction of the original data even if slightly disturbed.
For annotated data, a supervised loss is used, which is a combination of dice loss and cross entropy loss.
The formula of the supervision loss is
Figure BDA0003333082440000061
Wherein L isdicrIs die loss, Lcross-entropyIs the cross entropy loss, piIs a prediction of input dataY is the true value, and N is the number of data.
The dice loss formula is
Figure BDA0003333082440000062
The cross entropy loss formula is
Figure BDA0003333082440000063
Figure BDA0003333082440000064
For unannotated data, the MSE (Mean Squared Error) loss function is used to define the expected distance between the teacher model and the student model predictions. Unsupervised loss formula is Lunsup=Ex,η′,η[||F(x,θ′,η′)-f(x,θ,η)||2](ii) a Where f (x, θ ', η') is the prediction of the teacher model, f (x, θ, η) is the prediction of the student model, x represents input data, θ 'represents weight, η, η' represents noise, and E represents expectation.
Since the student model is trained with only limited annotated data during the initial training phase, the performance is poor and the prediction is not reliable. Therefore, the unsupervised cost also needs to be multiplied by a weight ω (t). ω (t) is a widely used time-dependent Gaussian preheat function for controlling the balance between supervised and unsupervised coherence losses, defined as follows:
Figure BDA0003333082440000065
where 0.1 is the regularization weight, t is the current training step length, t ismaxIs the total training step.
The proposed crack image semi-supervised semantic segmentation model learns from annotated and unannotated data by the following total loss function: l istotal=Lsup+ω(t)·Lunsup(ii) a Wherein L issupIs a loss of supervision; l isunsupIs unsupervised loss; ω (t) is the Gaussian preheat function.
The experimental data for this example are: when only 60% of the annotation data was used, this example had an F1-score of 0.6540 on the concrete Crack data set and an F1-score of 0.8321 on the Crack500 data set. The result shows that the embodiment greatly reduces the labeling workload while maintaining higher precision.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A crack detection method based on semi-supervised semantic segmentation is characterized by comprising the following steps;
acquiring a crack image, wherein the crack image comprises a first crack image with an annotation and a second crack image without the annotation;
inputting the first crack image into a student model, and inputting the second crack image into the student model and a teacher model to perform crack feature extraction and feature segmentation; the student model completes semi-supervised training by the supervised loss and the unsupervised loss, the supervised loss is obtained by calculating dice loss and cross entropy loss of the first crack image and the second crack image, and the unsupervised loss is obtained by performing network regularization on the second crack image;
updating the weights of the student models through gradient descent of supervision loss and unsupervised loss;
updating the weight of the teacher model by an exponential moving average of the weights of the student models;
and adjusting the output of the student model and the output of the teacher model according to the weight of the student model and the weight of the teacher model, so that the output of the student model and the output of the teacher model are consistent, and obtaining a crack characteristic prediction result.
2. The crack detection method based on semi-supervised semantic segmentation as recited in claim 1, wherein the structures of the student model and the teacher model are codec structures, and the codec structures comprise an encoder and a decoder; wherein the encoder comprises a CNN-based image feature extraction network; the decoder comprises an image feature segmentation network based on U-Net; and inputting the first crack image or the second crack image as an input image into an encoder, and performing feature extraction and segmentation by using a coder-decoder to obtain an output image.
3. The crack detection method based on semi-supervised semantic segmentation as recited in claim 2, wherein the CNN-based image feature extraction network comprises an EfficientNet network;
the EfficientNet network comprises 1 convolution layer and 23 mobile turning bottleneck convolution modules;
the mobile roll-over bottleneck convolution module comprises 4 convolution layers of 1 multiplied by 1, 1 separable convolution layer with depth and 1 global average pooling layer;
the working phase of the EfficientNet network comprises the following steps:
performing convolution processing on an input image to obtain a first-stage image;
performing moving turning bottleneck convolution processing twice on the first-stage image to obtain a second-stage image;
carrying out three-time moving and turning bottleneck convolution processing on the second-stage image to obtain a third-stage image;
carrying out three-time moving and turning bottleneck convolution processing on the third-stage image to obtain a fourth-stage image;
performing four-time moving and turning bottleneck convolution processing on the fourth-stage image to obtain a fifth-stage image;
performing four-time moving and turning bottleneck convolution processing on the fifth-stage image to obtain a sixth-stage image;
carrying out five times of moving and turning bottleneck convolution processing on the sixth-stage image to obtain a seventh-stage image;
carrying out moving and turning bottleneck convolution processing twice on the seventh-stage image to obtain an eighth-stage image;
and outputting the second-stage image, the third-stage image, the fifth-stage image, the sixth-stage image and the eighth-stage image to a decoder.
4. The crack detection method based on semi-supervised semantic segmentation according to claim 2, wherein the image feature segmentation network based on U-Net comprises 5 upsampling layers and 1 convolutional layer, and an image input by an encoder is subjected to layer-by-layer upsampling and then is input into the convolutional layer to obtain a feature image which is output as an output image; the feature image has the same size as the input image.
5. The crack detection method based on semi-supervised semantic segmentation as recited in claim 1, wherein the supervised loss formula is
Figure FDA0003333082430000021
Wherein L isdiceIs the loss of the die or dice,
Figure FDA0003333082430000022
Figure FDA0003333082430000023
Lcross-entropyis the cross-entropy loss of the entropy of the sample,
Figure FDA0003333082430000024
is a parameter in the loss function, piIs the prediction of the input data, y is the true value, and N is the number of data.
6. The crack detection method based on semi-supervised semantic segmentation as recited in claim 1, wherein the unsupervised loss formula is Lunsup=Ex,η′,η[||f(x,θ′,η′)-f(x,θ,η)||2](ii) a Where f (x, θ ', η') is the prediction of the teacher model, f (x, θ, η) is the prediction of the student model, x represents input data, θ 'represents weight, η, η' represents noise, and E represents expectation.
7. The crack detection method based on semi-supervised semantic segmentation as recited in claim 1, wherein the overall loss function formula of the student model and the teacher model is Ltotal=Lsup+ω(t)·Lunsup(ii) a Wherein L issupIs a loss of supervision; l isunsupIs without supervisionLosing; ω (t) is a Gaussian preheating function having the formula
Figure FDA0003333082430000025
Where t is the current training step, tmaxIs the total training step.
8. The crack detection method based on semi-supervised semantic segmentation according to claim 1, further comprising: the accuracy of the model was evaluated using F1-Score.
9. The semi-supervised semantic segmentation based crack detection method according to claim 1, wherein the training of the semi-supervised based crack detection semantic segmentation network comprises:
acquiring annotated data X1 and unannotated data X2, wherein an annotation label is Y1;
inputting X1 and X2 into a student model to obtain an input prediction result P1 of X1 and an input prediction result P2 of X2;
calculating a loss function Lsup (Y1, P1) of X1;
inputting X2 into the teacher model to obtain an input prediction result P3 of X2;
computing the loss function Luntup (P2, P3) for unannotated data X2
Calculating the gradient decline of the total loss function Lsup (Y1, P1) + Lunsup (P2, P3);
updating the weight of the student model according to the gradient descent result;
the weights of the teacher model are updated by an exponential moving average.
10. A computer device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program realizes the method according to any one of claims 1-9.
CN202111291405.1A 2021-11-02 2021-11-02 Crack detection method and device based on semi-supervised semantic segmentation Pending CN114140390A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111291405.1A CN114140390A (en) 2021-11-02 2021-11-02 Crack detection method and device based on semi-supervised semantic segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111291405.1A CN114140390A (en) 2021-11-02 2021-11-02 Crack detection method and device based on semi-supervised semantic segmentation

Publications (1)

Publication Number Publication Date
CN114140390A true CN114140390A (en) 2022-03-04

Family

ID=80392175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111291405.1A Pending CN114140390A (en) 2021-11-02 2021-11-02 Crack detection method and device based on semi-supervised semantic segmentation

Country Status (1)

Country Link
CN (1) CN114140390A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114708436A (en) * 2022-06-02 2022-07-05 深圳比特微电子科技有限公司 Training method of semantic segmentation model, semantic segmentation method, semantic segmentation device and semantic segmentation medium
CN114742917A (en) * 2022-04-25 2022-07-12 桂林电子科技大学 CT image segmentation method based on convolutional neural network
CN114897909A (en) * 2022-07-15 2022-08-12 四川大学 Crankshaft surface crack monitoring method and system based on unsupervised learning
CN116091773A (en) * 2023-02-02 2023-05-09 北京百度网讯科技有限公司 Training method of image segmentation model, image segmentation method and device
CN117830638A (en) * 2024-03-04 2024-04-05 厦门大学 Omnidirectional supervision semantic segmentation method based on prompt text

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210319547A1 (en) * 2020-04-08 2021-10-14 Zhejiang University Method and apparatus for identifying concrete crack based on video semantic segmentation technology

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210319547A1 (en) * 2020-04-08 2021-10-14 Zhejiang University Method and apparatus for identifying concrete crack based on video semantic segmentation technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WENJUN WANG 等: "Semi-supervised semantic segmentation network for surface crack detection", AUTOMATION IN CONSTRUCTION, no. 2021, pages 1 - 10 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742917A (en) * 2022-04-25 2022-07-12 桂林电子科技大学 CT image segmentation method based on convolutional neural network
CN114742917B (en) * 2022-04-25 2024-04-26 桂林电子科技大学 CT image segmentation method based on convolutional neural network
CN114708436A (en) * 2022-06-02 2022-07-05 深圳比特微电子科技有限公司 Training method of semantic segmentation model, semantic segmentation method, semantic segmentation device and semantic segmentation medium
CN114897909A (en) * 2022-07-15 2022-08-12 四川大学 Crankshaft surface crack monitoring method and system based on unsupervised learning
CN116091773A (en) * 2023-02-02 2023-05-09 北京百度网讯科技有限公司 Training method of image segmentation model, image segmentation method and device
CN116091773B (en) * 2023-02-02 2024-04-05 北京百度网讯科技有限公司 Training method of image segmentation model, image segmentation method and device
CN117830638A (en) * 2024-03-04 2024-04-05 厦门大学 Omnidirectional supervision semantic segmentation method based on prompt text

Similar Documents

Publication Publication Date Title
CN114140390A (en) Crack detection method and device based on semi-supervised semantic segmentation
CN107679618B (en) Static strategy fixed-point training method and device
CN111832501B (en) Remote sensing image text intelligent description method for satellite on-orbit application
CN111465946A (en) Neural network architecture search using hierarchical representations
CN108765506B (en) Layer-by-layer network binarization-based compression method
CN112905801B (en) Stroke prediction method, system, equipment and storage medium based on event map
CA3108752A1 (en) Neural network circuit device, neural network processing method, and neural network execution program
CN117475038B (en) Image generation method, device, equipment and computer readable storage medium
CN109523016B (en) Multi-valued quantization depth neural network compression method and system for embedded system
CN111382759A (en) Pixel level classification method, device, equipment and storage medium
Wang et al. Real-time topology optimization based on deep learning for moving morphable components
WO2023087597A1 (en) Image processing method and system, device, and medium
CN114676837A (en) Evolutionary quantum neural network architecture searching method based on quantum simulator
Rashid et al. Revealing the predictive power of neural operators for strain evolution in digital composites
KR20200023695A (en) Learning system to reduce computation volume
CN111753995A (en) Local interpretable method based on gradient lifting tree
Vemparala et al. L2pf-learning to prune faster
CN115376195A (en) Method for training multi-scale network model and method for detecting key points of human face
CN115544307A (en) Directed graph data feature extraction and expression method and system based on incidence matrix
US20210256389A1 (en) Method and system for training a neural network
CN114595641A (en) Method and system for solving combined optimization problem
WO2022198233A1 (en) Efficient compression of activation functions
CN114202669A (en) Neural network searching method for medical image segmentation
CN112990289A (en) Data processing method, device, equipment and medium based on multi-task prediction model
Zhang et al. Vision transformer with convolutions architecture search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination