CN116309536A - Pavement crack detection method and storage medium - Google Patents

Pavement crack detection method and storage medium Download PDF

Info

Publication number
CN116309536A
CN116309536A CN202310441866.5A CN202310441866A CN116309536A CN 116309536 A CN116309536 A CN 116309536A CN 202310441866 A CN202310441866 A CN 202310441866A CN 116309536 A CN116309536 A CN 116309536A
Authority
CN
China
Prior art keywords
detected
image
prediction
layer
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310441866.5A
Other languages
Chinese (zh)
Inventor
曹霆
胡劲元
李军怀
王怀军
王宇航
田程
张欣荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202310441866.5A priority Critical patent/CN116309536A/en
Publication of CN116309536A publication Critical patent/CN116309536A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a pavement crack detection method and a storage medium, which relate to the technical field of target detection and comprise the following steps: collecting an image to be detected; inputting a plurality of images to be detected into an MSF-transducer model, and outputting a prediction frame and a category label; and detecting cracks in the image to be detected according to the prediction frame and the category label. According to the invention, the acquired two-dimensional images are input into an MSF algorithm model to obtain feature graphs fused by different sizes, so that the difficulty that local texture characteristics and global texture relations are not accurate enough when the target detection field aims at slender and tiny feature target detection is properly relieved, meanwhile, a transducer multi-layer encoder-decoder structure is adopted, and the post-processing steps of prior knowledge constraint such as Anchor and non-maximum suppression are omitted by combining with position coding, so that end-to-end target detection is realized, and a target detection algorithm is greatly simplified.

Description

Pavement crack detection method and storage medium
Technical Field
The invention relates to the technical field of target detection, in particular to a pavement crack detection method and a storage medium.
Background
The detection of road surface cracks has been one of the important research contents of road traffic safety. In recent years, research on deep learning and target detection has prompted the intelligent development of crack detection methods. The crack detection is used as the basis of road health assessment and road surface maintenance measures, becomes the research focus in the fields of roads, bridges, tunnels and the like, and has important research value and wide application prospect.
The crack detection is to mark the cracks on the pavement according to a proper detection frame, and the confidence and the corresponding category of the cracks are displayed. Crack types fall into three categories: transverse cracks, longitudinal cracks and repaired cracks, the basic cracks contain 6-dimensional features, X, Y coordinates, width and height of the test frame, confidence and test class, respectively. With the advent of large-scale data sets, the reduction of computer hardware cost and the improvement of GPU parallel computing capability, deep learning gradually takes an absolute dominant position in the field of target detection and crack detection, target detection based on a DETR network has been researched and used by a plurality of students, the model does not have prior knowledge and constraint such as Anchor and the like, and meanwhile, the post-processing step of non-maximum suppression is abandoned, and the whole network model realizes end-to-end target detection, so that a target detection algorithm is greatly simplified.
The classical DETR model mainly comprises Encoder, decoder of Backbone, transformer of CNN and four final prediction layers FFN, in general, the network adopted in the backbond part of DETR is friendly to the feature extraction and processing of large-size targets, but in actual engineering, road cracks are affected by different acquisition equipment, acquisition distances, crack sizes, noise such as illumination and shadow, if the characteristics of the road cracks are extracted by the DETR model, local texture features of the cracks are easily lost, and the training of the subsequent network model and the detection effect of the cracks are all affected to a certain extent.
Disclosure of Invention
The invention provides a pavement crack detection method and a storage medium, which utilize a MSF algorithm and a transducer model based on deep learning to carry out target detection training and detect an image to be detected, thereby greatly simplifying the target detection algorithm.
The invention provides a pavement crack detection method, which comprises the following steps:
collecting an image to be detected;
inputting the image to be detected into an MSF-converter model, and outputting an optimal prediction frame, a category label and a confidence coefficient;
detecting cracks in the image to be detected according to the optimal prediction frame, the category label and the confidence coefficient;
inputting the image to be detected into an MSF-transducer model, and outputting a prediction frame, a category label and a confidence coefficient, wherein the method comprises the following steps:
performing multi-scale feature extraction on the image to be detected based on the MSF model to obtain a fusion feature map;
constructing position codes with the same dimension according to the fusion feature map;
encoding and decoding the fusion feature map and the position code based on a transducer model to obtain a decoding result;
and predicting the decoding result based on the prediction layer FFN to obtain a prediction frame, a category label and a confidence coefficient.
Preferably, the image to be detected needs to be preprocessed before multi-scale feature extraction is performed on the image to be detected through the MSF model.
Preferably, the pretreatment process comprises the steps of:
the size of the image to be detected is processed to be 200×200DPI;
graying the size-processed image to be detected, adding Gaussian noise, and then carrying out median filtering;
and labeling the image to be detected after median filtering by using a picture labeling tool according to three categories of transverse cracks, longitudinal cracks and repaired cracks.
Preferably, the multi-scale feature extraction is performed on the image to be detected based on the MSF model to obtain a fusion feature map, which comprises the following steps:
inputting the preprocessed image to be detected into a CBL module consisting of convolution, normalization and activation functions;
sequentially inputting images to be detected passing through the CBL module into a plurality of convolution layers and residual error structures to obtain a plurality of feature images with different sizes;
upsampling the plurality of different sized feature maps into a plurality of same sized feature maps;
and stacking, fusing and corresponding convolution are carried out on the feature images with the same size, so that a fused feature image is obtained.
Preferably, the position coding is constructed by:
Figure BDA0004194300780000031
Figure BDA0004194300780000032
wherein PE represents position code, pos represents the position of the current pixel in the input feature map, d model Representing the dimensions of the pixel, i represents a fused feature map of different positions, where even positions use sin and odd positions use cos.
Preferably, the transducer model comprises a multi-layer encoder and a multi-layer decoder, each layer encoder comprising a multi-head attention layer and a feed-forward connection layer, each layer decoder comprising a masked multi-head attention layer, a multi-head attention layer and a feed-forward connection layer.
Preferably, the decoding result is predicted based on a prediction layer FFN to obtain an optimal prediction frame, which includes the following steps:
inputting the multi-layer decoding result into a prediction layer FFN after parameter sharing to obtain a plurality of prediction frames,
constructing a prediction frame set according to the multiple prediction frames, and constructing a true value set;
and carrying out bipartite graph matching through a Hungary algorithm, matching the prediction frame set with the truth value set, and carrying out optimal selection on a plurality of prediction frames to obtain an optimal prediction frame.
Preferably, the hungarian algorithm is as follows:
Figure BDA0004194300780000033
Figure BDA0004194300780000034
in the method, in the process of the invention,
Figure BDA0004194300780000035
is the optimal allocation result set, σ (i) represents the index,/->
Figure BDA0004194300780000036
Is true value element y of road surface crack i Paired matching costs with the index σ (i), where c (i) is the target classTag b (i) is a vector, < >>
Figure BDA0004194300780000037
Probability of c (i),>
Figure BDA0004194300780000038
is a prediction frame, y i Represents the ith truth element,>
Figure BDA0004194300780000039
representing a collection element representing a prediction box, L box Represents definition frame loss, N represents the number of prediction sets, +.>
Figure BDA0004194300780000041
Representing the collection element.
Preferably, the transducer model is trained by a loss function, which is as follows
The illustration is:
Figure BDA0004194300780000042
Figure BDA0004194300780000043
where y represents the true value set of the detection object,
Figure BDA0004194300780000044
then represent the prediction box set,/->
Figure BDA0004194300780000045
Is c i Is used to determine the logarithmic probability of (1),
Figure BDA0004194300780000046
is Hungary loss, L iou Is a generalized IoU loss, lambda iou 、λ L1 Is a super parameter.
A computer-readable storage medium storing computer instructions for causing the computer to execute the road surface crack detection method.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, the acquired two-dimensional images are input into an MSF algorithm model to obtain feature graphs fused by different sizes, so that the difficulty that local texture characteristics and global texture relations are not accurate enough when the target detection field aims at slender and tiny feature target detection is properly relieved, meanwhile, a transducer multi-layer encoder-decoder structure is adopted, and the post-processing steps of prior knowledge constraint such as Anchor and non-maximum suppression are omitted by combining with position coding, so that end-to-end target detection is realized, and a target detection algorithm is greatly simplified.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block flow diagram of a pavement crack detection method of the present invention;
fig. 2 is a model training effect diagram of a pavement crack detection method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention provides a pavement crack detection method, which comprises the following steps:
the first step: and acquiring road surface images by using an acquisition vehicle to obtain a plurality of images to be detected.
Preprocessing a plurality of images to be detected, including the following steps:
(1) The size of the plurality of images to be detected is processed to 200×200DPI.
(2) Graying the plurality of size-processed images to be detected, adding Gaussian noise, and then carrying out median filtering;
(3) Labeling the multiple images to be detected after median filtering according to three categories of transverse cracks, longitudinal cracks and repaired cracks by using a Labelimg picture labeling tool, and randomly combining and generating according to the ratio of 0.85:0.15 of the training set and the verification set. The MSF-transducer model is trained by a training set, and verified by a verification set.
And a second step of: and inputting the images to be detected into a trained MSF-transducer model, and outputting an optimal prediction frame, a category label and a confidence coefficient. Comprising the following steps:
and carrying out Multi-scale feature extraction on the image to be detected based on an MSF (Multi-scale fusion) model to obtain a fusion feature map.
(1) The preprocessed image to be detected is input to a CBL module consisting of convolution, normalization and activation functions (Conv, batchNormalization and LeakyReLU), and then is ready to start feature extraction.
(2) And inputting the image to be detected passing through the CBL module into a plurality of convolution layers and residual structures to obtain the feature images of 1/8/, 1/16 and 1/32 of the original image, and completing the feature image extraction of the three sizes.
(3) The feature maps of different sizes are unified into feature maps of the same size by an up-sampling method similar to Feature Pyramid (FPN).
(4) And stacking, fusing and corresponding convolution are carried out on the feature images with the same size, so that a fused feature image is obtained.
And constructing position codes with the same dimension according to the fusion characteristic diagram.
Position encoding is constructed by:
Figure BDA0004194300780000061
Figure BDA0004194300780000062
wherein PE represents position code, pos represents the position of the current pixel in the input feature map, d model Representing the dimensions of the pixel, i represents a fused feature map of different positions, where even positions use sin and odd positions use cos.
And encoding and decoding the fusion feature map and the position code based on the transducer model to obtain a decoding result.
(1) And inputting the multiple fusion feature maps and the multiple position codes into a multi-layer encoder to obtain a coding result. Each layer of encoder consists of a multi-head attention layer and a feed-forward connection layer.
(2) And inputting the encoding result into a multi-layer decoder to obtain a decoding result. Each layer of decoder is composed of a multi-head attention layer, a multi-head attention layer and a feedforward connection layer. Except for the first layer decoder, the remaining decoders have the output of the above layer decoder and the output of the multi-layer encoder as inputs.
And predicting the decoding result based on the prediction layer FFN to obtain an optimal prediction frame, a category label and a confidence coefficient.
(1) The output result of each layer of decoder structure is input to the prediction layer FFN for prediction through parameter sharing, and the loss function is calculated to realize deep supervision. The FFN of the present invention is calculated from a 3-layer linear layer with a ReLU activation function and with a hidden layer so that the center coordinates, height and width can be normalized by the prediction box and the prediction class labels can be obtained using softmax function activation.
(2) A set of prediction frames of a fixed size can be obtained by the above operation, but this obviously exceeds the number of prediction frames actually required, thus allowing for an optimal selection of a considerable number of prediction frames. The invention constructs a prediction frame set according to a plurality of prediction frames and constructs a true value set. Expanding the number of Ground Truth, namely true value, to be the same as that of a prediction frame, using an additional special class label to represent background class, realizing that the predicted value and the true value become a set of the same number of elements, performing bipartite graph matching through a Hungary algorithm at the moment, and enabling the elements of the prediction set and the true set to be in one-to-one correspondence so as to minimize matching loss:
the hungarian algorithm is as follows:
Figure BDA0004194300780000071
Figure BDA0004194300780000072
in the method, in the process of the invention,
Figure BDA0004194300780000073
is the optimal allocation result set, σ (i) represents the index,/->
Figure BDA0004194300780000074
Is true value element y of road surface crack i Paired matching costs with the index σ (i), where c (i) is the target class label, b (i) is a vector, +.>
Figure BDA0004194300780000075
Probability of c (i),>
Figure BDA0004194300780000076
is a prediction frame, y i Represents the ith truth element,>
Figure BDA0004194300780000077
representing a collection element representing a prediction box, L box Represents definition frame loss, N represents the number of prediction sets, +.>
Figure BDA0004194300780000078
Representation setAnd (5) combining elements.
(3) Calculating a loss function according to the obtained group Truth, namely the corresponding relation between the true value and the predicted target frame;
Figure BDA0004194300780000079
Figure BDA00041943007800000710
where y represents the true value set of the detection object,
Figure BDA00041943007800000711
then represent the prediction box set,/->
Figure BDA00041943007800000712
Is c i Is used to determine the logarithmic probability of (1),
Figure BDA00041943007800000713
is Hungary loss, L iou Is a generalized IoU loss, lambda iou And lambda (lambda) L1 Is a hyper-parameter, normalized by the number of objects in the batch.
And a third step of: and judging the cracks in the image to be detected according to the optimal prediction frame, the category label and the confidence level. And selecting the cracks in the image by a prediction frame, classifying the types of the cracks by a class label, and displaying the confidence of the cracks.
Referring to fig. 2, after the model is trained by 500 epochs, it is obvious that the training set and the verification set vibrate and descend well and the difference between the two is not particularly large, which can completely prove that the task of detecting the crack of the road surface has good applicability and enough supporting force.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium is stored with computer executable instructions which can execute the pavement crack detection method.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. The pavement crack detection method is characterized by comprising the following steps of:
collecting an image to be detected;
inputting the image to be detected into an MSF-converter model, and outputting an optimal prediction frame, a category label and a confidence coefficient;
judging cracks in the image to be detected according to the optimal prediction frame, the class label and the confidence coefficient;
inputting the image to be detected into an MSF-transducer model, and outputting a prediction frame, a category label and a confidence coefficient, wherein the method comprises the following steps:
performing multi-scale feature extraction on the image to be detected based on the MSF model to obtain a fusion feature map;
constructing position codes with the same dimension according to the fusion feature map;
encoding and decoding the fusion feature map and the position code based on a transducer model to obtain a decoding result;
and predicting the decoding result based on the prediction layer FFN to obtain a prediction frame, a category label and a confidence coefficient.
2. The method for detecting pavement cracks according to claim 1, wherein the image to be detected is preprocessed before multi-scale feature extraction is performed on the image to be detected by using an MSF model.
3. The pavement crack detection method as set forth in claim 2, wherein the pretreatment process comprises the steps of:
the size of the image to be detected is processed to be 200×200DPI;
graying the size-processed image to be detected, adding Gaussian noise, and then carrying out median filtering;
and labeling the image to be detected after median filtering by using a picture labeling tool according to three categories of transverse cracks, longitudinal cracks and repaired cracks.
4. The pavement crack detection method as set forth in claim 3, wherein the MSF model-based multi-scale feature extraction is performed on the image to be detected to obtain a fusion feature map, and the method comprises the following steps:
inputting the preprocessed image to be detected into a CBL module consisting of convolution, normalization and activation functions;
sequentially inputting images to be detected passing through the CBL module into a plurality of convolution layers and residual error structures to obtain a plurality of feature images with different sizes;
upsampling the plurality of different sized feature maps into a plurality of same sized feature maps;
and stacking, fusing and corresponding convolution are carried out on the feature images with the same size, so that a fused feature image is obtained.
5. The pavement crack detection method as set forth in claim 4, wherein the position code is constructed by:
Figure FDA0004194300770000021
Figure FDA0004194300770000022
in the method, in the process of the invention,PE represents position encoding, pos represents the position of the current pixel in the input feature map, d model Representing the dimensions of the pixel, i represents a fused feature map of different positions, where even positions use sin and odd positions use cos.
6. The pavement crack detection method of claim 5, wherein the fransformer model comprises a multi-layer encoder and a multi-layer decoder, each layer encoder comprising a multi-head attention layer and a feed-forward tie layer, each layer decoder comprising a masked multi-head attention layer, a multi-head attention layer and a feed-forward tie layer.
7. The method for detecting a pavement crack according to claim 6, wherein the decoding result is predicted based on a prediction layer FFN to obtain an optimal prediction frame, comprising the steps of:
inputting the multi-layer decoding result into a prediction layer FFN after parameter sharing to obtain a plurality of prediction frames,
constructing a prediction frame set according to the multiple prediction frames, and constructing a true value set;
and carrying out bipartite graph matching through a Hungary algorithm, matching the prediction frame set with the truth value set, and carrying out optimal selection on a plurality of prediction frames to obtain an optimal prediction frame.
8. The pavement crack detection method as set forth in claim 1, characterized in that the hungarian algorithm is as follows:
Figure FDA0004194300770000023
Figure FDA0004194300770000024
in the method, in the process of the invention,
Figure FDA0004194300770000025
is the optimal allocation result set, σ (i) represents the index,/->
Figure FDA0004194300770000026
Is true value element y of road surface crack i Paired matching costs with the index σ (i), where c (i) is the target class label, b (i) is a vector, +.>
Figure FDA0004194300770000031
Probability of c (i),>
Figure FDA0004194300770000032
is a prediction frame, y i Represents the ith truth element,>
Figure FDA0004194300770000033
representing a collection element representing a prediction box, L box Represents definition frame loss, N represents the number of prediction sets, +.>
Figure FDA0004194300770000034
Representing the collection element.
9. The method of claim 8, wherein the transducer model is trained by a loss function, the loss function being as follows:
Figure FDA0004194300770000035
Figure FDA0004194300770000036
where y represents the true value set of the detection object,
Figure FDA0004194300770000037
then represent the prediction box set,/->
Figure FDA0004194300770000038
Is c i Is used to determine the logarithmic probability of (1),
Figure FDA0004194300770000039
is Hungary loss, L iou Is a generalized IoU loss, lambda iou 、λ L1 Is a super parameter.
10. A computer-readable storage medium storing computer instructions for causing the computer to perform the pavement crack detection method of any one of claims 1-9.
CN202310441866.5A 2023-04-23 2023-04-23 Pavement crack detection method and storage medium Pending CN116309536A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310441866.5A CN116309536A (en) 2023-04-23 2023-04-23 Pavement crack detection method and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310441866.5A CN116309536A (en) 2023-04-23 2023-04-23 Pavement crack detection method and storage medium

Publications (1)

Publication Number Publication Date
CN116309536A true CN116309536A (en) 2023-06-23

Family

ID=86828962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310441866.5A Pending CN116309536A (en) 2023-04-23 2023-04-23 Pavement crack detection method and storage medium

Country Status (1)

Country Link
CN (1) CN116309536A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596930A (en) * 2023-07-18 2023-08-15 吉林大学 Semi-supervised multitasking real image crack detection system and method
CN117437580A (en) * 2023-12-20 2024-01-23 广东省人民医院 Digestive tract tumor recognition method, digestive tract tumor recognition system and digestive tract tumor recognition medium
CN117975036A (en) * 2024-01-10 2024-05-03 广州恒沙云科技有限公司 Small target detection method and system based on detection converter

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116596930A (en) * 2023-07-18 2023-08-15 吉林大学 Semi-supervised multitasking real image crack detection system and method
CN116596930B (en) * 2023-07-18 2023-09-22 吉林大学 Semi-supervised multitasking real image crack detection system and method
CN117437580A (en) * 2023-12-20 2024-01-23 广东省人民医院 Digestive tract tumor recognition method, digestive tract tumor recognition system and digestive tract tumor recognition medium
CN117437580B (en) * 2023-12-20 2024-03-22 广东省人民医院 Digestive tract tumor recognition method, digestive tract tumor recognition system and digestive tract tumor recognition medium
CN117975036A (en) * 2024-01-10 2024-05-03 广州恒沙云科技有限公司 Small target detection method and system based on detection converter

Similar Documents

Publication Publication Date Title
CN110136170B (en) Remote sensing image building change detection method based on convolutional neural network
CN110705457B (en) Remote sensing image building change detection method
CN111091555B (en) Brake shoe breaking target detection method
CN116309536A (en) Pavement crack detection method and storage medium
CN110263706B (en) Method for detecting and identifying dynamic target of vehicle-mounted video in haze weather
CN110889449A (en) Edge-enhanced multi-scale remote sensing image building semantic feature extraction method
CN112347859A (en) Optical remote sensing image saliency target detection method
CN110853057B (en) Aerial image segmentation method based on global and multi-scale full-convolution network
CN112488025B (en) Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion
CN112287983B (en) Remote sensing image target extraction system and method based on deep learning
CN112991364A (en) Road scene semantic segmentation method based on convolution neural network cross-modal fusion
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN110490205A (en) Road scene semantic segmentation method based on the empty convolutional neural networks of Complete Disability difference
CN115482491A (en) Bridge defect identification method and system based on transformer
CN114581770A (en) TransUnnet-based automatic extraction processing method for remote sensing image building
CN117557775B (en) Substation power equipment detection method and system based on infrared and visible light fusion
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN116309348A (en) Lunar south pole impact pit detection method based on improved TransUnet network
CN116310916A (en) Semantic segmentation method and system for high-resolution remote sensing city image
CN116778346B (en) Pipeline identification method and system based on improved self-attention mechanism
CN117314938B (en) Image segmentation method and device based on multi-scale feature fusion decoding
CN113313077A (en) Salient object detection method based on multi-strategy and cross feature fusion
Jia et al. OccupancyDETR: Making semantic scene completion as straightforward as object detection
CN113887470B (en) High-resolution remote sensing image ground object extraction method based on multitask attention mechanism
Li et al. Infrared Small Target Detection Algorithm Based on ISTD-CenterNet.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination