CN116309536A - Pavement crack detection method and storage medium - Google Patents
Pavement crack detection method and storage medium Download PDFInfo
- Publication number
- CN116309536A CN116309536A CN202310441866.5A CN202310441866A CN116309536A CN 116309536 A CN116309536 A CN 116309536A CN 202310441866 A CN202310441866 A CN 202310441866A CN 116309536 A CN116309536 A CN 116309536A
- Authority
- CN
- China
- Prior art keywords
- detected
- image
- prediction
- layer
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 45
- 238000000034 method Methods 0.000 claims description 17
- 230000004927 fusion Effects 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 10
- 238000001914 filtration Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 abstract description 3
- 230000001629 suppression Effects 0.000 abstract description 3
- 238000012549 training Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000004075 alteration Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/32—Normalisation of the pattern dimensions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a pavement crack detection method and a storage medium, which relate to the technical field of target detection and comprise the following steps: collecting an image to be detected; inputting a plurality of images to be detected into an MSF-transducer model, and outputting a prediction frame and a category label; and detecting cracks in the image to be detected according to the prediction frame and the category label. According to the invention, the acquired two-dimensional images are input into an MSF algorithm model to obtain feature graphs fused by different sizes, so that the difficulty that local texture characteristics and global texture relations are not accurate enough when the target detection field aims at slender and tiny feature target detection is properly relieved, meanwhile, a transducer multi-layer encoder-decoder structure is adopted, and the post-processing steps of prior knowledge constraint such as Anchor and non-maximum suppression are omitted by combining with position coding, so that end-to-end target detection is realized, and a target detection algorithm is greatly simplified.
Description
Technical Field
The invention relates to the technical field of target detection, in particular to a pavement crack detection method and a storage medium.
Background
The detection of road surface cracks has been one of the important research contents of road traffic safety. In recent years, research on deep learning and target detection has prompted the intelligent development of crack detection methods. The crack detection is used as the basis of road health assessment and road surface maintenance measures, becomes the research focus in the fields of roads, bridges, tunnels and the like, and has important research value and wide application prospect.
The crack detection is to mark the cracks on the pavement according to a proper detection frame, and the confidence and the corresponding category of the cracks are displayed. Crack types fall into three categories: transverse cracks, longitudinal cracks and repaired cracks, the basic cracks contain 6-dimensional features, X, Y coordinates, width and height of the test frame, confidence and test class, respectively. With the advent of large-scale data sets, the reduction of computer hardware cost and the improvement of GPU parallel computing capability, deep learning gradually takes an absolute dominant position in the field of target detection and crack detection, target detection based on a DETR network has been researched and used by a plurality of students, the model does not have prior knowledge and constraint such as Anchor and the like, and meanwhile, the post-processing step of non-maximum suppression is abandoned, and the whole network model realizes end-to-end target detection, so that a target detection algorithm is greatly simplified.
The classical DETR model mainly comprises Encoder, decoder of Backbone, transformer of CNN and four final prediction layers FFN, in general, the network adopted in the backbond part of DETR is friendly to the feature extraction and processing of large-size targets, but in actual engineering, road cracks are affected by different acquisition equipment, acquisition distances, crack sizes, noise such as illumination and shadow, if the characteristics of the road cracks are extracted by the DETR model, local texture features of the cracks are easily lost, and the training of the subsequent network model and the detection effect of the cracks are all affected to a certain extent.
Disclosure of Invention
The invention provides a pavement crack detection method and a storage medium, which utilize a MSF algorithm and a transducer model based on deep learning to carry out target detection training and detect an image to be detected, thereby greatly simplifying the target detection algorithm.
The invention provides a pavement crack detection method, which comprises the following steps:
collecting an image to be detected;
inputting the image to be detected into an MSF-converter model, and outputting an optimal prediction frame, a category label and a confidence coefficient;
detecting cracks in the image to be detected according to the optimal prediction frame, the category label and the confidence coefficient;
inputting the image to be detected into an MSF-transducer model, and outputting a prediction frame, a category label and a confidence coefficient, wherein the method comprises the following steps:
performing multi-scale feature extraction on the image to be detected based on the MSF model to obtain a fusion feature map;
constructing position codes with the same dimension according to the fusion feature map;
encoding and decoding the fusion feature map and the position code based on a transducer model to obtain a decoding result;
and predicting the decoding result based on the prediction layer FFN to obtain a prediction frame, a category label and a confidence coefficient.
Preferably, the image to be detected needs to be preprocessed before multi-scale feature extraction is performed on the image to be detected through the MSF model.
Preferably, the pretreatment process comprises the steps of:
the size of the image to be detected is processed to be 200×200DPI;
graying the size-processed image to be detected, adding Gaussian noise, and then carrying out median filtering;
and labeling the image to be detected after median filtering by using a picture labeling tool according to three categories of transverse cracks, longitudinal cracks and repaired cracks.
Preferably, the multi-scale feature extraction is performed on the image to be detected based on the MSF model to obtain a fusion feature map, which comprises the following steps:
inputting the preprocessed image to be detected into a CBL module consisting of convolution, normalization and activation functions;
sequentially inputting images to be detected passing through the CBL module into a plurality of convolution layers and residual error structures to obtain a plurality of feature images with different sizes;
upsampling the plurality of different sized feature maps into a plurality of same sized feature maps;
and stacking, fusing and corresponding convolution are carried out on the feature images with the same size, so that a fused feature image is obtained.
Preferably, the position coding is constructed by:
wherein PE represents position code, pos represents the position of the current pixel in the input feature map, d model Representing the dimensions of the pixel, i represents a fused feature map of different positions, where even positions use sin and odd positions use cos.
Preferably, the transducer model comprises a multi-layer encoder and a multi-layer decoder, each layer encoder comprising a multi-head attention layer and a feed-forward connection layer, each layer decoder comprising a masked multi-head attention layer, a multi-head attention layer and a feed-forward connection layer.
Preferably, the decoding result is predicted based on a prediction layer FFN to obtain an optimal prediction frame, which includes the following steps:
inputting the multi-layer decoding result into a prediction layer FFN after parameter sharing to obtain a plurality of prediction frames,
constructing a prediction frame set according to the multiple prediction frames, and constructing a true value set;
and carrying out bipartite graph matching through a Hungary algorithm, matching the prediction frame set with the truth value set, and carrying out optimal selection on a plurality of prediction frames to obtain an optimal prediction frame.
Preferably, the hungarian algorithm is as follows:
in the method, in the process of the invention,is the optimal allocation result set, σ (i) represents the index,/->Is true value element y of road surface crack i Paired matching costs with the index σ (i), where c (i) is the target classTag b (i) is a vector, < >>Probability of c (i),>is a prediction frame, y i Represents the ith truth element,>representing a collection element representing a prediction box, L box Represents definition frame loss, N represents the number of prediction sets, +.>Representing the collection element.
Preferably, the transducer model is trained by a loss function, which is as follows
The illustration is:
where y represents the true value set of the detection object,then represent the prediction box set,/->Is c i Is used to determine the logarithmic probability of (1),is Hungary loss, L iou Is a generalized IoU loss, lambda iou 、λ L1 Is a super parameter.
A computer-readable storage medium storing computer instructions for causing the computer to execute the road surface crack detection method.
Compared with the prior art, the invention has the beneficial effects that:
according to the invention, the acquired two-dimensional images are input into an MSF algorithm model to obtain feature graphs fused by different sizes, so that the difficulty that local texture characteristics and global texture relations are not accurate enough when the target detection field aims at slender and tiny feature target detection is properly relieved, meanwhile, a transducer multi-layer encoder-decoder structure is adopted, and the post-processing steps of prior knowledge constraint such as Anchor and non-maximum suppression are omitted by combining with position coding, so that end-to-end target detection is realized, and a target detection algorithm is greatly simplified.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block flow diagram of a pavement crack detection method of the present invention;
fig. 2 is a model training effect diagram of a pavement crack detection method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention provides a pavement crack detection method, which comprises the following steps:
the first step: and acquiring road surface images by using an acquisition vehicle to obtain a plurality of images to be detected.
Preprocessing a plurality of images to be detected, including the following steps:
(1) The size of the plurality of images to be detected is processed to 200×200DPI.
(2) Graying the plurality of size-processed images to be detected, adding Gaussian noise, and then carrying out median filtering;
(3) Labeling the multiple images to be detected after median filtering according to three categories of transverse cracks, longitudinal cracks and repaired cracks by using a Labelimg picture labeling tool, and randomly combining and generating according to the ratio of 0.85:0.15 of the training set and the verification set. The MSF-transducer model is trained by a training set, and verified by a verification set.
And a second step of: and inputting the images to be detected into a trained MSF-transducer model, and outputting an optimal prediction frame, a category label and a confidence coefficient. Comprising the following steps:
and carrying out Multi-scale feature extraction on the image to be detected based on an MSF (Multi-scale fusion) model to obtain a fusion feature map.
(1) The preprocessed image to be detected is input to a CBL module consisting of convolution, normalization and activation functions (Conv, batchNormalization and LeakyReLU), and then is ready to start feature extraction.
(2) And inputting the image to be detected passing through the CBL module into a plurality of convolution layers and residual structures to obtain the feature images of 1/8/, 1/16 and 1/32 of the original image, and completing the feature image extraction of the three sizes.
(3) The feature maps of different sizes are unified into feature maps of the same size by an up-sampling method similar to Feature Pyramid (FPN).
(4) And stacking, fusing and corresponding convolution are carried out on the feature images with the same size, so that a fused feature image is obtained.
And constructing position codes with the same dimension according to the fusion characteristic diagram.
Position encoding is constructed by:
wherein PE represents position code, pos represents the position of the current pixel in the input feature map, d model Representing the dimensions of the pixel, i represents a fused feature map of different positions, where even positions use sin and odd positions use cos.
And encoding and decoding the fusion feature map and the position code based on the transducer model to obtain a decoding result.
(1) And inputting the multiple fusion feature maps and the multiple position codes into a multi-layer encoder to obtain a coding result. Each layer of encoder consists of a multi-head attention layer and a feed-forward connection layer.
(2) And inputting the encoding result into a multi-layer decoder to obtain a decoding result. Each layer of decoder is composed of a multi-head attention layer, a multi-head attention layer and a feedforward connection layer. Except for the first layer decoder, the remaining decoders have the output of the above layer decoder and the output of the multi-layer encoder as inputs.
And predicting the decoding result based on the prediction layer FFN to obtain an optimal prediction frame, a category label and a confidence coefficient.
(1) The output result of each layer of decoder structure is input to the prediction layer FFN for prediction through parameter sharing, and the loss function is calculated to realize deep supervision. The FFN of the present invention is calculated from a 3-layer linear layer with a ReLU activation function and with a hidden layer so that the center coordinates, height and width can be normalized by the prediction box and the prediction class labels can be obtained using softmax function activation.
(2) A set of prediction frames of a fixed size can be obtained by the above operation, but this obviously exceeds the number of prediction frames actually required, thus allowing for an optimal selection of a considerable number of prediction frames. The invention constructs a prediction frame set according to a plurality of prediction frames and constructs a true value set. Expanding the number of Ground Truth, namely true value, to be the same as that of a prediction frame, using an additional special class label to represent background class, realizing that the predicted value and the true value become a set of the same number of elements, performing bipartite graph matching through a Hungary algorithm at the moment, and enabling the elements of the prediction set and the true set to be in one-to-one correspondence so as to minimize matching loss:
the hungarian algorithm is as follows:
in the method, in the process of the invention,is the optimal allocation result set, σ (i) represents the index,/->Is true value element y of road surface crack i Paired matching costs with the index σ (i), where c (i) is the target class label, b (i) is a vector, +.>Probability of c (i),>is a prediction frame, y i Represents the ith truth element,>representing a collection element representing a prediction box, L box Represents definition frame loss, N represents the number of prediction sets, +.>Representation setAnd (5) combining elements.
(3) Calculating a loss function according to the obtained group Truth, namely the corresponding relation between the true value and the predicted target frame;
where y represents the true value set of the detection object,then represent the prediction box set,/->Is c i Is used to determine the logarithmic probability of (1),is Hungary loss, L iou Is a generalized IoU loss, lambda iou And lambda (lambda) L1 Is a hyper-parameter, normalized by the number of objects in the batch.
And a third step of: and judging the cracks in the image to be detected according to the optimal prediction frame, the category label and the confidence level. And selecting the cracks in the image by a prediction frame, classifying the types of the cracks by a class label, and displaying the confidence of the cracks.
Referring to fig. 2, after the model is trained by 500 epochs, it is obvious that the training set and the verification set vibrate and descend well and the difference between the two is not particularly large, which can completely prove that the task of detecting the crack of the road surface has good applicability and enough supporting force.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium is stored with computer executable instructions which can execute the pavement crack detection method.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
Claims (10)
1. The pavement crack detection method is characterized by comprising the following steps of:
collecting an image to be detected;
inputting the image to be detected into an MSF-converter model, and outputting an optimal prediction frame, a category label and a confidence coefficient;
judging cracks in the image to be detected according to the optimal prediction frame, the class label and the confidence coefficient;
inputting the image to be detected into an MSF-transducer model, and outputting a prediction frame, a category label and a confidence coefficient, wherein the method comprises the following steps:
performing multi-scale feature extraction on the image to be detected based on the MSF model to obtain a fusion feature map;
constructing position codes with the same dimension according to the fusion feature map;
encoding and decoding the fusion feature map and the position code based on a transducer model to obtain a decoding result;
and predicting the decoding result based on the prediction layer FFN to obtain a prediction frame, a category label and a confidence coefficient.
2. The method for detecting pavement cracks according to claim 1, wherein the image to be detected is preprocessed before multi-scale feature extraction is performed on the image to be detected by using an MSF model.
3. The pavement crack detection method as set forth in claim 2, wherein the pretreatment process comprises the steps of:
the size of the image to be detected is processed to be 200×200DPI;
graying the size-processed image to be detected, adding Gaussian noise, and then carrying out median filtering;
and labeling the image to be detected after median filtering by using a picture labeling tool according to three categories of transverse cracks, longitudinal cracks and repaired cracks.
4. The pavement crack detection method as set forth in claim 3, wherein the MSF model-based multi-scale feature extraction is performed on the image to be detected to obtain a fusion feature map, and the method comprises the following steps:
inputting the preprocessed image to be detected into a CBL module consisting of convolution, normalization and activation functions;
sequentially inputting images to be detected passing through the CBL module into a plurality of convolution layers and residual error structures to obtain a plurality of feature images with different sizes;
upsampling the plurality of different sized feature maps into a plurality of same sized feature maps;
and stacking, fusing and corresponding convolution are carried out on the feature images with the same size, so that a fused feature image is obtained.
5. The pavement crack detection method as set forth in claim 4, wherein the position code is constructed by:
in the method, in the process of the invention,PE represents position encoding, pos represents the position of the current pixel in the input feature map, d model Representing the dimensions of the pixel, i represents a fused feature map of different positions, where even positions use sin and odd positions use cos.
6. The pavement crack detection method of claim 5, wherein the fransformer model comprises a multi-layer encoder and a multi-layer decoder, each layer encoder comprising a multi-head attention layer and a feed-forward tie layer, each layer decoder comprising a masked multi-head attention layer, a multi-head attention layer and a feed-forward tie layer.
7. The method for detecting a pavement crack according to claim 6, wherein the decoding result is predicted based on a prediction layer FFN to obtain an optimal prediction frame, comprising the steps of:
inputting the multi-layer decoding result into a prediction layer FFN after parameter sharing to obtain a plurality of prediction frames,
constructing a prediction frame set according to the multiple prediction frames, and constructing a true value set;
and carrying out bipartite graph matching through a Hungary algorithm, matching the prediction frame set with the truth value set, and carrying out optimal selection on a plurality of prediction frames to obtain an optimal prediction frame.
8. The pavement crack detection method as set forth in claim 1, characterized in that the hungarian algorithm is as follows:
in the method, in the process of the invention,is the optimal allocation result set, σ (i) represents the index,/->Is true value element y of road surface crack i Paired matching costs with the index σ (i), where c (i) is the target class label, b (i) is a vector, +.>Probability of c (i),>is a prediction frame, y i Represents the ith truth element,>representing a collection element representing a prediction box, L box Represents definition frame loss, N represents the number of prediction sets, +.>Representing the collection element.
9. The method of claim 8, wherein the transducer model is trained by a loss function, the loss function being as follows:
10. A computer-readable storage medium storing computer instructions for causing the computer to perform the pavement crack detection method of any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310441866.5A CN116309536A (en) | 2023-04-23 | 2023-04-23 | Pavement crack detection method and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310441866.5A CN116309536A (en) | 2023-04-23 | 2023-04-23 | Pavement crack detection method and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116309536A true CN116309536A (en) | 2023-06-23 |
Family
ID=86828962
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310441866.5A Pending CN116309536A (en) | 2023-04-23 | 2023-04-23 | Pavement crack detection method and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116309536A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116596930A (en) * | 2023-07-18 | 2023-08-15 | 吉林大学 | Semi-supervised multitasking real image crack detection system and method |
CN117437580A (en) * | 2023-12-20 | 2024-01-23 | 广东省人民医院 | Digestive tract tumor recognition method, digestive tract tumor recognition system and digestive tract tumor recognition medium |
CN117975036A (en) * | 2024-01-10 | 2024-05-03 | 广州恒沙云科技有限公司 | Small target detection method and system based on detection converter |
-
2023
- 2023-04-23 CN CN202310441866.5A patent/CN116309536A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116596930A (en) * | 2023-07-18 | 2023-08-15 | 吉林大学 | Semi-supervised multitasking real image crack detection system and method |
CN116596930B (en) * | 2023-07-18 | 2023-09-22 | 吉林大学 | Semi-supervised multitasking real image crack detection system and method |
CN117437580A (en) * | 2023-12-20 | 2024-01-23 | 广东省人民医院 | Digestive tract tumor recognition method, digestive tract tumor recognition system and digestive tract tumor recognition medium |
CN117437580B (en) * | 2023-12-20 | 2024-03-22 | 广东省人民医院 | Digestive tract tumor recognition method, digestive tract tumor recognition system and digestive tract tumor recognition medium |
CN117975036A (en) * | 2024-01-10 | 2024-05-03 | 广州恒沙云科技有限公司 | Small target detection method and system based on detection converter |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110136170B (en) | Remote sensing image building change detection method based on convolutional neural network | |
CN110705457B (en) | Remote sensing image building change detection method | |
CN111091555B (en) | Brake shoe breaking target detection method | |
CN116309536A (en) | Pavement crack detection method and storage medium | |
CN110263706B (en) | Method for detecting and identifying dynamic target of vehicle-mounted video in haze weather | |
CN110889449A (en) | Edge-enhanced multi-scale remote sensing image building semantic feature extraction method | |
CN112347859A (en) | Optical remote sensing image saliency target detection method | |
CN110853057B (en) | Aerial image segmentation method based on global and multi-scale full-convolution network | |
CN112488025B (en) | Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion | |
CN112287983B (en) | Remote sensing image target extraction system and method based on deep learning | |
CN112991364A (en) | Road scene semantic segmentation method based on convolution neural network cross-modal fusion | |
CN116797787B (en) | Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network | |
CN110490205A (en) | Road scene semantic segmentation method based on the empty convolutional neural networks of Complete Disability difference | |
CN115482491A (en) | Bridge defect identification method and system based on transformer | |
CN114581770A (en) | TransUnnet-based automatic extraction processing method for remote sensing image building | |
CN117557775B (en) | Substation power equipment detection method and system based on infrared and visible light fusion | |
CN115861756A (en) | Earth background small target identification method based on cascade combination network | |
CN116309348A (en) | Lunar south pole impact pit detection method based on improved TransUnet network | |
CN116310916A (en) | Semantic segmentation method and system for high-resolution remote sensing city image | |
CN116778346B (en) | Pipeline identification method and system based on improved self-attention mechanism | |
CN117314938B (en) | Image segmentation method and device based on multi-scale feature fusion decoding | |
CN113313077A (en) | Salient object detection method based on multi-strategy and cross feature fusion | |
Jia et al. | OccupancyDETR: Making semantic scene completion as straightforward as object detection | |
CN113887470B (en) | High-resolution remote sensing image ground object extraction method based on multitask attention mechanism | |
Li et al. | Infrared Small Target Detection Algorithm Based on ISTD-CenterNet. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |