CN114612306A - Deep learning super-resolution method for crack detection - Google Patents

Deep learning super-resolution method for crack detection Download PDF

Info

Publication number
CN114612306A
CN114612306A CN202210250155.5A CN202210250155A CN114612306A CN 114612306 A CN114612306 A CN 114612306A CN 202210250155 A CN202210250155 A CN 202210250155A CN 114612306 A CN114612306 A CN 114612306A
Authority
CN
China
Prior art keywords
resolution
crack
image
super
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210250155.5A
Other languages
Chinese (zh)
Inventor
刘鹏宇
刘天禹
陈善继
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202210250155.5A priority Critical patent/CN114612306A/en
Publication of CN114612306A publication Critical patent/CN114612306A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a deep learning super-resolution method for crack detection, and belongs to the technical field of image super-resolution. The invention comprises the following steps: constructing a crack image data set for super-resolution network training; constructing a super-resolution network facing to the crack; training a super-resolution network facing the crack; and (5) super-resolution amplification of the crack image. The method fully utilizes the advantages of depth learning in the field of image super-resolution, designs the lightweight residual module comprising an attention mechanism and depth separable convolution based on the characteristics of the crack image, and constructs the super-resolution network by adopting the rear upper sampling structure, thereby solving the problems of difficult and inaccurate mapping from the crack low-resolution image to the high-resolution image, performing super-resolution amplification on the crack image under the condition of low computing resource occupation, retaining the texture information of the crack and improving the visual experience.

Description

Deep learning super-resolution method for crack detection
Technical Field
The invention relates to the technical field of image super-resolution, in particular to a deep learning super-resolution method for crack detection.
Background
At present, in the field of crack detection, the crack detection is mainly divided into artificial subjective detection, detection by using a sound wave emission instrument and a laser scanner. The former method is restricted by subjective consciousness of a detector and has no universality. The latter method uses instrumentation with high cost, which is not conducive to large-scale deployment. The rapid development of the deep learning image processing technology brings new opportunities for crack detection, and the method is used for intelligently analyzing the crack image, so that the efficiency of detection work and the accuracy of a detection result can be improved, the task load of detection workers can be reduced as much as possible, and the detection cost is reduced. The processing of images by deep learning is mostly completed based on a convolutional neural network, and the network extracts features by continuously iterating convolutional layers and obtains expected results by mapping layers. Most experiments prove that the method has incomparable advantages compared with the traditional method, and has better identification performance for concrete cracks, road cracks and rock stratum cracks.
The detection network commonly used for crack detection is a semantic segmentation network, the network carries out discrimination and classification on pixel points of an image one by one, and more pixel points mean that the content which can be learned by the network is richer, so that the network has higher requirements on the resolution of an input image. In the authoritative data sets, concentrate Crack Images for Classification and Crack-detection in the Crack detection field, the image resolution is about 224 × 224, and the image resolution requirements fed into the segmentation network are 480 × 480, 640 × 640 or higher, so that the low-resolution image which does not meet the input requirements needs to be amplified to obtain the high-resolution image which meets the network requirements (the low resolution and the high resolution are in a relative relationship, 400 × 400 belongs to a high-resolution image compared with 200 × 200, and 400 × 400 belongs to a low-resolution image compared with 600 × 600). The commonly used amplification method is to perform interpolation amplification on the low-resolution image by a bicubic interpolation method, so that the interpolation amplification is simple and quick, an additional module is not required to be added, and the method is widely applied to amplification of images at a mobile phone end and a computer end. However, the image with the rough interpolation and amplification is a rough high-resolution image, the resolution is only improved, the texture information of the object in the image is lost, and the most common problem is that the edge of the amplified object is blurred, so that the visual experience is influenced, and the subsequent identification and segmentation processing is not facilitated.
In view of the development of the deep learning super-resolution technology, the realization of image amplification by using a super-resolution network is a popular research direction. The method fits a mapping relation from a low-resolution image to a high-resolution image through learning of a large amount of data, improves the resolution of the image, brings better visual experience, and is an important image processing technology. However, the existing deep learning super-resolution technology is mainly oriented to actual scenes with task types, such as landscapes, animals, plants, people, food, buildings and vehicles, the mapping relation constructed by the existing network has a good effect on the occasions, but the image applied to the crack type is distorted and blurred, which indicates that the mapping relation constructed before is not suitable for the crack image; in addition, the existing super-resolution technology occupies higher computing resources and is not suitable for being used as a preprocessing module for crack detection. In view of the above problems, the invention designs a crack detection-oriented deep learning super-resolution method, which effectively solves the problems of difficult and inaccurate mapping from a crack low-resolution image to a high-resolution image, performs super-resolution amplification on the crack image while occupying less resources, retains the texture information of the crack and improves the visual experience.
Disclosure of Invention
The method mainly solves the technical problems that the existing deep learning super-resolution technology is poor in effect, the crack low-resolution image is difficult to map to the high-resolution image, and the computing resource occupation is high. Therefore, a deep learning super-resolution method facing crack detection is constructed, the method is used for improving the resolution of a crack image and simultaneously reserving crack texture information, and computing resources are reduced. In order to achieve the purpose, the invention adopts the following technical scheme:
a crack detection-oriented deep learning super-resolution method comprises the following steps:
step 1: and constructing a crack image data set for network training.
The quality of the data set in the deep learning is crucial to the super-resolution result, so that an original crack image data set is constructed through network open-source image data and field acquisition image data, and then the crack image set used for training and supervision is constructed through data enhancement, data clipping and data downsampling.
Step 2: and constructing a super-resolution network facing the crack.
And constructing a crack-oriented lightweight super-resolution network based on a post-up-sampling super-resolution network structure. The structure learns the low-resolution images end to end, and adds a learnable upsampling layer at the end for fitting the high-resolution images, so that the method has the advantage of greatly reducing the occupation of computing resources. The designed network can be divided into three modules of Head, Body and Tail.
The Head module is composed of two ordinary convolution layers and used for improving the dimensionality of an input low-resolution image and performing primary texture information extraction.
Further, the Body module consists of 16 repeatedly stacked blocks and 1 convolutional layer for refined texture feature extraction. Wherein each Block is divided into a front part, a middle part and a back part. The front part adopts a common convolution layer to extract the information output by the previous layer, the middle part is a lightweight residual error structure containing an attention mechanism and a depth separable convolution, and the input and the output are subjected to characteristic fusion in a jump connection mode; and collecting the texture information of the middle part by adopting a common convolution layer at the tail end. And finally, the input and the output of the Body module are fused in a jump connection mode, so that the interaction between texture information is improved.
And the final Tail module consists of two common convolution layers and a sub-pixel convolution layer and is used for carrying out the upper adoption operation on the characteristic diagram output by the Body so as to realize the amplification of the low-resolution image.
And step 3: and training a super-resolution network facing the crack.
Inputting the constructed crack super-resolution data set into a designed network, selecting an L1 loss function and an Adam optimizer, training the network to fit a mapping relation from a low-resolution crack image to a high-resolution crack image, and storing a model with the highest fitting rate after training.
And 4, step 4: and (5) super-resolution amplification of the crack image.
And mapping the low-resolution crack image by using the trained model file to obtain the amplified high-resolution crack image. The image is saved and can be used for crack detection later.
Compared with the prior art, the invention has the following advantages:
1. the invention discloses a deep learning super-resolution method designed based on crack image characteristics, which effectively solves the problems of distortion and blur of crack images amplified by the existing method, has better visual experience and crack texture information, and can be used for image preprocessing of crack detection and identification tasks.
2. The lightweight residual error module comprising an attention mechanism and a depth separable convolution is designed, and a lightweight super-resolution network is constructed by adopting a rear-mounted up-sampling structure, so that the occupation of computing resources is effectively reduced on the premise of ensuring that the network can perform high-precision super-resolution.
Drawings
FIG. 1 is a schematic overall flow chart of the crack detection-oriented deep learning super-resolution method in the invention.
FIG. 2 is a schematic flow chart of the present invention for constructing a fracture image dataset for training.
Fig. 3 is a structural diagram of a super-resolution network facing crack detection in the present invention.
Detailed Description
The invention mainly realizes super-resolution of crack images, and the specific method adopted by the invention is described in detail below by combining the attached drawings.
Specifically, the process of the deep learning super-resolution method for crack detection is shown in appendix 1, and includes the following steps. S1, constructing a crack image data set for network training. And S2, constructing a super-resolution network facing the crack. S3: and training a super-resolution network facing the crack. And S4, super-resolution amplification of the crack image.
(1) For S1, a fracture image dataset for network training is constructed.
The flow is shown in FIG. 2 of the appendix. Partial crack images are obtained through public resources, and the camera is used for shooting different types of cracks in different scenes for supplement, so that the diversity of crack types is improved. And the data is randomly rotated, light is supplemented, and the generalization of the data is improved. Then, cutting the processed image data to obtain a supervision image used as a true value; and then, processing the cut images by adopting a down-sampling and noise-adding mode to generate low-resolution images for training.
(2) For S2, a super-resolution network facing the crack is constructed.
The overall architecture of the network is illustrated on the left side of appendix FIG. 3. The designed network can be divided into three modules of Head, Body and Tail, wherein relu is an activation function.
The Head module is composed of two ordinary convolution layers with convolution kernel size of 3, the number of channels of an image with input resolution of 96 multiplied by 96 and channel number of 3 is expanded to 64, and initial extraction of features is carried out.
The Body module consists of 16 repeatedly stacked blocks and 1 common convolution layer, and jump connection is performed between the input and the output of the Body module for refined texture feature extraction. Wherein the Block specific structure is shown in the right side of appendix 3. Each Block is divided into a front part, a middle part and a rear part, and the front part extracts information output by a previous layer by a common convolution with a convolution kernel size of 3.
The middle part is a lightweight residual error structure, namely jump connection is carried out between input and output for fusing characteristic information. The residual error structure firstly increases the number of image channels to 128 through common convolution with convolution kernel size of 1, and then extracts information through deep separable convolution with convolution kernel size of 3, wherein the convolution sets the number of the convolution kernel channels to 1, and the number of the convolution kernels is set to the number of the characteristic image channels, so that each convolution kernel channel can process each characteristic image channel. After depth separable convolution, channel separation operation is carried out on the feature map, the feature map is divided into 2 feature maps with 64 channels and the same size, and the two feature maps are sent to two branches of channel attention (left) and space attention (right) respectively. Wherein the channel attention is subjected to average pooling processing aiming at each channel of the characteristic diagram to obtain a one-dimensional vector; obtaining an output vector through 2 common convolutional layers with the convolutional kernel size of 1, wherein the vector analyzes a weight relation for each channel of the characteristic diagram and gives a larger weight to more important channels; and finally, normalizing the output vector, and multiplying the output vector by the input feature map to obtain a new feature map. The spatial attention is focused on the spatial information of the feature map, and mean operation is carried out on each pixel in all channels of the feature map to obtain a single-channel feature map; then obtaining an output characteristic diagram through a common convolutional layer with the convolutional kernel size of 7; finally, the output characteristic diagram is normalized and processed, and then the input characteristic diagram is multiplied to obtain a new characteristic diagram. The two attention mechanisms can effectively assign corresponding weights to the characteristic diagram channels and pixels according to the interested areas of the characteristic diagrams, and improve the characteristic extraction capability. And then, carrying out channel splicing operation on the feature graphs processed by the two attention mechanisms, and carrying out channel number reduction operation through a common convolution with the convolution kernel size of 1 to restore the state of the residual structure input feature graph.
And the rear part is a common convolution with a convolution kernel of 3, the output of the middle residual error structure is subjected to information summary, and finally the input and the output of the Block module are fused in a jump connection mode, so that the interaction between texture information is improved.
The Tail module consists of two ordinary convolution layers with convolution kernel size of 3 and a sub-pixel convolution layer and is used for carrying out the upper adoption operation on the feature map output by Body so as to realize the amplification of the low-resolution image. The sub-pixel convolution is used as a common image super-resolution up-sampling module, so that the size of an image can be well improved under the condition of keeping the original information of the image. The network with the structural design has the advantages of low calculation amount and high precision, and well completes the task of crack image super-resolution amplification.
(3) For S3, the super-resolution network facing the crack is trained.
Inputting the constructed crack super-resolution data set into an inverted design network, selecting an L1 loss function and an Adam optimizer, setting the training times to be 200, setting the initial learning rate to be 0.0005, setting the input low-resolution training image size to be 96x96, setting the supervision image size to be 192 x 192, calculating the loss between the image of the training image after entering the network and the supervision image, finding balance between the training speed and the training precision by utilizing the adaptive learning rate, and continuously optimizing the network by the optimizer to fit the mapping relation between the low-resolution image and the high-resolution image. And after the training is finished, storing the model with the highest fitting rate.
(4) For S4, super-resolution magnification of the crack image.
And mapping the low-resolution crack image by using the trained model file so as to obtain the high-resolution crack image. Compared with an image amplified by a traditional interpolation method, the high-resolution image obtained by the network has better texture detail information and visual experience, and the image is finally stored and can be used for subsequent crack detection.
The above embodiments are merely illustrative of the technical solutions of the present invention, and are not restrictive. Those skilled in the art will understand that: the above embodiments do not limit the present invention in any way, and all similar technical solutions obtained by means of equivalent replacement or equivalent transformation belong to the protection scope of the present invention.

Claims (2)

1. A crack detection-oriented deep learning super-resolution method is characterized by comprising the following steps:
s1: constructing a crack image dataset for network training;
s2: constructing a super-resolution network facing to the crack;
s3: training a super-resolution network facing the crack;
for S2, constructing a crack-oriented super-resolution network; the designed network can be divided into three modules of Head, Body and Tail, wherein relu is an activation function; the Head module consists of two common convolution layers with convolution kernel size of 3, expands the number of channels of an image with input resolution of 96 multiplied by 96 and channel number of 3 to 64, and performs initial extraction of features; the Body module consists of 16 repeatedly stacked blocks and 1 common convolution layer, and jump connection is carried out between the input and the output of the Body module for extracting refined texture features; each Block is divided into a front part, a middle part and a rear part, and the front part extracts information output by a previous layer by a common convolution with a convolution kernel of 3; the middle part is a lightweight residual error structure, namely jump connection is carried out between input and output for fusing characteristic information; the residual error structure firstly increases the number of image channels to 128 through the common convolution with the convolution kernel size of 1, then carries out information extraction through the depth separable convolution with the convolution kernel size of 3, the convolution sets the number of the convolution kernel channels to 1, the number of the convolution kernels is set to the number of the characteristic diagram channels, the characteristic diagram is subjected to channel separation operation after the depth separable convolution, the characteristic diagram is divided into 2 two characteristic diagrams with 64 channels and the same size, and the two characteristic diagrams are respectively sent into two branches of channel attention and space attention; wherein the channel attention carries out average pooling treatment aiming at the characteristic graph of each channel to obtain a one-dimensional vector; obtaining an output vector through 2 common convolutional layers with the convolutional kernel size of 1, wherein the vector analyzes a weight relation for each channel of the characteristic diagram and gives a larger weight to more important channels; finally, normalizing the output vector, and multiplying the output vector by the input feature map to obtain a new feature map; the spatial attention is focused on the spatial information of the feature map, and all channels of the feature map are subjected to average pooling operation to obtain a single-channel feature map; then obtaining an output characteristic diagram through a common convolution layer with the convolution kernel size of 7; finally, the output characteristic diagram is normalized and processed, and then the input characteristic diagram is multiplied to obtain a new characteristic diagram; then, carrying out channel splicing operation on the feature graphs processed by the two attention mechanisms, and carrying out channel number reduction operation through a common convolution with a convolution kernel size of 1 to restore the state of the residual error structure input feature graph; the back part is a common convolution with a convolution kernel size of 3, information gathering is carried out on the output of the middle residual error structure, finally, the input and the output of the Block module are fused in a jump connection mode, and interaction among texture information is improved; the Tail module consists of two ordinary convolution layers with convolution kernel size of 3 and a sub-pixel convolution layer and is used for carrying out the upper adoption operation on the feature graph output by Body;
for S3, training a super-resolution network facing the crack; inputting the constructed crack super-resolution data set into a designed network, selecting an L1 loss function and an Adam optimizer, setting the training times to be 200, setting the initial learning rate to be 0.0005, inputting the size of a low-resolution training image to be 96x96, setting the size of a supervision image to be 192 x 192, calculating the loss between the image after the training image enters the network and the supervision image, optimizing the fitting relation, and storing a model with the highest fitting rate after the training is finished; and mapping the low-resolution crack image by using the trained model file so as to obtain the high-resolution crack image.
2. The crack detection-oriented deep learning super-resolution method according to claim 1, characterized in that:
for S1, constructing a fracture image dataset for network training; partial crack images are obtained through public resources, and different types of cracks under different scenes are shot by a camera to be used as supplements, so that the diversity of crack types is improved; randomly rotating the data, supplementing light and improving the generalization of the data; then, cutting the processed image data to obtain a supervision image used as a true value; and then, processing the cut images by adopting a down-sampling and noise-adding mode to generate low-resolution images for training.
CN202210250155.5A 2022-03-15 2022-03-15 Deep learning super-resolution method for crack detection Pending CN114612306A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210250155.5A CN114612306A (en) 2022-03-15 2022-03-15 Deep learning super-resolution method for crack detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210250155.5A CN114612306A (en) 2022-03-15 2022-03-15 Deep learning super-resolution method for crack detection

Publications (1)

Publication Number Publication Date
CN114612306A true CN114612306A (en) 2022-06-10

Family

ID=81863277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210250155.5A Pending CN114612306A (en) 2022-03-15 2022-03-15 Deep learning super-resolution method for crack detection

Country Status (1)

Country Link
CN (1) CN114612306A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115601242A (en) * 2022-12-13 2023-01-13 电子科技大学(Cn) Lightweight image super-resolution reconstruction method suitable for hardware deployment
CN116029943A (en) * 2023-03-28 2023-04-28 国科天成科技股份有限公司 Infrared image super-resolution enhancement method based on deep learning
CN116883246A (en) * 2023-09-06 2023-10-13 感跃医疗科技(成都)有限公司 Super-resolution method for CBCT image

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115601242A (en) * 2022-12-13 2023-01-13 电子科技大学(Cn) Lightweight image super-resolution reconstruction method suitable for hardware deployment
CN115601242B (en) * 2022-12-13 2023-04-18 电子科技大学 Lightweight image super-resolution reconstruction method suitable for hardware deployment
CN116029943A (en) * 2023-03-28 2023-04-28 国科天成科技股份有限公司 Infrared image super-resolution enhancement method based on deep learning
CN116883246A (en) * 2023-09-06 2023-10-13 感跃医疗科技(成都)有限公司 Super-resolution method for CBCT image
CN116883246B (en) * 2023-09-06 2023-11-14 感跃医疗科技(成都)有限公司 Super-resolution method for CBCT image

Similar Documents

Publication Publication Date Title
CN110210551B (en) Visual target tracking method based on adaptive subject sensitivity
CN114612306A (en) Deep learning super-resolution method for crack detection
CN110276354B (en) High-resolution streetscape picture semantic segmentation training and real-time segmentation method
CN112287940A (en) Semantic segmentation method of attention mechanism based on deep learning
CN111754438B (en) Underwater image restoration model based on multi-branch gating fusion and restoration method thereof
CN113344188A (en) Lightweight neural network model based on channel attention module
CN111652081B (en) Video semantic segmentation method based on optical flow feature fusion
CN113420794B (en) Binaryzation Faster R-CNN citrus disease and pest identification method based on deep learning
CN110910413A (en) ISAR image segmentation method based on U-Net
CN111476133B (en) Unmanned driving-oriented foreground and background codec network target extraction method
CN114022408A (en) Remote sensing image cloud detection method based on multi-scale convolution neural network
CN111666948A (en) Real-time high-performance semantic segmentation method and device based on multi-path aggregation
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN110633640A (en) Method for identifying complex scene by optimizing PointNet
WO2020232942A1 (en) Method for constructing farmland image-based convolutional neural network model, and system thereof
CN114596503A (en) Road extraction method based on remote sensing satellite image
CN110751271B (en) Image traceability feature characterization method based on deep neural network
CN114998756A (en) Yolov 5-based remote sensing image detection method and device and storage medium
CN113888505A (en) Natural scene text detection method based on semantic segmentation
CN115719445A (en) Seafood identification method based on deep learning and raspberry type 4B module
CN116385902A (en) Remote sensing big data processing method, system and cloud platform
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN117952985A (en) Image data processing method based on lifting information multiplexing under defect detection scene
CN118247645A (en) Novel DDCE-YOLOv s model underwater image target detection method
Wang et al. Hierarchical Kernel Interaction Network for Remote Sensing Object Counting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination