CN114359654A - YOLOv4 concrete apparent disease detection method based on position relevance feature fusion - Google Patents

YOLOv4 concrete apparent disease detection method based on position relevance feature fusion Download PDF

Info

Publication number
CN114359654A
CN114359654A CN202111478855.1A CN202111478855A CN114359654A CN 114359654 A CN114359654 A CN 114359654A CN 202111478855 A CN202111478855 A CN 202111478855A CN 114359654 A CN114359654 A CN 114359654A
Authority
CN
China
Prior art keywords
yolov4
disease
feature
fusion
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111478855.1A
Other languages
Chinese (zh)
Inventor
苏祖强
赵成
韩延
王诚诚
王鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202111478855.1A priority Critical patent/CN114359654A/en
Publication of CN114359654A publication Critical patent/CN114359654A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of computer vision, and particularly relates to a position relevance feature fusion-based YOLOv4 concrete apparent disease detection method, which comprises the steps of respectively carrying out multi-scale fusion on three layers of features output by a path aggregation network of YOLOv4, and carrying out feature multi-scale self-adaptive fusion by a position relevance attention-based module to construct a position relevance feature fusion-based YOLOv4 model; marking the position and the category of the disease of the collected disease image by using a marking tool, and training the model by using the disease image and the marked disease information; inputting the real-time detection concrete apparent disease image into a trained model, and outputting an image for marking disease types and positions after detection by the model; according to the invention, a feature fusion module based on position relevance is added behind the original route aggregation network of YOLOv4, so that the effect of YOLOv4 feature fusion is enhanced, and the detection precision of the target is improved.

Description

YOLOv4 concrete apparent disease detection method based on position relevance feature fusion
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a method for detecting the apparent diseases of YOLOv4 concrete based on position relevance feature fusion.
Background
Nowadays, there are more and more building structures aged worldwide, most of which use concrete as a building material. The concrete is influenced by the comprehensive action of various forces for a long time and by external environments such as chloride and sulfate corrosion in an extremely severe environment, and various diseases such as cracks, holes, honeycombs, pitted surfaces, exposed ribs and the like inevitably occur. The civil engineering such as bridges and dams formed by concrete structures has extremely high requirement on the integrity of the concrete structures, and in the long-term service process, if the civil engineering is not detected and maintained in time, collapse is finally caused to occur accidents along with the lapse of time, so that irreparable loss is caused. At present, the concrete apparent disease detection mainly adopts a manual detection mode, but the detection mode based on manual detection has the defects that the detection result is influenced by subjectivity, the detection labor intensity is high, the automation degree is low, and the like. With the increase of the service life of bridges, dams and other buildings, more and more bridges and dams need to be detected, and the artificial disease detection method cannot meet the requirements of actual engineering application, so that the concrete apparent disease detection method with higher efficiency and higher intelligent degree needs to be researched urgently.
Concrete apparent disease detection is essentially a target detection problem, and with the rapid development of computer vision technology and artificial intelligence theory in recent years, learners combine computer vision technology and artificial intelligence algorithm to realize concrete apparent disease detection. Deep learning is the latest research result in the field of artificial intelligence, and compared with the traditional machine learning method, the deep learning can realize self-adaptive feature extraction, and meanwhile, as the network structure of the deep learning is deeper, more abundant and abstract features can be extracted. At present, target detection algorithms based on deep learning are mainly divided into two categories: the first type is a method of double-stage detection algorithm such as R-CNN, Faster R-CNN and the like, the methods firstly acquire the area to be detected through a series of candidate area generation algorithms, and then send the area to a convolutional neural network for type judgment to realize the detection and positioning of the target; the second type is a single-stage target detection algorithm, such as SSD, YOLO, etc., which can achieve end-to-end target detection and positioning. Because the on-site detection task of the apparent concrete diseases has certain requirements on speed and precision, the single-stage method is more suitable for detecting the apparent concrete diseases. The YOLO v4 is a YOLO series fourth generation algorithm, and is used for extracting and fusing different scale features of an image through a CSPDarknet53 network, and detecting targets with different sizes on different scale feature maps respectively, so that the detection speed and the detection precision are well balanced. However, in an actual concrete disease detection task, the field environment is complex, the acquired image has complex background information, and the single-stage detection algorithm YOLOv4 is used for detection, so that the characteristic information of the disease cannot be sufficiently extracted, and the condition of missed detection is easy to occur.
Disclosure of Invention
In order to enrich the feature information extracted in the detection process of the YOLOv4 and further improve the detection precision, the invention provides a position relevance feature fusion-based YOLOv4 concrete apparent disease detection method, which comprises the following steps:
collecting concrete apparent disease images used for training a model, and constructing a YOLOv4 model based on position relevance feature fusion;
marking the position and the category of the disease by using a marking tool for the collected disease image;
training a model of YOLOv4 based on location relevance feature fusion according to the acquired disease image and the disease category and location information acquired by the labeling software;
detecting concrete apparent disease images in real time through a trained model based on position relevance feature fusion, classifying and positioning the detected diseases, and outputting images for marking disease types and positions after detection;
the position relevance feature fusion-based YOLOv4 model respectively performs multi-scale fusion on three layers of features output by a path aggregation network of YOLOv4, and performs feature multi-scale adaptive fusion through a position relevance attention-based module, specifically comprising:
the position relevance Attention module embeds position information into channel weights by using Coordinate Attention channel Attention from channel dimensions, and screens out channels sensitive to the position information during fusion;
the position relevance Attention module performs Spatial adaptive weight adjustment on the feature map by using Spatial Attention space Attention from a space dimension, so that the position of the target is more concerned during detection.
Further, the channel weight obtaining process includes:
two 1D global pooling operations are adopted, and each channel is aggregated along a horizontal coordinate and a vertical coordinate respectively to obtain two characteristics f with direction perception independentlyhAnd fw
Performing concatenate operation on the two obtained feature graphs, and then performing convolution operation to further extract feature information;
separating a feature map f 'along the vertical direction from the extracted feature information'hAnd a feature map f 'in the horizontal direction'wUsing an activation function to obtain channel weights M from the vertical and horizontal directionshAnd Mw
Mh=σ(Fh(f′h))
Mw=σ(Fw(f′w))
Wherein, sigma is Sigmoid activation function, FhAnd FwTwo convolution kernels of 1x1, respectively, for adjusting the feature map f in the vertical direction′hAnd a feature map f 'in the horizontal direction'wThe output channel dimension is made the same as the original number of in-out channels.
Further, the spatially adaptive weight adjustment of the feature map using Spatial Attention comprises:
generation of two different features F using maximum pooling and average pooling in channel dimensions avgAnd Fs maxConnecting the two features;
extracting information from the two connected features by convolution calculation to obtain a spatial attention feature;
obtaining a spatial weight M using an activation function for a spatial attention featuresExpressed as:
Figure RE-GDA0003555320840000031
wherein rho (.) is sigmoid activation function,
Figure RE-GDA0003555320840000032
represents the convolution operation using a 7x7 convolution kernel; f' is the input feature map.
Further, the training process of the position relevance feature fusion-based model of YOLOv4 includes:
dividing an input image into S multiplied by S squares;
predicting n bounding boxes in each square, and generating confidence of a detection target for the bounding boxes of each square;
for each boundary box, predicting the conditional probability of a certain class of detection target, and multiplying the conditional probability by the confidence coefficient to obtain the confidence coefficient of each boundary box for each specific class;
and calculating the direct difference between the output result and the labeling result by adopting a loss function of YOLOv4, and carrying out back propagation training on the model through the difference.
Further, the confidence is expressed as:
Figure RE-GDA0003555320840000041
wherein Confidence represents the Confidence level, Pr (object) represents the probability that the boundary box contains the object to be detected,
Figure RE-GDA0003555320840000042
indicating the overlap ratio of the predicted bounding box and the labeled bounding box.
Further, the loss function of YOLOv4 is expressed as:
L=Lreg+Lconf+Lclass
wherein L isregThe frame regression loss; l isconfIs a loss of confidence; l isclassIs a classification loss.
Further, the frame regression loss LregExpressed as:
Figure RE-GDA0003555320840000043
Figure RE-GDA0003555320840000044
Figure RE-GDA0003555320840000045
where IOU (A, B) represents the intersection ratio of real box A and predicted box B, p2(A, B) represents the Euclidean distance between the central points of the prediction frame B and the real frame A; m represents the diagonal distance of the coverage area of the real box A and the prediction box B; w and h denote the width and height of the prediction box, wgtAnd hgtRepresents true width and height; α is a balance coefficient, and v is a parameter for keeping the aspect ratio of the prediction box uniform.
Further, the confidence loss LconfExpressed as:
Figure RE-GDA0003555320840000051
wherein λ isnoobjIs the weight of the cross-over ratio error, s2B is the predicted boundary frame number of each cell;
Figure RE-GDA0003555320840000052
whether a detection target exists in the jth boundary box of the ith cell or not is represented, if the detection target exists, the detection target is 1, and if the detection target does not exist, the detection target is 0;
Figure RE-GDA0003555320840000053
in order to predict the degree of confidence,
Figure RE-GDA0003555320840000054
is the actual confidence.
Further, the classification loss LclassExpressed as:
Figure RE-GDA0003555320840000055
wherein s is2The number of divided cells;
Figure RE-GDA0003555320840000056
whether a detection target exists in the jth boundary box of the ith cell or not is represented, if the detection target exists, the detection target is 1, and if the detection target does not exist, the detection target is 0; pi jIs the actual probability of the category to which the object in the cell belongs,
Figure RE-GDA0003555320840000057
is the prediction probability.
According to the invention, a position relevance-based feature fusion module is added behind an original Path Aggregation Network (PANET) of the YOLOv4, so that the effect of YOLOv4 feature fusion is enhanced, and the detection precision of a target is improved.
Drawings
FIG. 1 is a flow chart of a method for detecting apparent diseases of YOLOv4 concrete based on location relevance feature fusion, according to the invention;
FIG. 2 is a training flowchart of the model of YOLOv4 based on location relevance feature fusion according to the present invention;
FIG. 3 is a block diagram of a location correlation feature fusion module according to the present invention;
FIG. 4 is a diagram of a YOLOv4 network structure based on location relevance feature fusion, which is adopted by the present invention;
FIG. 5 is a diagram of the detection effect of the present invention and the prior art, wherein (a) is an SSD detection effect diagram, (b) is a Faster RCNN detection effect diagram, (c) is an original YOLOv4 detection effect diagram, and (d) is a YOLOv4 model detection effect diagram based on location relevance feature fusion, namely the present invention;
fig. 6 is a diagram of the detection effect of the present invention and the prior art, wherein (a) is an SSD detection effect diagram, (b) is a fast RCNN detection effect diagram, (c) is an original YOLOv4 detection effect diagram, and (d) is a YOLOv4 model detection effect diagram based on location relevance feature fusion, that is, the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a position relevance feature fusion-based YOLOv4 concrete apparent disease detection method, which comprises the following steps:
collecting concrete apparent disease images used for training a model, and constructing a YOLOv4 model based on position relevance feature fusion;
marking the position and the category of the disease by using a marking tool for the collected disease image;
training a model of YOLOv4 based on location relevance feature fusion according to the acquired disease image and the disease category and location information acquired by the labeling software;
detecting concrete apparent disease images in real time through a trained model based on position relevance feature fusion, classifying and positioning the detected diseases, and outputting images for marking disease types and positions after detection;
the position relevance feature fusion-based YOLOv4 model respectively performs multi-scale fusion on three layers of features output by a path aggregation network of YOLOv4, and performs feature multi-scale adaptive fusion through a position relevance attention-based module, specifically comprising:
the position relevance Attention module embeds position information into channel weights by using Coordinate Attention channel Attention from channel dimensions, and screens out channels sensitive to the position information during fusion;
the position relevance Attention module performs Spatial adaptive weight adjustment on the feature map by using Spatial Attention space Attention from a space dimension, so that the position of the target is more concerned during detection.
In this embodiment, as shown in fig. 1, the method for detecting the apparent diseases of the YOLOv4 concrete based on location relevance feature fusion includes the following steps:
step 1, acquiring concrete building apparent disease images.
And 2, finely marking the disease area in the image by using a marking tool through a manual marking method, and marking the position and the category information of the disease in the image.
And 3, training a position relevance feature fusion-based Yolov4 model according to the collected disease images and the disease category and position information acquired through the labeling software.
The position relevance feature fusion-based YOLOv4 model adopted by the invention respectively performs multi-scale fusion on three layers of features output by a Path Aggregation Network (PANET), performs multi-scale self-adaptive fusion of the features through a position relevance attention-based module, and then performs output prediction, wherein the improved overall Network structure is shown in FIG. 4. The position relevance Attention module firstly uses the Coordinate Attention (CA) channel Attention from the channel dimension to embed the position information into the channel weight, screens out the channel sensitive to the position information during fusion, thereby avoiding the influence of the position information loss on the feature fusion effect in the transformation process of the feature graph dimension, and then uses the Spatial Attention (SA) space Attention from the space dimension to carry out the Spatial adaptive weight adjustment on the feature graph, so that the position of the target is more concerned during detection, and finally the detection precision of the target is improved.
The location correlation attention module structure is shown in fig. 3, and the module includes a channel attention module and a space attention module, namely:
1) channel attention module
The module adopts two 1D global pooling operations, and aggregates each channel along a horizontal coordinate and a vertical coordinate respectively to obtain two characteristics f with direction perception independentlyhAnd fwThen, the two feature maps obtained are subjected to a concatemate operation, and then convolution operation is carried out to further extract feature information, and the feature map f 'along the vertical direction is separated'hAnd a feature map f 'in the horizontal direction'wFinally, the activation function is used to obtain the channel weights M from the vertical and horizontal directionshAnd MwThe weight outputs are as follows:
Mh=σ(Fh(f′h))
Mw=σ(Fw(f′w))
wherein σ is Sigmoid activation function, FhAnd FwTwo convolution kernels, 1x1 each, are used to adjust the channel dimensions of the output to be the same as the original number of incoming and outgoing channels.
2) Space attention module
The module uses maximum pooling (Max pooling) and average pooling (Avg pooling) in the channel dimension, respectively, to generate two different features Fs avgAnd Fs maxThen connecting the two features together for rollingThe product calculation further extracts information to obtain a space attention characteristic UsThen using the activation function to obtain the spatial weight MsThe weight outputs are as follows:
Figure RE-GDA0003555320840000081
wherein rho (.) is sigmoid activation function,
Figure RE-GDA0003555320840000082
representing a convolution operation using a 7x7 convolution kernel.
In this embodiment, normalization processing is performed on the acquired concrete apparent disease image, then the image is reduced or enlarged, so that the size of the image is 416 × 416, the processed image is obtained, random data scrambling is performed, then 80% of the disease image is divided into a training set, and 20% of the disease image is divided into a test set for training and testing a model.
When the collected disease image is marked on the position and the category of the disease by using a marking tool, the marking tool is used for marking a target area in the image by using a rectangular frame, and the coordinate, the width and the height of the central point of the marking rectangular frame and the category of the marking rectangular frame are obtained.
Referring to fig. 2, a flowchart of the YOLOv4 model training based on location relevance feature fusion according to the present invention includes:
a1, dividing the input image into S × S squares by the model;
a2, predicting n bounding boxes in each square, and generating confidence of the detected object for the bounding box of each square, wherein the confidence is expressed as:
Figure RE-GDA0003555320840000083
wherein Confidence represents the Confidence level, Pr (object) represents the probability that the boundary box contains the object to be detected,
Figure RE-GDA0003555320840000084
representing the overlapping rate of the predicted bounding box and the labeled bounding box;
a3, for each bounding box, predicting the conditional probability Pr (class (i) object) of containing a certain class of detection object, wherein Pr (class (i) object) represents the probability that the bounding box contains the i-th class of detection object;
a4, multiplying the Confidence coefficient obtained in the step a2 by the conditional probability Pr (class (i) object) obtained in the step A3 to obtain the Confidence coefficient of each boundary box for each specific class;
and A5, calculating by adopting a loss function of YOLOv4 to obtain a positioning frame of each detection target, wherein the loss function is used for calculating the direct difference between the output result of the model and the labeling result. The YOLOv4 loss function mainly comprises three parts, namely border regression loss, confidence regression loss and classification loss, and is specifically represented by the following formula:
L=Lreg+Lconf+Lclass
in the formula (4), LregFor coordinate loss, the specific formula is as follows:
Figure RE-GDA0003555320840000091
Figure RE-GDA0003555320840000092
Figure RE-GDA0003555320840000093
IOU (A, B) in the formula represents the intersection ratio of the real box and the predicted box, p2(A, B) representing the Euclidean distance between the central points of the prediction frame and the real frame; m represents the diagonal distance of the coverage areas of the real box and the prediction box; w and h denote the width and height of the prediction box, wgtAnd hgtRepresenting realityIs wide and high. α is a balance coefficient, and v is a parameter for keeping the aspect ratio of the prediction box uniform.
LconfFor confidence loss, the specific formula is as follows:
Figure RE-GDA0003555320840000094
wherein λ isnoobjIs the weight of the cross-over ratio error, is set to 0.5,
Figure RE-GDA0003555320840000095
in order to predict the degree of confidence,
Figure RE-GDA0003555320840000096
is the actual confidence.
LclassThe method adopts a binary cross entropy loss function for classification, and the specific formula is as follows:
Figure RE-GDA0003555320840000097
wherein, Pi jIs the actual probability of the category to which the object in the cell belongs,
Figure RE-GDA0003555320840000098
is the prediction probability.
The input image is: firstly, normalizing the acquired concrete apparent disease image, then reducing or amplifying the image to enable the size of the image to be 416x416 to obtain a processed image, firstly randomly disordering data, then dividing 80% of the disease image into a training set and 20% of the disease image into a testing set for training and testing a model.
And 4, according to the trained model and the acquired concrete apparent disease image to be detected on site, detecting the concrete apparent disease image in real time, classifying and positioning the detected diseases, and outputting an image for marking the disease category and position after detection.
In order to verify the effectiveness of the method, an experiment detects common concrete apparent diseases, 1751 concrete disease images collected on site are established, wherein the main disease types comprise stripping, cracks, cavities, honeycombs, exposed ribs and the like, and the disease positions and the category information of the disease images are marked by using image marking software. And the ratio of the raw materials is 8: 2, the training data set and the test data set are divided according to the proportion, and the details of the data sets are shown in the following table 1:
TABLE 1 concrete apparent disease data set
Disease category Training data label number Number of test data markers In all
Exfoliation 644 145 789
Crack (crack) 794 157 951
Hole(s) 600 160 760
Honeycomb body 481 115 596
Exposed rib 562 161 723
In order to verify the effect of the provided position relevance-based feature fusion YOLOv4 model, training and testing are carried out on the established data set, and the performance of the improved model is evaluated by adopting the average detection accuracy rate mAP, the recall rate recall and the precision rate precision of evaluation indexes commonly used for target detection. And corresponding ablation experiments were performed, and five sets of experiments were performed, namely, the original YOLOv4 model, the SA Spatial Attention module, namely SA-YOLOv4(Spatial Attention-YOLOv4), the CA channel Attention module, namely CA-YOLOv4 (coordination Attention-YOLOv4), the model, namely M-YOLOv4(Multiscale-YOLOv4), the YOLOv4 model, namely L-YOLOv4 (localization correlation fusion-YOLOv4) based on Location correlation feature fusion, were added only after the pant output, respectively. The experimental results are shown in table 2 below.
TABLE 2 results of the experiment
Network model mAP recall precision
YOLOv4 75.34% 57.14% 86.67%
SA-YOLOv4 75.91% 58.42% 86.76%
CA-YOLOv4 76.06% 58.19% 86.84%
M-YOLOv4 76.43% 58.65% 87.15%
L-YOLOv4 77.16% 59.34% 87.35%
As can be seen from table 2, the original YOLOv4 model is improved in detection effects by adding a multi-scale fusion, a CA channel attention module, and an SA space attention module behind a routing aggregation network PANet. The position relevance feature fusion-based YOLOv4 model combining the ideas of the three is improved to the highest degree, compared with the original YOLOv4, the average detection accuracy rate mAP is higher by 1.82%, the recall rate recall is higher by 2.2%, and the precision rate precision is higher by 0.68%, which shows that the detection accuracy is effectively improved and the omission ratio is reduced through the position relevance feature fusion-based YOLOv 4.
Meanwhile, the YOLOv4 fused based on the position relevance characteristics is compared with other classic target detection algorithms SSD and fast RCNN, and comparison is made in terms of Average detection accuracy (mean Average Precision) mapp and detection speed, and the result is shown in table 3 below.
TABLE 3 results of the experiment
Figure RE-GDA0003555320840000111
Figure RE-GDA0003555320840000121
As can be seen from table 3, the original yollov 4 mAP value reached 75.34% for the self-constructed concrete apparent disease dataset used in this experiment. The mAP value of the position relevance feature fusion based YOLOv4 is improved by 1.82% compared with the original YOLOv4, and the mAP value of SSD and Faster RCNN on the data set is respectively lower by 5.66% and 2.15%. The position relevance feature fusion-based YOLOv4 has a good effect on the precision and speed of concrete apparent disease detection, which shows that a position relevance-based feature fusion module is added behind a Path Aggregation Network (PANet) of an original YOLOv4, and the Network extracts richer feature information and improves the detection precision through multi-scale adaptive feature fusion.
Fig. 5-6 (a), (b), (c), and (d) respectively represent SSD detection effect maps, Faster RCNN detection effect maps, raw YOLOv4 detection effect maps, and YOLOv4 model detection effect maps based on location association feature fusion. As can be seen from fig. 5-6, SSD, fast RCNN, and original YOLOv4 are not ideal in detecting the disease image collected on site, and have the situations of missing detection, false detection, etc., and L-YOLOv4 performs multi-scale feature fusion based on the position relevance after the path aggregation network, fully utilizes feature information from different scales, and enriches the context information of the target, so that the target detection is more accurate during the detection, and the problems of missing detection and false detection in the concrete apparent disease detection task are effectively improved.
The invention provides a position relevance feature fusion-based Yolov4 concrete apparent disease detection method aiming at a concrete apparent disease detection scene, and a position relevance-based feature fusion module is added behind a Path Aggregation Network (PANET) of an original Yolov4, so that the effect of YOLov4 feature fusion is enhanced, and the target detection effect is improved.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (9)

1. A position relevance feature fusion-based YOLOv4 concrete apparent disease detection method is characterized by comprising the following steps:
collecting concrete apparent disease images used for training a model, and constructing a YOLOv4 model based on position relevance feature fusion;
marking the position and the category of the disease by using a marking tool for the collected disease image;
training a model of YOLOv4 based on location relevance feature fusion according to the acquired disease image and the disease category and location information acquired by the labeling software;
detecting concrete apparent disease images in real time through a trained model based on position relevance feature fusion, classifying and positioning the detected diseases, and outputting images for marking disease types and positions after detection;
the position relevance feature fusion-based YOLOv4 model respectively performs multi-scale fusion on three layers of features output by a path aggregation network of YOLOv4, and performs feature multi-scale adaptive fusion through a position relevance attention-based module, specifically comprising:
the position relevance Attention module embeds position information into channel weights by using Coordinate Attention channel Attention from channel dimensions, and screens out channels sensitive to the position information during fusion;
the position relevance Attention module performs Spatial adaptive weight adjustment on the feature map by using Spatial Attention space Attention from a space dimension, so that the position of the target is more concerned during detection.
2. The method for detecting the apparent diseases of the YOLOv4 concrete based on the location correlation feature fusion as claimed in claim 1, wherein the channel weight obtaining process comprises:
two 1D global pooling operations are adopted, and each channel is aggregated along a horizontal coordinate and a vertical coordinate respectively to obtain two characteristics f with direction perception independentlyhAnd fw
Performing concatenate operation on the two obtained feature graphs, and then performing convolution operation to further extract feature information;
separating a feature map f 'along the vertical direction from the extracted feature information'hAnd a feature map f 'in the horizontal direction'wUsing an activation function to obtain channel weights M from the vertical and horizontal directionshAnd Mw
Mh=σ(Fh(f′h))
Mw=σ(Fw(f′w))
Wherein, sigma is Sigmoid activation function, FhAnd FwTwo convolution kernels of 1x1, respectively, for adjusting the feature map f 'in the vertical direction'hAnd a feature map f 'in the horizontal direction'wThe output channel dimension is made the same as the original number of in-out channels.
3. The method for detecting the apparent disease of the YOLOv4 concrete based on the location-related feature fusion as claimed in claim 1, wherein the spatially adaptive weight adjustment of the feature map using the Spatial Attention space comprises:
generation of two different features F using maximum pooling and average pooling in channel dimensions avgAnd Fs maxConnecting the two features;
extracting information from the two connected features by convolution calculation to obtain a spatial attention feature;
obtaining a spatial weight M using an activation function for a spatial attention featuresExpressed as:
Figure FDA0003394298240000021
wherein rho (.) is sigmoid activation function,
Figure FDA0003394298240000022
represents the convolution operation using a 7x7 convolution kernel; f' is the input feature map.
4. The method for detecting the apparent concrete diseases based on the position relevance feature fusion YOLOv4 as claimed in claim 1, wherein the training process of the model based on the position relevance feature fusion YOLOv4 comprises:
dividing an input image into S multiplied by S squares;
predicting n bounding boxes in each square, and generating confidence of a detection target for the bounding boxes of each square;
for each boundary box, predicting the conditional probability of a certain class of detection target, and multiplying the conditional probability by the confidence coefficient to obtain the confidence coefficient of each boundary box for each specific class;
and calculating the direct difference between the output result and the labeling result by adopting a loss function of YOLOv4, and carrying out back propagation training on the model through the difference.
5. The method for detecting the apparent diseases of the YOLOv4 concrete based on the location relevance feature fusion as claimed in claim 4, wherein the confidence coefficient is expressed as:
Figure FDA0003394298240000031
wherein Confidence represents the Confidence level, Pr (object) represents the probability that the boundary box contains the object to be detected,
Figure FDA0003394298240000032
indicating the overlap ratio of the predicted bounding box and the labeled bounding box.
6. The method for detecting the apparent concrete diseases based on the position relevance feature fusion of YOLOv4 as claimed in claim 1, wherein the loss function of YOLOv4 is expressed as:
L=Lreg+Lconf+Lclass
wherein L isregThe frame regression loss; l isconfIs a loss of confidence; l isclassIs a classification loss.
7. The method for detecting the apparent diseases of the YOLOv4 concrete based on the location correlation feature fusion as claimed in claim 6, wherein the border regression loss LregExpressed as:
Figure FDA0003394298240000033
Figure FDA0003394298240000034
Figure FDA0003394298240000035
where IOU (A, B) represents the intersection ratio of real box A and predicted box B, p2(A, B) represents the Euclidean distance between the central points of the prediction frame B and the real frame A; m represents the diagonal distance of the coverage area of the real box A and the prediction box B; w and h denote the width and height of the prediction box, wgtAnd hgtRepresents true width and height; α is a balance coefficient, and v is a parameter for keeping the aspect ratio of the prediction box uniform.
8. The method for detecting the apparent diseases of the YOLOv4 concrete based on the fusion of the position relevance features of claim 6, wherein the confidence loss L isconfExpressed as:
Figure FDA0003394298240000036
wherein λ isnoobjIs the weight of the cross-over ratio error, s2B is the predicted boundary frame number of each cell;
Figure FDA0003394298240000037
whether a detection target exists in the jth boundary box of the ith cell or not is represented, if the detection target exists, the detection target is 1, and if the detection target does not exist, the detection target is 0;
Figure FDA0003394298240000041
in order to predict the degree of confidence,
Figure FDA0003394298240000042
is the actual confidence.
9. The method for detecting the apparent diseases of the YOLOv4 concrete based on the fusion of the position correlation characteristics according to claim 6, wherein the classification loss L isclassExpressed as:
Figure FDA0003394298240000043
wherein s is2The number of divided cells;
Figure FDA0003394298240000044
whether a detection target exists in the jth boundary box of the ith cell or not is represented, if the detection target exists, the detection target is 1, and if the detection target does not exist, the detection target is 0; pi jIs the actual probability of the category to which the object in the cell belongs,
Figure FDA0003394298240000045
is the prediction probability.
CN202111478855.1A 2021-12-06 2021-12-06 YOLOv4 concrete apparent disease detection method based on position relevance feature fusion Pending CN114359654A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111478855.1A CN114359654A (en) 2021-12-06 2021-12-06 YOLOv4 concrete apparent disease detection method based on position relevance feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111478855.1A CN114359654A (en) 2021-12-06 2021-12-06 YOLOv4 concrete apparent disease detection method based on position relevance feature fusion

Publications (1)

Publication Number Publication Date
CN114359654A true CN114359654A (en) 2022-04-15

Family

ID=81097422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111478855.1A Pending CN114359654A (en) 2021-12-06 2021-12-06 YOLOv4 concrete apparent disease detection method based on position relevance feature fusion

Country Status (1)

Country Link
CN (1) CN114359654A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117011688A (en) * 2023-07-11 2023-11-07 广州大学 Method, system and storage medium for identifying diseases of underwater structure
CN117351356A (en) * 2023-10-20 2024-01-05 三亚中国农业科学院国家南繁研究院 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle
CN117351356B (en) * 2023-10-20 2024-05-24 三亚中国农业科学院国家南繁研究院 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117011688A (en) * 2023-07-11 2023-11-07 广州大学 Method, system and storage medium for identifying diseases of underwater structure
CN117011688B (en) * 2023-07-11 2024-03-08 广州大学 Method, system and storage medium for identifying diseases of underwater structure
CN117351356A (en) * 2023-10-20 2024-01-05 三亚中国农业科学院国家南繁研究院 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle
CN117351356B (en) * 2023-10-20 2024-05-24 三亚中国农业科学院国家南繁研究院 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle

Similar Documents

Publication Publication Date Title
CN110059554B (en) Multi-branch target detection method based on traffic scene
CN110569901B (en) Channel selection-based countermeasure elimination weak supervision target detection method
CN111626128B (en) Pedestrian detection method based on improved YOLOv3 in orchard environment
CN111275688A (en) Small target detection method based on context feature fusion screening of attention mechanism
CN112287788A (en) Pedestrian detection method based on improved YOLOv3 and improved NMS
CN106682697A (en) End-to-end object detection method based on convolutional neural network
CN113409314B (en) Unmanned aerial vehicle visual detection and evaluation method and system for corrosion of high-altitude steel structure
CN108038846A (en) Transmission line equipment image defect detection method and system based on multilayer convolutional neural networks
CN110348437B (en) Target detection method based on weak supervised learning and occlusion perception
CN110751195B (en) Fine-grained image classification method based on improved YOLOv3
CN110738355A (en) urban waterlogging prediction method based on neural network
CN104899883A (en) Indoor object cube detection method for depth image scene
CN114399719B (en) Transformer substation fire video monitoring method
CN111985325A (en) Aerial small target rapid identification method in extra-high voltage environment evaluation
CN111539422A (en) Flight target cooperative identification method based on fast RCNN
CN114299011A (en) Remote sensing target quadrilateral frame rapid detection method based on deep learning
CN111898419A (en) Partition landslide detection system and method based on cascade deep convolutional neural network
CN116662468A (en) Urban functional area identification method and system based on geographic object space mode characteristics
CN114359654A (en) YOLOv4 concrete apparent disease detection method based on position relevance feature fusion
Yang et al. C-RPNs: Promoting object detection in real world via a cascade structure of Region Proposal Networks
CN117171533B (en) Real-time acquisition and processing method and system for geographical mapping operation data
CN114003623A (en) Similar path typhoon retrieval method
CN111738086B (en) Composition method and system for point cloud segmentation and point cloud segmentation system and device
Guo et al. Safety monitoring in construction site based on unmanned aerial vehicle platform with computer vision using transfer learning techniques
Wangli et al. Foxtail Millet ear detection approach based on YOLOv4 and adaptive anchor box adjustment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination