CN114998185A

CN114998185A - Mud bleeding rate real-time detection method based on YOLOv5 model

Info

Publication number: CN114998185A
Application number: CN202210241021.7A
Authority: CN
Inventors: 周强; 丁燕; 丁小华
Original assignee: HUBEI CHIDGE TECHNOLOGY CO LTD
Current assignee: HUBEI CHIDGE TECHNOLOGY CO LTD
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2022-09-02

Abstract

The invention discloses a mud bleeding rate real-time detection method based on a YOLOv5 model, which comprises the steps of shooting a plurality of sections of mud bleeding pictures by a camera fixed in a mud recognition instrument, and obtaining a large number of pictures containing clear bleeding mud by using a long exposure frame; obtaining a data set after image information labeling data processing; loading partial pre-training weights of a YOLOv5 network by using transfer learning to construct a YOLOv5s detection framework; and training the network, adjusting the hyper-parameters according to the detection result, and continuously optimizing the loss function of YOLOv5-s until the optimal network is obtained. The mud bleeding rate can be accurately detected in real time, image blurring caused by poor shooting light can be removed, further mud bleeding detection is facilitated, compared with the prior manual identification, the detection speed and the detection precision are obviously improved, and the popularization of on-line mud bleeding identification is facilitated.

Description

Mud bleeding rate real-time detection method based on YOLOv5 model

Technical Field

The invention relates to the technical field of machine vision application, in particular to a mud bleeding rate real-time detection method based on a YOLOv5 model.

Background

The automatic mud bleeding rate identification is an important task, and 4 cement slurry related test methods are added according to the notice of highway engineering cement and cement concrete test regulations. The method comprises the following steps: the method comprises the following steps of (1) testing the bleeding rate among steel wires of the cement paste body, (ii) testing the free bleeding rate and the free expansion rate of the cement paste body, (iii) testing the filling degree of the cement paste body, and (iv) testing the pressure bleeding rate of the cement paste body.

At present, most of cement paste related test measurements are measured by using a manual visual statistical method, so that the time and labor are wasted, and errors are easy to generate under complex and variable scenes.

In recent years, the rapid development of deep learning related technologies provides a new identification method for a mud bleeding rate test, and the mud bleeding rate can be quickly and accurately obtained even in a complex and changeable scene.

Disclosure of Invention

The invention aims to provide a mud bleeding rate real-time detection method based on a YOLOv5 model, which has the advantages of fast network, simple pipline, low background false detection rate, strong universality and the like, and solves the problems in the background technology.

In order to achieve the purpose, the invention provides the following technical scheme:

a mud bleeding rate real-time detection method based on a YOLOv5 model comprises the following steps:

s1: shooting a plurality of sections of mud bleeding pictures by using a camera fixed in a mud recognition instrument, and obtaining a large number of pictures containing clear bleeding mud by using a long exposure frame;

s2: obtaining a data set after image information labeling data processing;

s3: loading partial pre-training weights of a YOLOv5 network by using transfer learning to construct a YOLOv5s detection framework;

s4: and training the network, adjusting the hyper-parameters according to the detection result, and continuously optimizing the loss function of YOLOv5-s until the optimal network is obtained.

Further, the specific method for obtaining the data set in S2 is as follows:

s201: in order to obtain an accurate mud bleeding rate, a high-speed continuous shooting camera is adopted to shoot a standing mud bleeding video in a mud bleeding rate recognition instrument for processing: decomposing a video into an image sequence, and manually marking mud scales and position information in a clear picture;

s202: carrying out the same image enhancement operation on the shot fuzzy picture and the shot clear picture, specifically: turning, cutting, changing contrast and adjusting saturation splicing;

s203: and combining the obtained image pairs and the position information to obtain a data set of the model, wherein the input of the model comprises fuzzy or clear unmarked pictures, and the output of the model comprises the input of corresponding marked clear pictures and the position information of mud bleeding.

Furthermore, a specific method for constructing a YOLOv5S detection framework in S3 is as follows:

s301: the feature extraction network adopts CSP Darknet53 with Focus structure as backbone network to extract the bottom features of the input picture; the first layer focus of the backhaul is used for periodically extracting pixel points from a high-resolution image and reconstructing the pixel points into a low-resolution image, namely stacking four adjacent positions of the image, focusing wh dimension information to a c channel space, improving the receptive field of each point, reducing the loss of original information, and therefore reducing the calculated amount and accelerating the speed;

s302: outputting 7 characteristic graphs of different levels from a backbone network, wherein 3 characteristic graphs are connected with CSP structural blocks in YOLOv5, PANET is based on Mask R-CNN and FPN frames, information transmission is enhanced, and the characteristic graphs are used for predicting mud bleeding positions in input pictures; and the other 4 characteristic graphs form a noise reduction branch of the YOLOv5s network in a convolution, sampling and splicing mode, and are used for removing blur in an input picture to generate a clear picture.

Furthermore, after the model is trained, a series of operations of calling a drawing function frame, saving a text result and saving an image result are carried out on the original model after the inference result is obtained, wherein target detection coordinates and label information are intercepted before the drawing function is called, and are packaged into json files which are transmitted to a server side through flash for recognition.

Further, in a YOLOv5 feature fusion layer and detection layer network, the feature and localization transitivity is enhanced by using an FPN + PAN structure, and an output detection layer with 3 scales is output: 80 × 80, 40 × 40 and 20 × 20 for detecting small, medium and large targets, respectively; aiming at the problems that the mud has less bleeding and the target to be identified is small, two scale detection layers of 40 multiplied by 40 and 20 multiplied by 20 are removed; meanwhile, the up-sampling operation is performed again in the stage of the feature fusion layer, and finally the output scales of the detection layer are respectively 160 × 160 and 80 × 80.

Furthermore, a BN layer is adopted to replace dropout, the BN layer is used before an activation layer in a deep neural network and is used for accelerating the convergence rate during model training, and the core formula of the BN layer is as follows:

Input:B＝{x ₁ ...m}；Υ,β(parameters to be learned)

Output:{y _i ＝BN _γ,β (x _i )}

y _i ←γx _i +β

the input is a value set B, trainable parameters gamma and beta;

the specific operation of the BN layer is as follows: calculating the mean and variance of B, and converting the mean and variance of B set into 0 and 1, corresponding to the above formula

Finally multiplying each element in B by upsilon and adding beta, and outputting; wherein upsilon and beta are trainable parameters and participate in the back propagation of the whole network;

normalization processing: regulating the data to a uniform interval, adopting upsilon and beta as reduction parameters, and under the condition of keeping the distribution of the original data, adopting whitening pretreatment on the BN layer which has a certain regularization effect, wherein the calculation formula is as follows:

then making a normalization process on the input data of a certain layer network, in the course of training adopting batch random gradient descent, Ex ^(k) ]Refers to each batch of training data neurons x ^(k) ]Average value of (d);

is each batch of data neurons [ x ] ^(k) ]A standard difference in activation.

Furthermore, the method for calculating the loss function of YOLOv5-s is divided into a first task target detection L1 regularization and a second task deblurring L2 regularization; wherein, L1 is called as mean absolute value error MAE, which is the mean of the absolute difference between the predicted value f (x) and the true value y of the model, and the formula is as follows:

in the above formula: f (x) _i ) And y _i Respectively representing the predicted value and the real value of the ith sample, n is the number of samples, and the derivative of the L1 loss function is constantAmount, with a steady gradient;

for the term mean square error MSE for L2, which refers to the average of the squares of the differences between the predicted values f (x) and the true values of the model, the formula is as follows:

wherein, M represents the number of samples,

representing the real data and x is the reconstructed data.

Compared with the prior art, the invention has the beneficial effects that:

1. the mud bleeding rate real-time detection method based on the YOLOv5 model can accurately detect the mud bleeding rate in real time, can remove image blurring caused by poor shooting light, and is favorable for further mud bleeding detection.

2. According to the mud bleeding rate real-time detection method based on the YOLOv5 model, online identification of mud bleeding is achieved through the flash and Vue, and compared with the prior manual identification, the detection speed and the detection precision are obviously improved.

3. The mud bleeding rate real-time detection method based on the YOLOv5 model reserves the characteristics of small model and high detection speed of a YOLOv5s detection network, does not depend on an expensive display card, can be operated on most equipment, and is convenient for popularization of on-line mud bleeding identification.

Drawings

FIG. 1 is a flow chart of the detection method of the present invention;

FIG. 2 is a flow chart of the YOLOv5s detection framework method of the present invention;

FIG. 3 is a schematic diagram of the convolution separation operation of the present invention;

FIG. 4 is an effect diagram of the mud bleeding on-line detection algorithm of the present invention;

FIG. 5 is a schematic diagram of an uploading picture of the mud bleeding on-line detection of the invention;

fig. 6 is an effect diagram of part of the mud bleeding in the online detection process.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-6, an embodiment of the present invention provides a mud bleeding rate real-time detection method based on YOLOv5 model, including the following steps:

the first step is as follows: shooting a plurality of sections of mud bleeding pictures by using a camera fixed in a mud recognition instrument, and obtaining a large number of pictures containing clear bleeding mud by using a long exposure frame;

the second step is that: obtaining a data set after image information labeling data processing; the specific method comprises the following steps:

in order to obtain an accurate mud bleeding rate, a high-speed continuous shooting camera is adopted to shoot a standing mud bleeding video in a mud bleeding rate recognition instrument for processing: decomposing a video into image sequences, and manually marking mud scales and position information in clear images;

then, the same image enhancement operation is carried out on the shot fuzzy picture and the shot clear picture, and the method specifically comprises the following steps: turning, cutting, changing contrast and adjusting saturation splicing;

combining the obtained image pair and the position information to obtain a data set of the model; the model input comprises a fuzzy or clear unlabelled picture, and the output comprises a corresponding input marked clear picture and position information of mud water secretion;

the third step: loading partial pre-training weights of a YOLOv5 network by using transfer learning to construct a YOLOv5s detection framework; the specific method comprises the following steps:

(1) the feature extraction network adopts CSP Darknet53 with Focus structure as backbone network to extract the bottom features of the input picture; the method comprises the following steps that pixel points of a first layer of focus of a backhaul are periodically extracted from a high-resolution image and reconstructed into a low-resolution image, namely, four adjacent positions of the image are stacked, wh dimension information is focused to a c channel space, the receptive field of each point is improved, loss of original information is reduced, and the module is mainly designed to reduce calculated amount and accelerate speed;

(2) outputting 7 characteristic graphs of different levels from a backbone network, wherein 3 characteristic graphs are connected with CSP structural blocks in YOLOv5, PANET is based on Mask R-CNN and FPN frameworks, information transmission is enhanced, and the capacity of accurately retaining spatial information is achieved, so that pixels can be properly positioned to form a Mask for predicting the position of mud bleeding in an input picture; the other 4 characteristic graphs form a noise reduction branch of the YOLOv5s network in modes of convolution, upsampling, splicing and the like, and are used for removing blurs in the input pictures to generate clear pictures;

the fourth step: training the network, adjusting the hyper-parameters according to the detection result, and continuously optimizing the loss function of YOLOv5-s until an optimal network is obtained; after the model is trained, the original model carries out a series of operations of calling a drawing function picture frame, saving a text result and saving an image result after obtaining an inference result, wherein target detection coordinates and label information are intercepted and captured before the drawing function is called, and are packaged into a json file which is transmitted to a server side for identification through a flash;

in a YOLOv5 feature fusion layer and a detection layer network, the transmissibility of features and positioning is enhanced by using an FPN + PAN structure, and 3 scales of output detection layers are output: the method is characterized by comprising the following steps of 80 × 80, 40 × 40 and 20 × 20, wherein the detection layers with the size of 40 × 40 and 20 × 20 are removed according to the problems that mud is less in bleeding and the target to be identified is smaller; meanwhile, the up-sampling operation is performed again in the stage of the feature fusion layer, and finally the output scales of the detection layer are respectively 160 × 160 and 80 × 80.

In the method, the BN layer is used for replacing dropout, and is often used in front of an activation layer in a deep neural network, so that the convergence speed of model training can be increased, the model training process is more stable, and gradient explosion or gradient disappearance is avoided; the BN layer core formula is as follows:

Input:B＝{x ₁ ...m}；Υ,β(parameters to be learned)

Output:{y _i ＝BN _γ,β (x _i )}

y _i ←γx _i +β

the input is a value set B, trainable parameters gamma and beta;

the specific operation of the BN is as follows: calculating the mean and variance of B, and then transforming the mean and variance of B set into 0 and 1 (corresponding to the above formula)

) Finally multiplying each element in B by upsilon and then adding beta, and outputting; wherein γ and β are trainable parameters and participate in the back propagation of the whole network; the purpose of normalization is as follows: data are normalized to a uniform interval, the divergence degree of the data is reduced, and the network learning is reduced

The BN layer which plays a certain regularization role uses approximate whitening preprocessing, and the calculation formula is as follows:

carrying out normalization processing on input data of a layer network; during training, a batch random gradient descent is adopted, and E [ x ] above ^(k) ]Refers to each batch of training data neurons [ x ] ^(k) ]Average value of (d);

that is, each batch of data neurons [ x ] ^(k) ]One standard deviation of activation.

In the method, the computation of the loss function of YOLOv5 in the mud detection is divided into a first task target detection L1 regularization and a second task deblurring L2 regularization; wherein, L1 Loss is also called Mean Absolute Error (MAE), which is the average of absolute differences between the predicted value f (x) and the true value y of the model, and the formula is as follows:

wherein f (x) _i ) And y _i Respectively representing the predicted value and the real value of the ith sample, wherein n is the number of the samples;

the derivative of the L1 loss function is constant and has stable gradient, so there is no problem of gradient explosion, and the penalty for outlier is fixed and not amplified.

Also known as Mean Square Error (MSE) for L2 Loss, which is the average of the squares of the differences between the predicted values f (x) and the true values of the model, the formula is as follows:

wherein: m represents the number of samples to be sampled,

representing the real data and x is the reconstructed data.

YOLOv5s is a very excellent lightweight detection network, but the mud identification operation platform requires that the model is easier to operate, so that the size of the network input has to be reduced, but the operation is reduced by simply reducing the input, for example, 640 is reduced to 320, the loss on the detection effect is large, and meanwhile, the model volume is still about 14M, so that the coefficients of the BN layer can be constrained by adding L1 regular, and the coefficients are thinned.

The L1 regularization constraint calculation formula is as follows:

the first term above is the loss function of normal training, the second term is constraint, where g(s) | s |, and λ is a regular coefficient, and the parameters can be thinned out according to the adjustment of the data set, and if added to the loss function of training, when performing back propagation:

L'＝∑l'+λ∑g'(γ)＝∑l'+λ∑|γ|'＝∑l'+λ∑γ*sign(γ)

therefore, only the coefficient and the coincidence function output of the BN layer weight multiplied by the weight are needed to be output when the BN layer weight is propagated reversely in training.

In order to balance the magnitude of two loss functions in the training process, the task weight is updated by adopting a gradient standardized GradNrom technology, and the calculation formula is as follows:

L(t)＝∑ _i w _i (t)*L _i (t)

wherein L is _i (t) represents a loss function for each task, w _i (t) represents the weight of each task, and the update formula of the weight is as follows:

wherein w _i (t) represents the weight of each task,. epsilon.represents the tuning parameter, L _GL Represents the GradLoss gradient loss, and the calculation formula is as follows:

wherein,

represents the normalized value of the gradient and represents,

expected value, r, representing gradient normalization of the respective task _i (t) represents the relative reverse training speed and α represents the set hyperparameter.

The calculation formula is as follows:

where w represents the weight of the last shared layer.

In the method of the invention, the indexes for evaluating the experimental performance are specifically as follows:

wherein TP represents the real value of the bleeding, and the predicted value is the amount of the bleeding; FP indicates that the real value is not the bleeding, and the predicted value is the amount of the bleeding; FN indicates that the true value is bleeding and the predicted value is not the amount of bleeding.

The clarity of the mud bleeding image is evaluated by using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM), and the larger the numerical values of PSNR and SSIM, the smaller the image distortion. The calculation formula is as follows:

wherein μ Img _r ，μImg _f Respectively represent Img _r And Img _f Pixel mean value of (a), a ² Img _r ，σ ² Img _f Respectively represent Img _r And Img _f Pixel variance of c ₁ ，c ₂ Representing a constant.

In summary, the following steps: the invention provides a mud bleeding rate real-time detection method based on a YOLOv5 model, which adopts a camera fixed in a mud recognition instrument to shoot a plurality of sections of mud bleeding pictures, and obtains a large number of pictures containing clear bleeding mud by using a long exposure frame; establishing a slurry two-phase flow data set, wherein the data set comprises bleeding slurry picture information under different slurry ratios and standing time, marking the data set comprising the picture information of different bleeding degrees, and classifying to obtain a data set to be trained; loading partial pre-training weights of YOLOv5 by using a transfer learning method, constructing a YOLOv5s detection framework, performing model parameter training on a data set to be trained by using the framework, predicting image characteristics by using the trained model, generating a boundary frame and identifying the bleeding rate; the method comprises the steps that a web end prediction interface is compiled, a function is compiled through a flash frame at the rear end, a front end web frame is compiled through VUE, and uploaded images are processed when a POST request is sent out by the front end. The improved YOLOv5-small model is used as a network model, network parameters of the network are reduced on the basis of YOLOv5, a lightweight model is constructed, a YOLOv5s model obtained by training on an established data set is used, interaction is carried out between a flash and vue and a server, a real-time online detection method is achieved, and rapid detection and identification can be carried out on pictures and the mud bleeding rate can be output only by inputting one original picture.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the equivalent replacement or change according to the technical solution and the inventive concept of the present invention should be covered by the scope of the present invention.

Claims

1. A mud bleeding rate real-time detection method based on a YOLOv5 model is characterized by comprising the following steps:

s2: obtaining a data set after image information labeling data processing;

2. The method for detecting the mud bleeding rate in real time based on the YOLOv5 model as claimed in claim 1, wherein the method comprises the following steps: the specific method for obtaining the data set in S2 is as follows:

s203: and combining the obtained image pairs and the position information to obtain a data set of the model, wherein the model input comprises fuzzy or clear unlabelled pictures, and the output comprises input corresponding labeled clear pictures and position information of mud bleeding.

3. The method for detecting the mud bleeding rate in real time based on the YOLOv5 model as claimed in claim 1, wherein the method comprises the following steps: the specific method for constructing the YOLOv5S detection framework in S3 is as follows:

s301: the feature extraction network adopts CSP Darknet53 with Focus structure as backbone network to extract the bottom layer features of the input picture; the first layer focus of the backhaul is used for periodically extracting pixel points from a high-resolution image and reconstructing the pixel points into a low-resolution image, namely stacking four adjacent positions of the image, focusing wh dimension information to a c channel space, improving the receptive field of each point, reducing the loss of original information, and therefore reducing the calculated amount and accelerating the speed;

s302: outputting 7 characteristic graphs of different levels from a backbone network, wherein 3 characteristic graphs are connected with CSP structural blocks in YOLOv5, PANET is based on Mask R-CNN and FPN frames, information propagation is enhanced, and the characteristic graphs are used for predicting mud bleeding positions in input pictures; and the other 4 characteristic graphs form a noise reduction branch of the YOLOv5s network in a convolution, upsampling and splicing mode, and are used for removing blur in an input picture to generate a clear picture.

4. The method for detecting the mud bleeding rate in real time based on the YOLOv5 model as claimed in claim 3, wherein: after the model is trained, the original model carries out a series of operations of calling a drawing function picture frame, saving a text result and saving an image result after obtaining an inference result, wherein target detection coordinates and label information are intercepted and captured before the drawing function is called, and are packaged into a json file which is transmitted to a server side for identification through a flash.

5. The method for detecting mud bleeding rate in real time based on the YOLOv5 model of claim 4, wherein in a YOLOv5 feature fusion layer and a detection layer network, the feature and positioning transmissibility is enhanced by using an FPN + PAN structure, and an output detection layer with 3 scales is output: 80 × 80, 40 × 40 and 20 × 20 for detecting small, medium and large targets, respectively; aiming at the problems that the mud has less bleeding and the target to be identified is smaller, two detection layers with the dimensions of 40 multiplied by 40 and 20 multiplied by 20 are removed; meanwhile, the up-sampling operation is performed again in the stage of the feature fusion layer, and finally the output scales of the detection layer are 160 × 160 and 80 × 80 respectively.

6. The method for detecting mud bleeding rate in real time based on the YOLOv5 model as claimed in claim 5, wherein dropout is replaced by a BN layer, the BN layer is used before an activation layer in the deep neural network and is used for accelerating the convergence rate of the model during training, and the BN layer has the following core formula:

Input:B＝{x ₁ ...m}；Υ,β(parameters to be learned)

Output:{y _i ＝BN _γ,β (x _i )}

y _i ←γx _i +β

the input is a value set B, trainable parameters gamma and beta;

Finally multiplying each element in B by upsilon and adding beta, and outputting; wherein γ and β are trainable parameters and participate in the back propagation of the whole network;

normalization treatment: regulating the data to a uniform interval, adopting upsilon and beta as reduction parameters, and under the condition of keeping the distribution of the original data, adopting whitening pretreatment on the BN layer which has a certain regularization effect, wherein the calculation formula is as follows:

then, a normalization process is carried out to the input data of a certain layer network, and batch random gradient descent is adopted in the training process, E [ x ] ^(k) ]Refer toEach batch of training data neurons x ^(k) ]Average value of (a);

is each batch of data neurons [ x ] ^(k) ]One standard deviation of activation.

7. The real-time mud bleeding rate detection method based on the YOLOv5 model as claimed in claim 1, wherein the loss function calculation method of YOLOv5-s is divided into a first task target detection L1 regularization and a second task deblurring L2 regularization; wherein, L1 is called as mean absolute value error MAE, which is the mean of the absolute difference between the predicted value f (x) and the true value y of the model, and the formula is as follows:

in the above formula: f (x) _i ) And y _i Respectively representing the predicted value and the true value of the ith sample, wherein n is the number of the samples, and the derivative of the L1 loss function is a constant and has a stable gradient;

for the L2 called mean square error MSE, which is the average of the squares of the differences between the predicted values f (x) and the true values of the model, the formula is as follows:

where M represents the number of samples, x represents the true data, and x is the reconstructed data.