CN114972316A

CN114972316A - Battery case end surface defect real-time detection method based on improved YOLOv5

Info

Publication number: CN114972316A
Application number: CN202210713529.2A
Authority: CN
Inventors: 胡海兵; 朱振昊
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2022-06-22
Filing date: 2022-06-22
Publication date: 2022-08-30

Abstract

The invention discloses a battery case end surface defect real-time detection method based on improved YOLOv5, which comprises the steps of obtaining a lithium battery steel case end surface defect image through lithium battery defect detection equipment; then, marking the defect type and the defect position and preprocessing a data set; then, improving based on a YOLOv5 network model, adding a CBAM attention mechanism, and training the optimized improved YOLOv5 network model based on a training set; and finally, detecting the defects of the end faces of the steel shells of the lithium batteries by using the trained improved YOLOv5 network model. The battery case end surface defect real-time detection method based on the improved YOLOv5 can be used for detecting and positioning the end surface defects of the steel cases of the common lithium batteries of different types in real time, improves the accuracy of identifying the defects of different types and similar structures, and has the advantages of high detection speed, high detection efficiency, high stability, high detection precision, low cost and the like.

Description

Battery case end surface defect real-time detection method based on improved YOLOv5

Technical Field

The invention relates to the technical field of deep learning and detection methods, in particular to a battery case end surface defect real-time detection method based on improved YOLOv 5.

Background

As a common energy storage device, a battery is widely used in various fields of production and life, such as household appliances, electronic products, electronic instruments, and automation equipment, such as mobile phones, computers, new energy vehicles, and the like. With the rapid development of manufacturing industry, the quality requirement of people on batteries is increasingly improved, the surface defects of battery products directly influence the use safety of consumers, battery cases are important components of the battery products, the quality of the battery cases determines the quality of the battery products, and therefore, a battery case defect detection link is an essential link in production. The battery case may have various defects due to the influence of raw materials or a production process during the production process, and the main defects at the end surface of the battery case include pits (pit), R-angle injure (R-angle injure), hard printing (hard printing), and the like. In the past, the main detection method of battery production enterprises is manual visual detection, and the method has the defects of strong subjectivity, low detection speed, low detection efficiency, large uncertainty and the like, and is difficult to meet the detection requirements of high speed and high accuracy of modern industries, so that the enterprises urgently need to replace manual detection with a detection means with high efficiency, high detection rate and low cost.

In recent years, an object detection framework has become a research focus of today, and deep learning networks which are commonly used in the field of object detection are mainly divided into a single-stage network and a double-stage network. The single-stage network is mainly represented by a YOLO series, and the double-stage network is mainly represented by an RCNN series, and the two main differences are that the single-stage network directly utilizes feature information extracted from the network to classify and regress the category, and the double-stage network needs to generate an interested region first and then further classify and regress the interested region. In order to meet the real-time requirement of the actual industrial production of the lithium battery steel shell, the YOLO series is used as a classic single-stage detection algorithm, the detection speed is high, and the detection precision is greatly improved while the high detection speed is ensured by the YOLOv 5. At present, the deep learning method is utilized to carry out less research on the real-time detection of the defects of the end face of the steel shell of the lithium battery, and the research mainly focuses on the aspects of preprocessing the image, threshold segmentation and the like by utilizing the traditional visual algorithm.

The traditional machine vision algorithm generally comprises three stages of region selection, feature extraction and feature classification for object detection. Firstly, a Sliding window (Sliding Windows) algorithm is utilized to select an area in which an object position may appear in an image, then an artificially and carefully designed extractor is used to extract features, such as SIFT (Scale-Invariant Feature Transform) and HOG (histogram Of ordered gradient), and the like, and finally, classifiers such as SVM (support Vector machines), AdaBoost (adaptive boosting) and the like are used to classify the extracted features. However, due to the reasons that the position and the size of the object are not fixed, and the designed extractor contains few parameters, the traditional algorithm has the problems of a large number of redundant boxes, high computational complexity, low robustness, poor feature extraction quality and the like.

Disclosure of Invention

The invention aims to: the method for detecting the defects of the end face of the battery shell in real time based on the improved YOLOv5 is provided to solve the defects.

In order to achieve the above purpose, the invention provides the following technical scheme:

the battery case end surface defect real-time detection method based on the improved YOLOv5 comprises the following steps:

s1, acquiring a defect image of the end face of the steel shell of the lithium battery;

s2, marking the defect type and the defect position of the obtained defect image on the surface of the lithium battery to generate a data set, preprocessing the data set, and dividing the processed data set into a training set, a verification set and a test set;

s3, improving a backbone feature extraction network based on a YOLOv5 network model, adding a CBAM attention mechanism, and training the optimized improved YOLOv5 network model based on a training set;

s4, detecting the end surface defect of the lithium battery steel shell by using the trained improved YOLOv5 network model.

Preferably, in step S2, labeling the defect type and defect position of the obtained defect image of the steel shell end face of the lithium battery by using Labelimg software to generate a data set; the data set is preprocessed, specifically, the data set is subjected to data enhancement operations including random inversion, brightness adjustment and noise addition.

Preferably, in step S3, the backbone feature extraction network based on the YOLOv5 network model is improved, and the specific content is as follows: and embedding an attention mechanism CBAM (CBAM) after the penultimate convolutional layer of the backbone network, wherein the network structure calculates an attention diagram of the feature diagram from two dimensions of a channel and a space for the feature diagram generated by the convolutional neural network, and then multiplies the attention diagram with the input feature diagram to perform the adaptive learning of the features.

Preferably, in step S3, to efficiently calculate the channel attention, the feature map is compressed in the spatial dimension using maximum pooling and average pooling, resulting in two different spatial context descriptions:

and

calculating the two different spatial background descriptions by using a shared network consisting of a Multilayer Perceptron to obtain a channel attention feature map M _c ^(F) The calculation formula is as follows:

in the formula: w denotes the first layer of the multilayer perceptron, W ₁ Representing the second layer of the multi-layer perceptron, sigma is represented as sigmoid activation function,

the maximum pooling characteristic is represented by the maximum pooling characteristic,

mean pooling characteristics are shown.

Preferably, in step S3, to efficiently compute spatial attention, two different feature descriptions are first derived using maximum pooling and average pooling in the channel dimension

And

the two feature descriptions are then merged using stitching and a spatial attention feature map M is generated using a convolution operation _s ^(F) The calculation formula is as follows:

in the formula: f. of ^7*7 Represents a convolution layer of 7 x 7, sigma is expressed as sigmoid activation function,

mean pooling characteristics are shown.

Preferably, in step S3, the neck network in the YOLOv5 network model is an FPN-PAN structure; after the YOLOv5 network model is improved, a lightweight upsampling CARAFE module is adopted in the neck network of the improved YOLOv5 network model, and Bi-FPN is used for replacing FPN-PAN.

The invention has the beneficial effects that:

according to the battery case end surface defect real-time detection method based on the improved YOLOv5, an attention mechanism CBAM is embedded in a YOLOv5 backbone network to pay attention to the importance of different spaces and channels, so that the model can learn the capability of extracting important features; the feature pyramid network FPN-PAN structure is replaced by Bi-FPN, the Bi-FPN enhances the representation capability of features through simple residual operation, the weight is introduced, and feature information of different scales can be well balanced; by adopting the light-weight CARAFE upsampling, the CARAFE upsampling has a larger receptive field, so that contextual information can be aggregated in a large perception field, and the performance can be improved by introducing less parameters and calculation amount. The battery case end surface defect real-time detection method based on the improved YOLOv5 can detect and locate the defects of the end surfaces of the steel cases of the common lithium batteries of different types in real time, and improve the accuracy of the defect identification of different types and similar structures; compared with manual visual detection, the method has the advantages of high detection speed, high detection efficiency, strong stability, low cost and the like; compared with the traditional machine vision algorithm models such as YOLO, YOLOv5 and the like, the method has the characteristics of low calculation complexity, good feature extraction quality, simplicity, rapidness and high efficiency, can realize the aims of high detection speed and high detection precision, and can meet the requirements of real-time performance and accuracy of the actual industrial production of the lithium battery steel shell.

Drawings

FIG. 1: the invention relates to a structure diagram of a lithium battery defect detection device;

FIG. 2: the invention discloses a sample diagram of the end surface defect of a lithium battery steel shell;

FIG. 3: the invention relates to a main network diagram embedded with a lightweight attention CBAM module;

FIG. 4: FPN, PANet and Bi-FPN three structure model diagrams of the invention;

FIG. 5: a confusion matrix map of the original algorithm of the YOLOv5 network model;

FIG. 6: the confusion matrix map of the improved YOLOv5 network model method of the present invention;

FIG. 7: the original algorithm of the YOLOv5 network model and the test set mAP graph of the improved YOLOv5 network model method of the invention.

Detailed Description

The present invention is further described with reference to the following examples, which are intended to be illustrative and illustrative only, and various modifications, additions and substitutions for the specific embodiments described herein may be made by those skilled in the art without departing from the spirit of the invention or exceeding the scope of the claims.

Example 1:

as shown in fig. 1-6, the method for detecting defects of end faces of battery cases in real time based on the improved YOLOv5 specifically comprises the following steps:

and S1, acquiring the defect image of the end face of the steel shell of the lithium battery.

The lithium battery steel shell end surface defect image is obtained through lithium battery defect detection equipment. Fig. 1 is a structural diagram of a lithium battery defect detection device, as shown in fig. 1, a main control computer controls a motor to move to collect images, a driving wheel is driven to rotate, a battery steel shell is driven by the driving wheel to move forward along a driving direction, when the steel shell moves to the position below a camera, a camera trigger signal starts to collect the images, the images collected by the camera are stored in an image collection processing computer, and a transmission device continues to move forward to collect the next steel shell image.

The battery shell can generate various defects under the influence of raw materials or a production process in the production process, and the main defects of the end face of the steel shell of the lithium battery comprise pits, R-angle damages, hard marks and the like. FIG. 2 is a sample view of an end surface defect of a steel can of a lithium battery, as shown in FIG. 2, wherein (a) shows a pit-type defect represented by a depressed portion formed by extrusion or impact of the steel can of the battery during the production process; wherein the graph (b) shows an R-corner damage type defect, which is represented as a defect on a rounded end surface of a steel case for a battery; wherein the drawing (c) shows a hard print type defect which appears as a convex portion of the end face of the battery case.

S2, marking the defect type and the defect position of the obtained lithium battery surface defect image by adopting Labelimg software to generate a data set; and preprocessing the data set, specifically, performing data enhancement operations including random inversion, brightness adjustment and noise addition on the data set, and then processing the processed data set according to the following steps of 8: 1: the scale of 1 is divided into a training set, a validation set, and a test set.

The data collected in the early stage are 700 images of the defects of the end face of the steel shell of the lithium battery. And because the original data is less, the number of each defect sample cannot meet the requirement, and the data enhancement operation is performed. The data enhancement can effectively avoid the overfitting of the test result by carrying out operations such as random overturning, brightness adjustment, noise addition and the like on the original image, so that the model can better adapt to a new sample, and the generalization capability of the model is enhanced. After data enhancement, the training set has 2479 defect images, which contain 1685 pits, 1041R corner scratches and 269 hard prints. The experimental sample data are all manually labeled by using labeling software labellimg software (as shown in fig. 2), and corresponding xml format files are generated, and then are converted into txt format trained by YOLO 5.

S3, improving a backbone feature extraction network based on a YOLOv5 network model, adding a CBAM (conditional block association module) attention mechanism, and training the optimized improved YOLOv5 network model based on a training set.

(1) Embedded lightweight attention CBAM module

In order to obtain more detailed information about target defects, the method is improved based on a main feature extraction network of an original YOLOv5 network model, a lightweight attention CBAM module is embedded after a penultimate convolutional layer of the main network, for a feature map generated by a convolutional neural network, an attention map of the feature map is calculated from two dimensions of a channel and a space, and then the attention map is multiplied by the input feature map to perform feature adaptive learning.

Fig. 3 is a backbone network diagram of a lightweight attention CBAM module embedded therein. As shown in fig. 3, the addition of the CBAM attention mechanism enables the model to focus on important features more efficiently. For channel attention, it is of primary interest what is meaningful to input pictures. To efficiently compute channel attention, feature maps are compressed in the spatial dimension using maximum pooling and average pooling, resulting in two different spatial background descriptions:

and

calculating the two different spatial background descriptions by using a shared network consisting of Multilayer perceptrons (Multilayer perceptrons) to obtain a channel attention feature map; spatial attention, unlike channeling attentionThe main focus is on location information. To compute spatial attention, two different signatures are first obtained using maximum pooling and average pooling in the channel dimension

And

the two feature descriptions are then merged using stitching and a spatial attention feature map is generated using a convolution operation.

Wherein, the calculation formula of the characteristic diagram in the channel attention is as follows:

in the formula: w ₀ Representing a first layer of a multi-layer sensor, W ₁ Representing the second layer of the multi-layer perceptron, sigma is represented as sigmoid activation function,

the maximum pooling characteristic is represented by the number of cells,

mean pooling characteristics are shown.

Wherein, the calculation formula of the feature map in the space attention is as follows:

mean pooling characteristics are indicated.

(2) Weighted bidirectional feature pyramid Bi-FPN

Conventionally, in object detection, it has been difficult to detect a small object. In the convolution process, the number of the pixel points of the large object is large, the number of the pixel points of the small object is small, the characteristics of the large object are easy to keep along with the deep convolution, the characteristics of the small object are easy to ignore, and the reason why the small object is difficult to detect in the target detection is always.

FIG. 4 is a diagram showing three structural models of FPN, PANET, and Bi-FPN. As shown in fig. 4, in order to fuse features of different layers and better improve the multi-scale detection problem, an FPN structure (as shown in fig. 4(a)) is generated. The FPN structure establishes a top-down channel, and some position information of a lower layer is lost while the prediction characteristic diagram is ensured to have higher semantic information of a higher layer. On the basis of the FPN structure, a bottom-to-top channel is added in the PANet structure (as shown in fig. 4(b)), and strong position information of a lower layer is also transmitted to the prediction feature map, so that the prediction feature map has high semantic information and position information at the same time, and target detection is facilitated.

In the method, a Bi-FPN structure (as shown in fig. 4(c)) with more complex bidirectional fusion is adopted to replace a PANet structure. Wherein, on the basis of the PANet structure, the Bi-FPN structure mainly has the following characteristics:

the deletion of nodes with only one input edge has little effect on the network and simplifies the bidirectional network. This is because if a node has only one input edge, then its contribution to the network fusion will be small.

② if the original input node and output node are in the same layer, an extra edge is added between the original input node and output node. The benefit of doing so is that more features can be fused without adding too much cost.

And thirdly, in the Bi-FPN structure, each two-way channel is processed to serve as a feature network layer, and the same layer is repeated for multiple times to realize feature fusion of a higher level. The Bi-FPN enhances the representation capability of the features through simple residual operation; the Bi-FPN structure has the advantages that the weight is introduced, the importance of different input features is learned, the different input features are distinguished and fused, and the feature information of different scales can be well balanced.

(3) Adopting CARAFE module on light weight

Because the defects of the end face of the battery shell are small defects, in order to enable the detection to be more accurate, the CARAFE module upsampling operation is adopted, and the upsampling is to enable the image to obtain higher resolution. Two common upsampling methods are mainly used, one is bilinear upsampling, but only adjacent sub-pixel spaces are considered, and sufficient semantic information cannot be acquired; the other is deconvolution, the up-sampling kernel is obtained by network learning instead of the distance calculation between pixels, but the same up-sampling kernel is adopted for each position of the feature map, the response to local change is limited, and a large number of parameters and calculation amount are introduced, especially when the size of the up-sampling kernel is large. CARAFE upsampling has a larger receptive field, and upsampling is realized according to input contents through semantic information correlation between an upsampling kernel and a feature map; the weight is reduced, and the performance can be improved without introducing excessive parameters and calculated amount.

(4) Network training

A deep learning model is built based on a YOLOv5 framework, the batch size when a training sample is set is 16, the momentum parameter momentum is 0.937, the weight attenuation regular term decaly is 0.0005, and the iteration number epoch is 200.

Fig. 5 is a confusion matrix diagram of the original algorithm of the YOLOv5 network model, and fig. 6 is a confusion matrix diagram of the improved YOLOv5 network model method of the present invention. As can be seen from fig. 5 and 6, the success prediction values of YOLOv5-Our for three categories, namely "pit", "R angle in j ury" and "hard printing", are higher than that of YOLOv5s, and the missing detection rate and the false detection rate are lower than those of YOLOv5s, which indicates that the model has good performance.

Comparing the original algorithm of the YOLOv5 network model and the improved YOLOv5 network model method of the invention can find that:

although an original YOLOv5 network model (namely an original algorithm of a YOLOv5 network model) can accurately detect various defects on the end face of a lithium battery steel shell, the precision is low, the missed detection is easy to occur in the area with similar defects, and the false detection is carried out to a certain extent; the improved YOLOv5 network model provided by the invention not only can effectively detect the defect precision, but also the boundary box predicted by the improved model is closer to the real defect area compared with the original model, and the segmentation is more complete, which shows that the improved YOLOv5 model not only improves the detection precision, but also has more accurate positioning.

Compared with the traditional manual defect detection, the defects with larger area and higher contrast can be detected by naked eyes, but the defects with small area or similar blurring can not be detected by naked eyes basically; according to the invention, under a high-precision camera of machine vision, through an improved defect detection model, the defects of pits, R corner damages and hard prints can be effectively detected, and the industrial omission ratio is reduced.

③ the FPS of the improved YOLOv5 model on the YingWEda 2060 display card reaches 73, which shows that the improved YOLOv5 model can meet the real-time requirement of the actual lithium battery steel shell industrial production.

The defect detection equipment (such as an industrial camera) of the lithium battery is used for collecting the defect image of the end face of the steel shell of the lithium battery, and the improved YOLOv5 defect detection model is used for detecting the defect part, so that the type and the position of the defect can be accurately obtained. In the embodiment, 3 defect images of pits on the end face of the steel shell of the lithium battery, R corner damage and hard printing are tested. FIG. 7 is a mAP diagram of a test set of the original algorithm of the YOLOv5 network model and the improved method of the YOLOv5 network model of the present invention. Wherein, the left diagram in fig. 7 is a test set mAP of the original algorithm of the YOLOv5 network model, and the right diagram in fig. 7 is a test set mAP of the improved YOLOv5 network model method of the present invention. As shown in fig. 7, it can be found that the use of the improved YOLOv5 network model method of the invention improves the test set mapp by 6.1%, which indicates that the method of the invention has better effect on detecting defects on the end faces of the steel shell of the lithium battery, can effectively detect the positions of three physical defects, and can identify complex contours of the defects.

According to the battery case end surface defect real-time detection method based on the improved YOLOv5, an attention mechanism CBAM is embedded in a YOLOv5 backbone network to pay attention to the importance of different spaces and channels, so that the model can learn the capability of extracting important features; the feature pyramid network FPN-PAN structure is replaced by Bi-FPN, the Bi-FPN enhances the representation capability of features through simple residual operation, the weight is introduced, and feature information of different scales can be well balanced; by adopting the light-weight CARAFE upsampling, the CARAFE upsampling has a larger receptive field, so that contextual information can be aggregated in a large perception field, and the performance can be improved by introducing less parameters and calculation amount.

The battery case end surface defect real-time detection method based on the improved YOLOv5 can detect and locate the defects of the end surfaces of the steel cases of the common lithium batteries of different types in real time, and improve the accuracy of the defect identification of different types and similar structures; compared with manual visual detection, the method has the advantages of high detection speed, high detection efficiency, strong stability, low cost and the like; compared with the traditional machine vision algorithm models such as YOLO, YOLOv5 and the like, the method has the characteristics of low calculation complexity, good feature extraction quality, simplicity, high efficiency, and capability of realizing the purposes of high detection speed and high detection precision and meeting the real-time and accuracy requirements of the actual industrial production of the lithium battery steel shell.

The foregoing is an illustrative description of the invention, and it is clear that the specific implementation of the invention is not restricted to the above-described manner, but it is within the scope of the invention to apply the inventive concept and solution to other applications without substantial or direct modification.

Claims

1. The battery case end surface defect real-time detection method based on the improved YOLOv5 is characterized by comprising the following steps of:

2. The method for detecting the end surface defect of the battery case based on the improved YOLOv5 in real time as claimed in claim 1, wherein in step S2, the acquired end surface defect image of the steel case of the lithium battery is labeled with defect type and defect position by using Labelimg software, so as to generate a data set; the preprocessing is specifically to perform data enhancement operations including random flipping, brightness adjustment and noise addition on the data set.

3. The method for detecting defects of end faces of battery cases based on improved YOLOv5 in real time as claimed in claim 1, wherein in step S3, the backbone feature extraction network based on YOLOv5 network model is improved, and the specific content is as follows: and embedding an attention mechanism CBAM (CBAM) after the penultimate convolutional layer of the backbone network, wherein the network structure calculates an attention diagram of the feature diagram from two dimensions of a channel and a space for the feature diagram generated by the convolutional neural network, and then multiplies the attention diagram with the input feature diagram to perform the adaptive learning of the features.

4. The method for detecting defects of end faces of battery cases based on improved YOLOv5 in real time as claimed in claim 3, wherein in step S3, in order to efficiently calculate the attention of channels, the feature map is compressed in the spatial dimension using maximum pooling and average pooling, resulting in two different spatial background descriptions:

and

mean pooling characteristics are shown.

5. The method for detecting defects of end faces of battery cases based on improved YOLOv5 in real time as claimed in claim 3, wherein in step S3, to calculate space attention efficiently, two different characterizations are obtained by using maximum pooling and average pooling firstly in channel dimension

And

mean pooling characteristics are shown.

6. The method for detecting defects of end faces of battery cases based on the improved YOLOv5 in real time as claimed in claim 3, wherein in step S3, the neck network in the YOLOv5 network model is in FPN-PAN structure; after the YOLOv5 network model is improved, a lightweight upsampling CARAFE module is adopted in the neck network of the improved YOLOv5 network model, and a Bi-FPN structure is used for replacing an FPN-PAN structure.