CN111612751B

CN111612751B - Lithium battery defect detection method based on Tiny-yolov3 network embedded with grouping attention module

Info

Publication number: CN111612751B
Application number: CN202010405111.6A
Authority: CN
Inventors: 陈海永; 张泽智; 刘卫朋; 张建华
Original assignee: Hebei University of Technology
Current assignee: Hebei University of Technology
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2022-11-15
Anticipated expiration: 2040-05-13
Also published as: CN111612751A

Abstract

The invention discloses a lithium battery defect detection method based on a Tiny-yolov3 network embedded with a grouping attention module, which comprises the steps of obtaining a lithium battery image containing a defect to be detected, extracting the characteristics of the image containing the defect to be detected through the Tiny-yolov3 network embedded with the grouping attention module, and using the grouping attention module for finely detecting the characteristics of the defect to be detected. The attention module is used for screening the characteristics after the upsampling, the output characteristics of the fifth layer convolution layer of the trunk network and the characteristics after the upsampling and the output characteristics of the fifth layer convolution layer of the trunk network are spliced respectively, so that the local characteristics of each characteristic layer before splicing can be noticed, the overall characteristics of the spliced characteristic layers can also be noticed, more target characteristic information extracted by the network can be obtained, the small defects on the surface of the lithium battery can be more easily identified, and the identification accuracy is improved.

Description

Lithium battery defect detection method based on Tiny-yolov3 network embedded with grouping attention module

Technical Field

The invention relates to the technical field of industrial defect detection, in particular to a lithium battery defect detection method based on a Tiny-yolov3 network embedded with a grouping attention module.

Background

Lithium batteries are a type of batteries that use lithium metal or lithium alloy as a negative electrode material and use a non-aqueous electrolyte solution, and have many advantages, such as high capacity, long life, and environmental protection, and are widely used in many fields. The performance and the service life of the lithium battery can be reduced and potential safety hazards can also exist if the lithium battery has defects in the production process. The defects of the lithium battery comprise edge sealing wrinkles, pole piece scratches, exposed foils, particles, perforation, dark spots, foreign matters, surface dents, stains, bulges, code spraying deformation and the like. The detection mode that present lithium cell adopted is the mode of artifical visual inspection basically, and reliability, stability and efficiency that artifical detection can not effectively be controlled to current cost of labor is high, is difficult to be applicable to industrial production.

The defects of the lithium battery are various, the shape is random, the size is different, the surface has a complex background with non-uniform textures, and the defects bring great challenges to the detection of the lithium battery. Wanggang (Wanggang, research on a lithium battery wrinkle detection system based on deep learning and implementation of [ D ]. Liaoning university, 2019) provides a lithium battery defect detection method, the method constructs a convolutional neural network based on an AlexNet network and a GoogLeNet network, defect features are extracted mainly by adopting a deep feature layer, and small target defect features are difficult to store after several downsampling during feature extraction due to the lack of fusion of multi-scale feature layers in the network, so that the method has poor detection effect on small target defects and low detection precision.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a lithium battery defect detection method based on a Tiny-yolov3 network embedded with a grouping attention module; the method integrates shallow and deep features, enhances the detection capability of defect features of different scales, and enhances the detection precision of small target defects of the lithium battery by embedding the grouping attention module.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a lithium battery defect detection method based on a Tiny-yolov3 network embedded with a grouping attention module is characterized by comprising the steps of obtaining a lithium battery image containing a defect to be detected, extracting features of the image containing the defect to be detected through the Tiny-yolov3 network embedded with the grouping attention module, and finely detecting the features of the defect to be detected through the grouping attention module;

after the output characteristic of the seventh layer of the convolution layer of the trunk network of the Tiny-yolov3 network is convoluted for two times, the output characteristic is amplified to be the same as the output characteristic scale of the fifth layer of the convolution layer of the trunk network through up-sampling, and a characteristic x1' is obtained; splicing the characteristic x1' and the output characteristic x2 of the fifth layer convolution layer of the trunk network together according to a channel to form a characteristic z;

the grouping attention module comprises the following specific processes:

s1, grouping the characteristics z

Dividing the features z into two groups according to the number of channels, and respectively recording the two groups as features m and n; wherein the content of the first and second substances,

z∈R ^C×W×H (1)

wherein R represents a feature space; w, H represent the width and height of the feature, respectively; c _x The number of channels being characteristic m; c _y The number of channels being a characteristic n; c is the number of channels of characteristic z, where C = C _x +C _y ；

S2, attention calculation

Firstly, respectively carrying out channel-based global maximum pooling and channel-based global average pooling on the features m to respectively obtain maxpool features and avgpool features;

then splicing the maxpool characteristic and the avgpool characteristic based on a channel to form an intermediate characteristic m1; and the intermediate features M1 are convolved to obtain attention features M2, the attention features M2 are subjected to sigmoid activation function to generate a spatial attention map M, wherein,

M＝sigmoid(f ^7×7 ([AvgPool(m)，MaxPool(m)])) (4)

in the formula, 7 × 7 represents the size of the convolution kernel;

finally, multiplying the space attention mapping chart M and the feature M to obtain a grouped attention feature M', namely completing the attention operation on the feature M;

then repeating the operation of the attention calculation of the feature m on the feature N and the feature Z respectively to generate grouping attention features N 'and Z' respectively; and splicing the grouping attention features M ' and N ' together according to channels to generate features O ', and then superposing the features O ' and the grouping attention features Z ' together according to the channels to obtain the features O.

The concrete structure of the Tiny-yolov3 network embedded with the grouping attention module is as follows:

the trunk network of the Tiny-yolov3 network comprises seven convolutional layers and six maximum pooling layers, and a maximum pooling layer is added behind each convolutional layer except the last convolutional layer;

firstly, convolving the output characteristics of a seventh layer of convolution layer of a backbone network to obtain characteristics x1; the characteristic x1 is connected with two branches, one branch is the characteristic y1 obtained after the characteristic x1 is subjected to convolution twice, and the characteristic y1 is the first characteristic layer of the yolo layer of the Tiny-yolov3 network; after convolution is carried out on the other branch circuit as a characteristic x1, the other branch circuit is amplified to be the same as the output characteristic scale of a fifth layer convolution layer of the trunk network through upsampling to obtain a characteristic x1'; splicing the characteristic x1' with a characteristic x2 output by a fifth layer convolution layer of the trunk network according to a channel to form a characteristic z, and screening the characteristic z through a grouping attention module to obtain a characteristic O; and the characteristic y2 is obtained after the characteristic O is subjected to convolution twice in sequence, and the characteristic y2 is the second characteristic layer of the yolo layer of the Tiny-yolov3 network.

The convolution kernel size of all convolution layers of the backbone network is 3x3, and the step length is 1; the size of the pooling windows of the first to fifth largest pooling layers is 2x2, and the step length is 2; the pooling window size of the sixth largest pooling layer is 2x2 with a step size of 1.

The method comprises the following specific steps:

the first step is as follows: acquiring a lithium battery image by using an industrial camera as an original image for defect detection; the original image comprises a non-defective image and an image containing a defect to be detected;

the second step is that: cutting all collected original images containing the defects to be detected, and uniformly cutting each original image containing the defects to be detected into 16 small images with the same size; labeling all small images containing the defects to be detected to form labels, and dividing all the labels into different data sets;

the third step: extracting the characteristics of the image containing the defect to be detected through a Tiny-yolov3 network embedded with a grouping attention module;

the fourth step: firstly, setting initial weight of a model and training parameters, wherein the training parameters comprise category number and category labels;

automatically generating an anchor frame by using a K-means clustering method, and storing the size of the anchor frame; then reading a training image, scaling the training image to 128 × 128 pixels, inputting the scaled image into a Tiny-yolov3 network embedded in a grouping attention module, obtaining a bounding box by using the size of an anchor frame as a prior frame through frame regression prediction, and classifying the bounding box by using a logistic classifier to obtain defect class classification probability corresponding to each bounding box; sorting the defect type classification probabilities of all the boundary frames by a non-maximum value inhibition method, and determining the defect type corresponding to each boundary frame to obtain a predicted value; then calculating training loss between the predicted value and the true value through a loss function;

finally, dynamically adjusting the learning rate and the iteration times according to the change of training loss, wherein the training is divided into two stages, the first stage is the first 100 periods when the training starts, and the initial learning rate is fixed to be 0.001; the second stage is a training period after 100 periods, the initial learning rate is set to be 0.0001, when the training loss tends to be stable, the learning rate is sequentially changed to be one tenth of the original learning rate, the final learning rate is set to be 0.00001, and the training is stopped until the learning rate is reduced to be 0.00001;

the fifth step: on-line testing

Firstly, dividing a test image into 16 small images, zooming the small images to 128 pixels by 128 pixels, and inputting the small images into a Tiny-yolov3 network embedded in a grouping attention module for testing; the single image detection time was 0.1s.

Compared with the prior art, the invention has the beneficial effects that:

the improved backbone network of the Tiny-yolov3 network needs to be subjected to five times of down-sampling (each convolution kernel has a size of 2x2, and the largest pooling layer with a step length of 2 is one time of down-sampling), the size of a feature map after each time of down-sampling becomes half of the original size, but partial defect features are lost in the five times of down-sampling, if a defect of 16 pixels x 16 pixels exists in a lithium battery image, the size of the defect retained after the three times of down-sampling is 4 pixels x 4 pixels, and the size retained after the five times of down-sampling is 1 pixel x1 pixel, so that more detailed information of general small target defects is stored in shallow layer features, and deep layer features retain high-level semantic features of target defects; the improved Tiny-yolov3 network splices the output characteristics of the fifth layer of convolution layer and the output characteristics of the last layer of convolution layer together, so that the fusion of shallow characteristics and deep characteristics is realized, the detailed information of the target defect is reserved, the high-level semantic characteristics of the target defect are reserved, the target identification is more accurate, and the detection precision is improved.

In the process of feature fusion, redundant background information is contained in shallow features, the information can interfere the detection of defects, and in order to inhibit the interference of the redundant information, an attention mechanism is introduced into the improved Tiny-yolov3 network; the attention mechanism can enable the neural network to obtain a target area needing important attention in the learning process to obtain an attention focus, and then put more attention into the area to obtain more detailed information of the target needing attention so as to inhibit other useless information, so that the background can be inhibited and the defect target can be highlighted; however, because the shallow features and the deep features have different retained features, after the attention mechanism is introduced, the attention mechanism of the convolutional neural network may lead the network to pay more attention to the deep features, because the deep features contain more semantic information, and the shallow features may be suppressed and ignored as redundant features, which may affect the detection of small target defects; in order to better cope with the difference between the shallow characteristic and the deep characteristic, the invention provides a grouping attention module, the characteristics are grouped before attention operation, and the characteristics after splicing the characteristics after up-sampling, the output characteristics of the fifth layer of the trunk network and the characteristics after splicing the characteristics after up-sampling and the output characteristics of the fifth layer of the trunk network are respectively screened by adopting the attention module, so that the local characteristics of each characteristic layer before splicing can be noticed, the overall characteristics of the spliced characteristic layers can also be noticed, more target characteristic information is extracted by the network, the small defects on the surface of the lithium battery can be more easily identified, and the identification accuracy is improved; meanwhile, the functions of restraining the background and highlighting the target by an attention mechanism can be exerted, so that the target feature extraction under the background with non-uniform texture and complex texture is easier.

The method is applied to industrial defect detection, has obvious effect particularly on the defect detection of small targets (the target size is less than 20 x 20 pixels), can reduce quality detection procedures, reduces later-stage manual quality detection items, saves cost and improves detection efficiency.

The Tiny-yolov3 network adopted by the invention belongs to a lightweight network, the calculated amount in the training process is small, the calculation cost and the detection time are greatly saved, the quality and the efficiency of quality inspection can be improved, the quality and the production efficiency of the lithium battery module are further improved, and the requirement on real-time property is met. Compared with the original Tiny-yolov3 network, the improved Tiny-yolov3 network has certain improvement on the detection precision of the surface defects of the lithium battery, and the average detection precision is improved by 4.7%. False detection of a sample without defects is reduced, and the recall rate is improved by 2.2%.

Drawings

FIG. 1 is an overall flow chart of a lithium battery defect detection method based on a Tiny-yolov3 network embedded with a grouping attention module according to an embodiment of the invention;

FIG. 2 is a diagram of an overall network architecture according to an embodiment of the present invention;

FIG. 3 is a flow diagram of the group attention module of the present invention.

Detailed Description

The technical solutions of the present embodiments are clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments, not all embodiments, of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the present embodiment, belong to the protection scope of the present invention.

The invention provides a lithium battery defect detection method (a method for short, see figures 1-3) based on a Tiny-yolov3 network embedded with a grouping attention module, which comprises the following steps:

the first step is as follows: acquiring an image

Acquiring a lithium battery image by using an industrial camera as an original image for defect detection; the original image comprises a non-defective image and an image containing a defect to be detected; the image containing the defects to be detected can be an image containing a single defect or an image containing a plurality of defects, and all the types of the defects to be detected must be contained;

the second step is that: producing a data set

The method mainly comprises the following steps of preparing a data set of the Tiny-yolov3 network, specifically taking a standard format of Pascal VOC2007 as a template:

2-1, establishing a data set storage folder;

a VOCdevkit folder is newly built, and a VOC2007 folder is arranged under the VOCdevkit folder; respectively establishing an exceptions folder, a JPEGImaps folder and an ImageSets folder under a VOC2007 folder, and establishing a Main folder under the ImageSets folder; establishing a train.txt file, a val.txt file, a test.txt file and a train.txt file under a Main folder, wherein the train.txt file, the val.txt file, the test.txt file and the train.txt file are respectively used for storing a training set, a verification set, a test set and a training verification set; the Annotations folder is used for storing xml files of all the marked images, and the JPEGImaps folder is used for storing the divided small images;

2-2, cutting the image;

cutting all collected original images containing the defects to be detected by using a sliding segmentation method, uniformly cutting each original image containing the defects to be detected into 16 small images with the same size, and storing all the small images in a JPEGImages folder; if the original image is directly adopted, the size of the original image is large, the calculation amount of a single image is too large, the detection time is long, and the requirement on hardware of a detection system is high; if the size of the original image is directly reduced, the original image is segmented and original pixels of the defects to be detected are reserved as the small target defects occupy small ratio and occupy fewer pixels after reduction, so that the detection is difficult;

2-3, marking the image;

manually calibrating all the small images containing the defects to be detected in the step 2-2 by using a software Labelimg, and marking out the defect parts; then, generating an xml file containing a picture name, a defect type and a defect position coordinate for each marked image, wherein one xml file is a label, and storing all xml files in an options folder;

2-4, grouping data sets

Dividing all xml files into a training set, a verification set, a training verification set and a test set, wherein the training set, the verification set and the test set are in a VOC2007 data set according to a proportion; firstly, extracting all xml files in an advertisements folder, then dividing all the xml files into two groups according to the proportion of 4; for example, the file names of all xml files classified into the training set are saved into a train.

The third step: improved Tiny-yolov3 network model

3-1, constructing a backbone network

The network is based on the improvement of a Tiny-yolov3 network, the Tiny-yolov3 network is a simplified version of yolov3, the operation speed is higher, the real-time performance is strong, the requirement of defect detection real-time performance on an industrial production line can be met, and the detection efficiency is higher; the main network of the original Tiny-yolov3 network comprises seven convolutional layers (conv) and six maximum pooling layers (Maxpool), and one maximum pooling layer is added behind each convolutional layer except the last convolutional layer; the convolution kernel size of all convolution layers is 3x3, and the step length is 1; the sizes of the pooling windows of the first to five largest pooling layers are all 2x2, and the step length is 2; the size of a pooling window of the sixth largest pooling layer is 2x2, the step length is 1, so that the feature and the parameter are reduced, and the feature scale is kept unchanged;

3-2 construction of yolo layer

The yolo layer of the modified Tiny-yolov3 network comprises two characteristic layers; firstly, convolving the output characteristics of the seventh convolution layer of the backbone network in the step 3-1 by the convolution kernel size 1 x1 and the step length 1 to obtain characteristics x1; the characteristic x1 is connected with two branches, one branch is a characteristic y1 obtained by two convolutions of which the characteristic x1 sequentially passes through convolution kernels with the sizes of 3 × 3 and 1 × 1 and the step length is 1, and the characteristic y1 is a first characteristic layer of a yolo layer of the improved Tiny-yolov3 network; the other branch is characterized in that the characteristic x1 is subjected to convolution with the convolution kernel size of 1 x1 and the step length of 1, and then is amplified to be the same as the output characteristic scale of a fifth layer convolution layer of the main network through one-time upsampling (upsampling), so that an upsampled characteristic x1' is obtained; splicing the feature x1 'and a feature x2 output by a fifth layer of the convolution layer of the trunk network together according to a channel to form a spliced feature z, wherein the feature x1' is the same as the feature x2 of the fifth layer of the trunk network in size, and the two features are spliced together according to the channel; the feature z is screened through a grouping Attention module (Attention) to obtain a feature O; the feature O is subjected to two convolutions of convolution kernel size 3x3 and step size 1 and convolution kernel size 1 x1 and step size 1 in sequence to obtain a feature y2, and the feature y2 is a second feature layer of a yolo layer of the improved Tiny-yolov3 network;

the grouping attention module comprises the following specific processes:

s1, grouping the characteristics z

The feature z is formed by splicing the feature x1 'and the feature x2, so that the feature z is divided into two groups according to the number of channels of the original feature, namely the number of channels belonging to the feature x1' is one group and is marked as a feature m; one group of the channels belonging to the characteristic x2 is marked as a characteristic n; wherein, the first and the second end of the pipe are connected with each other,

z∈R ^C×W×H (1)

wherein R represents a feature space; w, H represent the width and height of the feature, respectively; c _x The number of channels for feature x1'; c _y The number of channels being characteristic x 2; c is the number of channels of characteristic z, where C = C _x +C _y ；

S2, attention calculation

Firstly, respectively performing channel-based Global Maximum Pooling (GMP) and channel-based Global Average Pooling (GAP) on the features m to respectively obtain maxpool features and avgpool features;

then splicing the maxpool characteristic and the avgpool characteristic based on channels to form an intermediate characteristic m1 with the channel number of 2; after the intermediate features m1 are convolved by a convolution kernel with the size of 7 × 7 and the step length of 1, attention features m2 with the number of channels of 1 are obtained; the attention feature M2 is further processed by a sigmoid activation function to generate a spatial attention map M, wherein,

M＝sigmoid(f ^7×7 ([AvgPool(m)，MaxPool(m)])) (4)

in the formula, 7 × 7 represents the size of the convolution kernel;

finally, multiplying the space attention mapping chart M and the feature M to obtain a grouped attention feature M', namely completing the attention operation on the feature M; the attention calculation can inhibit the background and highlight the target, so that the target defect on the object to be detected is displayed more clearly;

then repeating the operation of the attention calculation of the feature m on the feature N and the feature Z respectively to generate grouping attention features N 'and Z' respectively; splicing the grouping attention features M ' and N ' together according to channels to generate features O ', and then superposing the features O ' and the grouping attention features Z ' together according to the channels to obtain refined features O;

the features m and the features n are spliced by two feature graphs of different scales, and the feature graph of each scale needs to pay attention to different features, so that the attention operation on the features m, n and z can pay attention to both the local features of each scale and the spliced overall features, so that small defects on the surface of the object to be detected can be more easily identified;

for example, the input image has dimensions 416 × 3, where the width and height are both 416, and the number of channels is 3; after all convolution and pooling operations of the backbone network, the characteristic dimension of the seventh convolutional layer of the backbone network is 13 × 1024; then, converting the characteristic scale of the seventh convolution layer of the backbone network into 13 × 128 through convolution operation with the convolution kernel size of 1 × 1 and the step size of 1, namely converting the scale of the characteristic x1 into 13 × 128; after the convolution operation of the feature x1 with the convolution kernel size of 1 × 1 and the step size of 1, the width and height of the feature x1 are changed to be 2 times of the original width and height through one-time up-sampling, namely the dimension of the feature x1' is 26 × 128; the output feature x2 of the fifth convolutional layer of the trunk network has a dimension of 26 × 256, and the feature x1' and the output feature x2 are spliced together, that is, the dimension of the feature z is 26 × 384; splicing is the combination of the number of channels, namely the number of the channels of two characteristics is added together, and the information under each channel is not increased; then, the features z are operated through a grouping attention module so as to achieve the purposes of inhibiting the background and highlighting the target features;

because the characteristic z is formed by splicing the characteristic x1 'and the output characteristic x2, the characteristics originally belonging to the number of channels of the characteristic x1' are divided into a group during grouping, the characteristics originally belonging to the number of channels of the output characteristic x2 are divided into a group, and grouping attention characteristics M 'and N' are obtained through grouping attention operation respectively, and the sizes of the grouping attention characteristics M 'and N' are unchanged; then, the feature Z is also subjected to grouping attention operation to generate a grouping attention feature Z ', and the scale of the grouping attention feature Z' is the same as that of the feature Z; stitching the grouping attention M 'and the grouping attention N' together to generate a feature O ', wherein the dimension of the feature O' is 26 × 384; then, overlapping the characteristic O 'and the grouping attention characteristic Z' together to obtain a refined characteristic O, wherein the number of channels is unchanged, two characteristics of each channel are overlapped into one characteristic, information under each channel is increased, the information under each channel is richer, and the identification of small target defects on the surface of the object to be detected is facilitated; since the dimensions of the attention-grouping feature Z 'and the feature O' are both 26 × 384, the dimensions of the superimposed features are also 26 × 384, that is, the dimensions of the feature O are 26 × 384;

the fourth step: model training

4-1, setting model training parameters

Modifying the category number and the category label of the improved Tiny-yolov3 network according to the category number and the name of the defect to be detected in the training set, wherein if 7 defects to be detected in the training set exist in total, the category number of the improved Tiny-yolov3 network is 8 and comprises a background and 7 defect categories to be detected; correspondingly modifying the category label of the improved Tiny-yolov3 network according to the name of the defect to be detected;

4-2, setting initial weight of model

In order to accelerate the convergence speed, reduce the training time and prevent overfitting, a Tiny-yolov3 model file obtained by pre-training an ImageNet data set is used as an initialization weight of an improved Tiny-yolov3 network;

4-3, calculating training loss

Determining the number of anchor frames according to the number of the characteristic layers output by the network model, automatically generating anchor frames (anchors) by using a K-means clustering method, and storing the sizes of the anchor frames; reading images in a training set, and reading image data including image names, defect types and defect position coordinate information; the method comprises the steps of scaling a training image to 128 × 128 pixels, extracting features of the scaled image through an improved Tiny-yolov3 network, obtaining a bounding box through frame regression prediction by taking the size of an anchor frame as a prior frame, and then classifying the bounding box by using a logistic classifier to obtain defect class classification probability corresponding to each bounding box; sorting the defect category classification probabilities of all the boundary frames by a non-maximum value suppression (NMS) method, determining the defect category corresponding to each boundary frame to obtain a predicted value, wherein the predicted value comprises the defect category and defect position information, and the defect position information is used for framing the position of the defect; the non-maximum suppression threshold is 0.5; then calculating training loss (loss) between the predicted value and the true value through a loss function;

4-4, training phase

Dynamically adjusting the learning rate and the iteration times according to the change of the training loss so as to update the parameters of the whole network; the training is divided into two stages, the first stage is the first 100 periods when the training starts, and the initial learning rate is fixed to be 0.001 so as to accelerate convergence; the second stage is a training period after 100 periods, the initial learning rate is set to be 0.0001, when the training loss tends to be stable, the learning rate is sequentially changed to one tenth of the original learning rate, the final learning rate is set to be 0.00001, and the training is stopped until the learning rate is reduced to be 0.00001;

the fifth step: on-line testing

The online test is completed under a Windows10 platform by a core i7 series computer CPU, a memory is 16GB, and a display card is double GTX1080 display cards, and is realized based on a keras program; firstly, dividing a test image (400 images of various defects) into 16 small images, zooming the small images to 128 pixels by 128 pixels, and inputting the small images into an improved Tiny-yolov3 network for detection; after all 16 small images are detected, all the small image detection results are spliced together to form a complete large image for output; the detection time of a single large image is 0.1s, and the requirement of production instantaneity of an enterprise can be met. The defects of each small image can be framed on the picture, all the defects can be framed if different defects exist, the lithium battery is considered as a defective battery as long as the defects exist, 16 small images are spliced together finally to form a complete large image, and the position information prediction is used for framing the position of the defect.

This embodiment 7 kind of defect images altogether to lithium cell surface dent, stain, swell, fold, pole piece mar, granule and dark spot have been tested, wherein can reach about 85% to the discernment accuracy rate of stain and dark spot, and the discernment rate of other defects all is more than 90%, and the position on the lithium cell surface that the defect was located can be confirmed to this application method, all has showing the promotion to the recall rate and the accuracy of lithium cell.

Nothing in this specification is said to apply to the prior art.

Claims

1. A lithium battery defect detection method based on a Tiny-yolov3 network embedded with a grouping attention module is characterized by comprising the steps of obtaining a lithium battery image containing a defect to be detected, extracting features of the image containing the defect to be detected through the Tiny-yolov3 network embedded with the grouping attention module, and finely detecting the features of the defect to be detected through the grouping attention module;

after the output characteristic of the seventh layer of the convolution layer of the trunk network of the Tiny-yolov3 network is convoluted for two times, the output characteristic is amplified to be the same as the output characteristic scale of the fifth layer of the convolution layer of the trunk network through up-sampling, and a characteristic x1' is obtained; splicing the characteristic x1' with an output characteristic x2 of a fifth layer convolution layer of the trunk network according to a channel to form a characteristic z;

the grouping attention module comprises the following specific processes:

s1, grouping the characteristics z

Dividing the feature z into two groups according to the number of channels of the original feature, namely, one group belonging to the number of channels of the feature x1', and recording the group as a feature m; belonging to a group of characteristic x2 channel numbers, denoted as characteristic n, wherein,

z∈R ^C×W×H (1)

wherein R represents a feature space; w, H represent the width and height of the feature, respectively; c _x The number of channels being characteristic m; c _y The number of channels being characteristic n; c is the number of channels of characteristic z, where C = C _x +C _y ；

S2, attention calculation

M＝sigmoid(f ^7×7 ([AvgPool(m),MaxPool(m)]))(4)

in the formula, 7 × 7 represents the size of the convolution kernel;

then repeating the operation of the attention calculation of the feature m on the feature N and the feature Z respectively to generate grouping attention features N 'and Z' respectively; splicing the grouping attention features M ' and N ' together according to channels to generate a feature O ', and then overlapping the feature O ' and the grouping attention feature Z ' together according to the channels to obtain a feature O;

the concrete structure of the Tiny-yolov3 network embedded with the packet attention module is as follows:

firstly, convolving the output characteristics of the seventh convolutional layer of the backbone network to obtain characteristics x1; the feature x1 is connected with two branches, one branch is the feature y1 obtained after the feature x1 is subjected to convolution twice, and the feature y1 is the first feature layer of the yolo layer of the Tiny-yolov3 network; after convolution is carried out on the other branch circuit, the other branch circuit is amplified to be the same as the output characteristic scale of a fifth layer convolution layer of the trunk network through upsampling to obtain a characteristic x1'; splicing the characteristic x1' and a characteristic x2 output by a fifth layer convolution layer of the trunk network together according to a channel to form a characteristic z, and screening the characteristic z through a grouping attention module to obtain a characteristic O; and performing convolution on the characteristic O for two times to obtain a characteristic y2, wherein the characteristic y2 is a second characteristic layer of a yolo layer of the Tiny-yolov3 network.

2. The detection method according to claim 1, wherein the convolution kernel size of all convolution layers of the backbone network is 3x3, and the step size is 1; the sizes of the pooling windows of the first to five largest pooling layers are all 2x2, and the step length is 2; the pooling window size of the sixth largest pooling layer is 2x2 with a step size of 1.

3. The detection method according to any one of claims 1-2, characterized in that the method comprises the following specific steps:

automatically generating an anchor frame by using a K-means clustering method, and storing the size of the anchor frame; then reading a training image, scaling the training image to 128 × 128 pixels, inputting the scaled image into a Tiny-yolov3 network embedded in a grouping attention module, obtaining a bounding box by using the size of an anchor frame as a prior frame through frame regression prediction, and classifying the bounding box by using a logistic classifier to obtain defect class classification probability corresponding to each bounding box; sorting the defect category classification probabilities of all the bounding boxes by a non-maximum value inhibition method, and determining the defect category corresponding to each bounding box to obtain a predicted value; then calculating training loss between the predicted value and the true value through a loss function;

the fifth step: on-line testing