CN116542932A

CN116542932A - Injection molding surface defect detection method based on improved YOLOv5s

Info

Publication number: CN116542932A
Application number: CN202310513694.8A
Authority: CN
Inventors: 孙力; 钱威
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2023-05-08
Filing date: 2023-05-08
Publication date: 2023-08-04

Abstract

The invention discloses an improved YOLOv5 s-based injection molding surface defect detection method, which comprises the following steps: acquiring a surface defect data set of an injection molding piece; after preprocessing and marking the surface defect data set of the injection molding, dividing the surface defect data set of the injection molding into a training set, a verification set and a test set; constructing a Yolov5s network structure model, and improving the Yolov5s network structure model to obtain an improved Yolov5s network structure model; training an unmodified and modified YOLOv5s network model by using a training set to obtain a model for detecting surface defects of the injection molding; and testing by using an injection molding surface defect detection model, and comparing test results. According to the invention, the injection molding surface defect is detected by using the YOLOv5s target detection algorithm of the attention-introducing mechanism, the depth-separable convolution and the small target detection layer, compared with the detection effect of the unmodified YOLOv5s grid structure model, the improvement of the detection speed of the injection molding surface defect is realized, and the defect detection precision is greatly improved.

Description

Injection molding surface defect detection method based on improved YOLOv5s

Technical Field

The invention belongs to the field of computer vision and deep learning, and particularly relates to an injection molding surface defect detection method based on improved YOLOv5 s.

Background

With the rapid development of the modern industry, plastic injection molding workpieces based on injection molding are increasingly widely applied, and appearance quality is a key factor affecting the quality of injection molding workpieces. Therefore, visual detection of the surface quality of the injection molding is an important link for detecting the quality of the injection molding.

The manual visual detection is still a detection method mainly adopted by injection molding production enterprises, is easily limited by manual subjective factors, and the detection accuracy is difficult to guarantee. Meanwhile, visual fatigue and mental fatigue of workers make the detection efficiency difficult to be continuous, work in an environment filled with pungent chemical plastic smell for a long time, and the physical health of the workers can be influenced to a certain extent.

Based on the problems, the machine vision detection technology can well overcome the defect of manual detection in the detection task. The traditional machine vision detection technology has the defect that the image classification characteristics cannot be automatically extracted, and the convolutional neural network algorithm in the deep learning algorithm can well meet the requirements. The convolutional neural network is mainly characterized in that characteristics are extracted layer by utilizing convolutional operation, and the network can be fitted with a target classification function through weight change. The convolutional neural network algorithm is applied to the recognition of the surface defect image of the injection molding, so that on one hand, the visual detection and recognition of the surface defect of the injection molding can be realized, and on the other hand, the problems of limitation and the like of the traditional characteristic engineering algorithm can be overcome.

Disclosure of Invention

The invention aims to solve the technical problem of providing an improved YOLOv5 s-based injection molding surface defect detection method, which can realize the remarkable improvement of the detection efficiency and the precision of the injection molding surface defect detection.

In order to solve the technical problems, the invention adopts the following technical scheme:

a plastic surface defect detection method based on improved YOLOv5s, comprising the steps of:

step 1, acquiring an injection molding surface defect data set;

step 2, after preprocessing and marking the surface defect data set of the injection molding piece in the step 1, dividing the surface defect data set of the injection molding piece into a training set for training, a verification set for verification and a test set for testing;

step 3, constructing a YOLOv5s network structure model, and improving the YOLOv5s network structure model to obtain an improved YOLOv5s network structure model;

step 4, training the unmodified and modified YOLOv5s network model in the step 3 by using the training set in the step 2 to obtain a model for detecting the surface defects of the injection molding;

and 5, testing by using the testing set in the step 2 by using the injection molding surface defect detection model obtained in the step 4, and comparing the testing results.

The step 1 specifically comprises the following steps: the multi-station visual detection system is used for respectively shooting pictures below, above and around the injection molding piece, the data set is divided into three parts based on three stations, the lower data set only has the pictures below the injection molding piece, the upper data set only has the pictures above the injection molding piece, and the surrounding data set only has the pictures around the injection molding piece.

The step 2 specifically comprises the following steps: the pretreatment of the plastic surface defect data set refers to the data enhancement of the injection molding defect data set, wherein the data enhancement comprises the horizontal, vertical and diagonal mirror image inversion of 10% of pictures in each data set.

In the step, marking the surface defect data set of the injection molding part means that an image marking tool labelimg is used for marking the defect of each image in the preprocessed surface defect data of the injection molding part.

In this step, the ratio of the training set, the verification set and the test set is set to be 6:2:2, wherein the ratio is the data set division for the single-station picture, that is, the data sets above, below and around are respectively divided.

The step 3 specifically comprises the following steps: the YOLOv5s network structure model is improved, and the improvement has three parts.

First improvement: adding a CBAM convolution attention mechanism module in a back bone and a Neck structure of a YOLOv5s network structure model, wherein the attention mechanism module is not directly added into the model, but is fused into a C3 module in the back bone and the Neck structure, and the fused C3 module is C3CBAM. And then, C3 in the YOLOv5s network is completely replaced by C3CBAM, so that the accuracy of the network model is improved.

Second improvement: the common convolution kernels with the kernel_size of 6*6/3*3 in the back bone and the Neck structure of the YOLOv5s network structure model are replaced by the depth separable convolution, so that the purposes of reducing parameters of the network model and improving the detection speed are achieved.

Third improvement: and adding a small target detection layer in the Prediction structure of the YOLOv5s network structure model, so as to improve the accuracy of small target object detection.

The step 4 specifically comprises the following steps: training epoch was set to 300, batch_size was set to 16, image_size was set to 1280 x 1280, adam was chosen as the gradient function, and annealing cosine was chosen as the learning rate change strategy.

In the step, training a basic YOLOv5s network structure model by using the training set in the step 2 to obtain weight files of defect detection models corresponding to three different stations, and a precision, recall and average precision (mAP) curve diagram during the training of the defect detection models; and (3) training the improved YOLOv5s network structure model by using the training set in the step (2) to obtain weight files of defect detection models corresponding to three different stations, and obtaining accurate rate (precision), recall rate (recovery) and average accuracy (mAP) graphs during training of the defect detection models.

The step 5 specifically comprises the following steps: testing by adopting the weight file of the unmodified defect detection model in the step 4 and the test set in the step 2 to obtain a Confusion Matrix (fusion Matrix), precision, recall (recall) and time of forwarding of the unmodified defect detection model on the test set; and (3) testing by adopting the weight file of the improved defect detection model in the step (4) and the test set in the step (2) to obtain a Confusion Matrix (fusion Matrix), a precision, a recall (recall) and a forward time of the improved defect detection model on the test set.

The invention has the beneficial effects that:

1. according to the injection molding surface defect detection method based on the improved YOLOv5s grid structure model, aiming at the defect detection problems of material shortage, crush injury, bruise, copper scraps, cracks, PIN distortion, interval and the like generated in the production and transportation processes of injection molding, the improved YOLOv5s target detection algorithm is adopted to train and deploy the model for the defects of three different stations of a single injection molding, and compared with the existing injection molding defect detection method, the method has the advantages of higher detection accuracy and detection speed.

2. According to the invention, the data sets are respectively manufactured on different stations of the injection molding part, and the models are respectively trained, so that the defects of one station are independently trained and detected by one model, the difficulty of model training is reduced, and the accuracy and the detection speed of the model are improved.

3. Four right-angle triangular prisms are arranged on four sides of the injection molding piece, and through total reflection of inclined planes of the four right-angle triangular prisms, an industrial camera can acquire complete images of the four sides at the same time, and a plurality of cameras are not required to shoot the periphery of a workpiece at the same time.

4. In the defect data set around the injection molding part, the number of marking frames of each picture is fixed to be 4 by adding the non-defect (OK) labels, and the method can enable the box_loss (rectangular frame loss) and obj_loss (confidence loss) of the model to be easier to converge and the model to be easier to train due to the fact that the number of targets is determined and the positions of the targets are approximately fixed.

5. According to the invention, the CBAM convolution attention mechanism module is added in the backbond and Neck structures of the YOLOv5s network structure model, so that the model accuracy is improved.

6. The invention reduces the parameter number of the network model and realizes high detection speed by replacing the common convolution kernel with 6*6/3*3 kernel_size in the back bone and Neck structure of the YOLOv5s network structure model with the depth separable convolution.

7. According to the invention, the small target detection layer is added in the Prediction structure of the YOLOv5s network structure model, so that the accuracy of small target object detection is improved, and the overall accuracy of the defect detection model is improved.

8. According to the invention, the injection molding surface defects are detected by using the YOLOv5s target detection algorithm of the attention-introducing mechanism, the depth-separable convolution and the small target detection layer, compared with the detection effect of an unmodified YOLOv5s grid structure model, the improvement of the model speed is realized, and the defect detection precision is greatly improved.

9. The model training strategy is that the first 30 epochs are preheated, the model can slowly tend to be stable under the small learning rate of preheating, and after the model is stable, the cosine annealing is selected to adjust the learning rate, so that the network model can be converged faster and tends to be stable more during training, and large-amplitude oscillation can not occur.

Drawings

FIG. 1 is a schematic flow chart of an improved YOLOv5 s-based method for detecting surface defects of injection molding

FIG. 2 shows a peripheral defect detection station stage of an injection molded part

FIG. 3 is a diagram of the unmodified YOLOv5s network model

FIG. 4 is a basic building block of an unmodified YOLOv5s network model

FIG. 5 is a schematic diagram of a module of the attention mechanism of the CBAM

FIG. 6 is a schematic diagram of a modified C3CBAM module

FIG. 7 is a schematic diagram of a general convolution flow chart

FIG. 8 is a schematic diagram of a channel-by-channel convolution process

FIG. 9 is a schematic diagram of a point-by-point convolution process

FIG. 10 is a schematic diagram of a DBS convolutional layer

FIG. 11 is a schematic diagram of an improved YOLOv5s network model

FIGS. 12 (a), (b) and (c) are graphs of Precision, recall and mAP, respectively, of an injection molding peripheral defect detection model training

FIGS. 13 (a), (b), and (c) are graphs of Precision, recall and mAP, respectively, of a defect detection model under an injection mold during training

FIGS. 14 (a), (b), and (c) are graphs of Precision, recall and mAP, respectively, of a defect detection model over an injection mold during training

FIGS. 15 (a), (b) are confusion matrix diagrams of unmodified and modified peripheral defect detection models, respectively, on test sets

FIGS. 16 (a), (b) are confusion matrix diagrams of unmodified and modified underlying defect detection models, respectively, over a test set

FIGS. 17 (a), (b) are confusion matrix diagrams of unmodified and modified upper defect detection models, respectively, on a test set

FIG. 18 is a schematic diagram of the mold for detecting defects around an injection molding

FIG. 19 is a schematic view showing a lower defect detection model for detecting upper defects of an injection molding member

FIG. 20 is a schematic diagram of an injection mold lower defect detection model for detecting lower defects of an injection mold

Detailed Description

The invention is further described below with reference to the accompanying drawings.

The invention relates to an improved YOLOv5 s-based injection molding surface defect detection method, wherein a flow chart is shown in figure 1, and the method mainly comprises the following steps:

step S1: and obtaining a surface defect picture of the injection molding piece.

The method comprises the following steps: the injection molding piece is similar to a cuboid in shape, and pictures below, above and around the injection molding piece are respectively shot through a multi-station visual detection system. As shown in fig. 2, four right-angle triangular prisms are arranged on four sides of an injection molding piece, the periphery of a workpiece is irradiated by a light source, and through reflection of inclined planes of the four right-angle triangular prisms, an industrial camera can acquire complete images of the four sides at the same time, a plurality of cameras are not required to shoot the periphery of the workpiece at the same time, and the shooting of the lower and upper pictures is achieved by using a single camera. The pictures shot by each station are independently manufactured into a data set, so that the data sets are three, namely a defect data set below the injection molding part, a defect data set above the injection molding part and a defect data set around the injection molding part.

Step S2: and preprocessing, marking and dividing the data set of the surface defect of the injection molding.

Step S21: image preprocessing of the plastic surface defect dataset refers to data enhancement of the injection molding defect dataset.

Specifically, the OPENCV is used for carrying out horizontal, vertical and diagonal mirror image overturning on 10% of images in each data set, the number of the data sets is not increased during the preprocessing operation, the number of the data sets is unchanged before and after preprocessing, the operation is used for enhancing the diversity of the data, and the over fitting is prevented during the training of the network model.

Step S22: and marking the surface defect data set of the injection molding.

Specifically, an image marking tool labelimg is used for marking the defects of each image in the preprocessed surface defect data of the injection molding, and the categories of the defects are expressed by pinyin because the data marking and the design of the model cannot be mixed with Chinese. Wherein defects of the defect dataset under the injection molding are classified into 5 categories, namely starved (QueLiao), crush (Yashang), oil stain (YouZangWu), crack (LieWen) and copper dust (TongXie); defects of the defect dataset above the injection molding were classified into 7 categories, namely starved (QueLiao), crush (YaShang), bruise (Pengshang), PIN twist (PINNiuQu), oil stain (YouZangWu), crack (LieWen) and copper dust (TongXie); the labels of defect data sets around the injection molding part are divided into 2 categories, wherein one category is a defect interval (JianGe), the other category is a non-defect (OK), different from the data sets of other stations, the defect positions of the defect data sets around are fixed, and 4 positions are in total, wherein the positions of the defect data sets around are either provided with interval (JianGe) defects or are perfect, namely the defect data sets are non-defect (OK), and by adding labels of the non-defect (OK), the box_loss (rectangular frame loss) and the obj_loss (confidence loss) of the model are easier to converge, because the number of label frames of each picture is fixed to be 4, and the positions of the label frames are also approximately fixed, so that the model is easier to train.

Step S23: and dividing the surface defect data set of the injection molding into a training set, a verification set and a test set.

Specifically, the dividing ratio of the training set, the verification set and the test set is set to be 6:2:2, wherein the ratio is used for dividing the data set of the single-station picture, namely, the data sets above, below and around are respectively divided. As can be seen from tables 1, 2 and 3, the number of pictures in the data sets at different stations is inconsistent, the number of different defects in the same data set is different, the data sets are divided separately, and the models are trained separately.

TABLE 1 data distribution of defect datasets around injection molded parts

TABLE 2 data distribution of defect dataset under injection molded part

TABLE 3 data distribution of defect dataset over injection molded part

Step S3: constructing a Yolov5s network structure model, improving the Yolov5s network structure model to obtain an improved Yolov5s network structure model, wherein the foundation network structure of the Yolov5s is shown in figure 3, and the lower part of the module is provided with P _n Is downsampling, i.e. the step size of the convolution operation is 2. N=3 (1+4+nc) in the output feature, where 3 is the number of anchor boxes, 1 is the target confidence score, 4 is the center point coordinate offset value and the wide-high offset value of bbox, and nc is the class of target detection. The network has four main parts, namely an input end, backbone, neck and a Prediction end, and the basic modules of the network comprise CBS, bottleNeck, bottlekNeck 2, C3 and SPPF. The invention makes three innovative improvements based on the YOLOv5s model, namely, introducing an attention mechanism, introducing depth separable convolution and adding a small target detection layer, and passing through a network modelThe improvement innovation of the model is compared with the detection effect of an unmodified YOLOv5s grid structure model, and the rapidity and the accuracy of the surface defect detection of the injection molding are realized.

Step S31: the input end uses the Mosaic data enhancement and the self-adaptive anchor frame calculation, wherein the Mosaic data enhancement adopts 4 pictures to splice in a mode of random zooming, random cutting and random arrangement, so that a detection data set is greatly enriched, and particularly, a plurality of small targets are added in the random zooming, so that the robustness of the network is better; the self-adaptive anchor frame calculation function is to recalculate the initial anchor frame aiming at different data sets, so that the initial anchor frame is adapted to the data sets to obtain the optimal anchor frame in the training set, the difficulty of training the network model is reduced, and the convergence speed of the network model is increased. Meanwhile, the input end needs to perform resolution (scaling) on the input data set picture, the input picture size of YOLOv5s basically has two types, namely 640 x 640 and 1280 x 1280 when the input end is used, the sample size of the injection molding part is generally 2100 x 2100, the fine granularity of some defects such as cracks and copper scraps is particularly small, the resolution size of 10 x 10 is difficult to find in some cases, and if the resolution is 640 x 640, much detail information is lost, so that the original picture resolution is 1280 x 1280 as the input picture size of the whole network model.

The function of the back part is to extract characteristic parameters of the picture, and the function of the Neck part is to fuse network characteristics of different scales by using a FPN+PAN structure; the Prediction part is used for outputting a Prediction feature map, the YOLOv5s Prediction feature map of the official network has three scales, the three scales are respectively used for detecting a large target, a middle target and a small target, in a training stage, a loss function is converged by a regression method, the Prediction feature vector is continuously fitted with a real numerical value, the network model is enabled to be finally converged, the loss functions of the network model are three, namely box_loss, cls_loss and obj_loss, the function of the loss function is used for measuring the distance between the Prediction information and expected information (label) of the neural network, and the closer the Prediction information is to the expected information, the smaller the loss function value is. box loss characterizes the size of the predicted target and the gap between the exact position and the tag value. The obj_loss characterizes the difference between the confidence level and the label value of the predicted rectangular frame, and the larger the confidence value is, the more likely the target is in the rectangular frame, wherein the confidence value is in the range of 0-1. cls_loss characterizes the difference between the class of the predicted target and the tag value, and in order to reduce the overfit and increase the training stability, a smoothing operation is usually performed on the single-hot tag. The overall loss function at model training is:

Loss＝a×loss _obj +b×loss _box +c×loss _cls

the overall penalty is a weighted sum of three penalties, typically a confidence penalty loss _obj Taking the maximum weight, and losing loss of rectangular frame _box And loss of classification loss _cls In the present invention, a=0.4, b=0.3, and c=0.3 are set.

Step S32: the basic modules mainly used by YOLOv5s networks include CBS, bottleNeck1, bottleckNeck2, C3 and SPPF.

The CBS module is shown in fig. 4, and consists of Conv (normal convolution module), BN (batch normalization) and SiLU (activation function).

The BottleNeck1 module is shown in FIG. 4, which is a residual module:

assuming that the input of the neural network of the residual unit is x, the desired output is H (x), and the intermediate parametric network layer is set to F, the output after x passes F becomes F (x), and since the residual network structure directly passes the input to the output as an initial result, the learning target is F (x) =h (x) -x, i.e., the residual. The original mapping thus becomes F (x) +x. The original residual unit can be seen as being made up of two parts, one linear direct mapping x and the other non-linear mapping F (x).

The residual unit may be defined as:

y＝f[F(x,w _i )+x]

wherein x, y are the input and output of the residual unit, F (x, w _i ) For residual mapping to be learned, w _i For the convolution kernel, f is the activation function. The BottleNeck1 module is designed according to the principle of a residual error module, wherein a parameter layer is formed by connecting two CBS modules in series.

The BottleNeck2 module is shown in FIG. 4, and differs from BottlekNeck 1 in that there is less shortcut, but the module is either a residual structure, although the names are similar, or two CBS modules are connected in series.

As shown in FIG. 4, the C3 module is of two types, namely C3_1_n and C3_2_n, wherein the two types are respectively shown in the figure, the difference is that a residual module BottleNeck1 is used as a basic module, a non-residual module BottleNeck2 is used as a basic building block, and the parameter n of the two types is the number of BottleNeck in the C3.

The SPPF module is the same as the SPP (feature pyramid pooling) module in terms of implementation, as shown in fig. 4, except that in the SPPF module, a maximum pooling manner of k=1×1,5×5,9×9,13×13 is used, and then the Concat operation is performed on feature maps with different scales, which has the effect of increasing the receptive field without affecting the network computing speed, and extracting the most important context information.

Step S33: a CBAM convolution attention mechanism module is added in the Backbone and Neck structure of the YOLOv5s network structure model.

CBAM module:

CBAM (Convolutional Block Attention Module) employed by the present invention is a channel and space (spatial) combined attention mechanism that sequentially extrapolates an attention map along two independent dimensions and then multiplies the attention map with an input feature map for adaptive feature optimization, the proposed CBAM attention mechanism can achieve better results in applications than a channel-only attention mechanism senet.

The channel attention mechanism structure, as shown in the CAM bank of FIG. 5, can be expressed as:

the spatial attention mechanism structure schematic, as shown in the SAM module in fig. 5, can be expressed as:

wherein σ is sigmoid operation, MLP represents a multi-layer perceptron, avgPool represents tie pooling, maxPool represents maximum pooling, F represents input features, W ₁ 、W ₀ The weight is represented by a weight that,representing channel and spatial tie pooling features respectively,representing channel and space maximization pooling features, M _c (F) Representing channel attention features, M _s (F) Representing the spatial attention feature, f represents the convolution operation, 7*7 represents the convolution kernel size.

As shown in FIG. 5, the general flow of the CBAM is that two parallel operations which are not interfered with each other are carried out on the input characteristics, one is not operated, the other is operated by CAM, then the results of the two operations are directly multiplied, the multiplied results continue to be divided into two parallel operations, one is also not required to be operated by SAM, the other is operated by SAM, and then the two results are multiplied again, so that the obtained result is the output of the attention mechanism operation of the CBAM.

The attention mechanism module is not directly added into the model, but is fused into a C3 module in a back and Neck structure, the fusion process is shown in fig. 6, a CBAM module is connected in series behind two serial CBS modules in BottleNeck1 and BottleNeck2, the fused BottleNeck is CBAMBottleNeck, the invention converts BottleNeck in the original C3 module into CBAMBottleNeck, the fused C3 module is C3CBAM, the C3CBAM can be divided into two types because of different types of CABACMETHOD BottleNeck, the definition of a parameter n is consistent with that of C3CBAM_1_n and C3CBAM_2_n, the number of AMBottNeck in the module is represented, and the whole model is improved to replace C3 with the CABAC 3CBAM, so that the accuracy of the network model is improved.

Step S34: the general convolution kernels of the kernel_size 6*6/3*3 in the back bone and the Neck structure of the Yolov5s network structure model are all replaced with depth separable convolutions, shown in FIG. 3 with subscript P _n The CBS module of (2) has the effect of downsampling, wherein the Conv module is a normal convolution with kernel_size of 6*6 or 3*3 and step sizes of 2. In order to achieve the effect of improving the network reasoning speed, the invention proposes to replace the common convolution in the CBS of the type with the depth separable convolution, and as shown in fig. 7, the flow of the common convolution operation is shown, and the common convolution parameter number P is _conv Calculated amount S _conv The expression of (2) is:

P _conv ＝C·N·K ²

S _conv ＝C·N·H·W·K ²

where C is the number of input channels, N is the output channels, W, H is the width and height of the output layer, and K is the convolution kernel size.

The depth separable convolution (Depthwise Separable Convolution) is mainly divided into two processes, namely a channel-by-channel convolution (Depthwise Convolution) and a point-by-point convolution (Pointwise Convolution), and schematic diagrams of the two convolutions are shown in fig. 8 and 9 respectively. The feature extraction is carried out on the input image layer by channel convolution, one convolution kernel is only responsible for carrying out convolution on one channel, and the number of channels of the output feature image obtained by the convolution operation is consistent with the number of channels of the input image. And finally, performing feature concentration by adopting 1*1 point-by-point convolution, and adjusting the channel number to be an output channel. Parameter quantity P of depth separable convolution _dsc Calculated amount S _dsc The expression of (2) is as follows:

P _dsc ＝C·K ² +N·C

S _dsc ＝C·H·W·K ² +N·C·H·W

The calculation formula of the calculated quantity ratio of the depth separable convolution to the common convolution is as follows:

for the normal convolution of kernel_size= 6*6, which needs to be replaced, it can be known that k=6, and the number of output channels n=64, and as can be known from the above formula, after the convolution is replaced by the depth separable convolution, the calculated parameter is only 1/23 of the original calculated parameter. For the first normal convolution of kernel_size= 3*3 to be replaced, it is known that k=3, the number of output channels n=128, and the above formula also indicates that the parameter calculation amount is only 1/8 of the original parameter calculation amount after the first normal convolution is replaced by the depth separable convolution.

As can be seen from the above calculation, in the case where the input features are the same, the calculated amount of the depth separable convolution is far lower than that of the normal convolution in order to obtain the same output feature vector diagram. Under the condition of the same calculated amount, the depth separable convolution can make the layer number of the neural network deeper, thereby achieving the purposes of reducing the parameters of the network model and improving the detection speed. The method for introducing the depth separable convolution into the YOLOv5s is specifically that the common convolution (Conv) in the CBS is replaced by the depth separable convolution (DSConv), and the replaced module is named as DBS, and a schematic diagram of the DBS module is shown in fig. 10.

Step S35: a small target detection layer is added in the Prediction structure of the YOLOv5s network structure model, so that the problem of difficult detection of a small target of the s model is solved.

As shown in fig. 3, the unmodified YOLOv5s model is up-sampled only twice in the neg structure, the up-sampled first time is spliced with the down-sampled result 16 times in the backbond, the up-sampled second time is spliced with the down-sampled result 8 times in the backbond, the model after modification is added to three up-samplings, namely, FPN is added by one layer, the up-sampled and spliced operations of the previous two times are the same as those of the unmodified model, the up-sampled third time after the new addition is spliced with the down-sampled result 4 times in the backbond, and the result after the third time is convolved by 1*1 to output as an added small target detection layer. Meanwhile, the downsampling frequency in the Neck structure is also increased once, namely a layer of PAN structure is also increased.

As shown in FIG. 11, in the improved YOLOv5s network model of the present invention, a small target detection layer is added in the Prediction structure, and meanwhile, the number of anchor frames in the adaptive anchor frame calculation of the input end is changed from 9 to 12. The size of the predicted feature vector of the small target detection layer is 320×320×n, which is used for detecting the target with the size above 4*4, and the size of the target detected on the corresponding original image is 4×6 (2100/1280) =6, namely, the target with the size above 6*6 is detected, and for the small target with the size of 10×10, such as the copper scraps in the step S31, the small target detection layer can play a good role in leak detection and defect detection, so that the problem that the small target detection is difficult by the small model which is not improved is solved. Because the structure of the whole network model changes, the Prediction has four Prediction feature vectors, so that the loss function for improvement also changes:

loss _box the calculation formula is as follows:

loss _box ＝a1×loss _box320 +a2×loss _box160 +a3×loss _box80 +a4×loss _box40

loss _obj the calculation formula is as follows:

loss _obj ＝a1×loss _obj320 +a2×loss _obj160 +a3×loss _obj80 +a4×loss _obj40

loss _cls the calculation formula is as follows:

loss _cls ＝a1×loss _cls320 +a2×loss _cls160 +a3×loss _cls80 +a4×loss _cls40

in the above calculation formulas, a1, a2, a3 and a4 are respectively a weight coefficient of 320×320 meshes, 160×160 meshes, 80×80 meshes, 40×40 meshes and loss function values of the three prediction feature layers, which are considered as an idea, the weight a1 of the 320×320 meshes is usually set to be maximum, the weight a2 of the 160×160 meshes is usually set to be third maximum, the weight a3 of the 80×80 meshes is the minimum, the weight a4 of the 40×40 meshes is the minimum, for example, a1, a2, a3 and a4 take 0.4, 0.3, 0.2 and 0.1 in sequence, the setting of the ratio is not absolute, and can be improved according to different situations, for example, the number of large targets in the data set is the maximum, the ratio of a1 to a4 can be reversely set, the small targets of the invention are the largest, so a1 is the small, and a4 is the smallest.

The total loss function calculation formula is unchanged, and still is:

Loss＝a×loss _obj +b×loss _box +c×loss _cls

step S4: the training model, model training environment is shown in table 4, and training parameters are shown in table 5.

Table 4 model training environment

Table 5 model training parameters

Step S41: when training is started, the weights of the model are initialized randomly, and the model can oscillate due to the fact that a large learning rate is selected, namely, model training is unstable. The mode of selecting the warming up learning rate of the Warmup can enable the learning rate in a plurality of epochs or some steps which start training to be smaller, the model can slowly tend to be stable under the small learning rate of the warming up, and the model is relatively stable and then the preset learning rate is selected for training, so that the model convergence speed becomes faster and the model effect is better. The preheating epochs are 0-29, the first 30 epochs are preheated, the later learning rate adjustment strategy selects annealing cosine, the cosine annealing adjusts learning rate (CosineAnnealingLR), and the strategy attenuates the learning rate for each epochs strictly according to the following formula:

wherein: η (eta) _t Representing a current learning rate; η (eta) _min Representing a learning rate minimum; η (eta) _max Representing a learning rate maximum; t (T) _cur Representing the current epoch; t (T) _max Indicating the maximum epoch.

Step S42: training by using a basic YOLOv5s network structure model to obtain weight files of trained defect detection models corresponding to three different station data sets, and obtaining accurate rate (precision), recall rate (recall) and average accuracy (mAP) graphs of the network structure model on the training set. Training is carried out by using the improved YOLOv5s network structure model, and the weight files of the injection molding surface defect detection models corresponding to the three different station data sets are also obtained, and the precision (precision), recall (recall) and average precision (mAP) graphs of the improved defect detection models on the training set are also obtained.

As shown in fig. 12, 13 and 14, training index graphs of two network models corresponding to three different stations of the injection molding are compared. The precision, recall and average accuracy (mAP) of the improved YOLOv5s model are found to be higher when the model converges than the model which is not improved, the convergence rate of the improved model is faster than that of the model which is improved, and the amplitude of the up-and-down oscillation of the model which is not improved is obvious compared with the model which is improved although various performance indexes of the model which is not improved are unstable at the end of training.

Step S5: the two models were aligned over the test set.

Step S51: and (3) testing by adopting the weight file of the trained unmodified defect detection model and the test set in the step S42 to obtain a Confusion Matrix (fusion Matrix) result of the unmodified defect detection model on the test set. As shown in fig. 15 (a), 16 (a) and 17 (a), there are respectively confusion matrix result diagrams of the unmodified defect detection models corresponding to the stations around, below and above the injection molding on the test set.

Step S52: and (3) testing by adopting the weight file of the improved defect detection model and the test set in the step S42 to obtain a Confusion Matrix (fusion Matrix) result of the improved defect detection model on the test set. As shown in fig. 15 (b), 16 (b), and 17 (b), there are respectively confusion matrix result diagrams of the improved defect detection models corresponding to the stations around, below, and above the injection molding on the test set.

Step S53: as shown in tables 5, 6, and 7, the unmodified and modified defect inspection models were compared for effects, and the parameters for comparison included Recall (Precision), precision (Recall), average Precision (mAP), and time to forward.

Table 5 model effect alignment table for peripheral defect dataset

Table 6 below defect dataset model Effect comparison table

Table 7 upper defect dataset model effect alignment table

Step S54: fig. 18, 19 and 20 are schematic diagrams of the results of the improved defect detection model for the injection molding part to detect the defects around, below and above the injection molding part, respectively, and the network model can accurately locate and classify various defects.

In summary, in the present invention, compared with the unmodified YOLOv5s peripheral defect detection model, the Precision mean value of the modified YOLOv5s peripheral defect detection model is increased by 0.34%, the Recall mean value is increased by 0.48%, the mAP value is increased by 0.4%, and the time for pre-transmission is reduced by 1ms.

In the invention, compared with an unmodified YOLOv5s lower defect detection model, the Precision mean value of the modified YOLOv5s lower defect detection model is improved by 2.83%, the Recall mean value is improved by 1.4%, the mAP value is improved by 3.74%, and the time for front transmission is reduced by 1ms.

In the invention, compared with an unmodified YOLOv5s upper defect detection model, the Precision mean value of the modified YOLOv5s upper defect detection model is improved by 6.43%, the Recall mean value is improved by 2.25%, the mAP value is improved by 1.83%, and the time for front transmission is reduced by 1ms.

Compared with the unmodified YOLOv5s model, the modified YOLOv5s model has the advantages that Precision, recall and mAP on three station data sets are obviously improved, the improvement effect of the defect detection models above and below the injection molding is obvious, and the defect detection accuracy of the modified injection molding periphery defect detection model can reach 100%. The experiment reduces the parameters of the network by introducing the depth separable convolution, but the parameter quantity is increased for the network model due to the addition of a small target detection layer and the addition of a attention mechanism, the front transmission consumption of the improved model is reduced by 1ms compared with that of an unmodified model, the detection speed is not greatly improved, and the requirement of improving the detection speed of the network model is truly met. In summary, the improved defect detection model realizes the improvement of the detection speed, and greatly improves the defect detection precision of the model.

The above list of detailed descriptions is only specific to practical embodiments of the present invention, and they are not intended to limit the scope of the present invention, and all equivalent manners or modifications that do not depart from the technical scope of the present invention should be included in the scope of the present invention.

Claims

1. An improved YOLOv5 s-based method for detecting surface defects of injection molding parts is characterized by comprising the following steps:

step 1, acquiring an injection molding surface defect data set;

step 2, after preprocessing and marking the surface defect data set of the injection molding piece in the step 1, dividing the surface defect data set of the injection molding piece into a training set, a verification set and a test set;

step 3, constructing a YOLOv5s basic network structure model, and improving the YOLOv5s basic network structure model to obtain an improved YOLOv5s basic network structure model;

step 4, training the improved YOLOv5s network model in the step 3 by using the training set in the step 2 to obtain a model for detecting the surface defects of the injection molding;

and 5, performing defect detection by using the injection molding surface defect detection model obtained in the step 4.

2. The improved YOLOv5 s-based method for detecting surface defects of injection molded parts according to claim 1, wherein the specific implementation of step 1 comprises:

the multi-station visual detection system is used for respectively shooting pictures below, above and around the injection molding piece, the data set is divided into three parts based on three stations, the lower data set only has the pictures below the injection molding piece, the upper data set only has the pictures above the injection molding piece, and the surrounding data set only has the pictures around the injection molding piece.

3. The improved YOLOv5 s-based method for detecting surface defects of injection molded parts according to claim 1, wherein the specific implementation of the step 2 comprises:

preprocessing a plastic surface defect data set, namely performing data enhancement on the plastic surface defect data set, wherein the data enhancement comprises horizontal, vertical and diagonal mirror image overturning of 10% of pictures in each data set;

marking the surface defect data set of the injection molding piece means that an image marking tool labelimg is used for marking the defect of each image in the preprocessed surface defect data of the injection molding piece; the specific labeling is as follows:

defects of the defect dataset under the injection molding were classified into 5 categories, namely starved (QueLiao), crush (Yashang), oil stain (YouZangWu), crack (LieWen) and copper dust (TongXie); defects of the defect dataset above the injection molding were classified into 7 categories, namely starved (QueLiao), crush (YaShang), bruise (Pengshang), PIN twist (PINNiuQu), oil stain (YouZangWu), crack (LieWen) and copper dust (TongXie); marking defect data sets around the injection molding part is divided into 2 categories, wherein one category is defect interval (JianGe), and the other category is non-defect (OK);

the ratio of the training set, the verification set and the test set is set to be 6:2:2, wherein the ratio is the data set division for the single-station picture, namely the data sets above, below and around are respectively divided.

4. The improved YOLOv5 s-based injection molding surface defect detection method of claim 1, wherein constructing the YOLOv5s basic network structure model in step 3 comprises the following steps:

s1: the input end uses Mosaic data enhancement and self-adaptive anchor frame calculation, the input end needs to restore the input data set picture, and the original picture restore is 1280 x 1280 as the input picture size of the whole network model;

s2: extracting characteristic parameters of the picture by using a Backbone, and fusing network characteristics of different scales by using a FPN+PAN structure in a Neck part; the Prediction part is used for outputting a Prediction feature map, and the loss function of the network model comprises a rectangular frame loss function loss _box Class loss function loss _cls And confidence loss _obj The overall loss function at model training is:

Loss＝a×loss _obj +b×loss _box +c×loss _cls

the overall loss is a weighted sum of the three losses;

s3: the basic modules of the YOLOv5 network include CBS, bottleNeck, bottleckNeck2, C3 and SPPF;

the CBS module consists of Conv (normal convolution module), BN (batch normalization) and SiLU (activation function);

the BottleNeck1 module is a residual module: assuming that the input of the neural network of the residual unit is x, the expected output is H (x), the middle parametric network layer is set to F, then the output after x passes through F becomes F (x), the residual network structure directly transmits the input to the output as an initial result, so that the learning target is F (x) =h (x) -x, i.e. the residual, and thus the original mapping becomes F (x) +x, and the original residual unit can be regarded as being composed of two parts, namely, a linear direct mapping x and another nonlinear mapping F (x);

the residual unit may be defined as:

y＝f[F(x,w _i )+x]

wherein x and y are each a residueInput and output of difference unit, F (x, w _i ) For residual mapping to be learned, w _i Is a convolution kernel, f is an activation function;

the BottleNeck2 module is connected in series by two CBS modules;

the C3 modules are of two types, namely C3_1_n and C3_2_n, wherein the C3_1_n uses a residual module BottleNeck1 as a basic module, the C3 modules use a non-residual module BottleNeck2 as a basic constituting module, and the parameters n of the two modules represent the number of BottleNeck in the C3;

the SPPF module performs the Concat operation on the feature maps with different scales by using the mode of maximum pooling of k=1×1,5×5,9×9,13×13, so as to increase the receptive field without affecting the network computing speed and extract the most important context information.

5. The method for detecting surface defects of injection molded parts based on modified YOLOv5S according to claim 4, wherein a=0.4, b=0.3, and c=0.3 are set in S2.

6. The method for detecting surface defects of injection molded parts based on improved YOLOv5s according to claim 4, wherein the improving of the YOLOv5s network structure model in the step 3 comprises:

first improvement: adding a CBAM convolution attention mechanism module in a back bone and a Neck structure of a YOLOv5s network structure model, wherein the attention mechanism module is not directly added into the model, but is fused into a C3 module in the back bone and the Neck structure, a CBAM module is connected in series behind two serial CBS modules in the BottleNeck1 and the BottleNeck2, the fused BottleNeck is CBAMBottleNeck, the BottleNeck in the original C3 module is changed into CBAMBottleNeck, the fused C3 module is C3CBAM, the C3CBAM is divided into two types, namely C3CBAM_1_n and C3CBAM_2_n, n represent the number of the CBAMBottleNeck in the module, and then the C3 in the YOLOv5s network is replaced by C3CBAM;

second improvement: kern in the back bone and Neck structure of the YOLOv5s network structure modelThe normal convolution kernel with el_size 6*6/3*3 is replaced by a depth separable convolution (Depthwise Separable Convolution); the depth separable convolution mainly comprises two processes, namely a channel-by-channel convolution (Depthwise Convolution) and a point-by-point convolution (Pointwise Convolution), wherein the channel-by-channel convolution is used for extracting features of an input image layer, one convolution kernel is only responsible for convolving one channel, and the number of channels of an output feature image obtained by the convolution operation is consistent with the number of channels of an input image. Finally, adopting 1*1 point-by-point convolution to perform feature concentration, adjusting the number of channels to be output channels, and enabling the depth to be separable and convolved to obtain parameter quantity P _dsc Calculated amount S _dsc The expression of (2) is as follows:

P _dsc ＝C·K ² +N·C

S _dsc ＝C·H·W·K ² +N·C·H·W

wherein, C is the number of input channels, N is the output channel, W, H is the width and height of the output layer, and K is the convolution kernel size;

third improvement: the method is characterized in that a small target detection layer is added in a Prediction structure of a YOLOv5s network structure model, the size of a predicted feature vector of the small target detection layer is 320 x n, the small target detection layer is used for detecting targets with the size of more than 4*4, the size of the targets detected on a corresponding original image is 4 x (2100/1280) =6, namely, the targets with the size of more than 6*6 are detected, and the accuracy of small target object detection is improved.

7. The method for detecting surface defects of injection molding based on improved YOLOv5s according to claim 6, wherein the calculation of each loss function of the improved YOLOv5s network structure model is as follows:

loss _box the calculation formula is as follows:

loss _obj the calculation formula is as follows:

loss _cls the calculation formula is as follows:

wherein a1, a2, a3 and a4 are respectively weight coefficients of 320×320 grids, 160×160 grids, 80×80 grids and 40×40 grids, and a1> a2> a3> a4.

8. The method for detecting surface defects of injection molded parts based on modified YOLOv5s according to claim 6, wherein a1, a2, a3 and a4 are 0.4, 0.3, 0.2 and 0.1 in this order.

9. The improved YOLOv5 s-based method for detecting surface defects of injection molded parts according to claim 1, wherein the training of step 4: the training epochs are set to 300, the batch_size is set to 12, the image_size is set to 1280 x 1280, the gradient function selects Adam, the warming learning rate of wakeup is selected, the first 30 epochs are warmed up, and the later learning rate adjustment strategy selects annealing cosine to adjust the learning rate.

10. The improved YOLOv5s based method of detecting surface defects of injection molded parts according to claim 9, wherein the learning rate is attenuated for each epoch strictly according to the following formula: