CN114022446A

CN114022446A - Leather flaw detection method and system based on improved YOLOv3

Info

Publication number: CN114022446A
Application number: CN202111302497.9A
Authority: CN
Inventors: 唐勇; 王美林; 刘俊豪; 罗泽建
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2021-11-04
Filing date: 2021-11-04
Publication date: 2022-02-08

Abstract

The invention provides a leather flaw detection method and system based on improved YOLOv3, relating to the technical field of leather flaw detection, and comprising the steps of firstly, constructing a sample leather flaw image data set, carrying out frame marking on leather flaws in the data set, and dividing the data set into a training set, a verification set and a test set; then constructing a YOLOv3 target neural network model, training the improved YOLOv3 target neural network model based on a kmeans clustering algorithm and a label smoothing technology improvement model, verifying the performance of the YOLOv3 target neural network model in training through a verification set, and obtaining a trained YOLOv3 target neural network model; and finally, performing leather flaw detection on the image data of the test set based on the trained model. According to the method, the generalization of the model is improved by improving YOLOv3, the false detection and missing detection probability of the flaws is reduced, and the accuracy of the leather flaw detection is further improved.

Description

Leather flaw detection method and system based on improved YOLOv3

Technical Field

The invention relates to the technical field of leather flaw detection, in particular to a leather flaw detection method and system based on improved YOLOv 3.

Background

Leather is an artificial leather product made of plastic special-shaped materials. With the continuous development of scientific technology and the improvement of life quality of people, the requirements of people on the surface quality of leather are higher and higher, for example, automobile interior leather is an indispensable decoration accessory, automobile purchasers have more and more critical requirements on the automobile interior leather, and various defects such as holes, stripes, scratches, bubbles, pads and the like are possibly inevitably generated on the original leather surface due to mosquito bites or fine defects in the processing process, and the defects seriously affect the surface quality of the leather, so the leather defect detection is very important for ensuring the excellent quality of finished leather products, and the leather defect detection is already an important link of leather production and quality management.

The traditional leather flaw detection method mainly adopts manual detection, the manual detection efficiency is low, and the accuracy is low. In order to overcome the defects of the current manual detection mode, a defect detection mode based on deep learning is widely applied as a method with higher performance, the method is used for detecting defects of wine bottles based on YOLOv3 in 1/5/2021, and Chinese invention patent (publication number: CN112184679A), which belongs to the technical field of defect detection, the scheme utilizes a YOLOv3 network structure to learn the defect characteristics in wine bottle defect picture data marked by manual collection to obtain a model capable of automatically identifying whether a wine bottle in production contains defects or not, then the model is loaded into a real-time wine bottle identification system to shoot in real time and return the identification result in real time, the problems of low efficiency and precision of the manual detection mode are solved, although the leather defects are similar to the defects, but the background of the leather defects is more complex, and for the leather defect types with larger size difference, the problem of false detection and missed detection is easy to occur by applying the basic YOLOv3 algorithm, so that the adverse effect of low detection accuracy is caused.

Disclosure of Invention

In order to solve the problem that the method for detecting the leather defects by using the traditional YOLOv3 algorithm is easy to have the defects, missing detection and false detection, the invention provides the leather defect detection method and the leather defect detection system based on the improved YOLOv3, and the generalization of the leather defect detection method and the leather defect detection system are improved by improving YOLOv3, the probability of missing detection and false detection of the defects is reduced, and the accuracy of leather defect detection is improved.

In order to achieve the technical effects, the technical scheme of the invention is as follows:

a leather flaw detection method based on improved YOLOv3, the method comprising the following steps:

constructing a sample leather flaw image data set, and performing frame marking on leather flaws in the data set;

dividing a sample leather flaw image data set marked by a frame into a training set, a verification set and a test set;

constructing a YOLOv3 target neural network model, and improving the YOLOv3 target neural network model based on a kmeans clustering algorithm and a label smoothing technology;

training the improved Yolov3 target neural network model by using a training set, and verifying the performance of the Yolov3 target neural network model in the training by using a verification set to obtain a finally trained Yolov3 target neural network model;

and performing leather flaw detection on the image data of the test set based on the trained Yolov3 target neural network model.

Preferably, the types of leather imperfections in the data set include: bubble, dirt stain, stripe, hole, ribbon color line.

Preferably, when the sample leather flaw image data set is divided into a training set, a verification set and a test set, sample leather flaw images divided into the training set, the verification set and the test set are all obtained from the sample leather flaw image data set through a random sampling method, and each image data in the divided sample leather flaw image data set is ensured to be more fit with random uncertainty of data in an actual situation.

Preferably, the constructed YOLOv3 target neural network model comprises a Darknet53 feature extraction module and a multi-scale prediction and feature fusion module, wherein the Darknet53 feature extraction module is used for extracting sample leather defect image features, the Darknet53 feature extraction module outputs the sample leather defect image features to the multi-scale prediction and feature fusion module after extracting the sample leather defect image features, the multi-scale prediction and feature fusion module comprises three convolution layer networks Convs and a detection output layer, the Darknet-53 feature extraction module outputs the sample leather defect image features of three prediction scales after extracting the sample leather defect image features, one branch of each convolution layer network Convs is output as a detection output after the sample leather defect image features of three prediction scales pass through the three convolution layer networks Convs, and the sample leather defect image features of three prediction scales output by the Darknet-53 feature extraction module are subjected to 2 times of upsampling and feature fusion, and finally, detecting the output layer to identify the defects.

Preferably, the Darknet-53 feature extraction module includes an initial convolution layer, a first residual unit, a second residual unit, a third residual unit, a fourth residual unit and a fifth residual unit, which are sequentially connected, the first residual unit includes a first single convolution layer and a first repeat execution convolution layer 1x, the second residual unit includes a second single convolution layer and a second repeat execution convolution layer 2x, which are sequentially connected, the third residual unit includes a third single convolution layer and a third repeat execution convolution layer 8x, which are sequentially connected, the fourth residual unit includes a fourth single convolution layer and a fourth repeat execution convolution layer 8x, which are sequentially connected, and the fifth residual unit includes a fifth single convolution layer and a fifth repeat execution convolution layer 4x, which are sequentially connected; the initial convolutional layer is a convolutional kernel with a channel of 32 and a size of 3 x 3, the input is a sample leather flaw image data set, the first repeatedly executed convolutional layer 1x, the second repeatedly executed convolutional layer 2x, the third repeatedly executed convolutional layer 8x, the fourth repeatedly executed convolutional layer 8x and the fifth repeatedly executed convolutional layer 4x are repeatedly executed for 1 time, 2 times, 8 times and 4 times respectively, the number of filters is firstly halved and then restored, and the total number is 52 layers; the input of the multi-Scale prediction and feature fusion module comprises a first prediction Scale1 sample leather defect image feature output by the fifth iteration convolutional layer 4x, a second prediction Scale2 sample leather defect image feature output by the fourth iteration convolutional layer 8x, and a third prediction Scale3 sample leather defect image feature output by the third iteration convolutional layer 8 x.

Preferably, the process of improving the YOLOv3 target neural network model based on the kmeans clustering algorithm is as follows:

s101, randomly selecting k marked frames from a sample leather defect image data set marked by frames as initial clustering centers, wherein each initial clustering center stores the width and height of a defect frame in a leather defect;

s102, calculating each real defect frame x in the data set based on the intersection ratio IoU_jTo each initial cluster center k_iDistance d of center of mass_iThen, the real flaw frame is distributed to the nearest cluster according to the distribution principle of the minimum distance, and the distance d_iThe solving expression is:

d_i(x_j)＝1-IoU(x_j,k_i)

wherein x is_jRepresenting the jth real border of all real flaw borders in the dataset; k is a radical of_iRepresenting the ith initial cluster center of the k initial cluster centers;

s103, updating a clustering center by using the real defect frame sample mean value of each clustering cluster;

s104, repeating the step S102 and the step S103 until the clustering centers do not change any more, and obtaining final clustering centers and k divided clustering clusters;

and S105, when the detection output layer of the YOLOv3 target neural network model identifies flaws, the flaws are realized on the basis of an anchor box, k clustering clusters are used as new anchor values of the anchor box, and the YOLOv3 target neural network model is improved.

In consideration of the difference of the sizes and types of the leather defects, the anchor box is clustered again through the kmeans clustering algorithm, the anchor box of the YOLOv3 detection output layer is improved, k clustering clusters are used as new anchor values of the anchor box, the new anchor values are more suitable for the defects with larger differences of the sizes of the leather, and the accuracy of leather defect detection is improved.

Preferably, the process of improving the YOLOv3 target neural network model based on the label smoothing technique is:

and (3) taking a cross entropy loss function as the predicted loss of the Yolov3 target neural network model, wherein the expression is as follows:

L＝-[y log p+(1-y)log(1-p)]

wherein y represents a true value and p represents a predicted value; when the detected flaw is accurate, p is 1, otherwise, p is 0;

performing label smoothing on a cross entropy loss function of the YOLOv3 target neural network model by using a label smoothing technology, wherein the expression is as follows:

wherein l represents the total number of the flaw types, and epsilon is a constant between 0 and 1 and is a punishment parameter; after label smoothing, when the detected flaw is accurate, p is 1-epsilon, otherwise, p is 0.

In the method, the complexity and the interference of the leather background are considered, the label is added into the cross entropy loss function of the YOLOv3 target neural network model smoothly, overfitting is prevented, the generalization of the model is improved, the false detection rate can be reduced, and the problem of false detection of the leather background is relieved to a certain extent.

Preferably, before training the improved YOLOv3 target neural network model with the training set, data enhancement is performed on the image data of the training set, the validation set and the test set, wherein the data enhancement includes color augmentation, translation, scale transformation and random clipping.

In the method, the diversity of the data is improved by adopting a data enhancement mode, so that the generalization capability of the model is improved when the data is subsequently applied to the model.

Preferably, the process of training the improved YOLOv3 target neural network model by using the training set is as follows:

setting training basic parameters, selecting an optimizer, setting a maximum iteration round N, training an improved YOLOv3 target neural network model by using a gradient descent method, measuring the training target by mAP Precision of the YOLOv3 target neural network model, after the training of each iteration round is finished, verifying the performance of the YOLOv3 target neural network model in the training by using a verification set, measuring by Precision and Recall, and finishing the training when the maximum iteration round N is reached to obtain the finally trained YOLOv3 target neural network model.

The invention also provides a leather flaw detection system based on the improved YOLOv3, which comprises:

the image data set constructing and labeling module is used for constructing a sample leather flaw image data set and performing frame labeling on leather flaws in the data set;

the data set dividing module is used for dividing the sample leather flaw image data set marked by the frame into a training set, a verification set and a test set;

the target neural network model building module is used for building a YOLOv3 target neural network model, and improving the YOLOv3 target neural network model based on a kmeans clustering algorithm and a label smoothing technology;

the target neural network model training module is used for training the improved YOLOv3 target neural network model by using a training set, and verifying the performance of the YOLOv3 target neural network model in the training by using a verification set to obtain a finally trained YOLOv3 target neural network model;

and the test module is used for carrying out leather flaw detection on the test set image data based on the trained Yolov3 target neural network model.

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

the invention provides a leather flaw detection method and system based on improved YOLOv3, which comprises the steps of firstly constructing a YOLOv3 target neural network model, then considering the problems of flaw size in leather and complex leather flaw background, improving the YOLOv3 target neural network model based on a kmeans clustering algorithm and a label smoothing technology, and training and verifying the improved YOLOv3 target neural network model based on a sample leather flaw image data set to obtain a finally trained YOLOv3 target neural network model for leather flaw detection. According to the technical scheme, the generalization of the model is improved by improving YOLOv3, the false detection and missing detection probability of the flaws is reduced, and the accuracy of leather flaw detection is further improved.

Drawings

FIG. 1 is a schematic flow chart of a leather flaw detection method based on improved YOLOv3 in example 1 of the present invention;

FIG. 2 is a block diagram of an improved Yolov3 target neural network model proposed in example 1 of the present invention;

FIG. 3 is a diagram showing the effect of a first leather flaw detection using the classic YOLOv3 algorithm in example 1 of the present invention;

fig. 4 is a diagram showing the detection effect of the first leather flaw detection based on the improved YOLOv3 target neural network model in embodiment 1 of the present invention;

FIG. 5 is a diagram showing the effect of the second leather flaw detection using the classic YOLOv3 algorithm proposed in embodiment 1 of the present invention;

FIG. 6 is a diagram showing the detection effect of the second leather flaw detection based on the improved Yolov3 target neural network model in embodiment 1 of the present invention;

fig. 7 is a schematic diagram showing the variation trend of each index along with the variation of the iterative training round when the classic YOLOv3 algorithm is used in embodiment 1 of the present invention;

fig. 8 is a schematic diagram showing a variation trend of each index along with a variation of an iterative training round based on the improved YOLOv3 target neural network model in embodiment 1 of the present invention;

fig. 9 is a schematic structural diagram of a leather flaw detection system based on the improved YOLOv3 in embodiment 2 of the present invention.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent;

for better illustration of the present embodiment, certain parts of the drawings may be omitted, enlarged or reduced, and do not represent actual dimensions;

it will be understood by those skilled in the art that certain well-known descriptions of the figures may be omitted.

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.

Example 1

As shown in fig. 1, an embodiment of the present invention provides a leather flaw detection method based on improved YOLOv3, the method including the following steps:

s1, constructing a sample leather flaw image data set, and performing frame marking on leather flaws in the data set; in this embodiment, the sample leather blemish image datasets are each from an actual leather production facility, and the types of leather blemishes in the datasets include: bubble, dirt stain, stripe, hole and ribbon color line, and leather flaw can be marked on the frame by a special software marking pen.

S2, dividing a sample leather flaw image data set marked by the frame into a training set, a verification set and a test set;

in this embodiment, in order to ensure that each image data in the divided sample leather defect image data set fits the random uncertainty of the data in the actual situation, when the sample leather defect image data set is divided into a training set, a verification set and a test set, the sample leather defect images divided into the training set, the verification set and the test set are all obtained from the sample leather defect image data set by a random sampling method.

S3, constructing a YOLOv3 target neural network model, and improving the YOLOv3 target neural network model based on a kmeans clustering algorithm and a label smoothing technology;

referring to fig. 2, the constructed YOLOv3 target neural network model includes a Darknet53 feature extraction module 1 and a multi-scale prediction and feature fusion module 2, the Darknet53 feature extraction module 1 is used for extracting sample leather defect image features, after extracting the sample leather defect image features, the sample leather defect image features are output to the multi-scale prediction and feature fusion module 2, the multi-scale prediction and feature fusion module 2 includes three convolution layer networks Convs and a detection output layer 22, the three convolution layer networks Convs are represented by a marker 21, the Darknet-53 feature extraction module 1 extracts the sample leather defect image features, and then outputs sample leather defect image features of three prediction scales, after the sample leather defect image features of three prediction scales pass through the three convolution layer networks Convs, one branch of each convolution layer network Convs is output as a detection output, and the other sample leather defect image features of three prediction scales output by the Darknet-53 feature extraction module 1 is sampled by 2 times and is output by the Darknet-53 feature extraction module 1 The image features are subjected to feature fusion, finally, defects are identified through detecting an output layer 22, a Darknet-53 feature extraction module 1 comprises an initial convolutional layer 11, a first residual error unit 12 and a second residual error unit 13 which are connected in sequence, a third residual unit 14, a fourth residual unit 15 and a fifth residual unit 16, wherein the first residual unit 12 comprises a first single convolution layer and a first repeated convolution layer 1x which are connected in sequence, the second residual unit 13 comprises a second single convolution layer and a second repeated convolution layer 2x which are connected in sequence, the third residual unit 14 comprises a third single convolution layer and a third repeated convolution layer 8x which are connected in sequence, the fourth residual unit 15 comprises a fourth single convolution layer and a fourth repeated convolution layer 8x which are connected in sequence, and the fifth residual unit 16 comprises a fifth single convolution layer and a fifth repeated convolution layer 4x which are connected in sequence; the initial convolutional layer 12 is a convolutional kernel with a channel of 32 and a size of 3 × 3, the input is a sample leather flaw image data set, the first repeatedly executed convolutional layer 1x, the second repeatedly executed convolutional layer 2x, the third repeatedly executed convolutional layer 8x, the fourth repeatedly executed convolutional layer 8x and the fifth repeatedly executed convolutional layer 4x are repeatedly executed for 1 time, 2 times, 8 times and 4 times respectively, the number of filters is firstly halved and then recovered, and the total number is 52 layers; the input of the multi-Scale prediction and feature fusion module 2 includes the first prediction Scale1 sample leather defect image feature output by the fifth iteration convolutional layer 4x, the second prediction Scale2 sample leather defect image feature output by the fourth iteration convolutional layer 8x, and the third prediction Scale3 sample leather defect image feature output by the third iteration convolutional layer 8 x.

The multi-scale prediction and feature fusion module performs upsampling, performs 2 times upsampling on feature maps obtained by the two previous layers in YOLOv3, connects the feature maps obtained by the previous layers with the feature maps obtained by the upsampling, and can obtain semantic information of an upsampling layer and fine-grained information of the previous layers, and finally obtains a tensor of twice size of the previous layers by processing the combined feature maps through several convolutional layers.

For a basic YOLOv3 target neural network model, considering different sizes and types of leather flaws, re-clustering an anchor box by a kmeans clustering algorithm, improving the anchor box of a YOLOv3 detection output layer, and improving the YOLOv3 target neural network model based on the kmeans clustering algorithm comprises the following steps:

d_i(x_j)＝1-IoU(x_j,k_i)

wherein x is_jRepresenting the jth real border of all real flaw borders in the dataset; k is a radical of_iRepresenting the ith initial cluster center of the k initial cluster centers; the intersection ratio calculation formula is as follows:

s103, updating a clustering center by using the real defect frame sample mean value of each clustering cluster; since the specific implementation of cluster center updating is well known in the art, it is not described herein.

In addition to this, considering the complexity and the obtrusiveness of the background of the leather itself, the model of the target neural network for YOLOv3 also includes: improving a cross entropy loss function of a YOLOv3 target neural network model based on a label smoothing technology, and setting the cross entropy loss function as the prediction loss of the YOLOv3 target neural network model, wherein the expression is as follows:

L＝-[y log p+(1-y)log(1-p)]

wherein l represents the total number of the flaw types, and epsilon is a constant between 0 and 1 and is a punishment parameter; after label smoothing, when the detected flaw is accurate, p is 1-epsilon, otherwise, p is 0. In the embodiment of the present invention, the value of epsilon is set to 0.1, and the confidence of the model is reduced by means of label smoothing, for example, when input x is [0,1], the model is modified by label smoothing as follows:

[0,1]×(1-0.1)+0.5×0.1＝[0.05,0.95]

the input is changed from original [0,1] to [0.05,0.95], namely, the original classification is accurate when p is 1, and is inaccurate when p is 0; after label smoothing, p becomes 1-epsilon and epsilon respectively.

S4, training the improved YOLOv3 target neural network model by using a training set, and verifying the performance of the YOLOv3 target neural network model in the training by using a verification set to obtain a finally trained YOLOv3 target neural network model;

before the training of the improved Yolov3 target neural network model by using the training set, the method further comprises data enhancement of the image data of the training set, the verification set and the test set, wherein the data enhancement mode comprises color augmentation, translation, scale transformation and random clipping. The data enhancement mode is adopted to improve the diversity of the data, so that the generalization capability of the model is improved when the data is subsequently applied to the model.

The process of training the improved YOLOv3 target neural network model using the training set is:

setting training basic parameters, selecting an optimizer, setting a maximum iteration turn N, verifying the performance of a YOLOv3 target neural network model in training by using a verification set after the training of each iteration turn is finished, measuring the performance by using the average precision mAP (maximum average precision), and finishing the training when the maximum iteration turn N is reached to obtain a finally trained YOLOv3 target neural network model.

The calculation formulas of Precision and Recall rate are as follows:

if a target exists in the detection frame, the detection frame is marked as TP, otherwise, the detection frame is marked as FP; if there is a target in the detection frame, it is marked as FN, otherwise it is marked as TN.

And S4, performing leather flaw detection on the image data of the test set based on the trained Yolov3 target neural network model.

The validity of the method proposed in example 1 of the present invention is verified below in connection with practical tests. The test is mainly completed at the PC end, and the main configuration of the PC end is that a CPU i 7-9700K, a GPU NVIDIA RTX-20708 GB, a memory 16GB, a system Ubuntu 18.04, a framework Pytrch and a language Python. The main parameters are as follows: the method comprises the following steps that Batch _ size is equal to 4, an activation function is LeakyRelu, an NMS test threshold value is equal to 0.6, a learning rate is 0.001, Adam is selected as an optimizer, policy is steps, imagesize is 1080 multiplied by 1080, epoch (namely the maximum iteration number N) is 100 and the like, in addition, in order to improve the diversity of data, data enhancement is carried out on pictures, and the generalization capability of a model is improved; data enhancement uses color amplification, translation, scale transformation, and random clipping, each step selecting whether to use with a probability of 0.5. 16095 image sets in the sample leather flaw image data set are used as a training set for training a target neural network model by adopting a random sampling method, 179 image sets are used as a verification set for training the target neural network model, and 1609 image sets are used as a test set for testing the target neural network model; and (3) performing frame marking on the leather flaws in the data set by using a marking tool, randomly selecting k marked frames from the sample leather flaw image data set after frame marking as initial clustering centers, wherein in the experiment, k is 9, and after the steps S101-S105, clustering is performed to obtain the values of anchors shown in the table 1.

TABLE 1

And taking the 9 new anchor value clusters as the anchors of the anchor box, improving the YOLOv3 target neural network model to obtain a final improved YOLOv3 target neural network model as shown in FIG. 2, and then performing label smoothing on the YOLOv3 target neural network model cross entropy loss function. Combining the parameter settings, training an improved YOLOv3 target neural network model by using a training set, measuring the training target by using the mAP Precision of the YOLOv3 target neural network model, after each iteration round of training is finished, verifying the performance of the YOLOv3 target neural network model in the training by using a verification set, measuring by using Precision and Recall, when the maximum iteration round N is reached, finishing the training, virtually, having no absolute finishing mark, finishing when the mAP is nearly saturated and no longer obviously improved, and training to obtain the finally trained YOLOv3 target neural network model. In order to verify the effectiveness of the improved YOLOv3 target neural network model proposed in the embodiment of the present invention, and compare it with the test effect of the classic YOLOv3 algorithm, fig. 3 shows a detection effect graph of the first leather flaw detection performed by the classic YOLOv3 algorithm, fig. 4 shows a detection effect graph of the first leather flaw detection performed based on the improved YOLOv3 target neural network model, two graphs are directed to the flaw detection of the same kind of leather, and the frames in fig. 3 and fig. 4 show different leather flaws; FIG. 5 is a graph showing the effect of a second leather blemish detection using the classic YOLOv3 algorithm; FIG. 6 is a graph showing the effect of a second leather blemish detection based on the improved Yolov3 target neural network model; both figures are directed to the detection of flaws in the same leather, the borders in fig. 5 and 6 represent different leather flaws; through comparison between fig. 3 and fig. 4, and comparison between fig. 5 and fig. 6, respectively, it can be seen that the classic YOLOv3 algorithm has low precision and has a missing detection item, the improved model can alleviate the missing detection defect, and can improve the detection precision, and the leather flaw detection algorithm based on the improved YOLOv3 is used to improve the leather flaw detection accuracy.

FIG. 7 is a schematic diagram showing the variation trend of various indexes along with the variation of the iterative training round when the classic YOLOv3 algorithm is used; FIG. 8 is a schematic diagram showing the variation trend of each index along with the variation of the iterative training round based on the improved Yolov3 target neural network model; referring to fig. 7 and 8, the abscissa of each trend graph represents the iterative training round, and the ordinate of each trend graph represents the change trend of the cross-over ratio IoU, the loss function objective, the Precision ratio Precision, the Recall ratio Recall, the mep, and the F1 index along with the change of the iterative training round; mAP @ ═ 0.5 means IoU ═ 0.5 when the accuracy of the model was measured; the formula for calculating the F1 index is as follows:

taking the execution results of epoch for the first time and the last time, the test results can be obtained as shown in table 2:

TABLE 2

As can be seen from table 2, after 100 iterative training, the effect of improved YOLOv3 is better than that of classical YOLOv 3; the method mainly analyzes a plurality of common parameters when the epoch is 99, the precision rate of the classical YOLOv3 is 0.488, the precision rate of the improved YOLOv3 is 0.804, and the precision rate is improved by 0.316; the recall rate of classical YOLOv3 is 0.619, the precision rate of improved YOLOv3 is 0.747, and the precision rate is improved by 0.128; mAP @0.5 of classical YOLOv3 is 0.498, mAP @0.5 of improved YOLOv3 is 0.751, which is increased by 0.261; classical YOLOv3 had an F1 of 0.524 and improved YOLOv3 had an F1 of 0.751, an increase of 0.227. The invention only trains epoch to be 100, but also proves that the improved YOLOv3 algorithm is superior to that before the improvement.

Example 2

As shown in fig. 9, the present invention further provides a leather flaw detection system based on improved YOLOv3, for implementing the method described in embodiment 1, including:

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A leather flaw detection method based on improved YOLOv3 is characterized by comprising the following steps:

2. The improved YOLOv 3-based leather blemish detection method of claim 1, wherein the types of leather blemishes in the data set comprise: bubble, dirt stain, stripe, hole, ribbon color line.

3. The improved YOLOv 3-based leather defect detection method according to claim 1, wherein when the sample leather defect image dataset is divided into a training set, a verification set and a test set, the sample leather defect images divided into the training set, the verification set and the test set are obtained from the sample leather defect image dataset by a random sampling method.

4. The method as claimed in claim 1, wherein the built YOLOv3 target neural network model includes a darknev 53 feature extraction module and a multi-scale prediction and feature fusion module, the darknev 53 feature extraction module is used for extracting sample leather defect image features, the darknev 53 feature extraction module is used for extracting sample leather defect image features and then outputting the sample leather defect image features to the multi-scale prediction and feature fusion module, the multi-scale prediction and feature fusion module includes three convolution layer networks Convs and a detection output layer, the darknev-53 feature extraction module extracts sample leather defect image features and then outputs three prediction scale sample leather defect image features, one branch of each convolution layer network Convs is output as a detection output after the three prediction scale sample leather defect image features pass through the three convolution layer networks Convs, and the other sample leather flaw image features are subjected to feature fusion through 2 times of upsampling and three prediction scales output by a Darknet-53 feature extraction module, and finally flaws are identified through detecting an output layer.

5. The leather flaw detection method based on the improved YOLOv3 as claimed in claim 4, the Darknet-53 characteristic extraction module is characterized by comprising an initial convolution layer, a first residual error unit, a second residual error unit, a third residual error unit, a fourth residual error unit and a fifth residual error unit which are sequentially connected, wherein the first residual error unit comprises a first single convolution layer and a first repeated execution convolution layer 1x which are sequentially connected, the second residual error unit comprises a second single convolution layer and a second repeated execution convolution layer 2x which are sequentially connected, the third residual error unit comprises a third single convolution layer and a third repeated execution convolution layer 8x which are sequentially connected, the fourth residual error unit comprises a fourth single convolution layer and a fourth repeated execution convolution layer 8x which are sequentially connected, and the fifth residual error unit comprises a fifth single convolution layer and a fifth repeated execution convolution layer 4x which are sequentially connected; the initial convolutional layer is a convolutional kernel with a channel of 32 and a size of 3 x 3, the input is a sample leather flaw image data set, the first repeatedly executed convolutional layer 1x, the second repeatedly executed convolutional layer 2x, the third repeatedly executed convolutional layer 8x, the fourth repeatedly executed convolutional layer 8x and the fifth repeatedly executed convolutional layer 4x are repeatedly executed for 1 time, 2 times, 8 times and 4 times respectively, the number of filters is firstly halved and then restored, and the total number is 52 layers; the input of the multi-Scale prediction and feature fusion module comprises a first prediction Scale1 sample leather defect image feature output by the fifth iteration convolutional layer 4x, a second prediction Scale2 sample leather defect image feature output by the fourth iteration convolutional layer 8x, and a third prediction Scale3 sample leather defect image feature output by the third iteration convolutional layer 8 x.

6. The method for detecting leather flaws based on improved YOLOv3 as claimed in claim 5, wherein the process of improving the YOLOv3 target neural network model based on the kmeans clustering algorithm is as follows:

d_i(x_j)＝1-IoU(x_j,k_i)

7. The method for detecting leather flaws based on improved YOLOv3 as claimed in claim 6, wherein the process of improving YOLOv3 target neural network model based on tag smoothing technique is:

L＝-[y log p+(1-y)log(1-p)]

8. The method of claim 1, wherein the training of the improved YOLOv3 target neural network model with the training set further comprises data enhancement of image data of the training set, the verification set and the test set, wherein the data enhancement comprises color augmentation, translation, scale transformation and random clipping.

9. The method for detecting leather flaws according to claim 7 based on improved YOLOv3, wherein the process of training the improved YOLOv3 target neural network model by using the training set comprises:

10. A leather blemish detection system based on modified YOLOv3, the system comprising: