CN117576016A

CN117576016A - Intelligent detection method for appearance defects of smoke box

Info

Publication number: CN117576016A
Application number: CN202311498745.0A
Authority: CN
Inventors: 陈波; 黄文祥; 瞿睿; 陈驰; 李继斌; 罗琛; 陈婷; 李民; 陈鑫; 罗鹏宇; 何壮丽; 叶运吉; 龚喜; 唐闻鸿
Original assignee: Yunnan Provincial Tabacco CoLtd Honghe Prefecture Branch
Current assignee: Yunnan Provincial Tabacco CoLtd Honghe Prefecture Branch
Priority date: 2023-12-08
Filing date: 2023-12-08
Publication date: 2024-02-20

Abstract

The invention discloses an intelligent detection method for appearance defects of a smoke box, which comprises a mixed network model combining a double-layer routing attention mechanism with a convolutional neural network; in the appearance detection of the smoke box, a shooting device collects an image data set with appearance defects of the smoke box and marks the defective parts in the picture; the image data set of the appearance defects of the smoke box is input into the hybrid network for model training, the network model can extract multi-scale characteristics of the image, and the multi-scale characteristics of the appearance of the smoke box extracted by the network can finish the detection task of the appearance defects of the smoke box. The deformation, damage and dirt defect of the appearance of the smoke box are detected intelligently with higher accuracy and faster efficiency; the deep learning method can detect defects of the appearance of the smoke box in the transportation and storage processes in real time, and can make corresponding countermeasures according to the defects of the appearance of the smoke box, so that the problems of influencing the product quality caused by packaging defects and reducing the transportation and storage efficiency are greatly reduced.

Description

Intelligent detection method for appearance defects of smoke box

Technical Field

The invention relates to the technical field of target detection, in particular to an intelligent detection method for appearance defects of a smoke box.

Background

In the process of transportation and storage after the cigarette carton is packaged, the appearance of the cigarette carton can be deformed, damaged and stained, when the defect of the appearance of the outer cigarette carton is serious, the quality of the cigarettes in the interior can be influenced, and the quality reputation of a tobacco enterprise can be seriously damaged by selling cigarettes with quality problems, so that the quality reputation of the enterprise is not beneficial to competition of the enterprise in the market. Meanwhile, the production cost and labor input of enterprises can be increased by processing products with quality problems caused by the defect of smoke box appearance, and the economic benefits of the enterprises are affected. Therefore, the defects of the appearance of the smoke box in the transportation and storage processes are monitored, and the method has important significance for guaranteeing the quality of products. Along with the continuous development of technology, more and more enterprises realize intellectualization in the processes of product production, processing, transportation and the like. And with the continuous development of computer vision technology, the method of deep learning is expected to monitor the appearance defects of the smoke box in real time. Through this kind of intelligent smoke box outward appearance defect detection mode, on the one hand can in time discover the defect problem that the smoke box outward appearance appears in transportation and storage process to feed back to relevant staff of enterprise and make corresponding countermeasure. On the other hand, the link of defects of the appearance of the smoke box and the defect degree can be found, so that enterprise staff can effectively conduct the problem investigation. In a word, the detection of the appearance defects of the smoke box is realized by using a deep learning method, which is an intelligent means with high efficiency and high accuracy, and the control of the product quality in the transportation and storage links is realized for tobacco enterprises.

The current solution suffers from several drawbacks:

1. when detecting appearance defects of tobacco boxes, some tobacco enterprises still monitor the tobacco boxes in a part of transportation stages and storage stages of products in a manual visual inspection mode, but the tobacco boxes cannot be monitored in real time and in the whole process; if the area with the appearance defect of the smoke box is beyond manual monitoring, a large amount of time is needed for investigation, so that the labor cost is increased, and the coping strategy cannot be timely made.

2. Because of the subjective judgment difference, different people may have different standards and judgment on the severity of the appearance defect of the smoke box, and long-time work can lead people to generate visual fatigue, so that the probability of judging errors is greatly increased, and rapid and comprehensive detection cannot be realized when cargoes are more.

Disclosure of Invention

The invention aims to provide an intelligent detection method for appearance defects of a smoke box, which can effectively detect deformation, damage and dirt defects of the appearance of the smoke box by utilizing a target detection technology.

In order to solve the technical problems, the invention adopts the following technical scheme:

an intelligent detection method for appearance defects of a smoke box comprises the following steps:

collecting an image dataset with defective appearance of the smoke box by using shooting equipment;

expanding an image dataset of the appearance of the smoke box defect by using a mode of manually increasing the appearance defect and randomly cutting and horizontally overturning data to preprocess;

marking the defect part in the smoke box appearance data set by using Labelimg;

inputting an image data set of the appearance defect of the smoke box in the fast R-CNN network to train a model;

and finishing the intelligent detection task of the appearance defects of the smoke box through the trained network model.

The method comprises the steps of constructing an image data set with defective appearance of the smoke box by using a monitoring camera or a hand shooting device, specifically, shooting deformation, damage and dirt defects of the appearance of the smoke box by using the monitoring camera or the hand shooting device, shooting for a plurality of times from various angles, distances and backgrounds, and finally classifying the types of the image defects.

The method comprises the steps of expanding an image dataset of the appearance of the smoke box defect by utilizing a mode of artificially increasing the appearance defect and randomly cutting and horizontally overturning data preprocessing, specifically, simulating deformation, damage and stain defect effects on a picture of the smoke box by using image editing software, expanding the image dataset by utilizing a mode of randomly cutting and horizontally overturning data preprocessing, and finally, expanding the dataset according to a formula of 8:1:1, dividing training set, verification set and test set.

And marking the defect part in the smoke box appearance defect image data set with a target real frame by using an image marking tool, specifically, marking the defect part in the smoke box appearance defect image data set with the target real frame by using a Labelimg tool, and generating a corresponding XML file.

The image data set of the appearance defect of the input smoke box in the fast R-CNN network is used for training a model, and a defect detection module is obtained, specifically:

s5.1, input end: the data of the sorted data set is enhanced by a Mosaic mode, the Mosaic is that four pictures in the data set are randomly extracted and spliced into one picture, then the pictures are scaled to a designated size, and finally the pictures are sent into a fast R-CNN network model in a batch size of 4 according to the resolution of 640 x 640;

s5.2, a backhaul terminal: the method is characterized in that a MobileViTv3 network is used as a backbone, the network is a hybrid network model combining CNN and a Transformer, multi-scale feature extraction can be better carried out on an input image, the total number of the network is 5 stages, the height and the width of each Stage feature image are reduced to be half of the original height and width of each Stage feature image, stage1 consists of a 3*3 convolution layer and a MV2 (MV2=Conv+BN+ReLU6) module, and the convolution layer carries out double downsampling simultaneously; stage2 is composed of three MV2 modules, the first MV2 module will double down sample the feature map; the structures of Stage3, stage4 and Stage5 are consistent, each of which consists of an MV2 module and an MViTv3 block module, the MV2 module is subjected to double downsampling, and the three stages are respectively 2, 4 and 3 of iteration times of a transducer in the MViTv3 block;

s5.3, neck end: predicting a feature map output by a back bone end by using RPN, extracting candidate frames, drawing 9 candidate frames at each center point in an original image, and combining a classification result with candidate frame information to obtain more accurate candidate frames;

s5.4, head end: sending the relevant features of the candidate frames obtained by the Neck end into a classification and regression network to obtain a classification result and a prediction frame of a final detection target;

s5.5, performing loss calculation on the obtained prediction frame and the real frame label, and then reversely updating the weight of the Faster R-CNN to iterate the network parameters;

and S5.6, repeating the steps S4.1-S4.5 until the iteration is ended, and obtaining a target detection module.

The intelligent detection task of the appearance defects of the smoke box is completed by utilizing the trained network model, specifically, the trained appearance defect detection model of the smoke box is deployed on monitoring equipment in the production, transportation and storage processes of products, the appearance of the smoke box is monitored in real time, the detected defect types and times are recorded, and relevant staff of enterprises are reminded to take corresponding countermeasures.

The beneficial effects of the invention are as follows:

(1) Compared with a manual visual inspection detection mode, the intelligent detection method for the defects of the appearance of the smoke box has the advantages that the defects of the appearance of the smoke box are intelligently detected with higher accuracy and higher efficiency, and through the intelligent detection mode for the defects of the appearance of the smoke box, the defect problem of the appearance of the smoke box in the transportation and storage processes and the links with the defects can be timely found, and the links with the defects are fed back to relevant staff of enterprises to make corresponding countermeasures and rapidly check the links with the problems. By the deep learning method, the problem of different standards generated by subjective checking of people can be effectively solved, the problem of judgment errors caused by visual fatigue of people is reduced, and better control of the quality of products in the transportation and storage links is realized for tobacco enterprises.

(2) The method is a mixed network model combining a double-layer routing attention mechanism with a convolutional neural network; in the process of detecting the appearance of the smoke box, collecting an image data set with the appearance defect of the smoke box through shooting equipment, expanding the data set with the appearance defect of the smoke box in a mode of artificially increasing the appearance defect, randomly cutting and horizontally overturning data, and marking the defect part in the picture; and (3) inputting an image data set of the appearance defect of the smoke box into the hybrid network for model training, extracting multi-scale characteristics of the image by the network model, and finally completing the detection task of the appearance defect of the smoke box through the multi-scale characteristics of the appearance of the smoke box extracted by the network.

(3) The invention can realize intelligent detection on deformation, damage and stain defects of the appearance of the smoke box with higher accuracy and faster efficiency; the deep learning method can detect defects of the appearance of the smoke box in the transportation and storage processes in real time, so that enterprise staff can make corresponding countermeasures according to the defects of the appearance of the smoke box, and the problems of influencing the product quality and reducing the transportation and storage efficiency caused by packaging defects are greatly reduced.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention for detecting defects in the appearance of a smoke box;

FIG. 2 is a block diagram of a Faster R-CNN network;

fig. 3 is a specific structural diagram of MV2 module in backbone network MobileViTv 3;

fig. 4 is a specific structure diagram of an MViTv3 block in the backbone network MobileViTv 3;

fig. 5 is a specific structural diagram of a BRA in the MViTv3 block;

fig. 6 is a specific structural diagram of MLP in MViTv3 block.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to be limiting

Example 1:

as shown in fig. 1-6, an intelligent detection method for appearance defects of a smoke box includes: collecting an image dataset with defective appearance of the smoke box by using shooting equipment; expanding an image dataset of the appearance of the smoke box defect by using a mode of manually increasing the appearance defect and randomly cutting and horizontally overturning data to preprocess; marking the defect part in the smoke box appearance data set by using Labelimg; inputting an image data set of the appearance defect of the smoke box into a fast R-CNN network to train a model; and finishing the intelligent detection task of the appearance defects of the smoke box through the trained network model.

Further, the method can be that the image data set of the appearance of the smoke box defect is expanded by utilizing a pretreatment mode of artificially increasing the appearance defect and randomly cutting and horizontally overturning the data, and the defect part in the image data set of the appearance defect of the smoke box is labeled with a target real frame by utilizing an image labeling tool, which comprises the following steps: simulating deformation, breakage and stain defect effects on a smoke box picture by using an image data set with the appearance of the smoke box defect by using image editing software, expanding the image data set by using a random cutting and horizontal overturning data preprocessing mode, and finally, carrying out data set conversion on the image data set according to 8:1:1, dividing a training set, a verification set and a test set in proportion, marking a target real frame on a defective part in the smoke box appearance defect image data set by using a Labelimg tool, and finally generating a corresponding XML file.

Further, the image data set for inputting the finished appearance defect of the smoke box into the fast R-CNN network can be set to obtain a target detection module, which specifically comprises:

Still further, the intelligent detection task of the appearance defect of the smoke box is completed by using the trained network model, which specifically comprises the following steps: the trained smoke box appearance defect detection model is deployed on monitoring equipment in the production, transportation and storage processes of products, the appearance of the smoke box is monitored in real time, the detected defect types and times are recorded, and relevant staff of enterprises are reminded to take corresponding countermeasures.

Example 2: as shown in fig. 1-6, an intelligent detection method for appearance defects of a smoke box includes:

s1, constructing an image data set of the appearance defect of the smoke box;

specifically, deformation, damage and stain defects of the appearance of the smoke box can be photographed through monitoring equipment in the production, transportation and storage processes of a tobacco enterprise and a mobile phone photographing mode on site, multiple photographing is performed from various angles, distances and backgrounds, and finally the types of the image defects are classified.

S2, expanding a data set by utilizing a data preprocessing mode;

specifically, the data sets acquired through the shooting equipment are arranged, the arranged image data simulate deformation, damage and stain defect effects on smoke box pictures by utilizing image editing software, in order to enable the training effect of a model to be better, the defect effects are simulated on a complete smoke box and a defective smoke box, and the defect effects on one image are deformed and scaled and then transferred to the other image; then expanding the image data set by utilizing a random clipping and horizontal overturning data preprocessing mode, and finally, the data set is subjected to 8:1:1, dividing training set, verification set and test set.

S3, marking a target real frame on a defective part in the smoke box appearance defect image data set by using an image marking tool;

specifically, labeling a target real frame of a defect in image data by using a Labelimg tool, generating a corresponding XML file, and recording the name, position, size, defect type and coordinate information of the defect in the generated XML file; the Labelimg tool can mark the true frame position of the defect in the image and mark the category of the defect, wherein the defect comprises three types of deformation, breakage and stains, and then the defect can be classified into mild, moderate and severe according to the severity, so that the defect categories are 9 types respectively: mild degradation, moderate deformation, severe deformation, mild damage, modifier damage, seee damage, mild stand, modifier stand, seee stand.

S4, inputting an image data set of the appearance defect of the smoke box into a fast R-CNN network for model training, and obtaining a defect detection module;

s4.1, input end: the data of the sorted data set is enhanced by a Mosaic mode, the Mosaic is that four pictures in the data set are randomly extracted and spliced into one picture, then the pictures are scaled to a designated size, and finally the pictures are sent into a fast R-CNN network model in a batch size of 4 according to the resolution of 640 x 640;

s4.2, a backhaul terminal: the method is characterized in that a MobileViTv3 network is used as a backbone, the network is a hybrid network model combining CNN and a Transformer, multi-scale feature extraction can be better carried out on an input image, the total number of the network is 5 stages, the height and the width of each Stage feature image are reduced to be half of the original height and width of each Stage feature image, stage1 consists of a 3*3 convolution layer and a MV2 (MV2=Conv+BN+ReLU6) module, and the convolution layer carries out double downsampling simultaneously; stage2 is composed of three MV2 modules, the first MV2 module will double down sample the feature map; the structures of Stage3, stage4 and Stage5 are consistent, each of which consists of an MV2 module and an MViTv3 block module, the MV2 module is subjected to double downsampling, and the three stages are respectively 2, 4 and 3 of iteration times of a transducer in the MViTv3 block;

s4.3, neck end: predicting a feature map output by a back bone end by using RPN, extracting candidate frames, drawing 9 candidate frames at each center point in an original image, and combining a classification result with candidate frame information to obtain more accurate candidate frames;

s4.4, head end: sending the relevant features of the candidate frames obtained by the Neck end into a classification and regression network to obtain a classification result and a prediction frame of a final detection target;

s4.5, performing loss calculation on the obtained prediction frame and the real frame label, and then reversely updating the weight of the Faster R-CNN to iterate the network parameters;

and S4.6, repeating the steps S4.1-S4.5 until the iteration is ended, and obtaining a target detection module.

The following is given for S4.1:

the data of the well-arranged data set is enhanced by a Mosaic mode, the Mosaic is to randomly extract four pictures in the data set to be spliced into one picture, then the pictures are scaled to a specified size, the generalization capability and the robustness of the model are improved by increasing the diversity and the quantity of the data, the risk of overfitting of the model can be reduced, and finally the pictures are sent into a fast R-CNN network model at a resolution of 640 x 640 and a batch size of 4;

the following is given for S4.2:

the backbox is a dual-layer routing attention mechanism MobileViTv3 network, the network is a mixed network model combining CNN and a transducer, the multi-scale feature extraction can be better carried out on an input image, the total number of the network is 5 stages, the height and the width of each Stage feature image are reduced to be half of the original size, the original image size is 640 x 640, the change of each Stage feature image is 320 x 320, 160 x 160, 80 x 80, 40 x 40 and 20 x 20, the Stage1 consists of a convolution layer of 3*3 and an MV2 module, and the convolution layer carries out double downsampling simultaneously; stage2 is composed of three MV2 modules, the first MV2 module will double down sample the feature map; the structures of Stage3, stage4 and Stage5 are consistent, each of the stages is composed of an MV2 module and an MViTv3 block module, the MV2 module is subjected to double downsampling, the difference of the three stages is that the iteration times of a transducer in the MViTv3 block are respectively 2, 4 and 3, and each module is composed of the following components:

the MV2 module is composed of Conv, BN (batch normalization), siLU, the MV2 function is mainly to model the feature map locally, and the first MV2 in each Stage is downsampled to obtain the multiscale feature of the image;

the MViTv3 block module consists of Conv and a transducer, a feature map is split into two branches after entering the module, one branch is subjected to local modeling through a 3*3-size depth separable convolution and a 1*1-size convolution, then split into two branches, one branch is subjected to a transducer module and then subjected to concat splicing with the other branch, and then the feature map is subjected to concat splicing with the first branch after being subjected to a 1*1-size convolution to obtain a final output feature map;

the transform module is an important point of the whole network, and a Multi-Head Self-Attention (MSA) mechanism is adopted by the traditional transform, which has the problems of high computational complexity and information redundancy, while a Bi-layer routing Attention (Bi-Level Routing Attention, BRA) mechanism can make Attention calculation on key value pairs which are most relevant to only a small part of semantic information for query, so that most irrelevant key value pairs are filtered out, information redundancy and calculation amount are greatly reduced, and therefore, the transform module is transformed by replacing MSA with BRA. The transducer module consists of LN (layer normalization), BRA, MLP (Multi-layer perceptron), wherein the network prevents gradient disappearance by residual connection; the MLP is composed of Linear, siLU, dropout, which can enhance the generalization ability of the model so that the model can accurately predict unseen inputs.

The following is given for S4.3:

predicting a feature image output by a background end by using RPN (Region Proposal Network), generating a candidate frame corresponding to a target area, traversing the feature image output by the background through a 3*3 sliding window, calculating a center point on an original image corresponding to the center point of the sliding window, drawing 9 candidate frames at each center point of the original image, carrying out boundary frame regression loss calculation on the feature image to obtain a candidate frame output box_pred of the target area, carrying out classification loss calculation to obtain a class probability output Cls_logits of a defect, screening candidate frames of a positive class sample and a negative class sample, and combining a classification result with screened candidate frame information through Proposal, wherein im_info is the change condition of the picture size; the loss function of the RPN consists of Classification loss classification loss and Regression loss of the Regression boundary box, and the calculation formula is as follows:

wherein L is _cls Represents the classification loss, N _cls Representing the number of all samples in a batch, p _i Representing the probability that the i-th target region is predicted to be a true label,when the value is 1, the current calculation is a positive sample, and when the value is 0, the current calculation is a negative sample; lambda is a balance parameter; l (L) _reg Representing bounding box regression loss, ++>Indicating the number of sliding windows, t _i Regression parameters representing the ith target area candidate box, +.>And (5) representing regression parameters of the real bounding box of the ith target area.

L _cls The multi-class cross entropy loss CrossEntropyLoss is used, and the calculation formula is as follows:

L _cls ＝-logp _i

L _reg the smoothed L1 Loss Smooth L1 Loss is used, and the calculation formula is as follows:

the following is given for S4.4:

the candidate box regions of different sizes are aligned to the fixed size feature representation by the ROI (Region of Interest) pooling layer, thereby achieving scale invariance and positional invariance. Then flattening the feature vector through the flat layer, adjusting the channel number through two full-connection layers of FC1 and FC2, sending the feature vector into a regression network and a classification network through FC3 and FC4 full-connection layers respectively to calculate, obtaining a prediction frame output box_pred and a classification result output Cls_logits of a final target area, wherein a loss function of the fast R-CNN network consists of Classification loss classification loss and Localization loss positioning loss, and the calculation formula is as follows:

L(p,u,t ^u ,v)＝L _cls (p,u)+λ[u≥1]L _loc (t ^u ,v)

wherein L is _cls Represents the classification loss, p= (p ₁ ，p ₂ ，...，p _k ) Probability distribution tensor representation representing model prediction, u represents true class label of target, t ^u Regression parameters representing the corresponding category u predicted by the bounding box regressorv represents regression parameters (v) _x ，v _y ，v _w ，v _h )。

L _cls (p，u)＝-logp _u

L _loc the smoothed L1 Loss Smooth L1 Loss is used, and the calculation formula is as follows:

and S5, deploying a trained smoke box appearance defect detection model on monitoring equipment in production, transportation and storage links, recording the detected defect types and severity, recording the occurrence times of each defect, and feeding back the defect condition of the smoke box appearance to related staff.

Although the invention has been described herein with reference to a number of illustrative embodiments thereof, it should be understood that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the scope and spirit of the principles of this disclosure. More specifically, various variations and modifications may be made to the component parts and/or arrangements of the subject combination arrangement within the scope of the disclosure, drawings and claims of this application. In addition to variations and modifications in the component parts and/or arrangements, other uses will be apparent to those skilled in the art.

Claims

1. An intelligent detection method for appearance defects of a smoke box is characterized by comprising the following steps of: the method comprises the following steps:

(1) Collecting an image dataset with defective appearance of the smoke box by using shooting equipment;

(2) Expanding an image dataset of the appearance of the smoke box defect by using a mode of manually increasing the appearance defect and randomly cutting and horizontally overturning data to preprocess;

(3) Marking the defect part in the smoke box appearance data set by using Labelimg;

(4) Inputting an image data set of the appearance defect of the smoke box in the fast R-CNN network to train a model;

(5) And finishing the intelligent detection task of the appearance defects of the smoke box through the trained network model.

2. The intelligent detection method for appearance defects of smoke boxes according to claim 1, wherein the intelligent detection method comprises the following steps: in the step (1), the step of collecting the image dataset with the defective appearance of the smoke box by using the shooting equipment is to construct the image dataset with the defective appearance of the smoke box by using the monitoring camera or the mobile phone shooting equipment, and specifically comprises the following steps: and shooting deformation, damage and stain defects of the appearance of the smoke box by using a monitoring camera or mobile phone shooting equipment, shooting for a plurality of times from various angles, distances and backgrounds, and classifying the types of the image defects.

3. The intelligent detection method for appearance defects of smoke boxes according to claim 1, wherein the intelligent detection method comprises the following steps: in the step (2), the image data set of the appearance of the smoke box defect is expanded by utilizing a mode of artificially increasing the appearance defect and randomly cutting and horizontally overturning data for preprocessing, and the method specifically comprises the following steps: the deformation, damage and stain defect effects are simulated on the smoke box picture by utilizing image editing software, then the image data set is expanded by utilizing a random cutting and horizontal overturning data preprocessing mode, and finally the data set is subjected to the following steps of 8:1:1, dividing training set, verification set and test set.

4. The intelligent detection method for appearance defects of smoke boxes according to claim 1, wherein the intelligent detection method comprises the following steps: in the step (3), marking the target real frame on the defect part in the smoke box appearance defect image data set by using an image marking tool, specifically: and marking a target real frame on a defect part in the smoke box appearance defect image data set by using a Labelimg tool, and generating a corresponding XML file.

5. The intelligent detection method for appearance defects of smoke boxes according to claim 1, wherein the intelligent detection method comprises the following steps: in the step (4), training a model by inputting an image data set of the appearance defect of the smoke box in the fast R-CNN network, and obtaining a defect detection module, wherein the defect detection module specifically comprises:

s5.5, performing loss calculation on the obtained prediction frame and the real frame label, and then reversely updating the weight of the fast R-CNN to iterate the network parameters;

6. The intelligent detection method for appearance defects of smoke boxes according to claim 1, wherein the intelligent detection method comprises the following steps: in the step (5), the trained network model is utilized to complete the intelligent detection task of the appearance defect of the smoke box, specifically: the trained smoke box appearance defect detection model is deployed on monitoring equipment in the production, transportation and storage processes of products, the appearance of the smoke box is monitored in real time, the detected defect types and times are recorded, and relevant staff of enterprises are reminded to take corresponding countermeasures.