CN117274176A

CN117274176A - Variable data printing defect detection method and system based on YOLOv5s model

Info

Publication number: CN117274176A
Application number: CN202311161707.6A
Authority: CN
Inventors: 刘杰; 刘振泳; 蔡泽龙; 李志聪; 茹家荣; 林显信; 黄成锵
Original assignee: Foshan University
Current assignee: Foshan University
Priority date: 2023-09-08
Filing date: 2023-09-08
Publication date: 2023-12-22

Abstract

The application is applicable to the technical field of industrial detection, and provides a variable data printing defect detection method and system based on a YOLOv5s model, wherein the method comprises the steps of acquiring a to-be-detected image of a to-be-detected printed matter based on a preset camera; inputting an image to be detected into an improved YOLOv5s model obtained by pre-training to generate a feature detection image; and determining whether the printed matter to be detected has printing defects according to the feature detection image. The printing defect possibly existing in the printed matter to be detected can be determined efficiently and accurately, the detection efficiency is far higher than that of manual detection, detection of different types of printed matters can be adapted to, detection under different backgrounds is achieved, the printing quality-problematic products can be found timely, influence on enterprise images caused by flowing into the market is avoided, consumer benefits are protected, and experience of consumers buying or using the products is improved.

Description

Variable data printing defect detection method and system based on YOLOv5s model

Technical Field

The application relates to the technical field of industrial detection, in particular to a variable data printing defect detection method and system based on a YOLOv5s model.

Background

The variable data printing is also called variable information printing, personalized printing or custom printing, and is a printing mode which takes an electronic file as a carrier, and transmits the electronic file to printing equipment through a network to realize direct printing. The variable data printing has the characteristics of one sheet of printing, no plate making, vertical accessibility, timely error correction, variable printing, on-demand printing and the like which are not possessed by the traditional printing, and the variable data printing elements mainly comprise characters, patterns and bar codes.

In the actual printing production process, various printing defects can be generated due to the influence of uncertain factors such as production technology, production equipment, industrial field production environment and the like, and the printing defects can be classified into three categories of points, lines and blocks according to defect forms; the printed matter is taken as a carrier of commodity key information, is an important way for enterprises to track the quality control of products, but the printing defects can influence the readability of the printed matter and the image of the enterprises, and can bring unavoidable loss to commodity manufacturers, so that the method has important significance for detecting the printing defects and evaluating the quality of the variable data printed matter.

At present, a manual detection method or a machine vision detection method based on a digital image processing technology is mainly adopted for detecting printing defects, wherein the manual detection method is to observe human eyes by means of a tool (such as a magnifying glass) or directly observe the printing defects through human eyes, but human eyes are easy to fatigue in detection of a large number of printed matters, and the machine vision detection method based on the digital image processing technology is required to finely adjust a light source, adjust a shooting distance and adjust a shooting angle before detection, so that the problem of low detection efficiency exists and needs to be further improved.

Disclosure of Invention

Based on the above, the embodiment of the application provides a variable data printing defect detection method and system based on a YOLOv5s model, so as to solve the problem of low detection efficiency in the prior art.

In a first aspect, embodiments of the present application provide a variable data printing defect detection method based on a YOLOv5s model, the method including:

acquiring a to-be-detected image of a to-be-detected printed matter based on a preset camera;

inputting the image to be detected into an improved YOLOv5s model obtained by training in advance to generate a feature detection image;

and determining whether the to-be-detected printed matter has printing defects according to the feature detection image.

Compared with the prior art, the beneficial effects that exist are: according to the variable data printing defect detection method based on the YOLOv5s model, the terminal equipment can firstly acquire the image to be detected of the printed matter to be detected based on the preset camera, then input the image to be detected into the trained improved YOLOv5s model to obtain the feature detection image, when the printed matter to be detected has printing defects, the feature detection image output by the improved YOLOv5s model marks the printing defects, and then whether the printed matter to be detected has the printing defects or not is determined according to the feature detection image, so that the printing defects of the printed matter to be detected can be found timely and accurately, the detection efficiency is greatly improved, the detection efficiency is far higher than that of manual detection and can be suitable for detection of different types of printed matter, and the problem of low current detection efficiency is solved to a certain extent.

In a second aspect, embodiments of the present application provide a variable data printing defect detection system based on a YOLOv5s model, the system comprising:

the image acquisition module to be detected: the method comprises the steps of acquiring a to-be-detected image of a to-be-detected printed matter based on a preset camera;

the feature detection image output module: the method comprises the steps of inputting an image to be detected into an improved YOLOv5s model obtained through training in advance to generate a feature detection image;

a printing defect determining module: and the device is used for determining whether the to-be-detected printed matter has printing defects according to the characteristic detection image.

In a third aspect, embodiments of the present application provide a terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to the first aspect as described above when the computer program is executed.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method of the first aspect described above.

It will be appreciated that the advantages of the second to fourth aspects may be found in the relevant description of the first aspect and are not repeated here.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments or the prior art will be briefly described below.

FIG. 1 is a flow chart of a method for detecting defects in variable data printing according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating a step S201 in a variable data printing defect detection method according to an embodiment of the present application;

FIG. 3 is a network architecture diagram of an improved YOLOv5s model provided by an embodiment of the present application;

FIG. 4 is a first flowchart illustrating a step S204 in a variable data printing defect detection method according to an embodiment of the present application;

FIG. 5 is a network block diagram of a C3 module according to one embodiment of the present application;

FIG. 6 is a flowchart of step S2042 in a variable data printing defect detection method according to an embodiment of the present application;

FIG. 7 is a second flow chart of step S204 in a variable data printing defect detection method according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a bi-directional multi-scale feature fusion network architecture provided in an embodiment of the present application;

FIG. 9 is a third flow chart of step S204 in a variable data printing defect detection method according to an embodiment of the present application;

FIG. 10 is a network block diagram of a CA module provided in an embodiment of the present application;

FIG. 11 is a flowchart illustrating a method for detecting a defect in variable data printing according to an embodiment of the present application;

FIG. 12 is a schematic diagram of a print defect provided in an embodiment of the present application, wherein FIG. 12 (a) is a first schematic diagram of an image to be detected with a print defect, and FIG. 12 (b) is a first schematic diagram of a feature detection image for improving the output of the YOLOv5s model;

FIG. 13 is a schematic diagram of a print defect provided in an embodiment of the present application, wherein FIG. 13 (a) is a second schematic diagram of an image to be detected with a print defect, and FIG. 13 (b) is a second schematic diagram of a feature detection image for improving the output of the YOLOv5s model;

FIG. 14 is a schematic diagram of a print defect provided in an embodiment of the present application, in which FIG. 14 (a) is a third schematic diagram of an image to be detected with a print defect, and FIG. 14 (b) is a third schematic diagram of a feature detection image for improving the output of the YOLOv5s model;

FIG. 15 is a block diagram of a variable data printing defect detection system according to one embodiment of the present application;

fig. 16 is a schematic diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

In the description of this application and the claims that follow, the terms "first," "second," "third," etc. are used merely to distinguish between descriptions and should not be construed to indicate or imply relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

In order to illustrate the technical solutions described in the present application, the following description is made by specific examples.

Referring to fig. 1, fig. 1 is a flow chart of a variable data printing defect detection method based on YOLOv5s model according to an embodiment of the present application. In this embodiment, the execution subject of the variable data printing defect detection method is a terminal device. It will be appreciated that the types of terminal devices include, but are not limited to, cell phones, tablet computers, notebook computers, ultra-Mobile Personal Computer (UMPC), netbooks, personal digital assistants (Personal Digital Assistant, PDA), etc., and embodiments of the present application do not impose any limitation on the specific type of terminal device.

Referring to fig. 1, the variable data printing defect detection method provided in the embodiment of the present application includes, but is not limited to, the following steps:

in S100, a to-be-detected image of a to-be-detected print is acquired based on a preset camera.

The camera may be a video image capturing apparatus that can be stably and efficiently applied to an industrial site, i.e., an industrial camera, without loss of generality; the printed matter to be detected is used for describing whether the printed matter with the printing defects exists or not to be detected; above the conveyor belt of the printing apparatus, an industrial camera may be mounted in advance, which can continuously take video about the print to be detected.

Specifically, the terminal device may acquire the video related to the print to be detected in real time through the industrial camera, and then the terminal device may determine each frame image in the video as the image to be detected of the print to be detected.

In some possible implementation manners, in order to facilitate the accurate and efficient detection of the printing defect in the following process, after the terminal device obtains the image to be detected, the terminal device may pre-process the image to be detected based on a preset computer vision library (such as OpenCv), segment the printed matter to be detected in the image to be detected with the background, and/or segment a plurality of printed matter to be detected in the image to be detected, and then perform mean filtering noise reduction processing to obtain the denoised image to be detected.

In S200, an image to be detected is input into the improved YOLOv5S model trained in advance, and a feature detection image is generated.

Specifically, after the terminal device acquires the image to be detected, the terminal device may input the image to be detected into the improved YOLOv5s model obtained by training in advance, and generate a feature detection image, where the feature detection image is used to describe the image output by the improved YOLOv5s model; when the print to be detected has a printing defect, the printing defect in the feature detection image can be marked.

In some possible implementations, to achieve accurate and efficient determination of the printing defect, the method further includes, but is not limited to, the following steps:

in S201, an improved YOLOv5S model is constructed based on the initial YOLOv5S model, the bi-directional multi-scale feature fusion network structure, and the CA attention mechanism.

Specifically, the initial YOLOv5s model is used for describing an initial YOLOv5s model, wherein the initial YOLOv5s model comprises a trunk feature extraction network and an enhanced feature extraction network, and the trunk feature extraction network is the trunk network, and the enhanced feature extraction network is the neck network; the improved YOLOv5s model is used for describing the improved YOLOv5s model, and the improved YOLOv5s model can accurately and efficiently determine printing defects. The terminal equipment can construct an improved YOLOv5s model by combining a bidirectional multiscale feature fusion network structure and a CA attention mechanism on the basis of an initial YOLOv5s model.

In some possible implementations, to implement the building of the improved YOLOv5S model, referring to fig. 2, step S201 includes, but is not limited to, the following steps:

in S202, an initial YOLOv5S model is acquired.

In particular, the terminal device may obtain an initial YOLOv5s model.

In S203, the network is extracted for the reinforcement features in the initial YOLOv5S model: and replacing the feature pyramid structure of the enhanced feature extraction network by using a bidirectional multi-scale feature fusion network structure to generate a first optimized YOLOv5s model.

Specifically, after the terminal device acquires the initial YOLOv5s model, the terminal device may perform the following processing for the enhanced feature extraction network in the initial YOLOv5s model: and replacing the feature pyramid structure in the original enhanced feature extraction network by using a bidirectional multi-scale feature fusion network structure to generate a first optimized YOLOv5s model.

In S204, a CA attention mechanism is embedded in the trunk feature extraction network corresponding to the first optimized YOLOv5S model, and an improved YOLOv5S model is generated.

Specifically, after the terminal device generates the first optimized YOLOv5s model, the terminal device may embed a CA attention mechanism in a trunk feature extraction network corresponding to the first optimized YOLOv5s model to generate an improved YOLOv5s model, where please refer to fig. 3, "Backbone" in fig. 3 is a trunk feature extraction network improving YOLOv5s model, "bifsion negk" in fig. 3 is an enhanced feature extraction network improving YOLOv5s model, and "Head" in fig. 3 is a Head network improving YOLOv5s model, and the trunk feature extraction networks corresponding to the improved YOLOv5s model are sequentially connected to the CBS module, C3 module, CA module, CBS module, C3 module, SPFF module and CA module in order from shallow layer to deep layer.

In some possible implementations, to improve accuracy and stability of the YOLOv5S model, referring to fig. 4, step S204 includes, but is not limited to, the following steps:

in S2041, the input feature x is input to the CBS module in the main path and the CBS module in the branching path, respectively, and after the two CBS modules perform convolution operation, the main path and the branching path generate the output feature y1.

Without loss of generality, the CBS module consists of a convolutional layer (Conv), a batch normalization layer (BN), and an activation function layer (SiLU), where the SiLU activation functions employed by the activation function layer may be:

in the formula, x is an image to be detected, sigmoid () is a preset S-shaped growth curve function, and e is a preset constant.

Illustratively, referring to FIG. 5, the C3 module is comprised of a main circuit including the BottleNeck module and the CBS module and a shunt including the CBS module but not the BottleNeck module; the input feature x may be an image to be detected processed by the CBS module, for example, the image to be detected processed by the second CBS module in the trunk feature extraction network and output by the second CBS module, i.e., the input feature x, relative to the C3 module of the third layer in the trunk feature extraction network. The terminal device can input the input characteristic x into the CBS module in the main path and the CBS module in the branching path respectively, and after the two CBS modules perform convolution operation, the main path and the branching path generate the output characteristic y1.

In S2042, the output feature y1 in the main path is input to a preset BottleNeck module, and an output feature y4 is generated.

Specifically, referring to fig. 5, the terminal device may input the output feature y1 in the main path to a preset BottleNeck module to generate the output feature y4.

In some possible implementations, to improve the efficiency of outputting the result and reduce the situation of gradient extinction or gradient explosion, referring to fig. 6, step S2042 includes, but is not limited to, the following steps:

in S20421, when the logical attribute of the residual bypass is True, the output feature y1 in the main path is input to the CBS module, the output feature y2 is generated, and the output feature y1 is input to the residual bypass.

For example, referring to fig. 5, the bottlenegk module includes a residual bypass and two CBS modules connected in series, where the logic attribute of the residual bypass is True or False, in one possible implementation, referring to fig. 3, the logic attribute of the residual bypass corresponding to the C3 module in the trunk feature extraction network is True, and the logic attribute of the residual bypass corresponding to the C3 module in the enhanced feature extraction network is False.

Specifically, referring to fig. 5, when the logical attribute of the residual bypass is True, the terminal device may input the output feature y1 in the main path to the CBS module to generate the output feature y2, and simultaneously input the output feature y1 to the residual bypass.

In S20422, the output feature y2 is input to the CBS module, and the output feature y3 is generated.

Specifically, referring to fig. 5, after the terminal device generates the output feature y2, the terminal device may input the output feature y2 to the CBS module to generate the output feature y3.

In S20423, the output feature y3 is stacked with the output feature y1 in the residual bypass, and the output feature y4 is generated.

Specifically, referring to fig. 5, after the terminal device generates the output feature y3, the terminal device may stack the output feature y3 with the output feature y1 in the residual bypass to generate the output feature y4.

In S20424, when the logical attribute of the residual bypass is False, the output feature y1 in the main path is input to the CBS module, and the output feature y2 is generated.

Specifically, referring to fig. 5, when the logic attribute of the residual bypass is False, the terminal device may input the output feature y1 in the main path to the CBS module to generate the output feature y2.

In S20425, the output feature y2 is input to the CBS module, and the output feature y3 is generated.

In S20426, the output feature y3 is determined as the output feature y4.

Specifically, referring to fig. 5, after the terminal device generates the output feature y3, the terminal device may determine the output feature y3 as the output feature y4, so that when the logic attribute of the residual bypass in the bottlenegk module is False, the bottlenegk module operates only as two CBS modules connected in series in the main path.

In S2043, the output feature y1 generated by the branching and the output feature y4 are subjected to a splicing process, and an output feature y5 is generated.

Specifically, the terminal device may perform a splicing process on the output feature y1 generated by the branching and the output feature y4 to generate an output feature y5.

In S2044, the output feature y5 is input to the CBS module, and after the CBS module performs a convolution operation, an output feature y6 is generated.

Specifically, after the terminal device generates the output feature y5, the terminal device may input the output feature y5 to the CBS module, and generate the output feature y6 after the CBS module performs the convolution operation.

In some possible implementations, in order to further enable the improved YOLOv5S model to accurately and efficiently detect printing defects, referring to fig. 7, step S204 further includes, but is not limited to, the following steps:

In S2045, a feature map output by a ninth layer CA module in the trunk feature extraction network is subjected to 1×1 convolution dimension reduction processing, and a first convolution dimension reduction feature map is generated.

Specifically, referring to fig. 3 and fig. 8, the terminal device may perform a 1×1 convolution dimension reduction process on the feature map output by the ninth layer CA module in the trunk feature extraction network to generate a first convolution dimension reduction feature map, where the 1×1 convolution represents a convolution with a size of 1×1.

In S2046, the feature map output by the thirteenth CA module in the trunk feature extraction network is input to the enhanced feature extraction network, and a 2×2 transposed convolution upsampling process is performed to generate a convolution upsampled feature map.

Specifically, referring to fig. 3 and fig. 8, after the terminal device generates the first convolution dimension-reduction feature map, the terminal device may input the feature map output by the thirteenth layer CA module in the trunk feature extraction network into the enhanced feature extraction network to perform 2×2 transposed convolution upsampling processing, so as to generate a convolution upsampled feature map, where the 2×2 transposed convolution represents a transposed convolution with a size of 2×2.

In S2047, the feature map output by the sixth layer CA module in the trunk feature extraction network is input to the enhanced feature extraction network, and a 1×1 convolution dimension reduction process is performed to generate a second convolution dimension reduction feature map.

Specifically, referring to fig. 3 and fig. 8, the terminal device may input the feature map output by the sixth layer CA module in the trunk feature extraction network into the enhanced feature extraction network to perform 1×1 convolution dimension reduction processing, so as to generate a second convolution dimension reduction feature map.

In S2048, in the enhanced feature extraction network, a convolution downsampling process of 3×3 in size and 2 in step size is performed on the second convolution dimensionality reduction feature map, so as to generate a third convolution dimensionality reduction feature map.

Specifically, referring to fig. 3 and 8, in the enhanced feature extraction network, the terminal device may perform convolution downsampling processing on the second convolution dimension reduction feature map with a size of 3×3 and a step size of 2, to generate a third convolution dimension reduction feature map.

In S2049, based on a preset Concat function, the first convolution dimension-reduction feature map, the convolution up-sampling feature map, and the third convolution dimension-reduction feature map are subjected to a stitching fusion process, so as to generate a stitching fusion feature map.

Specifically, referring to fig. 3 and fig. 8, the terminal device may perform a stitching and fusing process on the first convolution dimension reduction feature map, the convolution up-sampling feature map, and the third convolution dimension reduction feature map based on a preset Concat function, so as to generate a stitching and fusing feature map.

In S20491, a 1×1 convolution process is performed on the concatenated fusion feature map to generate a convolution fusion feature map.

Specifically, referring to fig. 3, after the terminal device generates the stitching fusion feature map, the terminal device may perform 1×1 convolution processing on the stitching fusion feature map to generate a convolution fusion feature map, so as to implement bidirectional multi-scale feature fusion with large scale downward and small scale upward, and the features after deep and shallow fusion can have better semantic information and spatial information.

In some possible implementations, to further facilitate the improvement of the YOLOv5S model, the printing defect can be accurately and efficiently determined, referring to fig. 9, step S204 further includes, but is not limited to, the following steps:

in S20492, the size of the input feature map is averaged and pooled from the horizontal direction and the vertical direction, respectively, based on the CA attention mechanism, to obtain a horizontal-direction one-dimensional vector and a vertical-direction one-dimensional vector.

Without loss of generality, the size of the input feature map is c×h×w, the horizontal one-dimensional vector is c×h×1, the vertical one-dimensional vector is c×1×w, C is the number of channels, H is high, and W is wide;

specifically, referring to fig. 10, the terminal device may firstly average and pool the size of the input feature map from the horizontal direction based on the CA attention mechanism to obtain a horizontal one-dimensional vector, and then average and pool the size of the input feature map from the vertical direction to obtain a vertical one-dimensional vector.

In S20493, in the spatial dimension, a horizontal one-dimensional vector and a vertical one-dimensional vector are obtainedThe vector is spliced to generate a spliced one-dimensional vector, and the channel number is changed into the original number by using 1x1 convolution to compress the channel number

Specifically, after the terminal device generates the horizontal one-dimensional vector and the vertical one-dimensional vector, the terminal device may perform a stitching operation on the horizontal one-dimensional vector and the vertical one-dimensional vector in the spatial dimension to generate a stitched one-dimensional vector, and simultaneously compress the number of channels by using 1x1 convolution to change the number of channels into the original numberThereby achieving the dimension adjustment.

In S20494, the spatial information in the vertical direction and the spatial information in the horizontal direction in the concatenated one-dimensional vector are encoded based on the batch normalization operation and the nonlinear transformation operation, and the feature vector is generated.

Specifically, the terminal device may encode and splice spatial information in a vertical direction and spatial information in a horizontal direction in one-dimensional vectors through batch normalization (batch norm) operation and nonlinear transformation (Non-linear) operation to generate feature vectors.

In S20495, the feature vectors are subjected to separation processing, and a new horizontal one-dimensional vector and a new vertical one-dimensional vector are generated.

Specifically, the terminal device may perform a separation (Split) process on the feature vector to generate a new horizontal one-dimensional vector and a new vertical one-dimensional vector, so as to implement the re-separation of the feature vector into a horizontal vector and a vertical vector.

In S20496, 1x1 convolution processing is performed on the new horizontal one-dimensional vector and the new vertical one-dimensional vector, respectively, to determine the target channel numbers of the new horizontal one-dimensional vector and the new vertical one-dimensional vector.

Specifically, the number of target channels of the new horizontal one-dimensional vector and the number of target channels of the new vertical one-dimensional vector are the same as the number of channels of the input feature map; the terminal device may first perform 1x1 convolution processing on the new horizontal one-dimensional vector to determine the number of target channels of the new horizontal one-dimensional vector, and then perform 1x1 convolution processing on the new vertical one-dimensional vector to determine the number of target channels of the new vertical one-dimensional vector.

In S20497, a new horizontal one-dimensional vector and a new vertical one-dimensional vector are input to a predetermined activation function, respectively, to obtain a horizontal output vector and a vertical output vector.

Specifically, the new horizontal direction one-dimensional vector and the new vertical direction one-dimensional vector are respectively input into a preset activation function (siggold) to obtain a horizontal output vector and a vertical output vector, wherein the activation function can be:

in the formula, S () is a preset siggold activation function, e is a preset constant, and x is an image to be detected.

In S20498, the horizontal output vector and the vertical output vector are subjected to normalization weighting processing to generate a target output vector.

Specifically, the terminal device may perform normalized weighting processing on the horizontal output vector and the vertical output vector, and generate a target output vector, where the target output vector may be output by the CA module.

In some possible implementations, to facilitate building a better improved YOLOv5S model, referring to fig. 11, before step S201, the method further includes, but is not limited to, the following steps:

in S2011, a print defect image set is obtained based on a preset camera.

Specifically, the set of printing defect images includes a plurality of printing defect images for describing images containing printing defects; the terminal device may obtain a set of print defect images based on the industrial camera and the vision platform.

In S2012, for each printed defect image: and marking the defect type and the defect position of the printing defect image, and generating a marked defect image.

Specifically, the terminal device may perform the following processing for each print defect image: and marking the defect type and the defect position of the printing defect image based on LabelImg software, and generating a marked defect image.

In one possible implementation, after the annotation defect image is generated, the duplicate data, the missing value data and/or the outlier data may be manually deleted, thereby ensuring accuracy, integrity, consistency and availability of the data, and enabling the subsequent data set to have higher quality and reliability.

In S2013, the plurality of marked defect images are divided into a training set and a verification set according to a preset ratio.

Specifically, the terminal device may divide the plurality of labeling defect images into a training set and a verification set according to a preset proportion, where the training set and the verification set are data sets, the number of labeling defect images in the training set is greater than the number of labeling defect images in the verification set, and the proportion between the training set and the verification set may be 7:3, so that training of the model is facilitated.

In S2014, based on a preset mosaic data enhancement algorithm, data enhancement processing is performed on the training set and the verification set, and an enhanced training set and an enhanced verification set are generated.

Specifically, after the terminal device divides the multiple labeling defect images into the training set and the verification set, the terminal device can firstly use the Open CV of the computer vision library to change the labeling defect images in the data set into images with lower resolution, so as to be beneficial to further improving the accuracy and stability of the improved YOLOv5s model and reducing the situation of over fitting, and the terminal device can perform data enhancement processing on the training set and the verification set based on a preset mosaic data enhancement algorithm to generate an enhancement training set and an enhancement verification set.

For example, the terminal device may first randomly select four different images and then randomly select one image among the four images as the center image of the composite image. Next, the algorithm will randomly crop out three adjacent images around this center image and stitch them together according to certain rules to form a new composite image. During stitching, algorithms may use some random transformations, such as rotation, scaling, and horizontal flipping, to increase the diversity of the composite image. Finally, the algorithm takes the label of the central area of the composite image as the label of the whole composite image, thereby generating a new training sample.

In some possible implementations, to facilitate efficient and accurate detection of printing defects, after step S201, the method further includes, but is not limited to, the steps of:

in S2015, inputting an enhanced training set into the improved YOLOv5S model for training, and adjusting the super parameters of the improved YOLOv5S model based on the enhanced verification set in the training process until the improved YOLOv5S model converges or reaches the maximum training round, so as to generate the trained improved YOLOv5S model.

Specifically, the terminal device may input the enhanced training set into the improved YOLOv5s model for training; in the training process, the hyper-parameters of the improved YOLOv5s model are adjusted based on the enhancement verification set until the improved YOLOv5s model converges or reaches the maximum training round, and the trained improved YOLOv5s model is generated.

In one possible implementation, the terminal device may draw a LOSS curve and a map curve according to the data of the modified YOLOv5s model, and then evaluate the modified YOLOv5s model based on three indexes of accuracy (Precision), recall (Recall), and average accuracy value (mean Average Precision).

Specifically, the accuracy can be obtained by calculation of the following formula:

And the recall rate can be calculated by the following formula:

in the formula, precision is the accuracy; recall is Recall; TP is the number of samples correctly classified as positive, i.e. the number of samples corresponding to the actual positive samples and the model classified as positive samples; FP is the number of samples that are misclassified as positive samples, i.e. actually negative samples, but the number of samples that are corresponding to the samples that are misclassified as positive samples by the model; FN is the number of samples that are misclassified as negative, i.e. actually positive samples, but the number of samples that are model classified as negative corresponds to.

Without loss of generality, when determining the average accuracy value, the average accuracy, i.e., AP, may be determined according to the area under the Precision-Recall curve, and then the average accuracy value may be determined according to the average value of APs of each class.

Accordingly, step S200 includes, but is not limited to, the following steps:

in S210, the image to be detected is input into the trained modified YOLOv5S model, and a feature detection image is determined.

Specifically, the terminal device may input the image to be detected into the trained improved YOLOv5s model, and determine the feature detection image.

In S300, it is determined whether or not the print to be detected has a print defect based on the feature detection image.

Specifically, referring to fig. 12-14, the terminal device may determine whether the print to be detected has a printing defect according to the feature detection image, for example, when the printing defect is detected in the feature detection image, it indicates that the print to be detected has a printing defect; and when the printing defects are not detected in the feature detection image, indicating that the printing product to be detected has no printing defects.

It should be noted that, referring to table 1 below, table 1 is obtained from a plurality of experimental results; as can be seen from Table 1, the improved YOLOv5s model after training provided by the embodiment of the invention is superior to other models, compared with the original YOLOv5s model, the accuracy is improved by 1.7%, the average accuracy value is improved by 2.9%, and the most advanced performance is realized with the advantages of processing efficiency and accuracy.

Table 1 shows the changes in various indices for the improved YOLOv5s model

The implementation principle of the variable data printing defect detection method based on the YOLOv5s model in the embodiment of the application is as follows: the terminal equipment can acquire a to-be-detected image of the to-be-detected printed matter according to a preset industrial camera; then inputting the image to be detected into an improved YOLOv5s model obtained by training, and outputting a feature detection image; and determining whether the printed matter to be detected has printing defects or not according to the feature detection image.

It should be noted that, the sequence number of each step in the above embodiment does not mean the sequence of execution sequence, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

Embodiments of the present application also provide a variable data printing defect detection system based on the YOLOv5s model, only a portion relevant to the present application is shown for convenience of illustration, as shown in fig. 15, the system 150 includes:

the image acquisition module to be detected 151: the method comprises the steps of acquiring a to-be-detected image of a to-be-detected printed matter based on a preset camera;

feature detection image output module 152: the method comprises the steps of inputting an image to be detected into an improved YOLOv5s model obtained through pre-training to generate a feature detection image;

the print defect determination module 153: and the device is used for determining whether the printed matter to be detected has printing defects according to the feature detection image.

Optionally, the system 150 further includes:

improved YOLOv5s model building block: the method comprises the steps of constructing an improved YOLOv5s model based on an initial YOLOv5s model, a bidirectional multi-scale feature fusion network structure and a CA attention mechanism, wherein the initial YOLOv5s model comprises a trunk feature extraction network and an enhanced feature extraction network;

Optionally, the improved YOLOv5s model building module includes:

the initial YOLOv5s model acquisition submodule: for obtaining an initial YOLOv5s model;

the first optimized YOLOv5s model generation sub-module: for extracting the network for the enhanced features in the initial YOLOv5s model: replacing a feature pyramid structure of the enhanced feature extraction network by using a bidirectional multi-scale feature fusion network structure to generate a first optimized YOLOv5s model;

improved YOLOv5s model generation submodule: the method comprises the steps of embedding a CA attention mechanism in a trunk feature extraction network corresponding to a first optimized YOLOv5s model to generate an improved YOLOv5s model, wherein the trunk feature extraction network corresponding to the improved YOLOv5s model is sequentially connected with a CBS module, a C3 module, a CA module, a C3 module, a SPFF module and a CA module from a shallow layer to a deep layer.

Optionally, the CBS module consists of a convolution layer, a batch normalization layer and an activation function layer; the C3 module consists of a main path and a shunt, wherein the main path comprises a BottleNeck module, and the shunt does not comprise the BottleNeck module; the improved YOLOv5s model generation submodule comprises:

A first output characteristic generation unit: the system comprises a CBS module and a CBS module, wherein the CBS module is used for inputting an input characteristic x into a main path and the CBS module in a branching path respectively, and after the two CBS modules perform convolution operation, the main path and the branching path generate an output characteristic y1;

a second output characteristic generation unit: the method comprises the steps of inputting an output characteristic y1 in a main path into a preset BottleNeck module to generate an output characteristic y4;

a third output characteristic generation unit: the method comprises the steps of splicing an output characteristic y1 generated by branching with an output characteristic y4 to generate an output characteristic y5;

fourth output characteristic generation unit: the output feature y5 is input to the CBS module, and the output feature y6 is generated after the CBS module performs convolution operation.

The embodiment of the present application further provides a terminal device, as shown in fig. 16, where the terminal device 160 of this embodiment includes: a processor 161, a memory 162, and a computer program 163 stored in the memory 162 and executable on the processor 161. The processor 161, when executing the computer program 163, implements the steps in the above-described flow processing method embodiment, such as steps S100 to S300 shown in fig. 1; alternatively, the processor 161, when executing the computer program 163, implements the functions of the respective modules in the above-described apparatus, such as the functions of the modules 151 to 153 shown in fig. 15.

The terminal device 160 may be a desktop computer, a notebook computer, a palm top computer, a cloud server, etc., and the terminal device 160 includes, but is not limited to, a processor 161, a memory 162. It will be appreciated by those skilled in the art that fig. 16 is merely an example of terminal device 160 and is not limiting of terminal device 160, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., terminal device 160 may also include input-output devices, network access devices, buses, etc.

The processor 161 may be a central processing unit (Central Processing Unit, CPU), other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.; a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 162 may be an internal storage unit of the terminal device 160, such as a hard disk or a memory of the terminal device 160, or the memory 162 may be an external storage device of the terminal device 160, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the terminal device 160; further, the memory 162 may also include both an internal storage unit and an external storage device of the terminal device 160, the memory 162 may also store the computer program 163 and other programs and data required by the terminal device 160, and the memory 162 may also be used to temporarily store data that has been output or is to be output.

An embodiment of the present application also provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the various method embodiments described above. Wherein the computer program comprises computer program code, the computer program code can be in the form of source code, object code, executable file or some intermediate form, etc.; the computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.

The foregoing are all preferred embodiments of the present application, and are not intended to limit the scope of the present application in any way, therefore: all equivalent changes of the method, principle and structure of the present application should be covered in the protection scope of the present application.

Claims

1. A variable data printing defect detection method based on YOLOv5s model, the method comprising:

2. The method of claim 1, wherein prior to said inputting the image to be detected into the pre-trained improved YOLOv5s model, generating a feature detection image, the method further comprises:

constructing the improved YOLOv5s model based on an initial YOLOv5s model, a bidirectional multi-scale feature fusion network structure and a CA attention mechanism, wherein the initial YOLOv5s model comprises a trunk feature extraction network and an enhanced feature extraction network;

wherein the constructing the improved YOLOv5s model based on the initial YOLOv5s model, the bi-directional multi-scale feature fusion network structure and the CA attention mechanism comprises:

acquiring an initial YOLOv5s model;

extracting a network for the reinforcement features in the initial YOLOv5s model: replacing the feature pyramid structure of the enhanced feature extraction network with a bidirectional multi-scale feature fusion network structure to generate a first optimized YOLOv5s model;

and embedding a CA attention mechanism into a trunk feature extraction network corresponding to the first optimized Yolov5s model to generate an improved Yolov5s model, wherein the trunk feature extraction network corresponding to the improved Yolov5s model is sequentially connected with a CBS module, a C3 module, a CA module, a CBS module, a C3 module, a SPFF module and a CA module from a shallow layer to a deep layer.

3. The method of claim 2, wherein the CBS module consists of a convolution layer, a batch normalization layer, and an activation function layer; the C3 module consists of a main path and a branching path, wherein the main path comprises a BottleNeck module, and the branching path does not comprise the BottleNeck module; embedding a CA attention mechanism in a trunk feature extraction network corresponding to the first optimized Yolov5s model to generate an improved Yolov5s model, wherein the method comprises the following steps of:

the input feature x is respectively input into a CBS module in the main path and a CBS module in the branching path, and after the two CBS modules perform convolution operation, the main path and the branching path generate an output feature y1;

inputting the output characteristic y1 in the main path into a preset BottleNeck module to generate an output characteristic y4;

splicing the output characteristic y1 generated by the branching with the output characteristic y4 to generate an output characteristic y5;

inputting the output characteristic y5 to a CBS module, and generating an output characteristic y6 after the CBS module carries out convolution operation;

the Bot tleNeck module comprises a residual bypass, wherein the logic attribute of the residual bypass is True or False; the inputting the output feature y1 in the main path into the preset BottleNeck module to generate an output feature y4 includes:

When the logic attribute of the residual bypass is True, inputting an output feature y1 in the main path to a CBS module, generating an output feature y2, and inputting the output feature y1 to the residual bypass;

inputting the output characteristic y2 to a CBS module to generate an output characteristic y3;

stacking the output feature y3 with the output feature y1 in the residual bypass to generate an output feature y4;

when the logic attribute of the residual bypass is False, inputting the output characteristic y1 in the main path to a CBS module to generate an output characteristic y2;

the output feature y3 is determined as output feature y4.

4. The method of claim 3, wherein embedding a CA attention mechanism in the backbone feature extraction network corresponding to the first optimized YOLOv5s model generates an improved YOLOv5s model, further comprising:

carrying out 1X 1 convolution dimension reduction processing on a feature map output by a ninth layer CA module in a trunk feature extraction network to generate a first convolution dimension reduction feature map;

inputting the feature map output by the thirteenth CA module in the trunk feature extraction network into the reinforced feature extraction network, and performing 2×2 transposed convolution upsampling processing to generate a convolution upsampling feature map;

Inputting the feature map output by the sixth CA module in the trunk feature extraction network into the reinforced feature extraction network, and performing 1X 1 convolution dimension reduction processing to generate a second convolution dimension reduction feature map;

in the reinforced feature extraction network, performing convolution downsampling processing with the size of 3 multiplied by 3 and the step length of 2 on the second convolution dimension reduction feature map to generate a third convolution dimension reduction feature map;

based on a preset Concat function, performing splicing and fusion processing on the first convolution dimension reduction feature map, the convolution up-sampling feature map and the third convolution dimension reduction feature map to generate a splicing and fusion feature map;

and carrying out 1 multiplied by 1 convolution processing on the spliced fusion feature map to generate a convolution fusion feature map.

5. The method of claim 4, wherein embedding a CA attention mechanism in a backbone feature extraction network corresponding to the first optimized YOLOv5s model generates an improved YOLOv5s model, further comprising:

based on the CA attention mechanism, carrying out average pooling on the size of an input feature map from the horizontal direction and the vertical direction respectively to obtain a horizontal direction one-dimensional vector and a vertical direction one-dimensional vector, wherein the size of the input feature map is C multiplied by H multiplied by W, the horizontal direction one-dimensional vector is C multiplied by H multiplied by 1, the vertical direction one-dimensional vector is C multiplied by 1 multiplied by W, C is the channel number, H is high, and W is wide;

In the space dimension, performing splicing operation on the horizontal one-dimensional vector and the vertical one-dimensional vector to generate a spliced one-dimensional vector, and compressing the channel number by using 1x1 convolution to change the channel number into the original one

Based on batch normalization operation and nonlinear transformation operation, encoding the space information in the vertical direction and the space information in the horizontal direction in the spliced one-dimensional vector to generate a feature vector;

separating the feature vectors to generate a new horizontal one-dimensional vector and a new vertical one-dimensional vector;

carrying out 1x1 convolution processing on the new horizontal direction one-dimensional vector and the new vertical direction one-dimensional vector respectively, and determining the target channel number of the new horizontal direction one-dimensional vector and the new vertical direction one-dimensional vector, wherein the target channel number is the same as the channel number of the input feature map;

respectively inputting the new horizontal one-dimensional vector and the new vertical one-dimensional vector to a preset activation function to obtain a horizontal output vector and a vertical output vector;

and carrying out normalization weighting processing on the horizontal output vector and the vertical output vector to generate a target output vector.

6. The method of claim 2, wherein prior to constructing the improved YOLOv5s model based on the initial YOLOv5s model, a bi-directional multi-scale feature fusion network structure, and a CA attention mechanism, the method further comprises:

obtaining a printing defect image set based on a preset camera, wherein the printing defect image set comprises a plurality of printing defect images;

for each of the print defect images: marking the defect type and the defect position of the printing defect image to generate a marked defect image;

dividing a plurality of marked defect images into a training set and a verification set according to a preset proportion;

and carrying out data enhancement processing on the training set and the verification set based on a preset mosaic data enhancement algorithm to generate an enhanced training set and an enhanced verification set.

7. The method of claim 1, wherein after the constructing the improved YOLOv5s model based on the initial YOLOv5s model, a bi-directional multi-scale feature fusion network structure, and a CA attention mechanism, the method further comprises:

inputting the enhanced training set into the improved YOLOv5s model for training, and adjusting the hyper-parameters of the improved YOLOv5s model based on the enhanced verification set in the training process until the improved YOLOv5s model converges or reaches the maximum training round, so as to generate the trained improved YOLOv5s model;

Correspondingly, the step of inputting the image to be detected into a pre-trained improved YOLOv5s model to generate a feature detection image comprises the following steps:

and inputting the image to be detected into the trained improved YOLOv5s model to generate a feature detection image.

8. A YOLOv5s model-based variable data printing defect detection system, the system comprising:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.