CN110930356A

CN110930356A - Industrial two-dimensional code reference-free quality evaluation system and method

Info

Publication number: CN110930356A
Application number: CN201910966425.0A
Authority: CN
Inventors: 翟广涛; 杨小康; 车朝晖; 朱文瀚
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2019-10-12
Filing date: 2019-10-12
Publication date: 2020-03-27
Anticipated expiration: 2039-10-12
Also published as: CN110930356B

Abstract

The invention provides a no-reference quality evaluation system for an industrial two-dimensional code image, which comprises an industrial two-dimensional code database module, a multi-task MTL convolutional neural network module and a multi-task MTL convolutional neural network module, wherein the industrial two-dimensional code database module is used for providing an industrial two-dimensional code test sample to be evaluated and an industrial two-dimensional code training sample for training; the multi-task MTL convolutional neural network module is used for training the multi-task MTL convolutional neural network on an industrial two-dimensional code training sample and completing a quality evaluation task of the industrial two-dimensional code to be evaluated by adopting the trained multi-task MTL convolutional neural network; after the industrial two-dimensional code is predicted by the shallow layer multitask convolutional neural network and the deep layer multitask convolutional neural network, on one hand, various distortion types of the image can be judged, and on the other hand, the quality grade of the image can be predicted. The invention can greatly reduce the time cost and the calculation cost of the preprocessing and the decoding of the industrial two-dimensional code, improve the decoding efficiency of the industrial two-dimensional code and increase the throughput of an industrial production line.

Description

Industrial two-dimensional code reference-free quality evaluation system and method

Technical Field

The invention relates to the technical field of image quality evaluation of industrial two-dimensional codes, in particular to a system and a method for evaluating the reference-free quality of industrial two-dimensional codes.

Background

Industrial two-dimensional (2D) codes are ubiquitous throughout automated assembly lines. Two-dimensional codes have been widely used in the fields of automatic identification tasks, such as semiconductor wafer marking, industrial component review and fault diagnosis. Two-dimensional codes have more advantages than one-dimensional barcodes, such as large data capacity, small size and built-in error correction/correction mechanisms. With the development of smart phones, high-quality two-dimensional codes, which are most common in daily life, can be decoded quickly and accurately. However, how to decode low quality industrial two-dimensional codes quickly and reliably remains a challenging task and has long been an important topic in the field of machine vision. Most industrial two-dimensional codes are distorted into artifacts by various inevitable distortions. Most advanced decoding algorithms cannot directly handle distorted artifacts. The distorted two-dimensional code image needs to be subjected to appropriate preprocessing according to a specific distortion type, such as morphological filtering, median filtering, guided filtering, mean filtering or sharpening filtering. How to quickly and accurately select a proper preprocessing filter according to the distortion type and quality level of a specific two-dimensional code is crucial to practical application. Most industrial two-dimensional codes suffer from varying degrees and different types of distortion, including printed surface geometry, distortion of the imprinting process, poor lighting conditions, motion blur, scratches, and smudges.

No matter what type of distortion, current decoding algorithms such as ZXing, Zbar, libdmtx and 2DTG cannot directly decode distorted two-dimensional codes. In fact, these decoding algorithms need to be supplemented with some preprocessing methods to reduce the distortion of the two-dimensional code image to be decoded. Most two-dimensional code images are generally subjected to difficult decoding and complex destruction to cause distortion, and the preprocessing step occupies most of decoding time, so that the efficiency of pipeline work is greatly reduced.

At present, the decoding of the industrial two-dimensional code is mainly based on a full reference method. Take KEYENCE SR-1000 (a common industrial decoding device) as an example. For low quality two-dimensional codes printed on industrial parts, KEYENCE SR-1000 first captures thousands of sample images from the element by adjusting exposure time, optical gain, and polarization filters, and then processes each captured image through seven common pre-processing filters (including smoothing, dilation, erosion, on, off, median, and sharpening filters). All the filtered pictures are then passed to a decoding algorithm to determine whether they can be successfully decoded. A reliable decoding step can only be obtained by these detailed steps.

However, the conventional full-reference industrial two-dimensional code image quality evaluation method cannot predict a priori knowledge about distortion types of input two-dimensional code images, so the conventional full-reference method can test a large number of combinations of different filters and illumination conditions only by exhaustive search, which requires very expensive time cost and operation cost. Moreover, the full reference method depends to a large extent on the decoding algorithm. Whereas conventional manual feature-based no-reference quality assessment methods only support predicting two single distortion types, namely defocus blur and illumination artifacts, and do not provide fine-grained distortion type classification and quality level estimation.

Therefore, an industrial two-dimensional code non-reference quality evaluation method is urgently needed in the industry at present, so that the time cost of preprocessing is reduced, more distortion types can be detected, and quality details with finer granularity can be provided, so that the method is suitable for the requirements in actual industrial pipeline application.

Disclosure of Invention

In view of the above-mentioned shortcomings in the prior art, an object of the present invention is to provide a system and a method for non-reference quality evaluation of an industrial two-dimensional code, which can determine a plurality of distortion types in advance to enable automatic selection of a filter, and can adjust decoding time according to a quality level of an input two-dimensional code to reduce a time cost of preprocessing.

The invention is realized by the following technical scheme.

According to an aspect of the present invention, there is provided an industrial two-dimensional code reference-free quality evaluation system, including:

the industrial two-dimensional code database module is used for providing an industrial two-dimensional code test sample to be evaluated and an industrial two-dimensional code training sample for training, and inputting the industrial two-dimensional code test sample to the multi-task MTL convolutional neural network module for training;

the multi-task MTL convolutional neural network module is used for training the multi-task MTL convolutional neural network by utilizing the industrial two-dimensional code for training, and completing a quality evaluation task on the industrial two-dimensional code to be evaluated by adopting the trained multi-task MTL convolutional neural network;

wherein:

the multitask MTL convolutional neural network module comprises: the system comprises a shallow layer multi-task MTL convolutional neural network and a deep layer multi-task MTL convolutional neural network; the shallow layer multi-task MTL convolutional neural network is suitable for an industrial two-dimensional code quality evaluation task in a global distortion application scene; the deep multi-task MTL convolutional neural network is suitable for industrial two-dimensional code quality evaluation tasks under application scenes including global distortion and local distortion;

the quality evaluation task comprises the following steps: the distortion type determination and quality level prediction two main tasks and seven auxiliary tasks related to the two main tasks.

Preferably, the global distortion types include: axial inconsistency, grid inconsistency, over-coarse printing, and over-fine printing; the local distortion types include: defective locator, blocked error correcting code, distorted modulation ratio.

Preferably, the seven auxiliary tasks include: locator defects, axial inconsistencies, grid inconsistencies, error correction code occlusions, modulation ratio distortions, over-coarse and over-fine printing.

Preferably, the industrial two-dimensional code database comprises (39 × N) industrial two-dimensional codes for training, wherein the (39 × N) industrial two-dimensional codes for training comprise (30 × N) pieces of industrial two-dimensional codes containing single distortion and (9 × N) pieces of industrial two-dimensional codes containing multiple distortion combinations; wherein:

the (30 multiplied by N) industrial two-dimensional codes containing single distortion are obtained by carrying out single mode destruction on N original industrial two-dimensional code images to enable the N original industrial two-dimensional code images to be distorted, and for each original industrial two-dimensional code image, a plurality of images containing single distortion are generated;

the (9 × N) pieces of industrial two-dimensional codes distorted in multiple combination modes are obtained by performing multiple combination mode destruction on N pieces of original industrial two-dimensional code images to distort the N pieces of original industrial two-dimensional code images, and for each original industrial two-dimensional code image, multiple artifacts destroyed by multiple combination modes are generated, wherein the number of the artifacts is less than that of the image containing single distortion.

Preferably, the N original industrial two-dimensional code images include any one or a combination of any more of the following four types:

-an image printed on the surface of a smooth metal workpiece;

-an image printed on the surface of a frosted metal workpiece;

-an image printed on the surface of the resinous workpiece;

simulated images generated by the encoding software libdmtx on a computer.

Preferably, five ways are used to obtain a plurality of distortion types of the single-distortion industrial two-dimensional code, including:

-simulating artifacts caused by a rough printing surface with a level 5 Speckle Noise (SN);

-simulating conveyor belt induced artifacts with 5-level Motion Blur (MB);

-simulating artifacts caused by limited marking techniques using 5-level over-printing (OP) and 5-level under-printing (UP);

-degrading the two-dimensional code image with a 5-level perspective transformation to distort it to simulate grid non-uniformities (GN) caused by poor camera angles;

-simulating axial inhomogeneity (AN) with 5 axial tension levels;

the distortion types of the industrial two-dimensional code with various distortion combinations comprise: SN + AN, SN + GN, MB + AN, MB + GN, OP + SN, OP + MB, SN + GN + AN, MB + AN + UP, and AN + GN.

Preferably, the shallow layer multitask MTL convolutional neural network module adopts an MTL convolutional neural network model including 4 convolutional layers, and is constructed according to the guidance of a hard parameter sharing paradigm; in the hard parameter sharing example, a shallow convolutional neural network layer of the MTL convolutional neural network model is set as a feature layer shared by a plurality of tasks, and a deep convolutional neural network layer of the MTL convolutional neural network model is set as a specific feature layer aiming at different tasks, so that a shared shallow convolutional layer submodule, a task-specific deep convolutional layer submodule, a hidden layer submodule and an output layer submodule are formed; wherein:

the shared shallow layer convolutional layer sub-module comprises two shared shallow layer convolutional layers which are respectively used for extracting image characteristics related to distortion types and quality levels from the industrial two-dimensional code image, wherein the image characteristics comprise textures, contrast, image geometric structures, noise intensity, edge information and the like;

the task-specific deep convolutional layer submodule comprises a specific deep convolutional layer unit for distortion type judgment and two specific deep convolutional layer units for quality grade prediction; the specific deep convolutional layer unit for distortion type judgment and the specific deep convolutional layer unit for quality grade prediction respectively extract high-level features of an image, and two main tasks of distortion type judgment and quality grade prediction and seven auxiliary tasks closely related to the main tasks are realized;

the hidden layer submodule is a full-connection layer and is used for integrating image features extracted by the shared shallow layer convolutional layer submodule and the task-specific deep layer convolutional layer submodule and fusing the features so as to extract feature information highly related to the main task;

the output layer submodule comprises a sigmoid layer and a softmax layer, wherein the sigmoid layer is used for outputting a distortion type judgment main task, and the softmax layer is used for outputting a quality grade prediction main task and all auxiliary tasks.

Preferably, each shared shallow convolutional layer unit comprises: the method comprises the following steps that a shallow layer convolutional layer comprising 8 cores and a shallow layer convolutional layer comprising 16 cores are further arranged, wherein a rectifying linear unit (ReLU) activation function layer and a maximum pooling layer are further arranged behind each shallow layer convolutional layer in sequence;

the specific deep convolutional layer unit for distortion type determination and the specific deep convolutional layer unit for quality level prediction each include two deep convolutional layer groups, wherein each deep convolutional layer group includes: the device comprises a deep convolutional layer containing 32 kernels and a deep convolutional layer containing 64 kernels, wherein a rectifying linear unit (ReLU) activation function layer is arranged behind each deep convolutional layer;

the fully-connected layer includes 1024 neurons.

Preferably, the maximum pooling layer is obtained using a filter with a convolution kernel size (kernel size) of 2 × 2 and a convolution step size (stride) of 2; the kernel size (kernel size) of the shallow convolutional layer and the deep convolutional layer is 5 × 5, the step size (stride) is 1, and the edge interpolation method is as follows: padding ═ same.

Preferably, "padding" indicates a pixel filling method for the edge of the feature map in the process of extracting features from an image by the convolutional neural network, and generally includes three specific modes, namely a "full" mode, a "same" mode and a "valid" mode; wherein "full" indicates that a point with a pixel of 0 is inserted at the edge of the feature map before convolution, so that the feature map before convolution changes from original W × H to (W +2k-2) × (H +2k-2) (where k is the kernel size of the convolution kernel, i.e. kernel size, and the kernel size is set to 5 in the invention); "same" indicates that a point with a pixel of 0 is inserted into the edge of the feature map before convolution, so that the original W × H of the feature map before convolution is changed into (W +4) × (H +4), thereby ensuring that the output after convolution and the input before convolution keep the same resolution; "valid" indicates that the edge interpolation is not performed, but the convolution calculation is directly performed. It is emphasized that "same" is used as a specific edge-filling method in the invention to ensure that the input and output of the convolutional layer have the same resolution.

Preferably, the deep multi-task MTL convolutional neural network module includes a common convolutional layer and 13 separable convolutional sub-modules, and an average pooling layer, a hidden layer and an output layer are further disposed behind the common convolutional layer; each separable convolution submodule comprises a depth direction convolution layer and a point-by-point convolution layer; two main tasks and seven auxiliary tasks share the feature extraction convolution layer at the front end;

the common convolutional layer comprises 32 cores;

the 13 separable convolution sub-modules comprising: the convolution sub-module comprises a depth direction convolution layer and a point-by-point convolution layer which respectively comprise 32 kernels and 128 kernels, a convolution sub-module respectively comprising a deep layer convolution layer and a point-by-point convolution layer of 256 kernels, a convolution sub-module respectively comprising a deep layer convolution layer and a point-by-point convolution layer of 512 kernels, and a convolution sub-module respectively comprising a depth direction convolution layer and a point-by-point convolution layer of 1024 kernels.

Preferably, the average pooling layer contains 1024 cores;

the hidden layer is a full-connection layer containing 1024 neurons;

the output layer submodule comprises a sigmoid layer and a softmax layer, wherein the sigmoid layer is used for outputting a distortion type judgment main task, and the softmax layer is used for outputting a quality grade prediction main task and all auxiliary tasks;

the size (kernel size) of a common convolution kernel is 3 × 3, the convolution step (stride) is 2, and the edge interpolation mode is padding or same;

the size (depth-wise kernel size) of the deep convolution kernel is 3 × 3, the convolution step (stride) is 1 or 2, and the edge interpolation mode is padding or same;

the point-wise convolution kernel size (point-wise kernel size) is 1 × 1, the convolution step size (stride) is 1, and the edge interpolation mode is padding ═ same;

the average pooling layer convolution kernel size (averaging) is 7 × 7, the convolution step size (stride) is 1, and the edge interpolation method is padding or same.

According to another aspect of the present invention, there is provided an industrial two-dimensional code non-reference quality evaluation method implemented by using the above industrial two-dimensional code non-reference quality evaluation system, including the following steps:

s1, selecting an industrial two-dimensional code image for training from an industrial two-dimensional code database;

s2, performing sample enhancement on the selected industrial two-dimensional code original image: rotating each original image by adopting 11 different rotation angles to obtain 11 brand-new industrial two-dimensional code images with different rotation angles and used for training;

s3, inputting the enhanced industrial two-dimensional code image for training into the multi-task MTL convolutional neural network module, and training two convolutional neural networks in the multi-task MTL convolutional neural network module: inputting M pieces of industrial two-dimensional code images (M represents batch size) in each batch, calculating a loss function, then performing gradient return of the loss function, updating network model parameters, and finally enabling the performance of the network model on training samples to reach a convergence state, namely enabling the predicted value of the network model and a real training label to be consistent with each other on each training sample;

s4, selecting an industrial two-dimensional code image to be evaluated from an industrial two-dimensional code database and inputting the image to a trained multi-task MTL convolutional neural network module to obtain the prediction outputs of two main tasks, wherein the prediction outputs comprise: distortion types and quality scores of the two-dimensional code images; selecting a corresponding image preprocessing step for decoding according to the obtained distortion type, so that the decoding efficiency is improved; and judging the predicted quality score, and if the quality score is lower than a set threshold, excluding the industrial two-dimensional code image from a decoding queue, thereby avoiding the serious influence of error decoding on the whole production line.

Compared with the prior art, the invention has the following beneficial effects:

the system and the method for evaluating the reference-free quality of the industrial two-dimensional code solve the problem of high dependency of the traditional full-reference method on a decoding algorithm, and greatly reduce time cost and operation cost; in addition, the invention also solves the problems that the current full reference method only supports two single distortion types and can not carry out fine-grained prediction on the distortion types, and the like; the method for judging the distortion type of the industrial two-dimensional code and the quality grade prediction method can effectively reduce the preprocessing time and the calculation cost.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

fig. 1 is a schematic flow chart of a method for evaluating quality of an industrial two-dimensional code without reference according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a shallow MTL convolutional neural network module in an industrial two-dimensional code non-reference quality evaluation system according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a deep MTL convolutional neural network module in an industrial two-dimensional code non-reference quality evaluation system according to an embodiment of the present invention.

In the figure: 1 is an input industrial two-dimensional code, 2 is a proposed shallow MTL convolutional neural network model and a deep MTL convolutional neural network model, 3 is a judgment of distortion type, 4 is a prediction of quality grade, 5 is an automatic selection of a filter, 6 is a quality grade classification of the industrial two-dimensional code, 7 is the two-dimensional code of which the quality grade is predicted to be 0, 8 is a two-dimensional code decoding algorithm, 9 is a decoding time limit adjustment, 10 is an input industrial two-dimensional code database, 11 and 12 are shared convolution layers, 13, 14 is a distortion type determination main task specific convolution layer, 15, 16 are quality grade prediction task specific convolution layers, 17 is a full connection layer, 18 is a sigmoid layer, 19 is a softmax layer, 20 is an input industrial two-dimensional code database, 21 is a common convolution layer, 22 is 13 depth separable convolution modules, 23 is an average pooling layer, 24 is a full connection layer, 25 is a sigmoid layer, and 26 is a softmax layer.

Detailed Description

The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

The embodiment of the invention provides an industrial two-dimensional code reference-free quality evaluation system.

First, in the industrial two-dimensional code non-reference quality evaluation system provided by the embodiment of the invention, an industrial two-dimensional code database module is provided, and the module is a brand new industrial two-dimensional code dataset. The data set contains (39 × N) two-dimensional code images in data matrix format printed on the industrial unit. These two-dimensional code images contain six distortion types that are common in practical applications, including: grid inconsistency distortion caused by non-ideal imaging angles, noise generated by lenses during imaging, motion blur caused by conveyor belts, axial inconsistency distortion caused by surfaces of industrial parts, and morphological distortion (too coarse and too fine) caused during printing.

Secondly, firstly, in the industrial two-dimensional code non-reference quality evaluation system provided by the embodiment of the present invention, a multi-task MTL convolutional neural network module is provided, and the module includes two completely new convolutional neural network models, which are: a shallow convolutional neural network (containing 4 convolutional layers), and a deep convolutional neural network (containing 15 convolutional layers). Specifically, the shallow convolutional neural network has the advantages of low time consumption and small parameter quantity, and can be used for the quality evaluation task of the industrial two-dimensional code with low distortion degree. Correspondingly, the deep convolutional neural network has the advantage of high precision and can be used for the quality evaluation task of the industrial two-dimensional code with serious distortion degree.

The proposed network model is applicable to two main tasks and seven auxiliary tasks. Specifically, the two main tasks include: 1. the method comprises the steps of 1, a distortion type classification task of an industrial two-dimensional code image and 2, a quality evaluation task of the distorted two-dimensional code image. The seven auxiliary tasks are respectively used for calculating seven different fine-grained distortion scores, and the fine-grained distortion scores are closely related to the main task and respectively comprise: 1. defective locator, 2 axial inconsistency, 3 grid inconsistency, 4 error correction code occlusion, 5 modulation ratio distortion, 6 over-coarse printing and 7 over-fine printing.

The correlation between the modules is:

the dataset contains (39 × N) industrial-scale two-dimensional code images. In order to obtain the training label of each image, the embodiment labels each image in detail by using the evaluation standard specified by the international organization for standardization. Specifically, the two-dimension code image is labeled by adopting two ISO international standards, namely ISO-15415 and ISO-16022, and the distortion type and the quality score of each two-dimension code image and the quality score of each specific distortion type (namely, the distortion degree corresponding to seven auxiliary tasks) are obtained through labeling. Finally, the labeled data set can be used to train a convolutional neural network.

The convolutional neural network can be used for predicting distortion classification and quality evaluation tasks of new industrial two-dimensional code images through training on the labeled data set. Specifically, the proposed model first classifies the distortion type of the input two-dimensional code image. After the distortion category is obtained, a corresponding image preprocessing step can be selected according to the distortion category, and the image is processed, so that the subsequent decoding efficiency is improved. In addition, after the quality score is obtained through model prediction, the images with the quality score lower than a certain threshold value can be excluded from a decoding queue, so that the serious influence of error decoding on the whole production line is avoided.

The seven auxiliary tasks are highly correlated with the two main tasks (i.e., distortion classification and quality assessment) and therefore share the same image features as the main task. The method utilizes the auxiliary tasks as regularization conditions, and simultaneously performs combined optimization on the main task and the auxiliary tasks in the model training process, so that a network model is prompted to pay attention to finer-grained distortion details, overfitting of the model is avoided, and the generalization performance of the model is improved.

In particular, the amount of the solvent to be used,

the embodiment of the invention provides an industrial two-dimensional code reference-free quality evaluation system, which comprises:

the industrial two-dimensional code database module is used for providing an industrial two-dimensional code to be evaluated and an industrial two-dimensional code for training and outputting the industrial two-dimensional code to the multitask MTL convolutional neural network module;

wherein:

the multitask MTL convolutional neural network module comprises: the system comprises a shallow layer multi-task MTL convolutional neural network and a deep layer multi-task MTL convolutional neural network; the shallow layer multi-task MTL convolutional neural network is suitable for an industrial-level two-dimensional code quality evaluation task in a global distortion application scene, and specifically, the global distortion type comprises the following steps: axial inconsistency, grid inconsistency, over-coarse printing, over-fine printing. The deep multi-task MTL convolutional neural network is suitable for industrial two-dimensional code quality evaluation tasks under application scenes including global distortion and local distortion, and specifically, the global distortion type comprises the following steps: axial inconsistency, grid inconsistency, over-coarse printing and over-fine printing; the local distortion types include: defective locator, blocked error correcting code, distorted modulation ratio.

The quality evaluation task comprises the following steps: two main tasks, namely distortion type classification and quality level prediction; and seven auxiliary tasks that are closely related to the two main tasks, the seven auxiliary tasks including: locator defects, axial inconsistencies, grid inconsistencies, error correction code occlusions, modulation ratio distortions, over-coarse and over-fine printing.

The industrial two-dimensional code database comprises (39 multiplied by N) industrial two-dimensional codes, a shallow multi-task MTL convolutional neural network module and a deep multi-task MTL convolutional neural network architecture module which are suggested by industrial two-dimensional code image input, and through the processing of the shallow multi-task MTL convolutional neural network module and the deep multi-task MTL convolutional neural network module, on one hand, the distortion type of the image is judged, so that the automatic selection of a filter is realized, on the other hand, the quality grade of a predicted image is subjected to decoding real-time limit adjustment on the image of which the quality grade is predicted to be zero by the two models, and finally, the decoding is performed by using a two-dimensional code decoding algorithm.

The established industrial two-dimensional code database containing (39 multiplied by N) industrial two-dimensional codes comprises (30 multiplied by N) industrial two-dimensional codes destroyed by a single distortion type and (9 multiplied by N) industrial two-dimensional codes destroyed by a plurality of combined distortion types, wherein the two-dimensional code image in the database has six independent distortion types, and each distortion type comprises five different distortion degrees; and nine composite distortion types, each composite distortion being from a combination of a plurality of individual distortion types. Wherein:

and the industrial two-dimensional code database is provided for the shallow multi-task MTL convolutional neural network model and the deep multi-task MTL convolutional neural network framework for learning.

The shallow layer multi-task MTL convolutional neural network model is constructed according to the guidance of a hard parameter sharing example and comprises a shared shallow layer convolutional layer, an additional specific deep layer convolutional layer, a subtask and an output layer. Additional specific deep convolutional layers are on top of the shared shallow convolutional layers, with the final output layer behind all convolutional layers.

The deep multi-task MTL convolutional neural network architecture takes MobileNet-V1 as a basic network and comprises a shared convolutional layer, an average pooling layer, subtasks and an added output layer. The average pooling layer is above the shared convolutional layer, and the added output layer is above the average pooling layer.

And the judgment of the distortion type is realized by two multi-task MTL convolutional neural network models.

The automatic selection of the filter is realized by judging the distortion type.

And the quality grade is judged by two multi-task MTL convolutional neural network models.

And adjusting the time limit of the two-dimensional code of which the quality level is predicted to be zero by the two models, and setting a shorter decoding time limit for the two-dimensional code of which the quality level is predicted to be zero through judging the quality level.

According to the first aspect of the present invention, the single-mode corrupted (30 × N) distorted image is a single distortion type corruption of N original industrial two-dimensional code images, resulting in six distortion types, each distortion type containing five different distortion degrees. Therefore, for each original two-dimensional code, 30 images of a single distortion type are generated.

The (9 × N) images subjected to the composite distortion type are destroyed by combining a plurality of single distortions on N original industrial two-dimensional code images, and for each original two-dimensional code, 9 images containing composite distortion are generated.

The N original industrial two-dimensional code images comprise N/4 images printed on a smooth metal surface, N/4 images printed on a frosted metal surface, N/4 images printed on a resin surface and N/4 images printed on a computer by encoding software libdmtx.

The simulation of six distortion types by five techniques includes 1) simulating artifacts caused by a rough printed surface with 5-level Speckle Noise (SN); 2) simulating artifacts caused by conveyor belts using 5-level Motion Blur (MB); 3) simulating artifacts caused by limited marking techniques using level 5 Overprinting (OP) and level 5 Underprinting (UP); 4) in order to simulate grid unevenness (GN) caused by a poor photographing angle, the present embodiment degrades and distorts a two-dimensional code image by 5-level perspective transformation; 5) axial non-uniformity (AN) was simulated with 5 levels of axial stretching.

The nine distortion combinations are: 1) SN + AN; 2) SN + GN; 3) MB + AN; 4) MB + GN; 5) OP + SN; 6) OP + MB; 7) SN + GN + AN; 8) MB + AN + UP; 9) AN + GN.

The shallow MTL convolutional neural network module adopts an MTL convolutional neural network architecture, a hard parameter sharing type example construction model is used for dividing a quality evaluation problem into nine subtasks, wherein the judgment and quality grade prediction of the distortion type are two main tasks, the rest are seven auxiliary tasks closely related to the main tasks, and particularly, a specific convolutional module is added to the main task for judging the distortion type.

The shallow MTL convolutional neural network model comprises a shallow shared convolutional layer submodule (comprising two shared shallow convolutional layer units), a specific deep convolutional submodule for judging a main task according to a distortion type, a specific deep convolutional submodule for predicting the main task according to a quality grade, a hidden layer submodule and an output layer submodule. Wherein:

the specific deep convolution submodule for judging the main task of the distortion type consists of two convolution layer groups.

The specific deep layer convolution submodule of the quality level prediction main task is composed of two convolution layer groups.

The hidden layer sub-module is a full connection layer.

And the output layer submodule judges the main task as a sigmoid layer according to the distortion type and predicts the main task and all auxiliary tasks as softmax layers according to the quality grade.

The shallow MTL convolutional neural network module is used for extracting image features, and all subtasks extract low-dimensional basic features of the image through two shared shallow convolutional layer units. And extracting high latitude semantic features of the image by a specific deep convolution submodule of the distortion type judgment main task to realize the judgment of the distortion type. And a specific deep convolution submodule of the quality grade prediction extracts high-dimensional semantic features in the image to realize the quality grade prediction, and seven auxiliary tasks share the specific convolution layer with the seven auxiliary tasks.

The two shared shallow convolutional layer units each comprise: two shallow convolutional layers containing 8 and 16 kernels, respectively, each followed by a rectifying linear unit (ReLU) activation function and a 2 x 2 max pooling layer.

The two additional specific deep convolutional layer units of the main task for distortion type determination include two deep convolutional layers containing 32 and 64 kernels, respectively, each of which is followed by a rectifying linear unit (ReLU) activation function.

The two specific convolutional layer units of the main task of predicting the quality level comprise two convolutional layers respectively comprising 32 kernels and 64 kernels, and each convolutional layer is followed by a rectifying linear unit (ReLU) activation function.

The fully-connected layer includes 1024 neurons.

The max pooling layer uses a filter with a kernel size of 2 x 2 with a step size set to 2.

The kernel size of the shallow layer convolutional layer and the deep layer convolutional layer is 5 × 5, the step length is 1, and the edge filling type is padding. Specifically, "padding" indicates that in the process of extracting features from an image by a convolutional neural network, a pixel filling method for the edges of a feature map generally includes three specific modes, namely a "full" mode, a "same" mode and a "valid" mode; wherein "full" indicates that a point with a pixel of 0 is inserted at the edge of the feature map before convolution, so that the feature map before convolution changes from original W × H to (W +2k-2) × (H +2k-2) (where k is the kernel size of the convolution kernel, i.e. kernel size, and the kernel size is set to 5 in the invention); "same" indicates that a point with a pixel of 0 is inserted into the edge of the feature map before convolution, so that the original W × H of the feature map before convolution is changed into (W +4) × (H +4), thereby ensuring that the output after convolution and the input before convolution keep the same resolution; "valid" indicates that the edge interpolation is not performed, but the convolution calculation is directly performed. It is emphasized that "same" is used as a specific edge-filling method in the invention to ensure that the input and output of the convolutional layer have the same resolution.

The deep multi-task MTL convolutional neural network module adopts MobileNet-V1 as a basic network, the MobileNet-V1 mainly comprises a common convolutional layer and 13 separable convolutional submodules, and the common convolutional layer is followed by an average pooling layer, a hidden layer and an output layer. Each depth-separable convolution sub-module may be divided into two parts, namely a depth-wise convolution layer and a point-wise convolution layer.

The MobileNet-V1 is used as a basic network and a powerful feature extractor to extract image features. All subtasks share all convolutional layers. Wherein:

the common convolutional layer has 32 cores.

The 13 separable convolution sub-modules comprising: the convolution module comprises a convolution module of a depth direction convolution layer and a point-by-point convolution layer respectively comprising 32 and 128 kernels, a convolution module of a depth direction convolution layer and a point-by-point convolution layer respectively comprising 128 kernels and 256 kernels, a convolution module of a depth direction convolution layer and a point-by-point convolution layer respectively comprising 256 and 512 kernels, a convolution module of a depth direction convolution layer and a point-by-point convolution layer respectively comprising 512 and 1024 kernels, and a convolution module of a depth direction convolution layer and a point-by-point convolution layer respectively comprising 1024 kernels.

The average pooling layer contains 1024 cores.

The hidden layer is a fully connected layer with 1024 neurons.

The output layer is used for judging a main task aiming at the distortion type and is a sigmoid layer; and predicting the main task and all auxiliary tasks aiming at the quality level, wherein the main task and all auxiliary tasks are softmax layers.

The size of the common convolutional layer inner core is 3 multiplied by 3, the step length is 2, and the number of filling layers is the same.

The size of the deep convolutional layer inner core is 3 multiplied by 3, the step length is 1 or 2, and the number of filling layers is the same.

The size of the inner core of the point-by-point convolution layer is 1 multiplied by 1, the step length is 1, and the number of filling layers is the same.

The average pooling layer kernel size is 7 x 7, the step length is 1, and the number of effective filling layers is increased.

The technical solutions provided by the embodiments of the present invention are further described in detail below.

1) Evaluating the proposed industrial two-dimensional code database. For fair comparison, this example performed 5 cross-validations. In each round of validation, the database was randomly divided into 5 classes, 4 of which were used as training sets and the remaining 1 as test sets. All deep learning based models were trained and tested on the same training and test set, while all classical methods based on manual features were evaluated on the same test set.

The same number of training samples from the 5 different quality classes are selected to form a proportionally balanced training set and provided to MTL-s and MTL-d learning.

For fair comparison, the present embodiment employs the same optimization method and parameter initialization strategy (pre-trained on ImageNet) to fine-tune the heavy-duty network, including AlexNet, inclusion-V3, inclusion ResNet V2, and MTL-d. Also, the present embodiment trains lightweight networks, including IQA-CNN + +, MEON, and MTL-s, using a unified training setup.

The international organization for standardization (ISO) promulgates two standards ISO-15415 and ISO-16022, which divide the quality of a two-dimensional data matrix symbol into 5 discrete levels according to some handmade features. The test set of the present embodiment is labeled with these fully-referenced ISO standards, providing supervision for model learning without reference. In the experiment, the quality grade true value is compared with the quality grade predicted value, so that the prediction accuracy of the model is checked.

2) MTL-s are trained from zero, and the training process is divided into two phases. In the first phase, all parameters of the MTL-s are initialized randomly. Parameters are set to calculate the total loss because the distortion type decision specific main task (hereinafter referred to as task a) and the quality level prediction main task (hereinafter referred to as task B) are main tasks, which should be paid more attention than other auxiliary tasks. The present embodiment jointly trains MTL-s by minimizing a weighted sum of the loss functions of a plurality of different subtasks. However, specific practice shows that the convergence speed of the quality assessment task is much faster than that of the distortion type classification task, and if the embodiment continues as in the first training phase, using fixed pool weights for a longer period of time will result in an overfitting of the quality assessment task and an underfitting of the distortion type classification task. Thus, the present embodiment changes the training parameters in the second stage to speed up the convergence of the distortion type classification task while avoiding overfitting of the quality assessment task. The invention adopts a momentum-based adaptive gradient descent optimization method (Adam) to optimize the model parameters and sets the fixed learning rate to be 1 x 10-³。

For MTL-d, this embodiment first sets weights and biases in ImageNet to initialize the parameters of MobileNet-V1, and then this embodiment randomly initializes task-specific fully-connected layers and sigmoid/softmax layers. The embodiment adopts a two-stage training strategy similar to MTL-s to fine-tune MTL-d. In this embodiment, an RMSprop optimization method is used to train MTL-d, the batch size is 32 images, and the period is 120 time periods. This embodiment sets the initial learning rate to 1 × 10-2 and the momentum to 0.9.

3) Accuracy of determination of distortion type: experimental results show that MTL-s is superior to IQACNN + + and MEON with similar model parameters, and the best precision is achieved on GN distortion groups. In addition, compared with the deep classification network in the prior art, the MTL-d has few model parameters, achieves faster operation time and realizes extremely high precision.

Most depth models achieve unsatisfactory performance on the AN and GN groups, while better accuracy can be achieved in the MB and SN groups. For MTL-d, most of the pictures degraded by AN distortion type are erroneously decided as SN type.

Prediction performance of quality level: classical quality assessment methods based on hand-made features produce continuous quality scores, while models based on deep learning produce discrete quality scores, which cannot be directly compared by the present embodiment. Therefore, the present embodiment first remaps using a fitted regression function suggested by the international Video Quality Expert Group (VQEG):

mapping raw quality scores S to Video Quality Expert Group (VQEG) recommendations

β therein_λ(λ ∈ {1, 2.., 5}) is a free parameter that needs to be determined in the curve fitting process. For fair comparison, this example then uses the mapped mass fraction S to calculate three widely used evaluation criteria, including Pearson Linear Correlation Coefficient (PLCC), Spearman' S Rank-order Correlation Coefficient (SRCC), and Root Mean Square Error (RMSE).

MTL-s outperformed the lightweight depth models, including IQA-CNN + + and MEON, and most classical methods except FSIM. The experimental results show that FSIM performance is slightly higher, but at a much higher time cost than MTL-s. Furthermore, MTL-d is comparable to the prediction accuracy of the most advanced depth networks, including AlexNet, inclusion-V3 and inclusion ResNet V2, but at a lower time cost and smaller model size. It is worth noting that the difference between two-dimensional codes at adjacent levels is not very obvious, and for most depth models, the quality level of some two-dimensional codes is easily identified as the level adjacent to them by mistake. In addition, most depth models have better performance on higher quality level two-dimensional codes.

The system and the method for evaluating the quality of the industrial two-dimensional code without reference provided in the above embodiments of the present invention establish a database containing (39 × N) industrial two-dimensional codes for the first time, and perform a test of quality measurement without reference on an industrial two-dimensional symbol (i.e., an industrial two-dimensional code) using CNN for the first time. The industrial two-dimensional code is preprocessed by the shallow multi-task MTL convolutional neural network module and the deep multi-task MTL convolutional neural network module, so that on one hand, various distortion types of an image can be judged, automatic selection of a filter becomes possible, on the other hand, the quality level of the image can be predicted, and decoding real-time limit adjustment is carried out on the image of which the quality level is predicted to be zero by the two modules. The system and the method for evaluating the quality of the industrial two-dimensional code without reference provided by the embodiment of the invention can greatly reduce the time cost and the calculation cost of preprocessing and decoding the industrial two-dimensional code, improve the decoding efficiency of the industrial two-dimensional code and increase the throughput of an industrial production line.

The foregoing description of specific embodiments of the present invention has been presented. In particular, the present invention is not limited to the above-described specific embodiments, and various changes or modifications may be made by those skilled in the art within the scope of the claims without affecting the essence of the present invention.

Claims

1. An industry two-dimensional code does not have reference quality evaluation system, its characterized in that includes:

the multi-task MTL convolutional neural network module is used for training the multi-task MTL convolutional neural network by utilizing the industrial two-dimensional code for training, and finishing a quality evaluation task aiming at an industrial two-dimensional code test sample to be evaluated by adopting the trained multi-task MTL convolutional neural network;

wherein:

2. The industrial two-dimensional code non-reference quality evaluation system according to claim 1, wherein the seven auxiliary tasks comprise: locator defects, axial inconsistencies, grid inconsistencies, error correction code occlusions, modulation ratio distortions, over-coarse and over-fine printing.

3. The industrial two-dimensional code non-reference quality evaluation system as claimed in claim 1, wherein the industrial two-dimensional code database comprises 39 x N industrial two-dimensional code images for training, wherein 30 x N industrial two-dimensional code images containing a single distortion, and 9 x N industrial two-dimensional code images containing a plurality of distortion combinations, wherein N is a positive integer representing an original industrial two-dimensional code image; wherein:

the single-distortion industrial two-dimensional code is obtained by performing single-mode destruction on a plurality of original industrial two-dimensional code images by utilizing five modes to enable the images to be distorted, and a plurality of images containing single distortion are generated for each original industrial two-dimensional code image;

the industrial two-dimensional code with various distortion combinations is obtained by carrying out various distortion type combination damage on a plurality of original industrial two-dimensional code images to enable the original industrial two-dimensional code images to be distorted, and for each original industrial two-dimensional code image, a plurality of artifacts damaged by various combination modes are generated, wherein the number of the artifacts is less than that of the image containing single distortion.

4. The industry two-dimensional code non-reference quality evaluation system according to claim 3, wherein the N original industry two-dimensional code images are from scenes of four types:

-an image printed on the surface of a smooth metal workpiece;

-an image printed on the surface of a frosted metal workpiece;

-an image printed on the surface of the resinous workpiece;

-images generated by the encoding software libdmtx on a computer;

the distortion types of the industrial two-dimensional code with multiple single distortions are obtained by five ways, including:

-simulating artifacts caused by a rough printing surface with a level 5 speckle noise SN;

-simulating conveyor belt induced artifacts with a 5-level motion blur MB;

-degrading the two-dimensional code image with a 5-level perspective transformation to distort it to simulate grid non-uniformities GN caused by poor camera angles;

-simulating axial inhomogeneity AN with 5 axial stretch levels;

5. The industrial two-dimensional code non-reference quality evaluation system according to claim 1, wherein the shallow layer multitask MTL convolutional neural network module adopts an MTL convolutional neural network model comprising 4 convolutional layers and is constructed according to the guidance of a hard parameter sharing example; in the hard parameter sharing example, a shallow convolutional neural network layer of the MTL convolutional neural network model is set as a feature layer shared by a plurality of tasks, and a deep convolutional neural network layer of the MTL convolutional neural network model is set as a specific feature layer aiming at different tasks, so that a shared shallow convolutional layer submodule, a task-specific deep convolutional layer submodule, a hidden layer submodule and an output layer submodule are formed; wherein:

the shared shallow layer convolutional layer submodule comprises two shared shallow layer convolutional layers which are respectively used for extracting image characteristics related to distortion types and quality levels from the industrial two-dimensional code image;

the task-specific deep convolutional layer submodule comprises a specific deep convolutional layer unit for distortion type judgment and two specific deep convolutional layer units for quality grade prediction; the specific deep convolutional layer unit for distortion type judgment and the specific deep convolutional layer unit for quality grade prediction respectively extract high-level features of the image, and a main task of distortion type judgment and quality grade prediction and an auxiliary task closely related to the main task are realized;

the hidden layer submodule is a full-connection layer and is used for integrating image features extracted by the shared shallow layer convolutional layer submodule and the task-specific deep layer convolutional layer submodule and fusing the features so as to extract feature information related to the main task;

6. The industrial two-dimensional code non-reference quality evaluation system according to claim 5, wherein each shared shallow convolutional layer unit comprises: the method comprises the following steps that a shallow layer convolutional layer comprising 8 cores and a shallow layer convolutional layer comprising 16 cores are further arranged, wherein a rectifying linear unit (ReLU) activation function layer and a maximum pooling layer are further arranged behind each shallow layer convolutional layer in sequence;

the specific deep convolutional layer unit for distortion type determination and the specific deep convolutional layer unit for quality level prediction each include two deep convolutional layer groups, wherein each deep convolutional layer group includes: the device comprises a deep convolutional layer containing 32 kernels and a deep convolutional layer containing 64 kernels, wherein a rectifying linear unit activation function layer is arranged behind each deep convolutional layer;

the fully-connected layer includes 1024 neurons.

7. The industrial two-dimensional code non-reference quality evaluation system according to claim 6, wherein the maximum pooling layer is obtained using a convolution kernel size of 2 x 2 filter and a convolution step size of 2; the convolution kernel size of the shallow convolution layer and the deep convolution layer is 5 multiplied by 5, the step length is 1, and the edge interpolation mode is as follows: padding ═ same.

8. The model for evaluating the quality of the industrial two-dimensional code without reference according to claim 1, wherein the deep multi-task MTL convolutional neural network module comprises a common convolutional layer and 13 separable convolutional sub-modules, and an average pooling layer, a hidden layer and an output layer are further arranged behind the common convolutional layer; each separable convolution submodule comprises a deep convolution layer and a point-by-point convolution layer; the two main tasks and the seven auxiliary tasks share the convolutional layer of the front-end feature extraction;

the common convolutional layer comprises 32 cores;

the 13 separable convolution sub-modules comprising: the convolution sub-module comprises a deep convolutional layer and a point-by-point convolutional layer which respectively comprise 32 kernels and 128 kernels, a convolution sub-module respectively comprising a deep convolutional layer and a point-by-point convolutional layer of 128 kernels and a convolution sub-module respectively comprising a deep convolutional layer and a point-by-point convolutional layer of 256 kernels, a convolution sub-module which comprises a deep convolutional layer and a point-by-point convolutional layer of 512 kernels, a convolution sub-module respectively comprising a deep convolutional layer and a point-by-point convolutional layer of 512 kernels, and a convolution sub-module respectively comprising a deep convolutional layer and a point-by-point convolutional layer of 1024 kernels.

9. The industrial two-dimensional code non-reference quality evaluation model according to claim 8, wherein the average pooling layer contains 1024 kernels;

the hidden layer is a full-connection layer containing 1024 neurons;

the kernel size of the common convolutional layer is 3 multiplied by 3, the convolution step length is 2, and the edge interpolation mode is padding which is same as same;

the kernel size of the depth convolution layer is 3 multiplied by 3, the convolution step length is 1 or 2, and the edge interpolation mode is padding which is same as same;

the kernel size of the point-by-point convolution layer is 1 multiplied by 1, the convolution step length is 1, and the edge interpolation mode is padding which is same as same;

the average pooling layer kernel size is 7 × 7, the convolution step is 1, and the edge interpolation mode is padding — same.

10. An industrial two-dimensional code non-reference quality evaluation method realized by adopting the industrial two-dimensional code non-reference quality evaluation system of any one of claims 1 to 9, characterized by comprising the following steps:

s1, selecting an industrial two-dimensional code image training sample for training from an industrial two-dimensional code database;

s3, inputting the enhanced industrial two-dimensional code image for training into the multi-task MTL convolutional neural network module, and training two convolutional neural networks in the multi-task MTL convolutional neural network module: inputting N industrial two-dimensional code images in each batch, calculating a loss function, then performing gradient feedback of the loss function, updating network model parameters, and finally enabling the performance of the network model on training samples to reach a convergence state, namely enabling the predicted value of the network model to be consistent with a real training label on each training sample;

s4, selecting an industrial two-dimensional code image to be evaluated from an industrial two-dimensional code database and inputting the image to a trained multi-task MTL convolutional neural network module to obtain the prediction outputs of two main tasks, wherein the prediction outputs comprise: distortion types and quality scores of the two-dimensional code images; selecting a corresponding image preprocessing step for decoding according to the obtained distortion type; and judging the predicted quality score, and if the quality score is lower than a set threshold, excluding the industrial two-dimensional code image from a decoding queue.