CN117877045A

CN117877045A - Bar code identification method, terminal equipment and storage medium

Info

Publication number: CN117877045A
Application number: CN202311760223.3A
Authority: CN
Inventors: 王珂; 周璐; 李晶; 张博
Original assignee: Zhejiang Huaray Technology Co Ltd
Current assignee: Zhejiang Huaray Technology Co Ltd
Priority date: 2023-12-19
Filing date: 2023-12-19
Publication date: 2024-04-12

Abstract

The application discloses a bar code identification method, terminal equipment and storage medium, wherein the identification method comprises the following steps: acquiring a bar code image to be identified; inputting the bar code image to be identified into a transducer model to obtain an identification result, wherein the identification result is a character sequence corresponding to the bar code to be identified. According to the bar code recognition method, the bar code recognition is performed in a deep learning mode, and the accuracy of bar code recognition is improved.

Description

Bar code identification method, terminal equipment and storage medium

Technical Field

The present disclosure relates to the field of barcode recognition, and in particular, to a barcode recognition method, a terminal device, and a storage medium.

Background

Bar codes are widely used in retail, payment, medical, logistics, and other industries. Wherein, the one-dimensional bar code is composed of black and white stripes, which represent different characters. The bar code identification equipment can shoot one-dimensional bar code images through equipment such as a CMOS camera and the like, and decode bar code contents according to the proportional relation of the bar space width in the one-dimensional bar code.

The prior art bar code decoding method mainly comprises the steps of obtaining a binary image of a bar code image based on an image processing method for a bar code with positioned, outputting digital character information by utilizing a photoelectric conversion and analog-to-digital conversion method, and finishing decoding by template distribution comparison. The decoding mode has low recognition and decoding efficiency, and once the bar code is stained, the bar code cannot be accurately judged and decoded.

Disclosure of Invention

The application provides a barcode identification method, terminal equipment and storage medium.

The technical scheme adopted by the application is to provide a bar code identification method, which comprises the following steps:

acquiring a bar code image to be identified;

inputting the bar code image to be identified into a transducer model to obtain an identification result, wherein the identification result is a character sequence corresponding to the bar code to be identified.

Optionally, the transducer model includes a coding network and a decoding network connected in sequence, and the method inputs the barcode image to be identified into the transducer model to obtain an identification result, including:

inputting the bar code image to be identified into a coding network for feature coding to obtain coding features;

and inputting the coding features into a decoding network for decoding to obtain a recognition result.

Optionally, the encoding network includes a downsampling layer, and the inputting the barcode image to be identified into the encoding network for feature encoding to obtain encoding features includes:

carrying out vector embedding and local feature extraction on the bar code image to be identified to obtain the corresponding features of the bar code image to be identified;

inputting the characteristics into a downsampling layer for downsampling to obtain downsampled characteristics;

and carrying out layer normalization on the downsampled features to obtain coding features.

Optionally, the convolution kernel in the downsampling layer is 3*3, the vertical step size is 2, and the horizontal step size is 1.

Optionally, the encoding network and the decoding network contain a self-attention mechanism.

Optionally, inputting the encoded feature into a decoding network for decoding to obtain the identification result, including:

performing feature decoding on the encoded features to obtain at least one local self-attention decoding information and at least one global self-attention decoding information, respectively;

respectively carrying out feature height reduction, channel number adjustment and feature compression on the local self-attention decoding information and the global self-attention decoding information to obtain a feature sequence;

and carrying out normalization processing on the characteristic sequence to obtain a recognition result.

Optionally, the step of acquiring the barcode image to be identified includes:

acquiring an original bar code image to be identified;

and inputting the original barcode image to be identified into a space deformation network for image correction so as to obtain the barcode image to be identified.

Optionally, the barcode image to be identified includes a labeling result, and after the step of obtaining the barcode image to be identified, the method further includes:

determining the CTC loss of the identification result relative to the labeling result according to the identification result;

parameters of the transducer model were adjusted using CTC loss.

Another technical scheme adopted by the application is to provide a terminal device, wherein the terminal device comprises a memory and a processor connected with the memory;

the memory is used for storing program data, and the processor is used for executing the program data to realize the identification method of the bar code.

Another technical solution adopted in the present application is to provide a computer storage medium, where the computer storage medium is used to store program data, and the program data is used to implement the barcode identification method when executed by a computer.

The beneficial effects of this application are: acquiring a bar code image to be identified; and inputting the bar code image to be identified into a transducer model to obtain an identification result. Since the character value corresponding to the bar code is determined by the bar space width, the recognition of the bar code can be treated as a character sequence recognition problem, so that the bar code recognition method can be widely used for recognizing the bar code image to be recognized by using a transducer model for character sequence recognition, and the accuracy of bar code recognition is further improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an embodiment of a bar code identification method provided herein;

FIG. 2 is a schematic illustration of a portion of an original barcode image to be identified;

FIG. 3 is a flow chart illustrating another embodiment of a bar code identification method provided herein;

FIG. 4 is a flow chart of another embodiment of a bar code identification method provided herein;

FIG. 5 is a flow chart of another embodiment of a bar code identification method provided herein;

FIG. 6 is a flow chart of another embodiment of a bar code identification method provided herein;

FIG. 7 is a flowchart illustrating another embodiment of a bar code identification method according to the present application

Fig. 8 is a schematic structural diagram of an embodiment of a terminal device provided in the present application;

fig. 9 is a schematic structural diagram of an embodiment of a computer storage medium provided in the present application.

Detailed Description

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

It is noted that the method for identifying the bar code provided by the application can only be used for identifying one-dimensional bar codes, but cannot be used for identifying other bar codes (such as two-dimensional bar codes).

The inventor of the application finds that the most common bar code recognition method at present firstly obtains a bar code image with finished positioning, and carries out binarization processing on the bar code image to obtain a binary image containing bar codes, determines digital character information by using a photoelectric conversion and analog-to-digital conversion method, and then completes bar code recognition by using template distribution comparison. The bar code identification method has the defect of low identification efficiency, and if the bar code is stained, the method can not accurately identify the information of the bar code.

Based on the method, the application mainly designs a set of end-to-end bar code recognition method based on deep learning, and unlike the traditional method, the method can further improve the recognition accuracy by considering the shape specificity of the bar code and recognizing the bar code to be recognized by using a transducer model.

Referring to fig. 1 specifically, fig. 1 is a schematic flow chart of an embodiment of a barcode recognition method provided in the present application.

As shown in fig. 1, the barcode identification method in the embodiment of the present application may specifically include the following steps:

S1, acquiring a bar code image to be identified.

The bar code recognition method provided by the application is mainly executed by a bar code recognition device. In some application scenarios, the barcode recognition device is the camera itself. In some application scenarios, the barcode recognition device is a barcode recognizer. In some application scenarios, the barcode recognition device may be a device communicatively connected to a camera or a barcode recognizer, for example, the device may be any one or more of a device for monitoring an image, a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and an autopilot, a robot, a security system, glasses for augmented reality or virtual reality, a helmet. In some possible implementations, the barcode recognition method may be implemented by way of a processor invoking computer readable instructions stored in a memory.

Specifically, the barcode recognition device acquires a barcode image to be recognized.

In some embodiments, the barcode image to be identified is acquired in real time by a camera or a barcode identifier. In some embodiments, the barcode image to be identified is acquired in advance by a camera or a barcode identifier, and stored in a storage medium.

In some possible application scenarios, the barcode image to be identified is a barcode image to be trained. It can be understood that the bar code image to be trained comprises character content corresponding to the bar code, namely labeling content.

In some possible application scenarios, the barcode image to be identified may be a barcode image that is pre-positioned and extracted, i.e. the barcode image only contains the barcode.

In some embodiments, the barcode recognition device may further perform image correction on the original barcode image to be recognized before obtaining the barcode image to be recognized.

In some embodiments, the width information of the bar code is located in a horizontal direction.

S2, inputting the bar code image to be identified into a transducer model to obtain an identification result.

The recognition result is a character sequence corresponding to the bar code to be recognized.

Specifically, the barcode recognition device inputs a barcode image to be recognized into a transducer model to obtain a recognition result.

A common bar Code system (Code mode of a bar Code) is Code128 codes, the Code128 codes are divided into Code128A, code B and Code128C according to different Code modes, 107 different characters are defined in a codebook, each character occupies 11-bit space positions, a bar starts and ends, a value corresponding to a bar Code character is determined by a bar space width, that is, the value corresponding to the bar Code character is irrelevant to the bar Code height.

The inventor of the application finds that the bar code recognition can be treated as a character sequence problem due to the specificity of the bar code coding mode. Based on this, the present inventors used a network of transducers for barcode recognition.

It will be appreciated that the barcode to be identified need not be Code128, but may be other common barcode schemes, such as EAN Code, UPC Code, 39 Code, cross 25 Code, 93 Code, or Codabar Code, etc., without limitation.

In some embodiments, the barcode recognition device inputs the image to be recognized into the transducer model for feature extraction, obtains a corresponding feature sequence, obtains a barcode prediction result, namely a recognition result, after decoding, and obtains the confidence of the recognition result.

According to the scheme, the bar code image to be identified is obtained; and inputting the bar code image to be identified into a transducer model to obtain an identification result. Since the character value corresponding to the bar code is determined by the bar space width, the recognition of the bar code can be treated as a character sequence recognition problem, so that the bar code recognition method can be widely used for recognizing the bar code image to be recognized by using a character sequence recognition transducer model, and the accuracy of bar code recognition is further improved.

Another embodiment of the barcode identification method provided in the present application may specifically include the following steps:

s11, acquiring a bar code image to be identified.

S12, inputting the bar code image to be identified into a coding network for feature coding to obtain coding features.

The transducer model comprises an encoding network and a decoding network which are connected in sequence.

Specifically, the bar code recognition device inputs the bar code image to be recognized into a coding network for feature coding so as to obtain coding features corresponding to the bar code image to be recognized.

S13, inputting the coding features into a decoding network for decoding to obtain a recognition result.

Specifically, the bar code recognition device inputs the coding features to a decoding network for decoding to obtain a recognition result.

The recognition result comprises a bar code prediction result output by the decoding network and a confidence corresponding to the bar code prediction result.

s21, acquiring a bar code image to be identified.

S22, acquiring a bar code image to be identified.

S23, carrying out vector embedding and local feature extraction on the bar code image to be identified to obtain the features corresponding to the bar code image to be identified.

Specifically, the bar code recognition device performs vector embedding and local feature extraction on the bar code image to be recognized to obtain the features corresponding to the bar code image to be recognized.

In some embodiments, the bar code recognition device performs vector embedding (patch embedding) on the bar code image to be recognized by using convolution with a convolution kernel of 3x3 and a step length of 2, performs batch normalization (Batch Normalization) and Relu (Rectified Linear Unite) activation, and sequentially performs 2 times, and pulls the image back to the feature represented by the vector (patches) to realize local feature information extraction of the bar code to be recognized.

S24, inputting the features into a downsampling layer for downsampling to obtain downsampled features.

Specifically, the bar code recognition device inputs the features to the downsampling layer for downsampling to obtain downsampled features.

In the present embodiment, the downsampling refers to downsampling in the vertical direction (height direction).

The inventor of the application finds that the reduction or expansion of the height has little influence on the identification of the bar code because the information contained in the bar code is in the width of the bar code. Based on the method, the characteristics are selected to be input into a downsampling layer for downsampling in the height direction, and the calculation efficiency of subsequent encoding and decoding operations is improved.

Note that, taking a two-dimensional code as an example, information thereof is stored in the horizontal direction (width direction) and the vertical direction (height direction), and if the two-dimensional code is subjected to the same height direction downsampling operation, information of the two-dimensional code stored in the vertical direction is lost. Therefore, the two-dimensional code is not downsampled in the vertical direction, but downsampled in the horizontal direction and the vertical direction at the same time in the same step size. Similarly, in the process of character recognition, the vertical downsampling operation is not performed on the character image to be recognized, but the horizontal downsampling operation and the vertical downsampling operation are performed on the character to be recognized at the same time in the same step length.

In some embodiments, the convolution kernel in the downsampling layer is 3*3, the vertical (height) step size is 2, and the horizontal (width) step size is 1. Alternatively, the step size of the convolution kernel may also be denoted as (2, 1).

Wherein the output of the convolution calculation satisfies the following relationship:

where output is the convolution calculated output image size, input is the convolution calculated input image size, k is the convolution kernel size (kernel_size), p is the edge padding (pad), and s is the step size.

In this embodiment, if the step length in the height direction and the step length in the width direction are different, the output of the convolution calculation satisfies the following relationship:

wherein input is _H To input the image height size, output _H To output the height dimension of the image s _H Input is the step length of the height direction _W To input the image width size, output _W To output the image width size S _W Is the width direction step size.

Taking a step size in the height direction of 2, a step size in the width direction of 1 as an example, and a convolution kernel size of 3, pad of 1. For example, the input size of the image is 224×224 (width×height), and the output size of the image is 224×112 (width×height), so as to implement the downsampling operation of the image in the height direction.

In some possible application scenarios, the downsampling operation may be used multiple times during the feature extraction, so as to achieve the purpose of reducing the calculation amount during the feature extraction by reducing the size of the feature map.

S25, carrying out layer standardization on the downsampled features to obtain coding features.

Specifically, the barcode recognition device performs layer normalization (Layer Normalization) to obtain the encoded features.

By highly compressing the bar code and not affecting the layout of the embedded vector in width, the computing cost of bar code recognition can be reduced and a hierarchical structure of defined text can be established.

S26, inputting the coding features into a decoding network for decoding to obtain a recognition result.

s31, acquiring a bar code image to be identified.

S32, inputting the bar code image to be identified into a coding network for feature coding to obtain coding features.

In some embodiments, the encoding network and the decoding network contain a self-attention mechanism.

S33, performing feature decoding on the coding features to obtain at least one piece of local self-attention decoding information and at least one piece of global self-attention decoding information respectively.

Specifically, the bar code recognition device performs feature decoding on the encoded features to obtain at least one local self-attention decoding information and at least one global self-attention decoding information respectively.

In some embodiments, the decoding network includes at least one local self-attention decoding layer and at least one global self-attention decoding layer. The local self-attention decoding layer is different from the global self-attention decoding layer.

In some application scenarios, the receptive fields between different local self-attention decoding layers may be the same or different.

S34, the local self-attention decoding information and the global self-attention decoding information are respectively subjected to feature height reduction, channel number adjustment and feature compression to obtain a feature sequence.

Specifically, the bar code recognition device performs feature height reduction, channel number adjustment and feature compression on the local self-attention decoding information and the global self-attention decoding information respectively to obtain a feature sequence.

In some embodiments, the barcode recognition device performs global pooling on each local self-attention decoding information and each global self-attention decoding information, reduces the feature height of the decoding information to 1, uses convolution kernel to be 1, adjusts the number of network output channels to the required number of channels by convolution with step length of 1, and then passes a plurality of output results through a full connection layer, so that the image features are compressed into a feature sequence.

And S35, carrying out normalization processing on the characteristic sequence to obtain a recognition result.

Specifically, the bar code recognition device performs normalization processing on the feature sequence to obtain a recognition result.

In the embodiment, the recognition result is the classification probability of the bar code character corresponding to each position strip in the bar code, and the position corresponding to the bar code character is recognized as the character; the position corresponding to the non-character is identified as a space.

s41, acquiring an original bar code image to be identified.

Specifically, the bar code recognition device acquires an original bar code image to be recognized.

Alternatively, the original barcode image to be identified may be an image acquired by a camera or a barcode identifier in real time, or may be an image acquired by the camera or the barcode identifier in advance.

S42, inputting the original barcode image to be identified into a space deformation network for image correction so as to obtain the barcode image to be identified.

Specifically, the bar code recognition device inputs an original bar code image to be recognized into a space deformation network (STN network) for image correction so as to obtain the bar code image to be recognized.

In some possible application scenarios, the barcode to be identified has geometric transformations such as rotation, translation, scale, distortion, etc. during the process of posting or barcode identifier/camera shooting. In some application scenarios, the original barcode image to be identified may have the conditions of blurring, small code (low barcode height value), reflection, defect and shielding, as shown in fig. 2.

It will be appreciated that if the above adverse conditions exist in the original identification barcode image, the accuracy of the obtained identification result may be reduced by directly inputting the original identification barcode image into the transducer model. Meanwhile, the transducer model has a large number of matrix transposition multiplication and other calculations in the training and reasoning process, and the larger the dimension is, the longer the calculation time is. The method utilizes a transducer model to identify the original bar code image to be identified, and has the defects of large calculation amount and long reasoning time.

Based on the above, the bar code recognition device inputs the original bar code image to be recognized into a space deformation network (STN) for image correction. Under the condition that key points are not marked, the space deformation network can learn space transformation parameters of pictures or features according to different tasks and spatially align the bar codes, so that the influence of geometric transformations such as rotation, translation, scale, distortion and the like on subsequent recognition of the bar codes in the shooting or posting process is reduced.

Meanwhile, the space deformation network can play a role in reducing the calculated amount and the reasoning time for the transducer model through size transformation.

Through practical tests, the inventor of the application has a certain correction effect on the original barcode image to be identified by the space deformation network. And under the condition that the converter model has the same input size, the reasoning time proportion shortened on the single picture is the same as the STN output image size and the network input size proportion. In the final bar code recognition test, under the condition that the bar code is vertical, the conditions of normal, fuzzy, small code, light reflection, defect, shielding and the like have 1% improvement on average. But in case that the barcode inclination is relatively large, the decoding accuracy is reduced by 3%. The bar code is sensitive to the relation before the bar space, and if the inclination angle is too large, the bar space position deviation can occur after the correction of the space deformation network, so that the decoding is inaccurate.

S43, inputting the bar code image to be identified into the transducer model to obtain an identification result.

Further, the bar code recognition method provided by the application has the advantages that the recognition accuracy of the blurred bar code, the small bar code, the reflective bar code, the distorted bar code or the stained bar code is obviously improved, and compared with the traditional bar code recognition method, the recognition accuracy is higher.

Referring to fig. 3, fig. 3 is a flow chart illustrating another embodiment of the bar code identification method provided in the present application.

As shown in fig. 3, another embodiment of the barcode recognition method provided in the present application may specifically include the following steps:

s51, acquiring a bar code image to be identified.

It should be noted that the barcode image to be identified in this embodiment is an image sample for model training.

In some embodiments, the barcode image to be identified is a one-dimensional barcode picture sample already positioned and extracted, and the picture sample is manually marked with barcode information, that is, the picture sample contains marking information. In the bar code image to be identified, the width information of the bar code is positioned in the horizontal direction of the image.

In some embodiments, the barcode recognition device divides the barcode image (image sample) to be recognized into three parts according to a certain proportion, which are respectively a training set, a verification set and a test set. And counting the aspect ratio of the image sample, and selecting the dimension with the aspect ratio of 80% or more as the dimension input by the transducer model.

It can be understood that in the actually recognized scene, the shooting conditions of the barcode image to be recognized are different, the picture presentation effect is also different, and a certain perspective, over-illumination/over-darkness or noise problem may exist. For the above problems, one to two times as many generated samples as real samples are used in the image samples.

The real sample is a bar code image shot in a real scene.

Alternatively, the generating samples may be at least one or more operations of perspective transformation, brightness transformation, noise addition, filtering operation, rotation operation, and clipping operation of the barcode recognition device, which are not limited herein.

In some embodiments, in the positioning process of the barcode image to be identified before being input to the transducer model for identification, the barcode identification device does not determine the start position of the barcode, that is, there are two possibilities of sending the barcode image to be identified to the transducer model for identification, and the leftmost possible barcode initiator may be the barcode terminator. In order to solve the above problem, the barcode recognition device rotates half the number of barcode images to be recognized by 180 ° to eliminate such an influence.

Different from the traditional decoding mode, the bar code identification method provided by the application has no requirement on a bar code mute region, and only the integrity of the bar code in the width direction is ensured.

In some embodiments, the barcode recognition device randomly expands (increases the width and/or the height) according to a certain range and sets two kinds of pictures of clipping ranges of a silent zone (only a barcode region in a barcode image to be recognized is reserved), so that a sample image is constructed, a trained transducer model is completed, the barcode positioning process before recognition cannot be excessively required, and sample fluctuation caused by barcode positioning is eliminated as much as possible.

The mute area (mute area) of the bar code is equivalent to the condition that the two ends of the bar code are left blank, so that the bar code reader cannot pick up information irrelevant to the bar code.

S52, inputting the bar code image to be identified into a transducer model to obtain an identification result.

S53, determining the CTC loss of the identification result relative to the labeling result according to the identification result.

Specifically, the bar code recognition device determines CTC (Connectionist Temporal Classification) loss of the recognition result relative to the labeling result according to the recognition result.

Compared with other loss functions, the CTC loss function has certain advantages in sequence signal processing, and the loss values of the identification result and the labeling result are calculated under the condition that input and output alignment is not needed.

S54, parameter adjustment is performed on the transducer model by utilizing CTC loss.

Specifically, the barcode recognition device uses CTC loss to adjust parameters of the transducer model.

In some embodiments, the barcode recognition device uses CTC loss, counter-propagating gradients, and performs network weight adjustment on the transducer model.

In some application scenarios, the barcode recognition device utilizes a verification set to verify the state and convergence condition of the model in the process of training the transducer model, and finally inputs the trained model into a test set to evaluate the generalization performance of the model.

Furthermore, the bar code recognition method provided by the application can be also used for bar code recognition of other coding modes, and the corresponding transducer model can be obtained only by acquiring corresponding sample data for training.

Referring to fig. 4, fig. 4 is a flowchart illustrating another embodiment of a bar code identification method provided in the present application.

As shown in fig. 4, another embodiment of the barcode recognition method provided in the present application may specifically include the following steps:

s101, preprocessing sample data.

Specifically, the bar code recognition device adopts the one-dimensional bar code picture sample which is already positioned and extracted, and the bar code information is manually marked. The sample data is a picture sample.

In some embodiments, the barcode recognition device divides the sample data into three parts according to a certain proportion: training set, validation set and test set. And counting the aspect ratio of the sample, and selecting the size with the aspect ratio being more than or equal to 80% as the size of the sample data input by the network.

It can be understood that the obtained real sample is affected by shooting conditions, the picture presentation effect is different, and a certain perspective, overexposure, darkness or noise problem may exist. For the above problems, the training set may use one to two times as many samples as the real samples. Illustratively, the barcode recognition device performs one or more transformation operations such as brightness transformation, contrast transformation, perspective transformation, random noise addition, and the like on the real sample to obtain a generated sample.

Further, based on the structural characteristics of the bar code, in order to avoid the problem of sample imbalance in classification, the bar code recognition device counts the occurrence probability of each class, and resets the cross entropy weight of the character classification.

It will be appreciated that the positioning process before deep learning decoding does not resolve the start position of the barcode, and that there are two possibilities for the picture to be sent to the network for decoding, the leftmost possible being the barcode initiator, and the terminator. Therefore, in the data enhancement stage (namely the process of generating a sample), the bar code identification device adopts a half probability picture to rotate 180 degrees so as to eliminate the influence.

Unlike the traditional decoding mode, the deep learning decoding has no requirement on the barcode mute region, and the integrity of the barcode in the width direction is ensured. In some application scenes, the bar code recognition device adopts pictures (image clipping) with two clipping ranges of random expansion (image width increase and/or height increase) and no mute area according to a certain range, so that the bar code positioning process before decoding cannot be excessively required, and sample fluctuation caused by bar code positioning is eliminated as much as possible.

S102, inputting the sample data into an STN network for image correction to obtain corrected sample data.

Specifically, the barcode recognition device uses the STN network to image correct the sample data before inputting the sample data into the training network. Under the condition that the key points are not marked on the sample data, the STN network learns the spatial transformation parameters of the pictures or the features according to the task, and the bar codes are spatially aligned, so that the influence of geometric transformations such as rotation, translation, scale, distortion and the like on decoding (namely bar code identification) of the bar codes in the shooting or posting process is reduced.

Since the transform model has a large number of matrix transpose multiplication and other calculations in the training and reasoning process, the larger the dimension, the longer the matrix calculation time. STN is applied in the network of self-attention mechanism, through size transformation, it can reduce the calculation amount and the reasoning time.

The inventor of the application finds out through practical tests that the STN network has a certain correction effect on the image, and under the condition that the network has the same input size, the shortened reasoning time proportion on a single picture and the size proportion of the output image of the STN network are the same as the input size proportion of the network. In the final decoding effect test, under the condition that the bar code is vertical, the conditions of normal, fuzzy, small code, light reflection, defect, shielding and the like have about 1% improvement on average, but under the condition that the bar code inclination is large, the decoding accuracy rate is reduced by about 3%. The barcode is sensitive to the relation before the barcode space, the inclination angle is overlarge, and the barcode space is offset after the image correction is carried out through the STN network, so that the decoding is inaccurate.

S103, inputting the corrected sample into a transducer model to be trained to obtain a first characteristic sequence.

The common bar Code coding type is Code128 codes, the coding mode of the Code128 codes is divided into Code128A, code B and Code128C, 107 different characters are respectively defined in a codebook, each character occupies 11-bit space positions, a bar starts and ends, and the value corresponding to the bar Code character is determined by the bar space width. The coding mode based on the bar code can be used as a character sequence problem to process, and a proper neural network structure is constructed. There are two methods applied based on the problem of character sequence recognition: CRNN network architecture and identifying networks based on self-attention mechanisms.

The decoding network (namely the bar code recognition network) provided by the application selects a network based on a transducer (namely the transducer model above), firstly carries out patch embedding by adopting convolution with a convolution kernel of 3 multiplied by 3 and a step length of 2, then sequentially carries out BN (Batch Normalization) and ReLu (Rectified Linear Unite), and carries out 2 times in total, pulls the image open to patches, and extracts local characteristic information.

Since the information contained in the bar code is in the width of the bar code, the reduction or expansion of the height has little effect on the bar code decoding operation. Therefore, the bar code recognition device adopts convolution with the convolution kernel of 3 multiplied by 3 and the step length of (2, 1) to downsample the sample image, compresses the bar code in height LN (Layer Normalization), does not influence the layout after patch embedding in width, and can reduce the calculation cost and establish a hierarchical structure for defining the text.

The decoding network provided by the application extracts the dependence of bar space (bar/space pattern) of the bar code and the correlation among different characters of the whole bar code through a self-attention mechanism and a residual error structure. The dependence of the bar space of the bar code itself is that the width occupied by a single bar/space of a sample is different, and the local self-attention mechanism can help to determine the relative width of the bar/space and exclude decoding influence caused by the difference of the samples. In the traditional decoding mode, correlation among different characters is to calculate check bits of the bar codes through mathematical relations to judge whether decoding is correct or not, and the application utilizes a global self-attention mechanism to summarize the mathematical relations among the bar codes. And reducing the feature height of the features output by the local self-attention layer and the global self-attention layer respectively to 1 by using global averaging pooling, adjusting the number of channels output by the network to be the required number of channels by using convolution with a convolution kernel of 1 and convolution with a step length of 1, and splicing a plurality of features through full connection, wherein the image features are further compressed into a first feature sequence.

In this embodiment, the decoding network adopts a residual structure. The residual structure is a structure adopted by the decoding network in a feature extraction stage (features respectively pass through a local self-attention layer and a global self-attention layer), the feature images after feature extraction and the feature images before feature extraction are overlapped to obtain a new feature image, and the feature image is further subjected to feature extraction.

S104, carrying out normalization processing on the first characteristic sequence, and calculating CTC loss.

Specifically, in order to calculate accurately the loss function, the bar code recognition device performs normalization processing on the first feature sequence to obtain a recognition result and the bar code character classification probability of each position based on a certain image in the sample data. In the recognition result, the position of the bar code image to be recognized, corresponding to the character, is decoded into the character, and the position of the non-character is output as a space.

Further, the barcode recognition device determines CTC loss using the recognition result and the labeling result. The CTC loss function has a certain advantage in terms of sequence signal processing, and compared with other loss functions, the CTC loss function calculates a predicted value and a labeled value under the condition that input and output alignment is not needed.

S105, judging whether the training completion condition is satisfied.

Specifically, the bar code recognition device judges whether the loss value meets the training completion condition or whether the iteration number meets the training completion condition.

In some embodiments, whether the loss value satisfies the training completion condition comprises: whether the penalty value is less than or equal to the penalty threshold. The loss value is the loss of the STN network and/or the loss of the transducer model to be trained.

In some embodiments, whether the number of iterations satisfies the training completion condition comprises: whether the number of iterations is greater than or equal to an iteration threshold. The iteration times are the iteration times of the STN network and/or the iteration times of the transducer model to be trained.

If so, the process goes to S107. And if not, jumping to S106.

And S106, updating the weights of the STN network and the to-be-trained transducer model by using the CTC loss.

Specifically, the bar code recognition device updates the weights of the STN network and the transducer model to be trained by using CTC loss.

The bar code recognition device updates the weight of the STN network and the transducer model to be trained by utilizing the CTC loss counter-propagation gradient.

In some embodiments, the barcode recognition device uses the verification set to verify the state and convergence condition of the STN network and the to-be-trained transducer model in the training process, and finally inputs the test set to evaluate the generalization performance of the STN network and the to-be-trained transducer model.

Jump to S102.

S107, the training of the STN network and the transducer model is completed.

Referring to fig. 5, fig. 5 is a flowchart illustrating another embodiment of a bar code identification method provided in the present application.

As shown in fig. 5, another embodiment of the barcode recognition method provided in the present application may specifically include the following steps:

s111, acquiring an original barcode image to be identified containing a barcode area.

Specifically, the bar code recognition device acquires an original bar code image to be recognized, which contains a bar code area, wherein the original bar code image to be recognized is subjected to positioning and extraction operations, and only the area of the bar code is contained in the image, and further, the width information of the original bar code image to be recognized is positioned in the horizontal direction.

S112, inputting the original barcode image to be identified into the STN network for image correction to obtain the barcode image to be identified.

Specifically, the bar code recognition device inputs an original bar code image to be recognized into the STN network for image correction to obtain the bar code image to be recognized.

S113, inputting the bar code image to be identified to a transducer model to obtain a second characteristic sequence.

Specifically, the bar code recognition device inputs the bar code image to be recognized to a transducer model, and a second feature sequence corresponding to the bar code image to be recognized is obtained through feature extraction.

And S114, carrying out normalization processing on the second characteristic sequence to obtain a decoding result.

Specifically, the bar code recognition device performs normalization processing on the second feature sequence to obtain a decoding result.

The decoding result comprises a predicted result of the bar code and a confidence level corresponding to the decoding result.

Furthermore, the bar code recognition method provided by the application independently carries out decoding steps (bar code recognition), which is beneficial to different adjustment for different requirements.

Referring to fig. 6, fig. 6 is a flowchart illustrating another embodiment of a bar code identification method provided in the present application.

As shown in fig. 6, another embodiment of the barcode recognition method provided in the present application may specifically include the following steps:

s121, preprocessing sample data.

Wherein S121 corresponds to S101 above, and is not described herein.

S122, inputting the sample data into a to-be-trained transducer model for training, and obtaining a third characteristic sequence.

In this embodiment, the transducer model to be trained contains deformable convolution layers.

The inventor of the application has found that STN networks cause degradation of recognition under certain individual conditions, and that the transform itself computational characteristics cause time consuming reasoning on certain hardware.

In order to solve the above problems, the inventors of the present application propose to add a deformable convolution layer to a transducer model, and to increase the reasoning speed without decreasing the recognition effect.

Specifically, a deformable convolution layer is added to the transducer model, wherein the transducer model does not perform geometric transformation operations. The deformable convolution layer samples the on-grid features from the input feature map and uses convolution to aggregate the sampled features by weighted summation. On the basis of CNN, a variable learning offset is added, an important area is selected in a self-adaptive mode, and the effective receptive field of CNN is enlarged.

In some embodiments, in the step of feature extraction, the deformable convolution layer is used to replace the local self-attention layer, and in the case of retaining the global self-attention layer, the deformable convolution layer senses local feature information by using the deformable convolution layer, so that the deformed bar code can be effectively identified to a certain extent.

In this embodiment, the transducer model further adopts a residual structure, and the feature map before feature extraction and the feature map after feature extraction (via the deformable convolution layer and the global self-attention layer) are superimposed, and then feature extraction operation is further performed.

In general, the convolution kernel shape of the convolution layer is fixed to a rectangle, which makes convolution have limited perceptibility for complex shapes, and although the receptive field corresponding to each pixel of the deep feature map gradually increases by superimposing the convolution layer, such receptive field is still a fixed rectangle. Based on the above, the deformable convolution layer is added with a layer of convolution on the original convolution position, and can be further adjusted in the horizontal and vertical directions on the basis of the original convolution kernel, the adjustment is completed by adding an offset to the convolution kernel position, and the added offset of the deformable convolution layer is the output of the additional convolution layer, and the information is obtained through learning.

And S123, carrying out normalization processing on the third characteristic sequence, and calculating CTC loss.

S123 corresponds to S104, and will not be described herein.

S124, judging whether the training completion condition is satisfied.

S124 corresponds to S105, and is not described herein.

If so, the process goes to S126. If not, the process goes to S125.

S125, updating the weight of the transducer model to be trained by using the CTC loss.

Specifically, the bar code recognition device updates the weight of the transducer model to be trained by using CTC loss.

The bar code recognition device updates the weight of the transducer model to be trained by utilizing the CTC loss counter-propagation gradient.

In some embodiments, the barcode recognition device uses the verification set to verify the state and convergence condition of the to-be-trained transducer model in the training process, and finally inputs the verification set to evaluate the generalization performance of the to-be-trained transducer model.

Jump to S122.

S126, training of a transducer model is completed.

Referring to fig. 7, fig. 7 is a flowchart illustrating another embodiment of a bar code identification method provided in the present application.

As shown in fig. 7, another embodiment of the barcode recognition method provided in the present application may specifically include the following steps:

s131, obtaining a bar code image to be identified, wherein the bar code image comprises a bar code area.

S131 corresponds to S111, and will not be described herein.

S132, inputting the bar code image to be identified to a transducer model to obtain a fourth characteristic sequence.

Specifically, the bar code recognition device inputs the bar code image to be recognized to a transducer model to obtain a fourth characteristic sequence.

And S134, carrying out normalization processing on the fourth characteristic sequence to obtain a decoding result.

Specifically, the bar code recognition device performs normalization processing on the fourth feature sequence to obtain a decoding result.

Likewise, the decoding result includes a prediction result for the barcode, and a confidence level corresponding to the decoding result.

With continued reference to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of a terminal device provided in the present application. The terminal device 500 of the embodiment of the present application includes a processor 51, a memory 52.

The processor 51 and the memory 52 are connected to the bus, and the memory 52 stores program data, and the processor 51 is configured to execute the program data to implement the barcode recognition method according to the above embodiment.

In the present embodiment, the processor 51 may also be referred to as a CPU (Central Processing Unit ). The processor 51 may be an integrated circuit chip with signal processing capabilities. Processor 51 may also be a general purpose processor, a digital signal processor (DSP, digital Signal Process), an application specific integrated circuit (ASIC, application Specific Integrated Circuit), a field programmable gate array (FPGA, field Programmable Gate Array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The general purpose processor may be a microprocessor or the processor 51 may be any conventional processor or the like.

Still further, referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of the computer storage medium provided in the present application, in which the program data 61 is stored in the computer storage medium 600, and the program data 61 is used to implement the barcode recognition method of the above embodiment when being executed by the processor.

Embodiments of the present application are implemented in the form of software functional units and sold or used as a stand-alone product, which may be stored on a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution, in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely an embodiment of the present application, and the patent scope of the present application is not limited thereto, but the equivalent structures or equivalent flow changes made in the present application and the contents of the drawings are utilized, or directly or indirectly applied to other related technical fields, which are all included in the patent protection scope of the present application.

Claims

1. A method of identifying a barcode, the method comprising:

acquiring a bar code image to be identified;

2. The method according to claim 1, wherein the transducer model includes a coding network and a decoding network connected in sequence, and the inputting the barcode image to be identified into the transducer model to obtain the identification result includes:

inputting the bar code image to be identified into the coding network for feature coding to obtain coding features;

and inputting the coding features into the decoding network for decoding so as to obtain the identification result.

3. The method according to claim 2, wherein the encoding network includes a downsampling layer, the inputting the barcode image to be identified into the encoding network for feature encoding to obtain encoded features, comprising:

performing vector embedding and local feature extraction on the bar code image to be identified to obtain features corresponding to the bar code image to be identified;

inputting the characteristics into the downsampling layer for downsampling so as to obtain downsampled characteristics;

And carrying out layer normalization on the downsampling characteristic to obtain the coding characteristic.

4. A method according to claim 3, characterized in that the convolution kernel in the downsampling layer is 3*3, the vertical step size is 2 and the horizontal step size is 1.

5. The method of claim 2, wherein the encoding network and the decoding network contain a self-attention mechanism.

6. The method of claim 5, wherein inputting the encoded features into the decoding network for decoding to obtain the recognition result comprises:

performing feature decoding on the coding features to obtain at least one local self-attention decoding information and at least one global self-attention decoding information respectively;

and carrying out normalization processing on the characteristic sequence to obtain the identification result.

7. The method of claim 1, wherein the step of acquiring the barcode image to be identified comprises:

acquiring an original bar code image to be identified;

8. The method according to claim 1, wherein the barcode image to be identified contains labeling results, and the step of acquiring the barcode image to be identified further comprises:

and utilizing the CTC loss to carry out parameter adjustment on the transducer model.

9. Terminal equipment, characterized in that it comprises a processor, a memory connected to the processor, wherein,

the memory stores program instructions;

the processor is configured to execute program instructions stored in the memory to implement the method of any one of claims 1 to 8.

10. A computer readable storage medium, characterized in that the storage medium stores program instructions which, when executed, implement the method of any one of claims 1 to 8.