CN110147788B

CN110147788B - Feature enhancement CRNN-based metal plate strip product label character recognition method

Info

Publication number: CN110147788B
Application number: CN201910448218.6A
Authority: CN
Inventors: 刘士新; 郭文瑞; 陈大力; 赖峰
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2019-05-27
Filing date: 2019-05-27
Publication date: 2021-09-21
Anticipated expiration: 2039-05-27
Also published as: CN110147788A

Abstract

The invention discloses a feature-enhanced CRNN-based metal plate strip product label character recognition method, which comprises the following steps of: preparing a picture database; preparing a recognition dictionary; preprocessing and expanding a training library; designing and establishing a feature-enhanced deep convolution cyclic neural network aiming at the label characters of metal plate strip products in the application of the steel industry; for the feature-enhanced deep convolution cyclic neural network, training for multiple times by adopting training example pictures in the training library; and identifying characters on the metal plate strip product label in the steel industry application based on the obtained output value of the last-stage neural network architecture in the trained deep convolution cyclic neural network model. According to the invention, through analysis of a large number of metal plate strip product labels shot in the steel industry field, more accurate feature learning is realized by the original character recognition network CRNN through feature enhancement from the practical aspect, and the recognition result in a real scene has very high reliability.

Description

Feature enhancement CRNN-based metal plate strip product label character recognition method

Technical Field

The invention relates to the technical field of image processing and deep learning, in particular to a feature-enhanced CRNN-based metal plate strip product label character recognition method.

Background

Compared with the general traditional application, the iron and steel industry application has serious influence on the industrial field environment, the character recognition is very sensitive to the influence of the external environment, and the effect of the character recognition accuracy is very difficult to achieve the application expectation when the conditions of complex background, artistic fonts, low resolution, non-uniform illumination, image degradation, character deformation, multi-language mixing, complex text formats and the like of the product label pictures shot in industrial scenes are met. The appearance of the character recognition technology based on deep learning enables character recognition in industrial application to have a new opportunity, but a character recognition method aiming at the specific background of metal plate strip product labels is not provided. Due to the serious quality problems of metal plate strip product labels, such as close letters, fuzzy characters, poor recognition degree between numbers 1, lower case English letters l and upper case English letters I and the like, the existing character recognition technology is difficult to realize prediction and distinction, so that the recognition precision is low and the reliability is lacked; therefore, under the condition that the existing method is not good in performance and is not popularized, a new technology is urgently needed to fill the blank.

Disclosure of Invention

According to the problems in the prior art, the invention discloses a feature-enhanced CRNN-based metal plate strip product label character recognition method. The technical means adopted by the invention are as follows:

a feature enhancement CRNN-based metal plate strip product label character recognition method comprises the following steps:

s1, preparing a picture database, wherein pictures in the picture database are derived from metal plate strip product label pictures shot in an industrial field;

cutting a region with characters of a shot label picture of the metal plate strip product to obtain a plurality of small pictures, wherein the character row direction in each small picture is the horizontal direction;

each small picture corresponds to a txt file with the same name and is used for storing character information in the small picture;

each small picture and the corresponding txt file are called as training data, and all the training data are obtained to form a database;

s2, preparing a recognition dictionary, traversing each character in each txt file in the database, adding the character into the original recognition dictionary to ensure that each character in each training data can be recognized, and obtaining the recognition dictionary after deduplication processing;

the original recognition dictionary is 1050 characters in size, and mainly relates to English letters, Chinese and English symbols and some common Chinese characters.

S3, preprocessing each small picture in the database to obtain training example pictures to form a training library;

after preprocessing, each training example picture is subjected to processing such as contrast conversion, brightness conversion, length stretching and the like to expand a training library;

the pretreatment comprises the following steps:

processing each small picture in the database into a single-channel gray-scale image;

the height of the single-channel gray-scale image is forcibly zoomed to 32 pixels, and the width is freely zoomed according to the zoom ratio of the height;

s4, designing and establishing a feature-enhanced deep convolution cyclic neural network (CRNN) aiming at the label characters of metal plate strip products in the application of the steel industry; the deep convolution cyclic neural network with the enhanced features comprises a multi-stage neural network architecture, wherein the multi-stage neural network architecture is provided with two special neural network architectures so as to realize multi-scale feature enhancement;

s5: aiming at the characteristic-enhanced deep convolution cyclic neural network, training for multiple times by adopting training example pictures in the training library, and adjusting parameters of the multi-stage neural network architecture according to a set learning rate during training so as to obtain a deep convolution cyclic neural network model for performing character recognition on metal plate strip product label characters in the application of the steel industry;

s6: and identifying characters on the metal plate strip product label in the steel industry application based on the output value of the final-stage neural network architecture in the trained deep convolution cyclic neural network model obtained in the step S5.

The multi-stage neural network architecture comprises 10 modules: the 1 st to 7 th modules are conventional convolution modules, maximum pooling (MaxPool) operations are added in the 1 st, 2 nd, 4 th and 6 th modules, respectively, and Batch Normalization (BN) operations are added in the 3 rd, 5 th and 7 th modules, respectively;

the two special neural network architectures are a 8 th module and a 9 th module, and the 8 th module and the 9 th module are local feature enhancement convolution modules (EFEM) which are respectively called an EFEM _ a module and an EFEM _ b module;

and the 10 th module is a result output layer and consists of a bidirectional cyclic neural network.

The EFEM _ a module consists of a deformable convolution layer, a Relu activation layer and a maximum pooling layer, and the convolution process under the EFEM _ a module for the feature transferred from the previous layer is as follows:

firstly, extracting the features of the features transmitted from the previous layer by a deformable convolution kernel with the size of 3 multiplied by 3, and then inputting the output values into 4 parallel branches and a residual error branch for relearning;

the first of the 4 parallel branches consists of convolutional layers with convolutional kernel sizes of 1 × 1 and 3 × 3; the second branch consists of convolution layers with convolution kernel sizes of 1 × 01 and 3 × 11 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 23; the third branch consists of convolution layers with convolution kernel sizes of 1 × 31 and 1 × 43 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 3; the fourth branch consists of convolution layers with convolution kernel sizes of 1 × 1, 1 × 3 and 3 × 1 and an expansion convolution with an expansion rate of 5 and a convolution kernel size of 3 × 3; splicing the outputs of the 4 parallel branches, inputting the spliced outputs into a convolution layer with convolution kernel size of 3 multiplied by 3 for feature thinning processing, outputting the spliced outputs to a convolution layer with the deformable convolution kernel size of 3 multiplied by 3 and the convolution layer with the convolution kernel size of 1 multiplied by 1, and outputting the result of x₀；

The residual branch uses convolution layer with convolution kernel size of 1 × 1, and its output result is x₁(ii) a Finally, the output results x of the 4 parallel branches are processed₀And the output result x of the residual branch₁According to scale₁The ratio of (a) to (b) is added to obtain x, and the x satisfies the following formula:

x＝x₀·scale₁+x₁；

and x, performing feature extraction on the convolution layer with the convolution kernel size of 1 x 1, the Relu activation layer and the maximum pooling layer.

The EFEM _ b module consists of a convolution layer, a Relu activation layer and a maximum pooling layer, and the convolution process of the features transferred from the upper layer under the EFEM _ b module is as follows:

the characteristics transferred from the upper layer are used as the input of the EFEM _ b module and are sent into 3 parallel branches and a residual error branch for relearning;

among the 3 parallel branchesThe first branch of (a) consists of convolutional layers with convolutional kernel sizes of 1 × 1 and 3 × 3; the second branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 3 × 3 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 3; the third branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 2 convolution kernels with convolution kernel sizes of 3 × 3 and expansion convolution with expansion rate of 3 and convolution kernel sizes of 3 × 3; splicing the outputs of the 3 branches, inputting the spliced outputs into a convolution layer with the convolution kernel size of 1 multiplied by 1 for carrying out characteristic refinement processing, and outputting the result of x₂；

The residual branch uses convolution layer with convolution kernel size of 1 × 1, and its output result is x₃(ii) a Finally, the output results x of the 3 parallel branches are processed₂And the output result x of the residual branch₃According to scale₂The ratio of (a) to (b) is added to obtain x, and the x satisfies the following formula:

x＝x₂·scale₂+x₃and x is subjected to feature extraction through a Relu activation layer and a maximum pooling layer.

In step S5, in training the multi-stage neural network architecture of the feature-enhanced deep convolutional recurrent neural network, a Connection Time Classification (CTC) is used as a loss function, and an Adam algorithm is used as a learning algorithm.

In step S5, when training the multi-stage neural network architecture of the feature-enhanced deep convolutional circular neural network, the learning rate of each training is less than or equal to the learning rate of the previous training.

In step S5, when training the multi-stage neural network architecture of the feature-enhanced deep convolutional recurrent neural network, the weight of the feature-enhanced deep convolutional recurrent neural network is initialized by using the "Xavier" method, and simultaneously, the offset layer values of all the deformable convolutional layers are all initialized to 0.

In step S5, when training the multi-stage neural network architecture of the feature-enhanced deep convolutional recurrent neural network, the output of each convolutional layer in the feature-enhanced deep convolutional recurrent neural network is transmitted to the next layer of neurons after the ReLu activation function operation.

Each character in the recognition dictionary has equal probability when appearing in a training example photo library, and the probability is according to 9: a scale of 1 is assigned to a training set and a validation set of the feature-enhanced deep convolutional recurrent neural network.

Compared with the prior art, the invention has the beneficial effects that:

according to the method, through analysis of a large number of metal plate strip product labels shot in a steel industry field, more accurate feature learning is realized through feature enhancement of an original character recognition network CRNN from the practical standpoint, the method has very excellent prediction capability on adjacent characters, fuzzy characters and characters with high similarity and extremely poor recognition, and the recognition result in a real scene has very high reliability.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a flowchart of a feature-enhanced CRNN-based metal plate strip product label character recognition method according to an embodiment of the present invention;

FIG. 2 is a schematic view of a label of an original metal strip product before cutting in a text area according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a cut thumbnail and text information in a txt file according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating pre-and post-processing comparison of small pictures in an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an EFEM _ a module according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an EFEM _ b module in accordance with an embodiment of the present invention;

FIG. 7 is a schematic diagram of a feature enhanced deep convolutional recurrent neural network architecture used in an embodiment of the present invention;

FIG. 8 is a diagram illustrating the loss and accuracy of the training process according to an embodiment of the present invention;

fig. 9 is a schematic diagram illustrating comparison of recognition effects of characters with low recognition degree according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1, a feature-enhanced CRNN-based metal plate strip product label character recognition method is characterized in that: preparing a large number of product label training pictures according to actual conditions in an industrial environment; establishing a feature enhancement CRNN neural network; training a neural network based on a large number of prepared pictures; performing character recognition by using the trained feature enhancement CRNN model;

the identification method comprises the following steps:

cutting a region with characters of a shot label picture (shown in figure 2) of the metal plate strip product to obtain a plurality of small pictures, wherein the character row direction in each small picture is the horizontal direction;

each small picture and the corresponding txt file are called as training data (as shown in fig. 3), all the training data are obtained to form a database, and 17386 training data databases are finally obtained;

s2, preparing a recognition dictionary, traversing each character in each txt file in the database, adding the character into the original recognition dictionary, and obtaining the recognition dictionary after duplication removal processing;

the original recognition dictionary is 1050 characters in size and mainly relates to English letters, Chinese and English symbols and some common Chinese characters;

after preprocessing, each training example picture is subjected to processing such as transformation contrast, brightness and length stretching to expand a training library, and the method comprises the following steps of 9: 1 to a training set and a validation set of the feature-enhanced deep convolutional recurrent neural network;

the pretreatment comprises the following steps:

the high-force scaling of the single-channel grayscale map to 32 pixels, the width scaling freely according to the scaling of the height, as shown in fig. 4;

s4, designing and establishing a feature-enhanced deep convolution cyclic neural network (CRNN) aiming at the label characters of metal plate strip products in the application of the steel industry; the deep convolution cyclic neural network with the enhanced features comprises a multi-level neural network architecture, and in view of the problems that characters of a printed form in the application of the steel industry are close and the quality of a picture shot in an actual environment is high, a special convolution module is added in the general feature extraction operation for feature enhancement, namely two special neural network architectures are arranged in the multi-level neural network architecture, so that the multi-scale feature enhancement is realized;

two special neural network architectures (EFEM _ a and EFEM _ b) feature enhancements:

EFEM _ a module: the values input to the block are first passed through a convolution kernel of size 3Performing feature extraction on the x 3 deformable convolution kernel, and then inputting the output value into 4 parallel branches and one residual error branch for relearning; the first of the 4 parallel branches consists of convolutional layers with convolutional kernel sizes of 1 × 1 and 3 × 03; the second branch consists of convolution layers with convolution kernel sizes of 1 × 11 and 3 × 21 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 33; the third branch consists of convolution layers with convolution kernel sizes of 1 × 41 and 1 × 53 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 3; the fourth branch consists of convolution layers with convolution kernel sizes of 1 × 1, 1 × 3 and 3 × 1 and an expansion convolution with an expansion rate of 5 and a convolution kernel size of 3 × 3; splicing the outputs of the 4 parallel branches, inputting the spliced outputs into a convolution layer with convolution kernel size of 3 multiplied by 3 for feature thinning processing, and then outputting the output to a convolution layer with the deformable convolution kernel size of 3 multiplied by 3 and the convolution layer with the convolution kernel size of 1 multiplied by 1 to output a result of x₀(ii) a The residual branch uses convolution layer with convolution kernel size of 1 × 1, and its output result is x₁(ii) a Finally, the output results x of the 4 parallel branches are processed₀And the output result x of the residual branch₁According to scale₁Adding the two in a ratio of 0.3 to obtain x (x ═ x)₀·scale₁+x₁) And then passing through a convolution layer with convolution kernel size of 1 × 1, a Relu activation layer and a maximum pooling layer. FIG. 5 is a schematic structural diagram of the EFEM _ a module.

EFEM _ b module: the values input to the module will be fed into 3 parallel branches and one residual branch for relearning; a first branch of the 3 parallel branches consists of convolutional layers with convolutional kernel sizes of 1 × 1 and 3 × 3; the second branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 3 × 3 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 3; the third branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 2 convolution kernels with convolution kernel sizes of 3 × 3 and expansion convolution with expansion rate of 3 and convolution kernel sizes of 3 × 3; splicing the outputs of the 3 branches, inputting the spliced outputs into a convolution layer with the convolution kernel size of 1 multiplied by 1 for carrying out characteristic refinement processing, and outputting the result of x₂(ii) a The residual branch uses a convolution layer of size 1 × 1, and the output result is x₃(ii) a Finally, the output results x of the 3 parallel branches are processed₂And the output result x of the residual branch₃According to scale₂Adding the two in a ratio of 0.3 to obtain x (x ═ x)₂·scale₂+x₃) And performing feature extraction through a Relu activation layer and maximum pooling. FIG. 6 is a schematic structural diagram of the EFEM _ b module.

As shown in fig. 7, the network structure of the feature enhanced CRNN used is described as follows:

for an input gray picture with the fixed height and the size of 32 pixels, firstly, the input gray picture passes through a convolution kernel with the size of 3 multiplied by 3 and the output channel number is 64, then, the input gray picture enters a maximum pooling layer through Relu activation, the obtained result is divided into two branches, the first branch is sent to a first regional feature enhancement module EFEM _ a to obtain a feature map x₄The second branch enters a convolution layer with convolution kernel size of 3 multiplied by 3 to obtain a feature map of 128 channels and activates the feature map to enter a maximum pooling layer, the obtained feature map is divided into two branches again for feature extraction, and the first branch is sent to a second regional feature enhancement module EFEM _ b to obtain a feature map x₅The second branch enters a convolution layer with convolution kernel size of 3 multiplied by 3 to obtain a characteristic diagram of 256 channels, and after activation, Batch Normalization (BN) is carried out to obtain a characteristic diagram x₆Then scaled by the scale₃By adding features, i.e. x ═ x₄·scale₃+x₆Then sending the added features into a convolution layer with a convolution kernel size of 3 multiplied by 3, outputting a channel number of 256, and then activating by Relu to enter a maximum pooling layer to obtain a feature map x₇Then scaled by the scale₄By adding features, i.e. x ═ x₆·scale₄+x₇And sequentially passing the summed characteristic diagram obtained at this time through 3 convolution layers with convolution kernels of 512 channels of 3 multiplied by 3 to obtain a characteristic sequence, and then passing through a double-circulation neural network layer twice to realize the prediction of corresponding characters in each frame of the characteristic sequence.

S5: and aiming at the network structure of the feature enhancement CRNN, training the network structure of the feature enhancement CRNN for multiple times by adopting the training example pictures in the training library, wherein the training of the network structure of the feature enhancement CRNN is divided into three stages in the example, and the learning rate is set differently in each stage until the loss and the precision are kept unchanged. The following modes are adopted when training the neural network architectures of the features enhanced CRNN:

A. classifying CTCs as loss functions using coupling time;

B. using Adam algorithm as an optimization algorithm of the model;

C. the first stage adjusts the parameters of the multi-stage neural network architecture using a learning rate of 0.001, with the variance of the Adam optimizer set to [0.9,0.99 ]]，scale₁、scale₂、scale₃And scale₄The loss of training to the network and the precision of the verification set are both set to 0.1 and remain unchanged; in the second stage, the parameters of the multi-stage neural network architecture are adjusted by adopting a learning rate of 0.0001, and the variance of the Adam optimizer is set to be 0.89 and 0.99]，scale₁And scale₂Remain unchanged, scale₃And scale₄The training loss and the precision of the verification set are kept unchanged, the precision of the verification set is greatly improved, and the training loss is greatly reduced; the third stage adopts a learning rate of 0.00001 to carry out fine tuning on the multi-stage neural network architecture, and the variance setting range of Adam is [0.88,0.99 ]]，scale₁And scale₂Increase to 0.3, scale₃And scale₄The number of iterations is increased to 0.6 to obtain the final model.

D. After the 3 rd, 5 th and 7 th convolution layers, the values output by the convolution are normalized by using Batch Normalization (BN), so that the risk of overfitting is reduced, and the speed of network training is increased;

E. the method is derived from an article of 'unwinding stability of routing feedback neural networks' in 2010, and the weight initialization method is favorable for accelerating convergence of network training, and simultaneously initializes all offset bias layer values of all deformable convolutions to 0;

through the training process, a deep convolution cyclic neural network model for carrying out character recognition on the metal plate strip product label characters in the application of the steel industry is obtained;

s6: and (4) identifying characters on the metal plate strip product label in the steel industry application based on the output value of the final-stage neural network architecture in the trained deep convolution cyclic neural network model obtained in the step (S5), as shown in fig. 8.

The scheme provides a data-driven identification method for metal plate strip product label characters based on feature enhancement CRNN, which adopts a database obtained by processing field-shot metal plate strip product label pictures as a final model of training set training through Xavier initialization weight, CTC loss function and Adam optimization algorithm, and can further improve the accuracy through training by acquiring more training data.

The method automatically updates the network parameters in each step of iteration through deep learning, achieves the purposes of autonomously learning the required characteristics from the training data and completing character recognition, can continuously improve the accuracy along with the increase of the training data amount, and greatly improves the recognition accuracy and reliability in industrial application scenes. In tests, the recognition accuracy of the model in 1500 pictures can reach about 95%, and the model is increased by about 10% compared with the previous recognition network, and particularly, the model has good support for characters which are close to each other, fuzzy characters and characters which are high in character similarity and low in recognition degree (as shown in fig. 9). The character recognition method based on deep learning can be used for quickly and accurately recognizing the label characters of the metal plate and strip products in the steel industry, and can be applied to the field of the steel industry and even a wider range.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A feature enhancement CRNN-based metal plate strip product label character recognition method is characterized by comprising the following steps:

after preprocessing, each training example picture is subjected to transformation contrast, brightness and length stretching processing to expand a training library;

the pretreatment comprises the following steps:

s4, designing and establishing a feature-enhanced deep convolution cyclic neural network aiming at the label characters of metal plate strip products in the application of the steel industry; the deep convolution cyclic neural network with the enhanced features comprises a multi-stage neural network architecture, wherein the multi-stage neural network architecture is provided with two special neural network architectures so as to realize multi-scale feature enhancement;

s6: recognizing characters on a metal plate strip product label in the steel industry application based on the output value of the final-stage neural network architecture in the trained deep convolution cyclic neural network model obtained in the step S5;

the multi-stage neural network architecture comprises 10 modules: the 1 st to 7 th modules are conventional convolution modules, maximum pooling operations are added to the 1 st, 2 nd, 4 th and 6 th modules respectively, and batch normalization operations are added to the 3 rd, 5 th and 7 th modules respectively;

the two special neural network architectures are a 8 th module and a 9 th module, the 8 th module and the 9 th module are convolution modules with enhanced regional characteristics, and are respectively called an EFEM _ a module and an EFEM _ b module;

2. The feature-enhanced CRNN-based metal strip product label text recognition method of claim 1,

the first of the 4 parallel branches consists of convolutional layers with convolutional kernel sizes of 1 × 1 and 3 × 3; the second branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 3 × 1 and convolution kernel sizes of 3 × 3 and with an expansion ratio of 3Expanding convolution composition; the third branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 1 × 3 and expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 3; the fourth branch consists of convolution layers with convolution kernel sizes of 1 × 1, 1 × 3 and 3 × 1 and an expansion convolution with an expansion rate of 5 and a convolution kernel size of 3 × 3; splicing the outputs of the 4 parallel branches, inputting the spliced outputs into a convolution layer with convolution kernel size of 3 multiplied by 3 for feature thinning processing, outputting the spliced outputs to a convolution layer with the deformable convolution kernel size of 3 multiplied by 3 and the convolution layer with the convolution kernel size of 1 multiplied by 1, and outputting the result of x₀；

x＝x₀·scale₁+x₁；

3. The feature-enhanced CRNN-based metal plate strip product label text recognition method of claim 2,

a first branch of the 3 parallel branches consists of convolutional layers with convolutional kernel sizes of 1 × 1 and 3 × 3; the second branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 3 × 3 and an expansion convolution with an expansion rate of 3 and a convolution kernel size of 3 × 3; the third branch consists of convolution layers with convolution kernel sizes of 1 × 1 and 2 convolution kernels with convolution kernel sizes of 3 × 3 and expansion convolution with expansion rate of 3 and convolution kernel sizes of 3 × 3; input 3 branchesAfter splicing, inputting the obtained result into a convolution layer with convolution kernel size of 1 multiplied by 1 for feature thinning processing, and outputting the result of x₂；

4. The feature-enhanced CRNN-based metal plate strip product label character recognition method of claim 3, wherein,

in the step S5, when training the multi-stage neural network architecture of the feature-enhanced deep convolutional recurrent neural network, the join time classification is used as a loss function, and the Adam algorithm is used as a learning algorithm.

5. The feature-enhanced CRNN-based metal plate strip product label character recognition method of claim 3, wherein,

6. The feature-enhanced CRNN-based metal plate strip product label character recognition method of claim 3, wherein,

7. The feature-enhanced CRNN-based metal plate strip product label character recognition method of claim 3, wherein,

8. The feature-enhanced CRNN-based metal plate strip product label character recognition method of claim 3, wherein,