CN114330646A

CN114330646A - Character recognition method based on compressed dense neural network

Info

Publication number: CN114330646A
Application number: CN202111483929.0A
Authority: CN
Inventors: 张召; 郑欢; 洪日昌; 汪萌
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2021-12-07
Filing date: 2021-12-07
Publication date: 2022-04-12

Abstract

The invention discloses a character recognition method based on a compression dense neural network. The way of combining the internal features of the dense block is redesigned, so that the proposed lightweight dense block can reduce the computation cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. Based on the lightweight dense block of the design, the invention provides a compression dense neural network for character recognition. The compressed dense neural network mainly comprises three modules, wherein input character image data firstly pass through a feature coding module to obtain dense coding features, the dense coding features are processed through an up-sampling module to obtain up-sampling features, and the up-sampling features are input into a transcription module to obtain a final character image recognition result. Through simulation experiments, the method provided by the invention is verified to be capable of effectively improving the character recognition capability.

Description

Character recognition method based on compressed dense neural network

Technical Field

The invention relates to the field of character recognition methods, in particular to a character recognition method based on a compression dense neural network.

Background

Text and images are the two most popular types of visual data in the field of computer vision. In practice, text is often always embedded in an image, and therefore, how to accurately detect and recognize text or characters in an image through a learning algorithm remains challenging and an important topic in the field of vision and pattern recognition, such as Optical Character Recognition (OCR). Optical character recognition is a long-standing topic, but it remains a very challenging task due to the complex background and the complex image content. In recent years, the rapid development and continuous breakthrough of computer vision and deep learning are witnessed, and a plurality of advanced end-to-end deep learning methods are proposed.

For OCR, two key subtasks are text line extraction and text line recognition. The first task is to extract regions of text in the image and the second task is to identify the text content of the identified regions. To handle OCR, there are currently two main flow frameworks. The first is to train an end-to-end network that can collectively address the task of text line extraction and recognition, such as an arbitrary direction network (AON). Another popular scheme is a two-phase scheme, i.e., training two networks, such as the Convolutional Recurrent Neural Network (CRNN), for two subtasks. Generally, unified models are more adaptive and faster, but the results are slightly lower. Two-stage models are generally more accurate, but less efficient. CRNN fuses the advantages of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Since CRNN is intended to recognize text, it is necessary to extract lines of text in images through a novel Connectionist text suggestion network. Recent work has shown that even without the number of repeated layers, the simplified model can still achieve good results with greater efficiency. Thus, the framework of CNN + CTCs is a viable and efficient solution. To extract features from the convolutional layer, existing networks can be used, such as dense convolutional networks (DenseNet) and residual networks (ResNet). And deleting the loop layer for efficiency, a new character recognition model can be formed

Therefore, it is an object of the present invention to provide a method for character recognition based on a compression-intensive neural network to solve the above problems.

Disclosure of Invention

The present invention aims to provide a character recognition method based on a compression-dense neural network to solve the problems mentioned in the background art.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a character recognition method based on a compression dense neural network reduces the calculation amount and the weight by redesigning the combination mode of internal features of a dense block, and comprises the following steps:

the method of the present invention proposes a new type of compressed dense block called lightweight dense block, and then a compressed dense convolutional network based on the designed lightweight dense block. In particular, the present invention redesigns the way in which the internal features in the dense block are combined to achieve a more efficient module through weight compression. Lightweight dense blocks use both summing and concatenation operations to connect internal features in each dense block, which can reduce computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block, compared to the original dense blocks. The lightweight dense block retains the character recognition capabilities of the model while reducing the weight size.

Based on the proposed lightweight dense block, the method of the invention proposes a compressed dense neural network. To recover the feature information and extend the depth of the network, the present invention uses several lightweight dense blocks, convolutional layers, and deconvolution operations to define three important modules in a compact dense neural network: the device comprises a feature coding module, an up-sampling module and a transcription module.

The method of the invention provides a compression dense neural network based on the lightweight dense block, and can effectively improve the character recognition capability.

The invention relates to a character recognition method based on a compressed dense neural network, which comprises the following steps:

in step 1, a coding module in the compressed dense neural network is used for processing input character image data to obtain dense coding characteristics. The inventive method proposes a module called lightweight dense block, then a feature coding module based on the lightweight dense block. In general, the signature coding module comprises a convolutional layer and three lightweight dense blocks. Firstly, the input character image data passes through a convolution layer to obtain shallow layer characteristics, then the shallow layer characteristics are input into three series-connected light-weight dense blocks, the output of the former light-weight dense block is the input of the latter light-weight dense block, and the dense coding characteristics are obtained.

Light-weight dense block: the lightweight dense block uses a summation operation in the residual block and a concatenation operation in the dense block. But the difference is that the residual dense block mainly uses summation operation to define the residual outside the dense block to improve the feature representation capability, and the light-weight dense block mainly uses summation operation to change the feature fusion mode inside the dense block, thereby reducing the calculation cost and obviously reducing the weight of the dense block. In particular, the present invention redesigns the way in which internal features in dense blocks are combined, enabling more efficient modules through weight compression. Lightweight dense blocks use both summing and concatenation operations to connect internal features in each dense block, which can reduce computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block, compared to the original dense blocks. The lightweight dense block retains the character recognition capabilities of the model while reducing the weight size. The execution process is as follows:

X₀representing input features of compressed dense blocks, X_iRepresenting features at layer i in the compressed dense block.

Showing the operation procedure of the convolutional layer.

In step 2, the upsampling module in the compressed dense neural network is used for processing the input dense coding features to obtain the upsampling features, recovering information lost in the coding process and fully using the feature information of different layers. The upsampling module includes a convolutional layer, an anti-convolutional layer, and two lightweight dense blocks. The input dense coding features firstly pass through an deconvolution layer, then pass through two series-connected lightweight dense blocks, and finally pass through a convolution layer to obtain the up-sampling features. The present invention uses deconvolution in the upsampling module, which can help to extend depth and recover lost feature information to some extent.

And step 3, inputting the upsampling characteristics into a softmax classifier of a transcription module to obtain a probability result, and finally outputting a character recognition result through a transcription layer.

According to the technical scheme, compared with the prior art, the character recognition method based on the compression dense neural network is characterized in that: a new lightweight dense block is proposed to reduce the computational cost and weight size. In particular, the present invention redesigns the way in which dense block internal features are combined, so that lightweight dense blocks can reduce the computational cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. In addition, the present invention designs two different convolution operation blocks to further compress the weight. Based on the designed lightweight dense block, the method of the invention provides a compression dense neural network for character recognition. A character recognition method based on a compression dense neural network can effectively improve the feature extraction and character recognition capability of character images.

Drawings

Fig. 1 is a flowchart of a character recognition method based on a compression-dense neural network according to an embodiment of the present invention.

Fig. 2 is a structural diagram of a character recognition method based on a compression-dense neural network according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of character recognition prediction according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention discloses a character recognition method based on a compression dense neural network. Dense neural network based models use dense blocks as core modules, but the internal functions are combined together in a serial fashion. Thus, the number of channels of the input combined features increases the associated computational cost as the dense block deepens. This will be more computationally intensive, while requiring more space to hold the weights, thus limiting the depth of the dense block. Therefore, in the method of the present invention, a new lightweight dense block is proposed to reduce the computation cost and weight size. In particular, the present invention redefines and designs the manner in which features within dense blocks are combined so that lightweight dense blocks can reduce the computational cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. Based on the lightweight dense block of the design, the invention provides a compression dense neural network for character recognition. Through simulation experiments, the method provided by the invention is verified to be capable of effectively improving the character recognition capability.

The invention tests on three image datasets, including two handwritten image datasets and one string image dataset: the handwritten image data set has: HASY, MNIST; the synthetic Chinese string dataset is the synthetic Chinese string dataset. MNIST is a widely used handwritten digit data set whose goal is to classify 28 x 28 pixel images into one of 10 digit classes. The MNIST dataset had 60,000 training samples and 10,000 testing samples. Hash is a public data set containing a single symbol, but is more challenging because there are more classes in hash than in MNIST, and there are many similar classes in hash, specifically it has 168,233 instances and 369 classes. the synthetic Chinese string dataset is generated from a Chinese corpus, including news and classical Chinese, and the lexicon is generated by modifying the font, size, grayscale, blur, perspective, and stretch to approximately 5990 characters, including Chinese, punctuation, english, and numbers. Each sample is fixed to 10 characters and the characters are randomly truncated from the corpus. The resolution of the pictures was 280 × 32, and a total of 360 ten thousand images were obtained. These databases are collected from many aspects and thus the test results are generally illustrative.

Referring to fig. 1, a flow chart of a character recognition method based on a compressed dense neural network is shown. The embodiment of the invention discloses a character recognition method based on a compression dense neural network, which comprises the following specific implementation steps:

step 101: and the coding module in the compression dense neural network is used for processing the input character image data to obtain dense coding characteristics. The inventive method proposes a module called lightweight dense block, then a feature coding module based on the lightweight dense block. In general, the signature coding module comprises a convolutional layer and three lightweight dense blocks. Firstly, the input character image data passes through a convolution layer to obtain shallow layer characteristics, then the shallow layer characteristics are input into three series-connected light-weight dense blocks, the output of the former light-weight dense block is the input of the latter light-weight dense block, and the dense coding characteristics are obtained.

Light-weight dense block: the lightweight dense block uses a summation operation in the residual block and a concatenation operation in the dense block. But the difference is that the residual dense block mainly uses summation operation to define the residual outside the dense block to improve the feature representation capability, and the light-weight dense block mainly uses summation operation to change the feature fusion mode inside the dense block, thereby reducing the calculation cost and obviously reducing the weight of the dense block. In particular, the present invention redesigns the way in which the internal features in dense blocks are combined to achieve a more efficient module through weight compression. Lightweight dense blocks use both summing and concatenation operations to connect internal features in each dense block, which can reduce computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block, compared to the original dense blocks. The lightweight dense block retains the character recognition capabilities of the model while reducing the weight size. The execution process is as follows:

Showing the operation procedure of the convolutional layer.

Step 102: the up-sampling module in the compressed dense neural network is used for processing input dense coding characteristics to obtain the up-sampling characteristics, recovering information lost in the coding process and fully using characteristic information of different layers. The upsampling module includes a convolutional layer, an anti-convolutional layer, and two lightweight dense blocks. The input dense coding features firstly pass through an deconvolution layer, then pass through two series-connected lightweight dense blocks, and finally pass through a convolution layer to obtain the up-sampling features. The present invention uses deconvolution in the upsampling module, which can help to extend depth and recover lost feature information to some extent.

Step 103: and inputting the upsampling characteristics into a softmax classifier to obtain a probability result, and finally outputting a character recognition result through a transcription layer. Specifically, the transcription layer is mainly used to convert the prediction of each frame into a final tag sequence, which includes operations of softmax and CTC. The softmax function is used to output a prediction of the learned features from the convolution portion, while the CTC may convert the prediction into a final character recognition result.

The method is described in detail in the embodiments disclosed above, and the method of the present invention can be implemented by using various types of systems, so that the present invention also discloses a system, and the following detailed description is given of specific embodiments.

Referring to fig. 2, a structure diagram of a character recognition method based on a compressed dense neural network is disclosed in an embodiment of the present invention. The invention discloses a character recognition method based on a compression dense neural network, which specifically comprises the following steps:

the feature coding module 201 is configured to code input character image data to obtain dense coding features. An upsampling module 202, configured to upsample the dense coded features to obtain upsampled features. The transcription module 203 predicts by using the input up-sampling characteristics and converts the prediction result into characters to be output. Firstly, end-to-end training is carried out on the neural network through a training sample to obtain the weight when the neural network is converged, and then a test sample is input into the neural network to obtain a character recognition result

The method is mainly compared with recognition results of Random forests (Random Forest), multilayer perceptrons (MLP), Linear Discriminant Analysis (LDA), three-layer convolutional neural networks (CNN-3), four-layer convolutional neural networks (CNN-4A), three-layer convolutional neural networks (CNN-3+ displacement pests) with displacement characteristics, four-layer convolutional neural networks (CNN-4+ displacement pests) with displacement characteristics and four-layer convolutional neural networks (CNN-4A + displacement pests) with displacement characteristics. Tables 1-3 show the Accuracy (Accuracy) of each algorithm on the handwriting dataset MNIST, hash, the synthetic Chinese string dataset, respectively.

Table 1: the invention and each algorithm based on MNIST data set character recognition comparison result

Table 2: the invention and each algorithm are based on HASY data set character recognition comparison result

Evaluated Frameworks	Accuracy(％)
		Random Forest	62.4％
MLP	62.2％
		LDA	46.8％
CNN-3	78.4％
		CNN-4	80.5％
CNN-4a	81.0％
		CNN-3+displacement features	78.8％
CNN-4+displacement features	81.4％
		CNN-4a+displacement features	82.3％
CDenseNet-U(ours)	84.8％

Table 3: the invention and each algorithm are based on the character recognition comparison result of the synthetic Chinese string dataset

Example experiment results on the real data set show that the method can be effectively used for identification based on various types of data sets, and better accuracy can be obtained.

Fig. 3 is a schematic diagram of feature extraction and recognition according to an embodiment of the present invention.

The invention can be seen from experimental results that the feature extraction and recognition effects of the invention are obviously superior to those of related Random Forest (Random Forest), multilayer perceptron (MLP), Linear Discriminant Analysis (LDA), three-layer convolutional neural network (CNN-3), four-layer convolutional neural network (CNN-4A), three-layer convolutional neural network (CNN-3+ displacement nerves) with displacement features, four-layer convolutional neural network (CNN-4+ displacement nerves) with displacement features and four-layer convolutional neural network (CNN-4A + displacement nerves) with displacement features, and the invention shows stronger stability and has certain advantages.

In summary, the following steps: the invention discloses a character recognition method based on a compression dense neural network. Dense neural network based models use dense blocks as core modules, but the internal functions are combined together in a serial fashion. Thus, the number of channels of the input combined features increases the associated computational cost as the dense block deepens. More computation is required and more space is required to hold the weights, limiting the depth of the dense block. Therefore, in the method of the present invention, a new lightweight dense block is proposed to reduce the computation cost and weight size. In particular, the present invention redesigns the way in which the internal features in the dense blocks are combined, so that the lightweight dense blocks can reduce the computational cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. Based on the lightweight dense block of the design, the invention provides a compression dense neural network for character recognition. Through simulation experiments, the method provided by the invention is verified to be capable of effectively improving the character recognition capability.

For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A character recognition method based on a compression dense neural network is characterized by comprising the following steps:

step 1, inputting an original character image into a compressed dense neural network, and coding feature information by using a feature coding module to obtain dense coding features;

step 2, the dense coding features obtained in the step 1 are used as input of an up-sampling module in the compressed dense neural network, and up-sampling is carried out on the dense coding features to obtain up-sampling features;

and 3, inputting the up-sampling features extracted in the step 2 into a transcription module in the compressed dense neural network, firstly obtaining a probability prediction result through a softmax classifier, and converting a numerical value result into characters through a transcription layer to be output to obtain a final recognition result.

2. The character recognition method based on the compression dense neural network as claimed in claim 1, wherein in the step 1, the coding module in the compression dense neural network is used for processing the input character image data to obtain dense coding features; firstly, input character image data passes through a convolutional layer to obtain shallow layer characteristics, and then the shallow layer characteristics are input into three light-weight dense blocks which are connected in series, wherein the output of the previous light-weight dense block is the input of the next light-weight dense block to obtain dense coding characteristics;

a mode of combining internal features in the dense blocks is designed in the light-weight dense blocks, and more effective modules are realized through weight compression; lightweight dense blocks, compared to the original dense blocks, use both summing and concatenation operations to connect internal features in each dense block, reducing the computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block; the lightweight dense block reduces the weight size and simultaneously retains the character recognition capability of the model, and the implementation process is as follows:

X₀input features, X, representing lightweight dense blocks_iRepresenting features at layer i in a compressed dense block,

showing the operation procedure of the convolutional layer.

3. The character recognition method based on the compressed dense neural network as claimed in claim 1, wherein in the step 2, the up-sampling module in the compressed dense neural network is used for processing the input dense coding features to obtain up-sampling features; the up-sampling module comprises a convolution layer, an anti-convolution layer and two light-weight dense blocks; the input dense coding features firstly pass through an deconvolution layer, then pass through two series-connected lightweight dense blocks, and finally pass through a convolution layer to obtain the up-sampling features.

4. The method of claim 1, wherein in step 3, the upsampled features are input into a softmax classifier to obtain a probability prediction result, and finally a character recognition result is output through a transcription layer.