CN114330646A - Character recognition method based on compressed dense neural network - Google Patents

Character recognition method based on compressed dense neural network Download PDF

Info

Publication number
CN114330646A
CN114330646A CN202111483929.0A CN202111483929A CN114330646A CN 114330646 A CN114330646 A CN 114330646A CN 202111483929 A CN202111483929 A CN 202111483929A CN 114330646 A CN114330646 A CN 114330646A
Authority
CN
China
Prior art keywords
dense
features
neural network
character recognition
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111483929.0A
Other languages
Chinese (zh)
Inventor
张召
郑欢
洪日昌
汪萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202111483929.0A priority Critical patent/CN114330646A/en
Publication of CN114330646A publication Critical patent/CN114330646A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a character recognition method based on a compression dense neural network. The way of combining the internal features of the dense block is redesigned, so that the proposed lightweight dense block can reduce the computation cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. Based on the lightweight dense block of the design, the invention provides a compression dense neural network for character recognition. The compressed dense neural network mainly comprises three modules, wherein input character image data firstly pass through a feature coding module to obtain dense coding features, the dense coding features are processed through an up-sampling module to obtain up-sampling features, and the up-sampling features are input into a transcription module to obtain a final character image recognition result. Through simulation experiments, the method provided by the invention is verified to be capable of effectively improving the character recognition capability.

Description

Character recognition method based on compressed dense neural network
Technical Field
The invention relates to the field of character recognition methods, in particular to a character recognition method based on a compression dense neural network.
Background
Text and images are the two most popular types of visual data in the field of computer vision. In practice, text is often always embedded in an image, and therefore, how to accurately detect and recognize text or characters in an image through a learning algorithm remains challenging and an important topic in the field of vision and pattern recognition, such as Optical Character Recognition (OCR). Optical character recognition is a long-standing topic, but it remains a very challenging task due to the complex background and the complex image content. In recent years, the rapid development and continuous breakthrough of computer vision and deep learning are witnessed, and a plurality of advanced end-to-end deep learning methods are proposed.
For OCR, two key subtasks are text line extraction and text line recognition. The first task is to extract regions of text in the image and the second task is to identify the text content of the identified regions. To handle OCR, there are currently two main flow frameworks. The first is to train an end-to-end network that can collectively address the task of text line extraction and recognition, such as an arbitrary direction network (AON). Another popular scheme is a two-phase scheme, i.e., training two networks, such as the Convolutional Recurrent Neural Network (CRNN), for two subtasks. Generally, unified models are more adaptive and faster, but the results are slightly lower. Two-stage models are generally more accurate, but less efficient. CRNN fuses the advantages of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Since CRNN is intended to recognize text, it is necessary to extract lines of text in images through a novel Connectionist text suggestion network. Recent work has shown that even without the number of repeated layers, the simplified model can still achieve good results with greater efficiency. Thus, the framework of CNN + CTCs is a viable and efficient solution. To extract features from the convolutional layer, existing networks can be used, such as dense convolutional networks (DenseNet) and residual networks (ResNet). And deleting the loop layer for efficiency, a new character recognition model can be formed
Therefore, it is an object of the present invention to provide a method for character recognition based on a compression-intensive neural network to solve the above problems.
Disclosure of Invention
The present invention aims to provide a character recognition method based on a compression-dense neural network to solve the problems mentioned in the background art.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a character recognition method based on a compression dense neural network reduces the calculation amount and the weight by redesigning the combination mode of internal features of a dense block, and comprises the following steps:
the method of the present invention proposes a new type of compressed dense block called lightweight dense block, and then a compressed dense convolutional network based on the designed lightweight dense block. In particular, the present invention redesigns the way in which the internal features in the dense block are combined to achieve a more efficient module through weight compression. Lightweight dense blocks use both summing and concatenation operations to connect internal features in each dense block, which can reduce computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block, compared to the original dense blocks. The lightweight dense block retains the character recognition capabilities of the model while reducing the weight size.
Based on the proposed lightweight dense block, the method of the invention proposes a compressed dense neural network. To recover the feature information and extend the depth of the network, the present invention uses several lightweight dense blocks, convolutional layers, and deconvolution operations to define three important modules in a compact dense neural network: the device comprises a feature coding module, an up-sampling module and a transcription module.
The method of the invention provides a compression dense neural network based on the lightweight dense block, and can effectively improve the character recognition capability.
The invention relates to a character recognition method based on a compressed dense neural network, which comprises the following steps:
in step 1, a coding module in the compressed dense neural network is used for processing input character image data to obtain dense coding characteristics. The inventive method proposes a module called lightweight dense block, then a feature coding module based on the lightweight dense block. In general, the signature coding module comprises a convolutional layer and three lightweight dense blocks. Firstly, the input character image data passes through a convolution layer to obtain shallow layer characteristics, then the shallow layer characteristics are input into three series-connected light-weight dense blocks, the output of the former light-weight dense block is the input of the latter light-weight dense block, and the dense coding characteristics are obtained.
Light-weight dense block: the lightweight dense block uses a summation operation in the residual block and a concatenation operation in the dense block. But the difference is that the residual dense block mainly uses summation operation to define the residual outside the dense block to improve the feature representation capability, and the light-weight dense block mainly uses summation operation to change the feature fusion mode inside the dense block, thereby reducing the calculation cost and obviously reducing the weight of the dense block. In particular, the present invention redesigns the way in which internal features in dense blocks are combined, enabling more efficient modules through weight compression. Lightweight dense blocks use both summing and concatenation operations to connect internal features in each dense block, which can reduce computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block, compared to the original dense blocks. The lightweight dense block retains the character recognition capabilities of the model while reducing the weight size. The execution process is as follows:
Figure BDA0003396691760000031
Figure BDA0003396691760000032
X0representing input features of compressed dense blocks, XiRepresenting features at layer i in the compressed dense block.
Figure BDA0003396691760000033
Showing the operation procedure of the convolutional layer.
In step 2, the upsampling module in the compressed dense neural network is used for processing the input dense coding features to obtain the upsampling features, recovering information lost in the coding process and fully using the feature information of different layers. The upsampling module includes a convolutional layer, an anti-convolutional layer, and two lightweight dense blocks. The input dense coding features firstly pass through an deconvolution layer, then pass through two series-connected lightweight dense blocks, and finally pass through a convolution layer to obtain the up-sampling features. The present invention uses deconvolution in the upsampling module, which can help to extend depth and recover lost feature information to some extent.
And step 3, inputting the upsampling characteristics into a softmax classifier of a transcription module to obtain a probability result, and finally outputting a character recognition result through a transcription layer.
According to the technical scheme, compared with the prior art, the character recognition method based on the compression dense neural network is characterized in that: a new lightweight dense block is proposed to reduce the computational cost and weight size. In particular, the present invention redesigns the way in which dense block internal features are combined, so that lightweight dense blocks can reduce the computational cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. In addition, the present invention designs two different convolution operation blocks to further compress the weight. Based on the designed lightweight dense block, the method of the invention provides a compression dense neural network for character recognition. A character recognition method based on a compression dense neural network can effectively improve the feature extraction and character recognition capability of character images.
Drawings
Fig. 1 is a flowchart of a character recognition method based on a compression-dense neural network according to an embodiment of the present invention.
Fig. 2 is a structural diagram of a character recognition method based on a compression-dense neural network according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of character recognition prediction according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention discloses a character recognition method based on a compression dense neural network. Dense neural network based models use dense blocks as core modules, but the internal functions are combined together in a serial fashion. Thus, the number of channels of the input combined features increases the associated computational cost as the dense block deepens. This will be more computationally intensive, while requiring more space to hold the weights, thus limiting the depth of the dense block. Therefore, in the method of the present invention, a new lightweight dense block is proposed to reduce the computation cost and weight size. In particular, the present invention redefines and designs the manner in which features within dense blocks are combined so that lightweight dense blocks can reduce the computational cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. Based on the lightweight dense block of the design, the invention provides a compression dense neural network for character recognition. Through simulation experiments, the method provided by the invention is verified to be capable of effectively improving the character recognition capability.
The invention tests on three image datasets, including two handwritten image datasets and one string image dataset: the handwritten image data set has: HASY, MNIST; the synthetic Chinese string dataset is the synthetic Chinese string dataset. MNIST is a widely used handwritten digit data set whose goal is to classify 28 x 28 pixel images into one of 10 digit classes. The MNIST dataset had 60,000 training samples and 10,000 testing samples. Hash is a public data set containing a single symbol, but is more challenging because there are more classes in hash than in MNIST, and there are many similar classes in hash, specifically it has 168,233 instances and 369 classes. the synthetic Chinese string dataset is generated from a Chinese corpus, including news and classical Chinese, and the lexicon is generated by modifying the font, size, grayscale, blur, perspective, and stretch to approximately 5990 characters, including Chinese, punctuation, english, and numbers. Each sample is fixed to 10 characters and the characters are randomly truncated from the corpus. The resolution of the pictures was 280 × 32, and a total of 360 ten thousand images were obtained. These databases are collected from many aspects and thus the test results are generally illustrative.
Referring to fig. 1, a flow chart of a character recognition method based on a compressed dense neural network is shown. The embodiment of the invention discloses a character recognition method based on a compression dense neural network, which comprises the following specific implementation steps:
step 101: and the coding module in the compression dense neural network is used for processing the input character image data to obtain dense coding characteristics. The inventive method proposes a module called lightweight dense block, then a feature coding module based on the lightweight dense block. In general, the signature coding module comprises a convolutional layer and three lightweight dense blocks. Firstly, the input character image data passes through a convolution layer to obtain shallow layer characteristics, then the shallow layer characteristics are input into three series-connected light-weight dense blocks, the output of the former light-weight dense block is the input of the latter light-weight dense block, and the dense coding characteristics are obtained.
Light-weight dense block: the lightweight dense block uses a summation operation in the residual block and a concatenation operation in the dense block. But the difference is that the residual dense block mainly uses summation operation to define the residual outside the dense block to improve the feature representation capability, and the light-weight dense block mainly uses summation operation to change the feature fusion mode inside the dense block, thereby reducing the calculation cost and obviously reducing the weight of the dense block. In particular, the present invention redesigns the way in which the internal features in dense blocks are combined to achieve a more efficient module through weight compression. Lightweight dense blocks use both summing and concatenation operations to connect internal features in each dense block, which can reduce computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block, compared to the original dense blocks. The lightweight dense block retains the character recognition capabilities of the model while reducing the weight size. The execution process is as follows:
Figure BDA0003396691760000051
Figure BDA0003396691760000052
X0representing input features of compressed dense blocks, XiRepresenting features at layer i in the compressed dense block.
Figure BDA0003396691760000053
Showing the operation procedure of the convolutional layer.
Step 102: the up-sampling module in the compressed dense neural network is used for processing input dense coding characteristics to obtain the up-sampling characteristics, recovering information lost in the coding process and fully using characteristic information of different layers. The upsampling module includes a convolutional layer, an anti-convolutional layer, and two lightweight dense blocks. The input dense coding features firstly pass through an deconvolution layer, then pass through two series-connected lightweight dense blocks, and finally pass through a convolution layer to obtain the up-sampling features. The present invention uses deconvolution in the upsampling module, which can help to extend depth and recover lost feature information to some extent.
Step 103: and inputting the upsampling characteristics into a softmax classifier to obtain a probability result, and finally outputting a character recognition result through a transcription layer. Specifically, the transcription layer is mainly used to convert the prediction of each frame into a final tag sequence, which includes operations of softmax and CTC. The softmax function is used to output a prediction of the learned features from the convolution portion, while the CTC may convert the prediction into a final character recognition result.
The method is described in detail in the embodiments disclosed above, and the method of the present invention can be implemented by using various types of systems, so that the present invention also discloses a system, and the following detailed description is given of specific embodiments.
Referring to fig. 2, a structure diagram of a character recognition method based on a compressed dense neural network is disclosed in an embodiment of the present invention. The invention discloses a character recognition method based on a compression dense neural network, which specifically comprises the following steps:
the feature coding module 201 is configured to code input character image data to obtain dense coding features. An upsampling module 202, configured to upsample the dense coded features to obtain upsampled features. The transcription module 203 predicts by using the input up-sampling characteristics and converts the prediction result into characters to be output. Firstly, end-to-end training is carried out on the neural network through a training sample to obtain the weight when the neural network is converged, and then a test sample is input into the neural network to obtain a character recognition result
The method is mainly compared with recognition results of Random forests (Random Forest), multilayer perceptrons (MLP), Linear Discriminant Analysis (LDA), three-layer convolutional neural networks (CNN-3), four-layer convolutional neural networks (CNN-4A), three-layer convolutional neural networks (CNN-3+ displacement pests) with displacement characteristics, four-layer convolutional neural networks (CNN-4+ displacement pests) with displacement characteristics and four-layer convolutional neural networks (CNN-4A + displacement pests) with displacement characteristics. Tables 1-3 show the Accuracy (Accuracy) of each algorithm on the handwriting dataset MNIST, hash, the synthetic Chinese string dataset, respectively.
Table 1: the invention and each algorithm based on MNIST data set character recognition comparison result
Figure BDA0003396691760000061
Figure BDA0003396691760000071
Table 2: the invention and each algorithm are based on HASY data set character recognition comparison result
Evaluated Frameworks Accuracy(%)
Random Forest 62.4%
MLP 62.2%
LDA 46.8%
CNN-3 78.4%
CNN-4 80.5%
CNN-4a 81.0%
CNN-3+displacement features 78.8%
CNN-4+displacement features 81.4%
CNN-4a+displacement features 82.3%
CDenseNet-U(ours) 84.8%
Table 3: the invention and each algorithm are based on the character recognition comparison result of the synthetic Chinese string dataset
Figure BDA0003396691760000072
Example experiment results on the real data set show that the method can be effectively used for identification based on various types of data sets, and better accuracy can be obtained.
Fig. 3 is a schematic diagram of feature extraction and recognition according to an embodiment of the present invention.
The invention can be seen from experimental results that the feature extraction and recognition effects of the invention are obviously superior to those of related Random Forest (Random Forest), multilayer perceptron (MLP), Linear Discriminant Analysis (LDA), three-layer convolutional neural network (CNN-3), four-layer convolutional neural network (CNN-4A), three-layer convolutional neural network (CNN-3+ displacement nerves) with displacement features, four-layer convolutional neural network (CNN-4+ displacement nerves) with displacement features and four-layer convolutional neural network (CNN-4A + displacement nerves) with displacement features, and the invention shows stronger stability and has certain advantages.
In summary, the following steps: the invention discloses a character recognition method based on a compression dense neural network. Dense neural network based models use dense blocks as core modules, but the internal functions are combined together in a serial fashion. Thus, the number of channels of the input combined features increases the associated computational cost as the dense block deepens. More computation is required and more space is required to hold the weights, limiting the depth of the dense block. Therefore, in the method of the present invention, a new lightweight dense block is proposed to reduce the computation cost and weight size. In particular, the present invention redesigns the way in which the internal features in the dense blocks are combined, so that the lightweight dense blocks can reduce the computational cost and weight size to (1/L, 2/L) compared to the original model, where L is the number of internal layers in the block. Based on the lightweight dense block of the design, the invention provides a compression dense neural network for character recognition. Through simulation experiments, the method provided by the invention is verified to be capable of effectively improving the character recognition capability.
For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. A character recognition method based on a compression dense neural network is characterized by comprising the following steps:
step 1, inputting an original character image into a compressed dense neural network, and coding feature information by using a feature coding module to obtain dense coding features;
step 2, the dense coding features obtained in the step 1 are used as input of an up-sampling module in the compressed dense neural network, and up-sampling is carried out on the dense coding features to obtain up-sampling features;
and 3, inputting the up-sampling features extracted in the step 2 into a transcription module in the compressed dense neural network, firstly obtaining a probability prediction result through a softmax classifier, and converting a numerical value result into characters through a transcription layer to be output to obtain a final recognition result.
2. The character recognition method based on the compression dense neural network as claimed in claim 1, wherein in the step 1, the coding module in the compression dense neural network is used for processing the input character image data to obtain dense coding features; firstly, input character image data passes through a convolutional layer to obtain shallow layer characteristics, and then the shallow layer characteristics are input into three light-weight dense blocks which are connected in series, wherein the output of the previous light-weight dense block is the input of the next light-weight dense block to obtain dense coding characteristics;
a mode of combining internal features in the dense blocks is designed in the light-weight dense blocks, and more effective modules are realized through weight compression; lightweight dense blocks, compared to the original dense blocks, use both summing and concatenation operations to connect internal features in each dense block, reducing the computational cost and weight size to (1/L, 2/L), where L is the number of internal layers in the block; the lightweight dense block reduces the weight size and simultaneously retains the character recognition capability of the model, and the implementation process is as follows:
Figure FDA0003396691750000011
Figure FDA0003396691750000012
X0input features, X, representing lightweight dense blocksiRepresenting features at layer i in a compressed dense block,
Figure FDA0003396691750000013
showing the operation procedure of the convolutional layer.
3. The character recognition method based on the compressed dense neural network as claimed in claim 1, wherein in the step 2, the up-sampling module in the compressed dense neural network is used for processing the input dense coding features to obtain up-sampling features; the up-sampling module comprises a convolution layer, an anti-convolution layer and two light-weight dense blocks; the input dense coding features firstly pass through an deconvolution layer, then pass through two series-connected lightweight dense blocks, and finally pass through a convolution layer to obtain the up-sampling features.
4. The method of claim 1, wherein in step 3, the upsampled features are input into a softmax classifier to obtain a probability prediction result, and finally a character recognition result is output through a transcription layer.
CN202111483929.0A 2021-12-07 2021-12-07 Character recognition method based on compressed dense neural network Pending CN114330646A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111483929.0A CN114330646A (en) 2021-12-07 2021-12-07 Character recognition method based on compressed dense neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111483929.0A CN114330646A (en) 2021-12-07 2021-12-07 Character recognition method based on compressed dense neural network

Publications (1)

Publication Number Publication Date
CN114330646A true CN114330646A (en) 2022-04-12

Family

ID=81048302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111483929.0A Pending CN114330646A (en) 2021-12-07 2021-12-07 Character recognition method based on compressed dense neural network

Country Status (1)

Country Link
CN (1) CN114330646A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642480A (en) * 2021-08-17 2021-11-12 苏州大学 Character recognition method, device, equipment and storage medium
CN113642477A (en) * 2021-08-17 2021-11-12 苏州大学 Character recognition method, device and equipment and readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642480A (en) * 2021-08-17 2021-11-12 苏州大学 Character recognition method, device, equipment and storage medium
CN113642477A (en) * 2021-08-17 2021-11-12 苏州大学 Character recognition method, device and equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN110334705B (en) Language identification method of scene text image combining global and local information
CN109726657B (en) Deep learning scene text sequence recognition method
CN111428718B (en) Natural scene text recognition method based on image enhancement
US20180137349A1 (en) System and method of character recognition using fully convolutional neural networks
CN107330127B (en) Similar text detection method based on text picture retrieval
CN111738169B (en) Handwriting formula recognition method based on end-to-end network model
CN110114776B (en) System and method for character recognition using a fully convolutional neural network
CN112633431B (en) Tibetan-Chinese bilingual scene character recognition method based on CRNN and CTC
CN113961736B (en) Method, apparatus, computer device and storage medium for text generation image
CN111898461B (en) Time sequence behavior segment generation method
CN115438215A (en) Image-text bidirectional search and matching model training method, device, equipment and medium
CN114596566A (en) Text recognition method and related device
CN113642477A (en) Character recognition method, device and equipment and readable storage medium
CN113642480A (en) Character recognition method, device, equipment and storage medium
Valy et al. Data augmentation and text recognition on Khmer historical manuscripts
Hemanth et al. CNN-RNN BASED HANDWRITTEN TEXT RECOGNITION.
Al Ghamdi A novel approach to printed Arabic optical character recognition
CN111753714B (en) Multidirectional natural scene text detection method based on character segmentation
CN112036290B (en) Complex scene text recognition method and system based on class mark coding representation
CN114694133B (en) Text recognition method based on combination of image processing and deep learning
CN110555462A (en) non-fixed multi-character verification code identification method based on convolutional neural network
CN113837157B (en) Topic type identification method, system and storage medium
CN114330646A (en) Character recognition method based on compressed dense neural network
CN113901913A (en) Convolution network for ancient book document image binaryzation
KR102331440B1 (en) System for text recognition using neural network and its method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination