CN113850157A - Character recognition method based on neural network - Google Patents
Character recognition method based on neural network Download PDFInfo
- Publication number
- CN113850157A CN113850157A CN202111046315.6A CN202111046315A CN113850157A CN 113850157 A CN113850157 A CN 113850157A CN 202111046315 A CN202111046315 A CN 202111046315A CN 113850157 A CN113850157 A CN 113850157A
- Authority
- CN
- China
- Prior art keywords
- image
- result
- text
- features
- characters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
A character recognition method based on a neural network is characterized by comprising the following steps: loading a training model; step two: inputting an image containing characters; step three: labeling each text segment by using a DBNet algorithm; step four: extracting text segments to obtain a cut text image; step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm; step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect. The method has wide application scenes, particularly has great adaptability to scenes with uneven illumination or large light and shade change, and ensures the identification stability of the scenes by a large number of network parameters.
Description
Technical Field
The field relates to the field of character recognition, in particular to a character recognition method based on a neural network.
Background
With the continuous improvement of industrial detection, the requirements for character recognition are gradually increased, when a large number of characters and different fonts need to be recognized, the traditional character recognition scheme is difficult to deal with, and the recognition mode based on the neural network can not only achieve the level far exceeding the traditional character recognition on the recognition accuracy rate, but also flexibly customize according to the requirements of customers, and train different models for different users. The technology uses a mode of firstly positioning characters and then identifying the characters, and has the advantages that each section can be defined by users, for example, the content of the detected food label can be divided into a material section, a production date section and the like.
The traditional character recognition can be divided into two types, one is a template matching method, and the other is a structural analysis method. The template matching method generally requires establishing a template library, and then manually extracting the features of the standard characters in the template library to compare with the features of the characters to be detected. The structural analysis method is a method for classifying the detected characters by extracting characteristic combinations such as 'points', 'lines' and 'cross lines' appearing on the basis of a single font. Both methods have several serious disadvantages, one is that the extraction of the features is manually performed, which results in a poor application range and time and labor consumption. Two are that both detection methods can only be used to detect fonts that compare to a standard (or are similar to a template). And thirdly, under the condition that the background is complex, the characters are difficult to be segmented, and the traditional method cannot be used.
Disclosure of Invention
The invention aims to provide a character recognition method based on a neural network, which has wide application, strong adaptability and high recognition stability.
The purpose of the invention is realized as follows: a method of neural network based character recognition, comprising the steps of:
the method comprises the following steps: loading a training model;
step two: inputting an image containing characters;
step three: labeling each text segment by using a DBNet algorithm;
step four: extracting text segments to obtain a cut text image;
step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;
step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.
Preferably, the specific method of the third step is as follows:
s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;
s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;
s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.
Preferably, the specific method of the fifth step is as follows:
s1: dividing an input text segment into a plurality of grids at equal intervals in columns;
s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;
s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.
Preferably, the time series network uses a bi-directional LSTM based architecture.
Preferably, before loading the training model, an available model file needs to be trained, and the specific method is as follows:
s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;
s2: inputting the image data into a neural network for model training;
s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.
Compared with the prior art, the invention has the advantages that:
1. the traditional method generally needs two stages, one is to separate a single character in an image and then recognize the single character, but the difference of the invention is that the first stage locates a paragraph with text instead of a single character, and the second stage recognizes the whole paragraph, and the two stages are both performed by an automatic learning mode, so that the anti-noise capability and the accuracy can be greatly improved.
3. A large number of network parameters obtained during model file training ensure the identification stability of the model file, and the method is applicable to GPU acceleration, the speed of identifying 130 ten thousand pixel images can reach 0.1 second, and the method is very suitable for scenes such as an automatic factory.
3. The operation process is simple, and the method can be directly applied without other additional settings after an engineer trains a model to fix the parameters.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
As shown in fig. 1, a method of character recognition based on neural network includes the following steps:
the method comprises the following steps: loading a training model;
step two: inputting an image containing characters;
step three: labeling each text segment by using a DBNet algorithm;
step four: extracting text segments to obtain a cut text image;
step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;
step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.
Preferably, the specific method of the third step is as follows:
s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;
s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;
s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.
Preferably, the specific method of the fifth step is as follows:
s1: dividing an input text segment into a plurality of grids at equal intervals in columns;
s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;
s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.
Preferably, the time series network uses a bi-directional LSTM based architecture.
Preferably, before loading the training model, an available model file needs to be trained, and the specific method is as follows:
s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;
s2: inputting the image data into a neural network for model training;
s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.
The present invention is not limited to the above-mentioned embodiments, and based on the technical solutions disclosed in the present invention, those skilled in the art can make some substitutions and modifications to some technical features without creative efforts according to the disclosed technical contents, and these substitutions and modifications are all within the protection scope of the present invention.
Claims (5)
1. A method of character recognition based on a neural network, comprising the steps of:
the method comprises the following steps: loading a training model;
step two: inputting an image containing characters;
step three: labeling each text segment by using a DBNet algorithm;
step four: extracting text segments to obtain a cut text image;
step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;
step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.
2. The method for character recognition based on neural network as claimed in claim 1, wherein the specific method of step three is:
s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;
s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;
s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.
3. The method for character recognition based on neural network as claimed in claim 1, wherein the concrete method of the fifth step is:
s1: dividing an input text segment into a plurality of grids at equal intervals in columns;
s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;
s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.
4. The method of claim 3, wherein the time-series network uses a bi-directional LSTM based structure.
5. The method of claim 1, wherein an available model file is trained before loading the training model, and the method comprises:
s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;
s2: inputting the image data into a neural network for model training;
s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111046315.6A CN113850157A (en) | 2021-09-08 | 2021-09-08 | Character recognition method based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111046315.6A CN113850157A (en) | 2021-09-08 | 2021-09-08 | Character recognition method based on neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113850157A true CN113850157A (en) | 2021-12-28 |
Family
ID=78973356
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111046315.6A Pending CN113850157A (en) | 2021-09-08 | 2021-09-08 | Character recognition method based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113850157A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114495106A (en) * | 2022-04-18 | 2022-05-13 | 电子科技大学 | MOCR (metal-oxide-semiconductor resistor) deep learning method applied to DFB (distributed feedback) laser chip |
-
2021
- 2021-09-08 CN CN202111046315.6A patent/CN113850157A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114495106A (en) * | 2022-04-18 | 2022-05-13 | 电子科技大学 | MOCR (metal-oxide-semiconductor resistor) deep learning method applied to DFB (distributed feedback) laser chip |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111582241B (en) | Video subtitle recognition method, device, equipment and storage medium | |
Ma et al. | Joint layout analysis, character detection and recognition for historical document digitization | |
CN104182750B (en) | A kind of Chinese detection method based on extreme value connected domain in natural scene image | |
CN114898466B (en) | Intelligent factory-oriented video action recognition method and system | |
CN114005123A (en) | System and method for digitally reconstructing layout of print form text | |
CN110705630A (en) | Semi-supervised learning type target detection neural network training method, device and application | |
CN109189965A (en) | Pictograph search method and system | |
CN112733858B (en) | Image character rapid identification method and device based on character region detection | |
CN111931769A (en) | Invoice processing device, invoice processing apparatus, invoice computing device and invoice storage medium combining RPA and AI | |
CN113850157A (en) | Character recognition method based on neural network | |
CN111680669A (en) | Test question segmentation method and system and readable storage medium | |
CN114821613A (en) | Extraction method and system of table information in PDF | |
CN114529894A (en) | Rapid scene text detection method fusing hole convolution | |
WO2024139300A1 (en) | Video text processing method and apparatus, and electronic device and storage medium | |
CN111428479B (en) | Method and device for predicting punctuation in text | |
CN111414889A (en) | Financial statement identification method and device based on character identification | |
CN115761235A (en) | Zero sample semantic segmentation method, system, equipment and medium based on knowledge distillation | |
CN116778497A (en) | Method and device for identifying hand well number, computer equipment and storage medium | |
CN113095239B (en) | Key frame extraction method, terminal and computer readable storage medium | |
Mulyana et al. | Optimization of Text Mining Detection of Tajweed Reading Laws Using the Yolov8 Method on the Qur'an | |
CN114973275A (en) | Image-text messy code identification method based on deep learning technology | |
CN114140808A (en) | Electronic official document identification method based on domestic CPU and operating system | |
Gupta et al. | C2vnet: A deep learning framework towards comic strip to audio-visual scene synthesis | |
CN112508026A (en) | Identification card character recognition method and device | |
CN111241864A (en) | Code scanning-free identification analysis method and system based on 5G communication technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |