CN113850157A - Character recognition method based on neural network - Google Patents

Character recognition method based on neural network Download PDF

Info

Publication number
CN113850157A
CN113850157A CN202111046315.6A CN202111046315A CN113850157A CN 113850157 A CN113850157 A CN 113850157A CN 202111046315 A CN202111046315 A CN 202111046315A CN 113850157 A CN113850157 A CN 113850157A
Authority
CN
China
Prior art keywords
image
result
text
features
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111046315.6A
Other languages
Chinese (zh)
Inventor
孔庆杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingrui Vision Intelligent Technology Shanghai Co ltd
Original Assignee
Jingrui Vision Intelligent Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingrui Vision Intelligent Technology Shanghai Co ltd filed Critical Jingrui Vision Intelligent Technology Shanghai Co ltd
Priority to CN202111046315.6A priority Critical patent/CN113850157A/en
Publication of CN113850157A publication Critical patent/CN113850157A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

A character recognition method based on a neural network is characterized by comprising the following steps: loading a training model; step two: inputting an image containing characters; step three: labeling each text segment by using a DBNet algorithm; step four: extracting text segments to obtain a cut text image; step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm; step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect. The method has wide application scenes, particularly has great adaptability to scenes with uneven illumination or large light and shade change, and ensures the identification stability of the scenes by a large number of network parameters.

Description

Character recognition method based on neural network
Technical Field
The field relates to the field of character recognition, in particular to a character recognition method based on a neural network.
Background
With the continuous improvement of industrial detection, the requirements for character recognition are gradually increased, when a large number of characters and different fonts need to be recognized, the traditional character recognition scheme is difficult to deal with, and the recognition mode based on the neural network can not only achieve the level far exceeding the traditional character recognition on the recognition accuracy rate, but also flexibly customize according to the requirements of customers, and train different models for different users. The technology uses a mode of firstly positioning characters and then identifying the characters, and has the advantages that each section can be defined by users, for example, the content of the detected food label can be divided into a material section, a production date section and the like.
The traditional character recognition can be divided into two types, one is a template matching method, and the other is a structural analysis method. The template matching method generally requires establishing a template library, and then manually extracting the features of the standard characters in the template library to compare with the features of the characters to be detected. The structural analysis method is a method for classifying the detected characters by extracting characteristic combinations such as 'points', 'lines' and 'cross lines' appearing on the basis of a single font. Both methods have several serious disadvantages, one is that the extraction of the features is manually performed, which results in a poor application range and time and labor consumption. Two are that both detection methods can only be used to detect fonts that compare to a standard (or are similar to a template). And thirdly, under the condition that the background is complex, the characters are difficult to be segmented, and the traditional method cannot be used.
Disclosure of Invention
The invention aims to provide a character recognition method based on a neural network, which has wide application, strong adaptability and high recognition stability.
The purpose of the invention is realized as follows: a method of neural network based character recognition, comprising the steps of:
the method comprises the following steps: loading a training model;
step two: inputting an image containing characters;
step three: labeling each text segment by using a DBNet algorithm;
step four: extracting text segments to obtain a cut text image;
step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;
step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.
Preferably, the specific method of the third step is as follows:
s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;
s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;
s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.
Preferably, the specific method of the fifth step is as follows:
s1: dividing an input text segment into a plurality of grids at equal intervals in columns;
s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;
s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.
Preferably, the time series network uses a bi-directional LSTM based architecture.
Preferably, before loading the training model, an available model file needs to be trained, and the specific method is as follows:
s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;
s2: inputting the image data into a neural network for model training;
s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.
Compared with the prior art, the invention has the advantages that:
1. the traditional method generally needs two stages, one is to separate a single character in an image and then recognize the single character, but the difference of the invention is that the first stage locates a paragraph with text instead of a single character, and the second stage recognizes the whole paragraph, and the two stages are both performed by an automatic learning mode, so that the anti-noise capability and the accuracy can be greatly improved.
3. A large number of network parameters obtained during model file training ensure the identification stability of the model file, and the method is applicable to GPU acceleration, the speed of identifying 130 ten thousand pixel images can reach 0.1 second, and the method is very suitable for scenes such as an automatic factory.
3. The operation process is simple, and the method can be directly applied without other additional settings after an engineer trains a model to fix the parameters.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
As shown in fig. 1, a method of character recognition based on neural network includes the following steps:
the method comprises the following steps: loading a training model;
step two: inputting an image containing characters;
step three: labeling each text segment by using a DBNet algorithm;
step four: extracting text segments to obtain a cut text image;
step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;
step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.
Preferably, the specific method of the third step is as follows:
s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;
s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;
s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.
Preferably, the specific method of the fifth step is as follows:
s1: dividing an input text segment into a plurality of grids at equal intervals in columns;
s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;
s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.
Preferably, the time series network uses a bi-directional LSTM based architecture.
Preferably, before loading the training model, an available model file needs to be trained, and the specific method is as follows:
s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;
s2: inputting the image data into a neural network for model training;
s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.
The present invention is not limited to the above-mentioned embodiments, and based on the technical solutions disclosed in the present invention, those skilled in the art can make some substitutions and modifications to some technical features without creative efforts according to the disclosed technical contents, and these substitutions and modifications are all within the protection scope of the present invention.

Claims (5)

1. A method of character recognition based on a neural network, comprising the steps of:
the method comprises the following steps: loading a training model;
step two: inputting an image containing characters;
step three: labeling each text segment by using a DBNet algorithm;
step four: extracting text segments to obtain a cut text image;
step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;
step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.
2. The method for character recognition based on neural network as claimed in claim 1, wherein the specific method of step three is:
s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;
s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;
s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.
3. The method for character recognition based on neural network as claimed in claim 1, wherein the concrete method of the fifth step is:
s1: dividing an input text segment into a plurality of grids at equal intervals in columns;
s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;
s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.
4. The method of claim 3, wherein the time-series network uses a bi-directional LSTM based structure.
5. The method of claim 1, wherein an available model file is trained before loading the training model, and the method comprises:
s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;
s2: inputting the image data into a neural network for model training;
s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.
CN202111046315.6A 2021-09-08 2021-09-08 Character recognition method based on neural network Pending CN113850157A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111046315.6A CN113850157A (en) 2021-09-08 2021-09-08 Character recognition method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111046315.6A CN113850157A (en) 2021-09-08 2021-09-08 Character recognition method based on neural network

Publications (1)

Publication Number Publication Date
CN113850157A true CN113850157A (en) 2021-12-28

Family

ID=78973356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111046315.6A Pending CN113850157A (en) 2021-09-08 2021-09-08 Character recognition method based on neural network

Country Status (1)

Country Link
CN (1) CN113850157A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495106A (en) * 2022-04-18 2022-05-13 电子科技大学 MOCR (metal-oxide-semiconductor resistor) deep learning method applied to DFB (distributed feedback) laser chip

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495106A (en) * 2022-04-18 2022-05-13 电子科技大学 MOCR (metal-oxide-semiconductor resistor) deep learning method applied to DFB (distributed feedback) laser chip

Similar Documents

Publication Publication Date Title
CN111582241B (en) Video subtitle recognition method, device, equipment and storage medium
Ma et al. Joint layout analysis, character detection and recognition for historical document digitization
CN104182750B (en) A kind of Chinese detection method based on extreme value connected domain in natural scene image
CN114898466B (en) Intelligent factory-oriented video action recognition method and system
CN114005123A (en) System and method for digitally reconstructing layout of print form text
CN110705630A (en) Semi-supervised learning type target detection neural network training method, device and application
CN109189965A (en) Pictograph search method and system
CN112733858B (en) Image character rapid identification method and device based on character region detection
CN111931769A (en) Invoice processing device, invoice processing apparatus, invoice computing device and invoice storage medium combining RPA and AI
CN113850157A (en) Character recognition method based on neural network
CN111680669A (en) Test question segmentation method and system and readable storage medium
CN114821613A (en) Extraction method and system of table information in PDF
CN114529894A (en) Rapid scene text detection method fusing hole convolution
WO2024139300A1 (en) Video text processing method and apparatus, and electronic device and storage medium
CN111428479B (en) Method and device for predicting punctuation in text
CN111414889A (en) Financial statement identification method and device based on character identification
CN115761235A (en) Zero sample semantic segmentation method, system, equipment and medium based on knowledge distillation
CN116778497A (en) Method and device for identifying hand well number, computer equipment and storage medium
CN113095239B (en) Key frame extraction method, terminal and computer readable storage medium
Mulyana et al. Optimization of Text Mining Detection of Tajweed Reading Laws Using the Yolov8 Method on the Qur'an
CN114973275A (en) Image-text messy code identification method based on deep learning technology
CN114140808A (en) Electronic official document identification method based on domestic CPU and operating system
Gupta et al. C2vnet: A deep learning framework towards comic strip to audio-visual scene synthesis
CN112508026A (en) Identification card character recognition method and device
CN111241864A (en) Code scanning-free identification analysis method and system based on 5G communication technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination