CN113850157A

CN113850157A - Character recognition method based on neural network

Info

Publication number: CN113850157A
Application number: CN202111046315.6A
Authority: CN
Inventors: 孔庆杰
Original assignee: Jingrui Vision Intelligent Technology Shanghai Co ltd
Current assignee: Jingrui Vision Intelligent Technology Shanghai Co ltd
Priority date: 2021-09-08
Filing date: 2021-09-08
Publication date: 2021-12-28

Abstract

A character recognition method based on a neural network is characterized by comprising the following steps: loading a training model; step two: inputting an image containing characters; step three: labeling each text segment by using a DBNet algorithm; step four: extracting text segments to obtain a cut text image; step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm; step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect. The method has wide application scenes, particularly has great adaptability to scenes with uneven illumination or large light and shade change, and ensures the identification stability of the scenes by a large number of network parameters.

Description

Character recognition method based on neural network

Technical Field

The field relates to the field of character recognition, in particular to a character recognition method based on a neural network.

Background

With the continuous improvement of industrial detection, the requirements for character recognition are gradually increased, when a large number of characters and different fonts need to be recognized, the traditional character recognition scheme is difficult to deal with, and the recognition mode based on the neural network can not only achieve the level far exceeding the traditional character recognition on the recognition accuracy rate, but also flexibly customize according to the requirements of customers, and train different models for different users. The technology uses a mode of firstly positioning characters and then identifying the characters, and has the advantages that each section can be defined by users, for example, the content of the detected food label can be divided into a material section, a production date section and the like.

The traditional character recognition can be divided into two types, one is a template matching method, and the other is a structural analysis method. The template matching method generally requires establishing a template library, and then manually extracting the features of the standard characters in the template library to compare with the features of the characters to be detected. The structural analysis method is a method for classifying the detected characters by extracting characteristic combinations such as 'points', 'lines' and 'cross lines' appearing on the basis of a single font. Both methods have several serious disadvantages, one is that the extraction of the features is manually performed, which results in a poor application range and time and labor consumption. Two are that both detection methods can only be used to detect fonts that compare to a standard (or are similar to a template). And thirdly, under the condition that the background is complex, the characters are difficult to be segmented, and the traditional method cannot be used.

Disclosure of Invention

The invention aims to provide a character recognition method based on a neural network, which has wide application, strong adaptability and high recognition stability.

The purpose of the invention is realized as follows: a method of neural network based character recognition, comprising the steps of:

the method comprises the following steps: loading a training model;

step two: inputting an image containing characters;

step three: labeling each text segment by using a DBNet algorithm;

step four: extracting text segments to obtain a cut text image;

step five: identifying the content of each text segment of the cut text image by using a CRNN algorithm;

step six: whether the output result has strange characters which are not in the dictionary or not is judged, if yes, the strange characters are added into the dictionary, the image is added into a database for training, and a training model is stored; if the result is not judged to be correct, outputting the result if the result is correct, and adding the image into the database again to train and storing the training model if the result is incorrect.

Preferably, the specific method of the third step is as follows:

s1: sending the image containing the text into a feature pyramid sampled by 4, 8, 16 and 32 times for extracting the overall features of the picture;

s2: the same size is obtained through upsampling and is spliced with the previous characteristics so as to extract local characteristics;

s3: the features are combined and input into a prediction network as the features of the picture, the prediction network generates two types of images, the first type is a binary image, the second type is a probability image corresponding to each binary image, a binary threshold value is used as a learnable parameter to be trained, and finally text region positioning is carried out on the screened binary image.

Preferably, the specific method of the fifth step is as follows:

s1: dividing an input text segment into a plurality of grids at equal intervals in columns;

s2: after extracting features by taking each grid as a basic unit of a time sequence, sending the grid into a time sequence network in sequence;

s3: finally, the features output by the sequence need to be subjected to feature conversion through a transcription layer, and the features are converted into characters.

Preferably, the time series network uses a bi-directional LSTM based architecture.

Preferably, before loading the training model, an available model file needs to be trained, and the specific method is as follows:

s1: preparing image data, wherein at least 2000 images are required within 100 character types for a single scene;

s2: inputting the image data into a neural network for model training;

s3: and feeding back the output result so that the network can obtain parameters through self-adaptation, and storing the parameters as a model file.

Compared with the prior art, the invention has the advantages that:

1. the traditional method generally needs two stages, one is to separate a single character in an image and then recognize the single character, but the difference of the invention is that the first stage locates a paragraph with text instead of a single character, and the second stage recognizes the whole paragraph, and the two stages are both performed by an automatic learning mode, so that the anti-noise capability and the accuracy can be greatly improved.

3. A large number of network parameters obtained during model file training ensure the identification stability of the model file, and the method is applicable to GPU acceleration, the speed of identifying 130 ten thousand pixel images can reach 0.1 second, and the method is very suitable for scenes such as an automatic factory.

3. The operation process is simple, and the method can be directly applied without other additional settings after an engineer trains a model to fix the parameters.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Detailed Description

As shown in fig. 1, a method of character recognition based on neural network includes the following steps:

the method comprises the following steps: loading a training model;

step two: inputting an image containing characters;

step three: labeling each text segment by using a DBNet algorithm;

step four: extracting text segments to obtain a cut text image;

Preferably, the specific method of the third step is as follows:

Preferably, the specific method of the fifth step is as follows:

s2: inputting the image data into a neural network for model training;

The present invention is not limited to the above-mentioned embodiments, and based on the technical solutions disclosed in the present invention, those skilled in the art can make some substitutions and modifications to some technical features without creative efforts according to the disclosed technical contents, and these substitutions and modifications are all within the protection scope of the present invention.

Claims

1. A method of character recognition based on a neural network, comprising the steps of:

the method comprises the following steps: loading a training model;

step two: inputting an image containing characters;

step three: labeling each text segment by using a DBNet algorithm;

step four: extracting text segments to obtain a cut text image;

2. The method for character recognition based on neural network as claimed in claim 1, wherein the specific method of step three is:

3. The method for character recognition based on neural network as claimed in claim 1, wherein the concrete method of the fifth step is:

4. The method of claim 3, wherein the time-series network uses a bi-directional LSTM based structure.

5. The method of claim 1, wherein an available model file is trained before loading the training model, and the method comprises:

s2: inputting the image data into a neural network for model training;