CN108427953A - A kind of character recognition method and device - Google Patents

A kind of character recognition method and device Download PDF

Info

Publication number
CN108427953A
CN108427953A CN201810162541.2A CN201810162541A CN108427953A CN 108427953 A CN108427953 A CN 108427953A CN 201810162541 A CN201810162541 A CN 201810162541A CN 108427953 A CN108427953 A CN 108427953A
Authority
CN
China
Prior art keywords
neural network
picture
recognized
layer
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810162541.2A
Other languages
Chinese (zh)
Inventor
袁飞
华仁红
刘洋
陈德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yida Turing Technology Co Ltd
Original Assignee
Beijing Yida Turing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yida Turing Technology Co Ltd filed Critical Beijing Yida Turing Technology Co Ltd
Priority to CN201810162541.2A priority Critical patent/CN108427953A/en
Publication of CN108427953A publication Critical patent/CN108427953A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A kind of character recognition method of offer of the embodiment of the present invention and device.The method includes:Picture to be identified is obtained, includes text information to be identified on the picture to be identified;Using the picture to be identified as the input of target nerve network model, Text region is carried out to the picture to be identified by the target nerve network model being pre-created, to obtain the text information in the picture to be identified.Described device is for executing the method.For the embodiment of the present invention by the way that picture to be identified to be input in target nerve network model, target nerve network model carries out Text region to picture to be identified, obtains the text information in picture to be identified, improves the efficiency and accuracy of Text region.

Description

Character recognition method and device
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to a character recognition method and device.
Background
The technology of automatically recognizing characters by using a computer is an important field of pattern recognition application. People need to process a large amount of words, reports and texts in production and life. In order to reduce the labor of people and improve the processing efficiency, the 50 s began to discuss the general character recognition method and develop an optical character recognizer. In the 60 s, utility machines using magnetic ink and special fonts were introduced. In the later 60 s, a plurality of character types and handwritten character recognition machines appeared, and the recognition precision and the machine performance of the character recognition machines can basically meet the requirements. Such as a handwritten form number recognition machine and a printed form english number recognition machine for letter sorting. In the 70 s, the basic theory of character recognition and the development of high-performance character recognition machines were mainly studied, and the research of character recognition was emphasized.
The character recognition can be applied to many fields, such as reading, translation, retrieval of document data, letter and parcel sorting, manuscript editing and proofreading, gathering and analysis of a large number of statistical reports and cards, bank check processing, commodity invoice statistical gathering, commodity code recognition, commodity warehouse management, automatic processing of a large number of credit cards in water, electricity, gas, house renting, personal insurance and other charge collection services, local automation of office typists and the like.
In the process of implementing the embodiment of the present invention, the inventor finds that, in the prior art, when identifying characters in an image, single characters in the image need to be matched one by one, then the single characters are synthesized, and finally text information in the image is obtained.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the invention provides a character recognition method and a character recognition device.
In a first aspect, an embodiment of the present invention provides a text recognition method, including:
acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized;
and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
Optionally, the method further comprises:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Optionally, the multi-layer convolutional neural network includes a residual layer, a BN layer, an excitation layer, and an LSTM layer, and the training the multi-layer convolutional neural network using an error back propagation algorithm includes:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
Optionally, the objective function of the target neural network model is a CTC loss function.
In a second aspect, an embodiment of the present invention provides a text recognition apparatus, including:
the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a picture to be recognized, and the picture to be recognized comprises character information to be recognized;
and the recognition module is used for taking the picture to be recognized as the input of a target neural network model and performing character recognition on the picture to be recognized through the pre-established target neural network model so as to obtain the character information in the picture to be recognized.
Optionally, the apparatus further includes a model creation module configured to:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Optionally, the multi-layer convolutional neural network includes a residual layer, a BN layer, an excitation layer, and an LSTM layer, and the model creation module is specifically configured to:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
Optionally, the objective function of the target neural network model is a CTC loss function.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor being capable of performing the method steps of the first aspect when invoked by the program instructions.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, including:
the non-transitory computer readable storage medium stores computer instructions that cause the computer to perform the method steps of the first aspect.
According to the character recognition method and device provided by the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a character recognition method according to an embodiment of the present invention;
fig. 2 is a screenshot of a substation device indicator in an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a character recognition device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a text recognition method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 101: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized;
specifically, the identification device acquires a picture to be identified, wherein the picture to be identified can be a substation equipment indicator, a safety warning board, a road sign indicator and the like, and text information to be identified is arranged on the picture to be identified.
Step 102: and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
Specifically, after receiving the picture to be recognized, the recognition device inputs the picture to be recognized into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
On the basis of the above embodiment, the method further includes:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Specifically, a target neural network model is created in advance, and the specific steps are as follows: firstly, a large number of pictures with character information are collected, manual labeling is carried out on the pictures in advance, characters on each picture are obtained, a text file can be generated on each picture, the content of the text file is the character content of the picture, and the name of the text file is the name of the picture. Dividing the acquired pictures into a training set and a verification set by using a preset proportion, for example: 1000 pictures are collected, 800 pictures and corresponding text information are randomly selected from the pictures to serve as a training set, and the remaining 200 pictures and corresponding text information serve as a verification set.
The method comprises the steps of constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of convolutional kernels and the number of convolutional kernels, the number of layers, the size of convolutional kernels and the number of convolutional kernels are initial values set in advance according to experience, training the multilayer convolutional neural network by taking pictures in a training set as input and taking character information corresponding to the pictures as output, training the multilayer convolutional neural network by adopting an error back propagation algorithm during training, judging whether the structure of the current multilayer convolutional neural network needs to be adjusted according to a loss value obtained by a target function by adopting a CTC (central control unit) loss function, and finally obtaining the target multilayer convolutional neural network.
When a target multilayer convolutional neural network is trained, adjusting each parameter in the target multilayer convolutional neural network through an error back propagation algorithm, obtaining a corresponding intermediate neural network model after each adjustment, enabling each intermediate neural network model to correspond to a loss value, sequencing the loss values of all the obtained intermediate neural network models from small to large, selecting the intermediate neural network models positioned at the first few bits from the sequencing, respectively verifying the selected intermediate neural network models by using a verification set to obtain the accuracy corresponding to each intermediate neural network model, and taking the intermediate neural network model with the highest accuracy as the target neural network model.
According to the embodiment of the invention, the multilayer convolutional neural network is created in advance, a large number of collected pictures are used as input, character information corresponding to the pictures is used as output for training, the target multilayer convolutional neural network is obtained, then the intermediate neural network model generated in the process of training the target multilayer convolutional neural network is obtained, and the target neural network model is determined from the intermediate neural network model through the accuracy rate, so that the accuracy rate of character recognition on the pictures is improved.
On the basis of the above embodiment, the multi-layer convolutional neural network includes a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network LSTM layer, and the training of the multi-layer convolutional neural network by using an error back propagation algorithm includes:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
Specifically, the constructed multilayer convolutional neural network comprises a residual error layer, a specification (BN) layer, an excitation layer and a (LSTM) layer, each network layer has a corresponding number of layers, the ordering of the network layers can be preset according to actual conditions, and when the multilayer convolutional neural network is trained by using pictures in a training set and text information corresponding to each picture, the number of layers of one or more network layers can be increased or reduced, for example: one residual layer and one LSTM layer can be added, so that a multilayer convolutional neural network with a deeper structure is obtained, and it can be understood that the effect is different when the residual layer and the LSTM layer are added to different positions of the original multilayer convolutional neural network, and the adjustment can be performed according to the actual condition or the loss value after training. And (3) adding a residual error layer and an LSTM layer, then training the new multilayer convolutional neural network again, if the loss value corresponding to the new multilayer convolutional neural network is not less than the preset threshold, continuing to adjust the result of the multilayer convolutional neural network until the loss value corresponding to the adjusted multilayer convolutional neural network is less than the preset threshold, and then obtaining the multilayer convolutional neural network which is the target multilayer convolutional neural network.
The embodiment of the invention obtains the final target multilayer convolutional neural network by adjusting the residual error layer, the BN layer, the excitation layer and the LSTM layer, and takes the character recognition problem as an end-to-end problem.
The following takes an equipment identification signboard in a transformer substation as an example, fig. 2 is a diagram of the equipment identification signboard of the transformer substation provided by the embodiment of the present invention, and as shown in fig. 2, the key steps involved in the method of the present invention are described in detail:
the method comprises the following steps of firstly, collecting a large number of nameplate pictures, manually marking character information on each nameplate, generating a text file for each nameplate, wherein the content of the text file is the character information on the nameplate, and the name of the text file is the name of the nameplate picture. And marking a large number of nameplate pictures, and dividing the nameplate pictures into a training set and a verification set according to the proportion.
And step two, constructing a multilayer convolutional neural network, including defining the number of layers, the size of convolutional kernels, the number of convolutional kernels and the like for forming the multilayer convolutional neural network. In consideration of the excellent performance of the residual neural network in image feature extraction, the residual neural network is adopted to extract features, feature maps extracted by a plurality of residual layers are sent to the LSTM layer to learn features, the output of the LSTM layer is used as the input of a CTC loss function, and finally a character recognition result is output. Preferably, a multi-layer convolutional neural network of 5 residual layers, 5 BN layers, 1 LSTM layer may be employed. The convolutional layer is designed in such a way that the features can be extracted quickly and efficiently.
And step three, training the network by adopting an error back propagation algorithm. And adopting the CTC loss function as an objective function of the multilayer convolutional neural network.
And step four, adding a residual error layer and an LSTM layer, designing a deeper network structure, and enabling the deeper network to learn better characteristics.
And step five, repeating the step three and the step four, namely adding a residual error layer every time to obtain a new network structure, and then training the new structure. And when the loss value is not reduced after the network layer is added or is not increased and is smaller than the set threshold value, taking the current network structure as the target multilayer convolutional neural network. Finally, through continuous trial, a neural network model formed by 8 layer residual error layers, 8 BN layers, 8 excitation layers and 2 LSTM layers is determined, so that the loss value can be ensured to be small enough, and the calculation time can be ensured to be short.
And step six, selecting a plurality of intermediate neural network models with smaller loss values from the intermediate neural network models stored during the final network structure training, verifying the models by using a verification set, and selecting one intermediate neural network model with the highest accuracy as a final target neural network model.
(1) In 7 months of 2017, the experiment of the invention is carried out on number 1 of Zhen outdoor avenue in the sunny ward region in Beijing, the substation equipment indicator is identified in the experiment, and characters on the indicator are identified by the method of the invention. The method comprises the steps of firstly collecting a large number of sign pictures, manually marking picture contents, training a network to obtain a device sign character recognition model, then inputting a picture, and directly outputting a sign character result.
(2) In 11 months in 2017, the experiment of the invention is carried out on number 1 of Zhen outdoors in the sunny region in Beijing, the safety warning board is identified in the experiment, and the characters on the warning board are identified by the method of the invention. Firstly, a large number of sign pictures are collected, picture contents are marked manually, a network is trained to obtain a device sign character recognition model, then a picture is input, and a warning sign character result is directly output.
(3) In 11 months in 2017, the experiment of the invention is carried out on number 1 of Anzhen outdoors in the sunny region in Beijing, the warning sign of the equipment is identified in the experiment, and the characters on the indicating plate are identified by the method of the invention. Firstly, collecting a large number of equipment warning mark pictures, manually marking the picture contents, training a network to obtain an equipment warning mark character recognition model, then inputting a picture, and directly outputting an equipment warning mark character result, wherein the whole recognition process is rapid and accurate, and successfully passes the test of a national information center software evaluation center.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
Fig. 3 is a schematic structural diagram of a character recognition device according to an embodiment of the present invention, as shown in fig. 3, the character recognition device includes: an obtaining module 301 and an identifying module 302, wherein:
the acquiring module 301 is configured to acquire a picture to be identified, where the picture to be identified includes text information to be identified; the recognition module 302 is configured to use the picture to be recognized as an input of a target neural network model, and perform text recognition on the picture to be recognized through the pre-created target neural network model to obtain the text information in the picture to be recognized.
Specifically, the obtaining module 301 obtains a picture to be recognized, where the picture to be recognized may be a substation equipment sign, a safety warning sign, a road sign, and the like, and the picture to be recognized has text information to be recognized. After receiving the picture to be recognized, the recognition module 302 inputs the picture to be recognized into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, which should be noted that the target neural network model is created in advance and trained, and can input corresponding character information according to the picture to be recognized input by the user.
The embodiment of the apparatus provided in the present invention may be specifically configured to execute the processing flows of the above method embodiments, and the functions of the apparatus are not described herein again, and refer to the detailed description of the above method embodiments.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
On the basis of the above embodiment, the apparatus further includes a model creation module configured to:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Specifically, the specific steps of creating the target neural network model are consistent with the above embodiments, and are not described herein again.
On the basis of the above embodiment, the multilayer convolutional neural network includes a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network LSTM layer, and the model creating module is specifically configured to:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
On the basis of the above embodiment, the objective function of the target neural network model is a CTC loss function.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
Fig. 4 is a schematic structural diagram of an entity of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes: a processor (processor)401, a memory (memory)402, and a bus 404; wherein,
the processor 401 and the memory 402 complete communication with each other through the bus 404;
the processor 401 is configured to call the program instructions in the memory 402 to execute the methods provided by the above-mentioned method embodiments, for example, including: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized; and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized; and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized; and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the apparatuses and the like are merely illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for recognizing a character, comprising:
acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized;
and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
2. The method of claim 1, further comprising:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
3. The method of claim 2, wherein the multi-layered convolutional neural network comprises a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network (LSTM) layer, and wherein the training the multi-layered convolutional neural network using an error back propagation algorithm comprises:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
4. The method of any one of claims 1-3, wherein the objective function of the target neural network model is a CTC loss function.
5. A character recognition apparatus, comprising:
the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a picture to be recognized, and the picture to be recognized comprises character information to be recognized;
and the recognition module is used for taking the picture to be recognized as the input of a target neural network model and performing character recognition on the picture to be recognized through the pre-established target neural network model so as to obtain the character information in the picture to be recognized.
6. The apparatus of claim 5, further comprising a model creation module to:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
7. The apparatus of claim 6, wherein the multi-layer convolutional neural network comprises a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network (LSTM) layer, and wherein the model creation module is specifically configured to:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
8. The apparatus of any one of claims 5-7, wherein the objective function of the target neural network model is a CTC loss function.
9. An electronic device, comprising: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any one of claims 1-4.
10. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1-4.
CN201810162541.2A 2018-02-26 2018-02-26 A kind of character recognition method and device Pending CN108427953A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810162541.2A CN108427953A (en) 2018-02-26 2018-02-26 A kind of character recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810162541.2A CN108427953A (en) 2018-02-26 2018-02-26 A kind of character recognition method and device

Publications (1)

Publication Number Publication Date
CN108427953A true CN108427953A (en) 2018-08-21

Family

ID=63157317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810162541.2A Pending CN108427953A (en) 2018-02-26 2018-02-26 A kind of character recognition method and device

Country Status (1)

Country Link
CN (1) CN108427953A (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389084A (en) * 2018-10-09 2019-02-26 郑州云海信息技术有限公司 A kind of method and device handling image information
CN109598270A (en) * 2018-12-04 2019-04-09 龙马智芯(珠海横琴)科技有限公司 Distort recognition methods and the device, storage medium and processor of text
CN109657683A (en) * 2018-12-19 2019-04-19 北京像素软件科技股份有限公司 Text region modeling method and device, character recognition method and electronic equipment
CN109740336A (en) * 2018-12-28 2019-05-10 北京云测信息技术有限公司 Recognition methods, device and the electronic equipment of a kind of verification information in picture
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN109977950A (en) * 2019-03-22 2019-07-05 上海电力学院 A kind of character recognition method based on mixing CNN-LSTM network
CN109993164A (en) * 2019-03-20 2019-07-09 上海电力学院 A kind of natural scene character recognition method based on RCRNN neural network
CN110008961A (en) * 2019-04-01 2019-07-12 深圳市华付信息技术有限公司 Text real-time identification method, device, computer equipment and storage medium
CN110046574A (en) * 2019-04-15 2019-07-23 北京易达图灵科技有限公司 Safety cap based on deep learning wears recognition methods and equipment
CN110059677A (en) * 2019-04-15 2019-07-26 北京易达图灵科技有限公司 Digital table recognition methods and equipment based on deep learning
CN110059742A (en) * 2019-04-15 2019-07-26 北京易达图灵科技有限公司 Safety protector wearing recognition methods and equipment based on deep learning
CN110059617A (en) * 2019-04-17 2019-07-26 北京易达图灵科技有限公司 A kind of recognition methods of target object and device
CN110070029A (en) * 2019-04-17 2019-07-30 北京易达图灵科技有限公司 A kind of gait recognition method and device
CN110070042A (en) * 2019-04-23 2019-07-30 北京字节跳动网络技术有限公司 Character recognition method, device and electronic equipment
CN110110777A (en) * 2019-04-28 2019-08-09 网易有道信息技术(北京)有限公司 Image processing method and training method and device, medium and calculating equipment
CN110147791A (en) * 2019-05-20 2019-08-20 上海联影医疗科技有限公司 Character recognition method, device, equipment and storage medium
CN110321884A (en) * 2019-06-13 2019-10-11 贝式计算(天津)信息技术有限公司 Method and device for identifying serial number
CN110321892A (en) * 2019-06-04 2019-10-11 腾讯科技(深圳)有限公司 A kind of picture screening technique, device and electronic equipment
CN110598686A (en) * 2019-09-17 2019-12-20 携程计算机技术(上海)有限公司 Invoice identification method, system, electronic equipment and medium
CN110895924A (en) * 2018-08-23 2020-03-20 珠海金山办公软件有限公司 Document content reading method and device, electronic equipment and readable storage medium
CN111008624A (en) * 2019-12-05 2020-04-14 嘉兴太美医疗科技有限公司 Optical character recognition method and method for generating training sample for optical character recognition
CN111027345A (en) * 2018-10-09 2020-04-17 北京金山办公软件股份有限公司 Font identification method and apparatus
CN113377644A (en) * 2020-02-25 2021-09-10 福建天泉教育科技有限公司 Test method based on front-end multi-system multi-language internationalized translation
CN114821029A (en) * 2022-05-16 2022-07-29 广东电网有限责任公司广州供电局 OCR technology-based distribution network operation security ring identification method and system
CN115004247A (en) * 2019-12-02 2022-09-02 尤帕斯公司 Training optical character detection and recognition models for robotic process automation
CN118122723A (en) * 2024-05-07 2024-06-04 温州电力建设有限公司 Pipeline cleaning method, device and equipment based on convolutional neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016197381A1 (en) * 2015-06-12 2016-12-15 Sensetime Group Limited Methods and apparatus for recognizing text in an image
CN106447707A (en) * 2016-09-08 2017-02-22 华中科技大学 Image real-time registration method and system
CN107239733A (en) * 2017-04-19 2017-10-10 上海嵩恒网络科技有限公司 Continuous hand-written character recognizing method and system
JP2017215859A (en) * 2016-06-01 2017-12-07 日本電信電話株式会社 Character string recognition device, method and program
CN107590774A (en) * 2017-09-18 2018-01-16 北京邮电大学 A kind of car plate clarification method and device based on generation confrontation network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016197381A1 (en) * 2015-06-12 2016-12-15 Sensetime Group Limited Methods and apparatus for recognizing text in an image
JP2017215859A (en) * 2016-06-01 2017-12-07 日本電信電話株式会社 Character string recognition device, method and program
CN106447707A (en) * 2016-09-08 2017-02-22 华中科技大学 Image real-time registration method and system
CN107239733A (en) * 2017-04-19 2017-10-10 上海嵩恒网络科技有限公司 Continuous hand-written character recognizing method and system
CN107590774A (en) * 2017-09-18 2018-01-16 北京邮电大学 A kind of car plate clarification method and device based on generation confrontation network

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110895924A (en) * 2018-08-23 2020-03-20 珠海金山办公软件有限公司 Document content reading method and device, electronic equipment and readable storage medium
CN109389084A (en) * 2018-10-09 2019-02-26 郑州云海信息技术有限公司 A kind of method and device handling image information
CN111027345A (en) * 2018-10-09 2020-04-17 北京金山办公软件股份有限公司 Font identification method and apparatus
CN109598270A (en) * 2018-12-04 2019-04-09 龙马智芯(珠海横琴)科技有限公司 Distort recognition methods and the device, storage medium and processor of text
CN109598270B (en) * 2018-12-04 2020-05-05 龙马智芯(珠海横琴)科技有限公司 Method and device for identifying distorted characters, storage medium and processor
CN109657683A (en) * 2018-12-19 2019-04-19 北京像素软件科技股份有限公司 Text region modeling method and device, character recognition method and electronic equipment
CN109740336A (en) * 2018-12-28 2019-05-10 北京云测信息技术有限公司 Recognition methods, device and the electronic equipment of a kind of verification information in picture
CN109740336B (en) * 2018-12-28 2020-08-18 北京云测信息技术有限公司 Method and device for identifying verification information in picture and electronic equipment
CN109902678A (en) * 2019-02-12 2019-06-18 北京奇艺世纪科技有限公司 Model training method, character recognition method, device, electronic equipment and computer-readable medium
CN109993164A (en) * 2019-03-20 2019-07-09 上海电力学院 A kind of natural scene character recognition method based on RCRNN neural network
CN109977950A (en) * 2019-03-22 2019-07-05 上海电力学院 A kind of character recognition method based on mixing CNN-LSTM network
CN110008961A (en) * 2019-04-01 2019-07-12 深圳市华付信息技术有限公司 Text real-time identification method, device, computer equipment and storage medium
CN110046574A (en) * 2019-04-15 2019-07-23 北京易达图灵科技有限公司 Safety cap based on deep learning wears recognition methods and equipment
CN110059742A (en) * 2019-04-15 2019-07-26 北京易达图灵科技有限公司 Safety protector wearing recognition methods and equipment based on deep learning
CN110059677A (en) * 2019-04-15 2019-07-26 北京易达图灵科技有限公司 Digital table recognition methods and equipment based on deep learning
CN110070029A (en) * 2019-04-17 2019-07-30 北京易达图灵科技有限公司 A kind of gait recognition method and device
CN110059617A (en) * 2019-04-17 2019-07-26 北京易达图灵科技有限公司 A kind of recognition methods of target object and device
CN110070029B (en) * 2019-04-17 2021-07-16 北京易达图灵科技有限公司 Gait recognition method and device
CN110070042A (en) * 2019-04-23 2019-07-30 北京字节跳动网络技术有限公司 Character recognition method, device and electronic equipment
CN110110777A (en) * 2019-04-28 2019-08-09 网易有道信息技术(北京)有限公司 Image processing method and training method and device, medium and calculating equipment
CN110147791A (en) * 2019-05-20 2019-08-20 上海联影医疗科技有限公司 Character recognition method, device, equipment and storage medium
CN110321892A (en) * 2019-06-04 2019-10-11 腾讯科技(深圳)有限公司 A kind of picture screening technique, device and electronic equipment
CN110321892B (en) * 2019-06-04 2022-12-13 腾讯科技(深圳)有限公司 Picture screening method and device and electronic equipment
CN110321884A (en) * 2019-06-13 2019-10-11 贝式计算(天津)信息技术有限公司 Method and device for identifying serial number
CN110598686A (en) * 2019-09-17 2019-12-20 携程计算机技术(上海)有限公司 Invoice identification method, system, electronic equipment and medium
CN115004247A (en) * 2019-12-02 2022-09-02 尤帕斯公司 Training optical character detection and recognition models for robotic process automation
CN111008624A (en) * 2019-12-05 2020-04-14 嘉兴太美医疗科技有限公司 Optical character recognition method and method for generating training sample for optical character recognition
CN113377644A (en) * 2020-02-25 2021-09-10 福建天泉教育科技有限公司 Test method based on front-end multi-system multi-language internationalized translation
CN113377644B (en) * 2020-02-25 2023-09-15 福建天泉教育科技有限公司 Testing method for multi-language internationalization translation based on front-end multi-system
CN114821029A (en) * 2022-05-16 2022-07-29 广东电网有限责任公司广州供电局 OCR technology-based distribution network operation security ring identification method and system
CN118122723A (en) * 2024-05-07 2024-06-04 温州电力建设有限公司 Pipeline cleaning method, device and equipment based on convolutional neural network

Similar Documents

Publication Publication Date Title
CN108427953A (en) A kind of character recognition method and device
CN109101469B (en) Extracting searchable information from digitized documents
CN104408093B (en) A kind of media event key element abstracting method and device
CN112613501A (en) Information auditing classification model construction method and information auditing method
CN112597312A (en) Text classification method and device, electronic equipment and readable storage medium
CN107977665A (en) The recognition methods of key message and computing device in a kind of invoice
CN110363084A (en) A kind of class state detection method, device, storage medium and electronics
CN103699523A (en) Product classification method and device
CN108090099B (en) Text processing method and device
CN111666761A (en) Fine-grained emotion analysis model training method and device
CN104142912A (en) Accurate corpus category marking method and device
CN112988963A (en) User intention prediction method, device, equipment and medium based on multi-process node
CN113360768A (en) Product recommendation method, device and equipment based on user portrait and storage medium
CN114881698A (en) Advertisement compliance auditing method and device, electronic equipment and storage medium
CN109902157A (en) A kind of training sample validation checking method and device
CN108090098B (en) Text processing method and device
JP2019079347A (en) Character estimation system, character estimation method, and character estimation program
CN110287911A (en) A kind of content identification method of invoice, device, equipment and storage medium
CN115392237B (en) Emotion analysis model training method, device, equipment and storage medium
CN106682667A (en) Image-text OCR (optical character recognition) system for uncommon fonts
CN109101487A (en) Conversational character differentiating method, device, terminal device and storage medium
CN113111869B (en) Method and system for extracting text picture and description thereof
CN113935880A (en) Policy recommendation method, device, equipment and storage medium
CN114821613A (en) Extraction method and system of table information in PDF
Guralnick et al. Humans in the loop: Community science and machine learning synergies for overcoming herbarium digitization bottlenecks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180821