CN108427953A - A kind of character recognition method and device - Google Patents
A kind of character recognition method and device Download PDFInfo
- Publication number
- CN108427953A CN108427953A CN201810162541.2A CN201810162541A CN108427953A CN 108427953 A CN108427953 A CN 108427953A CN 201810162541 A CN201810162541 A CN 201810162541A CN 108427953 A CN108427953 A CN 108427953A
- Authority
- CN
- China
- Prior art keywords
- neural network
- picture
- recognized
- layer
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000003062 neural network model Methods 0.000 claims description 86
- 238000013527 convolutional neural network Methods 0.000 claims description 72
- 238000012549 training Methods 0.000 claims description 44
- 238000012795 verification Methods 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 16
- 230000005284 excitation Effects 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000000306 recurrent effect Effects 0.000 claims description 4
- 230000002123 temporal effect Effects 0.000 claims description 4
- 210000005036 nerve Anatomy 0.000 abstract 4
- 238000002474 experimental method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- -1 electricity Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
A kind of character recognition method of offer of the embodiment of the present invention and device.The method includes:Picture to be identified is obtained, includes text information to be identified on the picture to be identified;Using the picture to be identified as the input of target nerve network model, Text region is carried out to the picture to be identified by the target nerve network model being pre-created, to obtain the text information in the picture to be identified.Described device is for executing the method.For the embodiment of the present invention by the way that picture to be identified to be input in target nerve network model, target nerve network model carries out Text region to picture to be identified, obtains the text information in picture to be identified, improves the efficiency and accuracy of Text region.
Description
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to a character recognition method and device.
Background
The technology of automatically recognizing characters by using a computer is an important field of pattern recognition application. People need to process a large amount of words, reports and texts in production and life. In order to reduce the labor of people and improve the processing efficiency, the 50 s began to discuss the general character recognition method and develop an optical character recognizer. In the 60 s, utility machines using magnetic ink and special fonts were introduced. In the later 60 s, a plurality of character types and handwritten character recognition machines appeared, and the recognition precision and the machine performance of the character recognition machines can basically meet the requirements. Such as a handwritten form number recognition machine and a printed form english number recognition machine for letter sorting. In the 70 s, the basic theory of character recognition and the development of high-performance character recognition machines were mainly studied, and the research of character recognition was emphasized.
The character recognition can be applied to many fields, such as reading, translation, retrieval of document data, letter and parcel sorting, manuscript editing and proofreading, gathering and analysis of a large number of statistical reports and cards, bank check processing, commodity invoice statistical gathering, commodity code recognition, commodity warehouse management, automatic processing of a large number of credit cards in water, electricity, gas, house renting, personal insurance and other charge collection services, local automation of office typists and the like.
In the process of implementing the embodiment of the present invention, the inventor finds that, in the prior art, when identifying characters in an image, single characters in the image need to be matched one by one, then the single characters are synthesized, and finally text information in the image is obtained.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the invention provides a character recognition method and a character recognition device.
In a first aspect, an embodiment of the present invention provides a text recognition method, including:
acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized;
and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
Optionally, the method further comprises:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Optionally, the multi-layer convolutional neural network includes a residual layer, a BN layer, an excitation layer, and an LSTM layer, and the training the multi-layer convolutional neural network using an error back propagation algorithm includes:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
Optionally, the objective function of the target neural network model is a CTC loss function.
In a second aspect, an embodiment of the present invention provides a text recognition apparatus, including:
the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a picture to be recognized, and the picture to be recognized comprises character information to be recognized;
and the recognition module is used for taking the picture to be recognized as the input of a target neural network model and performing character recognition on the picture to be recognized through the pre-established target neural network model so as to obtain the character information in the picture to be recognized.
Optionally, the apparatus further includes a model creation module configured to:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Optionally, the multi-layer convolutional neural network includes a residual layer, a BN layer, an excitation layer, and an LSTM layer, and the model creation module is specifically configured to:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
Optionally, the objective function of the target neural network model is a CTC loss function.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor being capable of performing the method steps of the first aspect when invoked by the program instructions.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, including:
the non-transitory computer readable storage medium stores computer instructions that cause the computer to perform the method steps of the first aspect.
According to the character recognition method and device provided by the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a character recognition method according to an embodiment of the present invention;
fig. 2 is a screenshot of a substation device indicator in an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a character recognition device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a text recognition method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 101: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized;
specifically, the identification device acquires a picture to be identified, wherein the picture to be identified can be a substation equipment indicator, a safety warning board, a road sign indicator and the like, and text information to be identified is arranged on the picture to be identified.
Step 102: and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
Specifically, after receiving the picture to be recognized, the recognition device inputs the picture to be recognized into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
On the basis of the above embodiment, the method further includes:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Specifically, a target neural network model is created in advance, and the specific steps are as follows: firstly, a large number of pictures with character information are collected, manual labeling is carried out on the pictures in advance, characters on each picture are obtained, a text file can be generated on each picture, the content of the text file is the character content of the picture, and the name of the text file is the name of the picture. Dividing the acquired pictures into a training set and a verification set by using a preset proportion, for example: 1000 pictures are collected, 800 pictures and corresponding text information are randomly selected from the pictures to serve as a training set, and the remaining 200 pictures and corresponding text information serve as a verification set.
The method comprises the steps of constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of convolutional kernels and the number of convolutional kernels, the number of layers, the size of convolutional kernels and the number of convolutional kernels are initial values set in advance according to experience, training the multilayer convolutional neural network by taking pictures in a training set as input and taking character information corresponding to the pictures as output, training the multilayer convolutional neural network by adopting an error back propagation algorithm during training, judging whether the structure of the current multilayer convolutional neural network needs to be adjusted according to a loss value obtained by a target function by adopting a CTC (central control unit) loss function, and finally obtaining the target multilayer convolutional neural network.
When a target multilayer convolutional neural network is trained, adjusting each parameter in the target multilayer convolutional neural network through an error back propagation algorithm, obtaining a corresponding intermediate neural network model after each adjustment, enabling each intermediate neural network model to correspond to a loss value, sequencing the loss values of all the obtained intermediate neural network models from small to large, selecting the intermediate neural network models positioned at the first few bits from the sequencing, respectively verifying the selected intermediate neural network models by using a verification set to obtain the accuracy corresponding to each intermediate neural network model, and taking the intermediate neural network model with the highest accuracy as the target neural network model.
According to the embodiment of the invention, the multilayer convolutional neural network is created in advance, a large number of collected pictures are used as input, character information corresponding to the pictures is used as output for training, the target multilayer convolutional neural network is obtained, then the intermediate neural network model generated in the process of training the target multilayer convolutional neural network is obtained, and the target neural network model is determined from the intermediate neural network model through the accuracy rate, so that the accuracy rate of character recognition on the pictures is improved.
On the basis of the above embodiment, the multi-layer convolutional neural network includes a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network LSTM layer, and the training of the multi-layer convolutional neural network by using an error back propagation algorithm includes:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
Specifically, the constructed multilayer convolutional neural network comprises a residual error layer, a specification (BN) layer, an excitation layer and a (LSTM) layer, each network layer has a corresponding number of layers, the ordering of the network layers can be preset according to actual conditions, and when the multilayer convolutional neural network is trained by using pictures in a training set and text information corresponding to each picture, the number of layers of one or more network layers can be increased or reduced, for example: one residual layer and one LSTM layer can be added, so that a multilayer convolutional neural network with a deeper structure is obtained, and it can be understood that the effect is different when the residual layer and the LSTM layer are added to different positions of the original multilayer convolutional neural network, and the adjustment can be performed according to the actual condition or the loss value after training. And (3) adding a residual error layer and an LSTM layer, then training the new multilayer convolutional neural network again, if the loss value corresponding to the new multilayer convolutional neural network is not less than the preset threshold, continuing to adjust the result of the multilayer convolutional neural network until the loss value corresponding to the adjusted multilayer convolutional neural network is less than the preset threshold, and then obtaining the multilayer convolutional neural network which is the target multilayer convolutional neural network.
The embodiment of the invention obtains the final target multilayer convolutional neural network by adjusting the residual error layer, the BN layer, the excitation layer and the LSTM layer, and takes the character recognition problem as an end-to-end problem.
The following takes an equipment identification signboard in a transformer substation as an example, fig. 2 is a diagram of the equipment identification signboard of the transformer substation provided by the embodiment of the present invention, and as shown in fig. 2, the key steps involved in the method of the present invention are described in detail:
the method comprises the following steps of firstly, collecting a large number of nameplate pictures, manually marking character information on each nameplate, generating a text file for each nameplate, wherein the content of the text file is the character information on the nameplate, and the name of the text file is the name of the nameplate picture. And marking a large number of nameplate pictures, and dividing the nameplate pictures into a training set and a verification set according to the proportion.
And step two, constructing a multilayer convolutional neural network, including defining the number of layers, the size of convolutional kernels, the number of convolutional kernels and the like for forming the multilayer convolutional neural network. In consideration of the excellent performance of the residual neural network in image feature extraction, the residual neural network is adopted to extract features, feature maps extracted by a plurality of residual layers are sent to the LSTM layer to learn features, the output of the LSTM layer is used as the input of a CTC loss function, and finally a character recognition result is output. Preferably, a multi-layer convolutional neural network of 5 residual layers, 5 BN layers, 1 LSTM layer may be employed. The convolutional layer is designed in such a way that the features can be extracted quickly and efficiently.
And step three, training the network by adopting an error back propagation algorithm. And adopting the CTC loss function as an objective function of the multilayer convolutional neural network.
And step four, adding a residual error layer and an LSTM layer, designing a deeper network structure, and enabling the deeper network to learn better characteristics.
And step five, repeating the step three and the step four, namely adding a residual error layer every time to obtain a new network structure, and then training the new structure. And when the loss value is not reduced after the network layer is added or is not increased and is smaller than the set threshold value, taking the current network structure as the target multilayer convolutional neural network. Finally, through continuous trial, a neural network model formed by 8 layer residual error layers, 8 BN layers, 8 excitation layers and 2 LSTM layers is determined, so that the loss value can be ensured to be small enough, and the calculation time can be ensured to be short.
And step six, selecting a plurality of intermediate neural network models with smaller loss values from the intermediate neural network models stored during the final network structure training, verifying the models by using a verification set, and selecting one intermediate neural network model with the highest accuracy as a final target neural network model.
(1) In 7 months of 2017, the experiment of the invention is carried out on number 1 of Zhen outdoor avenue in the sunny ward region in Beijing, the substation equipment indicator is identified in the experiment, and characters on the indicator are identified by the method of the invention. The method comprises the steps of firstly collecting a large number of sign pictures, manually marking picture contents, training a network to obtain a device sign character recognition model, then inputting a picture, and directly outputting a sign character result.
(2) In 11 months in 2017, the experiment of the invention is carried out on number 1 of Zhen outdoors in the sunny region in Beijing, the safety warning board is identified in the experiment, and the characters on the warning board are identified by the method of the invention. Firstly, a large number of sign pictures are collected, picture contents are marked manually, a network is trained to obtain a device sign character recognition model, then a picture is input, and a warning sign character result is directly output.
(3) In 11 months in 2017, the experiment of the invention is carried out on number 1 of Anzhen outdoors in the sunny region in Beijing, the warning sign of the equipment is identified in the experiment, and the characters on the indicating plate are identified by the method of the invention. Firstly, collecting a large number of equipment warning mark pictures, manually marking the picture contents, training a network to obtain an equipment warning mark character recognition model, then inputting a picture, and directly outputting an equipment warning mark character result, wherein the whole recognition process is rapid and accurate, and successfully passes the test of a national information center software evaluation center.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
Fig. 3 is a schematic structural diagram of a character recognition device according to an embodiment of the present invention, as shown in fig. 3, the character recognition device includes: an obtaining module 301 and an identifying module 302, wherein:
the acquiring module 301 is configured to acquire a picture to be identified, where the picture to be identified includes text information to be identified; the recognition module 302 is configured to use the picture to be recognized as an input of a target neural network model, and perform text recognition on the picture to be recognized through the pre-created target neural network model to obtain the text information in the picture to be recognized.
Specifically, the obtaining module 301 obtains a picture to be recognized, where the picture to be recognized may be a substation equipment sign, a safety warning sign, a road sign, and the like, and the picture to be recognized has text information to be recognized. After receiving the picture to be recognized, the recognition module 302 inputs the picture to be recognized into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, which should be noted that the target neural network model is created in advance and trained, and can input corresponding character information according to the picture to be recognized input by the user.
The embodiment of the apparatus provided in the present invention may be specifically configured to execute the processing flows of the above method embodiments, and the functions of the apparatus are not described herein again, and refer to the detailed description of the above method embodiments.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
On the basis of the above embodiment, the apparatus further includes a model creation module configured to:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
Specifically, the specific steps of creating the target neural network model are consistent with the above embodiments, and are not described herein again.
On the basis of the above embodiment, the multilayer convolutional neural network includes a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network LSTM layer, and the model creating module is specifically configured to:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
On the basis of the above embodiment, the objective function of the target neural network model is a CTC loss function.
According to the embodiment of the invention, the picture to be recognized is input into the target neural network model, and the target neural network model performs character recognition on the picture to be recognized to obtain character information in the picture to be recognized, so that the efficiency and the accuracy of character recognition are improved.
Fig. 4 is a schematic structural diagram of an entity of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes: a processor (processor)401, a memory (memory)402, and a bus 404; wherein,
the processor 401 and the memory 402 complete communication with each other through the bus 404;
the processor 401 is configured to call the program instructions in the memory 402 to execute the methods provided by the above-mentioned method embodiments, for example, including: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized; and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized; and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized; and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the apparatuses and the like are merely illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A method for recognizing a character, comprising:
acquiring a picture to be recognized, wherein the picture to be recognized comprises character information to be recognized;
and taking the picture to be recognized as the input of a target neural network model, and performing character recognition on the picture to be recognized through the pre-established target neural network model to obtain the character information in the picture to be recognized.
2. The method of claim 1, further comprising:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
3. The method of claim 2, wherein the multi-layered convolutional neural network comprises a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network (LSTM) layer, and wherein the training the multi-layered convolutional neural network using an error back propagation algorithm comprises:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
4. The method of any one of claims 1-3, wherein the objective function of the target neural network model is a CTC loss function.
5. A character recognition apparatus, comprising:
the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring a picture to be recognized, and the picture to be recognized comprises character information to be recognized;
and the recognition module is used for taking the picture to be recognized as the input of a target neural network model and performing character recognition on the picture to be recognized through the pre-established target neural network model so as to obtain the character information in the picture to be recognized.
6. The apparatus of claim 5, further comprising a model creation module to:
collecting a plurality of pictures with text information, acquiring the text information corresponding to the pictures, and dividing the pictures into a training set and a verification set according to a preset proportion;
constructing a multilayer convolutional neural network, wherein the multilayer convolutional neural network comprises the number of layers, the size of a convolutional kernel and the number of convolutional kernels;
training the multilayer convolutional neural network by adopting an error back propagation algorithm by taking the pictures in the training set as input and the character information corresponding to the pictures as output to obtain a target multilayer convolutional neural network;
acquiring intermediate neural network models of which the loss values generated in the training process of the target multilayer convolutional neural network meet preset conditions, and verifying each intermediate neural network model through the verification set to obtain the accuracy rate corresponding to each intermediate neural network model;
and taking the intermediate neural network model with the highest accuracy as the target neural network model.
7. The apparatus of claim 6, wherein the multi-layer convolutional neural network comprises a residual layer, a canonical BN layer, an excitation layer, and a temporal recurrent neural network (LSTM) layer, and wherein the model creation module is specifically configured to:
and adjusting the number of layers corresponding to the residual layer, the BN layer, the excitation layer and the LSTM layer respectively until the loss value corresponding to the adjusted multilayer convolutional neural network is smaller than a preset threshold value.
8. The apparatus of any one of claims 5-7, wherein the objective function of the target neural network model is a CTC loss function.
9. An electronic device, comprising: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any one of claims 1-4.
10. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810162541.2A CN108427953A (en) | 2018-02-26 | 2018-02-26 | A kind of character recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810162541.2A CN108427953A (en) | 2018-02-26 | 2018-02-26 | A kind of character recognition method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108427953A true CN108427953A (en) | 2018-08-21 |
Family
ID=63157317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810162541.2A Pending CN108427953A (en) | 2018-02-26 | 2018-02-26 | A kind of character recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108427953A (en) |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109389084A (en) * | 2018-10-09 | 2019-02-26 | 郑州云海信息技术有限公司 | A kind of method and device handling image information |
CN109598270A (en) * | 2018-12-04 | 2019-04-09 | 龙马智芯(珠海横琴)科技有限公司 | Distort recognition methods and the device, storage medium and processor of text |
CN109657683A (en) * | 2018-12-19 | 2019-04-19 | 北京像素软件科技股份有限公司 | Text region modeling method and device, character recognition method and electronic equipment |
CN109740336A (en) * | 2018-12-28 | 2019-05-10 | 北京云测信息技术有限公司 | Recognition methods, device and the electronic equipment of a kind of verification information in picture |
CN109902678A (en) * | 2019-02-12 | 2019-06-18 | 北京奇艺世纪科技有限公司 | Model training method, character recognition method, device, electronic equipment and computer-readable medium |
CN109977950A (en) * | 2019-03-22 | 2019-07-05 | 上海电力学院 | A kind of character recognition method based on mixing CNN-LSTM network |
CN109993164A (en) * | 2019-03-20 | 2019-07-09 | 上海电力学院 | A kind of natural scene character recognition method based on RCRNN neural network |
CN110008961A (en) * | 2019-04-01 | 2019-07-12 | 深圳市华付信息技术有限公司 | Text real-time identification method, device, computer equipment and storage medium |
CN110046574A (en) * | 2019-04-15 | 2019-07-23 | 北京易达图灵科技有限公司 | Safety cap based on deep learning wears recognition methods and equipment |
CN110059677A (en) * | 2019-04-15 | 2019-07-26 | 北京易达图灵科技有限公司 | Digital table recognition methods and equipment based on deep learning |
CN110059742A (en) * | 2019-04-15 | 2019-07-26 | 北京易达图灵科技有限公司 | Safety protector wearing recognition methods and equipment based on deep learning |
CN110059617A (en) * | 2019-04-17 | 2019-07-26 | 北京易达图灵科技有限公司 | A kind of recognition methods of target object and device |
CN110070029A (en) * | 2019-04-17 | 2019-07-30 | 北京易达图灵科技有限公司 | A kind of gait recognition method and device |
CN110070042A (en) * | 2019-04-23 | 2019-07-30 | 北京字节跳动网络技术有限公司 | Character recognition method, device and electronic equipment |
CN110110777A (en) * | 2019-04-28 | 2019-08-09 | 网易有道信息技术(北京)有限公司 | Image processing method and training method and device, medium and calculating equipment |
CN110147791A (en) * | 2019-05-20 | 2019-08-20 | 上海联影医疗科技有限公司 | Character recognition method, device, equipment and storage medium |
CN110321884A (en) * | 2019-06-13 | 2019-10-11 | 贝式计算(天津)信息技术有限公司 | Method and device for identifying serial number |
CN110321892A (en) * | 2019-06-04 | 2019-10-11 | 腾讯科技(深圳)有限公司 | A kind of picture screening technique, device and electronic equipment |
CN110598686A (en) * | 2019-09-17 | 2019-12-20 | 携程计算机技术(上海)有限公司 | Invoice identification method, system, electronic equipment and medium |
CN110895924A (en) * | 2018-08-23 | 2020-03-20 | 珠海金山办公软件有限公司 | Document content reading method and device, electronic equipment and readable storage medium |
CN111008624A (en) * | 2019-12-05 | 2020-04-14 | 嘉兴太美医疗科技有限公司 | Optical character recognition method and method for generating training sample for optical character recognition |
CN111027345A (en) * | 2018-10-09 | 2020-04-17 | 北京金山办公软件股份有限公司 | Font identification method and apparatus |
CN113377644A (en) * | 2020-02-25 | 2021-09-10 | 福建天泉教育科技有限公司 | Test method based on front-end multi-system multi-language internationalized translation |
CN114821029A (en) * | 2022-05-16 | 2022-07-29 | 广东电网有限责任公司广州供电局 | OCR technology-based distribution network operation security ring identification method and system |
CN115004247A (en) * | 2019-12-02 | 2022-09-02 | 尤帕斯公司 | Training optical character detection and recognition models for robotic process automation |
CN118122723A (en) * | 2024-05-07 | 2024-06-04 | 温州电力建设有限公司 | Pipeline cleaning method, device and equipment based on convolutional neural network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016197381A1 (en) * | 2015-06-12 | 2016-12-15 | Sensetime Group Limited | Methods and apparatus for recognizing text in an image |
CN106447707A (en) * | 2016-09-08 | 2017-02-22 | 华中科技大学 | Image real-time registration method and system |
CN107239733A (en) * | 2017-04-19 | 2017-10-10 | 上海嵩恒网络科技有限公司 | Continuous hand-written character recognizing method and system |
JP2017215859A (en) * | 2016-06-01 | 2017-12-07 | 日本電信電話株式会社 | Character string recognition device, method and program |
CN107590774A (en) * | 2017-09-18 | 2018-01-16 | 北京邮电大学 | A kind of car plate clarification method and device based on generation confrontation network |
-
2018
- 2018-02-26 CN CN201810162541.2A patent/CN108427953A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016197381A1 (en) * | 2015-06-12 | 2016-12-15 | Sensetime Group Limited | Methods and apparatus for recognizing text in an image |
JP2017215859A (en) * | 2016-06-01 | 2017-12-07 | 日本電信電話株式会社 | Character string recognition device, method and program |
CN106447707A (en) * | 2016-09-08 | 2017-02-22 | 华中科技大学 | Image real-time registration method and system |
CN107239733A (en) * | 2017-04-19 | 2017-10-10 | 上海嵩恒网络科技有限公司 | Continuous hand-written character recognizing method and system |
CN107590774A (en) * | 2017-09-18 | 2018-01-16 | 北京邮电大学 | A kind of car plate clarification method and device based on generation confrontation network |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110895924A (en) * | 2018-08-23 | 2020-03-20 | 珠海金山办公软件有限公司 | Document content reading method and device, electronic equipment and readable storage medium |
CN109389084A (en) * | 2018-10-09 | 2019-02-26 | 郑州云海信息技术有限公司 | A kind of method and device handling image information |
CN111027345A (en) * | 2018-10-09 | 2020-04-17 | 北京金山办公软件股份有限公司 | Font identification method and apparatus |
CN109598270A (en) * | 2018-12-04 | 2019-04-09 | 龙马智芯(珠海横琴)科技有限公司 | Distort recognition methods and the device, storage medium and processor of text |
CN109598270B (en) * | 2018-12-04 | 2020-05-05 | 龙马智芯(珠海横琴)科技有限公司 | Method and device for identifying distorted characters, storage medium and processor |
CN109657683A (en) * | 2018-12-19 | 2019-04-19 | 北京像素软件科技股份有限公司 | Text region modeling method and device, character recognition method and electronic equipment |
CN109740336A (en) * | 2018-12-28 | 2019-05-10 | 北京云测信息技术有限公司 | Recognition methods, device and the electronic equipment of a kind of verification information in picture |
CN109740336B (en) * | 2018-12-28 | 2020-08-18 | 北京云测信息技术有限公司 | Method and device for identifying verification information in picture and electronic equipment |
CN109902678A (en) * | 2019-02-12 | 2019-06-18 | 北京奇艺世纪科技有限公司 | Model training method, character recognition method, device, electronic equipment and computer-readable medium |
CN109993164A (en) * | 2019-03-20 | 2019-07-09 | 上海电力学院 | A kind of natural scene character recognition method based on RCRNN neural network |
CN109977950A (en) * | 2019-03-22 | 2019-07-05 | 上海电力学院 | A kind of character recognition method based on mixing CNN-LSTM network |
CN110008961A (en) * | 2019-04-01 | 2019-07-12 | 深圳市华付信息技术有限公司 | Text real-time identification method, device, computer equipment and storage medium |
CN110046574A (en) * | 2019-04-15 | 2019-07-23 | 北京易达图灵科技有限公司 | Safety cap based on deep learning wears recognition methods and equipment |
CN110059742A (en) * | 2019-04-15 | 2019-07-26 | 北京易达图灵科技有限公司 | Safety protector wearing recognition methods and equipment based on deep learning |
CN110059677A (en) * | 2019-04-15 | 2019-07-26 | 北京易达图灵科技有限公司 | Digital table recognition methods and equipment based on deep learning |
CN110070029A (en) * | 2019-04-17 | 2019-07-30 | 北京易达图灵科技有限公司 | A kind of gait recognition method and device |
CN110059617A (en) * | 2019-04-17 | 2019-07-26 | 北京易达图灵科技有限公司 | A kind of recognition methods of target object and device |
CN110070029B (en) * | 2019-04-17 | 2021-07-16 | 北京易达图灵科技有限公司 | Gait recognition method and device |
CN110070042A (en) * | 2019-04-23 | 2019-07-30 | 北京字节跳动网络技术有限公司 | Character recognition method, device and electronic equipment |
CN110110777A (en) * | 2019-04-28 | 2019-08-09 | 网易有道信息技术(北京)有限公司 | Image processing method and training method and device, medium and calculating equipment |
CN110147791A (en) * | 2019-05-20 | 2019-08-20 | 上海联影医疗科技有限公司 | Character recognition method, device, equipment and storage medium |
CN110321892A (en) * | 2019-06-04 | 2019-10-11 | 腾讯科技(深圳)有限公司 | A kind of picture screening technique, device and electronic equipment |
CN110321892B (en) * | 2019-06-04 | 2022-12-13 | 腾讯科技(深圳)有限公司 | Picture screening method and device and electronic equipment |
CN110321884A (en) * | 2019-06-13 | 2019-10-11 | 贝式计算(天津)信息技术有限公司 | Method and device for identifying serial number |
CN110598686A (en) * | 2019-09-17 | 2019-12-20 | 携程计算机技术(上海)有限公司 | Invoice identification method, system, electronic equipment and medium |
CN115004247A (en) * | 2019-12-02 | 2022-09-02 | 尤帕斯公司 | Training optical character detection and recognition models for robotic process automation |
CN111008624A (en) * | 2019-12-05 | 2020-04-14 | 嘉兴太美医疗科技有限公司 | Optical character recognition method and method for generating training sample for optical character recognition |
CN113377644A (en) * | 2020-02-25 | 2021-09-10 | 福建天泉教育科技有限公司 | Test method based on front-end multi-system multi-language internationalized translation |
CN113377644B (en) * | 2020-02-25 | 2023-09-15 | 福建天泉教育科技有限公司 | Testing method for multi-language internationalization translation based on front-end multi-system |
CN114821029A (en) * | 2022-05-16 | 2022-07-29 | 广东电网有限责任公司广州供电局 | OCR technology-based distribution network operation security ring identification method and system |
CN118122723A (en) * | 2024-05-07 | 2024-06-04 | 温州电力建设有限公司 | Pipeline cleaning method, device and equipment based on convolutional neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108427953A (en) | A kind of character recognition method and device | |
CN109101469B (en) | Extracting searchable information from digitized documents | |
CN104408093B (en) | A kind of media event key element abstracting method and device | |
CN112613501A (en) | Information auditing classification model construction method and information auditing method | |
CN112597312A (en) | Text classification method and device, electronic equipment and readable storage medium | |
CN107977665A (en) | The recognition methods of key message and computing device in a kind of invoice | |
CN110363084A (en) | A kind of class state detection method, device, storage medium and electronics | |
CN103699523A (en) | Product classification method and device | |
CN108090099B (en) | Text processing method and device | |
CN111666761A (en) | Fine-grained emotion analysis model training method and device | |
CN104142912A (en) | Accurate corpus category marking method and device | |
CN112988963A (en) | User intention prediction method, device, equipment and medium based on multi-process node | |
CN113360768A (en) | Product recommendation method, device and equipment based on user portrait and storage medium | |
CN114881698A (en) | Advertisement compliance auditing method and device, electronic equipment and storage medium | |
CN109902157A (en) | A kind of training sample validation checking method and device | |
CN108090098B (en) | Text processing method and device | |
JP2019079347A (en) | Character estimation system, character estimation method, and character estimation program | |
CN110287911A (en) | A kind of content identification method of invoice, device, equipment and storage medium | |
CN115392237B (en) | Emotion analysis model training method, device, equipment and storage medium | |
CN106682667A (en) | Image-text OCR (optical character recognition) system for uncommon fonts | |
CN109101487A (en) | Conversational character differentiating method, device, terminal device and storage medium | |
CN113111869B (en) | Method and system for extracting text picture and description thereof | |
CN113935880A (en) | Policy recommendation method, device, equipment and storage medium | |
CN114821613A (en) | Extraction method and system of table information in PDF | |
Guralnick et al. | Humans in the loop: Community science and machine learning synergies for overcoming herbarium digitization bottlenecks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180821 |