CN111046859B - Character recognition method and device - Google Patents

Character recognition method and device Download PDF

Info

Publication number
CN111046859B
CN111046859B CN201811184618.2A CN201811184618A CN111046859B CN 111046859 B CN111046859 B CN 111046859B CN 201811184618 A CN201811184618 A CN 201811184618A CN 111046859 B CN111046859 B CN 111046859B
Authority
CN
China
Prior art keywords
character
network
image
character recognition
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811184618.2A
Other languages
Chinese (zh)
Other versions
CN111046859A (en
Inventor
朱尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201811184618.2A priority Critical patent/CN111046859B/en
Publication of CN111046859A publication Critical patent/CN111046859A/en
Application granted granted Critical
Publication of CN111046859B publication Critical patent/CN111046859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The application provides a character recognition method and a device, wherein the method comprises the following steps: inputting an image to be recognized into a character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a correction image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the correction image to the character recognition network in the character recognition model to recognize characters in the correction image. The character correction network can correct images with problems of inclination, rotation, deformation and the like, so that the recognition result has good stability and high recognition accuracy, the character recognition model can position character key points through the character positioning network and can obtain results through the character correction network and the character recognition network, accurate character frames do not need to be detected from the images, segmentation is not needed, and the recognition accuracy is high.

Description

Character recognition method and device
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for character recognition.
Background
Current character recognition technology generally includes two modules, character region location and character segmentation. In the deep learning method, the character recognition technology is realized through a plurality of deep learning models, namely, the image is input into a feature extraction model to extract the features of the image, then the features output by the feature extraction model are input into a target detection model to detect a character frame, and finally the character frame output by the target detection model and the features output by the feature extraction model are input into a character segmentation model to carry out character segmentation.
However, the multiple deep learning models exist independently, and each deep learning model has data interaction, so that redundant calculation exists, and the memory space is occupied, so that the character recognition speed is low.
Disclosure of Invention
In view of the above, the present application provides a character recognition method and apparatus to solve the problem of low recognition speed in the character recognition method in the related art.
According to a first aspect of an embodiment of the present application, there is provided a character recognition method, the method including:
inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by a character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a correction image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the correction image to the character recognition network in the character recognition model to recognize characters in the correction image; and acquiring a character recognition result output by the character recognition model.
According to a second aspect of an embodiment of the present application, there is provided a character recognition apparatus, the apparatus including:
the character recognition module is used for inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a correction image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the correction image to the character recognition network in the character recognition model to recognize characters in the correction image; and the acquisition module is used for acquiring the character recognition result output by the character recognition model.
According to a third aspect of embodiments of the present application, there is provided an electronic device comprising a readable storage medium and a processor;
wherein the readable storage medium is for storing machine executable instructions;
the processor is configured to read the machine-executable instructions on the readable storage medium and execute the instructions to implement the steps of the character recognition method described above.
Based on the description, the whole recognition process is realized in the character recognition model, so that data interaction between a plurality of models and an external platform does not exist, the recognition speed can be improved, and meanwhile, the maintenance difficulty is reduced. The character correction network in the character recognition model can correct images with problems of inclination, rotation, deformation and the like, so that the character recognition result of the character recognition model is good in stability and high in recognition accuracy. And after inputting an image to the character recognition model, the model directly outputs a character recognition result, so that the end-to-end character recognition can be truly realized. In addition, the character recognition model can obtain a character recognition result only by positioning character key points in the image through the character positioning network and by the character correcting network and the character recognition network, and does not need to detect accurate character frames from the image or divide, so that the recognition accuracy can be further improved.
Drawings
FIG. 1 is a block diagram of a character recognition model according to an exemplary embodiment of the present application;
FIG. 2A is a flow chart illustrating an embodiment of a method of character recognition according to an exemplary embodiment of the present application;
FIG. 2B is a schematic diagram of a located character keypoint according to the embodiment of the application shown in FIG. 2A;
FIG. 2C is a schematic view of a preset position point according to the embodiment of FIG. 2A;
FIG. 2D is a schematic illustration of a rectified image according to the embodiment of FIG. 2A;
FIG. 3 is a flow chart illustrating another character recognition method according to an exemplary embodiment of the present application;
FIG. 4 is a hardware architecture diagram of an electronic device according to an exemplary embodiment of the application;
fig. 5 is a block diagram showing an embodiment of a character recognition apparatus according to an exemplary embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
The character recognition technology implemented by a plurality of deep learning models (including a feature extraction model, a target detection model, a character segmentation module) in the related art has the following problems: 1. the multiple deep learning models exist independently, each deep learning model has data interaction with an external platform, so that redundant calculation exists, the memory space is occupied, and the character recognition speed is low. 2. If the image has problems of inclination, deformation, etc., the recognition result may not be obtained, and thus the above character recognition technique is poor in stability. 3. The accuracy of character segmentation by the character segmentation model depends on the accuracy of character frame detection by the target detection model, so that the dependency of segmentation tasks is strong, and if the detected character frame is not accurate enough, segmentation errors are easy to occur.
Based on this, fig. 1 is a diagram of a character recognition model according to an exemplary embodiment of the present application, as shown in fig. 1, an image to be recognized is input into a trained character recognition model, first, a character key point is located in the image to be recognized by the character recognition model through a character locating network, and is output to a character correction network in the character recognition model, so that a correction image corresponding to a character area in the image to be recognized is determined by the character correction network in the image to be recognized by using a correspondence between the character key point and a preset position point, and the correction image is output to the character recognition network in the character recognition model to recognize characters in the correction image, so that a character recognition result output by the character recognition model can be obtained.
Based on the description, the whole recognition process is realized in the character recognition model, so that data interaction between a plurality of models and an external platform does not exist, the recognition speed can be improved, and meanwhile, the maintenance difficulty is reduced. The character correction network in the character recognition model can correct images with problems of inclination, deformation and the like, so that the character recognition result of the character recognition model is good in stability and high in recognition accuracy. And after inputting an image to the character recognition model, the model directly outputs a character recognition result, so that the end-to-end character recognition can be truly realized. In addition, the character recognition model can obtain a character recognition result only by positioning character key points in the image through the character positioning network and by the character correcting network and the character recognition network, and does not need to detect accurate character frames from the image or divide, so that the recognition accuracy can be further improved.
The technical scheme of the application is described in detail below by using specific examples.
Fig. 2A is a flowchart of an embodiment of a character recognition method according to an exemplary embodiment of the present application, and in combination with the above-described character recognition model structure shown in fig. 1, the character recognition model is obtained by training in advance, and may include a character positioning network, a character correcting network, and a character recognition network. As shown in fig. 2A, the character recognition method includes the steps of:
step 201: inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a correction image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the correction image to the character recognition network in the character recognition model to recognize characters in the correction image.
In an embodiment, for the process of positioning the key points of the characters in the image to be identified by the character positioning network, the features of the image to be identified can be extracted by the feature extraction network in the character positioning network, and output to the key point regression network in the character positioning network, and the extracted features are utilized by the key point regression network to extract the key points of the characters.
The image to be identified may be a gray-scale image under a natural scene (such as a scene of shop name identification, billboard identification, etc.), or may be a gray-scale image under a specific scene (such as a scene of license plate identification, business card identification, certificate identification, etc.). The feature extraction network may comprise a plurality of convolution layers and a pooling layer, but convolving at least once before each pooling. The keypoint regression network may comprise a fully connected layer and a plurality of regression layers. The extracted character key points can be edge points of a character area in the image to be recognized, the number of the character key points can be set according to actual requirements, as shown in fig. 2B, and the "+" in fig. 2B indicates the character key points, and the number of the character key points is 16.
In an embodiment, for a process of determining a corrected image corresponding to a character region in an image to be recognized by using a correspondence between a character key point and a preset position point in the image to be recognized by using a character correction network, a corresponding TPS (Thin Plate Spline, thin-plate spline) transformation matrix may be determined according to the correspondence between the character key point and the preset position point, after a blank corrected image is created, for each position point in the corrected image, a coordinate point corresponding to the position point in the image to be recognized is determined by using the TPS transformation matrix, and a corrected pixel value is obtained by interpolation of a pixel value of a pixel point near the coordinate point, and the corrected pixel value is added to the position point in the corrected image.
The number of the character key points is consistent with the number of the preset position points, the preset position points can be set according to a certain rule, the number of the character key points is assumed to be n, and the rule of the preset position points can be as follows: the preset position points are two parallel rows of position points, the distance between the two rows of position points is a preset height h, the length of each row of position points is a preset length w, the distance between adjacent position points in each row of position points is equal, and n position points are arranged according to the rule. The size of the created correction image may be w×h, and the pixels near the coordinate point may be pixels at four corners of the coordinate point, for example, a coordinate point corresponding to a certain position point in the correction image in the image to be identified is (100.5,2.6), and the pixels near the coordinate point are (100, 2), (101,2), (100, 3), (101,3).
It should be noted that, since the transformation matrix is obtained by the correspondence between the key points of the character and the preset position points, the region corresponding to the coordinate point corresponding to each position point in the blank correction image determined by the transformation matrix in the image to be identified is the character region, and the image formed by interpolating the pixel points near each coordinate point to obtain the correction pixel value is the correction image corresponding to the character region.
In an exemplary scenario, as shown in fig. 2B, a character key point is located in the image to be recognized through a character locating network, so as to obtain the character key point shown in fig. 2B, and the character key point is denoted as p= { P 1 ,P 2 ,…P 16 As shown in fig. 2C, the preset position point is denoted as P '= { P' 1 ,P’ 2 ,...P’ 16 Determining TPS transformation matrix according to the corresponding relation between P and P', creating a correction image with w x h blank, determining coordinate point of each position point in the correction image in the image to be identified through the TPS transformation matrix, interpolating by using pixel value of pixel point near the coordinate point to obtain correction pixel value, and filling corresponding position in the correction image by using the correction pixel valueThe points are placed so that a rectified image as shown in fig. 2D is obtained.
In an embodiment, for the process of recognizing characters in the corrected image by the character recognition network, features of the corrected image may be extracted by a convolutional neural network in the character recognition network and output to a convolutional neural network in the character recognition network, the features are weighted and encoded by the convolutional neural network and output to a decoding network in the character recognition network, the weighted and encoded features are decoded by the decoding network to obtain at least one feature sequence, and output to a classification layer in the character recognition network, and each feature sequence is classified by the classification layer to obtain a character corresponding to each feature sequence.
Among these, convolutional Neural Networks (CNNs) may be neural networks based on a res net (Residual Neural Network, deep residual network) structure. The decoding network may be a network based on an Attention Model structure.
Step 202: and acquiring a character recognition result output by the character recognition model.
Based on the above-described scene, after inputting the corrected image in fig. 2D into the character recognition network, the character recognition result of "GIORDANO" can be obtained.
In the embodiment of the application, the image to be recognized can be input into the trained character recognition model, the character recognition model locates character key points in the image to be recognized through the character locating network, and the character key points are output to the character correction network in the character recognition model, so that the character correction network can determine the correction image corresponding to the character area in the image to be recognized by utilizing the corresponding relation between the character key points and the preset position points, and the correction image is output to the character recognition network in the character recognition model to recognize the characters in the correction image, thereby obtaining the character recognition result output by the character recognition model.
Based on the description, the whole recognition process is realized in the character recognition model, so that data interaction between a plurality of models and an external platform does not exist, the recognition speed can be improved, and meanwhile, the maintenance difficulty is reduced. The character correction network in the character recognition model can correct images with problems of inclination, rotation, deformation and the like, so that the character recognition result of the character recognition model is good in stability and high in recognition accuracy. And after inputting an image to the character recognition model, the model directly outputs a character recognition result, so that the end-to-end character recognition can be truly realized. In addition, the character recognition model can obtain a character recognition result only by positioning character key points in the image through the character positioning network and by the character correcting network and the character recognition network, and does not need to detect accurate character frames from the image or divide, so that the recognition accuracy can be further improved.
Fig. 3 is a flowchart of another embodiment of a character recognition method according to an exemplary embodiment of the present application, and based on the embodiment shown in fig. 2A, this embodiment is exemplarily described by taking how to train a character recognition model, and as shown in fig. 3, a flow of training the character recognition model may include:
step 301: a training sample containing characters is obtained.
In an embodiment, images of various natural scenes or specific scenes can be obtained, and characters contained in the images are labeled, so that training samples are obtained.
Wherein the number of training samples may be set according to practical experience.
Step 302: and training the character recognition model end to end by using training samples until the training times reach the preset times, and stopping training.
In an embodiment, in the training process, parameters in the character recognition model can be adjusted by calculating the loss value of the character recognition result output by the character recognition model each time relative to the labeled character until the training times reach the preset times, and the training is stopped. The training times can be set according to practical experience.
Thus, the process shown in fig. 3 is completed, the training of the single character recognition model can be realized through the process shown in fig. 3, each neural network in the character recognition model does not need to be independently and separately trained, and the problem of error transmission caused by separate training can be avoided.
Fig. 4 is a hardware configuration diagram of an electronic device according to an exemplary embodiment of the present application, the electronic device including: a communication interface 401, a processor 402, a machine-readable storage medium 403, and a bus 404; wherein the communication interface 401, the processor 402 and the machine readable storage medium 403 perform communication with each other via a bus 404. The processor 402 may perform the character recognition method described above by reading and executing machine-executable instructions in the machine-readable storage medium 403 corresponding to the control logic of the character recognition method, the details of which are referred to in the above embodiments and will not be further elaborated here.
The machine-readable storage medium 403 referred to in this disclosure may be any electronic, magnetic, optical, or other physical storage device that can contain or store information, such as executable instructions, data, or the like. For example, a machine-readable storage medium may be: volatile memory, nonvolatile memory, or similar storage medium. In particular, the machine-readable storage medium 403 may be RAM (Radom Access Memory, random access memory), flash memory, a storage drive (e.g., hard drive), any type of storage disk (e.g., optical disk, DVD, etc.), or a similar storage medium, or a combination thereof.
Fig. 5 is a block diagram showing an embodiment of a character recognition apparatus according to an exemplary embodiment of the present application, and as shown in fig. 5, the character recognition apparatus includes:
the character recognition module 510 is configured to input an image to be recognized into a trained character recognition model, so that a character key point is positioned in the image to be recognized by the character recognition model through a character positioning network, and output the character key point to a character correction network in the character recognition model, so that a correction image corresponding to a character region in the image to be recognized is determined by the character correction network in the image to be recognized by using a corresponding relationship between the character key point and a preset position point, and the correction image is output to the character recognition network in the character recognition model to recognize characters in the correction image;
and the obtaining module 520 is configured to obtain a character recognition result output by the character recognition model.
In an optional implementation manner, the character recognition module 510 is specifically configured to extract, through a feature extraction network in the character positioning network, features of the image to be recognized and output the extracted features to a key point regression network in the character positioning network, in a process that the character positioning network positions key points of the character in the image to be recognized; and the key point regression network extracts character key points by using the extracted features.
In an optional implementation manner, the character recognition module 510 is specifically configured to determine, in the process that the character correction network determines, in the image to be recognized, a corrected image corresponding to a character region in the image to be recognized by using a correspondence between the character key points and preset position points, a corresponding thin-plate spline TPS transformation matrix according to the correspondence between the character key points and the preset position points, where the number of the character key points is identical to the number of the preset position points; creating a blank rectified image; and determining a coordinate point corresponding to each position point in the corrected image by using the TPS transformation matrix, interpolating by using pixel values of pixel points near the coordinate point to obtain a corrected pixel value, and filling the corrected pixel value into the position point in the corrected image.
In an optional implementation manner, the character recognition module 510 is specifically configured to extract, through a convolutional neural network in the character recognition network, a feature of the corrected image in a process of recognizing a character in the corrected image by the character recognition network, and output the feature to a convolutional neural network in the character recognition network; the cyclic neural network performs weighted coding on the characteristics and outputs the weighted coded characteristics to a decoding network in the character recognition network; the decoding network decodes the weighted and coded features to obtain at least one feature sequence, and outputs the at least one feature sequence to a classification layer in the character recognition network; the classifying layer classifies each characteristic sequence to obtain character content corresponding to each characteristic sequence.
In an alternative implementation, the apparatus further comprises (not shown in fig. 5):
the training module is used for acquiring training samples containing characters; and performing end-to-end training on the character recognition model by using the training sample until the training times reach the preset times, and stopping training.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the application.

Claims (9)

1. A method of character recognition, the method comprising:
inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by a character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a correction image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the correction image to the character recognition network in the character recognition model to recognize characters in the correction image;
the character recognition network recognizes characters in the rectified image, comprising: extracting the characteristics of the corrected image through a convolutional neural network in the character recognition network, and outputting the characteristics to a cyclic neural network in the character recognition network; the cyclic neural network performs weighted coding on the characteristics and outputs the weighted coded characteristics to a decoding network in the character recognition network; the decoding network decodes the weighted and coded features to obtain at least one feature sequence, and outputs the at least one feature sequence to a classification layer in the character recognition network; the classifying layer classifies each characteristic sequence to obtain character content corresponding to each characteristic sequence;
and acquiring a character recognition result output by the character recognition model.
2. The method of claim 1, wherein the character locating network locates character keypoints in the image to be identified, comprising:
extracting the characteristics of the image to be identified through a characteristic extraction network in the character positioning network, and outputting the characteristics to a key point regression network in the character positioning network;
and the key point regression network extracts character key points by using the extracted features.
3. The method according to claim 1, wherein the character correction network determining, in the image to be recognized, a corrected image corresponding to a character area in the image to be recognized by using a correspondence between the character key points and preset position points, includes:
determining a corresponding thin-plate spline function TPS transformation matrix according to the corresponding relation between the character key points and the preset position points, wherein the number of the character key points is consistent with the number of the preset position points;
creating a blank rectified image;
and determining a coordinate point corresponding to each position point in the corrected image by using the TPS transformation matrix, interpolating by using pixel values of pixel points near the coordinate point to obtain a corrected pixel value, and filling the corrected pixel value into the position point in the corrected image.
4. The method of claim 1, wherein the character recognition model is trained by:
acquiring a training sample containing characters;
and performing end-to-end training on the character recognition model by using the training sample until the training times reach the preset times, and stopping training.
5. A character recognition apparatus, the apparatus comprising:
the character recognition module is used for inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a correction image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the correction image to the character recognition network in the character recognition model to recognize characters in the correction image;
the character recognition module is specifically configured to extract, through a convolutional neural network in the character recognition network, characteristics of the corrected image and output the extracted characteristics to a convolutional neural network in the character recognition network in a process of recognizing characters in the corrected image by the character recognition network; the cyclic neural network performs weighted coding on the characteristics and outputs the weighted coded characteristics to a decoding network in the character recognition network; the decoding network decodes the weighted and coded features to obtain at least one feature sequence, and outputs the at least one feature sequence to a classification layer in the character recognition network; the classifying layer classifies each characteristic sequence to obtain character content corresponding to each characteristic sequence;
and the acquisition module is used for acquiring the character recognition result output by the character recognition model.
6. The apparatus according to claim 5, wherein the character recognition module is specifically configured to extract, through a feature extraction network in the character positioning network, features of the image to be recognized and output the extracted features to a key point regression network in the character positioning network, in a process that the character positioning network positions key points of the characters in the image to be recognized; and the key point regression network extracts character key points by using the extracted features.
7. The apparatus of claim 5, wherein the character recognition module is specifically configured to determine, in the process that the character correction network determines the corrected image corresponding to the character region in the image to be recognized by using the correspondence between the character key points and the preset position points in the image to be recognized, a corresponding thin-plate spline TPS transformation matrix according to the correspondence between the character key points and the preset position points, where the number of the character key points is identical to the number of the preset position points; creating a blank rectified image; and determining a coordinate point corresponding to each position point in the corrected image by using the TPS transformation matrix, interpolating by using pixel values of pixel points near the coordinate point to obtain a corrected pixel value, and filling the corrected pixel value into the position point in the corrected image.
8. The apparatus of claim 5, wherein the apparatus further comprises:
the training module is used for acquiring training samples containing characters; and performing end-to-end training on the character recognition model by using the training sample until the training times reach the preset times, and stopping training.
9. An electronic device comprising a readable storage medium and a processor;
wherein the readable storage medium is for storing machine executable instructions;
the processor is configured to read the machine-executable instructions on the readable storage medium and execute the instructions to implement the steps of the method of any of claims 1-4.
CN201811184618.2A 2018-10-11 2018-10-11 Character recognition method and device Active CN111046859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811184618.2A CN111046859B (en) 2018-10-11 2018-10-11 Character recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811184618.2A CN111046859B (en) 2018-10-11 2018-10-11 Character recognition method and device

Publications (2)

Publication Number Publication Date
CN111046859A CN111046859A (en) 2020-04-21
CN111046859B true CN111046859B (en) 2023-09-29

Family

ID=70229220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811184618.2A Active CN111046859B (en) 2018-10-11 2018-10-11 Character recognition method and device

Country Status (1)

Country Link
CN (1) CN111046859B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767754B (en) * 2020-06-30 2024-05-07 创新奇智(北京)科技有限公司 Identification code identification method and device, electronic equipment and storage medium
CN112132139A (en) * 2020-09-22 2020-12-25 深兰科技(上海)有限公司 Character recognition method and device
CN112464798A (en) * 2020-11-24 2021-03-09 创新奇智(合肥)科技有限公司 Text recognition method and device, electronic equipment and storage medium
CN112508003B (en) * 2020-12-18 2023-10-13 北京百度网讯科技有限公司 Character recognition processing method and device
CN112597940B (en) * 2020-12-29 2022-08-23 苏州科达科技股份有限公司 Certificate image recognition method and device and storage medium
CN115690803A (en) * 2022-10-31 2023-02-03 中电金信软件(上海)有限公司 Digital image recognition method and device, electronic equipment and readable storage medium
CN116434234B (en) * 2023-05-25 2023-10-17 珠海亿智电子科技有限公司 Method, device, equipment and storage medium for detecting and identifying casting blank characters

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678293A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex image and text sequence identification method based on CNN-RNN
CN106407971A (en) * 2016-09-14 2017-02-15 北京小米移动软件有限公司 Text recognition method and device
CN107798327A (en) * 2017-10-31 2018-03-13 北京小米移动软件有限公司 Character identifying method and device
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
CN108121984A (en) * 2016-11-30 2018-06-05 杭州海康威视数字技术股份有限公司 A kind of character identifying method and device
CN108334499A (en) * 2018-02-08 2018-07-27 海南云江科技有限公司 A kind of text label tagging equipment, method and computing device
CN108399408A (en) * 2018-03-06 2018-08-14 李子衿 A kind of deformed characters antidote based on deep space converting network
WO2018166114A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Picture identification method and system, electronic device, and medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599900B (en) * 2015-10-20 2020-04-21 华中科技大学 Method and device for recognizing character strings in image
US9990564B2 (en) * 2016-03-29 2018-06-05 Wipro Limited System and method for optical character recognition
CN106407976B (en) * 2016-08-30 2019-11-05 百度在线网络技术(北京)有限公司 The generation of image character identification model and perpendicular column character picture recognition methods and device
TWI607387B (en) * 2016-11-25 2017-12-01 財團法人工業技術研究院 Character recognition systems and character recognition methods thereof

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678293A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex image and text sequence identification method based on CNN-RNN
CN106407971A (en) * 2016-09-14 2017-02-15 北京小米移动软件有限公司 Text recognition method and device
CN108121984A (en) * 2016-11-30 2018-06-05 杭州海康威视数字技术股份有限公司 A kind of character identifying method and device
WO2018099194A1 (en) * 2016-11-30 2018-06-07 杭州海康威视数字技术股份有限公司 Character identification method and device
WO2018166114A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Picture identification method and system, electronic device, and medium
CN107798327A (en) * 2017-10-31 2018-03-13 北京小米移动软件有限公司 Character identifying method and device
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
CN108334499A (en) * 2018-02-08 2018-07-27 海南云江科技有限公司 A kind of text label tagging equipment, method and computing device
CN108399408A (en) * 2018-03-06 2018-08-14 李子衿 A kind of deformed characters antidote based on deep space converting network

Also Published As

Publication number Publication date
CN111046859A (en) 2020-04-21

Similar Documents

Publication Publication Date Title
CN111046859B (en) Character recognition method and device
CN109117848B (en) Text line character recognition method, device, medium and electronic equipment
CN111325203B (en) American license plate recognition method and system based on image correction
CN107239786B (en) Character recognition method and device
CN111639646B (en) Test paper handwritten English character recognition method and system based on deep learning
CN105981051B (en) Layering for image analysis interconnects multiple dimensioned convolutional network
CN110647829A (en) Bill text recognition method and system
CN104217203B (en) Complex background card face information identifying method and system
CN111598089B (en) License plate correction and recognition method based on deep learning
US9934444B2 (en) Image processing apparatus, image processing method and computer-readable storage medium
CN111178290A (en) Signature verification method and device
CN110942057A (en) Container number identification method and device and computer equipment
CN113378852A (en) Key point detection method and device, electronic equipment and storage medium
CN116311310A (en) Universal form identification method and device combining semantic segmentation and sequence prediction
KR20190098996A (en) Detection and Recognition of Remote Dense Visual Markers
CN114417904A (en) Bar code identification method based on deep learning and book retrieval system
CN111753812A (en) Text recognition method and equipment
CN115223166A (en) Picture pre-labeling method, picture labeling method and device, and electronic equipment
CN107403179A (en) A kind of register method and device of article packaged information
CN111612045B (en) Universal method for acquiring target detection data set
US20230110558A1 (en) Systems and methods for detecting objects
KR102026280B1 (en) Method and system for scene text detection using deep learning
CN110942073A (en) Container trailer number identification method and device and computer equipment
CN110348023A (en) A kind of method, apparatus, storage medium and the electronic equipment of Chinese text participle
CN113362380B (en) Image feature point detection model training method and device and electronic equipment thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant