CN111046859A - Character recognition method and device - Google Patents

Character recognition method and device Download PDF

Info

Publication number
CN111046859A
CN111046859A CN201811184618.2A CN201811184618A CN111046859A CN 111046859 A CN111046859 A CN 111046859A CN 201811184618 A CN201811184618 A CN 201811184618A CN 111046859 A CN111046859 A CN 111046859A
Authority
CN
China
Prior art keywords
character
network
image
character recognition
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811184618.2A
Other languages
Chinese (zh)
Other versions
CN111046859B (en
Inventor
朱尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201811184618.2A priority Critical patent/CN111046859B/en
Publication of CN111046859A publication Critical patent/CN111046859A/en
Application granted granted Critical
Publication of CN111046859B publication Critical patent/CN111046859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The application provides a character recognition method and a device, wherein the method comprises the following steps: inputting an image to be recognized into a character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a corrected image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized according to the corresponding relation between the character key points and preset position points, and outputting the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image. The character correction network can correct images with the problems of inclination, rotation, deformation and the like, so that the recognition result is good in stability and high in recognition accuracy, and the character recognition model can obtain results through the character positioning network for positioning character key points and the character correction network and the character recognition network, does not need to detect an accurate character frame from the images and does not need to perform segmentation, so that the recognition accuracy is high.

Description

Character recognition method and device
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a character recognition method and apparatus.
Background
The current character recognition technology generally comprises two modules of character area positioning and character segmentation. In the deep learning-based method, the character recognition technology needs to be realized through a plurality of deep learning models, namely, the image is input into a feature extraction model to extract the features of the image, then the features output by the feature extraction model are input into a target detection model to detect a character frame, and finally the character frame output by the target detection model and the features output by the feature extraction model are input into a character segmentation model to perform character segmentation.
However, the multiple deep learning models exist independently, and data interaction exists between each deep learning model, so that redundant calculation exists, a memory space is occupied, and the character recognition speed is low.
Disclosure of Invention
In view of the above, the present application provides a character recognition method and device to solve the problem of low recognition speed of the character recognition method in the related art.
According to a first aspect of embodiments of the present application, there is provided a character recognition method, the method including:
inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a corrected image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized according to the corresponding relation between the character key points and preset position points, and outputting the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image; and acquiring a character recognition result output by the character recognition model.
According to a second aspect of embodiments of the present application, there is provided a character recognition apparatus, the apparatus including:
the character recognition module is used for inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a corrected image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image; and the acquisition module is used for acquiring the character recognition result output by the character recognition model.
According to a third aspect of embodiments herein, there is provided an electronic device comprising a readable storage medium and a processor;
wherein the readable storage medium is configured to store machine executable instructions;
the processor is used for reading the machine executable instructions on the readable storage medium and executing the instructions to realize the steps of the character recognition method.
According to a fourth aspect of embodiments of the present application, there is provided a chip comprising a readable storage medium and a processor;
wherein the readable storage medium is configured to store machine executable instructions;
the processor is used for reading the machine executable instructions on the readable storage medium and executing the instructions to realize the steps of the character recognition method.
By applying the embodiment of the application, the image to be recognized can be input into the trained character recognition model, the character key point is positioned in the image to be recognized through the character positioning network by the character recognition model, and the character key point is output to the character correction network in the character recognition model, so that the character correction network determines the corrected image corresponding to the character area in the image to be recognized by utilizing the corresponding relation between the character key point and the preset position point in the image to be recognized, and outputs the corrected image to the character recognition network in the character recognition model to recognize the character in the corrected image, and the character recognition result output by the character recognition model can be obtained.
Based on the description, the whole recognition process is realized in the character recognition model, and data interaction between a plurality of models and an external platform does not exist, so that the recognition speed can be improved, and the maintenance difficulty is reduced. And because the images with the problems of inclination, rotation, deformation and the like can be corrected through the character correction network in the character recognition model, the character recognition result of the character recognition model has good stability and high recognition accuracy. After an image is input to the character recognition model, the model directly outputs the character recognition result, so that the end-to-end character recognition can be really realized. In addition, the character recognition model can obtain a character recognition result only by positioning the character key points in the image through the character positioning network and through the character correction network and the character recognition network, does not need to detect an accurate character frame from the image and does not need to perform segmentation, and therefore the recognition accuracy rate can be further improved.
Drawings
FIG. 1 is a block diagram illustrating a character recognition model according to an exemplary embodiment of the present application;
FIG. 2A is a flow chart illustrating an embodiment of a character recognition method according to an exemplary embodiment of the present application;
FIG. 2B is a schematic diagram of a located key point of a character according to the embodiment shown in FIG. 2A;
FIG. 2C is a schematic diagram of a default location point according to the embodiment of FIG. 2A;
FIG. 2D is a schematic illustration of a rectified image according to the embodiment of FIG. 2A;
FIG. 3 is a flow diagram illustrating an embodiment of another method for character recognition according to an illustrative embodiment of the present application;
FIG. 4 is a diagram illustrating a hardware configuration of an electronic device according to an exemplary embodiment of the present application;
fig. 5 is a block diagram illustrating an embodiment of a character recognition apparatus according to an exemplary embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
The character recognition technology realized by a plurality of deep learning models (including a feature extraction model, a target detection model and a character segmentation module) in the related art has the following problems: 1. a plurality of deep learning models exist independently, and each deep learning model has data interaction with an external platform, so that redundant calculation exists, the memory space is occupied, and the character recognition speed is low. 2. If the image has problems of inclination, deformation and the like, a recognition result may not be obtained, so that the character recognition technology has poor stability. 3. The accuracy of character segmentation performed by the character segmentation model depends on the accuracy of character frame detection by the target detection model, so that the dependency of segmentation tasks is strong, and if the detected character frame is not accurate enough, segmentation errors are easy to occur.
Based on this, fig. 1 is a structural diagram of a character recognition model according to an exemplary embodiment of the present application, as shown in fig. 1, an image to be recognized is input into a trained character recognition model, a character key point is first located in the image to be recognized by a character recognition model through a character locating network, and is output to a character correction network in the character recognition model, so that the character correction network determines a corrected image corresponding to a character area in the image to be recognized by using a corresponding relationship between the character key point and a preset position point in the image to be recognized, and outputs the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image, thereby obtaining a character recognition result output by the character recognition model.
Based on the description, the whole recognition process is realized in the character recognition model, and data interaction between a plurality of models and an external platform does not exist, so that the recognition speed can be improved, and the maintenance difficulty is reduced. And because the images with the problems of inclination, deformation and the like can be corrected through the character correction network in the character recognition model, the character recognition result of the character recognition model has good stability and high recognition accuracy. After an image is input to the character recognition model, the model directly outputs the character recognition result, so that the end-to-end character recognition can be really realized. In addition, the character recognition model can obtain a character recognition result only by positioning the character key points in the image through the character positioning network and through the character correction network and the character recognition network, does not need to detect an accurate character frame from the image and does not need to perform segmentation, and therefore the recognition accuracy rate can be further improved.
The technical solution of the present application is explained in detail by the following specific examples.
Fig. 2A is a flowchart of an embodiment of a character recognition method according to an exemplary embodiment of the present application, and in combination with the character recognition model structure shown in fig. 1, the character recognition model is obtained by training in advance, and may include a character positioning network, a character rectification network, and a character recognition network. As shown in fig. 2A, the character recognition method includes the steps of:
step 201: inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a corrected image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized according to the corresponding relation between the character key points and preset position points, and outputting the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image.
In an embodiment, in the process of positioning the key points of the characters in the image to be recognized by the character positioning network, the features of the image to be recognized are extracted by the feature extraction network in the character positioning network, and are output to the key point regression network in the character positioning network, and the extracted features are used by the key point regression network to extract the key points of the characters.
The image to be recognized may be a gray scale image in a natural scene (such as a shop name recognition scene, a billboard recognition scene, and the like), or may be a gray scale image in a specific scene (such as a license plate recognition scene, a business card recognition scene, a certificate recognition scene, and the like). The feature extraction network may contain multiple convolutional and pooling layers, but needs to be convolved at least once before each pooling. The keypoint regression network may include a fully connected layer and a plurality of regression layers. The extracted character key points may be edge points of a character area in the image to be recognized, and the number of the extracted character key points may be set according to actual requirements, as shown in fig. 2B, wherein "+" in fig. 2B represents character key points, and the number of the extracted character key points is 16.
In an embodiment, in a process of determining, by a character correction network, a corrected image corresponding to a character region in an image to be recognized by using a correspondence between a character key point and a preset position point, a corresponding TPS (Thin Plate Spline) transformation matrix may be determined according to the correspondence between the character key point and the preset position point, after a blank corrected image is created, for each position point in the corrected image, a coordinate point of the position point in the image to be recognized is determined by using the TPS transformation matrix, a corrected pixel value is obtained by interpolating pixel values of pixel points near the coordinate point, and the corrected pixel value is added to the position point in the corrected image.
Wherein, the number of character key point is unanimous with the number of presetting the position point, and the position point can be set up according to certain rule in advance, and the number of supposing the character key point is n, and the rule of presetting the position point in advance can be: the preset position points are two rows of parallel position points, the distance between the two rows of position points is a preset height h, the length of each row of position points is a preset length w, the distances between adjacent position points in each row of position points are equal, and n position points are arranged according to the rule. The size of the created rectified image may be w × h, and the pixel points near the coordinate point may be pixel points at four corners of the coordinate point, for example, a coordinate point corresponding to a certain position point in the rectified image in the image to be recognized is (100.5, 2.6), and the pixel points near the coordinate point are (100, 2), (101, 2), (100, 3), (101, 3).
It should be noted that, since the transformation matrix is obtained from the corresponding relationship between the key points of the characters and the preset position points, the region corresponding to the coordinate point corresponding to each position point in the blank corrected image determined by the transformation matrix in the image to be recognized is the character region, and then the image composed of corrected pixel values obtained by interpolating the pixel points near each coordinate point is the corrected image corresponding to the character region.
In an exemplary scenario, as shown in fig. 2B again, the character key points are located in the image to be recognized through the character locating network, and the character key points shown in fig. 2B are obtained, and are denoted as P ═ P1,P2,…P16As shown in fig. 2C, the preset position point is denoted as P '═ P'1,P’2,…P’16Determining a TPS transformation matrix according to the corresponding relation between P and P', creating a correction image with a blank size of w x h, determining a coordinate point of each position point in the correction image in the image to be recognized through the TPS transformation matrix, performing interpolation by using pixel values of pixel points near the coordinate point to obtain a correction pixel value, and filling the corresponding position point in the correction image by using the correction pixel value to obtain the correction image shown in FIG. 2D.
In an embodiment, in the process of identifying a character in the corrected image by using a character identification network, a feature of the corrected image may be extracted by using a convolutional neural network in the character identification network and output to a cyclic neural network in the character identification network, the feature is subjected to weighted encoding by using the cyclic neural network and output to a decoding network in the character identification network after the weighted encoding, the decoding network decodes the feature after the weighted encoding to obtain at least one feature sequence and outputs the at least one feature sequence to a classification layer in the character identification network, and each feature sequence is classified by the classification layer to obtain a character corresponding to each feature sequence.
Among them, the Convolutional Neural Network (CNN) may be a Neural Network based on a ResNet (Residual Neural Network) structure. The decoding network may be a network based on the Attention Model architecture.
Step 202: and acquiring a character recognition result output by the character recognition model.
Based on the above-described scene, after the corrected image in fig. 2D is input to the character recognition network, the character recognition result of "GIORDANO" can be obtained.
In the embodiment of the application, an image to be recognized can be input into a trained character recognition model, a character key point is positioned in the image to be recognized by the character recognition model through a character positioning network, and the character key point is output to a character correction network in the character recognition model, so that the character correction network determines a corrected image corresponding to a character area in the image to be recognized by using the corresponding relation between the character key point and a preset position point in the image to be recognized, and outputs the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image, and therefore a character recognition result output by the character recognition model can be obtained.
Based on the description, the whole recognition process is realized in the character recognition model, and data interaction between a plurality of models and an external platform does not exist, so that the recognition speed can be improved, and the maintenance difficulty is reduced. And because the images with the problems of inclination, rotation, deformation and the like can be corrected through the character correction network in the character recognition model, the character recognition result of the character recognition model has good stability and high recognition accuracy. After an image is input to the character recognition model, the model directly outputs the character recognition result, so that the end-to-end character recognition can be really realized. In addition, the character recognition model can obtain a character recognition result only by positioning the character key points in the image through the character positioning network and through the character correction network and the character recognition network, does not need to detect an accurate character frame from the image and does not need to perform segmentation, and therefore the recognition accuracy rate can be further improved.
Fig. 3 is a flowchart of another embodiment of a character recognition method according to an exemplary embodiment of the present application, which is based on the embodiment shown in fig. 2A, and is exemplarily illustrated how to train a character recognition model in this embodiment, and as shown in fig. 3, the flowchart of training the character recognition model may include:
step 301: training samples containing characters are obtained.
In an embodiment, images of various natural scenes or specific scenes can be acquired, and characters contained in the images are labeled, so that a training sample is obtained.
Wherein, the number of training samples can be set according to practical experience.
Step 302: and training the character recognition model end to end by using the training sample until the training times reach the preset times, and stopping training.
In an embodiment, in the training process, parameters in the character recognition model can be adjusted by calculating a loss value of a character recognition result output by the character recognition model each time relative to a labeled character until the training frequency reaches a preset frequency, and the training is stopped. Wherein, the training times can be set according to practical experience.
So far, the flow shown in fig. 3 is completed, and through the flow shown in fig. 3, training of a single character recognition model can be realized, each neural network in the character recognition model does not need to be independently and separately trained, and the problem of error transmission generated by separate training can be avoided.
Fig. 4 is a hardware block diagram of an electronic device according to an exemplary embodiment of the present application, where the electronic device includes: a communication interface 401, a processor 402, a machine-readable storage medium 403, and a bus 404; wherein the communication interface 401, the processor 402 and the machine-readable storage medium 403 communicate with each other via a bus 404. The processor 402 may execute the character recognition method described above by reading and executing machine-executable instructions in the machine-readable storage medium 402 corresponding to the control logic of the character recognition method, and the details of the method are described in the above embodiments and will not be described again here.
The machine-readable storage medium 403 referred to herein may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: volatile memory, non-volatile memory, or similar storage media. In particular, the machine-readable storage medium 403 may be a RAM (random Access Memory), a flash Memory, a storage drive (e.g., a hard disk drive), any type of storage disk (e.g., an optical disk, a DVD, etc.), or similar storage medium, or a combination thereof.
Fig. 5 is a block diagram illustrating an embodiment of a character recognition apparatus according to an exemplary embodiment of the present application, and as shown in fig. 5, the character recognition apparatus includes:
a character recognition module 510, configured to input an image to be recognized into a trained character recognition model, locate a key point of a character in the image to be recognized through a character location network by a character recognition model, and output the key point to a character correction network in the character recognition model, determine, by the character correction network, a corrected image corresponding to a character area in the image to be recognized by using a correspondence between the key point of the character and a preset position point, and output the corrected image to the character recognition network in the character recognition model to recognize a character in the corrected image;
an obtaining module 520, configured to obtain a character recognition result output by the character recognition model.
In an optional implementation manner, the character recognition module 510 is specifically configured to, in a process that a character positioning network positions a key point of a character in the image to be recognized, extract a feature of the image to be recognized through a feature extraction network in the character positioning network, and output the feature to a key point regression network in the character positioning network; and the key point regression network extracts the key points of the characters by using the extracted features.
In an optional implementation manner, the character recognition module 510 is specifically configured to determine, according to a correspondence between the character key points and preset position points, corresponding thin-plate spline function TPS transformation matrices in a process that the character correction network determines, in the image to be recognized, corrected images corresponding to character regions in the image to be recognized by using the correspondence between the character key points and the preset position points, where the number of the character key points is consistent with the number of the preset position points; creating a blank rectified image; and aiming at each position point in the corrected image, determining a corresponding coordinate point of the position point in the image to be recognized by using the TPS transformation matrix, interpolating by using pixel values of pixel points near the coordinate point to obtain a corrected pixel value, and filling the corrected pixel value into the position point in the corrected image.
In an optional implementation manner, the character recognition module 510 is specifically configured to, in the process of recognizing the character in the corrected image by the character recognition network, extract the feature of the corrected image through a convolutional neural network in the character recognition network, and output the feature to a cyclic neural network in the character recognition network; the cyclic neural network carries out weighted coding on the features and outputs the features after weighted coding to a decoding network in the character recognition network; the decoding network decodes the weighted and coded features to obtain at least one feature sequence and outputs the at least one feature sequence to a classification layer in the character recognition network; and the classification layer classifies each characteristic sequence to obtain the character content corresponding to each characteristic sequence.
In an alternative implementation, the apparatus further comprises (not shown in fig. 5):
the training module is used for acquiring a training sample containing characters; and performing end-to-end training on the character recognition model by using the training sample until the training times reach the preset times, and stopping training.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
The application also provides a chip, which comprises a readable storage medium and a processor, wherein the readable storage medium is used for storing machine executable instructions, and the processor is used for reading the machine executable instructions and executing the instructions to realize the steps in the character recognition method embodiment.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (12)

1. A method of character recognition, the method comprising:
inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a corrected image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized according to the corresponding relation between the character key points and preset position points, and outputting the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image;
and acquiring a character recognition result output by the character recognition model.
2. The method of claim 1, wherein a character locating network locates character key points in the image to be recognized, comprising:
extracting the characteristics of the image to be identified through a characteristic extraction network in the character positioning network, and outputting the characteristics to a key point regression network in the character positioning network;
and the key point regression network extracts the key points of the characters by using the extracted features.
3. The method according to claim 1, wherein the character rectification network determines a rectified image corresponding to a character area in the image to be recognized by using the correspondence between the character key points and preset position points in the image to be recognized, and comprises:
determining a corresponding thin plate spline function (TPS) transformation matrix according to the corresponding relation between the character key points and preset position points, wherein the number of the character key points is consistent with that of the preset position points;
creating a blank rectified image;
and aiming at each position point in the corrected image, determining a corresponding coordinate point of the position point in the image to be recognized by using the TPS transformation matrix, interpolating by using pixel values of pixel points near the coordinate point to obtain a corrected pixel value, and filling the corrected pixel value into the position point in the corrected image.
4. The method of claim 1, wherein the character recognition network recognizes characters in the rectified image, comprising:
extracting the characteristics of the corrected image through a convolutional neural network in the character recognition network, and outputting the characteristics to a cyclic neural network in the character recognition network;
the cyclic neural network carries out weighted coding on the features and outputs the features after weighted coding to a decoding network in the character recognition network;
the decoding network decodes the weighted and coded features to obtain at least one feature sequence and outputs the at least one feature sequence to a classification layer in the character recognition network;
and the classification layer classifies each characteristic sequence to obtain the character content corresponding to each characteristic sequence.
5. The method of claim 1, wherein the character recognition model is trained by:
acquiring a training sample containing characters;
and performing end-to-end training on the character recognition model by using the training sample until the training times reach the preset times, and stopping training.
6. An apparatus for character recognition, the apparatus comprising:
the character recognition module is used for inputting an image to be recognized into a trained character recognition model, positioning character key points in the image to be recognized through a character positioning network by the character recognition model, outputting the character key points to a character correction network in the character recognition model, determining a corrected image corresponding to a character area in the image to be recognized by the character correction network in the image to be recognized by utilizing the corresponding relation between the character key points and preset position points, and outputting the corrected image to the character recognition network in the character recognition model to recognize characters in the corrected image;
and the acquisition module is used for acquiring the character recognition result output by the character recognition model.
7. The apparatus according to claim 6, wherein the character recognition module is specifically configured to, in a process that a character positioning network positions a key point of a character in the image to be recognized, extract a feature of the image to be recognized through a feature extraction network in the character positioning network, and output the feature to a key point regression network in the character positioning network; and the key point regression network extracts the key points of the characters by using the extracted features.
8. The apparatus according to claim 6, wherein the character recognition module is specifically configured to, in a process that the character correction network determines, in the image to be recognized, the corrected image corresponding to the character region in the image to be recognized using the correspondence between the character key points and the preset position points, determine a corresponding thin-plate spline function TPS transformation matrix according to the correspondence between the character key points and the preset position points, where the number of the character key points is consistent with the number of the preset position points; creating a blank rectified image; and aiming at each position point in the corrected image, determining a corresponding coordinate point of the position point in the image to be recognized by using the TPS transformation matrix, interpolating by using pixel values of pixel points near the coordinate point to obtain a corrected pixel value, and filling the corrected pixel value into the position point in the corrected image.
9. The apparatus according to claim 6, wherein the character recognition module is specifically configured to, during the recognition of the character in the corrected image by the character recognition network, extract the feature of the corrected image through a convolutional neural network in the character recognition network, and output the feature to a cyclic neural network in the character recognition network; the cyclic neural network carries out weighted coding on the features and outputs the features after weighted coding to a decoding network in the character recognition network; the decoding network decodes the weighted and coded features to obtain at least one feature sequence and outputs the at least one feature sequence to a classification layer in the character recognition network; and the classification layer classifies each characteristic sequence to obtain the character content corresponding to each characteristic sequence.
10. The apparatus of claim 6, further comprising:
the training module is used for acquiring a training sample containing characters; and performing end-to-end training on the character recognition model by using the training sample until the training times reach the preset times, and stopping training.
11. An electronic device comprising a readable storage medium and a processor;
wherein the readable storage medium is configured to store machine executable instructions;
the processor configured to read the machine executable instructions on the readable storage medium and execute the instructions to implement the steps of the method of any one of claims 1-5.
12. A chip comprising a readable storage medium and a processor;
wherein the readable storage medium is configured to store machine executable instructions;
the processor configured to read the machine executable instructions on the readable storage medium and execute the instructions to implement the steps of the method of any one of claims 1-5.
CN201811184618.2A 2018-10-11 2018-10-11 Character recognition method and device Active CN111046859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811184618.2A CN111046859B (en) 2018-10-11 2018-10-11 Character recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811184618.2A CN111046859B (en) 2018-10-11 2018-10-11 Character recognition method and device

Publications (2)

Publication Number Publication Date
CN111046859A true CN111046859A (en) 2020-04-21
CN111046859B CN111046859B (en) 2023-09-29

Family

ID=70229220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811184618.2A Active CN111046859B (en) 2018-10-11 2018-10-11 Character recognition method and device

Country Status (1)

Country Link
CN (1) CN111046859B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767754A (en) * 2020-06-30 2020-10-13 创新奇智(北京)科技有限公司 Identification code identification method and device, electronic equipment and storage medium
CN112132139A (en) * 2020-09-22 2020-12-25 深兰科技(上海)有限公司 Character recognition method and device
CN112464798A (en) * 2020-11-24 2021-03-09 创新奇智(合肥)科技有限公司 Text recognition method and device, electronic equipment and storage medium
CN112508003A (en) * 2020-12-18 2021-03-16 北京百度网讯科技有限公司 Character recognition processing method and device
CN112597940A (en) * 2020-12-29 2021-04-02 苏州科达科技股份有限公司 Certificate image recognition method and device and storage medium
CN115690803A (en) * 2022-10-31 2023-02-03 中电金信软件(上海)有限公司 Digital image recognition method and device, electronic equipment and readable storage medium
CN116434234A (en) * 2023-05-25 2023-07-14 珠海亿智电子科技有限公司 Method, device, equipment and storage medium for detecting and identifying casting blank characters

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678293A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex image and text sequence identification method based on CNN-RNN
CN106407971A (en) * 2016-09-14 2017-02-15 北京小米移动软件有限公司 Text recognition method and device
US20170286803A1 (en) * 2016-03-29 2017-10-05 Wipro Limited System and method for optical character recognition
US20180025256A1 (en) * 2015-10-20 2018-01-25 Tencent Technology (Shenzhen) Company Limited Method and apparatus for recognizing character string in image
US20180060704A1 (en) * 2016-08-30 2018-03-01 Baidu Online Network Technology (Beijing) Co., Ltd. Method And Apparatus For Image Character Recognition Model Generation, And Vertically-Oriented Character Image Recognition
CN107798327A (en) * 2017-10-31 2018-03-13 北京小米移动软件有限公司 Character identifying method and device
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
US20180150956A1 (en) * 2016-11-25 2018-05-31 Industrial Technology Research Institute Character recognition systems and character recognition methods thereof using convolutional neural network
CN108121984A (en) * 2016-11-30 2018-06-05 杭州海康威视数字技术股份有限公司 A kind of character identifying method and device
CN108334499A (en) * 2018-02-08 2018-07-27 海南云江科技有限公司 A kind of text label tagging equipment, method and computing device
CN108399408A (en) * 2018-03-06 2018-08-14 李子衿 A kind of deformed characters antidote based on deep space converting network
WO2018166114A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Picture identification method and system, electronic device, and medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180025256A1 (en) * 2015-10-20 2018-01-25 Tencent Technology (Shenzhen) Company Limited Method and apparatus for recognizing character string in image
CN105678293A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex image and text sequence identification method based on CNN-RNN
US20170286803A1 (en) * 2016-03-29 2017-10-05 Wipro Limited System and method for optical character recognition
US20180060704A1 (en) * 2016-08-30 2018-03-01 Baidu Online Network Technology (Beijing) Co., Ltd. Method And Apparatus For Image Character Recognition Model Generation, And Vertically-Oriented Character Image Recognition
CN106407971A (en) * 2016-09-14 2017-02-15 北京小米移动软件有限公司 Text recognition method and device
US20180150956A1 (en) * 2016-11-25 2018-05-31 Industrial Technology Research Institute Character recognition systems and character recognition methods thereof using convolutional neural network
WO2018099194A1 (en) * 2016-11-30 2018-06-07 杭州海康威视数字技术股份有限公司 Character identification method and device
CN108121984A (en) * 2016-11-30 2018-06-05 杭州海康威视数字技术股份有限公司 A kind of character identifying method and device
WO2018166114A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Picture identification method and system, electronic device, and medium
CN107798327A (en) * 2017-10-31 2018-03-13 北京小米移动软件有限公司 Character identifying method and device
CN107977665A (en) * 2017-12-15 2018-05-01 北京科摩仕捷科技有限公司 The recognition methods of key message and computing device in a kind of invoice
CN108334499A (en) * 2018-02-08 2018-07-27 海南云江科技有限公司 A kind of text label tagging equipment, method and computing device
CN108399408A (en) * 2018-03-06 2018-08-14 李子衿 A kind of deformed characters antidote based on deep space converting network

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767754A (en) * 2020-06-30 2020-10-13 创新奇智(北京)科技有限公司 Identification code identification method and device, electronic equipment and storage medium
CN111767754B (en) * 2020-06-30 2024-05-07 创新奇智(北京)科技有限公司 Identification code identification method and device, electronic equipment and storage medium
CN112132139A (en) * 2020-09-22 2020-12-25 深兰科技(上海)有限公司 Character recognition method and device
CN112464798A (en) * 2020-11-24 2021-03-09 创新奇智(合肥)科技有限公司 Text recognition method and device, electronic equipment and storage medium
CN112508003A (en) * 2020-12-18 2021-03-16 北京百度网讯科技有限公司 Character recognition processing method and device
CN112508003B (en) * 2020-12-18 2023-10-13 北京百度网讯科技有限公司 Character recognition processing method and device
CN112597940A (en) * 2020-12-29 2021-04-02 苏州科达科技股份有限公司 Certificate image recognition method and device and storage medium
CN115690803A (en) * 2022-10-31 2023-02-03 中电金信软件(上海)有限公司 Digital image recognition method and device, electronic equipment and readable storage medium
CN116434234A (en) * 2023-05-25 2023-07-14 珠海亿智电子科技有限公司 Method, device, equipment and storage medium for detecting and identifying casting blank characters
CN116434234B (en) * 2023-05-25 2023-10-17 珠海亿智电子科技有限公司 Method, device, equipment and storage medium for detecting and identifying casting blank characters

Also Published As

Publication number Publication date
CN111046859B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN111046859B (en) Character recognition method and device
CN109117848B (en) Text line character recognition method, device, medium and electronic equipment
CN107239786B (en) Character recognition method and device
CN110827247B (en) Label identification method and device
JP6952094B2 (en) Image processing device and image processing method
CN101504719B (en) Image processing apparatus and method
CN111931664A (en) Mixed note image processing method and device, computer equipment and storage medium
CN104217203B (en) Complex background card face information identifying method and system
CN109858476B (en) Tag expansion method and electronic equipment
JP7026165B2 (en) Text recognition method and text recognition device, electronic equipment, storage medium
CN111191649A (en) Method and equipment for identifying bent multi-line text image
CN110942057A (en) Container number identification method and device and computer equipment
CN110879972B (en) Face detection method and device
AU2017380263B2 (en) Method for detecting and recognising long-range high-density visual markers
CN107403179B (en) Registration method and device for article packaging information
CN114092949A (en) Method and device for training class prediction model and identifying interface element class
KR20220122458A (en) Method for de-identifying text plate contained in video data, and device performing the same
CN113378852A (en) Key point detection method and device, electronic equipment and storage medium
US20230110558A1 (en) Systems and methods for detecting objects
CN110705554A (en) Image processing method and device
CN110942073A (en) Container trailer number identification method and device and computer equipment
CN111325194B (en) Character recognition method, device and equipment and storage medium
KR20190093752A (en) Method and system for scene text detection using deep learning
CN113743146A (en) Deep learning-based batch graphic code identification method and device and storage medium
CN114120005A (en) Image processing method, neural network model training method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant