CN112257719A - Character recognition method, system and storage medium - Google Patents

Character recognition method, system and storage medium Download PDF

Info

Publication number
CN112257719A
CN112257719A CN202011099422.0A CN202011099422A CN112257719A CN 112257719 A CN112257719 A CN 112257719A CN 202011099422 A CN202011099422 A CN 202011099422A CN 112257719 A CN112257719 A CN 112257719A
Authority
CN
China
Prior art keywords
character
character recognition
image
recognition
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011099422.0A
Other languages
Chinese (zh)
Inventor
吴中山
黎维春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tianwei Big Data Technology Co ltd
Original Assignee
Shenzhen Tianwei Big Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tianwei Big Data Technology Co ltd filed Critical Shenzhen Tianwei Big Data Technology Co ltd
Priority to CN202011099422.0A priority Critical patent/CN112257719A/en
Publication of CN112257719A publication Critical patent/CN112257719A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a character recognition method, a character recognition system and a storage medium. A character recognition method comprises the following steps: acquiring a character database, and establishing a character recognition model according to the character database; acquiring and sending a character image to be recognized; carrying out region division on the character image to be recognized to obtain one or more character recognition region images; sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results; and generating and sending a character recognition report according to the character recognition result. The invention has the advantages of effectively improving the character recognition efficiency and ensuring the accuracy of character recognition.

Description

Character recognition method, system and storage medium
Technical Field
The invention relates to the technical field of character recognition, in particular to a character recognition method, a character recognition system and a storage medium.
Background
With the rapid development of scientific technology, the character recognition technology is also rapidly developed and widely applied to various industries. People often need to process characters in pictures in the working process, and because the characters in the pictures cannot be edited, the characters of the pictures need to be recognized firstly.
The existing character recognition method generally leads the whole image into a character recognition model to realize the recognition of characters in the image, and the recognition accuracy rate of the recognition method is low; when the image is too large, the operation load of the system can be greatly increased by integrally identifying the image, so that the operation efficiency is low, and further the character identification efficiency is low.
Disclosure of Invention
In order to overcome the above problems or at least partially solve the above problems, embodiments of the present invention provide a method, a system, and a storage medium for character recognition, which can effectively improve character recognition efficiency and ensure accuracy of character recognition.
The embodiment of the invention is realized by the following steps:
in a first aspect, an embodiment of the present invention provides a text recognition method, including the following steps:
acquiring a character database, and establishing a character recognition model according to the character database;
acquiring and sending a character image to be recognized;
carrying out region division on the character image to be recognized to obtain one or more character recognition region images;
sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results;
and generating and sending a character recognition report according to the character recognition result.
When identifying image characters, firstly acquiring data in an existing character database, and establishing a character identification model according to the data in the character database, wherein the character identification model is a mathematical model for converting and identifying the image characters into character texts to be output according to character images and character data; after receiving a character recognition request, acquiring and sending a character image to be recognized, and then performing region division on the character image to be recognized to obtain one or more character recognition region images, so that each character recognition region image can be rapidly recognized in the following, the recognition accuracy and efficiency are improved, and the recognition operation efficiency is improved; and sequentially importing each character recognition area image into a character recognition model for character recognition, recognizing characters in each character recognition area image through the character recognition model, generating and sending one or more character recognition results, and generating and sending a character recognition report according to the character recognition results, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like.
The method divides an integral image into a plurality of areas, and then respectively identifies, thereby effectively reducing the operation load and improving the character identification efficiency and the identification accuracy.
Based on the first aspect, in some embodiments of the present invention, the method for performing area division on the text image to be recognized to obtain one or more text recognition area images includes the following steps:
extracting the character type of characters in the character image to be recognized;
and carrying out region division on the character image to be recognized according to the character type to obtain one or more character recognition region images.
Based on the first aspect, in some embodiments of the present invention, the method for generating and sending a text recognition report according to a text recognition result includes the following steps:
a1, judging whether only one character recognition result exists, if yes, entering the step A2; if not, go to step A3;
a2, marking the character recognition result as a unique recognition result, and generating and sending a character recognition report according to the unique recognition result;
and A3, integrating the multiple character recognition results according to the import sequence to obtain a complete recognition result, and generating and sending a character recognition report according to the complete recognition result.
Based on the first aspect, in some embodiments of the present invention, the text recognition method further includes the following steps:
comparing the character recognition result with the character image to be recognized, judging whether the character recognition in the character image to be recognized is complete or not, and if so, sending the character recognition result; if not, marking the unrecognized area image, and importing the unrecognized area image into a character recognition model for character recognition.
Based on the first aspect, in some embodiments of the present invention, the method for generating and sending a text recognition report according to a text recognition result includes the following steps:
performing semantic consistency matching on characters in the character recognition result to obtain a consistency text;
and generating and sending a character recognition report according to the consistency text.
Based on the first aspect, in some embodiments of the present invention, the text recognition method further includes the following steps:
and optimizing the character image to be recognized by adopting an image clear processing method to obtain a clear character image to be recognized.
In a second aspect, an embodiment of the present invention provides a text recognition system, including a model establishing module, an image obtaining module, an area dividing module, a text recognition module, and a report generating module, where:
the model establishing module is used for acquiring a character database and establishing a character recognition model according to the character database;
the image acquisition module is used for acquiring and sending character images to be recognized;
the area division module is used for carrying out area division on the character image to be recognized so as to obtain one or more character recognition area images;
the character recognition module is used for sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results;
and the report generating module is used for generating and sending a character recognition report according to the character recognition result.
When image characters are identified, firstly, data in an existing character database is obtained through a model building module, and a character identification model is built according to the data in the character database, wherein the character identification model is a mathematical model for converting and identifying the image characters into character texts to be output according to character images and character data; after receiving a character recognition request, acquiring and sending a character image to be recognized through an image acquisition module, and then performing region division on the character image to be recognized through a region division module to obtain one or more character recognition region images, so that each character recognition region image can be rapidly recognized in the following process, the recognition accuracy and efficiency are improved, and the recognition operation efficiency is improved; the character recognition method comprises the steps of sequentially importing each character recognition area image into a character recognition model through a character recognition module for character recognition, recognizing characters in each character recognition area image through the character recognition model, generating and sending one or more character recognition results, and generating and sending a character recognition report according to the character recognition results through a report generation module, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like.
The system divides an integral image into a plurality of areas, and then identifies the areas respectively, thereby effectively reducing the running load and improving the character identification efficiency and the identification accuracy.
Based on the second aspect, in some embodiments of the present invention, the region dividing module includes a type sub-module and a region sub-module, where:
the type submodule is used for extracting the character type of characters in the character image to be recognized;
and the region sub-module is used for performing region division on the character image to be recognized according to the character type so as to obtain one or more character recognition region images.
Based on the second aspect, in some embodiments of the present invention, the report generating module includes a determining sub-module, an identifying sub-module, and an integrating sub-module, wherein:
the judgment submodule is used for judging whether only one character recognition result exists or not, and if so, the identification submodule works; if not, integrating the sub-modules to work;
the identification submodule is used for marking the character identification result as a unique identification result, and generating and sending a character identification report according to the unique identification result;
and the integration submodule is used for integrating the plurality of character recognition results according to the import sequence to obtain a complete recognition result, and generating and sending a character recognition report according to the complete recognition result.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and the computer-executable instructions are configured to perform the above-mentioned character recognition method.
The embodiment of the invention at least has the following advantages or beneficial effects:
the embodiment of the invention provides a character recognition method, when image characters are recognized, firstly, data in the existing character database are obtained, a character recognition model is established according to the data in the character database, and the character recognition model is a mathematical model for converting and recognizing the image characters into character texts to be output according to character images and character data; after receiving a character recognition request, acquiring and sending a character image to be recognized, and then performing region division on the character image to be recognized to obtain one or more character recognition region images, so that each character recognition region image can be rapidly recognized in the following, the recognition accuracy and efficiency are improved, and the recognition operation efficiency is improved; and sequentially importing each character recognition area image into a character recognition model for character recognition, recognizing characters in each character recognition area image through the character recognition model, generating and sending one or more character recognition results, and generating and sending a character recognition report according to the character recognition results, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like. The method divides an integral image into a plurality of areas, and then respectively identifies, thereby effectively reducing the operation load and improving the character identification efficiency and the identification accuracy.
The embodiment of the invention also provides a character recognition system, when image characters are recognized, firstly, the data in the existing character database is obtained through the model building module, and a character recognition model is built according to the data in the character database, wherein the character recognition model is a mathematical model for converting and recognizing the image characters into character texts to be output according to the character images and the character data; after receiving a character recognition request, acquiring and sending a character image to be recognized through an image acquisition module, and then performing region division on the character image to be recognized through a region division module to obtain one or more character recognition region images, so that each character recognition region image can be rapidly recognized in the following process, the recognition accuracy and efficiency are improved, and the recognition operation efficiency is improved; the character recognition method comprises the steps of sequentially importing each character recognition area image into a character recognition model through a character recognition module for character recognition, recognizing characters in each character recognition area image through the character recognition model, generating and sending one or more character recognition results, and generating and sending a character recognition report according to the character recognition results through a report generation module, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like. The system divides an integral image into a plurality of areas, and then identifies the areas respectively, thereby effectively reducing the running load and improving the character identification efficiency and the identification accuracy.
The embodiment of the invention also provides a computer readable storage medium which can store computer executable instructions for executing the character recognition method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flow chart of a method for recognizing characters according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating report generation in a text recognition method according to an embodiment of the present invention;
fig. 3 is a schematic block diagram of a text recognition system according to an embodiment of the present invention.
Icon: 100. a model building module; 200. an image acquisition module; 300. a region dividing module; 310. a type submodule; 320. a region submodule; 400. a character recognition module; 500. a report generation module; 510. a judgment submodule; 520. identifying a submodule; 530. and integrating the submodules.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the same element.
In the description of the embodiments of the present invention, "a plurality" represents at least 2.
Examples
As shown in fig. 1, in a first aspect, an embodiment of the present invention provides a text recognition method, including the following steps:
s1, acquiring a character database, and establishing a character recognition model according to the character database;
s2, acquiring and sending a character image to be recognized;
s3, carrying out region division on the character image to be recognized to obtain one or more character recognition region images;
s4, sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results;
and S5, generating and sending a character recognition report according to the character recognition result.
When identifying image characters, firstly acquiring data in an existing character database, and establishing a character identification model according to the data in the character database, wherein the character identification model is a mathematical model for converting and identifying the image characters into character texts to be output according to character images and character data; after receiving a character recognition request, acquiring and sending a character image to be recognized, and then performing region division on the character image to be recognized to obtain one or more character recognition region images, so that each character recognition region image can be rapidly recognized in the following, the recognition accuracy and efficiency are improved, and the recognition operation efficiency is improved; and sequentially importing each character recognition area image into a character recognition model for character recognition, importing the images from top to bottom and from left to right when importing the images, recognizing characters in each character recognition area image through the character recognition model, generating and sending one or more character recognition results, and generating and sending a character recognition report according to the character recognition results, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like.
The method divides an integral image into a plurality of areas, and then respectively identifies, thereby effectively reducing the operation load and improving the character identification efficiency and the identification accuracy.
Based on the first aspect, in some embodiments of the present invention, the method for performing area division on the text image to be recognized to obtain one or more text recognition area images includes the following steps:
extracting the character type of characters in the character image to be recognized;
and carrying out region division on the character image to be recognized according to the character type to obtain one or more character recognition region images.
When the character image to be recognized is recognized, firstly, performing primary processing on the character image to be recognized, wherein the primary processing refers to extracting character types of characters in the character image to be recognized, the character types comprise handwriting types, machine typing types, different language types and the like, after the character types of the characters in the character image to be recognized are extracted, performing region division on the character image to be recognized according to the different character types to obtain one or more character recognition region images so as to recognize each character recognition region image subsequently.
Based on the first aspect, in some embodiments of the present invention, the method for generating and sending a text recognition report according to a text recognition result includes the following steps:
a1, judging whether only one character recognition result exists, if yes, entering the step A2; if not, go to step A3;
a2, marking the character recognition result as a unique recognition result, and generating and sending a character recognition report according to the unique recognition result;
and A3, integrating the multiple character recognition results according to the import sequence to obtain a complete recognition result, and generating and sending a character recognition report according to the complete recognition result.
When generating the final recognition report, firstly, after obtaining a character recognition result, judging whether only one character recognition result exists, if so, marking the character recognition result as a unique recognition result, and generating and sending a character recognition report according to the unique recognition result; and if a plurality of character recognition results exist, integrating the plurality of character recognition results according to an import sequence, namely integrating the plurality of character recognition results according to an import-first output sequence to obtain a complete recognition result, and generating and sending a character recognition report according to the complete recognition result, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like.
Based on the first aspect, in some embodiments of the present invention, the text recognition method further includes the following steps:
comparing the character recognition result with the character image to be recognized, judging whether the character recognition in the character image to be recognized is complete or not, and if so, sending the character recognition result; if not, marking the unrecognized area image, and importing the unrecognized area image into a character recognition model for character recognition.
After the character recognition result is obtained, in order to ensure that the character image to be recognized is completely recognized, comparing the image of the character recognition result with the character image to be recognized, judging whether the character in the character image to be recognized is completely recognized or not, and if so, sending the character recognition result; if the image is not completely recognized, and part or all of the images are not recognized, the image of the unrecognized area is marked, then the image of the unrecognized area is led into the character recognition model for character recognition again, and the process of the method is not finished until the recognition is complete.
Based on the first aspect, in some embodiments of the present invention, the method for generating and sending a text recognition report according to a text recognition result includes the following steps:
performing semantic consistency matching on characters in the character recognition result to obtain a consistency text;
and generating and sending a character recognition report according to the consistency text.
After the character recognition result is obtained, in order to ensure the continuity of the semantic meaning of the text and facilitate the reading and checking of subsequent users, a semantic analysis method is adopted to carry out semantic analysis on the characters in the character recognition result, then semantic consistency matching is carried out according to the semantic analysis result to ensure the semantic meaning of the character text to obtain a continuity text, and then a character recognition report is generated and sent according to the continuity text, wherein the character recognition report comprises character text information, text segmentation adjustment information and the like.
Based on the first aspect, in some embodiments of the present invention, the text recognition method further includes the following steps:
and optimizing the character image to be recognized by adopting an image clear processing method to obtain a clear character image to be recognized.
In order to ensure that the images are identified more efficiently subsequently, after the character images to be identified are obtained, the character images to be identified are optimized by adopting a pixel-level optimization method or an image restoration method in an image definition processing method so as to improve the definition of the images and obtain the clear character images to be identified.
In a second aspect, an embodiment of the present invention provides a text recognition system, including a model building module 100, an image obtaining module 200, an area dividing module 300, a text recognition module 400, and a report generating module 500, where:
a model establishing module 100, configured to obtain a character database, and establish a character recognition model according to the character database;
the image acquisition module 200 is used for acquiring and sending character images to be recognized;
the region dividing module 300 is configured to perform region division on the text image to be recognized to obtain one or more text recognition region images;
the character recognition module 400 is used for sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results;
and a report generating module 500, configured to generate and send a text recognition report according to the text recognition result.
When identifying image characters, firstly, acquiring data in an existing character database through the model establishing module 100, and establishing a character identification model according to the data in the character database, wherein the character identification model is a mathematical model for converting and identifying the image characters into character texts to be output according to character images and character data; after receiving a character recognition request, acquiring and sending a character image to be recognized through the image acquisition module 200, and then performing region division on the character image to be recognized through the region division module 300 to obtain one or more character recognition region images, so that each character recognition region image can be rapidly recognized in the following process, the recognition accuracy and efficiency are improved, and the recognition operation efficiency is improved; the character recognition module 400 sequentially imports each character recognition area image into a character recognition model for character recognition, imports the images from top to bottom and from left to right in the import process, recognizes characters in each character recognition area image through the character recognition model, generates and sends one or more character recognition results, and then generates and sends a character recognition report according to the character recognition results through the report generation module 500, wherein the character recognition report comprises character text information, text paragraph information, character text image information, character type information and the like.
The system divides an integral image into a plurality of areas, and then identifies the areas respectively, thereby effectively reducing the running load and improving the character identification efficiency and the identification accuracy.
Based on the second aspect, in some embodiments of the present invention, the region dividing module 300 includes a type sub-module 310 and a region sub-module 320, where:
the type sub-module 310 is used for extracting the character type of the characters in the character image to be recognized;
the area sub-module 320 is configured to perform area division on the text image to be recognized according to the text type to obtain one or more text recognition area images.
When the character image to be recognized is recognized, firstly, the character image to be recognized is primarily processed through the type submodule 310, wherein the primary processing refers to extracting character types of characters in the character image to be recognized, and the character types comprise handwriting types, machine typing types, different language types and the like; after the character types of the characters in the character image to be recognized are extracted, the area sub-module 320 performs area division on the character image to be recognized according to different character types to obtain one or more character recognition area images, so as to respectively recognize each character recognition area image in the following.
Based on the second aspect, in some embodiments of the present invention, the report generating module 500 includes a determining sub-module 510, an identifying sub-module 520, and an integrating sub-module 530, wherein:
the judgment sub-module 510 is configured to judge whether there is only one character recognition result, and if yes, the identification sub-module 520 works; if not, the integration sub-module 530 works;
the identification submodule 520 is used for marking the character identification result as a unique identification result, and generating and sending a character identification report according to the unique identification result;
and an integrating sub-module 530, configured to integrate the multiple character recognition results according to the import order to obtain a complete recognition result, and generate and send a character recognition report according to the complete recognition result.
When generating the final recognition report, firstly, after obtaining the character recognition result, judging whether only one character recognition result exists through the judging submodule 510, if only one character recognition result exists, marking the character recognition result as the unique recognition result through the identifying submodule 520, and generating and sending the character recognition report according to the unique recognition result; if there are multiple character recognition results, the integration sub-module 530 integrates the multiple character recognition results according to the import sequence, that is, according to the import-first-output sequence, to obtain a complete recognition result, and generates and sends a character recognition report according to the complete recognition result, where the character recognition report includes character text information, text paragraph information, character text image information, character type information, and the like.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and the computer-executable instructions are configured to perform the above-mentioned character recognition method.
The Memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), and the like.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (10)

1. A character recognition method is characterized by comprising the following steps:
acquiring a character database, and establishing a character recognition model according to the character database;
acquiring and sending a character image to be recognized;
carrying out region division on the character image to be recognized to obtain one or more character recognition region images;
sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results;
and generating and sending a character recognition report according to the character recognition result.
2. The method of claim 1, wherein the method of dividing the text image to be recognized into regions to obtain one or more text recognition region images comprises the following steps:
extracting the character type of characters in the character image to be recognized;
and carrying out region division on the character image to be recognized according to the character type to obtain one or more character recognition region images.
3. The method of claim 1, wherein the method of generating and sending a text recognition report according to the text recognition result comprises the following steps:
a1, judging whether only one character recognition result exists, if yes, entering the step A2; if not, go to step A3;
a2, marking the character recognition result as a unique recognition result, and generating and sending a character recognition report according to the unique recognition result;
and A3, integrating the multiple character recognition results according to the import sequence to obtain a complete recognition result, and generating and sending a character recognition report according to the complete recognition result.
4. The character recognition method of claim 1, further comprising the steps of:
comparing the character recognition result with the character image to be recognized, judging whether the character recognition in the character image to be recognized is complete or not, and if so, sending the character recognition result; if not, marking the unrecognized area image, and importing the unrecognized area image into a character recognition model for character recognition.
5. The method of claim 1, wherein the method of generating and sending a text recognition report according to the text recognition result comprises the following steps:
performing semantic consistency matching on characters in the character recognition result to obtain a consistency text;
and generating and sending a character recognition report according to the consistency text.
6. The character recognition method of claim 1, further comprising the steps of:
and optimizing the character image to be recognized by adopting an image clear processing method to obtain a clear character image to be recognized.
7. A character recognition system is characterized by comprising a model establishing module, an image obtaining module, an area dividing module, a character recognition module and a report generating module, wherein:
the model establishing module is used for acquiring a character database and establishing a character recognition model according to the character database;
the image acquisition module is used for acquiring and sending character images to be recognized;
the area division module is used for carrying out area division on the character image to be recognized so as to obtain one or more character recognition area images;
the character recognition module is used for sequentially importing each character recognition area image into a character recognition model for character recognition, and generating and sending one or more character recognition results;
and the report generating module is used for generating and sending a character recognition report according to the character recognition result.
8. The word recognition system of claim 7, wherein the region partitioning module comprises a type sub-module and a region sub-module, wherein:
the type submodule is used for extracting the character type of characters in the character image to be recognized;
and the region sub-module is used for performing region division on the character image to be recognized according to the character type so as to obtain one or more character recognition region images.
9. The word recognition system of claim 7, wherein the report generation module comprises a determination sub-module, an identification sub-module, and an integration sub-module, wherein:
the judgment submodule is used for judging whether only one character recognition result exists or not, and if so, the identification submodule works; if not, integrating the sub-modules to work;
the identification submodule is used for marking the character identification result as a unique identification result, and generating and sending a character identification report according to the unique identification result;
and the integration submodule is used for integrating the plurality of character recognition results according to the import sequence to obtain a complete recognition result, and generating and sending a character recognition report according to the complete recognition result.
10. A computer-readable storage medium having computer-executable instructions stored thereon for performing the method of text recognition according to any one of claims 1-6.
CN202011099422.0A 2020-10-14 2020-10-14 Character recognition method, system and storage medium Pending CN112257719A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011099422.0A CN112257719A (en) 2020-10-14 2020-10-14 Character recognition method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011099422.0A CN112257719A (en) 2020-10-14 2020-10-14 Character recognition method, system and storage medium

Publications (1)

Publication Number Publication Date
CN112257719A true CN112257719A (en) 2021-01-22

Family

ID=74243369

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011099422.0A Pending CN112257719A (en) 2020-10-14 2020-10-14 Character recognition method, system and storage medium

Country Status (1)

Country Link
CN (1) CN112257719A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906696A (en) * 2021-05-06 2021-06-04 北京惠朗时代科技有限公司 English image region identification method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN110135411A (en) * 2019-04-30 2019-08-16 北京邮电大学 Business card identification method and device
CN110390260A (en) * 2019-06-12 2019-10-29 平安科技(深圳)有限公司 Picture scanning part processing method, device, computer equipment and storage medium
CN110569830A (en) * 2019-08-01 2019-12-13 平安科技(深圳)有限公司 Multi-language text recognition method and device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN110135411A (en) * 2019-04-30 2019-08-16 北京邮电大学 Business card identification method and device
CN110390260A (en) * 2019-06-12 2019-10-29 平安科技(深圳)有限公司 Picture scanning part processing method, device, computer equipment and storage medium
CN110569830A (en) * 2019-08-01 2019-12-13 平安科技(深圳)有限公司 Multi-language text recognition method and device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906696A (en) * 2021-05-06 2021-06-04 北京惠朗时代科技有限公司 English image region identification method and device

Similar Documents

Publication Publication Date Title
CN109685052A (en) Method for processing text images, device, electronic equipment and computer-readable medium
US10062001B2 (en) Method for line and word segmentation for handwritten text images
CN107240185B (en) A kind of crown word number identification method, device, equipment and storage medium
CN112069991A (en) PDF table information extraction method and related device
CN106599001A (en) Webpage content acquisition method and system
CN106156794B (en) Character recognition method and device based on character style recognition
CN115828874A (en) Industry table digital processing method based on image recognition technology
CN114706966A (en) Voice interaction method, device and equipment based on artificial intelligence and storage medium
CN111460355A (en) Page parsing method and device
CN114359533B (en) Page number identification method based on page text and computer equipment
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN116958996A (en) OCR information extraction method, system and equipment
CN110197140B (en) Material auditing method and equipment based on character recognition
CN117496542B (en) Document information extraction method, device, electronic equipment and storage medium
CN114529933A (en) Contract data difference comparison method, device, equipment and medium
CN114758340A (en) Intelligent identification method, device and equipment for logistics address and storage medium
CN112632948B (en) Case document ordering method and related equipment
CN112257719A (en) Character recognition method, system and storage medium
Yuan et al. An opencv-based framework for table information extraction
JP2004178010A (en) Document processor, its method, and program
CN117111890A (en) Software requirement document analysis method, device and medium
Mulyana et al. Optimization of Text Mining Detection of Tajweed Reading Laws Using the Yolov8 Method on the Qur'an
CN115565193A (en) Questionnaire information input method and device, electronic equipment and storage medium
CN115033699A (en) Fund user classification method and device
CN114443834A (en) Method and device for extracting license information and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination