CN107092902B - Character string recognition method and system - Google Patents

Character string recognition method and system Download PDF

Info

Publication number
CN107092902B
CN107092902B CN201610091505.2A CN201610091505A CN107092902B CN 107092902 B CN107092902 B CN 107092902B CN 201610091505 A CN201610091505 A CN 201610091505A CN 107092902 B CN107092902 B CN 107092902B
Authority
CN
China
Prior art keywords
character
path
combinations
probability
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610091505.2A
Other languages
Chinese (zh)
Other versions
CN107092902A (en
Inventor
王淞
范伟
孙俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN201610091505.2A priority Critical patent/CN107092902B/en
Publication of CN107092902A publication Critical patent/CN107092902A/en
Application granted granted Critical
Publication of CN107092902B publication Critical patent/CN107092902B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

The present disclosure provides a character string recognition method and system. The identification method according to one embodiment of the present disclosure includes: over-cutting the character string image into a plurality of connected areas; classifying combinations of each connected region and a predetermined number of adjacent connected regions by using a class II classifier, and giving the probability that each combination is a character; performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting the path with the highest probability that all the combinations are characters; and performing character recognition on the combination in the selected path by using a full-class classifier. Compared with the prior art, the method and the system provided by the disclosure have higher recognition rate of the handwritten Chinese character string.

Description

Character string recognition method and system
Technical Field
The present disclosure relates to the field of character string recognition, and in particular, to a method and system for character string recognition.
Background
Compared with English characters, Chinese characters are various in types and complex in structure. In a conventional character string recognition method, a character string image is over-segmented in a first step, and then character recognition is performed on the over-segmented image using a classifier, rules, and the like. However, the recognition rate of the Chinese character string by the traditional method cannot meet the requirement.
Therefore, it is desirable to provide a character string recognition method and system with a higher recognition rate.
Disclosure of Invention
The following presents a simplified summary of the disclosure in order to provide a basic understanding of some aspects of the disclosure. It should be understood that this summary is not an exhaustive overview of the disclosure. It is not intended to identify key or critical elements of the disclosure or to delineate the scope of the disclosure. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.
To solve the above problems, the present disclosure provides a method and system for recognizing a character string.
According to an aspect of the present disclosure, there is provided a character string recognition method, including: over-cutting the character string image into a plurality of connected areas; classifying combinations of each connected region and a predetermined number of adjacent connected regions by using a class II classifier, and giving the probability that each combination is a character; performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting the path with the highest probability that all the combinations are characters; and performing character recognition on the combination in the selected path by using a full-class classifier.
According to another aspect of the present disclosure, there is provided a character string recognition system including: the over-segmentation device is used for over-segmenting the character string image into a plurality of connected areas; a second class classifier for classifying combinations of each connected region and a predetermined number of adjacent connected regions, giving a probability that each combination is a character; a path search device for performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting a path with the highest probability that all the combinations are characters; and a full-class classifier for performing character recognition on the combinations in the selected path.
Compared with the prior art, the method and the system provided by the disclosure have higher recognition rate of the character strings, especially the handwritten Chinese character strings.
The above and other advantages of the present disclosure will become more apparent from the following detailed description of the preferred embodiments of the present disclosure when taken in conjunction with the accompanying drawings.
Drawings
To further clarify the above and other advantages and features of the present disclosure, a more particular description of embodiments of the present disclosure will be rendered by reference to the appended drawings. Which are incorporated in and form a part of this specification, along with the detailed description that follows. Elements having the same function and structure are denoted by the same reference numerals. It is appreciated that these drawings depict only typical examples of the disclosure and are therefore not to be considered limiting of its scope. In the drawings:
fig. 1 is a flowchart of a method of recognizing a character string according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a method of identifying a character string according to an embodiment of the present disclosure;
FIG. 3 is a diagram of various paths formed by a combination of multiple connected regions in the method shown in FIG. 2;
FIG. 4 is a flow diagram of a method of recognition of a character string according to another embodiment of the present disclosure;
FIG. 5 is a diagram of a system for recognition of character strings, according to an embodiment of the present disclosure;
FIG. 6 is a diagram of a system for recognition of character strings, according to another embodiment of the present disclosure;
FIG. 7 is a diagram of a system for recognition of character strings, according to a variant embodiment of the present disclosure;
FIG. 8 is a flow diagram of a method of training a class two classifier for character classification according to an embodiment of the present disclosure;
FIG. 9 shows a schematic block diagram of a computer that may be used to implement methods and systems in accordance with embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure.
Here, it should be further noted that, in order to avoid obscuring the present disclosure with unnecessary details, only the device structures and/or processing steps closely related to the scheme according to the present disclosure are shown in the drawings, and other details not so much related to the present disclosure are omitted.
In the present disclosure, a scheme of cascading classifiers for character string recognition is proposed. The recognition method and system of various character strings and the method of training a class two classifier for character classification proposed in the present disclosure are described in detail below with reference to the accompanying drawings.
Referring first to fig. 1, fig. 1 is a flowchart of a character string recognition method according to an embodiment of the present disclosure. As shown in fig. 1, the method 1000 includes the steps of: over-dividing the character string image into a plurality of connected regions (step 1001); classifying combinations of each connected region and a predetermined number of adjacent connected regions using a class two classifier, giving a probability that each combination is a character (step 1002); performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting a path with the highest probability that all the combinations are characters (step 1003); and performing character recognition on the combinations in the selected path using a full-class classifier (step 1004).
According to the method 1000, a character string image is first over-sliced (step 1001). In the present disclosure, the character string may be a handwritten character string, and the handwritten character string may include kanji characters, numeric characters, letters, symbols, or a combination thereof. In the following, the description will be given taking an example in which the character string includes a handwritten kanji character string.
Referring to fig. 2, fig. 2 is a schematic diagram of a character string recognition method according to an embodiment of the present disclosure. As shown in fig. 2, in the present embodiment, the character string includes the handwritten chinese character "fushitong research and development center" and punctuation marks such as left and right quotation marks and periods. And performing over-segmentation on the character string image into a plurality of connected areas by using an over-segmentation technology. Among them, the over-cutting technique is a mature technique in the field and is not the focus of the present disclosure, so it will not be described in detail here. In general, the over-segmentation is performed based on inter-character spaces and character stroke features in the character string image. In fig. 2, one possible result after over-segmentation is characterized by S1.
Next, each connected region is classified with a predetermined number of adjacent connected regions using a class two classifier, giving the probability that each combination is a character (step 1002). In which, a person skilled in the art may set an upper limit of the number of connected regions included in the combination of connected regions according to experience and specific application scenarios (e.g., the habit of handwritten chinese characters of a specific group, etc.), for example, the upper limit of the number may be set to four or five or other suitable values.
For example, if it is assumed that a kanji character contains a maximum of four connected regions, then the combination of each connected region contains a maximum of four connected regions. In this case, the combination of the connected regions may be a single connected region, a combination of two adjacent connected regions, a combination of three adjacent connected regions, or a combination of four adjacent connected regions.
The class two classifier used in step 1002 may be a class two SVM (support vector machine) classifier or a class two CNN (convolutional neural network) classifier, but is not limited thereto. In the embodiment shown in fig. 2, each combination of connected regions is classified using a two-class CNN classifier and given a probability of being a character. Details of how the class two classifier is trained will be described later.
Then, an optimal path is selected among all possible paths formed by the combination of connected regions, and how to select the optimal path is explained below with reference to fig. 3 (step 1003). All possible paths refer to all possible adjacent combinations of connected regions formed by segmenting the character string image, with the constraint that the maximum number of connected regions contained in the aforementioned allowable combinations is used. Fig. 3 is a diagram of three paths P1, P2, and P3 among all paths formed by various combinations of a plurality of communication regions in the embodiment shown in fig. 2. For simplicity, the three paths P1, P2 and P3 are only taken as examples for illustration.
In this embodiment, the path with the highest probability that all combinations are characters is selected among all possible paths. There are various possible algorithms for the probability that all combinations are characters. In a preferred embodiment, in each path, the average value of the probabilities that the combinations are characters is selected as the probability that all the combinations of the path are characters. The average may be an arithmetic average or a weighted average. The weighted average probability is calculated, for example, by the following formula (1)
Figure GDA0002920989460000041
Figure GDA0002920989460000042
Where Pi denotes a probability that a combination of each connected region is a character, Mi denotes a preset weighting parameter for the combination of each connected region, and N denotes the number of combinations of connected regions.
The weighting parameters Mi are set to make the calculation result of the average probability more accurate, for example, it is necessary to consider the differences (such as the character length, the occupied area, etc.) between various types of characters (such as handwritten characters of different crowds). The Mi may be set to the length of each combination of connected regions or the number of black pixels in each combination of connected regions, according to actual needs.
Alternatively, the value of Mi may be set to 1 or may be set to any fixed constant without considering the dissimilarity between handwritten characters.
The following description will take path P3 as an example. As shown in fig. 3, the path P3 includes combinations of 12 connected regions, i.e., N has a value of 12, and the probability (Pi) that each of these combinations of connected regions is a character is, from left to right: 0.7, 0.8, 0.7, 0.6, 0.7, 0.8, 0.7, 0.6, 0.7.
In the present embodiment, the preset weighting parameter Mi is set as the length of the combination of each connected region, and in the path P3, the length value of the combination of each connected region sequentially from left to right is: 7. 9, 6, 12, 8, 11, 6, 7, 4.
The calculation is performed by substituting the data Pi and Mi of the path P3 into the above equation (1), and the following equation is obtained:
Figure GDA0002920989460000051
thus, the average probability of the path P3 is found to be 0.72 by calculation.
Similarly, by substituting the corresponding data of the path P1 and the path P2 into the formula (1) for calculation, the average probabilities of the path P1 and the path P2 can be obtained, respectively. In the embodiment shown in fig. 3, the average probabilities of P1 and P2 were calculated to be 0.63 and 0.69, respectively. It can be seen that the average probability of path P3 is the highest, which is the best path of the three paths shown in fig. 3.
Among the large number of possible paths, the optimal path is selected by performing a path search on all possible paths, and the path in which the probability (e.g., the aforementioned average probability) that all combinations are characters is highest is selected as the optimal path. There are many algorithms for path search. In a preferred embodiment, the path search in step 1003 of method 1000 may be performed using dynamic programming or beam search, and the disclosure is not limited thereto. Dynamic programming (dynamic programming) is generally used to solve a problem with certain optimal properties, and the basic idea is to decompose the problem to be solved into a number of sub-problems, solve the sub-problems first, and then obtain the solution of the original problem from the solutions of the sub-problems. The sub-problems obtained through decomposition are often not mutually independent, the answers of the solved sub-problems are stored, and the obtained answers are found out when needed, so that a large amount of repeated calculation can be avoided, the calculation amount is reduced, and the time is saved. Beam search (beam search) is a heuristic graph search algorithm, and is usually used in the case that the solution space of a graph is relatively large, in order to reduce the space and time occupied by the search, when the depth of each step is expanded, the nodes with poor quality are removed, and the nodes with good quality are reserved. This reduces space consumption and improves time efficiency.
As shown in fig. 2, after the optimal path, for example, S2 (corresponding to P3 in fig. 3) is selected, character recognition is performed on the combination in the selected path S2 using a full class classifier (step 1004), resulting in a final recognition result. The full-class classifier may be, but is not limited to, a full-class SVM classifier or a full-class CNN classifier.
Referring now to fig. 4, fig. 4 is a flow chart of a method of recognition of a character string according to another embodiment of the present disclosure. As shown in fig. 4, the method 4000 for recognizing a character string includes steps 4001 to 4005, wherein the steps 4001 to 4004 are similar to the steps 1001 to 1004 of the recognition method 1000 shown in fig. 1. The recognition method 4000 further comprises the step of optimizing the result of recognition (step 4005) compared to the recognition method 1000.
Specifically, the results of the recognition may be optimized in step 4005 using a language model such as a unigram language model or a bigram language model. A language model (statistical language model) represents the distribution probability of a language unit (word or phrase) of a certain language, and may be regarded as a statistical model for generating a text of a certain language.
n in the n-gram language model represents the order of the Markov process. When n is 1, the model is a unary language model, and probability estimation is performed by using the occurrence frequency of each language unit as a parameter. When n is 2, it is a binary language model, and it uses the co-occurrence information of the language unit pair to perform probability estimation of the relevant parameters.
Although a separate optimization step is included in the exemplary recognition method 4000 shown in fig. 4, alternatively, a language model may be embedded in the path search step in the recognition method proposed in the present disclosure, so that optimization is performed during the path search. The optimization process can be set by those skilled in the art according to actual needs, and the disclosure is not limited thereto.
A character string recognition system according to the present disclosure is described below.
Referring to fig. 5, fig. 5 is a diagram of a recognition system of character strings according to an embodiment of the present disclosure. As shown in fig. 5, the system 5000 for recognizing a character string includes an over-segmentation apparatus 5001, a class ii classifier 5002, a path search apparatus 5003, and a class full classifier 5004.
The over-segmentation device 5001 is configured to over-segment the character string image into a plurality of connected regions; the class ii classifier 5002 is configured to classify combinations of each connected region and a predetermined number of adjacent connected regions, and give a probability that each combination is a character; the path search means 5003 is configured to perform a path search on all paths formed by various combinations of the plurality of connected regions, and select a path with the highest probability that all combinations are characters; the full-class classifier 5004 is used to perform character recognition on the combinations in the selected path.
In a preferred embodiment, each combination of connected regions comprises no more than four connected regions. The present disclosure is not limited herein, and a person skilled in the art may set an upper limit value of the number of connected regions included in each combination of connected regions according to actual circumstances, for example, each combination of connected regions may be set to include not more than five connected regions.
In a preferred embodiment, the two-class classifier comprises a two-class SVM classifier or a two-class CNN classifier, and the full-class classifier comprises a full-class SVM classifier or a full-class CNN classifier.
In a preferred embodiment, the character string comprises a handwritten kanji character string, and the handwritten character string comprises kanji characters, numeric characters, letters, symbols, or combinations thereof.
The path searching means 5003 may comprise a dynamic programming unit or a beam searching unit. The dynamic planning unit is used for dynamically planning a path formed by the combination of a plurality of connected areas; the beam search unit is configured to perform a beam search on a path formed by a combination of the plurality of connected regions.
Referring to fig. 6, fig. 6 is a diagram of a recognition system of a character string according to another embodiment of the present disclosure. As shown in fig. 6, the recognition system 6000 includes an overcut sorting device 6001, a class two classifier 6002, a route searching device 6003, and a class full classifier 6004.
Compared to the recognition system 5000 shown in fig. 5, the path search means 6003 in the recognition system 6000 further includes a calculation unit 6013, and other components are similar to the recognition system 5000. The calculation unit 6013 is configured to calculate an average probability of the probabilities of all combinations in each path, and the path search means 6003 selects a path having the highest average probability. Although in the embodiment shown in fig. 6, the calculation unit 6013 is part of the path search device 6003, in a variant embodiment, a separate calculation device may be provided in the identification system, and the disclosure is not limited thereto
In a preferred embodiment, the calculation unit 6013 is configured to calculate the average probability by equation (1) as described above in connection with the method embodiments. Also, in formula (1), P represents the average probability, Pi represents the probability that the combination of each connected region is a character, Mi represents a preset weighting parameter for the combination of each connected region, and N represents the number of combinations of connected regions. In a further preferred embodiment, Mi comprises any one of the following: the combined length of each connected region; the number of black pixels in the combination of each connected region; and a fixed constant.
Referring now to fig. 7, fig. 7 is a diagram of a system for recognition of character strings, according to a variant embodiment of the present disclosure. As shown in fig. 7, the recognition system 7000 includes an over-cutting apparatus 7001, a class ii classifier 7002, a path search apparatus 7003, a class full classifier 7004, and an optimization apparatus 7005. The route search device 7003 includes a calculation unit 7013. In contrast to identification system 6000 shown in fig. 6, identification system 7000 also includes optimization apparatus 7005, and the other components are similar to identification system 6000.
The optimization device 7005 is configured to optimize the recognition result by using a language model, and similar to the above description in connection with the method embodiments, the optimization may be performed by using a language model such as a unigram language model or a bigram language model. Although a separate optimization means 7005 is included in the exemplary recognition system 7000 shown in fig. 7, alternatively, a language model may be embedded in the path search means 7003 in the recognition system proposed by the present disclosure, i.e., an optimization unit is provided in the path search means 7003 so as to perform optimization during the path search. Those skilled in the art can set up the relevant devices or components for optimization according to actual needs, and the disclosure is not limited thereto.
It is easily understood that the optimization device 7005 shown in fig. 7 can be arranged in the recognition system 5000 shown in fig. 5 according to actual needs, and can also be arranged in other variant embodiments of the recognition system.
The training of the classifiers involved in the present disclosure is described below.
First, training of the class two classifier is described. Referring to fig. 8, fig. 8 is a flowchart of a method of training a class two classifier for character classification according to an embodiment of the present disclosure. As shown in fig. 8, method 8000 includes over-segmenting the training string image into a plurality of connected regions (step 8001); marking combinations of each connected region and a predetermined number of adjacent connected regions as character combinations and non-character combinations, respectively (step 8002); and training the class two classifier with the character combinations and non-character combinations (step 8003).
The training strings may be taken from any known handwritten word stock, or more specifically, may include strings previously written by the author of the handwritten string currently to be recognized, and so on. The present disclosure is not limited thereto, and those skilled in the art can set the training character string according to actual needs. During training, each combination of connected regions preferably includes no more than four training connected regions. The disclosure is not limited herein, and those skilled in the art may set the upper limit value of the number of training connected regions included in each combination of connected regions according to practical situations, for example, each combination of connected regions may be set to include no more than five training connected regions.
In addition, the training of the full-class classifier is to train the full-class classifier with individual kanji characters, for example, training the full-class classifier with a single character library such as a CASIA handwritten Chinese library, and then identifying the individual characters with the full-class classifier. Since training of the full-class classifier is not the key content of the present disclosure, it is not described herein.
FIG. 9 shows a schematic block diagram of a computer that may be used to implement methods and systems in accordance with embodiments of the present disclosure.
In fig. 9, a Central Processing Unit (CPU)901 performs various processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage section 908 to a Random Access Memory (RAM) 903. In the RAM 903, data necessary when the CPU 901 executes various processes and the like is also stored as necessary. The CPU 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output interface 905 is also connected to bus 904.
The following components are connected to the input/output interface 905: an input section 906 (including a keyboard, a mouse, and the like), an output section 907 (including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like), a storage section 908 (including a hard disk, and the like), a communication section 909 (including a network interface card such as a LAN card, a modem, and the like). The communication section 909 performs communication processing via a network such as the internet. The driver 910 may also be connected to the input/output interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like can be mounted on the drive 910 as needed, so that a computer program read out therefrom is installed in the storage section 908 as needed.
In the case where the series of processes described above is realized by software, a program constituting the software is installed from a network such as the internet or a storage medium such as the removable medium 911.
It will be understood by those skilled in the art that such a storage medium is not limited to the removable medium 911 shown in fig. 9 in which the program is stored, distributed separately from the apparatus to provide the program to the user. Examples of the removable medium 911 include a magnetic disk (including a floppy disk (registered trademark)), an optical disk (including a compact disc-read only memory (CD-ROM) and a Digital Versatile Disc (DVD)), a magneto-optical disk (including a mini-disk (MD) (registered trademark)), and a semiconductor memory. Alternatively, the storage medium may be the ROM 902, a hard disk included in the storage section 908, or the like, in which programs are stored, and which is distributed to users together with the device including them.
The present disclosure also provides a program product having machine-readable instruction code stored thereon. Which when read and executed by a machine may perform methods that are implemented in accordance with the principles and concepts of the present disclosure.
Accordingly, storage media carrying the above-described program product having machine-readable instruction code stored thereon are also included within the scope of the present disclosure. Including, but not limited to, floppy disks, optical disks, flash memory, magneto-optical disks, memory cards, memory sticks, and the like.
It is also noted that in the apparatus, methods, and systems of the present disclosure, components or steps may be broken down and/or re-combined. These decompositions and/or recombinations should be considered equivalents of the present disclosure. Also, the steps of executing the series of processes described above may naturally be executed chronologically in the order described, but need not necessarily be executed chronologically. Some steps may be performed in parallel or independently of each other.
Finally, it should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Furthermore, without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Although the embodiments of the present disclosure have been described in detail with reference to the accompanying drawings, it should be understood that the above-described embodiments are merely illustrative of the present disclosure and are not to be construed as limiting the present disclosure. It will be apparent to those skilled in the art that various modifications and variations can be made in the above-described embodiments without departing from the spirit and scope of the disclosure. Accordingly, the scope of the disclosure is to be defined only by the claims appended hereto, and by their equivalents.
Supplementary note
Supplementary note 1. a character string recognition method, the recognition method comprising:
over-cutting the character string image into a plurality of connected areas;
classifying combinations of each connected region and a predetermined number of adjacent connected regions by using a class II classifier, and giving the probability that each combination is a character;
performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting the path with the highest probability that all the combinations are characters; and
and performing character recognition on the combination in the selected path by using a full-class classifier.
Supplementary notes 2. the recognition method as described in supplementary notes 1, wherein the probability that all combinations in a path are characters is the average probability of the probabilities that the respective combinations in the path are characters.
Supplementary note 3. the identification method as described in supplementary note 2, wherein the average probability is calculated by the following formula:
Figure GDA0002920989460000111
wherein the content of the first and second substances,
Figure GDA0002920989460000112
represents the average probability, Pi represents the probability that each combination is a character, Mi represents a preset weighting parameter for each combination, and N represents the number of combinations.
Supplementary note 4. the identification method of supplementary note 3, wherein Mi comprises any one of the following:
the length of each combination;
the number of black pixels in each combination; and
a constant is fixed.
Note 5. the identification method as described in note 1, wherein each combination includes no more than four connected regions.
Supplementary notes 6. the identification method of any one of supplementary notes 1-5, wherein the two-class classifier comprises a two-class SVM classifier or a two-class CNN classifier, and the full-class classifier comprises a full-class SVM classifier or a full-class CNN classifier.
Supplementary note 7. the identification method of any one of supplementary notes 1-5, wherein the path search comprises dynamic planning or beam search.
Supplementary notes 8. the identification method of any one of supplementary notes 1-5, wherein the character string comprises a handwritten character string of kanji characters, numeric characters, letters, symbols, or combinations thereof.
Supplementary note 9. the identification method of any one of supplementary notes 1-5, wherein the character string is over-divided into a plurality of connected regions using inter-character spaces and character stroke features in the character string.
Supplementary notes 10. the recognition method of any one of supplementary notes 1-5, wherein the recognition method further comprises optimizing while searching for a path using a language model or optimizing the result of recognition using a language model.
Supplementary notes 11. the identification method of supplementary notes 10, wherein the language model comprises a univariate language model or a bivariate language model.
Note 12. a character string recognition system, comprising:
the over-segmentation device is used for over-segmenting the character string image into a plurality of connected areas;
a second class classifier for classifying combinations of each connected region and a predetermined number of adjacent connected regions, giving a probability that each combination is a character;
a path search device for performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting a path with the highest probability that all the combinations are characters; and
and the full-class classifier is used for performing character recognition on the combination in the selected path.
Supplementary notes 13. the recognition system of supplementary notes 12, wherein the probability that all combinations in a path are characters is the average of the probabilities that the respective combinations in the path are characters.
Note 14. the recognition system according to note 13, wherein the path search means includes a calculation unit that calculates the average probability by the following formula:
Figure GDA0002920989460000121
wherein
Figure GDA0002920989460000122
Represents the average probability, Pi represents the probability that each combination is a character, Mi represents a preset weighting parameter for each combination, and N represents the number of combinations.
Supplementary notes 15. the identification system of supplementary notes 14, wherein Mi comprises any one of the following:
the length of each combination;
the number of black pixels in each combination; and
a constant is fixed.
Reference 16. the identification system of reference 12, wherein each combination of connected regions comprises no more than four connected regions.
Reference 17. the identification system of reference 12, wherein the two-class classifier comprises a two-class SVM classifier or a two-class CNN classifier. And the full class classifier comprises a full class SVM classifier or a full class CNN classifier.
Supplementary notes 18. the recognition system of supplementary notes 12, wherein the character string comprises a handwritten character string of kanji characters, numeric characters, letters, symbols, or combinations thereof.
Reference 19. the identification system according to any one of the references 12 to 18, wherein the path search means comprises:
a dynamic planning unit for dynamically planning a path formed by a combination of the plurality of connected regions; or
A beam search unit for performing a beam search on a path formed by a combination of the plurality of connected regions.
Reference 20. the identification system of any one of the references 12-18, further comprising:
the optimizing device is used for optimizing the recognition result by utilizing the language model; or
And the optimizing unit is arranged in the path searching device and used for optimizing by utilizing the language model while performing path searching.

Claims (9)

1. A method of identifying a character string, the method comprising:
over-cutting the character string image into a plurality of connected areas;
classifying combinations of each connected region and a predetermined number of adjacent connected regions by using a class II classifier, and giving the probability that each combination is a character;
performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting the path with the highest probability that all the combinations are characters; and
character recognition is performed on the combinations in the selected path using a full class classifier,
the character string is overcut into the plurality of connected regions by using character space and character stroke characteristics in the character string, and the probability that all combinations in a path are characters is the average probability of the probability that each combination in the path is a character.
2. The identification method of claim 1, wherein the average probability is calculated by the following formula:
Figure FDA0002920989450000011
wherein the content of the first and second substances,
Figure FDA0002920989450000012
represents the average probability, Pi represents the probability that each combination is a character, Mi represents a preset weighting parameter for each combination, and N represents the number of combinations.
3. The identification method of claim 2, wherein Mi comprises any one of:
the length of each combination;
the number of black pixels in each combination; and
a constant is fixed.
4. An identification method as claimed in claim 1, wherein each combination comprises no more than four connected regions.
5. The recognition method of any one of claims 1-4, wherein the two-class classifier comprises a two-class SVM classifier or a two-class CNN classifier, and the full-class classifier comprises a full-class SVM classifier or a full-class CNN classifier.
6. The identification method of any of claims 1-4, wherein the path search comprises a dynamic planning or a beam search.
7. The recognition method according to any one of claims 1-4, wherein the recognition method further comprises optimizing the path search with a language model or optimizing the recognized result with a language model.
8. A recognition method according to claim 7, wherein said language model comprises a univariate language model or a bivariate language model.
9. A recognition system for a character string, the recognition system comprising:
the over-segmentation device is used for over-segmenting the character string image into a plurality of connected areas;
a second class classifier for classifying combinations of each connected region and a predetermined number of adjacent connected regions, giving a probability that each combination is a character;
a path search device for performing path search on all paths formed by various combinations of the plurality of connected regions, and selecting a path with the highest probability that all the combinations are characters; and
a full-class classifier for performing character recognition on the combinations in the selected path,
the character string is overcut into the plurality of connected regions by using character space and character stroke characteristics in the character string, and the probability that all combinations in a path are characters is the average probability of the probability that each combination in the path is a character.
CN201610091505.2A 2016-02-18 2016-02-18 Character string recognition method and system Active CN107092902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610091505.2A CN107092902B (en) 2016-02-18 2016-02-18 Character string recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610091505.2A CN107092902B (en) 2016-02-18 2016-02-18 Character string recognition method and system

Publications (2)

Publication Number Publication Date
CN107092902A CN107092902A (en) 2017-08-25
CN107092902B true CN107092902B (en) 2021-04-06

Family

ID=59648845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610091505.2A Active CN107092902B (en) 2016-02-18 2016-02-18 Character string recognition method and system

Country Status (1)

Country Link
CN (1) CN107092902B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871910B (en) * 2019-03-12 2021-06-22 成都工业学院 Handwritten character recognition method and device
CN112036221A (en) * 2019-06-04 2020-12-04 富士通株式会社 Apparatus, method and medium for processing character image
EP3772015B1 (en) * 2019-07-31 2023-11-08 MyScript Text line extraction

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520851A (en) * 2008-02-29 2009-09-02 富士通株式会社 Character information identification device and method
CN101853126A (en) * 2010-05-12 2010-10-06 中国科学院自动化研究所 Real-time identification method for on-line handwriting sentences
CN101930545A (en) * 2009-06-24 2010-12-29 夏普株式会社 Handwriting recognition method and device
CN102243708A (en) * 2011-06-29 2011-11-16 北京捷通华声语音技术有限公司 Handwriting recognition method, handwriting recognition system and handwriting recognition terminal
CN103310209A (en) * 2012-03-09 2013-09-18 富士通株式会社 Method and device for identification of character string in image
CN103324929A (en) * 2013-06-25 2013-09-25 天津师范大学 Handwritten Chinese character recognition method based on substructure learning
CN103577843A (en) * 2013-11-22 2014-02-12 中国科学院自动化研究所 Identification method for handwritten character strings in air
CN103984943A (en) * 2014-05-30 2014-08-13 厦门大学 Scene text identification method based on Bayesian probability frame
CN104573683A (en) * 2013-10-21 2015-04-29 富士通株式会社 Character string recognizing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4181310B2 (en) * 2001-03-07 2008-11-12 昌和 鈴木 Formula recognition apparatus and formula recognition method
JP6166532B2 (en) * 2012-12-28 2017-07-19 グローリー株式会社 Character recognition method and character recognition device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520851A (en) * 2008-02-29 2009-09-02 富士通株式会社 Character information identification device and method
CN101930545A (en) * 2009-06-24 2010-12-29 夏普株式会社 Handwriting recognition method and device
CN101853126A (en) * 2010-05-12 2010-10-06 中国科学院自动化研究所 Real-time identification method for on-line handwriting sentences
CN102243708A (en) * 2011-06-29 2011-11-16 北京捷通华声语音技术有限公司 Handwriting recognition method, handwriting recognition system and handwriting recognition terminal
CN103310209A (en) * 2012-03-09 2013-09-18 富士通株式会社 Method and device for identification of character string in image
CN103324929A (en) * 2013-06-25 2013-09-25 天津师范大学 Handwritten Chinese character recognition method based on substructure learning
CN104573683A (en) * 2013-10-21 2015-04-29 富士通株式会社 Character string recognizing method and device
CN103577843A (en) * 2013-11-22 2014-02-12 中国科学院自动化研究所 Identification method for handwritten character strings in air
CN103984943A (en) * 2014-05-30 2014-08-13 厦门大学 Scene text identification method based on Bayesian probability frame

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A semi-incremental recognition method for on-line handwritten English text;Cuong Tuan Nguyen 等;《2014 14th International Conference on Frontiers in Handwriting Recognition》;20141231;234-239 *
手写字符串识别搜索算法;于金伦 等;《模式识别与人工智能》;20090430;第22卷(第2期);182-187 *

Also Published As

Publication number Publication date
CN107092902A (en) 2017-08-25

Similar Documents

Publication Publication Date Title
US10915564B2 (en) Leveraging corporal data for data parsing and predicting
US10853576B2 (en) Efficient and accurate named entity recognition method and apparatus
Mandal et al. Supervised learning methods for bangla web document categorization
US7689531B1 (en) Automatic charset detection using support vector machines with charset grouping
US20200104359A1 (en) System and method for comparing plurality of documents
JP2011065646A (en) Apparatus and method for recognizing character string
US20150254332A1 (en) Document classification device, document classification method, and computer readable medium
Raychev et al. Language-independent sentiment analysis using subjectivity and positional information
US8560466B2 (en) Method and arrangement for automatic charset detection
Peng et al. Text classification in Asian languages without word segmentation
US11170169B2 (en) System and method for language-independent contextual embedding
US11301639B2 (en) Methods and systems for generating a reference data structure for anonymization of text data
CN110347791B (en) Topic recommendation method based on multi-label classification convolutional neural network
Brodić et al. Language discrimination by texture analysis of the image corresponding to the text
Freitag Trained named entity recognition using distributional clusters
CN107092902B (en) Character string recognition method and system
Brodić et al. Clustering documents in evolving languages by image texture analysis
Tkaczyk New methods for metadata extraction from scientific literature
CN110968693A (en) Multi-label text classification calculation method based on ensemble learning
CN110059192A (en) Character level file classification method based on five codes
CN117034948B (en) Paragraph identification method, system and storage medium based on multi-feature self-adaptive fusion
Castellucci et al. Context-aware convolutional neural networks for twitter sentiment analysis in italian
CN116822634A (en) Document visual language reasoning method based on layout perception prompt
CN111340029A (en) Device and method for identifying at least partial address in recipient address
TeCho et al. Boosting-based ensemble learning with penalty profiles for automatic Thai unknown word recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant