DE60224128T2 - Apparatus and method for recognizing characters and mathematical expressions - Google Patents

Apparatus and method for recognizing characters and mathematical expressions

Info

Publication number
DE60224128T2
DE60224128T2 DE2002624128 DE60224128T DE60224128T2 DE 60224128 T2 DE60224128 T2 DE 60224128T2 DE 2002624128 DE2002624128 DE 2002624128 DE 60224128 T DE60224128 T DE 60224128T DE 60224128 T2 DE60224128 T2 DE 60224128T2
Authority
DE
Germany
Prior art keywords
character
characters
relationship
mathematical expression
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
DE2002624128
Other languages
German (de)
Other versions
DE60224128D1 (en
Inventor
Yuko Ome-shi Eto
Masakazu Fukuoka-shi Suzuki
Kazuaki 2016 Shimmachi 9-chome Ome-shi Yokota
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to JP2001063968A priority Critical patent/JP4181310B2/en
Priority to JP2001063968 priority
Application filed by Toshiba Corp filed Critical Toshiba Corp
Application granted granted Critical
Publication of DE60224128D1 publication Critical patent/DE60224128D1/en
Publication of DE60224128T2 publication Critical patent/DE60224128T2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • G06K9/6807Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
    • G06K9/6814Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the graphical properties
    • G06K9/6835Discrimination between machine-print, hand-print and cursive writing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/72Methods or arrangements for recognition using electronic means using context analysis based on the provisionally recognised identity of a number of successive patterns, e.g. a word
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K2209/00Indexing scheme relating to methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K2209/01Character recognition

Description

  • These The invention relates to a recognition device of mathematical expressions and a mathematical expression recognition method and a character recognition device and a character recognition method that can be used for Recognizing a document image containing mathematical expressions.
  • Reports about character recognition for mathematical expressions containing printed elements, or mathematical expressions and the recognition of the structures of mathematical expressions are has been generated for a while, although the number of such reports is not is very big. The signs to be recognized are not necessarily one-dimensional arranged. Rather, arrangements of signs to be recognized are more frequent than not two-dimensional as with indices, exponents, fractions and so on in the ordinary Practice arranged two-dimensionally. Therefore, a facility for Recognizing (determining) not only those contained in mathematical impressions Provide characters or signs based on these, but also the structures (position information) of mathematical expressions, um to know if each of the characters as an index, an exponent, an Denominator, a counter or anything else is arranged. Accordingly, to recognize a mathematical expression with the help of a computer the time the for the Processing operation is required much longer than the time required for Processing ordinary signs needed becomes.
  • Reports of results that have made it possible to discern the structure of a mathematical representation within a practical processing time include the documents listed below [1], [2] and [3]. According to the documents, a rule for determining the positional relationship of the characters is defined as a mathematical expression including superscript and subscript, and each character is judged to be a common character, index, exponent, denominator, counter, or anything else Matching with its position by referring to the rule to recognize the structure of the mathematical expression.
    • Document [1]: Masayuki Okamoto, Hashim Msafire Twaayondo, "Structure Recognition of Mathematical Expressions Using Peripheral Distribution Features", Transaction for the Institute of Electronics, Information and Communication, D-II, Volume J78 -D-II, No. 2, pp. 366-370 (1995).
    • Document [2]: Masayuki Okamoto, Hiroyuki Azuma, "Recognition of Mathematical Expressions with Emphasis on the Layout of Signs", Transaction for the Institute of Electronics, Information and Communication, D-II, Vol. J78-D-II, No. 3, pp. 434-482 (1995).
    • Document [3]: RJ Fateman, T. Tokuyasu, BP Berman and N. Mitchell, "Optical Character Recognition and Parsing of Typeset Methematics", Journal of Visual Communication and Image Representation, Vol. 7, No. 1, pages 2-15 (1995).
  • however is in the prior art including the known techniques of the above listed documents every character as a common one Characters, an index, an exponent, a denominator, a counter or judged anything else based on the location characteristic. Accordingly, if the position of a character is misjudged, it interferes all subsequent assessments to a significant extent in negative Wise. For example, if a common character is an index is misinterpreted, all characters following it, the arranged on the same level as that of the misinterpreted one Signed as so many indices misinterpreted. In short, one Local misrecognition of a mathematical expression may be the detection severely deface its overall structure.
  • moreover refer to the known techniques of the above-listed documents only on character recognition within a mathematical expression and do not show any technique for detecting a mathematical expression in a text.
  • One Article "Incorporating Syntactic Constraints in Recognition Handwritten Sentences "(" Incorporating Syntactic restrictions in the Recognition of Handwritten Sentences ") by Srihari et al., Center for Document Analysis and Recognition, State University of New York a method of linguistic analysis that provides candidate recognition results with probabilities and the given priority most likely Awards candidates. A directory that has chances to Combine characters and words contains is used.
  • Another article, "Computing Graphs and Graph Transformations," by Blostein et al., Software Practice & Experience, Vol. 29, No. 3, John Wiley & Sons, pp. 197-217, March 1999 discloses graphene modifications involving graph reduction, graph rewriting, and graph transformation. The method is used to analyze optical character recognition usage in a mathematical recognizer. One Process creates annotations that are analyzed for practical spatial relationships. Connections are presented in a graph.
  • If Therefore, the position of a character is misjudged, this affects negative all subsequent assessments to a significant extent. For example, if a common Characters are misjudged as an index, all become ordinary Characters that are placed after him on the same level with the of the misjudged character, as so many indices misjudged. In short, a local misrecognition of mathematical expression can severely damage the recognition of its overall structure.
  • moreover refer to the known techniques of the above-listed documents only on character recognition within a mathematical expression and do not show any technique for detecting a mathematical expression in a text.
  • One other article "Structure Analysis and Recognition of Mathematical Expressions "(" Structure Analysis and Detection of mathematical expressions ") by Twaakyondo et al., Department of Information Engineering, Shinshu University, Japan, discloses the recognition of printed mathematical expressions under Using two-dimensional structural analysis that uses "bottom-up" and "top-down" strategies. Before processing, individual symbols are recognized and one estimate the normal size of a Central symbol is made. The text is then subjected to a root expression analysis and the headline a heading expression analysis and matrix expression analysis. The analysis of the ordinary non-mathematical text will not be considered.
  • The The present invention is directed to an apparatus according to claim 1 and a method according to claim 6. Preferred embodiments become dependent claims explained.
  • The Invention can be more complete be understood from the following detailed description, when considered in conjunction with the accompanying drawings becomes, in which shows:
  • 1 a block diagram of an OCR system (optical character recognition system) according to an embodiment of the present invention;
  • 2 a flowchart of the operation for detecting a mathematical expression by the embodiment of the 1 ;
  • 3 a schematic representation of the operation of evaluating the mathematical expression or text, by the embodiment of the 1 is executed to detect a mathematical expression;
  • 4 a schematic representation of the directory for judging a mathematical expression or text, which is the embodiment of the 1 Device becomes;
  • 5 a schematic representation of the operation for finding an optimal path, by the embodiment of the 1 to be used to capture a mathematical expression;
  • 6 a schematic representation of the directory for connecting language parts, which by the embodiment of the 1 is used;
  • 7 a flowchart of the operation of recognizing a mathematical expression by the embodiment of the 1 ;
  • 8th a schematic representation of the operation of decomposing a mathematical expression, which by the embodiment of the 1 is performed to recognize the mathematical expression;
  • 9 a schematic representation of the operation for detecting candidate character, for recognizing a mathematical expression by the embodiment of the 1 is to be used;
  • 10 4 is a schematic representation of the operation for computing the normalization quantity and the normalization center in order to recognize a mathematical expression by the embodiment of FIG 1 is to be used;
  • 11A . 11B . 11C and 11D Scatterplots by the embodiment of the 1 can be used;
  • 12 a schematic representation of link candidates, between two consecutive characters by the embodiment of the 1 can be generated;
  • 13 a schematic representation of the operation for finding an optimal path, by the embodiment of the 1 to be used to recognize a mathematical expression; and
  • 14A . 14B and 14C schemati These representations of the conditions to be fulfilled for the computational determination of a global evaluation, which are used to recognize a mathematical expression by the embodiment of FIG 1 is used.
  • A embodiment a device and a method for recognizing mathematical expressions and an apparatus and method for recognizing characters in accordance with Present invention will now be with reference to the accompanying Drawings described.
  • 1 FIG. 10 is a block diagram of a character recognition system implemented using the embodiment of the present invention. FIG. The character recognition system (optical character recognition: OCR) 11 is designed to recognize a printed document 8th containing mathematical expressions. Such a printed document 8th can typically be a scientific or technological document. The system 11 reads the printed document 8th with the help of a scanner or a scanner 10 and performs a processing operation to recognize each of the text regions and each of the math expression regions in the document. Then the system gives 11 electronic document data containing text data and mathematical expression data as data of the results of recognition 20 , Documents that can be read by such a system include not only printed documents but also document images containing mathematical expressions that have already been reduced to image data.
  • The OCR system 11 is realized as a software or a computer program that is executed by a computer. It may typically be a design analysis section of functional modules 111 comprise a recognition section 112 ordinary signs, a detection section 113 mathematical expressions, a recognition section 114 mathematical expressions, an output conversion section 115 , a directory for evaluating mathematical expressions / texts 201 , a directory for connecting language parts 202 , a scatter plot information storage section 203 and a global evaluation information storage section 204 , The directories and the information of the memory sections are stored in one or more storage media such as semiconductor memories and / or magnetic disks.
  • The Processing operation for recognizing a document runs in one Sequence from 1) scanning the document image, 2) analyzing the Design, 3) Recognition of ordinary Characters, 4) capture mathematical expressions, 5) recognize mathematical expressions and 6) converting the obtained data into electrical output data. Now, the processing operation becomes special with respect to Method of execution of step 4) of acquiring mathematical expressions and of step 5) of recognizing mathematical expressions.
  • In front to describe in more detail the steps of detecting mathematical Expressions and the recognition of mathematical expressions becomes the flow of the processing operation summarized below.
  • First, a page image of the printed document is obtained when the scanner 10 the printed document 8th , which contains mathematical expressions, reads. Then the design analysis section analyzes 111 The design of each of the page images and the page image is divided into one or more graphical regions, one or more table regions, and one or more text regions. The image data of each of the graphic regions and the table regions is output without any further processing. The recognition section of ordinary signs 112 recognizes the ordinary characters in the text regions. The operation of recognizing the ordinary characters is realized by dividing lines and cutting out characters based on a histogram to recognize each of the characters. The capturing section then captures 113 mathematical expressions are each the mathematical expressions based on the result of the above character recognition operation.
  • The directory for evaluating mathematical expressions / texts 201 and the directory for connecting language parts 202 be through the detection section 113 mathematical expressions used to capture mathematical expressions. The directory for evaluating mathematical expressions / texts 201 defines an evaluation score for the ability of each word to belong to a text, and also an evaluation score for the possibility that a word belongs to a math expression based on the category that can be specified for the word using normal phrases. Thus, an evaluation evaluation value for the possibility of belonging to a text and also an evaluation evaluation for the possibility of belonging to a mathematical expression are word for word referring to the dictionary for judging a mathematical expression / text 201 assessed.
  • The directory for connecting language parts 202 Are defined. a formative grammar of texts and mathematical expressions. special it defines rules of linking language parts of texts and mathematical expressions. Thus, each recognized character is divided into a mathematical expression region or a text region by determining an optimal connection relationship of the group of recognized characters containing the word based on that through the dictionary for connecting speech parts 202 provided formative grammar, the evaluation evaluation for the possibility of belonging to a text and the evaluation evaluation for the possibility of belonging to a mathematical expression, as with reference to the directory for judging mathematical expressions / texts 201 receive.
  • All characters and symbols belonging to mathematical expression regions (regions for mathematical expressions) become the recognition section 114 sent mathematical expressions and subjected to a processing operation to recognize the structure of each mathematical expression. In this processing operation for recognizing the structure of each mathematical expression, the mathematical expression is decomposed into elements, and then each of the elements is checked as to whether a character is on a baseline, a superscript, or a superscript. A plurality of character size scatter patterns included in the scatter pattern information storage section 203 and conditions for global evaluation stored in the global evaluation information storage section, which will be described in more detail below, are used in this checking operation. A pair of consecutively arranged characters may be arranged on a same baseline or one of them may be a subscript or a heading of the other. A character size scatter pattern providing sample information shows the amount of normalization of consecutive characters and the distribution of their possible center positions. Thus, inter-character structure candidates and link candidates representing their respective evaluation scores are obtained for any two consecutive characters in a mathematical expression for the purpose of determining their relationship, which may be that of the horizontal juxtaposition (on the same baseline) or that of subscript or superscript ,
  • The Conditions of global evaluation are considered conditional expressions expressed to appropriately determine the inter-character structure based on the global evaluation of all contained in a mathematical expression Character. By using the global evaluation, it is possible to use a find the optimal path to conclusively relate to each other the sign into a mathematical expression without any contradiction by referring to determining local inter-character relationships.
  • The output conversion section 115 combines the recognition results obtained for the text regions and the math expression regions synthetically and outputs data on the recognition results 20 out.
  • Form same mood procedures
  • Now Hereinafter, a specific method for detecting a mathematical will be described Expression described.
  • In this embodiment, as in FIG 2 shown a mathematical expression region (region of mathematical expressions) detected by means of a math expression detection method (method for detecting mathematical expressions) comprising two steps, the step A1 and the step A2. This detection method is basically arranged to acquire a mathematical expression from a document written in English.
  • <step A1: Evaluation of Mathematical Expression / Text>
  • In step A1, each word is evaluated as either mathematical expression (Math) or text (text) based on ordinary character recognition. A word as used herein refers to a string separated from other characters by blanks and is detected as a result of character recognition. 3 shows this procedure.
  • It is referred to 3 , the first line shows one in the system 11 input image (original image). The second line shows the result of FIG. 12 as a result of an ordinary character recognition operation of the ordinary character recognition section 112 received recognition. Since the feature of recognizing a mathematical expression can not be used for an operation for recognizing the ordinary characters, in this embodiment, the ordinary character recognition portion recognizes 112 a mathematical expression merely as an unexpected string of symbols. In step A1, the recognition result is input, and each of the words in the mathematical expression is evaluated as either a mathematical expression or a text. The values listed to the right of the headings of Math and Text are evaluation evaluations given to the word. In this embodiment, this processing operation will be described with reference to the above bene directory for the evaluation of a mathematical expression / text 201 carried out. 4 shows examples of data contained in the dictionary for judging a mathematical expression / text 201 can be stored.
  • It is referred to 4 , the line with the line number 1 shows that the word "with" is a preposition (PP) and has an evaluation score of 0 as Math (mathematical expression) and that of 100 as text (text). The telephone with the line number 2 shows that the word "where" is a pronoun (PN) and also has an evaluation score of 0 as a math and that of 100 as a text. The row numbered line 3 shows that the word "is" is a verb (V) and has an evaluation score of 70 as a math and that of 70 as a text. The row with row number 4 shows that the word "a" is an article (ART) and has an evaluation score of 90 as a math and 90 as a text. In this way, the directory stores to evaluate a mathematical expression / text 201 broadly all the words that could appear in scientific and technological documents, their respective buzzings (the arrangement of character code), their respective language parts, and their respective evaluation ratings, which have been given mathematical expressions and as text.
  • In addition, this embodiment is arranged to flexibly accommodate various character strings of symbols using normal expressions in view of the fact that mathematical expressions can be recognized as so many unexpected strings of symbols. Normal expressions are used to express the spelling of words in a flexible manner in character search systems. The meaning of some of the symbols used in normal expressions are shown below.
    • , represents any sign.
    • * represents 0 or more repetitions of the immediately preceding character (for example,. * represents any string).
    • [] represents any of specified characters in square brackets (e.g., [az] represents some alphabetic character from "a" to "z").
    • ^ represents characters that differ from those specified next (for example, [^ az] represents any character that differs from "a" to "z").
  • Accordingly, the row with the line number 5 shows the 4 that the word ". * [^ az]. *" includes a character other than the characters from "a" to "z", or a symbol. It is a noun (N) and has an evaluation score of 100 as math and 70 as text. Similarly, the cylinder with the line number 6 indicates that the word ". * [^ Az]. * [^ Az]. *" Includes two characters other than the characters from "a" to "z" or one or two symbols. It is a noun (N) and has an evaluation score of 100 for math and 40 for text. The row with the row number 7 shows that the word ". * [^ Az]. * [^ Az]. * [^ Az]. *" Includes three characters other than characters "a" to "z" or one , two or three symbols. It is a noun (N) and has the evaluation score of 100 as a math and 20 as a text. The row with the row number 8 shows that the word ". *" Is a word formed by a single alphabetic character selected from "a" to "z". It is a noun (N) and has an evaluation score of 90 for math and 40 for text. Note that a noun (N) indicates that the word is a text.
  • Thus, it is possible to have the part of speech and the evaluation scores as math and as text each of each as a result of a character recognition operation referring to the dictionary for judging a math expression / text 201 performed, recognized word, as in 4 shown, line by line to determine.
  • Exactly, as in 3 are shown as an evaluation score of 0 as Math and that of 100 as text for the word "with" based on the knowledge made by referring to row number 1 of the 4 is obtained. An evaluation score of 90 as Math and that of 40 as text are given for the word "f" based on the knowledge given with reference to line number 8 of the 4 is obtained. An evaluation score of 100 as a math and that of 20 as a text are given to the word ", \" with three characters based on the knowledge given with reference to the row number 7 of the 4 is obtained. An evaluation score of 100 as Math and that of 20 as a text are given to the word ") =, \" with four characters based on the knowledge obtained by referring to the dictionary for judging a mathematical expression / text 201 although such an arrangement of characters is not in 4 is shown. Similarly, an evaluation score of 0 as Math and that of 100 as text are given to the word "where" based on the knowledge made by referring to line number 2 of FIG 4 is obtained. An evaluation score of 90 as Math and that of 40 as text are given to the word "U" based on the knowledge given with reference to line number 8 of FIG 4 is obtained. An evaluation score of 70 as a math and that of 70 as a text are given to the word "is" based on the knowledge given by referring to the row number 3 of the 4 is obtained. An evaluation score of 90 as Math and that of 90 as text will be the last word "a" based on the Knowledge by referring to the line number 4 of the 4 is obtained because the knowledge of the line number 4 has a priority to that of the line number 8.
  • <step A2: Search for optimal path>
  • Then, in the next step of step A2, a processing operation for finding an optimal path from the evaluation evaluation and coupling the evaluation evaluation is performed. 5 schematically illustrates this operation. In this step A2, the directory for connecting language parts 202 used because it shows which part of speech can be connected to which part of speech in a text and which part of speech in a text can be connected to a mathematical expression. 6 shows examples of data stored in the directory for connecting parts of speech 202 can be stored.
  • It is referred to 6 , "Text pp → Math" in the first line shows that a preposition (PP) in a text can be linked to an immediately following mathematical expression. "Math → Math" in the second line shows that two or more than two mathematical expressions can be joined. "Math → Text PN" in the third line shows that a mathematical expression can be connected to the immediately following pronoun (PN) in a text. "Text PN → Math" in the fourth line shows that a pronoun (PN) in a text can be connected to the immediately following mathematical expression. "Text ART → Text N" in the fifth line shows that an article (ART) in a text can be linked to the immediately following noun (N) in the text.
  • The directory for connecting language parts 202 stores all possible combinations that are good for linking. Any other combinations are not good for joining.
  • In an operation of searching for an optimal path, either a mathematical expression or text is searched for each word with reference to its evaluation scores, and the only permitted links are tracked for detected words in accordance with those in the dictionary for connecting speech parts 202 defined rules of formative grammar. Then the path is searched that shows the largest sum of evaluation scores imparted to the words as mathematical expressions and text and is selected from all possible connection paths. In short, the word "with" in 5 can be linked to the immediately following word "with" if the word "with" is a math and the word "f" is either a math or a text and if the word "with" is text and the word "f" is either Math or text. However, the path of text "with" to math "f" is selected because the sum of the evaluation scores of that combination is the largest. 5 Figure 5 shows that an optimal path of text, math, math, math, text, math, text and text is selected and tracked for joining the eight words starting from the first word "with" and ending with the last word "a . "
  • This Search algorithm can be realized by means of a beam search technique (which is also referred to as a prioritized search). The beam search is in the field of dynamic programming a well-known algorithm. It's designed to eliminate paths as hardly possible be judged to be the optimal path to the dimensions of the search Space and also both the amount of calculations and the storage capacity needed to search an optimal path are required to reduce.
  • As a result of the search operation described above, each of the words is judged to belong to either a mathematical expression or a text, and math expression regions and text regions are detected. In the case of 5 It will be seen that the four words of "f", "(, \", ") =, \" and "U" are judged to belong to a mathematical expression, whereas all other words are judged to belong to a text. The regions of the image data corresponding to the words, which are judged to belong to mathematical expressions, are math expression regions, whereas the regions of the image data, which are the words judged to be texts, are text regions.
  • It should be noticed that higher developed formative grammar as a context-free grammar can be defined to describe connection contexts the connections of language parts for the purpose of this invention check. Such a grammar is equivalent to one normal grammar but higher developed as the latter.
  • Conventional systems of the type considered are adapted to use simple rules for acquiring mathematical expressions. For example, when one or more parentheses and / or slashed letters are recognized in a group of words, they usually determine such that the group of words is that of a mathematical expression. Such systems usually can not handle disparate symbols that are identified in a mathematical formula term are included. If a word "a" is found in a text, the system can not determine whether it is an indefinite English-language article or a character in a mathematical expression. In contrast, this embodiment can more accurately determine whether each word belongs to a mathematical expression or text by referring to the evaluation scores of the word in a manner described above. In addition, since the embodiment uses a formative grammar to check each word, it is possible to determine that a word "a" not immediately followed by a noun belongs to a mathematical expression based on the rule that only one noun can directly follow "a" if the latter is an English-language article.
  • (Math expression recognizing method)
  • To recognize a mathematical expression, it is necessary to check a technique for checking the structure of the expression in terms of indices, exponents, denominator / numerator, and so on, in addition to the recognition of the characters. Therefore, in this sense, the recognition of a mathematical expression is more complicated than the recognition of a common sign. In this embodiment, characters are recognized using a known character recognition technique, whereas the structure of a mathematical expression is examined by a method including four steps from step B1, step B2, step B3, and step Band, which will be described below with reference to FIG 7 to be discribed.
  • <step B1: acquiring the structure in terms of numerator / denominator, indices, stress marks, Root signs, points and so on>
  • In Step B1 becomes break lines, root signs and so on Math expression regions are recorded and denominators and counters and words in root signs are isolated. Similarly, if there are any indices, stress marks, root signs, points and so forth, these are captured and from the image data of the math expression regions deleted.
  • For example, if an in 8th is shown in a mathematical expression region detected in the manner described above, it is decomposed into four mathematical expression components represented by broken lines in FIG 8th are surrounded. Then, the left index is deleted from the math expression component ( 3 a → a) and the emphasis symbol "^" or "~" is deleted from the math expression component (xdx ^ → xdx). Although in 8th not shown, root characters and points, if any, are deleted (√a + b → a + b) (x → x).
  • mathematical Elements like denominator / counter, Indices, stress marks, root signs and points can be relative exactly in agreement with the above Documents [1], [2] and [3]. In many cases they can be captured by a simple procedure that focuses on local Position relationships relates to such elements. Therefore, the subsequent steps of Step B2 to step B4 are limited to subscriptions and citations (exponents) using the simple detection method of the Capture other mathematical elements as listed above have been for the to reduce the time required.
  • <step B2: Character recognition>
  • In the subsequent steps from step B2 to step B4 the mathematical expression components that are punctuated by fractional lines, stress signs, Indices, root signs and points have been freed in step B1 are, processed.
  • In step B2, linked black elements are extracted from the math expression components for the image data of the math expression components obtained in step B1, and are subjected to a character recognition process. As a result, those in the lower half of the 9 shown candidate characters for the mathematical expression components of the denominator in 8th receive. 9 shows a math expression component of cx 2 y 3 whose image data is subjected to character recognition. Each character (linked black element) can be a capital letter, a lowercase letter, or a digit.
  • <step B3: Creating Link Candidates>
  • In step B3, the possibility of connecting any two characters to all candidate characters obtained by using the in 10 examined context. 10 shows the values to be used for determining the positional relationship of any two consecutive characters (the normalization size and the normalization center). The two characters may be arranged horizontally (on a same baseline) or one of them may be a subscript or a superscript of the other. It is referred to 10 , the values h1, h2 are respectively the quantities (height indications) of the normalization of the considered characters. If two characters are arranged on a same line, the size of the normalization refers to the corrected size, which makes them have a same size (height).
  • Note that the size of the normalization here refers to the height from the highest point of letters with upper length (eg the highest point of the letter "d") to the lowest point of letters with lower length (eg lowest point of the letter "y"). In other words, the value of h1 shows the height of two combined letters of "d" and "y" typed on top of each other. "d" is a character in which a black link line segment extends to the highest point of the uppercase letter and "y" is a character in which a black link line segment extends to the lowest point of the lowercase letter. For example, the letter "x" in 10 a height smaller than that of the letter "d" or "y". Therefore, "x" can be used to show the magnitude of normalization h1, which is equal to the height of the combined two letters of "d" and "y" that have been typed over each other by multiplying their actual height by a given multiplier. The value of the multiplier to be used to make each character show the size of the normalization is defined in advance. Thus, the sign is made to provide the magnitude of the normalization by multiplying its actual height by the given multiplier defined in advance. For example, the height of the small letter "c" is extended both up and down. On the other hand, the height of a capital letter "C" is only extended downward.
  • In similar Way becomes the actual Size of the character "2", which is in an index region, multiplied by a multiplier that is specifically defined around it the normalization size of h2 to show that for To use indexes. Because characters that are in index regions are a small actual Have size, will the for the normalization quantity h2 used in an index region "2" becomes smaller made as the normalization size h1 for that yourself on the baseline characters "x" use is.
  • In 10 c1 and c2 respectively denote normalization centers. The normalization center reference is used to make all the characters on a same line show the same middle position in terms of height. Here, the normalizing center is referred to as the center of the y-coordinate value of a rectangle that passes around the normalized character (hereinafter referred to as a character rectangle). If the heights and the normalization centers of two adjacent characters are h1 and c1, and h2 and c2, respectively, the in 11A to 11D shown character size scatter plot by printing the relationship of the normalization quantities H = (h2 / h1) × 1000 and the connection of the normalization centers D = {(c1-c2) / h1} × 1000.
  • The four scatterplots (sample information) of the 11A to 11D By observing pairs of characters that are on a same horizontal baseline, the pairs of characters, each pair being a capital letter of the other, and character pairs, one of each pair being a subscript of the other, are for the relationships of sizes H and Centers D of normalization with respect to different character types obtained results. Exactly shows 11A a scatter plot obtained when both of two consecutive characters are alphabetic characters. As used herein, alphabetic characters refer to ordinary alphabetic characters, Greek characters and numerals. 11B FIG. 12 shows a scattergram obtained when each pair of consecutively arranged characters includes an alphabetical character and an operator. 11C FIG. 12 shows a scattergram obtained when each pair of consecutively arranged characters includes an integral character and an alphabetic character. 11D FIG. 12 shows a scattergram obtained when each pair of consecutively arranged characters includes a Σ character and an alphabetic character.
  • Thus, it is now possible to determine the inter-character structure candidates, each of which can show a horizontal positional relationship, a character / subscript or context, and their respective sets of evaluation scores, hereinafter referred to as link candidates, by computationally Determining the values of H and D for each pair of candidate characters obtained in step B2 and judging the polygonal region in the scattergram of the corresponding combination of character types to which the pair belongs. For example, if the relationship between the normalization quantity H and the normalization center D of a pair of consecutively arranged characters in the polygon region P1 or P2 in FIG 11A is found, two characters are judged as showing a character / heading relationship. The total evaluation score may be higher if they belong to P2 than if they belong to P1, since the number of dots in region P1 is greater than that in region P2. On the other hand, when the relationship is found in the polygon region P3 or P4, the two characters are judged to show a character / subscript relationship. The GE The overall evaluation score may be higher if it belongs to P4 than if it belongs to P3. Finally, when the context is found in the polygonal regions P5 or P6, the two characters are judged to be a horizontal positional relationship. The overall evaluation score may be higher if it belongs to P5 than if it belongs to P6.
  • 12 FIG. 12 is a schematic representation of link candidates generated between pairs of two consecutive characters for the math expression component of FIG 9 , In 12 Each link candidate displays a parent candidate character (left), a daughter candidate character (right), a connection type, and an evaluation rating. Note that a link candidate is generated for every two consecutive characters, and also for two characters arranged consecutively to a character located in an index region between them (x and y in 12 ).
  • As in 12 are shown, the following join candidates for the characters of "c" and "x" are referenced by the scatter plots of FIG 11A generated;
    (c, x, horizontal, 100),
    (c, X, deep, 60) and
    (C, X, horizontal, 100).
  • It it can be seen that the combination of (C, x) does not exist can because the connection of H and D is not in any region the scatterplots can be found.
  • The following join candidates are for the characters "x" and "2", which is an index, with reference to the scatter plot of 11A generated;
    (X, 2, high, 60),
    (x, 2, high, 100) and
    (x, 2, horizontal, 20).
  • The following link candidates are used for the characters of "x" and "y" with reference to the in 11A shown scatter plot, taking the index "2" into consideration.
    (x, y, horizontal 100),
    (x, y, deep, 60)
    (X, Y, horizontal 60),
    (2, y, deep 10) and
    (2, Y, deep, 50).
  • Finally, the following link candidates for the characters "y" and "3", which is an index, will be described with reference to the scattergram of FIG 11A generated;
    (y, 3, high, 100) and
    (Y, 3, high, 50).
  • In one aspect, this embodiment uses four scatter plots as shown in FIG 11A to 11D shown prepared for combinations of characters of different types. How out 11A to 11D can be seen, the distribution of intercharacter contexts can vary significantly depending on the character types of each pair. In view of this fact, the scatter plots are prepared to accommodate combinations of characters of different types so that the inter-character relationship of a character pair can be judged with reference to the corresponding one of the scatter plots.
  • According to any chart [1], [2] and [3] used above, an index is detected by examining whether its normalization center is displayed up or down from the horizontal center of the parent character or not. From the point of view of the scatterplots of the 11A to 11D this means that an index is detected by using only the vertical coordinates of the scatter plots and therefore the probability of detecting a false index and that of missing an index recognition are high. In contrast, according to the present invention, an index is detected in a two-dimensional region by using the combination of the characters of different types and different sizes to judge and prepare the scatterplots for the combinations of characters of different types. As a result, the accuracy of the detected indexes is noticeably improved.
  • Now will be before describing the next step the reason why the recognition of mathematical structures of expression a problem of recognizing an optimal path is described below.
  • A mathematical expression has a tree structure and symbols are not arranged on the same line. This is why the fact that it is a problem in grasping an optimal path is not understood by many people. According to the present invention, an optimal path is obtained by drawing a whole tree showing optimal mathematical expression structures using the link mesh prepared in step B3. The connection of each character with its parent character can be judged by drawing a whole tree. Thus, a set of (parent candidate character (left), subsidiary candidate character (right), connection type, evaluation score) is referred to as a link candidate, and each character rectangle is executed to match all candidate links in the right to wear. Then, a whole tree is defined by selecting a single link candidate from each rectangle. Such selections may be considered as a tracking operation of paths. Therefore, recognizing mathematical expression structures is a problem of detecting an optimal path.
  • Such Paths can however, not altogether regarded as a tree of mathematical expression structures become. For example, a character can only have one child character horizontally arranged on the same line or as a capital or a Subscript (absence of duplicated links). In addition, everyone must as based on right or left connection or index connection selected in their drawing rectangle Recognized candidates of a sign agree with each other, so they one Part of the mathematical expression are (selection of a uniqueness candidate). An overall tree that fulfills both these requirements becomes denoted as non-contradictory mathematical expression syntax tree and the paths forming such a tree are considered applicable Called paths. An optimal path becomes applicable from those Paths selected. In the following description, the simple expression refers a path to a preserved from the consistent total tree Application.
  • <step B4: Search for optimal path>
  • Then is in step B4 by backward (or vorwärtiges) Tracking the between pairs of characters generated in step B4 link candidates searched for an optimal path to connect the link candidates. Specifically, the largest common Evaluation evaluation pointing path is searched among the paths which can connect the characters without contradiction by considering the connection contexts between any two consecutive characters (horizontal positional contexts, character / subscriptions and Character / superscript relationships) and selecting one of the connection candidates of each pair of consecutive arranged signs. In addition, in this embodiment, the total value of local evaluation evaluations of the respective character pairs, the through their affiliation candidates are given based on the three conditions for the global Evaluation corrected, which in turn are defined based on the distribution of heights the character contained in the mathematical expression component. The optimal path is also corrected by reference to the Overall evaluation evaluation wanted.
  • If the normalization size of the character in the index region of each parent character is greater than the normalization size of the parent character, as in 14A shown, the overall evaluation rating is reduced. In the case of 14A For example, "+" is erroneously judged to be an index of "2" so that "b" that follows is also misjudged as being located in the index region. Misjudgements of this type can often occur when a sign is judged only by relying on the local evaluation score. However, since "b" has a size equal to "a" and its normalization size is larger than that of "2", its overall evaluation rating is reduced.
  • When two consecutive characters are found on the same line, and the succeeding one of the two characters is in the index region of the preceding character, as in 14B As shown, the overall evaluation score is also reduced. In other words, the total evaluation score is reduced when a character to any of the small regions P2, P4, P6 of the scattergram of the 11A to 11D which are in the index region of the immediately preceding character found on the baseline. 14B shows a case where a lower case letter "x" is erroneously regarded as a capital letter "X". Since the character "B" located near the character "A" located on the baseline is found in the index region, the overall evaluation score is reduced.
  • The overall evaluation score is also reduced, the normalization size of baseline alphabetic characters are differentiated beyond a certain threshold, as in 14C shown. In other words, 14C shows a case where capital letters "C" are erroneously regarded as lower case letters "c". Then, the normalization size of the lower case letter "c" becomes larger than the normalization size of the capital letter "A". Therefore, the overall evaluation score is reduced.
  • Therefore are the conditions for global evaluation the one that meets Need to become, to the overall evaluation score of the path containing the characters in to connect a mathematical expression without any contradiction can, selecting one of the link candidates each pair of characters from the point of view of a horizontal position context, a sign / collocation context and a sign / collocation context. The operation to search for an optimal path, the largest overall evaluation evaluation can also be generated at high speed using a Technique of beam search performed (which is also referred to as a prioritized search).
  • 13 shows an example of the optimal path as judged considering the conditions for global evaluation. In this way, an optimal link candidate is selected for each intra-character context, and the intra-character context is judged to be a horizontal positional relationship, a character / capitalization relationship, or a character / subscript relationship.
  • The in the above referenced documents [1], [2] and [3] Techniques lack the concept of global evaluation. If so a single character on the baseline is faulty as an index is taken, all the following characters become erroneous as appropriate many indices have been adopted. This occurs when in part because each Characters as an index or an exponent based on an under Use of locally characteristic aspects of the character performed arithmetic operation is judged. In contrast, by the present invention the concept of global evaluation is adopted for the following paths, so that if a character is erroneously assumed as an index, the problem not all subsequent characters are accepted incorrectly occurs. Then it is possible the result of the operation of an external device for detecting a mathematical expression using the technique of global Evaluate evaluation. It is also possible to put the technique on one apply complex assessment operation.
  • Then becomes the final result of the recognition for the math expression component Hand back indices, stress marks, root signs, and so on continue, if available, temporarily removed in step B1 have been added to the string that is optimally linked, receive. Then, the final result of the recognition for the mathematical expression region becomes Carry out the processing operation of steps B2 to B4 on each of the math expressing components receive. The data of the recognition result of the text region and Math expression Regions document containing by synthetically combining the Results of recognition of text regions and math expression regions receive.
  • Therefore includes a mathematical expression recognition device a character recognition unit configured to recognize of characters in a document image that has a text and a mathematical Contains expression a first directory configured to store an evaluation evaluation path for each Word type identified by a normal expression can, where the rating shows the probability to text To belong and, belonging to a mathematical expression, an evaluation unit, which is configured to receive the evaluation ratings, the probability to belong to the text and that to the mathematical one To be an expression for each in the characters recognized by the character recognition unit included words Referring to the first directory, and a math expression detecting unit, FIG. which is configured to be optimal for connecting words Path to search by selecting one of text and mathematical expression based on a formative Grammar and the probability to belong to the text and the evaluation evaluations that belong to a mathematical expression for each of the words, thereby detecting characters belonging to the automatic expression.
  • There a mathematical expression region using a commonplace Can recognize unexpected characters in the character recognition operation Recognition result occur. With regard to this problem includes the math expression recognition device a directory to use to classify the in the result of a character recognition operation using normal expressions executed in different types and getting evaluation ratings for each Word type, which are used for classification, respectively the probability shows to belong to a text and to belong to a mathematical expression. Therefore it is possible to everyone Word in a flexible way with reference to the directory evaluation ratings to rent. Every mathematical expression is determined by looking for a optimally connecting each successively arranged pair of words Path for captures both the text and the mathematical expression the band for each word respectively in terms of text and mathematical expression arithmetically assessed evaluation evaluations. With this arrangement Is it possible, to accurately grasp a mathematical expression region and therefore the recognize mathematical expressions contained in a document.
  • According to the present embodiment, a mathematical expression recognition apparatus comprises a character recognition unit configured to recognize characters in a document image including a text and a mathematical expression, a detection unit configured to acquire a mathematical expression region from the characters recognized by the character recognition unit, a memory configured to store a plurality of information samples having a relationship of a normalization amount and a center position between each consecutively arranged character pair in terms of character types including a horizontal positional relationship, a character / subscript relationship and a character / capitalization relationship, and a unit configured to calculate the ratio of the normalization size and the center position between each pair of consecutively arranged characters included in of the mathematical expression region, and obtaining link candidates for the horizontal position relationship, the character / subscript-related and the character / superscript relationship, based on the calculated ratio of the normalization amount and the center position and the calculated relationship of the type of the two consecutively arranged characters corresponding sample information.
  • The Mathematical expression recognition apparatus includes a variety of sample information for different combination types of two consecutive arranged characters and it is therefore possible, the inter-character context from each detected consecutively arranged character pair as a horizontal positional context, a sign / collocation context or to recognize a sign / capital letter relationship by reference on the sample information corresponding to the type of characters for which the Inter-sign context is to be judged. With this arrangement Is it possible, the operation error rate of determining the positions of characters in a mathematical expression to reduce and operational efficiency of recognizing the structure of mathematical expressions.
  • According to the present Embodiment comprises a math expression recognition device a character recognition unit, which is configured to recognize characters in a document image, which includes a mathematical expression, a unit that configures is to capture a math expression region from the result a character recognition obtained by the character recognition device, a unit configured to store a plurality sample information relating to the inter-character context the normalization quantities and that of the central positions of each pair of successive ones Characters with regard to the character types and the positional relationships of the horizontal position context, an inter-character relationship determination unit that configures is to computationally determine the Correlation of the normalization quantities and of the central positions each of the pairs of consecutively arranged characters in one Mathematics expression region and obtaining candidate links as combinations of inter-character structure candidates containing the respective probabilities show a horizontal position context, a character / collocation context or to have a sign / capital letter relationship based on the Calculation result and the sample information and their respective evaluation ratings, a unit configured to save the conditions for global evaluation, based on the distribution of heights of in the math expression regions contained characters, and a unit that is configured is to search for an optimal path to consistency Joining the characters in each of the math expression regions, to choose an inter-character structure candidate with a horizontal positional relationship, a sign / collocation context or a sign / collocation context for each Pair of consecutively arranged characters, and to recognize of the horizontal position context, of the character / subscript-context or the sign / capital letter relationship of the couple following characters based on the result of the search operation.
  • Therefore Is it possible, not just following each couple's local context arranged characters, but also an optimal Path to search the characters in a math expressions region without contradiction, to the overall evaluation evaluations Ultimately, to maximize the conditions for global evaluation Taking into account. Thus, if the positional relationship of a pair of successive following misaligned characters is misjudged, it is avoided that misjudgment the operation of determining the forest affected by a mathematical expression.
  • As As described above, the present embodiment provides the following advantages ready.
  • It is possible, efficient math expression regions by judging each word as such a text or a mathematical expression to judge and search for an optimal path to connect of words based on the formative grammar and the every word in terms textual and mathematical evaluation evaluation evaluation.
  • It is possible to accurately determine the positional relationship of each pair of consecutively arranged characters as a horizontal positional relationship, a character / subscriptive relationship, or a character / heading relationship by preparing a plurality of scattergrams sequencing the normalization magnitudes and normalization centers of pairs arranged characters shows.
  • It is possible, to prevent any misjudgment of a positional context of a Pair of consecutively arranged characters the operation of Determining the overall structure of a mathematical expression does not affect only by determining the location relationship each pair of consecutively arranged characters, but also of searching for an optimal path, the conditions for global evaluation Taking into account.
  • It is possible, the number of links to use to create link candidates Signs to reduce the efficiency of the overall processing operation to improve by running a Preprocessing operation of decomposing mathematical expressions into components and capturing indices, stress marks, root signs and so on, before creating link candidates and searching for optimal paths.
  • While the foregoing description refers to specific embodiments of the present invention, it will be understood that many modifications can be made. The appended claims are intended to cover such modifications as would come within the true scope of the present invention. Accordingly, the presently disclosed embodiments are to be considered in all respects as the scope of the present invention, which is indicated by the appended claims, illustrative and not restrictive, rather, the foregoing description and all changes which come within the scope of equivalents of the claims , therefore intended to be included in the scope of protection. For example, the present invention can be put into practice as a computer-readable recording medium containing a program to allow a computer to function as a pre-judging device, to allow the computer to realize a pre-judging function or make it possible for the computer to execute a pre-judging means. For example, the OCR system 11 the embodiment described above are completely realized by means of software. Therefore, the advantages of the present invention can be realized by preparing a program for the above processing sequence that a computer can execute, storing it in a computer-readable storage medium, inserting it into a computer with the help of the storage medium, and causing the computer to execute the program ,

Claims (10)

  1. A math expression recognition apparatus, comprising: a character recognition device ( 112 ) for recognizing characters in a document image containing a text and a mathematical expression, the document image comprising a string of a plurality of recognized words; a first directory device ( 201 ) for storing a pair of evaluation scores for each word type that can be identified by a normal expression, wherein a first evaluation score indicates the likelihood of belonging to the text, and a second evaluation score indicates the likelihood of belonging to the math expression; an evaluation facility ( 113 ) from the first directory device ( 201 ) to obtain the first and second evaluation scores showing the likelihood of belonging to the text and of belonging to the mathematical expression for each of the words contained in the characters and recognized by the character recognizer with respect to the first directory means; and a math expression detector ( 114 ) operable to search for an optimal path through the evaluation scores along the juxtaposition of the words by selecting each of the first evaluation score and the second evaluation score for each word in the string of words in turn, operable to calculate a plurality of sums of the evaluation scores for the stringing together of words, each sum containing either the first evaluation score or the second evaluation score for each word, for such sequential combinations of words allowed by formative grammar showing which part of speech can be linked to what part of speech in the text, and which part of speech in the text can be linked to a mathematical expression, and operable to capture, as an optimal path, that path that has the largest sum of the evaluation scores given to the string of Words that capture characters belonging to the mathematical expression by providing, as an optimal path, the most likely interpretation for stringing words between text and mathematical expression; the apparatus further comprising processing the characters within a recognized math expression region: a memory device ( 203 ) for storing a plurality of elements of the sample information, characterizing a relationship of a normalization amount and a center position between each pair of consecutively arranged characters in terms of the types of the characters, the elements of the sample information being divided into a plurality of groups of samples of horizontal positional relationship, character / subscript relationship, and character / capitalization relationship; and a determining device ( 114 ) for calculating the relationship of the normalization amount and the center position between each pair of consecutively arranged characters included in the mathematical expression region, determining that the calculated ratio is within the plurality of groups, and obtaining connection candidates for the horizontal positional relationship, the characters / Subscript relationship and the character / high-key relationship, based on the result of the determination, and recognizing the mathematical expression by searching an optimal path along the consecutively arranged characters by the connection candidates.
  2. The apparatus according to claim 1, characterized in that the mathematical expression detection means comprises: a second directory means ( 202 ) for storing a connectable part of speech and mathematical expression as formative grammar; and a search facility ( 114 ) to search for a path through the evaluation scores along the concatenated words, and to show the largest evaluation score given to the word as a mathematical expression or text among all possible inter-word connection paths as the optimal path by selecting either the text or the mathematical expression for each word according to the part of speech of the word and of the formative grammar read out by the second directory means.
  3. The device according to claim 2, characterized in that it further comprises: a memory ( 204 configured to store a global evaluation condition to determine, based on the distribution of the heights of the characters included in the mathematical expression region, the condition indicating a horizontal positional relationship, character / subscript relationship, and character / major relationship; a search facility ( 114 ) using data from the memory ( 204 ) to find an optimal path for connecting candidate candidates along the consecutive arranged characters in each of the mathematical expression regions while maintaining the global evaluation condition characterizing the same horizontal positional relationship, character / subscript relationship and character / title relationship; means for selecting an inter-character structure candidate having a horizontal positional relationship, a character / subscript relationship, or a character-to-title relationship for each pair of consecutively arranged characters based on the global evaluation condition and the connection candidate; and means for recognizing the horizontal positional relationship, the character / subscript relationship, or the character / subscript relationship of the pair of consecutively arranged characters based on the result of the search operation.
  4. The device according to claim 3, characterized in that the global evaluation condition at least one includes the relationship between the height of one in the subscript region contained character and height each of the other characters, the positional relationship between a Baseline and a character contained in the subscript region, and the height distribution between characters that are at the same horizontal level are located.
  5. The device according to claim 2, characterized in that it further comprises: a decomposition device ( 113 ) for decomposing each mathematical expression detected by the mathematical expression acquisition unit into components and removing at least left indexes, accent marks, root characters and points from each component, and characterized in that the determining means obtains connection candidates for the components of which the left indexes, accent marks , Root signs or points are removed.
  6. A mathematical expression recognition method, comprising: recognizing characters in a document image containing a text and a mathematical expression as a series of a plurality of recognized words; Referring to a first directory that stores a pair of evaluation scores for each word type that can be identified by a normal expression, a first evaluation score indicates the likelihood that the word belongs to the text and a second evaluation score indicates the probability that the word belongs to the mathematical expression to obtain the evaluation scores indicating the likelihood that the word belongs to the text and that the word belongs to the math expression for each of the words contained in the recognized words; and searching for an optimal path through evaluation scores along the juxtaposition of the words by selecting each of the first evaluation score and the second score score for each word in the string of words in sequence, computing the A plurality of sums of the evaluation ratings for the stringing of words, each sum containing either the first evaluation score or the second evaluation score for each word, for those sequential combinations of words allowed by formative grammar showing which portion of speech with which Part of speech in the text can be linked, and which part of the language in the text can be linked to a mathematical expression, capturing characters belonging to the mathematical expression, providing, as the optimal path, the most likely interpretation for the juxtaposition of words between text and mathematical expression, the method further comprising the steps of, for processing the characters within a recognized mathematical expression region, using: 203 ) for storing a plurality of a relationship of a normalization amount and a center position between each pair of consecutively arranged characters with respect to the types of the character indicating sample information, the sample information being divided into plural groups of values of samples such as horizontal positional relationship, character / subscript Relationship and sign / uplift relationship; and using a determination device ( 114 ) for calculating the relationship of the normalization amount and the center position between each pair of successive characters arranged in the mathematical expression region, determining that the calculated relationship is within the plurality of groups, and obtaining candidate candidates for the horizontal positional relationship, the characters / Subscript relationship and the token relationship based on the result of the determination, and recognizing the mathematical expression by searching an optimal path along the consecutive arranged characters by the candidate candidates.
  7. A method according to claim 6, comprising the steps of: using a second directory device ( 202 ) for storing a connectable part of speech and a mathematical expression as formative grammar; and using a search facility ( 114 ) to search for a path through the evaluation scores along the connected words, and to show the largest evaluation score given to the word as a mathematical expression or text from all possible inter-word connection paths as an optimal path, by selecting either the text or the mathematical expression for each word according to the part of the language of the word and the formative grammar read out by the second directory means.
  8. The method of claim 7, comprising the steps of: using a memory configured to store a global evaluation condition of the characters contained in the mathematical expression region ( 204 ), which condition characterizes a horizontal positional relationship, character / subscript relationship; and searching for an optimal path for connecting the connection candidates along the consecutively arranged characters in each of the mathematical expression regions while maintaining the global evaluation condition that identifies the same horizontal positional relationship, character / subscript relationship, and character / capitalization relationship; Selecting an inter-character structure candidate having a horizontal positional relationship, a character / subscript relationship or a character-to-title relationship for each pair of consecutively arranged characters based on the global evaluation condition and the connection candidate, and recognizing the horizontal positional relationship Character / subscript relationship or the character / subscript relationship of the pair of consecutively arranged characters based on the result of the search operation.
  9. The method according to claim 8, characterized that the global evaluation condition is at least one of the following encompassed by the relationship between the height of one in a subscript region contained character, and the height each of the other characters, the positional relationship between a Baseline and a character contained in the subscript region, and the distribution of heights under signs that are at the same horizontal level.
  10. The device according to claim 7, comprising the steps of: Disassemble each one the math expression acquisition unit recorded mathematical expression, in components and Remove at least the left indices, accent marks, root signs and points from each component, and using the determining means for obtaining connection candidates for the components, of which removed the left indices, accent marks, root signs or dots have been.
DE2002624128 2001-03-07 2002-03-05 Apparatus and method for recognizing characters and mathematical expressions Active DE60224128T2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2001063968A JP4181310B2 (en) 2001-03-07 2001-03-07 Formula recognition apparatus and formula recognition method
JP2001063968 2001-03-07

Publications (2)

Publication Number Publication Date
DE60224128D1 DE60224128D1 (en) 2008-01-31
DE60224128T2 true DE60224128T2 (en) 2008-12-04

Family

ID=18922868

Family Applications (1)

Application Number Title Priority Date Filing Date
DE2002624128 Active DE60224128T2 (en) 2001-03-07 2002-03-05 Apparatus and method for recognizing characters and mathematical expressions

Country Status (4)

Country Link
US (1) US7181068B2 (en)
EP (1) EP1239406B1 (en)
JP (1) JP4181310B2 (en)
DE (1) DE60224128T2 (en)

Families Citing this family (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007535740A (en) * 2004-03-23 2007-12-06 アンヘル・パラショス・オルエタAngel Palacios Orueta Managing formulas
US7698638B2 (en) * 2004-09-15 2010-04-13 Microsoft Corporation Systems and methods for automated equation buildup
US7561737B2 (en) * 2004-09-22 2009-07-14 Microsoft Corporation Mathematical expression recognition
US7929767B2 (en) * 2004-09-22 2011-04-19 Microsoft Corporation Analyzing subordinate sub-expressions in expression recognition
US7561739B2 (en) * 2004-09-22 2009-07-14 Microsoft Corporation Analyzing scripts and determining characters in expression recognition
US8156116B2 (en) 2006-07-31 2012-04-10 Ricoh Co., Ltd Dynamic presentation of targeted information in a mixed media reality recognition system
US7702673B2 (en) 2004-10-01 2010-04-20 Ricoh Co., Ltd. System and methods for creation and use of a mixed media environment
US8201076B2 (en) 2006-07-31 2012-06-12 Ricoh Co., Ltd. Capturing symbolic information from documents upon printing
US9020966B2 (en) 2006-07-31 2015-04-28 Ricoh Co., Ltd. Client device for interacting with a mixed media reality recognition system
US8489987B2 (en) 2006-07-31 2013-07-16 Ricoh Co., Ltd. Monitoring and analyzing creation and usage of visual content using image and hotspot interaction
US9176984B2 (en) 2006-07-31 2015-11-03 Ricoh Co., Ltd Mixed media reality retrieval of differentially-weighted links
US9063952B2 (en) 2006-07-31 2015-06-23 Ricoh Co., Ltd. Mixed media reality recognition with image tracking
US8965145B2 (en) 2006-07-31 2015-02-24 Ricoh Co., Ltd. Mixed media reality recognition using multiple specialized indexes
US8249344B2 (en) * 2005-07-01 2012-08-21 Microsoft Corporation Grammatical parsing of document visual structures
US8020091B2 (en) * 2005-07-15 2011-09-13 Microsoft Corporation Alignment and breaking of mathematical expressions in documents
US9171202B2 (en) 2005-08-23 2015-10-27 Ricoh Co., Ltd. Data organization and access for mixed media document system
US7812986B2 (en) 2005-08-23 2010-10-12 Ricoh Co. Ltd. System and methods for use of voice mail and email in a mixed media environment
US9405751B2 (en) 2005-08-23 2016-08-02 Ricoh Co., Ltd. Database for mixed media document system
KR100630200B1 (en) * 2005-08-24 2006-09-22 삼성전자주식회사 Method for operating calculator mode in the portable terminal
JP2007072718A (en) * 2005-09-06 2007-03-22 Univ Of Tokyo Handwritten mathematical expression recognizing device and recognizing method
US20100254606A1 (en) * 2005-12-08 2010-10-07 Abbyy Software Ltd Method of recognizing text information from a vector/raster image
RU2309456C2 (en) * 2005-12-08 2007-10-27 "Аби Софтвер Лтд." Method for recognizing text information in vector-raster image
AU2007215162A1 (en) 2006-02-10 2007-08-23 Nokia Corporation Systems and methods for spatial thumbnails and companion maps for media objects
CN101443787B (en) 2006-02-17 2012-07-18 徕美股份公司 Method and system for verification of uncertainly recognized words in an OCR system
US9384619B2 (en) 2006-07-31 2016-07-05 Ricoh Co., Ltd. Searching media content for objects specified using identifiers
US9721157B2 (en) * 2006-08-04 2017-08-01 Nokia Technologies Oy Systems and methods for obtaining and using information from map images
WO2009075689A2 (en) 2006-12-21 2009-06-18 Metacarta, Inc. Methods of systems of using geographic meta-metadata in information retrieval and document displays
US7885456B2 (en) * 2007-03-29 2011-02-08 Microsoft Corporation Symbol graph generation in handwritten mathematical expression recognition
US8116570B2 (en) * 2007-04-19 2012-02-14 Microsoft Corporation User interface for providing digital ink input and correcting recognition errors
US8009915B2 (en) * 2007-04-19 2011-08-30 Microsoft Corporation Recognition of mathematical expressions
US9373029B2 (en) 2007-07-11 2016-06-21 Ricoh Co., Ltd. Invisible junction feature recognition for document security or annotation
US9530050B1 (en) 2007-07-11 2016-12-27 Ricoh Co., Ltd. Document annotation sharing
US8156115B1 (en) 2007-07-11 2012-04-10 Ricoh Co. Ltd. Document-based networking with mixed media reality
US8176054B2 (en) * 2007-07-12 2012-05-08 Ricoh Co. Ltd Retrieving electronic documents by converting them to synthetic text
US8073258B2 (en) * 2007-08-22 2011-12-06 Microsoft Corporation Using handwriting recognition in computer algebra
US20090245646A1 (en) * 2008-03-28 2009-10-01 Microsoft Corporation Online Handwriting Expression Recognition
US8121412B2 (en) * 2008-06-06 2012-02-21 Microsoft Corporation Recognition of tabular structures
US20100115403A1 (en) * 2008-11-06 2010-05-06 Microsoft Corporation Transforming math text objects using build down and build up
US20100166314A1 (en) * 2008-12-30 2010-07-01 Microsoft Corporation Segment Sequence-Based Handwritten Expression Recognition
JP4775462B2 (en) * 2009-03-12 2011-09-21 カシオ計算機株式会社 Computer and program
JP5471126B2 (en) * 2009-07-31 2014-04-16 カシオ計算機株式会社 Electronic device and program
US8571270B2 (en) 2010-05-10 2013-10-29 Microsoft Corporation Segmentation of a word bitmap into individual characters or glyphs during an OCR process
US8751550B2 (en) 2010-06-09 2014-06-10 Microsoft Corporation Freeform mathematical computations
JP5790070B2 (en) * 2010-08-26 2015-10-07 カシオ計算機株式会社 Display control apparatus and program
CN103250149B (en) * 2010-12-07 2015-11-25 Sk电信有限公司 For extracting semantic distance and according to the method for semantic distance to mathematics statement classification and the device for the method from mathematics statement
JP5267546B2 (en) 2010-12-22 2013-08-21 カシオ計算機株式会社 Electronic computer and program with handwritten mathematical expression recognition function
US8943113B2 (en) * 2011-07-21 2015-01-27 Xiaohua Yi Methods and systems for parsing and interpretation of mathematical statements
US9058331B2 (en) 2011-07-27 2015-06-16 Ricoh Co., Ltd. Generating a conversation in a social network based on visual search results
US9208218B2 (en) * 2011-10-19 2015-12-08 Zalag Corporation Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results
US9600587B2 (en) 2011-10-19 2017-03-21 Zalag Corporation Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results
CN104067292B (en) * 2012-01-23 2017-05-03 微软技术许可有限责任公司 Formula detection engine
JP5950700B2 (en) * 2012-06-06 2016-07-13 キヤノン株式会社 Image processing apparatus, image processing method, and program
CN103679129A (en) * 2012-09-21 2014-03-26 中兴通讯股份有限公司 Method and device for identifying object in image
JP2014127188A (en) * 2012-12-27 2014-07-07 Toshiba Corp Shaping device and method
US9330070B2 (en) 2013-03-11 2016-05-03 Microsoft Technology Licensing, Llc Detection and reconstruction of east asian layout features in a fixed format document
JP2014203393A (en) * 2013-04-09 2014-10-27 株式会社東芝 Electronic apparatus, handwritten document processing method, and handwritten document processing program
CN103996055B (en) * 2014-06-13 2017-06-09 上海珉智信息科技有限公司 Recognition methods based on grader in image file electronic bits of data identifying system
RU2596600C2 (en) 2014-09-02 2016-09-10 Общество с ограниченной ответственностью "Аби Девелопмент" Methods and systems for processing images of mathematical expressions
US10025976B1 (en) * 2016-12-28 2018-07-17 Konica Minolta Laboratory U.S.A., Inc. Data normalization for handwriting recognition
CN108038441A (en) * 2017-12-07 2018-05-15 庞军良 A kind of System and method for based on image recognition

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL109268A (en) * 1994-04-10 1999-01-26 Advanced Recognition Tech Pattern recognition method and system

Also Published As

Publication number Publication date
US7181068B2 (en) 2007-02-20
DE60224128D1 (en) 2008-01-31
EP1239406A2 (en) 2002-09-11
US20020126905A1 (en) 2002-09-12
EP1239406A3 (en) 2005-03-16
EP1239406B1 (en) 2007-12-19
JP2002269499A (en) 2002-09-20
JP4181310B2 (en) 2008-11-12

Similar Documents

Publication Publication Date Title
Casey et al. A survey of methods and strategies in character segmentation
Jain et al. Document representation and its application to page decomposition
Bansal et al. Integrating knowledge sources in Devanagari text recognition system
Manmatha et al. A scale space approach for automatically segmenting words from historical handwritten documents
Negi et al. An OCR system for Telugu
Fujisawa et al. Segmentation methods for character recognition: from segmentation to document structure analysis
Chaudhuri et al. A complete printed Bangla OCR system
US7298903B2 (en) Method and system for separating text and drawings in digital ink
Elliman et al. A review of segmentation and contextual analysis techniques for text recognition
TWI321294B (en) Method and device for determining at least one recognition candidate for a handwritten pattern
US5491760A (en) Method and apparatus for summarizing a document without document image decoding
DE69730930T2 (en) Method and device for character recognition
JP3664550B2 (en) Document retrieval method and apparatus
Kim et al. An architecture for handwritten text recognition systems
Wang et al. Segmentation of merged characters by neural networks and shortest path
Lorigo et al. Offline Arabic handwriting recognition: a survey
JP3183577B2 (en) The method for selecting semantically significant image in the document image without decoding image content
US7336827B2 (en) System, process and software arrangement for recognizing handwritten characters
EP1345162A2 (en) Character recognition system and method
EP0621553A2 (en) Methods and apparatus for inferring orientation of lines of text
US6335986B1 (en) Pattern recognizing apparatus and method
JP5144940B2 (en) Improved robustness in table of contents extraction
JP3639126B2 (en) Address recognition device and address recognition method
US20080123940A1 (en) Cursive character handwriting recognition system and method
US20020071607A1 (en) Apparatus, method, and program for handwriting recognition

Legal Events

Date Code Title Description
8364 No opposition during term of opposition
8320 Willingness to grant licences declared (paragraph 23)