DE60224128T2

DE60224128T2 - Apparatus and method for recognizing characters and mathematical expressions

Info

Publication number: DE60224128T2
Application number: DE60224128T
Authority: DE
Inventors: Masakazu Fukuoka-shi Suzuki; Kazuaki 2016 Shimmachi 9-chome Ome-shi Yokota; Yuko Ome-shi Eto
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2001-03-07
Filing date: 2002-03-05
Publication date: 2008-12-04
Anticipated expiration: 2022-03-06
Also published as: EP1239406A3; US7181068B2; EP1239406A2; US20020126905A1; JP4181310B2; JP2002269499A; EP1239406B1; DE60224128D1

Description

Diese Erfindung betrifft eine Erkennungsvorrichtung mathematischer Ausdrücke und ein Erkennungsverfahren mathematischer Ausdrücke sowie eine Zeichenerkennungsvorrichtung und ein Zeichenerkennungsverfahren, die verwendet werden können zum Erkennen eines mathematische Ausdrücke enthaltenden Dokumentenabbilds.These The invention relates to a recognition device of mathematical expressions and a mathematical expression recognition method and a character recognition device and a character recognition method that can be used for Recognizing a document image containing mathematical expressions.

Berichte über Zeichenerkennung für mathematische Ausdrücke enthaltende bedruckte Elemente, oder mathematische Ausdrücke und die Erkennung der Strukturen von mathematischen Ausdrücken sind eine Zeitlang erzeugt worden, obwohl die Zahl solcher Berichte nicht sehr groß ist. Die zu erkennenden Zeichen sind nicht notwendigerweise eindimensional angeordnet. Vielmehr sind Anordnungen von zu erkennenden Zeichen öfter als nicht zweidimensional wie mit Indizes, Exponenten, Bruchzahlen und so weiter in der gewöhnlichen Praxis zweidimensional angeordnet. Daher muss eine Einrichtung zum Erkennen (Bestimmen) nicht nur der in mathematischen Eindrücken enthaltenen Zeichen oder sich auf diese beziehenden Zeichen bereitstellen, sondern auch die Strukturen (Positionsinformation) mathematischer Ausdrücke, um zu wissen, ob jedes der Zeichen als ein Index, ein Exponent, ein Nenner, ein Zähler oder irgendetwas Anderes angeordnet ist. Demnach ist zum Erkennen eines mathematischen Ausdrucks mit Hilfe eines Computers die Zeit, die für die Verarbeitungsoperation erforderlich ist, viel länger als die Zeit, die zum Verarbeiten gewöhnlicher Zeichen benötigt wird.Reports about character recognition for mathematical expressions containing printed elements, or mathematical expressions and the recognition of the structures of mathematical expressions are has been generated for a while, although the number of such reports is not is very big. The signs to be recognized are not necessarily one-dimensional arranged. Rather, arrangements of signs to be recognized are more frequent than not two-dimensional as with indices, exponents, fractions and so on in the ordinary Practice arranged two-dimensionally. Therefore, a facility for Recognizing (determining) not only those contained in mathematical impressions Provide characters or signs based on these, but also the structures (position information) of mathematical expressions, um to know if each of the characters as an index, an exponent, an Denominator, a counter or anything else is arranged. Accordingly, to recognize a mathematical expression with the help of a computer the time the for the Processing operation is required much longer than the time required for Processing ordinary signs needed becomes.

Berichte über Ergebnisse, die es ermöglicht haben, die Struktur einer mathematischen Darstellung innerhalb einer praktikablen Verarbeitungszeit zu erkennen, schließt die nachstehend aufgelisteten Dokumente [1], [2] und [3] ein. Gemäß den Dokumenten wird eine Regel zum Bestimmen des Positionszusammenhangs der Zeichen ein einem mathematischen Ausdruck, der hochgestellte und tiefgestellte Zeichen einschließt, definiert und jedes Zeichen wird als ein gewöhnliches Zeichen, ein Index, ein Exponent, ein Nenner, ein Zähler oder irgendetwas Anderes beurteilt in Übereinstimmung mit seiner Position durch Bezugnahme auf die Regel, um die Struktur des mathematischen Ausdrucks zu erkennen.

Dokument [1]: Masayuki Okamoto, Hashim Msafire Twaayondo, "Structure Recognition of Mathematical Expressions Using Peripheral Distribution Features" (Strukturerkennung mathematischer Ausdrücke unter Verwendung peripherer Verteilungsmerkmale), Transaction for the Institute of Electronics, Information and Communication, D-II, Band J78-D-II, Nr. 2, Seiten 366–370 (1995).
Dokument [2]: Masayuki Okamoto, Hiroyuki Azuma, "Recognition of Mathematical Expressions with Emphasis an the Layout of Signs" (Erkennung mathematischer Ausdrücke mit der Betonung auf das Layout von Symbol"), Transaction for the Institute of Electronics, Information and Communication, D-II, Band J78-D-II, Nr. 3, Seiten 434–482 (1995).
Dokument [3]: R. J. Fateman, T. Tokuyasu, B. P. Berman and N. Mitchell, "Optical Character Recognition and Parsing of Typeset Methematics" ("Optische Zeichenerkennung und Syntax-Analyse von Typensatz-Mathematik"), Journal of Visual Communication and Image Representation, Bd. 7, Nr. 1, Seiten 2–15 (1995).

Reports of results that have made it possible to discern the structure of a mathematical representation within a practical processing time include the documents listed below [1], [2] and [3]. According to the documents, a rule for determining the positional relationship of the characters is defined as a mathematical expression including superscript and subscript, and each character is judged to be a common character, index, exponent, denominator, counter, or anything else Matching with its position by referring to the rule to recognize the structure of the mathematical expression.

Document [1]: Masayuki Okamoto, Hashim Msafire Twaayondo, "Structure Recognition of Mathematical Expressions Using Peripheral Distribution Features", Transaction for the Institute of Electronics, Information and Communication, D-II, Volume J78 -D-II, No. 2, pp. 366-370 (1995).
Document [2]: Masayuki Okamoto, Hiroyuki Azuma, "Recognition of Mathematical Expressions with Emphasis on the Layout of Signs", Transaction for the Institute of Electronics, Information and Communication, D-II, Vol. J78-D-II, No. 3, pp. 434-482 (1995).
Document [3]: RJ Fateman, T. Tokuyasu, BP Berman and N. Mitchell, "Optical Character Recognition and Parsing of Typeset Methematics", Journal of Visual Communication and Image Representation, Vol. 7, No. 1, pages 2-15 (1995).

Jedoch wird beim Stand der Technik einschließlich der bekannten Techniken der oben aufgelisteten Dokumente jedes Zeichen als ein gewöhnliches Zeichen, ein Index, ein Exponent, ein Nenner, ein Zähler oder irgendetwas Anderes basierend auf der Ortseigenschaft beurteilt. Wenn die Position eines Zeichens falsch beurteilt wird, beeinträchtigt es demnach alle nachfolgenden Beurteilungen in signifikantem Umfang in negativer Weise. Wenn beispielsweise ein gewöhnliches Zeichen als ein Index fehlinterpretiert wird, werden alle ihm nachfolgenden Zeichen, die auf demselben Level angeordnet sind wie dem des fehlinterpretierten Zeichens als so viele Indizes fehlinterpretiert. Kurz gesagt, eine lokale Fehlerkennung eines mathematischen Ausdrucks kann die Erkennung seiner Gesamtstruktur stark verunstalten.however is in the prior art including the known techniques of the above listed documents every character as a common one Characters, an index, an exponent, a denominator, a counter or judged anything else based on the location characteristic. Accordingly, if the position of a character is misjudged, it interferes all subsequent assessments to a significant extent in negative Wise. For example, if a common character is an index is misinterpreted, all characters following it, the arranged on the same level as that of the misinterpreted one Signed as so many indices misinterpreted. In short, one Local misrecognition of a mathematical expression may be the detection severely deface its overall structure.

Zudem beziehen sich die bekannten Techniken der oben aufgelisteten Dokumente nur auf Zeichenerkennung innerhalb eines mathematischen Ausdrucks und zeigen keinerlei Technik zum Erfassen eines mathematischen Ausdrucks in einem Text.moreover refer to the known techniques of the above-listed documents only on character recognition within a mathematical expression and do not show any technique for detecting a mathematical expression in a text.

Ein Artikel "Incorporating Syntactic Constraints in Recognising Handwritten Sentences" ("Einarbeiten syntaktischer Einschränkungen in die Erkennung von handgeschriebenen Sätzen") von Srihari et al., Center for Document Analysis and Recognition, State University of New York, offenbart ein Verfahren der linguistischen Analyse, das das Versehen von Kandidatenerkennungsergebnissen mit Wahrscheinlichkeiten teilt und die gegebene Priorität dem wahrscheinlichsten Kandidaten verleiht. Ein Verzeichnis, das Wahrscheinlichkeiten zum Kombinieren von Zeichen und Wörtern enthält, wird verwendet.One Article "Incorporating Syntactic Constraints in Recognition Handwritten Sentences "(" Incorporating Syntactic restrictions in the Recognition of Handwritten Sentences ") by Srihari et al., Center for Document Analysis and Recognition, State University of New York a method of linguistic analysis that provides candidate recognition results with probabilities and the given priority most likely Awards candidates. A directory that has chances to Combine characters and words contains is used.

Ein anderer Artikel "Computing Graphs and Graph Transformations" ("Berechnen von Graphen und Graphentransformationen") von Blostein et al., Software Practice & Experience, Bd. 29, Nr. 3, John Wiley & Sons, Seiten 197–217, März 1999, offenbart Graphen-Modifikationen, die eine Graphenreduzierung, ein Graphenneuschreiben und eine Graphentransformation einbezieht. Das Verfahren wird zum Analysieren optischer Zeichenerkennungsbenutzung in einem mathematischen Erkenner verwendet. Ein Prozess erstellt Anmerkungen, die für praktische räumliche Zusammenhänge analysiert werden. Zusammenhänge werden in einem Graphen präsentiert.Another article, "Computing Graphs and Graph Transformations," by Blostein et al., Software Practice & Experience, Vol. 29, No. 3, John Wiley & Sons, pp. 197-217, March 1999 discloses graphene modifications involving graph reduction, graph rewriting, and graph transformation. The method is used to analyze optical character recognition usage in a mathematical recognizer. One Process creates annotations that are analyzed for practical spatial relationships. Connections are presented in a graph.

Wenn daher die Position eines Zeichens fehlbeurteilt wird, beeinträchtigt dies negativ alle nachfolgenden Beurteilungen in signifikantem Umfang. Wenn beispielsweise ein gewöhnliches Zeichen als ein Index fehlbeurteilt wird, werden alle gewöhnlichen Zeichen, die nach ihm auf demselben Level angeordnet sind mit dem des fehlbeurteilten Zeichens, als so viele Indizes fehlbeurteilt. Kurz gesagt, eine lokale Fehlerkennung des mathematischen Ausdrucks kann stark die Erkennung seiner Gesamtstruktur schädigen.If Therefore, the position of a character is misjudged, this affects negative all subsequent assessments to a significant extent. For example, if a common Characters are misjudged as an index, all become ordinary Characters that are placed after him on the same level with the of the misjudged character, as so many indices misjudged. In short, a local misrecognition of mathematical expression can severely damage the recognition of its overall structure.

Ein anderer Artikel "Structure Analysis and Recognition of Mathematical Expressions" ("Strukturanalyse und Erkennung mathematischer Ausdrücke") von Twaakyondo et al., Abteilung für Informations-Ingenieurwissenschaften, Shinshu Universität, Japan, offenbart die Erkennung von gedruckten mathematischen Ausdrücken unter Verwendung zweidimensionaler Strukturanalyse, die "Von unten nach oben"- und "Von oben nach unten"-Strategien verwendet. Vor dem Verarbeiten werden individuelle Symbole erkannt und eine Schätzung der normalen Größe eines Zentralsymbols wird vorgenommen. Daraufhin wird der Text einer Wurzelausdrucksanalyse unterzogen und die Überschrift einer Überschriftsausdrucksanalyse und Matrixausdrucksanalyse. Die Analyse des gewöhnlichen nicht mathematischen Textes wird nicht beachtet.One other article "Structure Analysis and Recognition of Mathematical Expressions "(" Structure Analysis and Detection of mathematical expressions ") by Twaakyondo et al., Department of Information Engineering, Shinshu University, Japan, discloses the recognition of printed mathematical expressions under Using two-dimensional structural analysis that uses "bottom-up" and "top-down" strategies. Before processing, individual symbols are recognized and one estimate the normal size of a Central symbol is made. The text is then subjected to a root expression analysis and the headline a heading expression analysis and matrix expression analysis. The analysis of the ordinary non-mathematical text will not be considered.

Die vorliegende Erfindung richtet sich auf eine Vorrichtung gemäß Anspruch 1 und ein Verfahren gemäß Anspruch 6. Bevorzugte Ausführungsformen werden in den abhängigen Ansprüchen erläutert.The The present invention is directed to an apparatus according to claim 1 and a method according to claim 6. Preferred embodiments become dependent claims explained.

Die Erfindung kann vollständiger aus der folgenden detaillierten Beschreibung verstanden werden, wenn diese in Verbindung den beiliegenden Zeichnungen betrachtet wird, in denen zeigt:The Invention can be more complete be understood from the following detailed description, when considered in conjunction with the accompanying drawings becomes, in which shows:

1 ein Blockdiagramm eines OCR-Systems (optisches Zeichenerkennungssystem) gemäß einer Ausführungsform der vorliegenden Erfindung; 1 a block diagram of an OCR system (optical character recognition system) according to an embodiment of the present invention;

2 ein Ablaufdiagramm des Betriebs zum Erfassen eines mathematischen Ausdrucks durch die Ausführungsform der 1; 2 a flowchart of the operation for detecting a mathematical expression by the embodiment of the 1 ;

3 eine schematische Darstellung des Betriebs des Evaluierens der mathematischen Ausdrucks bzw. Textes, der durch die Ausführungsform der 1 ausgeführt wird, um einen mathematischen Ausdruck zu erfassen; 3 a schematic representation of the operation of evaluating the mathematical expression or text, by the embodiment of the 1 is executed to detect a mathematical expression;

4 eine schematische Darstellung des Verzeichnisses zum Beurteilen eines mathematischen Ausdrucks bzw. Textes, das durch die Ausführungsform der 1 Vorrichtung wird; 4 a schematic representation of the directory for judging a mathematical expression or text, which is the embodiment of the 1 Device becomes;

5 eine schematische Darstellung des Betriebs zum Suchen eines optimalen Pfades, der durch die Ausführungsform der 1 zu verwenden ist, um einen mathematischen Ausdruck zu erfassen; 5 a schematic representation of the operation for finding an optimal path, by the embodiment of the 1 to be used to capture a mathematical expression;

6 eine schematische Darstellung des Verzeichnisses zum Verbinden von Sprachteilen, welches durch die Ausführungsform der 1 verwendet wird; 6 a schematic representation of the directory for connecting language parts, which by the embodiment of the 1 is used;

7 ein Ablaufdiagramm des Betriebs des Erkennens eines mathematischen Ausdrucks durch die Ausführungsform der 1; 7 a flowchart of the operation of recognizing a mathematical expression by the embodiment of the 1 ;

8 eine schematische Darstellung des Betriebs des Zerlegens eines mathematischen Ausdrucks, welcher durch die Ausführungsform der 1 durchgeführt wird, um den mathematischen Ausdruck zu erkennen; 8th a schematic representation of the operation of decomposing a mathematical expression, which by the embodiment of the 1 is performed to recognize the mathematical expression;

9 eine schematische Darstellung des Betriebs zum Erfassen von Kandidatenzeichen, der zum Erkennen eines mathematischen Ausdrucks durch die Ausführungsform der 1 zu verwenden ist; 9 a schematic representation of the operation for detecting candidate character, for recognizing a mathematical expression by the embodiment of the 1 is to be used;

10 eine schematische Darstellung des Betriebs zum rechenmäßigen Bestimmen der Normalisierungsgröße und des Normalisierungszentrums, der zum Erkennen eines mathematischen Ausdrucks durch die Ausführungsform der 1 zu verwenden ist; 10 4 is a schematic representation of the operation for computing the normalization quantity and the normalization center in order to recognize a mathematical expression by the embodiment of FIG 1 is to be used;

11A, 11B, 11C und 11D Streudiagramme, die durch die Ausführungsform der 1 verwendet werden können; 11A . 11B . 11C and 11D Scatterplots by the embodiment of the 1 can be used;

12 eine schematische Darstellung von Verknüpfungskandidaten, die zwischen zwei aufeinander folgenden Zeichen durch die Ausführungsform der 1 erzeugt werden können; 12 a schematic representation of link candidates, between two consecutive characters by the embodiment of the 1 can be generated;

13 eine schematische Darstellung des Betriebs zum Suchen eines optimalen Pfads, der durch die Ausführungsform der 1 zu verwenden ist, um einen mathematischen Ausdruck zu erkennen; und 13 a schematic representation of the operation for finding an optimal path, by the embodiment of the 1 to be used to recognize a mathematical expression; and

14A, 14B und 14C schematische Darstellungen der Bedingungen, die zu erfüllen sind für das rechenmäßige Bestimmen einer globalen Evaluierung, die zum Erkennen eines mathematischen Ausdrucks durch die Ausführungsform der 1 verwendet wird. 14A . 14B and 14C schemati These representations of the conditions to be fulfilled for the computational determination of a global evaluation, which are used to recognize a mathematical expression by the embodiment of FIG 1 is used.

Eine Ausführungsform einer Vorrichtung und eines Verfahrens zum Erkennen mathematischer Ausdrücke und eine Vorrichtung und ein Verfahren zum Erkennen von Zeichen in Übereinstimmung mit der vorliegenden Erfindung werden nun unter Bezugnahme auf die beiliegenden Zeichnungen beschrieben.A embodiment a device and a method for recognizing mathematical expressions and an apparatus and method for recognizing characters in accordance with Present invention will now be with reference to the accompanying Drawings described.

1 ist ein Blockdiagramm eines Zeichenerkennungssystems, das unter Verwendung der Ausführungsform der vorliegenden Erfindung realisiert wird. Das Zeichenerkennungssystem (optische Zeichenerkennung: OCR) 11 wird ausersehen zum Erkennen eines gedruckten Dokumentes 8, das mathematische Ausdrücke enthält. Ein solches gedrucktes Dokument 8 kann typischerweise ein wissenschaftliches oder technologisches Dokument sein. Das System 11 liest das gedruckte Dokument 8 mit Hilfe eines Scanners bzw. Abtasters 10 ab und führt eine Verarbeitungsoperation aus zum Erkennen jeder der Textregionen und jeder der Mathematik-Ausdrucksregionen in dem Dokument. Dann gibt das System 11 elektronische Dokumentendaten aus, die Textdaten und mathematische Ausdrucksdaten enthalten, als Daten der Ergebnisse der Erkennung 20. Dokumente, die durch ein solches System gelesen werden können, schließen nicht nur gedruckte Dokumente ein, sondern auch Dokumentbilder, die mathematische Ausdrücke enthalten, die bereits zu Bilddaten reduziert worden sind. 1 FIG. 10 is a block diagram of a character recognition system implemented using the embodiment of the present invention. FIG. The character recognition system (optical character recognition: OCR) 11 is designed to recognize a printed document 8th containing mathematical expressions. Such a printed document 8th can typically be a scientific or technological document. The system 11 reads the printed document 8th with the help of a scanner or a scanner 10 and performs a processing operation to recognize each of the text regions and each of the math expression regions in the document. Then the system gives 11 electronic document data containing text data and mathematical expression data as data of the results of recognition 20 , Documents that can be read by such a system include not only printed documents but also document images containing mathematical expressions that have already been reduced to image data.

Das OCR-System 11 wird als eine Software realisiert oder ein Computerprogramm, das durch einen Computer ausgeführt wird. Es kann typischerweise Funktionsmodule einen Entwurfsanalyseabschnitt 111 umfassen, einen Erkennungsabschnitt 112 gewöhnlicher Zeichen, einen Erfassungsabschnitt 113 mathematischer Ausdrücke, einen Erkennungsabschnitt 114 mathematischer Ausdrücke, einen Ausgabeumwandlungsabschnitt 115, ein Verzeichnis zum Beurteilen mathematischer Ausdrücke/Texte 201, ein Verzeichnis zum Verbinden von Sprachteilen 202, einen Streudiagramm-Informationsspeicherabschnitt 203 und einen Globalevaluierungsinformationsspeicherabschnitt 204. Die Verzeichnisse und die Information der Speicherabschnitte werden in einem oder mehreren Speichermedien wie Halbleiterspeichern und/oder Magnetplatten gespeichert.The OCR system 11 is realized as a software or a computer program that is executed by a computer. It may typically be a design analysis section of functional modules 111 comprise a recognition section 112 ordinary signs, a detection section 113 mathematical expressions, a recognition section 114 mathematical expressions, an output conversion section 115 , a directory for evaluating mathematical expressions / texts 201 , a directory for connecting language parts 202 , a scatter plot information storage section 203 and a global evaluation information storage section 204 , The directories and the information of the memory sections are stored in one or more storage media such as semiconductor memories and / or magnetic disks.

Die Verarbeitungsoperation zum Erkennen eines Documents läuft in einer Abfolge ab von 1) Abtasten des Dokumentbildes, 2) Analysieren des Entwurfs, 3) Erkennen gewöhnlicher Zeichen, 4) Erfassen mathematischer Ausdrücke, 5) Erkennen mathematischer Ausdrücke und 6) Umwandeln der erhaltenen Daten in elektrische Ausgangsdaten. Nun wird die Verarbeitungsoperation speziell im Hinblick auf das Verfahren zum Ausführen des Schrittes 4) des Erfassens mathematischer Ausdrücke und des Schrittes 5) des Erkennens mathematischer Ausdrücke beschrieben.The Processing operation for recognizing a document runs in one Sequence from 1) scanning the document image, 2) analyzing the Design, 3) Recognition of ordinary Characters, 4) capture mathematical expressions, 5) recognize mathematical expressions and 6) converting the obtained data into electrical output data. Now, the processing operation becomes special with respect to Method of execution of step 4) of acquiring mathematical expressions and of step 5) of recognizing mathematical expressions.

Vor dem detaillierteren Beschreiben der Schritte des Erfassens mathematischer Ausdrücke und des Erkennens mathematischer Ausdrücke wird der Ablauf der Verarbeitungsoperation nachstehend zusammenfassend beschrieben.In front to describe in more detail the steps of detecting mathematical Expressions and the recognition of mathematical expressions becomes the flow of the processing operation summarized below.

Zuerst wird ein Seitenbild des gedruckten Dokumentes erhalten, wenn der Scanner 10 das gedruckte Dokument 8, das mathematische Ausdrücke enthält, liest. Dann analysiert der Entwurfsanalyseabschnitt 111 den Entwurf jedes der Seitenbilder und das Seitenbild wird aufgeteilt in eine oder mehrere graphische Regionen, eine oder mehreren Tabellenregionen und eine oder mehrere Textregionen. Die Bilddaten jeder der graphischen Regionen und der Tabellenregionen werden ohne irgendeine weitere Verarbeitung ausgegeben. Der Erkennungsabschnitt gewöhnlicher Zeichen 112 erkennt die gewöhnlichen Zeichen in den Textregionen. Der Betrieb des Erkennens der gewöhnlichen Zeichen wird durch Aufteilen von Zeilen und Ausschneiden von Zeichen basierend auf einem Histogramm realisiert, um jedes der Zeichen zu erkennen. Daraufhin erfasst der Erfassungsabschnitt 113 mathematischer Ausdrücke jeweils die mathematischen Ausdrücke basierend auf dem Ergebnis der obigen Operation der Zeichenerkennung.First, a page image of the printed document is obtained when the scanner 10 the printed document 8th , which contains mathematical expressions, reads. Then the design analysis section analyzes 111 The design of each of the page images and the page image is divided into one or more graphical regions, one or more table regions, and one or more text regions. The image data of each of the graphic regions and the table regions is output without any further processing. The recognition section of ordinary signs 112 recognizes the ordinary characters in the text regions. The operation of recognizing the ordinary characters is realized by dividing lines and cutting out characters based on a histogram to recognize each of the characters. The capturing section then captures 113 mathematical expressions are each the mathematical expressions based on the result of the above character recognition operation.

Das Verzeichnis zum Beurteilen mathematischer Ausdrücke/Texte 201 und das Verzeichnis zum Verbinden von Sprachteilen 202 werden durch den Erfassungsabschnitt 113 mathematischer Ausdrücke verwendet, um mathematische Ausdrücke zu erfassen. Das Verzeichnis zum Beurteilen mathematischer Ausdrücke/Texte 201 definiert eine Evaluierungsbewertung für die Möglichkeit jedes Wortes, dass es zu einem Text gehört, und auch eine Evaluierungsbewertung für die Möglichkeit, dass ein Wort zu einem mathematischen Ausdruck gehört, basierend auf der Kategorie, die für das Wort mit Hilfe normaler Ausdrücke spezifiziert werden kann. Demnach werden ein Evaluierungsbewertungswert für die Möglichkeit zu einem Text zu gehören und auch eine Evaluierungsbewertung für die Möglichkeit zu einem mathematischen Ausdruck zu gehören Wort für Wort unter Bezugnahme auf das Verzeichnis zum Beurteilen eines mathematischen Ausdrucks/Textes 201 beurteilt.The directory for evaluating mathematical expressions / texts 201 and the directory for connecting language parts 202 be through the detection section 113 mathematical expressions used to capture mathematical expressions. The directory for evaluating mathematical expressions / texts 201 defines an evaluation score for the ability of each word to belong to a text, and also an evaluation score for the possibility that a word belongs to a math expression based on the category that can be specified for the word using normal phrases. Thus, an evaluation evaluation value for the possibility of belonging to a text and also an evaluation evaluation for the possibility of belonging to a mathematical expression are word for word referring to the dictionary for judging a mathematical expression / text 201 assessed.

Das Verzeichnis zum Verbinden von Sprachteilen 202 definiert. eine formative Grammatik von Texten und mathematischen Ausdrücken. Spezieller definiert es Regeln des Verbindens von Sprachteilen von Texten und mathematischen Ausdrücken. Demnach wird jedes erkannte Zeichen aufgeteilt in eine mathematische Ausdrucksregion oder eine Textregion durch Bestimmen eines optimalen Verbindungs-Zusammenhangs der Gruppe der das Wort enthaltenden erkannten Zeichen basierend auf der durch das Verzeichnis zum Verbinden von Sprachteilen 202 bereitgestellten formativen Grammatik, der Evaluierungsbewertung für die Möglichkeit des Gehörens zu einem Text und der Evaluierungsbewertung für die Möglichkeit des Gehörens zu einem mathematischen Ausdruck, wie unter Bezugnahme auf das Verzeichnis zum Beurteilen mathematischer Ausdrücke/Texte 201 erhalten.The directory for connecting language parts 202 Are defined. a formative grammar of texts and mathematical expressions. special it defines rules of linking language parts of texts and mathematical expressions. Thus, each recognized character is divided into a mathematical expression region or a text region by determining an optimal connection relationship of the group of recognized characters containing the word based on that through the dictionary for connecting speech parts 202 provided formative grammar, the evaluation evaluation for the possibility of belonging to a text and the evaluation evaluation for the possibility of belonging to a mathematical expression, as with reference to the directory for judging mathematical expressions / texts 201 receive.

Alle zu Mathematik-Ausdrucksregionen (Regionen für mathematische Ausdrücke) gehörenden Zeichen und Symbol werden zu dem Erkennungsabschnitt 114 mathematischer Ausdrücke gesendet und einer Verarbeitungsoperation unterzogen zum Erkennen der Struktur jedes mathematischen Ausdrucks. In dieser Verarbeitungsoperation zum Erkennen der Struktur jedes mathematischen Ausdrucks wird der mathematische Ausdruck in Elemente zerlegt und dann wird jedes der Elemente dahingehend überprüft, ob ein Zeichen auf einer Grundlinie ist, ein Tiefschriftzeichen oder ein Hochschriftzeichen. Eine Vielzahl von Zeichengrößenstreumustern, die in dem Streumusterinformationsspeicherabschnitt 203 gespeichert sind und Bedingungen für globale Evaluierung, die in dem Speicherabschnitt für globale Evaluierungsinformation gespeichert sind, welche nachstehend detaillierter beschrieben werden, werden bei dieser Überprüfungsoperation verwendet. Ein Paar aufeinander folgend angeordneter Zeichen kann auf einer selben Grundlinie angeordnet sein oder eines von ihnen kann eine Tiefschrift oder eine Hochschrift des anderen sein. Ein Zeichengrößenstreumuster, das Abtastinformation bereitstellt, zeigt die Größe der Normalisierung aufeinander folgender Zeichen und die Verteilung ihrer möglichen Mittelpositionen. Demnach werden Inter-Zeichenstrukturkandidaten und Verknüpfungskandidaten, die ihre jeweiligen Evaluierungsbewertungen repräsentieren, für irgendwelche zwei aufeinander folgenden Zeichen in einen mathematischen Ausdruck erhalten zum Zwecke des Bestimmens ihres Zusammenhangs, welcher der der horizontalen Nebeneinanderstellung (auf derselben Grundlinie) sein kann oder der von Tiefschrift oder Hochschrift.All characters and symbols belonging to mathematical expression regions (regions for mathematical expressions) become the recognition section 114 sent mathematical expressions and subjected to a processing operation to recognize the structure of each mathematical expression. In this processing operation for recognizing the structure of each mathematical expression, the mathematical expression is decomposed into elements, and then each of the elements is checked as to whether a character is on a baseline, a superscript, or a superscript. A plurality of character size scatter patterns included in the scatter pattern information storage section 203 and conditions for global evaluation stored in the global evaluation information storage section, which will be described in more detail below, are used in this checking operation. A pair of consecutively arranged characters may be arranged on a same baseline or one of them may be a subscript or a heading of the other. A character size scatter pattern providing sample information shows the amount of normalization of consecutive characters and the distribution of their possible center positions. Thus, inter-character structure candidates and link candidates representing their respective evaluation scores are obtained for any two consecutive characters in a mathematical expression for the purpose of determining their relationship, which may be that of the horizontal juxtaposition (on the same baseline) or that of subscript or superscript ,

Die Bedingungen globaler Evaluierung werden als Bedingungsmathematische Ausdrücke ausgedrückt zum angemessenen Bestimmen der Inter-Zeichenstruktur basierend auf der globalen Evaluierung aller in einem mathematischen Ausdruck enthaltenen Zeichen. Durch das verwenden der globalen Evaluierung ist es möglich, einen optimalen Pfad zu finden zum schlüssigen zueinander in Bezugsetzen der Zeichen in einen mathematischen Ausdruck ohne irgendeinen Widerspruch durch Bezugnahme zum Bestimmen lokaler Inter-Zeichenzusammenhänge.The Conditions of global evaluation are considered conditional expressions expressed to appropriately determine the inter-character structure based on the global evaluation of all contained in a mathematical expression Character. By using the global evaluation, it is possible to use a find the optimal path to conclusively relate to each other the sign into a mathematical expression without any contradiction by referring to determining local inter-character relationships.

Der Ausgabeumwandlungsabschnitt 115 kombiniert die für die Textregionen und die Mathematik-Ausdrucksregionen erhaltenen Erkennungsergebnisse synthetisch und gibt Daten auf den Erkennungsergebnissen 20 aus.The output conversion section 115 combines the recognition results obtained for the text regions and the math expression regions synthetically and outputs data on the recognition results 20 out.

FormelbestimmungsverfahrenForm same mood procedures

Nun wird nachstehend ein spezifisches Verfahren zum Erfassen eines mathematischen Ausdrucks beschrieben.Now Hereinafter, a specific method for detecting a mathematical will be described Expression described.

In dieser Ausführungsform wird wie in 2 gezeigt eine Mathematikausdrucksregion (Region mathematischer Ausdrücke) mit Hilfe eines Mathematikausdrucks-Erfassungsverfahrens (Verfahren zum Erfassen mathematischer Ausdrücke) erfasst, das zwei Schritte umfasst, den Schritt A1 und den Schritt A2. Dieses Erfassungsverfahren ist im Grunde eingerichtet zum Erfassen eines mathematischen Ausdrucks aus einem in englischer Sprache geschriebenen Dokument.In this embodiment, as in FIG 2 shown a mathematical expression region (region of mathematical expressions) detected by means of a math expression detection method (method for detecting mathematical expressions) comprising two steps, the step A1 and the step A2. This detection method is basically arranged to acquire a mathematical expression from a document written in English.

In Schritt A1 wird jedes Wort entweder als mathematischer Ausdruck (Math) oder als Text (Text) basierend auf gewöhnlicher Zeichenerkennung evaluiert. Ein Wort, wie es hier verwendet wird, bezieht sich auf eine Zeichenkette, die von anderen Zeichen durch Leerzeichen getrennt ist und wird als ein Ergebnis der Zeichenerkennung erfasst. 3 zeigt dieses Verfahren.In step A1, each word is evaluated as either mathematical expression (Math) or text (text) based on ordinary character recognition. A word as used herein refers to a string separated from other characters by blanks and is detected as a result of character recognition. 3 shows this procedure.

Es wird Bezug genommen auf 3, die erste Zeile zeigt ein in das System 11 eingegebenes Bild (Originalbild). Die zweite Zeile zeigt das Ergebnis der als ein Ergebnis einer Operation der gewöhnlichen Zeichenerkennung des gewöhnlichen Zeichenerkennungsabschnitts 112 erhaltenen Erkennung. Da das Merkmal des Erkennens eines mathematischen Ausdrucks nicht für eine Operation zum Erkennen der gewöhnlichen Zeichen benutzt werden kann, erkennt in dieser Ausführungsform der gewöhnliche Zeichenerkennungsabschnitt 112 einen mathematischen Ausdruck bloß als eine unerwartete Kette von Symbolen. Im Schritt A1 wird das Erkennungsergebnis eingegeben und jedes der Wörter in dem mathematischen Ausdruck wird entweder als ein mathematischer Ausdruck oder als ein Text evaluiert. Die rechts von den Überschriften von Math und Text aufgelisteten Werte sind dem Wort verliehene Evaluierungsbewertungen. In dieser Ausführungsform wird diese Verarbeitungsoperation unter Bezugnahme auf das oben beschriebene Verzeichnis für die Beurteilung eines mathematischen Ausdrucks/Textes 201 durchgeführt. 4 zeigt Beispiele von Daten, die in dem Verzeichnis zum Beurteilen eines mathematischen Ausdrucks/Textes 201 gespeichert werden können.It is referred to 3 , the first line shows one in the system 11 input image (original image). The second line shows the result of FIG. 12 as a result of an ordinary character recognition operation of the ordinary character recognition section 112 received recognition. Since the feature of recognizing a mathematical expression can not be used for an operation for recognizing the ordinary characters, in this embodiment, the ordinary character recognition portion recognizes 112 a mathematical expression merely as an unexpected string of symbols. In step A1, the recognition result is input, and each of the words in the mathematical expression is evaluated as either a mathematical expression or a text. The values listed to the right of the headings of Math and Text are evaluation evaluations given to the word. In this embodiment, this processing operation will be described with reference to the above bene directory for the evaluation of a mathematical expression / text 201 carried out. 4 shows examples of data contained in the dictionary for judging a mathematical expression / text 201 can be stored.

Es wird Bezug genommen auf 4, die Zeile mit der Zeilennummer 1 zeigt dass das Wort "mit" eine Präposition (PP) ist und eine Evaluierungsbewertung von 0 als Math (mathematischer Ausdruck) hat und die von 100 als Text (Text). Die Telefon mit der Zeilennummer 2 zeigt, dass das Wort "wobei" ein Pronomen (PN) ist und auch eine Evaluierungsbewertung von 0 als Math hat und die von 100 als Text. Die Zeile mit der Zeilennummer 3 zeigt, dass das Wort "ist" ein Verb (V) ist und eine Evaluierungsbewertung von 70 als Math hat und die von 70 als Text. Die Zeile mit der Zeilennummer 4 zeigt, dass das Wort "ein" ein Artikel (ART) ist und eine Evaluierungsbewertung von 90 als Math hat und die von 90 als Text. Auf diese Weise speichert das Verzeichnis zur Beurteilung eines mathematischen Ausdrucks/Textes 201 weitgehend alle Wörter, die in wissenschaftlichen und technologischen Dokumenten auftreten könnten, ihre jeweiligen Bucstabierungen (die Anordnung von Zeichencode), ihre jeweiligen Sprachteile und ihre jeweiligen Evaluierungsbewertungen, die ihnen mathematische Ausdrücke und als Text verliehen worden sind.It is referred to 4 , the line with the line number 1 shows that the word "with" is a preposition (PP) and has an evaluation score of 0 as Math (mathematical expression) and that of 100 as text (text). The telephone with the line number 2 shows that the word "where" is a pronoun (PN) and also has an evaluation score of 0 as a math and that of 100 as a text. The row numbered line 3 shows that the word "is" is a verb (V) and has an evaluation score of 70 as a math and that of 70 as a text. The row with row number 4 shows that the word "a" is an article (ART) and has an evaluation score of 90 as a math and 90 as a text. In this way, the directory stores to evaluate a mathematical expression / text 201 broadly all the words that could appear in scientific and technological documents, their respective buzzings (the arrangement of character code), their respective language parts, and their respective evaluation ratings, which have been given mathematical expressions and as text.

Zudem ist diese Ausführungsform eingerichtet zum flexiblen Unterbringen verschiedener Zeichenketten von Symbolen mit Hilfe normaler Ausdrücke in Hinblick auf die Tatsache, dass mathematische Ausdrücke als so viele unerwartete Ketten von Symbolen erkannt werden können. Normale Ausdrücke werden verwendet zum Ausdrücken der Buchstabierungen von Wörtern in flexibler Weise in Zeichensuchsystemen. Die Bedeutung von einigen der in normalen Ausdrücken verwendeten Symbole sind nachstehend gezeigt.

. stellt irgend ein Zeichen dar.
* stellt 0 oder mehr Wiederholungen des unmittelbar vorangehenden Zeichens dar (z. B. stellt .* irgendeine Zeichenkette dar).
[] stellt irgendeines von spezifizierten Zeichen in einer eckigen Klammer dar (z. B. stellt [a-z] irgendein alphabetisches Zeichen von "a" bis "z" dar).
^ stellt Zeichen dar, die von jenen als Nächstes spezifizierten abweichen (z. B. repräsentiert [^a-z] irgendein Zeichen, das von "a" bis "z" abweicht).

In addition, this embodiment is arranged to flexibly accommodate various character strings of symbols using normal expressions in view of the fact that mathematical expressions can be recognized as so many unexpected strings of symbols. Normal expressions are used to express the spelling of words in a flexible manner in character search systems. The meaning of some of the symbols used in normal expressions are shown below.

, represents any sign.
* represents 0 or more repetitions of the immediately preceding character (for example,. * represents any string).
[] represents any of specified characters in square brackets (e.g., [az] represents some alphabetic character from "a" to "z").
^ represents characters that differ from those specified next (for example, [^ az] represents any character that differs from "a" to "z").

Demnach zeigt die Zeile mit der Zeilennummer 5 der 4 dass das Wort ".* [^a-z].*" ein Zeichen einschließt, das von den Zeichen von "a" bis "z" abweicht, oder ein Symbol. Es ist ein Nomen (N) und hat eine Evaluierungsbewertung von 100 als Math und die von 70 als Text. In ähnlicher Weise zeigt die Zylinder mit der Zeilennummer 6, dass das Wort ".*[^a-z].*[^a-z].*" zwei Zeichen einschließt, die von den Zeichen von "a" bis "z" abweichen, oder eines oder zwei Symbole. Es ist ein Nomen (N) und hat eine Evaluierungsbewertung von 100 als Math und die von 40 als Text. Die Zeile mit der Zeilennummer 7 zeigt, dass das Wort ".*[^a-z].*[^a-z].*[^a-z].*" drei Zeichen einschließt, die von Zeichen "a" bis "z" abweichen oder eines, zwei oder drei Symbole. Es ist ein Nomen (N) und hat die Evaluierungsbewertung von 100 als Math und die von 20 als Text. Die Zeile mit der Zeilennummer 8 zeigt, dass das Wort ".*" ein Wort ist, das durch ein einzelnes alphabetisches Zeichen gebildet wird, welches aus "a" bis "z" ausgewählt wird. Es ist ein Nomen (N) und hat eine Evaluierungsbewertung von 90 als Math und die von 40 als Text. Beachte, dass ein Nomen (N) angibt, dass das Wort ein Text ist.Accordingly, the row with the line number 5 shows the 4 that the word ". * [^ az]. *" includes a character other than the characters from "a" to "z", or a symbol. It is a noun (N) and has an evaluation score of 100 as math and 70 as text. Similarly, the cylinder with the line number 6 indicates that the word ". * [^ Az]. * [^ Az]. *" Includes two characters other than the characters from "a" to "z" or one or two symbols. It is a noun (N) and has an evaluation score of 100 for math and 40 for text. The row with the row number 7 shows that the word ". * [^ Az]. * [^ Az]. * [^ Az]. *" Includes three characters other than characters "a" to "z" or one , two or three symbols. It is a noun (N) and has the evaluation score of 100 as a math and 20 as a text. The row with the row number 8 shows that the word ". *" Is a word formed by a single alphabetic character selected from "a" to "z". It is a noun (N) and has an evaluation score of 90 for math and 40 for text. Note that a noun (N) indicates that the word is a text.

Demnach ist es möglich, den Teil der Sprache und die Evaluierungsbewertungen als Math und als Text jeweils von jedem als ein Ergebnis einer Operation der Zeichenerkennung, die unter Bezugnahme auf das Verzeichnis zum Beurteilen eines mathematischen Ausdrucks/Textes 201 durchgeführt wird, erkanntes Wort, wie in 4 gezeigt, Zeile für Zeile zu bestimmen.Thus, it is possible to have the part of speech and the evaluation scores as math and as text each of each as a result of a character recognition operation referring to the dictionary for judging a math expression / text 201 performed, recognized word, as in 4 shown, line by line to determine.

Genauer, wie in 3 gezeigt, werden eine Evaluierungsbewertung von 0 als Math und die von 100 als Text verliehen für das Wort "mit" basierend auf der Kenntnis, die durch Bezugnahme auf die Zeilennummer 1 der 4 erhalten wird. Eine Evaluierungsbewertung von 90 als Math und die von 40 als Text werden für das Wort "f" basierend auf der Kenntnis verliehen, die unter Bezugnahme auf die Zeilennummer 8 der 4 erhalten wird. Eine Evaluierungsbewertung von 100 als Math und die von 20 als Text werden dem Wort ",\" mit drei Zeichen basierend auf der Kenntnis verliehen, die unter Bezugnahme auf die Zeilennummer 7 der 4 erhalten wird. Eine Evaluierungsbewertung von 100 als Math und die von 20 als Text werden dem Wort ")=,\" verliehen mit vier Zeichen basierend auf der Kenntnis, die erhalten wird durch Bezugnahme auf das Verzeichnis zum Beurteilen eines mathematischen Ausdrucks/Textes 201, obwohl eine solche Anordnung von Zeichen nicht in 4 gezeigt ist. In ähnlicher Weise werden eine Evaluierungsbewertung von 0 als Math und die von 100 als Text dem Wort "wobei" basierend auf der Kenntnis verliehen, die durch Bezugnahme auf die Zeilennummer 2 der 4 erhalten wird. Eine Evaluierungsbewertung von 90 als Math und die von 40 als Text werden dem Wort "U" basierend auf der Kenntnis verliehen, die unter Bezugnahme auf die Zeilennummer 8 der 4 erhalten wird. Eine Evaluierungsbewertung von 70 als Math und die von 70 als Text werden dem Wort "ist" basierend auf der Kenntnis verliehen, die durch Bezugnahme auf die Zeilennummer 3 der 4 erhalten wird. Eine Evaluierungsbewertung von 90 als Math und die von 90 als Text werden dem letzten Wort "ein" basierend auf der Kenntnis verliehen, die durch Bezugnahme auf die Zeilennummer 4 der 4 erhalten wird, weil die Kenntnis der Zeilennummer 4 eine Priorität zu der von der Zeilennummer 8 hat.Exactly, as in 3 are shown as an evaluation score of 0 as Math and that of 100 as text for the word "with" based on the knowledge made by referring to row number 1 of the 4 is obtained. An evaluation score of 90 as Math and that of 40 as text are given for the word "f" based on the knowledge given with reference to line number 8 of the 4 is obtained. An evaluation score of 100 as a math and that of 20 as a text are given to the word ", \" with three characters based on the knowledge given with reference to the row number 7 of the 4 is obtained. An evaluation score of 100 as Math and that of 20 as a text are given to the word ") =, \" with four characters based on the knowledge obtained by referring to the dictionary for judging a mathematical expression / text 201 although such an arrangement of characters is not in 4 is shown. Similarly, an evaluation score of 0 as Math and that of 100 as text are given to the word "where" based on the knowledge made by referring to line number 2 of FIG 4 is obtained. An evaluation score of 90 as Math and that of 40 as text are given to the word "U" based on the knowledge given with reference to line number 8 of FIG 4 is obtained. An evaluation score of 70 as a math and that of 70 as a text are given to the word "is" based on the knowledge given by referring to the row number 3 of the 4 is obtained. An evaluation score of 90 as Math and that of 90 as text will be the last word "a" based on the Knowledge by referring to the line number 4 of the 4 is obtained because the knowledge of the line number 4 has a priority to that of the line number 8.

Dann wird in dem nächsten Schritt von Schritt A2 eine Verarbeitungsoperation zum Suchen eines optimalen Pfades von dem Evaluierungsbewertung und zum Koppeln der Evaluierungsbewertung durchgeführt. 5 stellt schematisch diese Operation dar. In diesem Schritt A2 wird das Verzeichnis zum Verbinden von Sprachteilen 202 verwendet, weil es zeigt, welcher Sprachteil mit welchem Sprachteil in einem Text verbunden werden kann und welcher Sprachteil in einem Text mit einem mathematischen Ausdruck verbunden werden kann. 6 zeigt Beispiele von Daten, die in dem Verzeichnis zum Verbinden von Sprachteilen 202 gespeichert werden können.Then, in the next step of step A2, a processing operation for finding an optimal path from the evaluation evaluation and coupling the evaluation evaluation is performed. 5 schematically illustrates this operation. In this step A2, the directory for connecting language parts 202 used because it shows which part of speech can be connected to which part of speech in a text and which part of speech in a text can be connected to a mathematical expression. 6 shows examples of data stored in the directory for connecting parts of speech 202 can be stored.

Es wird Bezug genommen auf 6, "Text pp → Math" in der ersten Zeile zeigt, dass eine Präposition (PP) in einem Text mit einem unmittelbar darauf folgenden mathematischen Ausdruck verbunden werden kann. "Math → Math" in der zweiten Zeile zeigt, dass zwei oder mehr als zwei mathematische Ausdrücke verbunden werden können. "Math → Text PN" in der dritten Zeile zeigt, dass ein mathematischer Ausdruck mit dem unmittelbar nachfolgenden Pronomen (PN) in einem Text verbunden werden kann. "Text PN → Math" in der vierten Zeile zeigt, dass ein Pronomen (PN) in einem Text mit dem unmittelbar nachfolgenden mathematischen Ausdruck verbunden werden kann. "Text ART → Text N" in der fünften Zeile zeigt, dass ein Artikel (ART) in einem Text mit dem unmittelbar nachfolgenden Nomen (N) in dem Text verbunden werden kann.It is referred to 6 , "Text pp → Math" in the first line shows that a preposition (PP) in a text can be linked to an immediately following mathematical expression. "Math → Math" in the second line shows that two or more than two mathematical expressions can be joined. "Math → Text PN" in the third line shows that a mathematical expression can be connected to the immediately following pronoun (PN) in a text. "Text PN → Math" in the fourth line shows that a pronoun (PN) in a text can be connected to the immediately following mathematical expression. "Text ART → Text N" in the fifth line shows that an article (ART) in a text can be linked to the immediately following noun (N) in the text.

Das Verzeichnis zum Verbinden von Sprachteilen 202 speichert alle möglichen Kombinationen, die zum Verbinden gut sind. Irgendwelche anderen Kombinationen sind nicht zum Verbinden gut.The directory for connecting language parts 202 stores all possible combinations that are good for linking. Any other combinations are not good for joining.

In einem Betrieb des Suchens nach einem optimalen Pfad wird entweder ein mathematischer Ausdruck oder Text für jedes Wort unter Bezugnahme auf seine Evaluierungsbewertungen gesucht und die nur zulässigen Verbindungen werden nach erfassten Wörtern nachverfolgt in Übereinstimmung mit den in dem Verzeichnis zum Verbinden von Sprachteilen 202 definierten Regeln der formativen Grammatik. Dann wird der Pfad gesucht, der die größte Summe von Evaluierungsbewertungen zeigt, die den Wörtern als mathematischer Ausdrücke und Text verliehen worden sind und wird von allen möglichen Verbindungspfaden ausgewählt. Kurz gesagt, das Wort "mit" in 5 kann mit dem unmittelbar folgenden Wort "mit" verbunden werden, wenn das Wort "mit" ein Math ist und das Wort "f" entweder ein Math ist oder ein Text und wenn das Wort "mit" Text ist und das Wort "f" entweder Math oder Text. Wie auch immer, der Pfad von Text "mit" zu Math "f" wird ausgewählt, weil die Summe der Evaluierungsbewertungen dieser Kombination den größten Wert zeigt. 5 zeigt, dass ein optimaler Pfad von Text, Math, Math, Math, Text, Math, Text und Text, ausgewählt wird und nachverfolgt wird für das Verbinden der acht Wörter beginnend von dem ersten Wort "mit" und endend mit dem letzten Wort "a."In an operation of searching for an optimal path, either a mathematical expression or text is searched for each word with reference to its evaluation scores, and the only permitted links are tracked for detected words in accordance with those in the dictionary for connecting speech parts 202 defined rules of formative grammar. Then the path is searched that shows the largest sum of evaluation scores imparted to the words as mathematical expressions and text and is selected from all possible connection paths. In short, the word "with" in 5 can be linked to the immediately following word "with" if the word "with" is a math and the word "f" is either a math or a text and if the word "with" is text and the word "f" is either Math or text. However, the path of text "with" to math "f" is selected because the sum of the evaluation scores of that combination is the largest. 5 Figure 5 shows that an optimal path of text, math, math, math, text, math, text and text is selected and tracked for joining the eight words starting from the first word "with" and ending with the last word "a . "

Dieser Suchalgorithmus kann mit Hilfe einer Strahlsuchtechnik realisiert werden (welche auch als weitenpriorisierte Suche bezeichnet wird). Das Strahlsuchen ist auf dem Gebiet der dynamischen Programmierung ein wohlbekannter Algorithmus. Er ist entworfen, um Pfade zu eliminieren, die als kaum möglich beurteilt werden, ausgewählt zu werden als optimaler Pfad, um die Dimensionen des zu durchsuchenden Raums und auch sowohl den Umfang an Berechnungen und die Speicherkapazität, die zum Suchen eines optimalen Pfads erforderlich sind, zu reduzieren.This Search algorithm can be realized by means of a beam search technique (which is also referred to as a prioritized search). The beam search is in the field of dynamic programming a well-known algorithm. It's designed to eliminate paths as hardly possible be judged to be the optimal path to the dimensions of the search Space and also both the amount of calculations and the storage capacity needed to search an optimal path are required to reduce.

Als ein Ergebnis der oben beschriebenen Suchoperation wird jedes der Wörter als entweder zu einem mathematischen Ausdruck oder einem Text gehörend beurteilt und Mathematikausdrucksregionen und Textregionen werden erfasst. In dem Fall der 5 wird gesehen werden, dass die vier Wörter von "f", "(,\", ")=,\" und "U" als zu einem mathematischen Ausdruck gehörend beurteilt werden wohingegen alle anderen Wörter als zu einem Text gehörend beurteilt werden. Die Regionen der den Wörtern entsprechenden Bilddaten, die als zu mathematischen Ausdrücken gehörend beurteilt werden, sind Mathematikausdrucksregionen, wohingegen die Regionen der Bilddaten, die den Wörtern, die als zu Texten gehörend beurteilt werden, Textregionen sind.As a result of the search operation described above, each of the words is judged to belong to either a mathematical expression or a text, and math expression regions and text regions are detected. In the case of 5 It will be seen that the four words of "f", "(, \", ") =, \" and "U" are judged to belong to a mathematical expression, whereas all other words are judged to belong to a text. The regions of the image data corresponding to the words, which are judged to belong to mathematical expressions, are math expression regions, whereas the regions of the image data, which are the words judged to be texts, are text regions.

Es sollte bemerkt werden, dass höher entwickelte formative Grammatik wie eine kontextfreie Grammatik definiert werden kann zum Beschreiben von Verbindungszusammenhängen, um die Verbindungen von Sprachteilen zum Zwecke dieser Erfindung zu prüfen. Eine solche Grammatik ist äquivalent einer normalen Grammatik aber höher entwickelt als die Letztere.It should be noticed that higher developed formative grammar as a context-free grammar can be defined to describe connection contexts the connections of language parts for the purpose of this invention check. Such a grammar is equivalent to one normal grammar but higher developed as the latter.

Konventionelle Systeme des betrachteten Typs sind angepasst, um einfache Regeln zum Erfassen mathematischer Ausdrücke zu verwenden. Wenn beispielsweise eine oder mehrere Klammern und/oder Schrägschriftbuchstaben in einer Gruppe von Wörtern erkannt werden, bestimmen sie normalerweise derart, dass die Gruppe von Wörtern jene eines mathematischen Ausdrucks ist. Solche Systeme können normalerweise nicht verschiedenartige Symbole handhaben, die in einem erkannten mathematischen Ausdruck eingeschlossen sind. Wenn ein Wort "a" ("ein") in einem Text gefunden wird, kann das System nicht bestimmen, ob es sich um einen unbestimmten englischsprachigen Artikel handelt oder ein Schriftzeichen in einem mathematischen Ausdruck. Demgegenüber kann diese Ausführungsform exakter bestimmen, ob jedes Wort zu einem mathematischen Ausdruck oder Text gehört, durch Bezugnahme auf die Evaluierungsbewertungen des Wortes in einer oben beschriebenen Weise. Zudem ist es, da die Ausführungsform eine formative Grammatik zum Prüfen jedes Wortes verwendet, möglich, zu bestimmen, dass ein Wort "a", dem nicht unmittelbar ein Nomen folgt, zu einem mathematischen Ausdruck gehört, basierend auf der Regel, dass nur ein Nomen direkt "a" folgen kann, wenn das Letztere ein englischsprachiger Artikel ist.Conventional systems of the type considered are adapted to use simple rules for acquiring mathematical expressions. For example, when one or more parentheses and / or slashed letters are recognized in a group of words, they usually determine such that the group of words is that of a mathematical expression. Such systems usually can not handle disparate symbols that are identified in a mathematical formula term are included. If a word "a" is found in a text, the system can not determine whether it is an indefinite English-language article or a character in a mathematical expression. In contrast, this embodiment can more accurately determine whether each word belongs to a mathematical expression or text by referring to the evaluation scores of the word in a manner described above. In addition, since the embodiment uses a formative grammar to check each word, it is possible to determine that a word "a" not immediately followed by a noun belongs to a mathematical expression based on the rule that only one noun can directly follow "a" if the latter is an English-language article.

(Mathematikausdruckserkennungsverfahren)(Math expression recognizing method)

Zum Erkennen eines mathematischen Ausdrucks ist es notwendig, eine Technik zum Prüfen der Struktur des Ausdrucks im Hinblick auf Indizes, Exponenten, Nenner/Zähler und so weiter zu überprüfen zusätzlich zu dem Erkennen der Zeichen. Daher ist in diesem Sinne die Erkennung eines mathematischen Ausdrucks komplizierter als die Erkennung eines gewöhnlichen Zeichens. In dieser Ausführungsform werden Zeichen mit Hilfe einer bekannten Zeichenerkennungstechnik erkannt wohingegen die Struktur eines mathematischen Ausdrucks durch ein Verfahren geprüft wird, das vier Schritte von Schritt B1, Schritt B2, Schritt B3 und Schritt Band umfasst, die nachstehend unter Bezugnahme auf 7 beschrieben werden.To recognize a mathematical expression, it is necessary to check a technique for checking the structure of the expression in terms of indices, exponents, denominator / numerator, and so on, in addition to the recognition of the characters. Therefore, in this sense, the recognition of a mathematical expression is more complicated than the recognition of a common sign. In this embodiment, characters are recognized using a known character recognition technique, whereas the structure of a mathematical expression is examined by a method including four steps from step B1, step B2, step B3, and step Band, which will be described below with reference to FIG 7 to be discribed.

In Schritt B1 werden Bruchstriche, Wurzelzeichen und so weiter von Mathematikausdrucksregionen erfasst und Nenner und Zähler und Wörter in Wurzelzeichen werden isoliert. In ähnlicher Weise werden wenn es irgendwelche Indizes, Betonungszeichen, Wurzelzeichen, Punkte und so weiter gibt, diese erfasst und von den Bilddaten der Mathematikausdrucksregionen gelöscht.In Step B1 becomes break lines, root signs and so on Math expression regions are recorded and denominators and counters and words in root signs are isolated. Similarly, if there are any indices, stress marks, root signs, points and so forth, these are captured and from the image data of the math expression regions deleted.

Wenn beispielsweise ein in 8 gezeigter mathematischer Ausdruck in einer Mathematikausdrucksregion enthalten ist, der in der oben beschriebenen Weise erfasst wird, wird er in vier mathematische Ausdruckskomponenten zerlegt, die von unterbrochenen Linien in 8 umgeben sind. Dann wird der linke Index von der Mathematikausdruckskomponente (³a → a) gelöscht und das Betonungszeichen "^" oder "~" werden von der Mathematikausdruckskomponente (xdx^ → xdx) gelöscht. Obwohl in 8 nicht gezeigt, werden Wurzelzeichen und Punkte, sofern vorhanden, gelöscht (√a + b → a + b)(x → x).For example, if an in 8th is shown in a mathematical expression region detected in the manner described above, it is decomposed into four mathematical expression components represented by broken lines in FIG 8th are surrounded. Then, the left index is deleted from the math expression component ( ³ a → a) and the emphasis symbol "^" or "~" is deleted from the math expression component (xdx ^ → xdx). Although in 8th not shown, root characters and points, if any, are deleted (√a + b → a + b) (x → x).

Mathematische Elemente wie Nenner/Zähler, Indizes, Betonungszeichen, Wurzelzeichen und Punkte können relativ exakt in Übereinstimmung mit den oben angeführten Dokumenten [1], [2] und [3] erfasst werden. In vielen Fällen können sie durch ein einfaches Verfahren erfasst werden, das sich auf lokale Positionszusammenhänge solcher Elemente bezieht. Daher können die nachfolgenden Schritte von Schritt B2 bis Schritt B4 eingeschränkt werden auf Tiefschriften und Hochschriften (Exponenten) unter Verwendung des einfachen Erfassungsverfahrens des Erfassens anderer mathematischer Elemente wie sie oben aufgelistet worden sind, um die für sie erforderliche Zeit zu reduzieren.mathematical Elements like denominator / counter, Indices, stress marks, root signs and points can be relative exactly in agreement with the above Documents [1], [2] and [3]. In many cases they can be captured by a simple procedure that focuses on local Position relationships relates to such elements. Therefore, the subsequent steps of Step B2 to step B4 are limited to subscriptions and citations (exponents) using the simple detection method of the Capture other mathematical elements as listed above have been for the to reduce the time required.

In den nachfolgenden Schritten von Schritt B2 bis Schritt B4 werden die Mathematikausdruckskomponenten, die von Bruchstrichen, Betonungszeichen, Indizes, Wurzelzeichen und Punkten in Schritt B1 befreit worden sind, verarbeitet.In the subsequent steps from step B2 to step B4 the mathematical expression components that are punctuated by fractional lines, stress signs, Indices, root signs and points have been freed in step B1 are, processed.

In Schritt B2 werden verknüpfte schwarze Elemente von den Mathematikausdruckskomponenten für die Bilddaten der Mathematikausdruckskomponenten extrahiert, die im Schritt B1 erhalten worden sind und werden einem Prozess der Zeichenerkennung unterzogen. Als ein Ergebnis werden die in der unteren Hälfte der 9 gezeigten Kandidatenzeichen für die Mathematikausdruckskomponenten des Nenners in 8 erhalten. 9 zeigt eine Mathematikausdruckskomponente von cx²y³, deren Bilddaten einer Zeichenerkennung unterzogen werden. Jedes Zeichen (verknüpftes schwarzes Element) kann ein Großbuchstabe sein, ein Kleinbuchstabe oder eine Ziffer.In step B2, linked black elements are extracted from the math expression components for the image data of the math expression components obtained in step B1, and are subjected to a character recognition process. As a result, those in the lower half of the 9 shown candidate characters for the mathematical expression components of the denominator in 8th receive. 9 shows a math expression component of cx ² y ³ whose image data is subjected to character recognition. Each character (linked black element) can be a capital letter, a lowercase letter, or a digit.

In Schritt B3 wird die Möglichkeit des Verbindens irgendwelcher zwei Zeichen an allen erhaltenen Kandidatenzeichen unter Verwendung des in 10 dargestellten Zusammenhangs geprüft. 10 zeigt die für das Bestimmen des Positionszusammenhangs irgendwelcher zwei aufeinander folgender Zeichen zu verwendenden Werte (die Normalisierungsgröße und das Normalisierungszentrum). Die beiden Zeichen können horizontal (auf einer selben Grundlinie) angeordnet sein oder eines von ihnen kann eine Tiefschrift oder Hochschrift des Anderen sein. Es wird Bezug genommen auf 10, die Werte h1, h2 sind jeweils die Größen (Höhenangaben) der Normalisierung der betrachteten Zeichen. Wenn zwei Zeichen auf einer selben Linie angeordnet sind, bezieht sich die Größe der Normalisierung auf die korrigierte Größe, die sie eine selbe Größe (Höhe) haben lässt.In step B3, the possibility of connecting any two characters to all candidate characters obtained by using the in 10 examined context. 10 shows the values to be used for determining the positional relationship of any two consecutive characters (the normalization size and the normalization center). The two characters may be arranged horizontally (on a same baseline) or one of them may be a subscript or a superscript of the other. It is referred to 10 , the values h1, h2 are respectively the quantities (height indications) of the normalization of the considered characters. If two characters are arranged on a same line, the size of the normalization refers to the corrected size, which makes them have a same size (height).

Beachte, dass die Größe der Normalisierung sich hier auf die Höhe von dem höchsten Punkt von Buchstaben mit Oberlänge (z. B. dem höchsten Punkt von dem Buchstaben "d") bis zu dem tiefsten Punkt von Buchstaben mit Unterlänge (z. B. dem tiefsten Punkt des Buchstaben "y") bezieht. Mit anderen Worten, der Wert von h1 zeigt die Höhe von zwei kombinierten Buchstaben von "d" und "y", die übereinander getippt sind. "d" ist ein Zeichen, bei dem ein schwarzes Verknüpfungsliniensegment sich bis zum höchsten Punkt des Buchstabens mit Oberlänge erstreckt und "y" ist ein Zeichen, bei dem ein schwarzes Verknüpfungsliniensegment sich bis zu dem tiefsten Punkt des Buchstabens mit Unterlänge erstreckt. Beispielsweise hat der Buchstabe "x" in 10 eine Höhe, die kleiner ist als die des Buchstabens "d" oder "y". Daher kann "x" dazu verwendet werden, die Größe der Normalisierung h1 zu zeigen, die gleich der Höhe der kombinierten beiden Buchstaben von "d" und "y" ist, die übereinander getippt worden sind durch Multiplizieren ihrer tatsächlichen Höhe mit einem gegebenen Multiplizierer. Der Wert des Multiplizierers, der zu verwenden ist um jedes Zeichen die Größe der Normalisierung zeigen zu lassen, wird im Voraus definiert. Demnach wird das Zeichen dazu gebracht, die Größe der Normalisierung durch Multiplizieren seiner tatsächlichen Höhe mit dem gegebenen Multiplizierer zu liefern, der im Voraus definiert ist. Beispielsweise wird die Höhe des kleinen Buchstabens "c" sowohl aufwärts als auch abwärts ausgedehnt. Andererseits wird die Höhe eines Großbuchstabens "C" nur abwärts ausgedehnt.Note that the size of the normalization here refers to the height from the highest point of letters with upper length (eg the highest point of the letter "d") to the lowest point of letters with lower length (eg lowest point of the letter "y"). In other words, the value of h1 shows the height of two combined letters of "d" and "y" typed on top of each other. "d" is a character in which a black link line segment extends to the highest point of the uppercase letter and "y" is a character in which a black link line segment extends to the lowest point of the lowercase letter. For example, the letter "x" in 10 a height smaller than that of the letter "d" or "y". Therefore, "x" can be used to show the magnitude of normalization h1, which is equal to the height of the combined two letters of "d" and "y" that have been typed over each other by multiplying their actual height by a given multiplier. The value of the multiplier to be used to make each character show the size of the normalization is defined in advance. Thus, the sign is made to provide the magnitude of the normalization by multiplying its actual height by the given multiplier defined in advance. For example, the height of the small letter "c" is extended both up and down. On the other hand, the height of a capital letter "C" is only extended downward.

In ähnlicher Weise wird die tatsächliche Größe des Zeichens "2", das sich in einer Indexregion befindet, mit einem Multiplizierer multipliziert, der speziell definiert ist um es die Normalisierungsgröße von h2 zu zeigen, die für Indizes zu verwenden ist. Da Zeichen, die sich in Indexregionen befinden, eine kleine tatsächliche Größe haben, wird die für das sich in einer Indexregion befindende Zeichen "2" verwendete Normalisierungsgröße h2 kleiner gemacht als die Normalisierungsgröße h1, die für das sich auf der Grundlinie befindende Zeichen "x" zu verwenden ist.In similar Way becomes the actual Size of the character "2", which is in an index region, multiplied by a multiplier that is specifically defined around it the normalization size of h2 to show that for To use indexes. Because characters that are in index regions are a small actual Have size, will the for the normalization quantity h2 used in an index region "2" becomes smaller made as the normalization size h1 for that yourself on the baseline characters "x" use is.

In 10 kennzeichnet c1 und c2 jeweils Normalisierungszentren. Die Normalisierungszentrumsbezugnahme wird verwendet, um alle Zeichen auf einer selben Linie eine selbe Mittelposition im Hinblick auf die Höhe zeigen zu lassen. Hier wird das Normalisierungszentrum als der Mittelpunkt des y-Koordinatenwertes eines Rechtecks bezeichnet, das um das normalisierte Zeichen herum verläuft (nachstehend als Zeichenrechteck bezeichnet). Wenn die Höhen und die Normalisierungszentren von zwei benachbart angeordneten Zeichen h1 und c1 bzw. h2 und c2 sind, wird das in 11A bis 11D gezeigte Zeichengrößenstreudiagramm durch Ausdrucken des Zusammenhangs der Normalisierungsgrößen H = (h2/h1) × 1000 unddes Zusammenhangs der Normalisierungszentren D = {(c1 – c2)/h1} × 1000 erhalten. In 10 c1 and c2 respectively denote normalization centers. The normalization center reference is used to make all the characters on a same line show the same middle position in terms of height. Here, the normalizing center is referred to as the center of the y-coordinate value of a rectangle that passes around the normalized character (hereinafter referred to as a character rectangle). If the heights and the normalization centers of two adjacent characters are h1 and c1, and h2 and c2, respectively, the in 11A to 11D shown character size scatter plot by printing the relationship of the normalization quantities H = (h2 / h1) × 1000 and the connection of the normalization centers D = {(c1-c2) / h1} × 1000.

Die vier Streudiagramme (Probeninformation) der 11A bis 11D zeigen die durch Beobachten von Zeichenpaaren, die sich auf einer selben horizontalen Grundlinie befinden, Zeichenpaaren, von denen jedes Paar eine Hochschrift des anderen ist, und Zeichenpaaren, von denen eines von jedem Paar eine Tiefschrift des anderen ist für die Zusammenhänge der Größen H und der Zentren D der Normalisierung in Hinblick auf unterschiedliche Zeichentypen erhaltenen Ergebnisse. Genauer zeigt 11A ein Streudiagramm, das erhalten wird, wenn beide von zwei aufeinander folgenden Zeichen alphabetische Schriftzeichen sind. alphabetische Schriftzeichen, wie sie hier verwendet werden, beziehen sich auf gewöhnliche alphabetische Schriftzeichen, griechische Schriftzeichen und Ziffern. 11B zeigt ein Streudiagramm, das erhalten wird, wenn jedes Paar von aufeinander folgend angeordneten Zeichen ein alphabetisches Schriftzeichen und einen Operator enthält. 11C zeigt ein Streudiagramm, das erhalten wird, wenn jedes Paar von aufeinander folgend angeordneten Zeichen ein Integralzeichen und ein alphabetisches Schriftzeichen enthält. 11D zeigt ein Streudiagramm, das erhalten wird, wenn jedes Paar von aufeinander folgend angeordneten Schriftzeichen ein Σ-Zeichen und ein alphabetisches Schriftzeichen enthält.The four scatterplots (sample information) of the 11A to 11D By observing pairs of characters that are on a same horizontal baseline, the pairs of characters, each pair being a capital letter of the other, and character pairs, one of each pair being a subscript of the other, are for the relationships of sizes H and Centers D of normalization with respect to different character types obtained results. Exactly shows 11A a scatter plot obtained when both of two consecutive characters are alphabetic characters. As used herein, alphabetic characters refer to ordinary alphabetic characters, Greek characters and numerals. 11B FIG. 12 shows a scattergram obtained when each pair of consecutively arranged characters includes an alphabetical character and an operator. 11C FIG. 12 shows a scattergram obtained when each pair of consecutively arranged characters includes an integral character and an alphabetic character. 11D FIG. 12 shows a scattergram obtained when each pair of consecutively arranged characters includes a Σ character and an alphabetic character.

Demnach ist es nun möglich, die Inter-Zeichenstrukturkandidaten zu bestimmen, von denen jeder einen horizontalen Positionszusammenhang zeigen kann, einen Zeichen/Tiefschrift-Zusammenhang oder einen Zeichen/Hochschrift-Zusammenhang, und ihre jeweiligen Sätze von Evaluierungsbewertungen auf nachstehend als Verknüpfungskandidaten bezeichnet) durch rechenmäßiges Bestimmen der Werte von H und D für jedes Paar von im Schritt B2 erhaltenen Kandidatenzeichen und Beurteilen, der Polygonalregion in dem Streudiagramm der entsprechenden Kombination von Zeichentypen, zu denen das Paar gehört. Wenn beispielsweise der Zusammenhang der Normalisierungsgröße H und der des Normalisierungszentrums D eines Paares aufeinander folgend angeordneter Zeichen in der Polygonalregion P1 oder P2 in 11A gefunden wird, werden zwei Zeichen beurteilt als einen Zeichen/Hochschrift-Zusammenhang zeigend. Die Gesamtevaluierungsbewertung kann höher sein wenn sie zu P2 gehören als wenn sie zu P1 gehören, da die Zahl der Punkte in der Region P1 größer ist als die in der Region P2. Wenn andererseits der Zusammenhang in der Polygonalregion P3 oder P4 gefunden wird, werden die beiden Zeichen beurteilt als einen Zeichen/Tiefschrift-Zusammenhang zeigend. Die Gesamtevaluierungsbewertung kann höher sein, wenn sie zu P4 gehören als wenn sie zu P3 gehören. Wenn letztendlich der Zusammenhang in den Polygonalregionen P5 oder P6 gefunden wird, werden die beiden Zeichen als einen horizontalen Positionszusammenhang zeigend beurteilt. Die Gesamtevaluierungsbewertung kann höher sein, wenn sie zu P5 gehören als wenn sie zu P6 gehören.Thus, it is now possible to determine the inter-character structure candidates, each of which can show a horizontal positional relationship, a character / subscript or context, and their respective sets of evaluation scores, hereinafter referred to as link candidates, by computationally Determining the values of H and D for each pair of candidate characters obtained in step B2 and judging the polygonal region in the scattergram of the corresponding combination of character types to which the pair belongs. For example, if the relationship between the normalization quantity H and the normalization center D of a pair of consecutively arranged characters in the polygon region P1 or P2 in FIG 11A is found, two characters are judged as showing a character / heading relationship. The total evaluation score may be higher if they belong to P2 than if they belong to P1, since the number of dots in region P1 is greater than that in region P2. On the other hand, when the relationship is found in the polygon region P3 or P4, the two characters are judged to show a character / subscript relationship. The GE The overall evaluation score may be higher if it belongs to P4 than if it belongs to P3. Finally, when the context is found in the polygonal regions P5 or P6, the two characters are judged to be a horizontal positional relationship. The overall evaluation score may be higher if it belongs to P5 than if it belongs to P6.

12 zeigt eine schematische Darstellung von Verknüpfungskandidaten, die zwischen Paaren von zwei aufeinander folgenden Zeichen erzeugt werden für die Mathematikausdruckkomponente der 9. In 12 zeigt jeder Verknüpfungskandidat ein Elternkandidatzeichen (links), ein Tochterkandidatzeichen (rechts), einen Verbindungstyp und eine Evaluierungsbewertung. Beachte, dass ein Verknüpfungskandidat für jede zwei aufeinander folgend angeordnete Zeichen erzeugt wird und auch für zwei Zeichen, die aufeinander folgend angeordnet sind zu einem Zeichen, das sich in einer Indexregion zwischen ihnen befindet (x und y in 12). 12 FIG. 12 is a schematic representation of link candidates generated between pairs of two consecutive characters for the math expression component of FIG 9 , In 12 Each link candidate displays a parent candidate character (left), a daughter candidate character (right), a connection type, and an evaluation rating. Note that a link candidate is generated for every two consecutive characters, and also for two characters arranged consecutively to a character located in an index region between them (x and y in 12 ).

Wie in 12 gezeigt, werden die folgenden Verknüpfungskandidaten für die Zeichen von "c" und "x" durch Bezugnahme auf die Streudiagramme der 11A erzeugt;
(c, x, horizontal, 100),
(c, X, tief, 60) und
(C, X, horizontal, 100).As in 12 are shown, the following join candidates for the characters of "c" and "x" are referenced by the scatter plots of FIG 11A generated;
(c, x, horizontal, 100),
(c, X, deep, 60) and
(C, X, horizontal, 100).

Es ist einzusehen, dass die Kombination von (C, x) nicht existieren kann, weil der Zusammenhang von H und D nicht in irgendeiner Region der Streudiagramme gefunden werden kann.It it can be seen that the combination of (C, x) does not exist can because the connection of H and D is not in any region the scatterplots can be found.

Die folgenden Verknüpfungskandidaten werden für die Zeichen "x" und "2", die ein Index ist, unter Bezugnahme auf das Streudiagramm der 11A erzeugt;
(X, 2, hoch, 60),
(x, 2, hoch, 100) und
(x, 2, horizontal, 20).The following join candidates are for the characters "x" and "2", which is an index, with reference to the scatter plot of 11A generated;
(X, 2, high, 60),
(x, 2, high, 100) and
(x, 2, horizontal, 20).

Die folgenden Verknüpfungskandidaten werden für die Zeichen von "x" und "y" unter Bezugnahme auf das in 11A gezeigte Streudiagramm erzeugt, den Index "2" in die Überlegung einbeziehend.
(x, y, horizontal 100),
(x, Y, tief, 60)
(X, y, horizontal 60),
(2, y, tief 10) und
(2, Y, tief, 50).The following link candidates are used for the characters of "x" and "y" with reference to the in 11A shown scatter plot, taking the index "2" into consideration.
(x, y, horizontal 100),
(x, y, deep, 60)
(X, Y, horizontal 60),
(2, y, deep 10) and
(2, Y, deep, 50).

Schließlich werden die folgenden Verknüpfungskandidaten für die Zeichen "y" und "3", welches ein Index ist, unter Bezugnahme auf das Streudiagramm der 11A erzeugt;
(y, 3, hoch, 100) und
(Y, 3, hoch, 50).Finally, the following link candidates for the characters "y" and "3", which is an index, will be described with reference to the scattergram of FIG 11A generated;
(y, 3, high, 100) and
(Y, 3, high, 50).

In einem Aspekt verwendet diese Ausführungsform vier Streudiagramme, wie sie in 11A bis 11D gezeigt sind, die vorbereitet sind für Kombinationen von Zeichen unterschiedlicher Typen. Wie aus 11A bis 11D gesehen werden kann, kann die Verteilung von Inter-Zeichenzusammenhängen signifikant abhängig von den Zeichentypen jedes Paars variieren. Im Hinblick auf diese Tatsache werden die Streudiagramme vorbereitet, um Kombinationen von Zeichen unterschiedlicher Typen derart unterzubringen, dass der Inter-Zeichenzusammenhang eines Zeichenpaares unter Bezugnahme auf das entsprechende der Streudiagramme beurteilt werden kann.In one aspect, this embodiment uses four scatter plots as shown in FIG 11A to 11D shown prepared for combinations of characters of different types. How out 11A to 11D can be seen, the distribution of intercharacter contexts can vary significantly depending on the character types of each pair. In view of this fact, the scatter plots are prepared to accommodate combinations of characters of different types so that the inter-character relationship of a character pair can be judged with reference to the corresponding one of the scatter plots.

Gemäß irgendeinem oben herangezogenen Diagramm [1], [2] und [3] wird ein Index durch Untersuchen erfasst, ob sein Normalisierungszentrum aufwärts oder abwärts von dem Horizontalzentrum des Elternzeichens angezeigt wird oder nicht. Vom Gesichtspunkt der Streudiagramme der 11A bis 11D bedeutet dies, dass ein Index erfasst wird durch Verwenden nur der Vertikalkoordinaten der Streudiagramme und demnach die Wahrscheinlichkeit des Erfassens eines falschen Indexes und der des Verfehlens der Erkennung eines Indexes hoch sind. Im Gegensatz hierzu wird gemäß der vorliegenden Erfindung ein Index in einer zweidimensionalen Region erfasst durch Heranziehen der Kombination der Zeichen unterschiedlicher Typen und unterschiedlicher Größen zur Beurteilung und zum Vorbereiten der Streudiagramme für die Kombinationen von Zeichen unterschiedlicher Typen. Als ein Ergebnis ist die Genauigkeit der erfassten Indizes spürbar verbessert.According to any chart [1], [2] and [3] used above, an index is detected by examining whether its normalization center is displayed up or down from the horizontal center of the parent character or not. From the point of view of the scatterplots of the 11A to 11D this means that an index is detected by using only the vertical coordinates of the scatter plots and therefore the probability of detecting a false index and that of missing an index recognition are high. In contrast, according to the present invention, an index is detected in a two-dimensional region by using the combination of the characters of different types and different sizes to judge and prepare the scatterplots for the combinations of characters of different types. As a result, the accuracy of the detected indexes is noticeably improved.

Nun wird vor dem Beschreiben des nächsten Schrittes der Grund, weshalb das Erkennen mathematischer Ausdrucksstrukturen ein Problem des Erkennens eines optimalen Pfades ist, nachstehend beschrieben.Now will be before describing the next step the reason why the recognition of mathematical structures of expression a problem of recognizing an optimal path is described below.

Ein mathematischer Ausdruck hat eine Baumstruktur und Symbole werden nicht auf einer selben Linie angeordnet. Dies ist der Grund, warum die Tatsache, dass es ein Problem darstellt beim Erfassen eines optimalen Pfades von vielen Leuten nicht verstanden wird. erfindungsgemäß wird ein optimaler Pfad durch zeichnen eines Gesamtbaums, der optimale mathematische Ausdrucksstrukturen zeigt, unter Verwendung des in Schritt B3 vorbereiteten Verknüpfungsnetzes erhalten. Die Verbindung jedes Zeichens mit seinem Elternzeichen kann durch Zeichnen eines Gesamtbaums beurteilt werden. Demnach wird ein Satz von (Elternkandidatenzeichen (links), Tochterkandidatenzeichen (rechts), Verbindungstyp, Evaluierungsbewertung)) als Verknüpfungskandidat bezeichnet und jedes Zeichenrechteck wird ausgeführt, um alle Verknüpfungskandidaten in Bezug auf das Tochterzeichen in dem Rechteck zu tragen. Dann wird ein Gesamtbaum definiert durch Auswählen eines einzelnen Verknüpfungskandidaten aus jedem Rechteck. Solche Auswahlen können als Aufspüroperation von Pfaden angesehen werden. Daher ist das Erkennen mathematischer Ausdrucksstrukturen ein Problem des Erfassens eines optimalen Pfads.A mathematical expression has a tree structure and symbols are not arranged on the same line. This is why the fact that it is a problem in grasping an optimal path is not understood by many people. According to the present invention, an optimal path is obtained by drawing a whole tree showing optimal mathematical expression structures using the link mesh prepared in step B3. The connection of each character with its parent character can be judged by drawing a whole tree. Thus, a set of (parent candidate character (left), subsidiary candidate character (right), connection type, evaluation score) is referred to as a link candidate, and each character rectangle is executed to match all candidate links in the right to wear. Then, a whole tree is defined by selecting a single link candidate from each rectangle. Such selections may be considered as a tracking operation of paths. Therefore, recognizing mathematical expression structures is a problem of detecting an optimal path.

Solche Pfade können jedoch nicht insgesamt als ein Baum mathematischer Ausdrucksstrukturen angesehen werden. Beispielsweise kann ein Zeichen nur ein Tochterzeichen horizontal auf derselben Linie angeordnet haben oder als eine Hochschrift oder eine Tiefschrift angeordnet haben (Nichtvorhandensein duplizierter Verknüpfungen). Zudem müssen alle als basierend auf rechter oder linker Verbindung oder Indexverbindung in ihrem Zeichenrechteck ausgewählt erkannten Kandidaten eines Zeichens miteinander übereinstimmen, damit sie ein Teil des mathematischen Ausdrucks sind (Auswahl eines Einzigartigkeits-Kandidaten). Ein Gesamtbaum, der diese beiden Erfordernisse erfüllt, wird als widerspruchsfreier Mathematikausdruckssyntaxbaum bezeichnet und die einen solchen Gesamtbaum bildenden Pfade werden als anwendbare Pfade bezeichnet. Ein optimaler Pfad wird, von jenen anwendbaren Pfaden ausgewählt. In der folgenden Beschreibung bezieht sich der einfache Ausdruck eines Pfades auf eine von dem widerspruchsfreien Gesamtbaum erhaltene Anwendung.Such Paths can however, not altogether regarded as a tree of mathematical expression structures become. For example, a character can only have one child character horizontally arranged on the same line or as a capital or a Subscript (absence of duplicated links). In addition, everyone must as based on right or left connection or index connection selected in their drawing rectangle Recognized candidates of a sign agree with each other, so they one Part of the mathematical expression are (selection of a uniqueness candidate). An overall tree that fulfills both these requirements becomes denoted as non-contradictory mathematical expression syntax tree and the paths forming such a tree are considered applicable Called paths. An optimal path becomes applicable from those Paths selected. In the following description, the simple expression refers a path to a preserved from the consistent total tree Application.

Dann wird im Schritt B4 durch rückwärtiges (oder vorwärtiges) Nachverfolgen der zwischen Zeichenpaaren im Schritt B4 erzeugten Verknüpfungskandidaten nach einem optimalen Pfad gesucht, um die Verknüpfungskandidaten zu verbinden. Genauer, der die größten gemeinsamen Evaluierungsbewertungen zeigende Pfad wird unter den Pfaden gesucht, die die Zeichen widerspruchsfrei verbinden können durch Berücksichtigen der Verbindungszusammenhänge zwischen irgendwelchen zwei aufeinander folgenden Zeichen (Horizontalpositionszusammenhänge, Zeichen/Tiefschrift-Zusammenhänge und Zeichen/Hochschrift-Zusammenhänge) und Auswählen eines der Verbindungskandidaten jedes Paares von aufeinander folgend angeordneten Zeichen. Zudem wird in dieser Ausführungsform der Gesamtwert der lokalen Evaluierungsbewertungen der jeweiligen Zeichenpaare, die durch ihre Verknüpfungskandidaten gegeben sind, basierend auf den drei Bedingungen für die globale Evaluierung korrigiert, die wiederum definiert werden basierend auf der Verteilung der Höhen der in der mathematischen Ausdruckskomponente enthaltenen Zeichen. Der optimale Pfad wird auch durch Bezugnahme auf die korrigierte Gesamtevaluierungsbewertung gesucht.Then is in step B4 by backward (or vorwärtiges) Tracking the between pairs of characters generated in step B4 link candidates searched for an optimal path to connect the link candidates. Specifically, the largest common Evaluation evaluation pointing path is searched among the paths which can connect the characters without contradiction by considering the connection contexts between any two consecutive characters (horizontal positional contexts, character / subscriptions and Character / superscript relationships) and selecting one of the connection candidates of each pair of consecutive arranged signs. In addition, in this embodiment, the total value of local evaluation evaluations of the respective character pairs, the through their affiliation candidates are given based on the three conditions for the global Evaluation corrected, which in turn are defined based on the distribution of heights the character contained in the mathematical expression component. The optimal path is also corrected by reference to the Overall evaluation evaluation wanted.

Wenn die Normalisierungsgröße des Zeichens in der Indexregion jedes Elternzeichens größer ist als die Normalisierungsgröße des Elterzeichens, wie in 14A gezeigt, ist die Gesamtevaluierungsbewertung reduziert. In dem Fall der 14A wird "+" in fehlerhafter Weise als ein Index von "2" beurteilt, so dass "b", das nachfolgt, ebenfalls fehlbeurteilt wird als in der Indexregion angeordnet. Fehlbeurteilungen dieses Typs können häufig auftreten, wenn ein Zeichen nur durch sich Verlassen auf die lokale Evaluierungsbewertung beurteilt wird. Da jedoch "b" eine Größe gleich der von "a" hat und ihre Normalisierungsgröße größer ist als die von "2" ist ihre gesamte Evaluierungsbewertung reduziert.If the normalization size of the character in the index region of each parent character is greater than the normalization size of the parent character, as in 14A shown, the overall evaluation rating is reduced. In the case of 14A For example, "+" is erroneously judged to be an index of "2" so that "b" that follows is also misjudged as being located in the index region. Misjudgements of this type can often occur when a sign is judged only by relying on the local evaluation score. However, since "b" has a size equal to "a" and its normalization size is larger than that of "2", its overall evaluation rating is reduced.

Wenn zwei aufeinander folgend angeordnete Zeichen auf derselben Linie gefunden werden und das nachfolgende der zwei Zeichen sich in der Indexregion des vorangehenden Zeichens befindet, wie in 14B gezeigt, ist die Gesamtevaluierungsbewertung ebenfalls reduziert. Anders gesagt, die Gesamtevaluierungsbewertung wird reduziert, wenn ein Zeichen zu irgendeiner der kleinen Regionen P2, P4, P6 des Streudiagramms der 11A bis 11D gehört, die sich in der Indexregion des unmittelbar vorangehenden Zeichens befinden, das auf der Grundlinie gefunden wird. 14B zeigt einen Fall, in dem ein Kleinbuchstabe "x" fälschlicher Weise als Großbuchstabe "X" angesehen wird. Da das Zeichen "B", das nahe bei dem Zeichen "A" angeordnet ist, welches sich auf der Grundlinie befindet, in der Indexregion gefunden wird, wird die Gesamtevaluierungsbewertung reduziert.When two consecutive characters are found on the same line, and the succeeding one of the two characters is in the index region of the preceding character, as in 14B As shown, the overall evaluation score is also reduced. In other words, the total evaluation score is reduced when a character to any of the small regions P2, P4, P6 of the scattergram of the 11A to 11D which are in the index region of the immediately preceding character found on the baseline. 14B shows a case where a lower case letter "x" is erroneously regarded as a capital letter "X". Since the character "B" located near the character "A" located on the baseline is found in the index region, the overall evaluation score is reduced.

Die Gesamtevaluierungsbewertung wird ebenfalls reduziert, die Normalisierungsgröße von auf der Grundlinie angeordneten alphabetischen Zeichen differenziert sind jenseits eines gewissen Grenzbereichs, wie in 14C gezeigt. Anders gesagt, 14C zeigt einen Fall, bei dem Großbuchstaben "C" fälschlicher Weise als Kleinbuchstaben "c" betrachtet wird. Dann wird die Normalisierungsgröße des Kleinbuchstaben "c" größer als die Normalisierungsgröße des Großbuchstabens "A". Daher wird die Gesamtevaluierungsbewertung reduziert.The overall evaluation score is also reduced, the normalization size of baseline alphabetic characters are differentiated beyond a certain threshold, as in 14C shown. In other words, 14C shows a case where capital letters "C" are erroneously regarded as lower case letters "c". Then, the normalization size of the lower case letter "c" becomes larger than the normalization size of the capital letter "A". Therefore, the overall evaluation score is reduced.

Demnach sind die Bedingungen für globale Evaluierung die, die erfüllt werden müssen, um die Gesamtevaluierungsbewertung des Pfades, der die Zeichen in einem mathematischen Ausdruck ohne irgendeinen Widerspruch verbinden kann, das Auswählen eines der Verknüpfungskandidaten jedes Zeichenpaares vom Gesichtspunkt eines Horizontalpositionszusammenhangs, eines Zeichen/Tiefschrift-Zusammenhangs und eines Zeichen/Hochschrift-Zusammenhangs. Der Betrieb zum Suchen nach einem optimalen Pfad, der die größte Gesamtevaluierungsbewertung erzeugt, kann mit hoher Geschwindigkeit auch unter Verwendung einer Technik der Strahlsuche durchgeführt werden (welche auch als weitenpriorisierte Suche bezeichnet wird).Therefore are the conditions for global evaluation the one that meets Need to become, to the overall evaluation score of the path containing the characters in to connect a mathematical expression without any contradiction can, selecting one of the link candidates each pair of characters from the point of view of a horizontal position context, a sign / collocation context and a sign / collocation context. The operation to search for an optimal path, the largest overall evaluation evaluation can also be generated at high speed using a Technique of beam search performed (which is also referred to as a prioritized search).

13 zeigt ein Beispiel des optimalen Pfads wie unter Berücksichtigung der Bedingungen zur globalen Evaluierung beurteilt. Auf diese Weise wird ein optimaler Verknüpfungskandidat für jeden Intra-Zeichenzusammenhang ausgewählt, und der Intra-Zeichenzusammenhang wird als ein Horizontalpositionszusammenhang, ein Zeichen/Hochschrift- Zusammenhang oder ein Zeichen/Tiefschrift-Zusammenhang beurteilt. 13 shows an example of the optimal path as judged considering the conditions for global evaluation. In this way, an optimal link candidate is selected for each intra-character context, and the intra-character context is judged to be a horizontal positional relationship, a character / capitalization relationship, or a character / subscript relationship.

Den in den obigen herangezogenen Dokumenten [1], [2] und [3] beschriebenen Techniken fehlt es an dem Konzept der globalen Evaluierung. Wenn daher ein einzelnes Zeichen auf der Grundlinie fehlerhaft als ein Index genommen wird, werden alle folgenden Zeichen fehlerhaft als entsprechend viele Indizes angenommen. Dies tritt auf, wenn teilweise weil jedes Zeichen als ein Index oder ein Exponent basierend auf einer unter Verwendung lokal charakteristischer Aspekte des Zeichens durchgeführter Rechenoperation beurteilt wird. Im Gegensatz hierzu wird durch die vorliegende Erfindung das Konzept der globalen Evaluierung übernommen für die folgenden Pfade, so dass, wenn ein Zeichen fehlerhaft als ein Index angenommen wird, das Problem, dass alle folgenden Zeichen fehlerhaft angenommen werden, nicht auftritt. Dann ist es möglich, das Ergebnis der Operation einer externen Vorrichtung zum Erkennen eines mathematischen Ausdrucks unter Verwendung der Technik der globalen Evaluierung zu evaluieren. Es ist auch möglich, die Technik auf eine komplexe Beurteilungsoperation anzuwenden.The in the above referenced documents [1], [2] and [3] Techniques lack the concept of global evaluation. If so a single character on the baseline is faulty as an index is taken, all the following characters become erroneous as appropriate many indices have been adopted. This occurs when in part because each Characters as an index or an exponent based on an under Use of locally characteristic aspects of the character performed arithmetic operation is judged. In contrast, by the present invention the concept of global evaluation is adopted for the following paths, so that if a character is erroneously assumed as an index, the problem not all subsequent characters are accepted incorrectly occurs. Then it is possible the result of the operation of an external device for detecting a mathematical expression using the technique of global Evaluate evaluation. It is also possible to put the technique on one apply complex assessment operation.

Dann wird das letztendliche Ergebnis der Erkennung für die Mathematik-Ausdruckskomponente durch Zurückgeben der Indizes, der Betonungsmarkierungen, der Wurzelzeichen und so weiter, sofern vorhanden, die in Schritt B1 temporär entfernt worden sind, zu der Zeichenkette, die optimal verknüpft ist, erhalten. Dann wird das letztendliche Ergebnis der Erkennung für die Mathematik-Ausdrucksregion durch Durchführen der Verarbeitungsoperation der Schritte B2 bis B4 auf jede der Mathematik-Ausdruckskomponenten erhalten. Die Daten des Erkennungsergebnisses der Textregion und Mathematik-Ausdrucksregionen enthaltenden Dokumentes werden durch synthetisches Kombinieren der Ergebnisse der Erkennung der Textregionen und der Mathematik-Ausdrucksregionen erhalten.Then becomes the final result of the recognition for the math expression component Hand back indices, stress marks, root signs, and so on continue, if available, temporarily removed in step B1 have been added to the string that is optimally linked, receive. Then, the final result of the recognition for the mathematical expression region becomes Carry out the processing operation of steps B2 to B4 on each of the math expressing components receive. The data of the recognition result of the text region and Math expression Regions document containing by synthetically combining the Results of recognition of text regions and math expression regions receive.

Demnach umfasst eine Mathematik-Ausdruckserkennungsvorrichtung eine Zeichenerkennungseinheit, die konfiguriert ist zum Erkennen von Zeichen in einem Dokumentbild, das einen Text und einen mathematischen Ausdruck enthält, ein erstes Verzeichnis, das konfiguriert ist zum Speichern eines Evaluierungsbewertungspfads für jeden Worttyp, der mit Hilfe eines normalen Ausdrucks identifiziert werden kann, wobei die Bewertung die Wahrscheinlichkeit zeigt, zu Text zu gehören und die, zu einem mathematischen Ausdruck zu gehören, eine Evaluierungseinheit, die konfiguriert ist, um die Evaluierungsbewertungen zu erhalten, die die Wahrscheinlichkeit zu dem Text zu gehören und die zu dem mathematischen Ausdruck zu gehören, für jedes der in den durch die Zeichenerkennungseinheit erkannten Zeichen eingeschlossenen Wörter zeigt unter Bezugnahme auf das erste Verzeichnis, und eine Mathematikausdrucks-Erfassungseinheit, die konfiguriert ist um nach einem Wörter verbindenden optimalen Pfad zu suchen durch Auswählen eines von Text und mathematischem Ausdruck basierend auf einer formativen Grammatik und die Wahrscheinlichkeit, zu dem Text zu gehören und die, zu einem mathematischen Ausdruck zu gehören zeigenden Evaluierungsbewertungen für jedes der Wörter, hierdurch Zeichen, die zu dem automatischen Ausdruck gehören, erfassend.Therefore includes a mathematical expression recognition device a character recognition unit configured to recognize of characters in a document image that has a text and a mathematical Contains expression a first directory configured to store an evaluation evaluation path for each Word type identified by a normal expression can, where the rating shows the probability to text To belong and, belonging to a mathematical expression, an evaluation unit, which is configured to receive the evaluation ratings, the probability to belong to the text and that to the mathematical one To be an expression for each in the characters recognized by the character recognition unit included words Referring to the first directory, and a math expression detecting unit, FIG. which is configured to be optimal for connecting words Path to search by selecting one of text and mathematical expression based on a formative Grammar and the probability to belong to the text and the evaluation evaluations that belong to a mathematical expression for each of the words, thereby detecting characters belonging to the automatic expression.

Da eine Mathematik-Ausdrucksregion mit Hilfe einer gewöhnlichen Operation der Zeichenerkennung erkannt wird, können unerwartete Zeichen im Erkennungsergebnis auftreten. Im Hinblick auf dieses Problem umfasst die Mathematik-Ausdruckserkennungsvorrichtung ein Verzeichnis, das zu verwenden ist zum Klassifizieren der in dem Ergebnis einer Operation der Zeichenerkennung, die unter Verwendung normaler Ausdrücke in unterschiedliche Typen ausgeführt worden ist, und Erhalten von Evaluierungsbewertungen für jeden Worttyp, welche verwendet werden zur Klassifizierung, die jeweils die Wahrscheinlichkeit zeigt, zu einem Text zu gehören und die zu einem mathematischen Ausdruck zu gehören. Daher ist es möglich, jedem Wort in flexible Weise unter Bezugnahme auf das Verzeichnis Evaluierungsbewertungen zu verleihen. Jeder mathematische Ausdruck wird durch Suchen eines optimalen jedes aufeinander folgend angeordnete Wörterpaar verbindenden Pfades für sowohl den Text als auch den mathematischen Ausdruck erfasst auf der Band der für jedes Wort jeweils in Hinblick auf Text und mathematischen Ausdruck rechnerisch beurteilten Evaluierungsbewertungen. Mit dieser Anordnung ist es möglich, eine Mathematik-Ausdrucksregion exakt zu erfassen und demnach die in einem Dokument enthaltenen mathematischen Ausdrücke zu erkennen.There a mathematical expression region using a commonplace Can recognize unexpected characters in the character recognition operation Recognition result occur. With regard to this problem includes the math expression recognition device a directory to use to classify the in the result of a character recognition operation using normal expressions executed in different types and getting evaluation ratings for each Word type, which are used for classification, respectively the probability shows to belong to a text and to belong to a mathematical expression. Therefore it is possible to everyone Word in a flexible way with reference to the directory evaluation ratings to rent. Every mathematical expression is determined by looking for a optimally connecting each successively arranged pair of words Path for captures both the text and the mathematical expression the band for each word respectively in terms of text and mathematical expression arithmetically assessed evaluation evaluations. With this arrangement Is it possible, to accurately grasp a mathematical expression region and therefore the recognize mathematical expressions contained in a document.

Gemäß der vorliegenden Ausführungsform umfasst eine Mathematik-Ausdruckserkennungsvorrichtung eine Zeichenerkennungseinheit, die konfiguriert ist zum Erkennen von Zeichen in einem einen Text und einen mathematischen Ausdruck enthaltenden Dokumentenbildes, einer Erfassungseinheit, die konfiguriert ist zum Erfassen einer Mathematik-Ausdrucksregion aus den durch die Zeichenerkennungseinheit erkannten Zeichen, einem Speicher, der konfiguriert ist zum Speichern einer Vielzahl von Informationsproben, die einen Zusammenhang einer Normalisierungsgröße und einer Mittelposition zwischen jedem aufeinander folgend angeordneten Zeichenpaar im Hinblick auf Zeichentypen angibt einschließlich eines horizontalen Positionszusammenhangs, eines Zeichen/Tiefschrift-Zusammenhangs und eines Zeichen/Hochschrift-Zusammenhangs, und einer Einheit, die konfiguriert ist zum Berechnen des Verhältnisses der Normalisierungsgröße und der Mittelposition zwischen jedem Paar aufeinander folgend angeordneter Zeichen, das in der Mathematik-Ausdrucksregion enthalten ist, und zum Erhalten von Verknüpfungskandidaten für den Horizontalpositions-Zusammenhang, den Zeichen/Tiefschrift-Zusammenhang und den Zeichen/Hochschrift-Zusammenhang basierend auf den berechneten Verhältnis der Normalisierungsgröße und der Mittelposition und der dem berechneten Zusammenhang des Typs der beiden aufeinander folgend angeordneten Zeichen entsprechenden Probeninformation.According to the present embodiment, a mathematical expression recognition apparatus comprises a character recognition unit configured to recognize characters in a document image including a text and a mathematical expression, a detection unit configured to acquire a mathematical expression region from the characters recognized by the character recognition unit, a memory configured to store a plurality of information samples having a relationship of a normalization amount and a center position between each consecutively arranged character pair in terms of character types including a horizontal positional relationship, a character / subscript relationship and a character / capitalization relationship, and a unit configured to calculate the ratio of the normalization size and the center position between each pair of consecutively arranged characters included in of the mathematical expression region, and obtaining link candidates for the horizontal position relationship, the character / subscript-related and the character / superscript relationship, based on the calculated ratio of the normalization amount and the center position and the calculated relationship of the type of the two consecutively arranged characters corresponding sample information.

Die Mathematik-Ausdruckserkennungsvorrichtung umfasst eine Vielzahl von Probeninformationen für unterschiedliche Kombinationstypen von zwei aufeinander folgend angeordneten Zeichen und es ist demnach möglich, den Inter-Zeichenzusammenhang von jedem erfassten aufeinander folgend angeordneten Zeichenpaar als ein horizontaler Positionszusammenhang, ein Zeichen/Tiefschrift-Zusammenhang oder ein Zeichen/Hochschrift-Zusammenhang zu erkennen unter Bezugnahme auf die Probeninformationen, die den Typen der Zeichen entsprechen, für die der Inter-Zeichenzusammenhang zu beurteilen ist. Mit dieser Anordnung ist es möglich, die Betriebsfehlerrate des Bestimmens der Positionen von Zeichen in einem mathematischen Ausdruck zu reduzieren und die Betriebseffizienz des Erkennens der Struktur mathematischer Ausdrücke zu verbessern.The Mathematical expression recognition apparatus includes a variety of sample information for different combination types of two consecutive arranged characters and it is therefore possible, the inter-character context from each detected consecutively arranged character pair as a horizontal positional context, a sign / collocation context or to recognize a sign / capital letter relationship by reference on the sample information corresponding to the type of characters for which the Inter-sign context is to be judged. With this arrangement Is it possible, the operation error rate of determining the positions of characters in a mathematical expression to reduce and operational efficiency of recognizing the structure of mathematical expressions.

Gemäß der vorliegenden Ausführungsform umfasst eine Mathematik-Ausdruckserkennungsvorrichtung eine Zeichenerkennungseinheit, die konfiguriert ist zum Erkennen von Zeichen in einem Dokumentenbild, das einen mathematischen Ausdruck einschließt, eine Einheit, die konfiguriert ist zum Erfassen einer Mathematik-Ausdrucksregion von dem Ergebnis einer durch die Zeichenerkennungseinrichtung erhaltenen Zeichenerkennung, eine Einheit, die konfiguriert ist zum Speichern einer Vielzahl von Probeninformationen in Bezug auf den Inter-Zeichenzusammenhang der Normalisierungsgrößen und dem der Zentralpositionen jedes Paars von aufeinander folgend angeordneten Zeichen in Hinblick auf die Zeichentypen und die Positionszusammenhänge des Horizontalpositionszusammenhangs, eine Inter-Zeichenzusammenhangsbestimmungseinheit, die konfiguriert ist zum rechenmäßigen Bestimmen des Zusammenhangs der Normalisierungsgrößen und des der Zentralpositionen jedes der Paare aufeinander folgend angeordneter Zeichen in einer Mathematik-Ausdrucksregion und zum Erhalten von Verknüpfungskandidaten als Kombinationen von Inter-Zeichenstrukturkandidaten, die die jeweiligen Wahrscheinlichkeiten zeigen, einen Horizontalpositionszusammenhang, einen Zeichen/Tiefschrift-Zusammenhang oder einen Zeichen/Hochschrift-Zusammenhang zu haben, basierend auf dem Rechenergebnis und der Probeninformation und ihrer jeweiligen Evaluierungsbewertungen, eine Einheit, die konfiguriert ist zum Speichern der Bedingungen zur globalen Evaluierung, die basierend auf der Verteilung der Höhen der in den Mathematik-Ausdrucksregionen enthaltenen Zeichen vorbeurteilt ist, und eine Einheit, die konfiguriert ist zum Suchen nach einem optimalen Pfad zum widerspruchsfreien Verbinden der Zeichen in jeder der Mathematik-Ausdrucksregionen, zum Auswählen eines Inter-Zeichenstrukturkandidaten mit einem horizontalen Positionszusammenhang, einem Zeichen/Tiefschrift-Zusammenhang oder einem Zeichen/Hochschrift-Zusammenhang für jedes Paar aufeinander folgend angeordneter Zeichen, und zum Erkennen des Horizontalpositionszusammenhangs, des Zeichen/Tiefschrift-Zusammenhangs oder des Zeichen/Hochschrift-Zusammenhangs des Paares aufeinander folgend angeordneter Zeichen basierend auf dem Ergebnis der Suchoperation.According to the present Embodiment comprises a math expression recognition device a character recognition unit, which is configured to recognize characters in a document image, which includes a mathematical expression, a unit that configures is to capture a math expression region from the result a character recognition obtained by the character recognition device, a unit configured to store a plurality sample information relating to the inter-character context the normalization quantities and that of the central positions of each pair of successive ones Characters with regard to the character types and the positional relationships of the horizontal position context, an inter-character relationship determination unit that configures is to computationally determine the Correlation of the normalization quantities and of the central positions each of the pairs of consecutively arranged characters in one Mathematics expression region and obtaining candidate links as combinations of inter-character structure candidates containing the respective probabilities show a horizontal position context, a character / collocation context or to have a sign / capital letter relationship based on the Calculation result and the sample information and their respective evaluation ratings, a unit configured to save the conditions for global evaluation, based on the distribution of heights of in the math expression regions contained characters, and a unit that is configured is to search for an optimal path to consistency Joining the characters in each of the math expression regions, to choose an inter-character structure candidate with a horizontal positional relationship, a sign / collocation context or a sign / collocation context for each Pair of consecutively arranged characters, and to recognize of the horizontal position context, of the character / subscript-context or the sign / capital letter relationship of the couple following characters based on the result of the search operation.

Demnach ist es möglich, nicht nur den lokalen Zusammenhang jedes Paares aufeinander folgend angeordneter Zeichen zu bestimmen, sondern auch einen optimalen Pfad zu suchen, der die Zeichen in einer Mathematik-Ausdrucksregion widerspruchsfrei verbinden kann, um die Gesamtevaluierungsbewertungen ultimativ zu maximieren, die Bedingungen zur globalen Evaluierung berücksichtigend. Wenn demnach der Positionszusammenhang eines Paars von aufeinander folgend angeordneten Zeichen fehlbeurteilt wird, wird vermieden, dass die Fehlbeurteilung den Betrieb des Bestimmens der Gesamtstruktur eines mathematischen Ausdrucks beeinträchtigt.Therefore Is it possible, not just following each couple's local context arranged characters, but also an optimal Path to search the characters in a math expressions region without contradiction, to the overall evaluation evaluations Ultimately, to maximize the conditions for global evaluation Taking into account. Thus, if the positional relationship of a pair of successive following misaligned characters is misjudged, it is avoided that misjudgment the operation of determining the forest affected by a mathematical expression.

Wie oben beschrieben, stellt die vorliegende Ausführungsform die folgenden Vorteile bereit.As As described above, the present embodiment provides the following advantages ready.

Es ist möglich, effizient Mathematik-Ausdrucksregionen durch Beurteilen jedes Wortes als ein solches eines Textes oder eines mathematischen Ausdrucks zu beurteilen und durch Suchen nach einem optimalen Pfad zum Verbinden von Wörtern basierend auf der formativen Grammatik und den jedem Wort in Hinblick auf Text und mathematischen Ausdruck verliehenen Evaluierungsbewertung.It is possible, efficient math expression regions by judging each word as such a text or a mathematical expression to judge and search for an optimal path to connect of words based on the formative grammar and the every word in terms textual and mathematical evaluation evaluation evaluation.

Es ist möglich, exakt den Positionszusammenhang jedes Paares aufeinander folgend angeordneter Zeichen als horizontaler Positionszusammenhang, als Zeichen/Tiefschrift-Zusammenhang oder als Zeichen/Hochschrift-Zusammenhang zu bestimmen durch Vorbereiten einer Vielzahl von Streudiagrammen, die Normalisierungsgrößen und Normalisierungszentren von Paaren aufeinander folgend angeordneter Zeichen zeigt.It is possible to accurately determine the positional relationship of each pair of consecutively arranged characters as a horizontal positional relationship, a character / subscriptive relationship, or a character / heading relationship by preparing a plurality of scattergrams sequencing the normalization magnitudes and normalization centers of pairs arranged characters shows.

Es ist möglich, zu verhindern, dass irgendeine Fehlbeurteilung eines Positionszusammenhangs eines Paares aufeinander folgend angeordneter Zeichen die Operation des Bestimmens der Gesamtstruktur eines mathematischen Ausdrucks beeinträchtigt nicht nur durch Bestimmen des Orts-Zusammenhangs jedes Paares aufeinander folgend angeordneter Zeichen, sondern auch des Suchens nach einem optimalen Pfad, die Bedingungen für globale Evaluierung berücksichtigend.It is possible, to prevent any misjudgment of a positional context of a Pair of consecutively arranged characters the operation of Determining the overall structure of a mathematical expression does not affect only by determining the location relationship each pair of consecutively arranged characters, but also of searching for an optimal path, the conditions for global evaluation Taking into account.

Es ist möglich, die Anzahl der zum Erzeugen von Verknüpfungskandidaten zu verwendenden Zeichen zu reduzieren um die Effizienz der Gesamtverarbeitungsoperation zu verbessern durch Ausführen einer Vorverarbeitungsoperation des Zerlegens mathematischer Ausdrücke in Komponenten und des Erfassens von Indizes, Betonungsmarkierungen, Wurzelzeichen und so weiter vor dem Erzeugen von Verknüpfungskandidaten und dem Suchen nach optimalen Pfaden.It is possible, the number of links to use to create link candidates Signs to reduce the efficiency of the overall processing operation to improve by running a Preprocessing operation of decomposing mathematical expressions into components and capturing indices, stress marks, root signs and so on, before creating link candidates and searching for optimal paths.

Während die vorstehende Beschreibung auf spezielle Ausführungsformen der vorliegenden Erfindung Bezug nimmt, wird verstanden werden, dass viele Modifikationen vorgenommen werden können. Die beiliegenden Ansprüche sind dazu gedacht, solche Modifikationen abzudecken, als würden sie in den tatsächlichen Schutzbereich der vorliegenden Erfindung fallen. Die derzeit offenbarten Ausführungsformen sind demnach in jeder Hinsicht als den Schutzbereich der vorliegenden Erfindung, der durch die beiliegenden Ansprüche angegeben wird, erläuternd und nicht einschränkend zu verstehen, vielmehr ist die vorangehende Beschreibung und sind alle Änderungen, die in den Bedeutungsbereich von Äquivalenten der Ansprüche kommen, demnach dazu gedacht, im Schutzbereich enthalten zu sein. Beispielsweise kann die vorliegende Erfindung als ein computer-lesbares Aufzeichnungsmedium in die Praxis umgesetzt werden, welches ein Programm enthält, um es einem Computer zu ermöglichen, als Vorbeurteilungsvorrichtung zu funktionieren, es dem Computer zu ermöglichen, eine Vorbeurteilungsfunktion zu realisieren oder es dem Computer zu ermöglichen, eine Vorbeurteilungseinrichtung auszuführen. Beispielsweise kann das OCR-System 11 der oben beschriebenen Ausführungsform vollständig mit Hilfe von Software realisiert werden. Daher können die Vorteile der vorliegenden Erfindung durch Vorbereiten eines Programms für die obige Verarbeitungssequenz realisiert werden, das ein Computer ausführen kann, es in einem computer-lesbaren Speichermedium speichernd, es in einem Computer mit Hilfe des Speichermediums Einfügens und den Computer veranlassend, das Programm auszuführen.While the foregoing description refers to specific embodiments of the present invention, it will be understood that many modifications can be made. The appended claims are intended to cover such modifications as would come within the true scope of the present invention. Accordingly, the presently disclosed embodiments are to be considered in all respects as the scope of the present invention, which is indicated by the appended claims, illustrative and not restrictive, rather, the foregoing description and all changes which come within the scope of equivalents of the claims , therefore intended to be included in the scope of protection. For example, the present invention can be put into practice as a computer-readable recording medium containing a program to allow a computer to function as a pre-judging device, to allow the computer to realize a pre-judging function or make it possible for the computer to execute a pre-judging means. For example, the OCR system 11 the embodiment described above are completely realized by means of software. Therefore, the advantages of the present invention can be realized by preparing a program for the above processing sequence that a computer can execute, storing it in a computer-readable storage medium, inserting it into a computer with the help of the storage medium, and causing the computer to execute the program ,

Claims

A math expression recognition apparatus, comprising: a character recognition device ( 112 ) for recognizing characters in a document image containing a text and a mathematical expression, the document image comprising a string of a plurality of recognized words; a first directory device ( 201 ) for storing a pair of evaluation scores for each word type that can be identified by a normal expression, wherein a first evaluation score indicates the likelihood of belonging to the text, and a second evaluation score indicates the likelihood of belonging to the math expression; an evaluation facility ( 113 ) from the first directory device ( 201 ) to obtain the first and second evaluation scores showing the likelihood of belonging to the text and of belonging to the mathematical expression for each of the words contained in the characters and recognized by the character recognizer with respect to the first directory means; and a math expression detector ( 114 ) operable to search for an optimal path through the evaluation scores along the juxtaposition of the words by selecting each of the first evaluation score and the second evaluation score for each word in the string of words in turn, operable to calculate a plurality of sums of the evaluation scores for the stringing together of words, each sum containing either the first evaluation score or the second evaluation score for each word, for such sequential combinations of words allowed by formative grammar showing which part of speech can be linked to what part of speech in the text, and which part of speech in the text can be linked to a mathematical expression, and operable to capture, as an optimal path, that path that has the largest sum of the evaluation scores given to the string of Words that capture characters belonging to the mathematical expression by providing, as an optimal path, the most likely interpretation for stringing words between text and mathematical expression; the apparatus further comprising processing the characters within a recognized math expression region: a memory device ( 203 ) for storing a plurality of elements of the sample information, characterizing a relationship of a normalization amount and a center position between each pair of consecutively arranged characters in terms of the types of the characters, the elements of the sample information being divided into a plurality of groups of samples of horizontal positional relationship, character / subscript relationship, and character / capitalization relationship; and a determining device ( 114 ) for calculating the relationship of the normalization amount and the center position between each pair of consecutively arranged characters included in the mathematical expression region, determining that the calculated ratio is within the plurality of groups, and obtaining connection candidates for the horizontal positional relationship, the characters / Subscript relationship and the character / high-key relationship, based on the result of the determination, and recognizing the mathematical expression by searching an optimal path along the consecutively arranged characters by the connection candidates.

The apparatus according to claim 1, characterized in that the mathematical expression detection means comprises: a second directory means ( 202 ) for storing a connectable part of speech and mathematical expression as formative grammar; and a search facility ( 114 ) to search for a path through the evaluation scores along the concatenated words, and to show the largest evaluation score given to the word as a mathematical expression or text among all possible inter-word connection paths as the optimal path by selecting either the text or the mathematical expression for each word according to the part of speech of the word and of the formative grammar read out by the second directory means.

The device according to claim 2, characterized in that it further comprises: a memory ( 204 configured to store a global evaluation condition to determine, based on the distribution of the heights of the characters included in the mathematical expression region, the condition indicating a horizontal positional relationship, character / subscript relationship, and character / major relationship; a search facility ( 114 ) using data from the memory ( 204 ) to find an optimal path for connecting candidate candidates along the consecutive arranged characters in each of the mathematical expression regions while maintaining the global evaluation condition characterizing the same horizontal positional relationship, character / subscript relationship and character / title relationship; means for selecting an inter-character structure candidate having a horizontal positional relationship, a character / subscript relationship, or a character-to-title relationship for each pair of consecutively arranged characters based on the global evaluation condition and the connection candidate; and means for recognizing the horizontal positional relationship, the character / subscript relationship, or the character / subscript relationship of the pair of consecutively arranged characters based on the result of the search operation.

The device according to claim 3, characterized in that the global evaluation condition at least one includes the relationship between the height of one in the subscript region contained character and height each of the other characters, the positional relationship between a Baseline and a character contained in the subscript region, and the height distribution between characters that are at the same horizontal level are located.

The device according to claim 2, characterized in that it further comprises: a decomposition device ( 113 ) for decomposing each mathematical expression detected by the mathematical expression acquisition unit into components and removing at least left indexes, accent marks, root characters and points from each component, and characterized in that the determining means obtains connection candidates for the components of which the left indexes, accent marks , Root signs or points are removed.

A mathematical expression recognition method, comprising: recognizing characters in a document image containing a text and a mathematical expression as a series of a plurality of recognized words; Referring to a first directory that stores a pair of evaluation scores for each word type that can be identified by a normal expression, a first evaluation score indicates the likelihood that the word belongs to the text and a second evaluation score indicates the probability that the word belongs to the mathematical expression to obtain the evaluation scores indicating the likelihood that the word belongs to the text and that the word belongs to the math expression for each of the words contained in the recognized words; and searching for an optimal path through evaluation scores along the juxtaposition of the words by selecting each of the first evaluation score and the second score score for each word in the string of words in sequence, computing the A plurality of sums of the evaluation ratings for the stringing of words, each sum containing either the first evaluation score or the second evaluation score for each word, for those sequential combinations of words allowed by formative grammar showing which portion of speech with which Part of speech in the text can be linked, and which part of the language in the text can be linked to a mathematical expression, capturing characters belonging to the mathematical expression, providing, as the optimal path, the most likely interpretation for the juxtaposition of words between text and mathematical expression, the method further comprising the steps of, for processing the characters within a recognized mathematical expression region, using: 203 ) for storing a plurality of a relationship of a normalization amount and a center position between each pair of consecutively arranged characters with respect to the types of the character indicating sample information, the sample information being divided into plural groups of values of samples such as horizontal positional relationship, character / subscript Relationship and sign / uplift relationship; and using a determination device ( 114 ) for calculating the relationship of the normalization amount and the center position between each pair of successive characters arranged in the mathematical expression region, determining that the calculated relationship is within the plurality of groups, and obtaining candidate candidates for the horizontal positional relationship, the characters / Subscript relationship and the token relationship based on the result of the determination, and recognizing the mathematical expression by searching an optimal path along the consecutive arranged characters by the candidate candidates.

A method according to claim 6, comprising the steps of: using a second directory device ( 202 ) for storing a connectable part of speech and a mathematical expression as formative grammar; and using a search facility ( 114 ) to search for a path through the evaluation scores along the connected words, and to show the largest evaluation score given to the word as a mathematical expression or text from all possible inter-word connection paths as an optimal path, by selecting either the text or the mathematical expression for each word according to the part of the language of the word and the formative grammar read out by the second directory means.

The method of claim 7, comprising the steps of: using a memory configured to store a global evaluation condition of the characters contained in the mathematical expression region ( 204 ), which condition characterizes a horizontal positional relationship, character / subscript relationship; and searching for an optimal path for connecting the connection candidates along the consecutively arranged characters in each of the mathematical expression regions while maintaining the global evaluation condition that identifies the same horizontal positional relationship, character / subscript relationship, and character / capitalization relationship; Selecting an inter-character structure candidate having a horizontal positional relationship, a character / subscript relationship or a character-to-title relationship for each pair of consecutively arranged characters based on the global evaluation condition and the connection candidate, and recognizing the horizontal positional relationship Character / subscript relationship or the character / subscript relationship of the pair of consecutively arranged characters based on the result of the search operation.

The method according to claim 8, characterized that the global evaluation condition is at least one of the following encompassed by the relationship between the height of one in a subscript region contained character, and the height each of the other characters, the positional relationship between a Baseline and a character contained in the subscript region, and the distribution of heights under signs that are at the same horizontal level.

The device according to claim 7, comprising the steps of: Disassemble each one the math expression acquisition unit recorded mathematical expression, in components and Remove at least the left indices, accent marks, root signs and points from each component, and using the determining means for obtaining connection candidates for the components, of which removed the left indices, accent marks, root signs or dots have been.