JPH0950485A

JPH0950485A - Character string recognizing device

Info

Publication number: JPH0950485A
Application number: JP7203148A
Authority: JP
Inventors: Naganori Ishidera; 永記石寺
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1995-08-09
Filing date: 1995-08-09
Publication date: 1997-02-18
Anticipated expiration: 2015-08-09
Also published as: JP2825072B2

Abstract

PROBLEM TO BE SOLVED: To provide a character string recognizing device which is capable of correctly recognizing a character string even when plural lines of a handwritten character strings are inputted. SOLUTION: A character recognition part 4 performs a character recognition processing by using a standard character pattern and obtains a character recognition cost showing the reliability of a character code and a recognition result. A language knowledge processing part 7 prepares plural sets of the combination of characters satisfying constraint condition to be character string candidates by using layout knowledge and language knowledge and calculates the language cost showing the language reliability for each of the plural obtained character string candidates. A layout analysis part 8 calculates the layout cost showing the layout analysis propriety for each of the plural obtained character string candidates by using layout knowledge. A character string recognition result output part 9 outputs the character string candidate in which the most satisfactory cost can be obtained as the recognition result of the character string out of the plural obtained character string recognition candidates by combining three kinds of cost and using the result.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は手書き文字列認識方法及
びその装置に関し、特に光学的文字読み取り装置（ＯＣ
Ｒ）における手書き文字列認識方法及びその装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and apparatus for recognizing a handwritten character string, and more particularly to an optical character reading apparatus (OC).
The present invention relates to a method and apparatus for recognizing a handwritten character string in R).

【０００２】[0002]

【従来の技術】文字列認識技術は、例えば、郵便物や帳
票上に記載されている宛名情報を読み取る装置などに広
く用いられている。従来の文字列認識技術として、入力
された文字列から切り出しの候補を複数求め、全ての候
補に文字認識を施した結果から、言語的知識を用いるこ
とにより最終的に文字列を認識する方法が知られてい
る。2. Description of the Related Art A character string recognition technique is widely used in, for example, a device for reading address information described on a mail or a form. As a conventional character string recognition technique, there is a method in which a plurality of cutout candidates are obtained from an input character string, and a character string is finally recognized by using linguistic knowledge from a result of performing character recognition on all candidates. Are known.

【０００３】しかし、これらの文字列認識技術は文字列
が一行づつ切り出された状態で入力されることを前提と
していたため、例えば一行づつ文字列を切り出すことに
失敗して二行が一度に入力されてしまうと正しく文字列
を認識することができなくなるという問題があった。However, these character string recognition techniques are based on the premise that a character string is cut out one line at a time, and for example, it fails to cut out a character string line by line and two lines are input at a time. Then, there is a problem that the character string cannot be correctly recognized.

【０００４】この問題を解決するために、従来は図形情
報をより詳しく分析することにより文字列を精度良く抽
出しようとするための様々な工夫がなされてきた。例え
ば行天らによる「制約充足型文字領域抽出の基礎検討」
（信学技報、ＰＲＵ９２−１１９、１９９３）や、中島
らによる「手書き郵便からの宛名行検出における試行検
証プロセスの導入」（信学技報、ＰＲＵ９５−６、１９
９５）に示されているような方法が知られている。ま
た、特開平７−６２０２「文字認識装置」に示されてい
るように一度文字列を抽出した後に、もう一度文字列を
抽出し直すような方法も開示されている。In order to solve this problem, various devices have been conventionally devised for extracting character strings with high precision by analyzing graphic information in more detail. For example, "Basic study of constraint-satisfiable character area extraction" by Gyten et al.
(Science Technical Report, PRU92-119, 1993) and "Introduction of trial verification process in address line detection from handwritten mail" by Nakajima et al. (Study Technical Report, PRU95-6, 19).
95) is known. Further, as disclosed in Japanese Patent Application Laid-Open No. 7-6202, a method of extracting a character string once and then extracting the character string again is disclosed.

【０００５】しかし、郵便物や帳票上に記載されている
自由形式の手書き文字列は、文字列内の各文字の大きさ
や形状の変動が大きく、また文字列同士が入り組んでい
ることもあり、図形情報だけから完全に文字列を一行づ
つ切り出すことは困難である。However, in a free-form handwritten character string described on a mail or a form, the size and shape of each character in the character string vary greatly, and the character strings may be complicated. It is difficult to completely extract a character string line by line only from graphic information.

【０００６】よって、従来の文字列認識方法では、入力
される文字列が一行でなかった場合、文字列を正しく認
識することができなくなるという問題は避けられない問
題となる。Therefore, in the conventional character string recognition method, if the input character string is not one line, the problem that the character string cannot be correctly recognized becomes an unavoidable problem.

【０００７】[0007]

【発明が解決しようとする課題】そこで、本発明は入力
される手書き文字列の行数が複数の場合でも、正しく文
字列を認識することのできる文字列認識方法及び装置を
提供することを目的とする。SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide a character string recognition method and apparatus capable of correctly recognizing a character string even when the number of lines of an input handwritten character string is plural. And

【０００８】[0008]

【課題を解決するための手段】本発明は前記課題を解決
するため、請求項１記載の文字列認識装置では、光学的
に走査された文字列画像を格納する文字列画像記憶部
と、前記文字列画像を読み込み文字候補パターンと文字
候補図形情報を作成する文字切り出し部と、標準文字パ
ターンを記憶する文字認識辞書記憶部と、前記文字認識
辞書記憶部に格納されている前記標準文字パターンと前
記文字切り出し部から読み込んだ前記文字候補パターン
を照合し文字認識結果として文字コードと文字認識結果
の信頼性を表現する尺度としての文字認識コストを得る
文字認識部と、言語知識を格納する言語知識記憶部と、
文字の記載条件などの図形的知識としてレイアウト知識
を格納するレイアウト知識記憶部と、前記文字切り出し
部から前記文字候補図形情報を読み込み前記文字候補図
形情報に対応する前記文字コードを前記文字認識部から
読み込み前記レイアウト知識記憶部から前記レイアウト
知識を読み込み前記言語知識記憶部に格納されている前
記言語知識と前記文字コードとの言語的照合を行い前記
言語知識を満たすような前記文字コードの組み合わせを
文字列候補として作成し作成された文字列候補の言語的
信頼性を表現する尺度としての言語的コストを得る言語
知識処理部と、前記文字切り出し部から前記文字候補図
形情報を読み込み前記言語知識処理部から前記文字列候
補を読み込み前記レイアウト知識記憶部から前記レイア
ウト知識を読み込み前記文字列候補に対してレイアウト
的制約の妥当性を表現する尺度としてのレイアウトコス
トを得るレイアウト解析部と、前記文字認識部から前記
文字認識コストを読み込み前記言語知識処理部から前記
文字列候補に対する前記言語的コストを読み込み前記レ
イアウト解析部から前記文字列候補の前記レイアウトコ
ストを読み込み前記文字認識コストと前記言語的コスト
と前記レイアウトコストを組み合わせたときに最も良い
コストが得られる前記文字列候補を文字列認識結果とし
て出力する文字列認識結果出力部を備えて構成される。According to the present invention, there is provided a character string recognizing apparatus according to the present invention, wherein a character string image storage unit for storing a character string image scanned optically; A character cutout unit that reads a character string image and creates a character candidate pattern and character candidate graphic information, a character recognition dictionary storage unit that stores a standard character pattern, and the standard character pattern stored in the character recognition dictionary storage unit. A character recognition unit that collates the character candidate pattern read from the character cutout unit and obtains a character code and a character recognition cost as a scale expressing reliability of the character recognition result as a character recognition result, and linguistic knowledge that stores linguistic knowledge A storage unit,
A layout knowledge storage unit that stores layout knowledge as graphic knowledge such as character description conditions, and reads the character candidate graphic information from the character cutout unit to read the character code corresponding to the character candidate graphic information from the character recognition unit. The layout knowledge is read from the layout knowledge storage unit, the linguistic knowledge stored in the linguistic knowledge storage unit is linguistically collated with the character code, and a combination of the character codes that satisfies the linguistic knowledge is read. A linguistic knowledge processing unit that obtains a linguistic cost as a measure expressing the linguistic reliability of the character string candidates created and created as column candidates; and the language knowledge processing unit that reads the character candidate graphic information from the character cutout unit And reads the layout knowledge from the layout knowledge storage unit. A layout analysis unit that obtains a layout cost as a measure for expressing the validity of the layout constraint for the character string candidate, and reads the character recognition cost from the character recognition unit, and reads the character recognition cost from the linguistic knowledge processing unit for the character string candidate. The linguistic cost is read, the layout cost of the character string candidate is read from the layout analysis unit, and the character string candidate that provides the best cost is obtained by combining the character recognition cost, the linguistic cost, and the layout cost. A character string recognition result output unit that outputs a character string recognition result is configured.

【０００９】[0009]

【作用】本発明によると、文字列を認識する際に、まず
始めに文字行を仮定せず文字の切り出しと認識を行い、
一度言語的知識を満たす文字の組み合わせを作成してし
まう。この結果言語的知識を満たす組み合わせは多数得
られることになるが、この状態から文字認識のコストと
言語知識的なコストとレイアウト解析的コストを用いて
文字列の認識結果を求める。この方法は個別文字の切り
出しの曖昧性だけでなく文字行抽出の曖昧性までも含め
て言語的知識と同じレベルで総合的に判断していること
になるので、入力文字列が一行でなかった場合でも正し
く文字列を認識できる。According to the present invention, when recognizing a character string, characters are first cut out and recognized without assuming a character line.
Once you create a character combination that satisfies your linguistic knowledge. As a result, many combinations satisfying the linguistic knowledge can be obtained. From this state, a character string recognition result is obtained using the cost of character recognition, the cost of linguistic knowledge, and the cost of layout analysis. In this method, the input character string was not a single line, because it made comprehensive judgments at the same level as linguistic knowledge, including not only the ambiguity of cutting out individual characters but also the ambiguity of character line extraction. Even if it can recognize the character string correctly.

【００１０】[0010]

【実施例】本発明の第一の実施例について図面を参照し
て説明する。図１は本発明の第一の実施例の構成を示す
ブロック図である。図１に示す実施例の構成は、光学的
に走査された文字列画像を格納する文字列画像記憶部１
と、文字列画像を読み込み文字候補パターンとその文字
候補パターンに対応する文字候補図形情報を作成する文
字切り出し部２と、標準文字パターンを記憶する文字認
識辞書記憶部３と、標準文字パターンと文字候補パター
ンを照合し文字認識結果として文字コードと文字認識結
果の信頼性を表現する尺度としての文字認識コストを得
る文字認識部４と、言語知識を格納する言語知識記憶部
５と、文字の記載条件などの図形的知識としてレイアウ
ト知識を格納するレイアウト知識記憶部６と、文字切り
出し部２から文字候補図形情報を読み込みこの文字候補
図形情報に対応する文字コードを文字認識部４から読み
込みレイアウト知識記憶部６からレイアウト知識を読み
込み言語知識記憶部５の言語知識と文字コードとの言語
的照合を行い言語知識を満たすような文字コードの組み
合わせを複数作成して文字列候補とし複数作成された各
々の文字列候補に対して言語的信頼性を表現する尺度と
しての言語的コストを得る言語知識処理部７と、文字切
り出し部２から文字候補図形情報を読み込み言語知識処
理部７から全文字列候補を読み込みレイアウト知識記憶
部６からレイアウト知識を読み込み各々の文字列候補毎
にレイアウト的制約の妥当性を表現する尺度としてのレ
イアウトコストを得るレイアウト解析部８と、文字認識
部４から文字認識コストを読み込み言語知識処理部７か
ら文字列候補の言語的コストを読み込みレイアウト解析
部８から文字列候補のレイアウトコストを読み込みこれ
ら三つのコストを組み合わせたときに最も良いコストが
得られる文字列候補を文字列認識結果として出力する文
字列認識結果出力部９とを備えて成る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the first embodiment of the present invention. The configuration of the embodiment shown in FIG. 1 is a character string image storage unit 1 for storing a character string image optically scanned.
A character cutout unit 2 that reads a character string image and creates a character candidate pattern and character candidate graphic information corresponding to the character candidate pattern; a character recognition dictionary storage unit 3 that stores a standard character pattern; A character recognition unit 4 for collating candidate patterns to obtain a character code and a character recognition cost as a measure for expressing the reliability of the character recognition result as a character recognition result; a linguistic knowledge storage unit 5 for storing linguistic knowledge; A layout knowledge storage unit 6 that stores layout knowledge as graphic knowledge such as conditions, reads character candidate graphic information from a character cutout unit 2, reads a character code corresponding to the character candidate graphic information from a character recognition unit 4, and stores layout knowledge. The layout knowledge is read from the section 6 and the linguistic knowledge of the linguistic knowledge storage section 5 is linguistically collated with the character code, and the A plurality of combinations of character codes that satisfy the intelligibility as character string candidates, and a linguistic knowledge processing unit 7 that obtains a linguistic cost as a measure for expressing linguistic reliability for each of the plurality of generated character string candidates. The character candidate graphic information is read from the character cutout unit 2, all the character string candidates are read from the language knowledge processing unit 7, the layout knowledge is read from the layout knowledge storage unit 6, and the validity of the layout constraint is expressed for each character string candidate. A layout analysis unit 8 for obtaining a layout cost as a measure to be read, a character recognition cost from the character recognition unit 4, a linguistic cost of a character string candidate from the language knowledge processing unit 7, and a layout cost of a character string candidate from the layout analysis unit 8. Is read, the character string candidate that gives the best cost when these three costs are combined is recognized by character string recognition. Comprising a character string recognition result output unit 9 for outputting a.

【００１１】次に、本実施例の動作について説明する。Next, the operation of this embodiment will be described.

【００１２】文字列画像記憶部１は、イメージスキャナ
等の通常の画像入力手段によって入力された文字列画像
を格納する通常の記憶手段であり、文字列画像は例えば
２値化された画像である。The character string image storage unit 1 is a normal storage unit for storing a character string image input by a normal image input unit such as an image scanner, and the character string image is, for example, a binarized image. .

【００１３】文字切り出し部２は、文字列画像から一文
字である可能性のある部分画像を文字候補パターンとし
て抽出し、その文字候補パターンに関する図形的な情報
も同時に文字候補図形情報として作成する手段である。
文字候補パターンを抽出し、同時に文字候補図形情報も
作成するには、いくつかの方法が考えられる。一例を挙
げれば、２値化された文字列画像の黒画素連結領域のあ
らゆる組み合わせを作成し、その組み合わせの画像と外
接矩形の高さと幅と画像上での位置座標を抽出すること
により実現できる。図２に一例を示している。図２
（ａ）に示すように「野田」のような文字列画像が与え
られた場合、その黒画素連結領域を部分画像として抽出
し、同図（ｂ）に示すように部分画像とその外接矩形の
高さと幅と画像上での位置座標を記憶し、同図（ｃ）に
示すように部分画像同士の相対的位置関係を保ちながら
あらゆる組合せを作成し、その組合せに対応する新たな
部分画像を求め、その外接矩形の高さと幅と画像上での
位置座標を求めることにより文字候補パターンと文字候
補図形情報を作成することができる。The character cutout unit 2 extracts a partial image which may be one character from a character string image as a character candidate pattern, and simultaneously creates graphic information on the character candidate pattern as character candidate graphic information. is there.
There are several methods for extracting character candidate patterns and simultaneously creating character candidate graphic information. As an example, this can be realized by creating all combinations of the black pixel connection regions of the binarized character string image and extracting the height and width of the combination image, the circumscribed rectangle, and the position coordinates on the image. . FIG. 2 shows an example. FIG.
When a character string image such as "Noda" is given as shown in (a), the black pixel connected region is extracted as a partial image, and the partial image and its circumscribed rectangle are extracted as shown in FIG. The height, width, and position coordinates on the image are stored, and all combinations are created while maintaining the relative positional relationship between the partial images as shown in FIG. 10C, and a new partial image corresponding to the combination is created. The character candidate pattern and character candidate graphic information can be created by calculating the height and width of the circumscribed rectangle and the position coordinates on the image.

【００１４】文字認識辞書記憶部３は標準文字パターン
を記憶する通常の記憶手段である。The character recognition dictionary storage section 3 is a normal storage means for storing standard character patterns.

【００１５】文字認識部４は、文字切り出し部２から文
字候補パターンを読み込み、文字認識辞書記憶部３から
標準文字パターンを読み込んで文字候補パターンと標準
文字パターンを照合・認識し文字認識結果として文字コ
ードと文字認識結果の信頼性を表現する尺度としての文
字認識コストを得る手段である。ここで行う認識は、例
えば従来より良く知られている単純類似度法（文献「文
字認識概論」オーム社、１９８２、ｐｐ３４−３５参
照）や津雲による「方向パタンマッチングの改良と手書
き漢字認識への応用」（信学技報、ＰＲＵ９０−２０、
１９９０）に記載されている方法を用いることにより実
現できる。The character recognition unit 4 reads a character candidate pattern from the character cutout unit 2, reads a standard character pattern from the character recognition dictionary storage unit 3, checks and recognizes the character candidate pattern and the standard character pattern, and obtains a character recognition result as a character recognition result. This is a means to obtain the character recognition cost as a scale expressing the reliability of the code and the character recognition result. Recognition performed here is, for example, a simple similarity method well known in the art (refer to the document “Introduction to Character Recognition,” Ohmsha, 1982, pp. 34-35) or “improvement of direction pattern matching and handwritten kanji recognition by Tsugumo”. Application ”(IEICE Technical Report, PRU90-20,
1990).

【００１６】また文字認識コストは、文字認識結果が誤
りである可能性が高い程大きな値になるようなものであ
ればよく、例えば、本願と同一出願人による特許明細書
（特願平１−３３４３４７「文字列認識方法及びその装
置」）に記載されている文字評価値を用いて以下のよう
にコストを求めることができる。The character recognition cost may be such that the higher the possibility that the character recognition result is erroneous, the larger the value. For example, a patent specification (Japanese Patent Application No. Hei. 334347 "Character string recognition method and apparatus"), the cost can be obtained as follows using the character evaluation value.

【００１７】文字認識コストをｒｃ、文字評価値をｒと
すると、ｒｃ＝ａ１−ｒ・・・（１）ここでａ１は予め決められた定数である。他にもｒｃ＝ａ２／ｒ・・・（２）としてもよい。ここでａ２は予め決められた定数であ
る。Assuming that the character recognition cost is rc and the character evaluation value is r, rc = a1-r (1) where a1 is a predetermined constant. Rc = a2 / r (2) Here, a2 is a predetermined constant.

【００１８】言語知識記憶部５は、標準単語の文字コー
ド列情報や、単語の接続関係の制約条件等を言語知識と
して格納する通常の記憶手段である。言語知識として格
納される情報には、例えば以下に挙げられるような情報
が考えられる。The linguistic knowledge storage unit 5 is a normal storage means for storing character code string information of standard words, constraints on word connection relations, and the like as linguistic knowledge. The information stored as the linguistic knowledge may be, for example, the following information.

【００１９】郵便物や帳票に書かれた宛名住所の読み取
りの場合、単語の接続関係の制約条件として、図３にお
いて木構造で表現されているような住所の上位−下位関
係を格納する。すなわち、図３の品川区の住所構造で言
えば、「東京都」の下位に「品川区」があり、さらに、
その下位に「荏原」「旗の台」ほかの町名があるという
関係である。住所の場合、一般には、都道府県名、市区
郡名、町名という順番で基本的な上位−下位関係が成り
立っており、町名レベルが大字名、字名（あるいは町
名、系列町名）のように多段になることもある。住所の
場合は、住所要素の上位−下位関係が、そのまま住所要
素の並びの制約になる。すなわち、上位の住所要素から
下位の住所要素の順に並ぶことになる。図３をもとにし
た例を述べれば、「東京都」−「品川区」−「中延」と
いう並びは可能だが、「東京都」−「中延」や「中延」
−「旗の台」のような並びは不可である。そのような接
続制約の表現方法は、例えば、２つの住所要素の全組み
合わせに対して接続可否を記述する方法や、あるい住所
要素の直後あるいは直前に接続し得る住所要素を列挙す
る方法などがある。一般的なドキュメントの文字列の読
み取りの場合には、住所のような階層構造ではなく、文
法的な単語の並びに関する制約にもとづいて、単語の接
続可否を記述することになる。In the case of reading a mailing address written on a postal matter or a form, the upper-lower relation of addresses as represented by a tree structure in FIG. 3 is stored as a constraint condition of the connection relation of words. That is, in the address structure of Shinagawa-ku in FIG. 3, there is “Shinagawa-ku” below “Tokyo”,
The relationship is that there are town names such as "Ebara" and "Hatanotai" below that. In the case of an address, in general, a basic upper-lower relationship is established in the order of prefecture name, city / ward county name, and town name. It may have multiple stages. In the case of an address, the upper-lower relation of the address elements is a constraint on the arrangement of the address elements. That is, the upper address elements are arranged in the order of the lower address elements. According to the example based on FIG. 3, the sequence of "Tokyo"-"Shinagawa-ku"-"Nakanobu" is possible, but "Tokyo"-"Nakanobu" or "Nakanobu"
-An arrangement such as "flag stand" is not allowed. Examples of such a connection constraint expression method include a method of describing whether connection is possible for all combinations of two address elements, and a method of listing address elements that can be connected immediately after or immediately before an address element. is there. In the case of reading a character string of a general document, whether or not words can be connected is described based on not a hierarchical structure such as an address but a grammatical restriction on word arrangement.

【００２０】標準単語の文字コード列情報としては、読
み取り対象に現われる単語を構成する各文字に対して
［単語の文字列Ｕ，Ｕの文字列長Ｌ，その文字のＵ内位
置Ｐ］という３項情報を格納する。図４には、図３のよ
うな住所要素群に対する文字コード列情報の内容の例を
示す。図４において、「：」の左側のキー文字は、住所
要素に現われる各文字であり、その右側には、対応する
３項情報が並べてある。それら３項情報は、例えば、
「荏」に対応する［荏原，２，１］であれば、「荏」と
いう文字は住所要素「荏原」の２文字中の１文字目であ
ることを表現している。ある文字が複数の住所要素中に
現われることはあるので、例えば、「延」に対応する
「西中延，３，３］［中延，２，２］「東中延，３，
３］であれば、「延」という文字は、住所要素「西中
延」の３文字中の３文字目、または、住所要素「中延」
の２文字中の２文字目、または、住所要素「東中延」の
３文字中の３文字目であることを表現している。なお、
３項情報における住所要素（単語）Ｕは、図４の例では
文字列で示したが、文字列そのものではなく、住所要素
（単語）と対応づけたコード値で表現してもかまわな
い。The character code string information of the standard word includes, for each character constituting the word appearing in the object to be read, a character string U of a word, a character string length L of U, and a position P in the U of the character. Stores term information. FIG. 4 shows an example of the contents of the character code string information for the address element group as shown in FIG. In FIG. 4, the key characters on the left side of “:” are the characters appearing in the address element, and the corresponding three-term information is arranged on the right side. The three-term information is, for example,
If it is [EBARA, 2, 1] corresponding to "EBARA", it means that the character "EBARA" is the first character of the two characters of the address element "EBARA". Since a certain character may appear in a plurality of address elements, for example, “Nishinakanobu, 3, 3” [Nakanobu, 2, 2] “Higashinakanobu, 3,” corresponding to “Nobu”
3], the character “En” is the third character of the three characters of the address element “Nishi Nakanobu” or the address element “Nakanobu”.
It is expressed as the second character of the two characters or the third character of the three characters of the address element “Higashi Nakanobu”. In addition,
Although the address element (word) U in the ternary information is represented by a character string in the example of FIG. 4, it may be represented by a code value associated with the address element (word) instead of the character string itself.

【００２１】レイアウト知識記憶部６は、書式などの文
字レイアウトに関する情報をレイアウト知識として格納
する通常の記憶手段である。レイアウト知識として格納
される情報には、例えば以下に挙げるような情報が考え
られる。The layout knowledge storage section 6 is a normal storage means for storing information on a character layout such as a format as layout knowledge. As the information stored as layout knowledge, the following information can be considered, for example.

【００２２】記載されている文字列のなかのある１文字
に着目したとき、その文字の一つ前の文字が着目してい
る文字に対してどのような位置に記載され得るかの条件
を与える前方記載領域を決定するパラメータを記憶する
ことができる。When attention is paid to a certain character in the described character string, a condition is given as to what position the preceding character can be described with respect to the noted character. A parameter for determining the front writing area can be stored.

【００２３】例えば、ある文字に対する前方記載領域は
図５（ａ）に示すように表すことができる。このような
領域は以下の式を満たす領域として定義することができ
る。For example, the front writing area for a certain character can be represented as shown in FIG. Such a region can be defined as a region satisfying the following formula.

【００２４】ある文字ブロックに対する前方記載領域
は、その文字ブロックの中心点を原点とする局座標を用
いて以下のような領域で表現することができる。The forward description area for a certain character block can be represented by the following area using station coordinates with the center point of the character block as the origin.

【００２５】Ｒｆ＞ｒ１０＜θ＜θ１・・・（３）０ θ１＜θ＜θ２ｒ２ θ２＜θ＜θ３ｒ１／ｃｏｓθ θ３＜θ＜２π ここでθ＝０の方向は文字列記載基本方向（縦書きなら
ば上から下への垂直方向、横書きならば左から右への水
平方向）に対して反時計周りに９０度回転した方向であ
る。なお、π／２＜θ１＜π、３π／２＜θ２＜θ３＝
Ａｒｃｃｏｓ（ｒ１／ｒ２）＜２πである。Rf> r1 0 <θ <θ1 (3) 0 θ1 <θ <θ2 r2 θ2 <θ <θ3 r1 / cosθ θ3 <θ <2π Here, the direction of θ = 0 is the basic direction of the character string. (Vertical direction from top to bottom for vertical writing, horizontal direction from left to right for horizontal writing) is a direction rotated 90 degrees counterclockwise. Note that π / 2 <θ1 <π, 3π / 2 <θ2 <θ3 =
Arccos (r1 / r2) <2π.

【００２６】図５において（ａ）は縦書きで書かれた郵
便物の例であるが、前方記載領域を決定する基準となる
文字に対する右側は自分の行より一行前の行が位置する
ことが可能であり、基準となる文字の上には同じ行内の
文字が記載されている可能性があるという意味を持って
いる。In FIG. 5, (a) shows an example of a postal matter written in a vertical writing mode. The right side of a character serving as a reference for determining the forward writing area may be located one line before its own line. It is possible, and has the meaning that characters in the same line may be described on the reference character.

【００２７】式（３）を用いて前方記載領域を定義する
場合には、パラメータのｒ１とｒ２とθ１とθ２とθ３
を記憶すればよい。In the case where the forward description area is defined by using equation (3), parameters r1, r2, θ1, θ2, and θ3 are used.
Should be stored.

【００２８】また、前方記載領域を決定する基準となる
文字がある単語の１文字目である場合にのみ、その直前
で改行されている可能性があるとして、図５の（ａ）の
ような領域を前方記載領域と考え、それ以外のときには
図５の（ｂ）のように前方記載領域を定義することもで
きる。Also, only when the character serving as a reference for determining the front writing area is the first character of a word, it is determined that there is a possibility that a line feed is performed immediately before the first character, as shown in FIG. The area can be considered as the front writing area, and at other times, the front writing area can be defined as shown in FIG.

【００２９】このときの領域は例えば以下の式を満たす
領域として定義することができる。The area at this time can be defined as, for example, an area satisfying the following equation.

【００３０】Ｒｆ＞ｒ３ π−θ４＜θ＜π＋θ４・・・（４）０ｅｌｓｅここでθ＝０の方向は文字列記載基本方向である。なお
０＜θ４＜π／２である。Rf> r3 π−θ4 <θ <π + θ4 (4) 0 else Here, the direction of θ = 0 is the basic direction for writing a character string. Note that 0 <θ4 <π / 2.

【００３１】式（４）で単語の１文字目以外の前方記載
領域を決定する場合には、ｒ３とθ４を記憶すれば良
い。When determining the forward writing area other than the first character of the word by the equation (4), r3 and θ4 may be stored.

【００３２】また前方記載領域は矩形領域の組み合わせ
として定義することも可能である。前方記載領域を決定
する基準となるある文字ブロックの中心点を原点とし
て、いくつかの矩形の座標情報を記憶すればよい。この
場合前方記載領域を決定する矩形の数と、それらの矩形
の中心座標と高さと幅を記憶すればよい。この他にも例
えば、基準となる文字を中心としたある半径ｒ４の円内
を前方記載領域としてもよく、この場合にはｒ４を記憶
すればよい。Further, the front writing area can be defined as a combination of rectangular areas. Some rectangular coordinate information may be stored with the center point of a certain character block serving as a reference for determining the forward writing area as the origin. In this case, the number of rectangles for determining the front writing area, the center coordinates, the height, and the width of those rectangles may be stored. In addition to this, for example, the inside of a circle having a certain radius r4 around the reference character may be used as the front writing area. In this case, r4 may be stored.

【００３３】他にもレイアウト知識として考えられるも
のとして、文字列として認識したときに文字の読み飛ば
しがあった場合、読み飛ばされた文字が記載されている
可能性がある領域を対応可能領域として記憶することが
できる。As another possible layout knowledge, if a character is skipped when recognized as a character string, an area in which the skipped character may be described is defined as a corresponding area. Can be memorized.

【００３４】今、文字列のｉ番目の要素に対応する文字
ブロックをブロックｉ、文字列のｉ＋ｊ番目の要素に対
応する文字ブロックをブロックｉ＋ｊとすると、ブロッ
クｉとブロックｉ＋ｊの間にｉ＋ｋ番目の要素に対応す
る文字ブロックであるブロックｉ＋ｋが記載され得る領
域が対応可能領域となる。ここでｊ＞１であり、ｊ＞ｋ
＞０である。Now, assuming that a character block corresponding to the i-th element of the character string is block i and a character block corresponding to the i + j-th element of the character string is block i + j, the i + k-th element is located between the block i and the block i + j. An area in which a block i + k, which is a character block corresponding to an element, can be described is an applicable area. Where j> 1 and j> k
> 0.

【００３５】対応可能領域は、例えば以下のような領域
として定義することができる。ブロックｉの中心点を原
点とした以下の式を満たす領域としてＲｄを定義する。The corresponding area can be defined, for example, as the following area. Rd is defined as a region that satisfies the following equation with the center point of block i as the origin.

【００３６】ブロックｉの中心点を原点とした以下の式を満たす領域
としてＲｕを定義する。[0036] Ru is defined as a region that satisfies the following equation with the center point of block i as the origin.

【００３７】Ｒｕ＞ｒ５ π−θ５＜θ＜π＋θ５・・・（６）０ｅｌｓｅここでｒ５はブロックｉの中心点とブロックｉ＋ｊの中
心点間の距離である。また、θ６はブロックｉの中心点
からブロックｉ＋ｊの中心点を結んだ直線の方向からの
角度のずれを表しπ／２以下の値である。式（５）と式
（６）で決定される領域ＲｄとＲｕの重なる領域を対応
可能領域とすればよい。このときの例は図５（ｃ）に示
されている。Ｒｄは右下がりの斜線領域であり、Ｒｕは
右上がりの斜線領域である。この場合ｒ５はブロックｉ
とブロックｉ＋ｊの関係が明らかになった時点で始めて
計算することができる値なので予め記憶する必要は無
く、θ５を記憶すれば良い。Ru> r5 π−θ5 <θ <π + θ5 (6) 0 else Here, r5 is the distance between the center point of block i and the center point of block i + j. Θ6 represents a deviation of an angle from a direction of a straight line connecting the center point of the block i and the center point of the block i + j, and is a value of π / 2 or less. The overlapping area of the areas Rd and Ru determined by the equations (5) and (6) may be set as the corresponding area. An example at this time is shown in FIG. Rd is a diagonally rightwardly shaded area, and Ru is an upwardly diagonally shaded area. In this case, r5 is the block i
Since it is a value that can be calculated for the first time when the relationship between and the block i + j is clarified, it is not necessary to store it in advance, and θ5 may be stored.

【００３８】また、対応可能領域は矩形領域の組み合わ
せとして定義して記憶しておくこともできる。その場合
は、例えばブロックｉの中心点とブロックｉ＋ｊの中心
点を結ぶ直線の中心点を中心とした矩形で表現できる。
この場合、矩形の高さと幅を、ブロックｉの中心点とブ
ロックｉ＋ｊの中心点間の距離との比で表現することが
できるので、その比率をブロック高さとブロック幅を決
定するパラメータとして記憶すれば良い。The corresponding area can be defined and stored as a combination of rectangular areas. In that case, for example, it can be represented by a rectangle centered on the center point of a straight line connecting the center point of block i and the center point of block i + j.
In this case, the height and width of the rectangle can be represented by the ratio of the distance between the center point of block i and the center point of block i + j, and the ratio is stored as a parameter for determining the block height and block width. Good.

【００３９】また他のレイアウト知識として、文字列を
認識したときに任意の連続する２文字間、つまりブロッ
クｉとブロックｉ＋ｊの間にはなにも記載されてはいけ
ないとする記載禁止領域を定義して記憶することもでき
る。As another layout knowledge, a write-inhibited area is defined that must not be written between any two consecutive characters when a character string is recognized, that is, between block i and block i + j. You can also memorize.

【００４０】これは対応可能領域を定義するときに用い
た方法と同様に定義することができ、例えば式（５）で
算出された領域Ｒｄと式（６）においてｋ＝１と置いて
算出された領域Ｒｕの重なる領域を記載禁止領域とすれ
ばよい。よって、記載禁止領域としては対応可能領域と
同様なパラメータを記憶すれば良い。This can be defined in the same manner as the method used when defining the applicable area. For example, the area Rd calculated by the equation (5) and the equation (6) are calculated by setting k = 1. An area where the overlapped area Ru overlaps may be set as a writing prohibited area. Therefore, the same parameters as those of the corresponding area may be stored as the writing prohibited area.

【００４１】さらにレイアウト知識として、行の終端で
あるかどうかのチェックをするためにラインエンド領域
を定義して記憶することもできる。通常一行の終端に位
置する文字の後方には何も文字が記載されていないの
で、行の終端以降には何も文字が記載されてはいけない
ような領域としてラインエンド領域を定義する。Further, as a layout knowledge, a line end area can be defined and stored in order to check whether it is the end of a row. Usually, no character is described after the character located at the end of one line, so the line end area is defined as an area where no character can be described after the end of the line.

【００４２】この領域を定義するには、例えば、文字列
記載基本方向に対する角度のずれをθ６として行の終端
に位置するブロックの中心点を原点とした領域を以下の
ように定義する。In order to define this area, for example, an area having the origin at the center point of the block located at the end of the line is defined as follows, assuming that the angle deviation from the basic direction of writing the character string is θ6.

【００４３】ここでθ＜θ６＜π／２である。この場合記憶する内容
はｒ６とθ６である。[0043] Here, θ <θ6 <π / 2. In this case, the contents to be stored are r6 and θ6.

【００４４】また、ラインエンド領域は矩形領域として
定義して記憶しておくこともでき、この場合は行の終端
に位置するブロックの中心点と原点とした矩形の相対的
な座標情報を記憶すれば良い。The line end area can also be defined and stored as a rectangular area. In this case, the relative coordinate information of the center point of the block located at the end of the line and the rectangle defined as the origin is stored. Good.

【００４５】これ以外のレイアウト知識として、単語に
よっては文字の大きさが単語内でも大きく変わることが
あり、例えば図６のように「霞ヶ関」という単語であれ
ば「霞」と「関」の間にくる「ヶ」は小さく書かれるこ
とが多いといった知識を単語内ブロックサイズ情報とし
て記憶することができる。As other layout knowledge, depending on the word, the size of the character may vary greatly within the word. For example, if the word is "Kasumigaseki" as shown in FIG. It is possible to store the knowledge that the “ga” coming in the word is often written small as block size information in a word.

【００４６】例えば、登録されているｎ文字で構成され
る単語内ブロックサイズ情報として、単語内標準ブロッ
クサイズ比［Ｈ１，Ｖ１，Ｈ２，Ｖ２，…，Ｈｎ，Ｖ
ｎ］を記憶する。単語内標準ブロックサイズ比は例えば
以下のように算出することができる。文書毎に記載され
ている文字の大きさは異なるので、まず始めにある単語
の構成要素である文字ブロックの高さと幅のデータから
［ｈ１，ｖ１，ｈ２，ｖ２，…，ｈｎ，ｖｎ］というデ
ータ形式を作成し、この各要素の値をｈ１で割ることに
より一度正規化を行ったものについて平均を求めること
により算出することができる。For example, as the intra-word block size information composed of registered n characters, the intra-word standard block size ratio [H1, V1, H2, V2,..., Hn, V
n] is stored. The standard block size ratio within a word can be calculated, for example, as follows. Since the size of the characters described for each document is different, first, from the data of the height and width of the character block which is a component of the word, [h1, v1, h2, v2,..., Hn, vn] This can be calculated by creating a data format and dividing the value of each element by h1 to obtain an average of the values once normalized.

【００４７】また単語内標準ブロックサイズ比だけでな
く［Ｈ１，Ｖ１，Ｈ２，Ｖ２，…，Ｈｎ，Ｖｎ］の共分
散行列を算出し記憶しておくこともできる。In addition to the intra-word standard block size ratio, the covariance matrix of [H1, V1, H2, V2, ..., Hn, Vn] can be calculated and stored.

【００４８】レイアウト知識として記憶できる情報はこ
れだけでなく、例えば罫線と文字との相対的な関係や模
様と文字との相対的な関係などを記憶しておくこともで
きる。The information that can be stored as layout knowledge is not limited to this. For example, the relative relationship between ruled lines and characters, the relative relationship between patterns and characters, and the like can also be stored.

【００４９】言語知識処理部７は、文字切り出し部２か
ら文字候補図形情報を読み込みこの文字候補図形情報に
対応する文字コードを文字認識部４から読み込みレイア
ウト知識記憶部からレイアウト知識を読み込み言語知識
記憶部５の言語知識を読み込み言語知識と文字コードと
の言語的照合を行い言語知識を満たすような文字コード
の組み合わせを文字列候補として作成し作成された全て
の文字列候補に対して言語的信頼性を表現する尺度とし
ての言語的コストを得る手段である。The linguistic knowledge processing unit 7 reads the character candidate graphic information from the character cutout unit 2, reads the character code corresponding to the character candidate graphic information from the character recognition unit 4, reads the layout knowledge from the layout knowledge storage unit, and stores the language knowledge. The linguistic knowledge of the part 5 is read, the linguistic collation between the linguistic knowledge and the character code is performed, and a combination of character codes satisfying the linguistic knowledge is created as a character string candidate. It is a means of obtaining linguistic costs as a measure of expressing gender.

【００５０】ここで言う言語的照合には、いくつかの方
法が考えられる。一例を挙げれば、以下のような方法が
考えられる。Several methods can be considered for the linguistic collation referred to here. As an example, the following method can be considered.

【００５１】文字切り出し部２から読み込んだ文字候補
図形情報に対応する文字コードの各々をキーとして、言
語知識の一つである標準単語の文字コード列情報を検索
する。そして、標準単語の文字コード列情報のキー文字
のなかに該当する文字があったら、その文字に対応する
３項情報を読み出して登録する。Using each of the character codes corresponding to the character candidate graphic information read from the character cutout unit 2 as a key, character code string information of a standard word, which is one of language knowledge, is searched. Then, if there is a character corresponding to the key character of the character code string information of the standard word, the ternary information corresponding to the character is read and registered.

【００５２】次に、登録された３項情報のなかの２つの
３項情報を連結し、読み取り結果の候補を作成する方法
としては、例えば次のようなものが考えられる。Next, as a method of linking two pieces of the three-item information among the registered three-item information and creating a candidate for the reading result, for example, the following method can be considered.

【００５３】３項情報連鎖を作成する第一の実現方法
は、２つの３項情報［Ｕ１，Ｌ１，Ｐ１］と［Ｕ２，Ｌ
２，Ｐ２］について、条件ａ：［Ｕ１，Ｌ１，Ｐ１］に対応する文字候補図形
が［Ｕ２，Ｌ２，Ｐ２］に対応する文字候補図形の前方
記載領域にある条件ｂ：Ｕ１＝Ｕ２かつＰ１＜Ｐ２条件ｃ：Ｕ１はＵ２の前方記載領域に存在し得るとしたとき、条件ａかつ（条件ｂまたは条件ｃ）が成立
する際に、［Ｕ１，Ｌ１，Ｐ１］の後に［Ｕ２，Ｌ２，
Ｐ２］を連結して、可能なすべての組み合わせを形成し
た上で、各組み合わせのコスト計算を行なう。The first method for creating a three-term information chain is two pieces of three-term information [U1, L1, P1] and [U2, L
[2, P2] Condition a: The character candidate graphic corresponding to [U1, L1, P1] is in the front entry area of the character candidate graphic corresponding to [U2, L2, P2] Condition b: U1 = U2 and P1 <P2 Condition c: U1 can be present in the forward description area of U2. When Condition a and (Condition b or Condition c) are satisfied, [U2, L2, after [U1, L1, P1].
P2] to form all possible combinations, and then calculate the cost of each combination.

【００５４】この他にも例えば、本願と同一出願人によ
る特許明細書（特願平６−３１７１６３「文字列読み取
り装置」）に記載されている方法を用いることもでき
る。In addition to this, for example, a method described in a patent specification (Japanese Patent Application No. 6-317163 “character string reading device”) by the same applicant as the present application can be used.

【００５５】言語的コストは、様々なものが考えられる
が、例えば読み飛ばした文字数をそのまま言語的コスト
としてもよいし、読み飛ばした文字数を文字列の全文字
数で割ったものでもよい。また、住所などを読み取る場
合文字列の最初には都道府県や市区郡などの情報が書か
れていることが多く、最後のほうは町名が書かれている
ことが多く、しばしば住所読み取りでは町名が読めない
場合のほうが認識誤りを起こしやすいので文字列の最後
の方になればなるほど強い重みを付けてコストを計算す
ることもできる。さらに、連続した読み飛ばしがある場
合、読み飛ばしの個数が多い程大きなコストを与えるこ
ともできる。Various linguistic costs can be considered. For example, the number of skipped characters may be used as it is as the linguistic cost, or the number of skipped characters may be divided by the total number of characters in the character string. Also, when reading addresses, etc., information such as prefectures, municipalities, etc. is often written at the beginning of the character string, and the town name is often written at the end of the character string. Since the recognition error is more likely to occur when the character cannot be read, it is possible to calculate the cost with a stronger weight toward the end of the character string. Further, when there are continuous skips, the greater the number of skips, the greater the cost.

【００５６】この他にも例えば、本願と同一出願人によ
る特許明細書（特願平６−３１７１６３「文字列読み取
り装置」）に記載されている方法を用いることもでき
る。In addition to this, for example, a method described in a patent specification (Japanese Patent Application No. 6-317163 “character string reading device”) by the same applicant as the present application can be used.

【００５７】レイアウト解析部８は、文字切り出し部２
から文字候補図形情報を読み込み言語知識処理部７から
文字列候補を読み込みレイアウト知識記憶部６からレイ
アウト知識を読み込み各文字列候補毎にレイアウト的制
約の妥当性を表現する尺度としてのレイアウトコストを
得る手段である。The layout analysis unit 8 includes the character cutout unit 2
The character candidate figure information is read from the language knowledge processing unit 7, the character string candidates are read, the layout knowledge is read from the layout knowledge storage unit 6, and the layout cost as a measure expressing the validity of the layout constraint is obtained for each character string candidate. It is a means.

【００５８】レイアウトコストを計算するには、いくつ
かの方法が考えられる。一例を挙げれば、次のようにし
て実現できる。There are several methods for calculating the layout cost. For example, this can be realized as follows.

【００５９】図７に示すように、ある文字列候補のｉ番
目の文字に対応するブロックｉの幅をｘ_i、高さをｙ_i
として、ブロックｉの面積の平方根をｓ_iとする。同様
にｊ番目の文字に対応するブロックｊの面積の平方根を
ｓ_jとする。さらにブロックｉの中心からブロックｊの
中心を結ぶベクトルｖ_ijと文字列記載基本方向とのなす
角度をω_ijとし、ブロックｉとブロックｊの中心点間の
距離をｄ_ijとする。このとき以下のようなコストを算出
する。As shown in FIG. 7, the width of the block i corresponding to the i-th character of a certain character string candidate is x _i , and the height is y _i.
Let s _i be the square root of the area of block i. Similarly, let s _j be the square root of the area of block j corresponding to the j-th character. Further, an angle between a vector v _ij connecting the center of the block i to the center of the block j and the basic direction of the character string is ω _ij, and the distance between the center points of the block i and the block j is d _ij . At this time, the following cost is calculated.

【００６０】ｃ１_(i,j)＝ｄ_ij／（ｓ_i＋ｓ_j）（ブロックｉとｊ両方が存在する）・・・（８） α１（それ以外）ｃ２_(i,j)＝｜ｓ_i−ｓ_j｜／（ｓ_i＋ｓ_j）（ブロックｉとｊ両方が存在する）・・・（９） α２（それ以外）ｃ３_(i,j)＝ω_ij （ブロックｉとｊ両方が存在する）・・・（10） α３（それ以外）このときα１とα２とα３は予め与えられた定数であ
る。C1 _{(i, j)} = d _ij / (s _i + s _j ) (both blocks i and j are present) (8) α1 (other than that) c2 _{(i, j)} = | s _i −s _j | / (s _i + s _j ) (both blocks i and j exist)... (9) α2 (other than that) c3 _{(i, j)} = ω _ij (both blocks i and j exist) (10) α3 (other than that) At this time, α1, α2, and α3 are constants given in advance.

【００６１】さらに、図８に示すように、ある文字列候
補のｉ番目とｊ番目の要素に対応するブロックｉの中心
からブロックｊの中心を結ぶベクトルをｖ_ij、ブロック
中心間の距離をｄ_ij、ある文字列候補のｊ番目とｋ番目
の要素に対応するブロックｊの中心からブロックｋの中
心を結ぶベクトルをｖ_jk、ブロック中心間の距離を
ｄ_jk、ｖ_ijとｖ_jkのなす角をθ_ijkとする。このとき以
下のようなコストを算出する。Further, as shown in FIG. 8, a vector connecting the center of block j to the center of block j corresponding to the i-th and j-th elements of a certain character string candidate is represented by v _ij , and the distance between block centers is represented by d _ij . _ij , v _{jk is} a vector connecting the center of block k to the center of block k corresponding to the j-th and k-th elements of a certain character string candidate, the distance between block centers is d _jk , and the angle between v _ij and v _jk _Is θ _ijk . At this time, the following cost is calculated.

【００６２】ｃ４_(i,j,k)＝θ_ijk （ブロックｉ，ｊ，ｋが全て存在する）・・・（11） α４（それ以外）ｃ５_(i,j,k)＝｜ｄ_ij−ｄ_jk｜／（ｄ_ij＋ｄ_jk）（ブロックｉ，ｊ，ｋが全て存在する）・・・（12） α５（それ以外）このときα４とα５は予め定められた定数である。これ
らのコストを計算する際に、ｉ＋１＝ｊ以外のときｃ１
_(i,j)＝０、ｃ２_(i,j)＝０、ｃ３_(i,j)＝０とおいて
もよい。また、ｉ＋２＝ｊ＋１＝ｋ以外のときｃ４
_(i,j,k)＝０、ｃ５_(i,j,k)＝０とおいてもよい。C4 _{(i, j, k)} = θ _ijk (all blocks i, j, k exist)... (11) α4 (other) c5 _{(i, j, k)} = | d _ij − d _jk | / (d _ij + d _jk ) (all blocks i, j, and k exist) (12) α5 (otherwise) At this time, α4 and α5 are predetermined constants. When calculating these costs, when i + 1 = j, c1
_{(i, j)} = 0, c2 _{(i, j)} = 0 and c3 _{(i, j)} = 0 may be set. When i + 2 = j + 1 = k is not satisfied, c4
_{(i, j, k)} = 0 and c5 _{(i, j, k)} = 0 may be set.

【００６３】また、文字列候補中に存在すべき文字コー
ドが抜けてしまったことで読み飛ばした虫食い照合の部
分がある場合、虫食い部分でｃ１_(i,j)、ｃ２_(i,j)、
ｃ３_(i,j)、ｃ４_(i,j,k)、ｃ５_(i,j,k)にペナルティ
ーとして定数値を与えるのではなく、虫食い部分に対応
させることができるような文字候補パターンが対応可能
領域に存在するならば、その部分に仮の対応文字を設定
し、そのままｃ１_(i,j ₎、ｃ２_(i,j)、ｃ３_(i,j)、ｃ
４_(i,j,k)、ｃ５_(i,j,k)を計算することもできる。If a character code candidate is skipped because a character code that should be present in the character string candidate has been omitted, c1 _{(i, j)} , c2 _{(i, j)} , c2 _{(i, j)} ,
Instead of giving a constant value as a penalty to c3 _{(i, j)} , c4 _{(i, j, k)} , and c5 _{(i, j, k)} , a character candidate pattern that can be made to correspond to an insect-eating part is supported. If it exists in the possible area, a temporary corresponding character is set in that part, and c1 _{(i, j} ₎ , c2 _{(i, j)} , c3 _{(i, j)} , c
4 _{(i, j, k)} and c5 _{(i, j, k)} can also be calculated.

【００６４】仮対応をさせるには、例えば対応可能領域
にブロック中心点が位置するようなブロックを検出し、
検出されたブロックを仮対応候補としてこれらのブロッ
クのあらゆる可能な組み合わせを仮対応の結果とすれば
よい。この結果、虫食い箇所への仮対応のさせかたは複
数あることになる。複数の仮対応可能性を考え、その中
から一番コストの低い値になった場合を仮対応の結果と
考えて、そのときのコストをレイアウトコストとしても
よいし、全ての仮対応の平均コストをレイアウトコスト
としてもよい。複数の仮対応をさせた結果を図９に示
す。図９の例は文字列として「川越市上野田町」と認識
しようとした場合である。図９（ａ）では、「上野田
町」の「野」と「田」が虫食いになっている。図９
（ｂ）では、仮対応候補となった三つのブロックを用い
てあらゆる可能な組み合わせを示している。To make provisional correspondence, for example, a block in which the block center point is located in the applicable area is detected, and
The detected blocks may be used as provisional correspondence candidates, and all possible combinations of these blocks may be used as provisional correspondence results. As a result, there are a plurality of ways of temporarily coping with the worm-eating part. Considering the possibility of multiple provisional measures, considering the case of the lowest cost value among them as the result of provisional measures, the cost at that time may be used as the layout cost, or the average cost of all provisional measures May be used as the layout cost. The result of making a plurality of provisional correspondences is shown in FIG. The example of FIG. 9 is a case where an attempt is made to recognize “Kawagoe City Uenodamachi” as a character string. In FIG. 9A, the “field” and “field” of “Uenoda-cho” are worm-eating. FIG.
In (b), all possible combinations are shown using three blocks that have been provisional correspondence candidates.

【００６５】仮対応の結果も利用してレイアウトコスト
を計算する場合には、虫食い部分に仮対応する文字候補
パターンがない場合にのみ、ペナルティーとして各コス
トに定数値を付与すればよいことになる。When the layout cost is calculated using the result of the provisional correspondence, a constant value may be given to each cost as a penalty only when there is no character candidate pattern provisionally corresponding to the worm-eating part. .

【００６６】また、ある候補文字列に余分なブロックが
混入したかどうかを評価してレイアウトコストの一つと
することもできる。これは、ある候補文字列内におい
て、言語的に連続する２文字間に設定される記載禁止領
域に他の文字候補パターンが存在するかどうかを評価
し、他の文字候補パターンが存在する場合その文字候補
パターンのブロックの大きさが大きいほどコストが高く
なるように設定すればよい。図１０にその例を示してい
る。図１０は文字列として「川越市野田町」と認識しよ
うとした場合である。図１０（ａ）では「田」と「町」
の間の記載禁止領域に他の文字候補パターンが存在しな
いので「野田町」と読むことはかなり妥当であるが、図
１０（ｂ）では「田」と「町」の間に大きなブロックが
混入しており「野田町」と読む妥当性は低く、図１０
（ｃ）では混入したブロックが小さいので妥当性が少し
あるといった内容をコストに反映させる。コストの計算
は、例えば以下のような方法で行うことができる。It is also possible to evaluate whether or not an extra block is mixed in a certain candidate character string and use it as one of the layout costs. This is to evaluate whether there is another character candidate pattern in a writing prohibited area set between two linguistically consecutive characters in a certain candidate character string, and if there is another character candidate pattern, The cost may be set so that the larger the block size of the character candidate pattern, the higher the cost. FIG. 10 shows an example thereof. FIG. 10 shows a case where an attempt is made to recognize “Noda-cho, Kawagoe-shi” as a character string. In FIG. 10A, "field" and "town"
Since there is no other character candidate pattern in the writing prohibited area between the two, it is quite appropriate to read “Nodamachi”, but in FIG. 10B, a large block is mixed between “Ta” and “town”. Therefore, the relevance of reading "Noda Town" is low, and Figure 10
In (c), the content that the mixed block is small and thus slightly valid is reflected in the cost. The cost can be calculated by, for example, the following method.

【００６７】ある文字列候補のｉ番目の要素に対応する
ブロックｉとｉ＋１番目の要素に対応するブロックｉ＋
１の面積の平方根をそれぞれｓ_i、ｓ_i+1として、ブロ
ックｉとブロックｉ＋１との間に設定された記載禁止領
域とブロックｉとブロックｉ＋１以外のブロックとの重
なっている部分の面積の平方根の総和をＳとすると以下
のようなコストが算出できる。Block i corresponding to the i-th element of a certain character string candidate and block i + corresponding to the i + 1-th element
The square roots of the area where the writing prohibited area set between the block i and the block i + 1 and the block other than the block i and the block i + 1 overlap each other, where the square roots of the area of 1 are s _i and s _{i + 1} , respectively. If the sum of S is S, the following cost can be calculated.

【００６８】ｃ６_(i,i+1)＝Ｓ／（ｓ_i＋ｓ_i+1）（ブロックｉとｉ＋１両方が存在する）・・・（13）０（それ以外）また、全体的な文字列の行らしさを評価するために以下
のようなコストを考えることもできる。C6 _{(i, i + 1)} = S / (s _i + s _{i + 1} ) (both blocks i and i + 1 exist) (13) 0 (others) Also, the entire character string The following costs can be considered in order to evaluate the feasibility.

【００６９】今、文字数Ｎの文字列候補があったとする
と、文字列記載基本方向をＨ軸、文字列記載基本方向と
垂直な方向をＶ軸として候補文字列のｉ番目の文字に対
応するブロックｉの中心点の座標をＢ_iＨ_m、Ｂ_iＶ_m
と置きなおし、Ｂ_iＶ_mの平均値をＡｖｅＶ_mとし、｜
Ｂ_NＨ_m−Ｂ_OＨ_m｜をＬｉｎｅＨとすると、以下のよ
うなコストを計算することができる。Now, assuming that there are character string candidates having the number of characters N, the block corresponding to the i-th character of the candidate character string is defined by setting the basic direction of the character string as the H axis and the direction perpendicular to the basic direction of the character string as the V axis. i the coordinate of the center point of the _{_{_{B i H m, B i V}}} m
And the average value of B _i V _m is AveV _m, and |
_{_{_{_{B N H m -B O H m}}}} | When the the LineH, it is possible to calculate the cost as follows.

【００７０】ｃ７_i＝｜Ｂ_iＶ_m−ＡｖｅＶ_m｜／ＬｉｎｅＨ・・・（14）この一例を図１１に示す。図１１では文字数Ｎ＝６の場
合の例を示している。C7 _i = | B _i V _m −Ave V _m | / LineH (14) An example of this is shown in FIG. FIG. 11 shows an example where the number of characters N = 6.

【００７１】さらに、住所は必ずしも一行だけで記載さ
れているとは限らないので二行にまたがって住所が書か
れている場合についても評価することができる。この場
合、行が改行されている部分を検出して二行に分割し、
次にそれぞれの行らしさを評価すればよい。Further, since the address is not always described on one line, it is possible to evaluate the case where the address is written over two lines. In this case, the line break is detected and split into two lines.
Next, it is sufficient to evaluate each of the actions.

【００７２】改行の検出には様々な方法が考えられる。
ここで一例を挙げるならば、例えばｃ１_(i,j)とｃ３
_(i,j)とｃ５_{(i-1,j-1,k-1)}とｃ５_(i,j,k)が同時に大
きな値を持つ部分があったとすると、ｉ文字目とｊ文字
目の間が行の切れ目であると考えて二行にわけることが
できる。ここで、ｉ＋２＝ｊ＋１＝ｋの関係がある。図
１２に、この一例を示す。図１２（ａ）のようにｃ１
（６，７）とｃ３（６，７）とｃ５（５，６，７）とｃ
５（６，７，８）が同時に大きくなると、ブロック６と
ブロック７が行の切れ目であることが検出できる。Various methods can be considered for detecting a line feed.
Here, for example, for example, c1 _{(i, j)} and c3
_{If there is a part where (i, j)} , c5 _{(i-1, j-1, k-1)} and c5 _{(i, j, k)} have large values at the same time, Is considered to be a line break and can be divided into two lines. Here, there is a relationship of i + 2 = j + 1 = k. FIG. 12 shows an example of this. As shown in FIG.
(6,7), c3 (6,7), c5 (5,6,7) and c
When 5 (6, 7, 8) simultaneously increases, it can be detected that blocks 6 and 7 are line breaks.

【００７３】改行が検出された場合、行の切れ目で計算
されるレイアウトコストをクリアすることもできる。ま
た改行部分で二行に分割してそれぞれの行らしさの総和
を求めることもできる。When a line feed is detected, the layout cost calculated at the line break can be cleared. In addition, it is also possible to divide the line into two lines at the line feed portion and calculate the total sum of the lines.

【００７４】また、行の切れ目が確からしいかどうかを
評価することもできる。これは、ある第ｍ行目の行の終
端に対応するブロックの後方にラインエンド領域を設定
しその領域に終端のブロック以外のブロックが存在した
場合ペナルティーとして定数α６をｃ８_mとして与え
て、存在しなかった場合にはα７を与えれば良い。この
ときα６とα７は予め決められた定数であり、α６＞＞
α７である。また、文字列として住所を認識する場合な
どは、住所の最終文字とラインエンドのブロックが一致
したときｃ８_m＝０とし、また住所の最終文字のライン
エンド領域に数字として認識できるような文字候補ブロ
ックが存在した場合もｃ８_m＝０とすることができる。It is also possible to evaluate whether a line break is likely. This is because a line end area is set behind a block corresponding to the end of a certain m-th row, and a constant α6 is given as a penalty c8 _m as a penalty when a block other than the end block exists in that area. Otherwise, α7 may be given. At this time, α6 and α7 are predetermined constants, and α6 >>
α7. Also, when recognizing an address as a character string, when the last character of the address and the block of the line end match, c8 _m = 0, and a character candidate that can be recognized as a number in the line end area of the last character of the address. Even if a block exists, c8 _m = 0 can be set.

【００７５】さらに、図１３のように文字行らしさを算
出する際、候補文字列の各ブロックの中心点を基にして
最小自乗法で求めた直線の方向に基づいて文字列記載基
本方向を新たに求めることもできる。Further, when calculating the character likelihood as shown in FIG. 13, the basic direction of the character string description is newly set based on the direction of the straight line obtained by the least square method based on the center point of each block of the candidate character string. You can also ask.

【００７６】また、単語内標準ブロックサイズ情報を用
いてある文字列候補中の単語の図形情報による単語らし
さを評価することもできる。今、評価している文字列候
補内のある単語のブロックの座標情報とその単語に対応
する単語内標準ブロックサイズ情報とを比較して単語ら
しさを評価することができる。Further, it is also possible to evaluate the likelihood of a word in a certain character string candidate by using graphic information using standard intra-word block size information. Now, the word-likeness can be evaluated by comparing the coordinate information of a block of a certain word in the evaluated character string candidate with the standard block size information within a word corresponding to the word.

【００７７】例えばこの比較にベクトル間のユークリッ
ド距離を用いて単語らしさを算出することができる。単
語らしさの評価にはマハラノビス距離等を用いてもよ
い。この単語らしさの総和をｃ９として候補文字列のレ
イアウトコストの一つであるとすることもできる。For example, the likelihood of a word can be calculated using the Euclidean distance between the vectors in this comparison. Mahalanobis distance or the like may be used for evaluation of wordiness. The sum of the word-likeliness may be set as c9, which may be one of the layout costs of the candidate character strings.

【００７８】今、ある文字列候補の文字数をＮ個、文字
行数をＭだとすると、その文字列候補に対するレイアウ
トコストは以下のようにして算出される。Assuming that the number of characters of a character string candidate is N and the number of character lines is M, the layout cost for the character string candidate is calculated as follows.

【００７９】ＬＣ１＝（１／Ｎ）ΣΣｃ１_(i,j) ・・・（15）ＬＣ２＝（１／Ｎ）ΣΣｃ２_(i,j) ・・・（16）ＬＣ３＝（１／Ｎ）ΣΣｃ３_(i,j) ・・・（17）ＬＣ４＝（１／Ｎ）ΣΣΣｃ４_(i,j,K) ・・・（18）ＬＣ５＝（１／Ｎ）ΣΣΣｃ５_(i,j,K) ・・・（19）ＬＣ６＝（１／Ｎ）Σｃ６_(i,i+1) ・・・（20）ＬＣ７＝（１／Ｎ）Σｃ７_i ・・・（21）ＬＣ８＝Σｃ８_i ・・・（22）ＬＣ９＝ｃ９・・・（23）文字列認識結果出力部９は、文字認識部４から文字認識
コストを読み込み、言語知識処理部７から各文字列候補
の言語的コストを読み込み、レイアウト解析部８から各
文字列候補のレイアウトコストを読み込み、これら三つ
のコストを組み合わせたときに最も良い（低い）コスト
が得られる文字列候補を文字列認識結果として出力する
手段である。LC1 = (1 / N) ΣΣc1 _{(i, j)} (15) LC2 = (1 / N) ΣΣc2 _{(i, j)} (16) LC3 = (1 / N) ΣΣc3 _{( i, j)} (17) LC4 = (1 / N) ΣΣΣc4 _{(i, j, K)} (18) LC5 = (1 / N) ΣΣΣc5 _{(i, j, K)} ... ( 19) LC6 = (1 / N) Σc6 _{(i, i + 1)} (20) LC7 = (1 / N) Σc7 _i (21) LC8 = Σc8 _i (22) LC9 = c9 (23) The character string recognition result output unit 9 reads the character recognition cost from the character recognition unit 4, reads the linguistic cost of each character string candidate from the linguistic knowledge processing unit 7, and reads each linguistic cost from the layout analysis unit 8. This is a means for reading the layout cost of a character string candidate and outputting as a character string recognition result a character string candidate that provides the best (lowest) cost when these three costs are combined.

【００８０】文字認識コストと言語的コストとレイアウ
トコストを組み合わせて文字列認識コストを計算するに
は、いくつかの方法が考えられる。一例を挙げれば、次
のようにして実現できる。There are several methods for calculating the character string recognition cost by combining the character recognition cost, the linguistic cost, and the layout cost. For example, this can be realized as follows.

【００８１】ある文字列候補の全文字数をＮ個だとする
とある文字列のｉ番目の文字の文字認識コストをｒｃ_i
とすると、全文字認識コストＲＣは以下のように算出さ
れる。Assuming that the total number of characters of a certain character string candidate is N, the character recognition cost of the i-th character of a certain character string is rc _i
Then, the total character recognition cost RC is calculated as follows.

【００８２】ＲＣ＝Σｒｃ_i ・・・（24）ただしｉ番目の文字に対応する文字認識結果が存在しな
い場合はｒｉ＝βとする。このときβは予め定められた
定数である。このとき文字列の最後のほうになればなる
ほど強い重みを付けてコストを計算することもできる。RC = Σrc _i (24) However, if there is no character recognition result corresponding to the i-th character, ri = β. At this time, β is a predetermined constant. At this time, the cost can be calculated by assigning a stronger weight toward the end of the character string.

【００８３】また、言語的コストをＫＣとする。The linguistic cost is KC.

【００８４】いま文字列認識コストＳＣを以下の式によ
って定義する。Now, the character string recognition cost SC is defined by the following equation.

【００８５】ＳＣ＝γ×ＲＣ＋δ×ＫＣ＋ε１×ＬＣ１＋ε２×ＬＣ２＋ε３×ＬＣ３＋ε ４×ＬＣ４＋ε５×ＬＣ５＋ε６×ＬＣ６＋ε７×ＬＣ７＋ε８×ＬＣ８＋ε９ ×ＬＣ９・・・（25）このときγ、δ、ε１、ε２、ε３、ε４、ε５、ε
６、ε７、ε８、ε９は予め定められた定数である。こ
のコストは、文字列の認識結果が正常である可能性が高
いときに低い値になり、間違っている可能性が高いとき
ほど大きな値になるようなものであればどんなコストで
もよい。SC = γ × RC + δ × KC + ε1 × LC1 + ε2 × LC2 + ε3 × LC3 + ε4 × LC4 + ε5 × LC5 + ε6 × LC6 + ε7 × LC7 + ε8 × LC8 + ε9 × LC9 (25) At this time, γ, δ, ε1, ε2, ε3, ε4, ε5 , Ε
6, ε7, ε8, and ε9 are predetermined constants. This cost may be any value as long as it has a low value when the recognition result of the character string is likely to be normal, and has a large value when the recognition result of the character string is likely to be incorrect.

【００８６】このようにして算出された文字列認識コス
トが最も良い値（低い値）になった文字列候補を文字列
認識結果として出力することにより文字列を認識する。A character string is recognized by outputting the character string candidate whose character string recognition cost calculated as described above has the best value (low value) as a character string recognition result.

【００８７】次に本発明の第２の実施例を説明する。本
実施例では文字列を認識する際に第一の実施例で示した
ように文字列認識コストを計算して最良コストを認識結
果とする方法以外に、学習を用いて認識系を構成し、構
成された認識系を用いて文字列を認識する機能を文字列
認識結果出力部９に備える。Next, a second embodiment of the present invention will be described. In this embodiment, when a character string is recognized, in addition to the method of calculating a character string recognition cost and obtaining the best cost as a recognition result as shown in the first embodiment, a recognition system is configured using learning, The character string recognition result output unit 9 has a function of recognizing a character string using the configured recognition system.

【００８８】学習を用いて認識系を構成するためには、
例えば以下のような方法が考えられる。In order to construct a recognition system using learning,
For example, the following method can be considered.

【００８９】予めいくつかの文字列画像に対する全文字
列候補を作成し、全文字列候補中に正解が存在すればそ
の文字列候補に正解であるという情報を付与し、その他
の文字列候補に不正解であるという情報を付与する。こ
のように正解情報と不正解情報が付与された文字列候補
の各コストＲＣ、ＫＣ、ＬＣ１、ＬＣ２、ＬＣ３、ＬＣ
４、ＬＣ５、ＬＣ６、ＬＣ７、ＬＣ８、ＬＣ９を入力と
考え学習を行う。All character string candidates for some character string images are created in advance, and if there is a correct answer in all the character string candidates, information that the character string candidate is correct is given to other character string candidates. Give information that the answer is incorrect. In this way, the costs RC, KC, LC1, LC2, LC3, LC of the character string candidates to which the correct answer information and the incorrect answer information are added
4, learning is performed considering LC5, LC6, LC7, LC8, and LC9 as inputs.

【００９０】学習を行うには、例えば重回帰分析や判別
分析などの統計的な方法を用いたり、ニューラルネット
を用いることによって学習を行うことができる。The learning can be performed by using a statistical method such as a multiple regression analysis or a discriminant analysis, or by using a neural network.

【００９１】重回帰分析を用いて学習を行う場合、文字
列候補が正解ならば例えば−１、不正解ならな例えば１
を出力するように学習を行えばよい。このようにして学
習によって構成された認識系を用いて認識を行う際には
認識系が最も低い値を出力した文字列候補を文字列認識
結果とすればよい。In the case of learning using multiple regression analysis, if the character string candidate is a correct answer, for example, -1;
Should be learned so as to output. When performing recognition using a recognition system configured by learning in this way, a character string candidate that has output the lowest value from the recognition system may be used as a character string recognition result.

【００９２】また、重回帰分析によって得られる結果
は、式（２５）におけるＳＣを計算するためのパラメー
タγ、δ、ε１、ε２、ε３、ε４、ε５、ε６、ε
７、ε８、ε９と考えることも可能であり、重回帰分析
によって得られたパラメータを第一の実施例の式（２
５）のパラメータとすることもできる。また、学習に判
別分析を用いる場合も同様である。The results obtained by the multiple regression analysis are the parameters γ, δ, ε1, ε2, ε3, ε4, ε5, ε6, ε for calculating SC in equation (25).
7, ε8, ε9, and the parameters obtained by the multiple regression analysis are calculated using the equation (2) in the first embodiment.
It can also be the parameter of 5). The same applies when discriminant analysis is used for learning.

【００９３】また、ニューラルネットを用いる場合は、
ＲＣ、ＫＣ、ＬＣ１、ＬＣ２、ＬＣ３、ＬＣ４、ＬＣ
５、ＬＣ６、ＬＣ７、ＬＣ８、ＬＣ９を入力とし、正解
の場合１、不正解の場合０の教師信号を与えることによ
り学習させたニューラルネットを用いて文字列認識を行
うこともできる。When a neural network is used,
RC, KC, LC1, LC2, LC3, LC4, LC
It is also possible to perform character string recognition by using a neural network learned by inputting 5, LC6, LC7, LC8, and LC9 as teacher signals of 1 for correct answer and 0 for incorrect answer.

【００９４】次に本発明の第三の実施例を説明する。本
実施例では文字列を認識する際に、最小コストを与える
文字列候補で文字認識結果の文字コードが一部得られて
いない等の未解決の部分に対して認識結果を確かめるた
めに文字切り出しと認識をもう一度行い直し、その結果
に基づいて文字列認識を行う検証機能を文字列認識結果
出力部９に備える。Next, a third embodiment of the present invention will be described. In this embodiment, when recognizing a character string, character extraction is performed to check the recognition result for an unresolved portion such as a character string candidate that gives the minimum cost and a character code of the character recognition result is partially not obtained. The character string recognition result output unit 9 is provided with a verification function of performing the recognition again, and performing character string recognition based on the result.

【００９５】この検証機能は、例えば本願と同一出願人
による特許明細書（特願平６−３１７１６３「文字列読
み取り装置」）に記載されている方法を用いて実現する
ことができる。This verification function can be realized by using, for example, a method described in a patent specification (Japanese Patent Application No. 6-317163 “character string reading device”) by the same applicant as the present application.

【００９６】[0096]

【発明の効果】以上説明したように本発明によれば、始
めに言語知識処理を施して複数の文字列候補を作成し、
これらの文字列候補に対してレイアウトコストを計算
し、文字認識コストと言語的コストとレイアウトコスト
を同時に用いて文字列を認識するので、文字列が複数行
入力されても認識対象の文字列を正しく認識することが
できる。As described above, according to the present invention, a plurality of character string candidates are created by first performing linguistic knowledge processing.
The layout cost is calculated for these character string candidates, and the character string is recognized using the character recognition cost, the linguistic cost, and the layout cost at the same time. Can be correctly recognized.

[Brief description of drawings]

【図１】本発明の一実施例の概略構成を示すブロック
図。FIG. 1 is a block diagram showing a schematic configuration of an embodiment of the present invention.

【図２】図１に示すブロック図の文字切り出し部の処理
を説明するための図。FIG. 2 is a view for explaining processing of a character cutout unit in the block diagram shown in FIG. 1;

【図３】住所要素の一覧の例を示す図。FIG. 3 is a diagram showing an example of a list of address elements.

【図４】文字コード列情報の内容の例を示す図。FIG. 4 is a diagram showing an example of the contents of character code string information.

【図５】レイアウト知識の例を示す図。FIG. 5 is a diagram showing an example of layout knowledge.

【図６】単語の図形的知識を説明する図。FIG. 6 is a view for explaining graphical knowledge of words.

【図７】図１に示すブロック図のレイアウト解析部の処
理を説明するための図。FIG. 7 is a view for explaining processing of a layout analysis unit in the block diagram shown in FIG. 1;

【図８】図１に示すブロック図のレイアウト解析部の処
理を説明するための図。FIG. 8 is a view for explaining processing of a layout analysis unit in the block diagram shown in FIG. 1;

【図９】虫食い照合に対する仮対応の例を説明する図。FIG. 9 is a view for explaining an example of provisional correspondence to worm-eating verification.

【図１０】余分なブロックが混入したときのコストの計
算方法について説明する図。FIG. 10 is a diagram illustrating a method of calculating a cost when an extra block is mixed.

【図１１】文字行らしさを説明する図。FIG. 11 is a diagram illustrating character lineness.

【図１２】改行検出してから文字文字行らしさを評価す
ることを説明する図。FIG. 12 is a view for explaining evaluation of character / character line likeness after line feed detection.

【図１３】文字列記載基本方向を最小自乗法で求めるこ
とを説明する図。FIG. 13 is a view for explaining that a basic direction in which a character string is written is determined by a least square method.

[Explanation of symbols]

１画像記憶部２文字切り出し部３文字認識辞書記憶部４文字認識部５言語知識記憶部６レイアウト知識記憶部７言語知識処理部８レイアウト解析部９文字列認識結果出力部 DESCRIPTION OF SYMBOLS 1 Image storage part 2 Character extraction part 3 Character recognition dictionary storage part 4 Character recognition part 5 Language knowledge storage part 6 Layout knowledge storage part 7 Language knowledge processing part 8 Layout analysis part 9 Character string recognition result output part

Claims

[Claims]

A character string recognizing device for recognizing a character string by inputting an image of a handwritten character string, a character string image storage unit for storing an optically scanned character string image, and reading the character string image A character cutout unit for creating a character candidate pattern and character candidate graphic information, a character recognition dictionary storage unit for storing a standard character pattern, and a standard character pattern and a character cutout unit stored in the character recognition dictionary storage unit. A character recognition unit that collates the read character candidate patterns and obtains a character code and a character recognition cost as a measure for expressing the reliability of the character recognition result as a character recognition result; a linguistic knowledge storage unit that stores linguistic knowledge; A layout knowledge storage unit for storing layout knowledge as graphical knowledge such as a description condition of the character, and reading the character candidate graphic information from the character cutout unit. The character code corresponding to the character candidate graphic information is read from the character recognition unit, the layout knowledge is read from the layout knowledge storage unit, the language of the language knowledge and the character code stored in the language knowledge storage unit. Language matching processing unit that performs linguistic matching to create a combination of the character codes satisfying the linguistic knowledge as a character string candidate and obtains a linguistic cost as a scale expressing linguistic reliability of the created character string candidate Reading the character candidate graphic information from the character cutout unit, reading the character string candidate from the linguistic knowledge processing unit, reading the layout knowledge from the layout knowledge storage unit, and setting layout restrictions on the character string candidate. A layout analysis unit for obtaining a layout cost as a measure expressing validity; The character recognition cost is read from the language knowledge processing unit, and the linguistic cost for the character string candidate is read from the layout analysis unit.The layout cost of the character string candidate is read from the layout analysis unit. A character string recognition device comprising: a character string recognition result output unit that outputs, as a character string recognition result, the character string candidate that provides the best cost when layout costs are combined.

2. The layout analysis unit according to claim 1, wherein a character code to be present in the character string candidate is omitted, and there is a character portion which is linguistically skipped, and a possibility that the skipped character is described. In a case where character candidate graphic information that is not a component of a character string exists in a certain area, a function of calculating the layout cost by temporarily associating character candidate graphic information that is not a component of a character string with a skipped character is provided. The character string recognition device according to claim 1, wherein

3. The layout analysis unit gives a penalty if character candidate graphic information corresponding to a character code that is not a component of a character string exists in a linguistically continuous part of the character string candidates, and gives the layout cost. 2. The character string recognition device according to claim 1, further comprising a function of calculating.

4. The layout analysis section has a function of detecting whether a line break has occurred in the middle of a character string candidate, and changing the method of calculating the layout cost depending on whether the line break has occurred in the middle or not. 2. The character string recognition device according to claim 1, wherein:

5. The layout analysis unit according to claim 1, further comprising a function of evaluating the likelihood of a word using graphic information for each word existing in a character string candidate and calculating the layout cost. A character string recognition device as described.

6. The character string recognition result output unit has a function of configuring a means for recognizing a character string by combining the character recognition cost, the linguistic cost, and the layout cost by learning. Item 1. The character string recognition device according to Item 1.

7. The character string recognition result output unit includes a verification function for determining whether or not the recognized character string exists when the reliability of the recognized character string is low. 1. The character string recognition device according to 1.