JP2746345B2

JP2746345B2 - Post-processing method for character recognition

Info

Publication number: JP2746345B2
Application number: JP63206530A
Authority: JP
Inventors: 隆邦嶺脇; 道義立川
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1988-08-20
Filing date: 1988-08-20
Publication date: 1998-05-06
Anticipated expiration: 2013-05-06
Also published as: JPH0256085A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、文字認識の後処理方法に関する。Description: TECHNICAL FIELD The present invention relates to a post-processing method for character recognition.

[Conventional technology]

一般に文字認識装置においては、入力画像上の文字イ
メージから抽出した特徴パターンと予め辞書に登録され
ている標準特徴パターンとの比較照合により、文字を認
識している。In general, a character recognition device recognizes a character by comparing and comparing a feature pattern extracted from a character image on an input image with a standard feature pattern registered in a dictionary in advance.

しかし、このような１文字単位の文字認識では、漢字
のように類似文字の多い場合に、一つの文字に対し複数
の候補が見つかることが多いので、最終的に妥当な候補
文字を決定するための後処理が必要である。However, in such character recognition on a character-by-character basis, when there are many similar characters such as kanji, a plurality of candidates are often found for one character. Post-processing is required.

従来、このような後処理として、住所のみ、氏名の
み、あるいは書籍名のみ、というような単一の意味カテ
ゴリーの文字列を対象とし、単語知識を用いた知識処理
により１文字単位の文字認識の結果を修正する方法が検
討されている。これは、予め設定された枠内に文字を記
入する場合など、文書内の文字列の位置とその属する意
味カテゴリーが分かっている場合には有効であるが、対
象とする文書内に、複数の意味カテゴリーに属する文字
列があり、それぞれの文書内の位置が曖昧な場合には対
応できない。Conventionally, as such post-processing, a character string of a single semantic category, such as only an address, only a name, or only a book title, is targeted, and the character recognition in units of one character is performed by knowledge processing using word knowledge. Methods to correct the results are being considered. This is effective when the position of the character string in the document and the semantic category to which the character string belongs are known, such as when characters are entered in a preset frame, but multiple characters are included in the target document. If there is a character string belonging to the semantic category and the position in each document is ambiguous, it cannot be handled.

[Problems to be solved by the invention]

さて、名刺における住所と電話番号のように、同じ大
きさの文字で続けて印刷された２以上の異った意味カテ
ゴリーに属する文字列は、一つの文字列ブロックとして
切り出される。このような文字列ブロックは、意味カテ
ゴリー別に分割して処理し、出力すべきであるが、従来
の後処理方法によれば一つのブロックはまとめて処理さ
れてしまう。By the way, character strings belonging to two or more different semantic categories, which are continuously printed with characters of the same size, such as an address and a telephone number in a business card, are cut out as one character string block. Such a character string block should be processed by dividing it into semantic categories and output. However, according to the conventional post-processing method, one block is processed collectively.

本発明の目的は、２以上の異なった意味カテゴリーに
属する文字列のブロックを意味カテゴリー別に分割して
処理し出力する、文字認識の後処理方法を提供すること
にある。It is an object of the present invention to provide a post-processing method for character recognition in which blocks of character strings belonging to two or more different semantic categories are divided and processed and output for each semantic category.

[Means for solving the problem]

本発明は、入力画像から切り出された各文字列のブロ
ックについて、１文字単位の文字認識を行い、該文字認
識の結果を意味カテゴリー別の単語知識を用いた知識処
理によって修正する文字認識の後処理方法において、一
つの文字列ブロックに対し、意味カテゴリー別の複数の
知識処理を行い、当該文字列ブロック中に複数の意味カ
テゴリーの文字列群が存在するか判定し、存在すると該
文字列群を意味カテゴリーごとに分割することを特徴と
する。また、この分割したそれぞれの部分の処理結果の
文字列を、その意味カテゴリーのラベルを付加して出力
することを特徴とする。The present invention performs character recognition on a character-by-character basis for each character string block cut out from an input image and corrects the result of the character recognition by knowledge processing using word knowledge for each semantic category. In the processing method, a plurality of knowledge processes for each semantic category are performed on one character string block, and it is determined whether a character group of a plurality of semantic categories exists in the character string block. Is divided for each semantic category. In addition, a character string of a processing result of each of the divided parts is output with a label of its semantic category added.

(Operation)

着目する一つの文字列ブロックに対し、ある意味カテ
ゴリーの知識処理の結果、該文字列ブロックの一部文字
列が未処理となった場合、その未処理文字列に対して別
の意味カテゴリーの知識処理を行い、意味カテゴリー別
に２以上の部分に分割し、意味カテゴリー別に知識処理
の結果を出力する。As a result of knowledge processing of a certain semantic category for one character string block of interest, if a partial character string of the character string block is unprocessed, knowledge of another semantic category for the unprocessed character string is obtained. Processing is performed, divided into two or more parts for each semantic category, and the result of knowledge processing is output for each semantic category.

〔Example〕

以下、図面を用い本発明の実施例を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

まず、第２図に示す名刺画像21を例に、本発明の一実
施例における処理の内容について説明する。First, the contents of the processing in one embodiment of the present invention will be described using the business card image 21 shown in FIG. 2 as an example.

この名刺画像21から文字列ブロックの切り出しを行う
と、第３図に示す矩形領域31〜35が文字列ブロック〜
として切り出される。この文字列ブロックの切り出し
は、例えば名刺画像内の黒画素の外接矩形を統合する方
法によって行われるが、他の方法によってもよい。When a character string block is cut out from the business card image 21, rectangular areas 31 to 35 shown in FIG.
It is cut out as. This character string block is cut out by, for example, a method of integrating circumscribed rectangles of black pixels in a business card image, but may be another method.

このようにして得られた各文字列ブロックの内容は不
明であるが、対象画像が横書きの名刺画像であることが
予め指定されるか、あるいは文字ブロックの切り出し時
に判定しておけば、画像内における各意味カテゴリーの
文字列の相対的位置関係や配列順序などはある程度特定
されるので、切り出された各文字列ブロックの画像内の
相対的位置や文字サイズ（文字列ブロックの高さなど）
により、その意味カテゴリーの候補を推定可能である。
すなわち、文字列ブロックは企業名らしい、文字列ブ
ロックは肩書らしい、というように意味カテゴリーを
推定できる。The contents of each character string block obtained in this way are unknown, but if the target image is specified in advance as a horizontally written business card image or if it is determined at the time of cutting out the character block, Since the relative positional relationship and arrangement order of the character strings of each semantic category in are specified to some extent, the relative position and character size (such as the height of the character string blocks) of the extracted character string blocks in the image
Thus, the candidate of the semantic category can be estimated.
That is, the semantic category can be estimated such that the character string block looks like a company name and the character string block looks like a title.

次に、各文字列ブロック内の文字を切り出し、その特
徴パターンと文字辞書に登録されている標準特徴パター
ンとの比較照合により１文字単位の文字認識を行い、候
補文字を決定する。各文字の候補文字数は予め設定して
おくことが可能であるが、例えば第１位候補からの距離
差と、予め設定された閾値との比較などによって候補文
字数を決定してもよい。Next, a character in each character string block is cut out, and character recognition is performed for each character by comparing and comparing the characteristic pattern with a standard characteristic pattern registered in the character dictionary, and a candidate character is determined. Although the number of candidate characters for each character can be set in advance, the number of candidate characters may be determined by, for example, comparing a distance difference from the first candidate with a preset threshold.

文字列ブロックについての１文字単位の文字認識の
結果は、例えば第４図に示すようになる。The result of the character recognition of the character string block in units of one character is, for example, as shown in FIG.

ここで、文字列ブロックは住所の文字列と電話番号
の文字列からなるブロックであるが、これは両者が同じ
大きさの文字で続けて印刷されているので一つのブロッ
クとして切り出される。このようなブロックは、名刺画
像などでは頻繁に発生するが、ブロック内の文字列を意
味カテゴリー別に分割することが望ましい。Here, the character string block is a block composed of a character string of an address and a character string of a telephone number. Since both are printed successively with characters of the same size, they are cut out as one block. Such a block frequently occurs in a business card image or the like, but it is desirable to divide a character string in the block according to a semantic category.

さて、以上の１文字単位の文字認識によって得られた
各文字列ブロックの候補文字列について、その先頭から
意味カテゴリー別の単語の知識辞書との比較照合による
知識処理で修正を行う。このときに、上記のように、文
字列ブロックの相対的位置などによって推定した意味カ
テゴリーの知識辞書を用いる。ただし、複数の意味カテ
ゴリーについて知識処理を順次行い、最も確からしい処
理結果が得られた意味カテゴリーの結果を選んでもよ
い。The candidate character strings of each character string block obtained by the above-described character recognition on a character-by-character basis are corrected from the beginning of the candidate character strings by knowledge processing by comparison and collation with a knowledge dictionary of words of each meaning category. At this time, as described above, the knowledge dictionary of the semantic category estimated based on the relative position of the character string block and the like is used. However, knowledge processing may be sequentially performed on a plurality of semantic categories, and the result of the semantic category that provides the most reliable processing result may be selected.

文字列ブロックを例にすると、その推定される意味
カテゴリーである住所の知識辞書と候補文字列との比較
照合により、例えば「〒223横浜市港北区新栄町」の部
分が処理結果として得られ、それに続く文字列の部分は
知識辞書との一致がとれず未処理となったとする。Taking the character string block as an example, by comparing and comparing the knowledge dictionary of the address, which is the estimated semantic category, with the candidate character string, for example, a portion of "〒223 Shinagawacho, Kohoku-ku, Yokohama-shi" is obtained as a processing result It is assumed that the character string that follows does not match the knowledge dictionary and has not been processed.

この未処理となった文字列の文字数をカウントし、そ
の文字数を予め設定されている閾値と比較することによ
り「未処理文字列の部分は別の意味カテゴリーの文字列
が含まれている」が否かを判断する。この例では未処理
文字列は16文字であって、別の意味カテゴリーの文字列
が含まれると推定するに充分な長さである。By counting the number of characters of this unprocessed character string and comparing the number of characters with a preset threshold value, "the unprocessed character string portion contains a character string of another semantic category" Determine whether or not. In this example, the unprocessed character string is 16 characters long enough to assume that a character string of another semantic category is included.

そこで、文字列ブロックの名刺画像内の相対的位置
や、既に処理された文字列の意味カテゴリーの出現情報
（例えば会社名、氏名はすでに見つかっているので、も
う出現しないなど）、文字列の長さにより、未処理文字
列の属する意味カテゴリーを推定する。この例では、そ
の意味カテゴリーは「電話番号」または「住所の付属」
である可能性が高いと判断されるので、まず電話番号情
報であることを示すキーワードを文字列中から検索す
る。Therefore, the relative position of the character string block in the business card image, the appearance information of the semantic category of the already processed character string (for example, the company name and name have already been found, so they no longer appear), the length of the character string Thus, the semantic category to which the unprocessed character string belongs is estimated. In this example, the semantic category is "phone number" or "address attached"
Is determined to be high, a keyword indicating the telephone number information is searched from the character string.

この例では「電話」というキーワードが発見されるの
で、「電話」の文字以降は電話番号を示す文字列である
として処理する。そして、第５図に示すように、文字列
ブロックの処理結果として、その文字列を「電話」の
部分で前後に分割し、それぞれの部分の処理結果文字列
をその意味カテゴリーのラベルを付加して出力する（結
果メモリに書き込む。）。In this example, since the keyword "telephone" is found, the character string after "telephone" is processed as a character string indicating a telephone number. Then, as shown in FIG. 5, as a processing result of the character string block, the character string is divided before and after by a "phone" part, and the processing result character string of each part is added with a label of its semantic category. And output (write to result memory).

なお、「電話」以降の文字列の一部が未処理文字列と
して残り、その文字数が多い場合は、その未処理文字列
について同様の知識処理を繰り返す。If a part of the character string after "telephone" remains as an unprocessed character string and the number of characters is large, the same knowledge processing is repeated for the unprocessed character string.

ここで説明した例は住所と電話番号が一つの文字列ブ
ロック中に含まれた場合であったが、他の組合せの場合
も同様な手順によって処理される。Although the example described here is a case where the address and the telephone number are included in one character string block, other combinations are processed in the same procedure.

このように本実施例においては、文字列ブロックの文
字認識結果について、ひとつの意味カテゴリーの知識処
理を行い、それで未処理文字列が残った場合、その文字
数と意味カテゴリーの可能性などから、別の意味カテゴ
リーの文字列がブロック内に残っていると判断すると、
未処理文字列について別の意味カテゴリーの知識処理を
行うという操作を繰り返し、各意味カテゴリー別の知識
処理の結果を出力する。As described above, in the present embodiment, the knowledge processing of one semantic category is performed on the character recognition result of the character string block, and when an unprocessed character string remains, the character processing is performed based on the number of characters and the possibility of the semantic category. When it is determined that the character string of the semantic category of remains in the block,
The operation of performing knowledge processing of another semantic category for the unprocessed character string is repeated, and the result of the knowledge processing for each semantic category is output.

以上説明した処理を行う文字認識の後処理装置の一例
を第１図により説明する。An example of a character recognition post-processing device that performs the above-described processing will be described with reference to FIG.

第１図において、処理対象の原稿（名刺、文書）はス
キャナー１により光学的に読み取られ、２値画像として
イメージメモリ２に格納される。この入力画像より文字
列ブロック切り出し部３により文字列ブロックが切出さ
れ、そのイメージがブロックイメージメモリ４に格納さ
れる。この時、各文字列ブロックの相対位置などの意味
カテゴリー推定のための情報も抽出される。なお、横書
き名刺であるというような原稿についての情報は予め指
定されるか、あるいは文字列ブロックの相対位置やサイ
ズなどの情報により自動的に判定するという方法によっ
てもよい。In FIG. 1, a document (business card, document) to be processed is optically read by a scanner 1 and stored in an image memory 2 as a binary image. A character string block is cut out from the input image by a character string block cutout unit 3, and the image is stored in a block image memory 4. At this time, information for estimating the semantic category such as the relative position of each character string block is also extracted. Note that information on a document such as a horizontally written business card may be specified in advance, or may be determined automatically based on information such as the relative position and size of a character string block.

この各文字列ブロックのイメージに対し、文字切り出
し認識部５によって文字切り出し、特徴パターン抽出、
文字辞書６との比較照合による１文字単位の文字認識が
行われる。From the image of each character string block, character extraction is performed by the character extraction recognition unit 5 to extract a characteristic pattern.
Character recognition in units of one character is performed by comparison and collation with the character dictionary 6.

このような各文字列ブロック毎の１文字単位の文字認
識により得られた候補文字列について、上述のような後
処理が行われる。The above-described post-processing is performed on the candidate character string obtained by performing character recognition on a character-by-character basis for each character string block.

すなわち、後処理・知識辞書照合部７において、各文
字列ブロック毎にその候補文字列と、意味カテゴリー別
の単語の知識辞書８との比較照合による知識処理が行わ
れる。このときの意味カテゴリーは上述のように推定さ
れる。そして、知識処理の結果を意味カテゴリーのラベ
ルとともに別カテゴリー文字列存在判定部９へ出力す
る。That is, in the post-processing / knowledge dictionary matching unit 7, knowledge processing is performed for each character string block by comparing and matching the candidate character string with the knowledge dictionary 8 of words of each meaning category. The meaning category at this time is estimated as described above. Then, the result of the knowledge processing is output to the different category character string existence determining unit 9 together with the label of the semantic category.

この別カテゴリー文字列存在判定部９においては、後
処理・知識辞書照合部７から送られてきた処理結果と１
文字単位の文字認識による候補文字列とを比較し、未処
理の文字列の部分に別カテゴリーの文字列が含まれてい
るか否かを上述のように文字数などによって判断する。
別カテゴリーの文字列が含まれていると判断した場合、
未処理部分の候補文字列について後処理・知識辞書照合
部７に別の意味カテゴリーの知識処理を行わせ、処理済
みの文字列部分についてはその処理結果の文字列を意味
カテゴリーのラベルとともに結果メモリ10に書き込む。In the different category character string existence determining unit 9, the processing result sent from the post-processing
By comparing the character string with the candidate character string by character recognition on a character-by-character basis, it is determined whether or not the unprocessed character string includes a character string of another category based on the number of characters or the like as described above.
If it is determined that a string of another category is included,
The post-processing / knowledge dictionary matching unit 7 performs knowledge processing of another semantic category for the candidate character string of the unprocessed part, and for the processed character string part, the character string of the processing result is stored in the result memory together with the label of the semantic category. Write to 10.

11は以上の処理の流れを制御する制御部である。 Reference numeral 11 denotes a control unit that controls the flow of the above processing.

なお、本発明は音声認識の修正処理にも応用可能であ
る。Note that the present invention is also applicable to speech recognition correction processing.

〔The invention's effect〕

以上の説明から明らかなように、本発明によれば、入
力画像から切り出された一つの文字列ブロック中に２以
上の異なった意味カテゴリーの文字列が含まれいている
場合、該文字列ブロックを意味カテゴリー別に分割し
て、その処理結果を出力することが可能になる。As is apparent from the above description, according to the present invention, when one or more character string blocks cut out from an input image include character strings of two or more different semantic categories, the character string block is It is possible to output the processing result by dividing by the semantic category.

[Brief description of the drawings]

第１図は本発明を実施した文字認識装置の一例を示す概
略ブロック図、第２図ないし第５図は本発明の一実施例
における処理説明のための図であって、第２図は名刺画
像の一例を示す図、第３図は第２図は名刺画像の文字列
ブロック切り出し結果を示す図、第４図は第３図中の文
字列ブロックに対する１文字単位の文字認識の結果を
示す図、第５図は第３図中の文字列ブロックに対する
知識処理の結果を示す図である。１……スキャナー、２……イメージメモリ、３……文字列ブロック切り出し部、４……ブロックイメージメモリ、５……文字切り出し・認識部、６……文字辞書、７……後処理・知識辞書照合部、８……知識辞書、９……別カテゴリー文字列存在判定部、 10……結果メモリ。FIG. 1 is a schematic block diagram showing an example of a character recognition device embodying the present invention, and FIGS. 2 to 5 are views for explaining processing in one embodiment of the present invention. FIG. 3 is a view showing an example of an image, FIG. 3 is a view showing a character string block cutout result of a business card image, and FIG. 4 is a view showing a result of character recognition of a character string block in FIG. FIG. 5 is a diagram showing the result of knowledge processing on the character string block in FIG. DESCRIPTION OF SYMBOLS 1 ... scanner, 2 ... image memory, 3 ... character string block extraction part, 4 ... block image memory, 5 ... character extraction / recognition part, 6 ... character dictionary, 7 ... post-processing / knowledge dictionary Matching unit, 8: Knowledge dictionary, 9: Different category character string existence determining unit, 10: Result memory.

Claims

(57) [Claims]

1. A character recognition method for performing character recognition on a character-by-character basis for each character string block cut out from an input image and correcting the character recognition result by knowledge processing using word knowledge for each semantic category. In the post-processing method, a plurality of knowledge processes for each semantic category are performed on one character string block, and it is determined whether or not a character string group having a plurality of semantic categories exists in the character string block. A post-processing method for character recognition, wherein a group is divided into semantic categories.

2. A post-processing method for character recognition according to claim 1, wherein a character string as a processing result of each of the divided parts is
A character recognition post-processing method characterized by adding a label of the semantic category and outputting.