JP6118646B2

JP6118646B2 - Form processing device, form processing method, form processing program

Info

Publication number: JP6118646B2
Application number: JP2013118849A
Authority: JP
Inventors: 高橋　寿一; 寿一高橋; 新庄　広; 広新庄; 和樹中島; 博文木村
Original assignee: Hitachi Information and Telecommunication Engineering Ltd
Current assignee: Hitachi Information and Telecommunication Engineering Ltd
Priority date: 2013-06-05
Filing date: 2013-06-05
Publication date: 2017-04-19
Anticipated expiration: 2033-06-05
Also published as: JP2014235694A

Description

本発明は、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ：光学式文字読取装置）などにおける画像処理技術に関する。 The present invention relates to an image processing technique in an OCR (Optical Character Reader) or the like.

ＯＣＲを用いて文字を読取る場合、一般に２値化、文字行抽出、文字切り出し、文字識別、言語照合というステップを経る。個々の処理の概要は次のとおりである。２値化は、入力したカラー画像や濃淡画像から、文字を黒、その他を白とする２値画像を生成する処理である。文字行抽出は、画像内において文字が記載されている行領域を特定する処理である。文字切り出しは、文字行領域内における個々の文字領域を特定する処理である。文字識別は、個々の文字領域の画像に対して文字認識処理を実施し、その文字認識結果（文字コード）と確信度を求める処理である。言語照合は、あらかじめ登録済みの単語などと文字認識結果を照合し、文字認識結果の誤りを訂正する処理である。例えば国名を認識したい場合は、国名の単語を登録した辞書と文字認識結果を比較し、最も合致する単語を認識結果とみなす。 When a character is read using OCR, generally, steps of binarization, character line extraction, character segmentation, character identification, and language collation are performed. The outline of each processing is as follows. Binarization is processing for generating a binary image in which characters are black and others are white from an input color image or grayscale image. Character line extraction is a process for specifying a line area in which characters are described in an image. Character segmentation is a process of specifying individual character areas in a character line area. Character identification is a process of performing a character recognition process on an image of each character area and obtaining a character recognition result (character code) and a certainty factor. Language collation is a process of collating a character recognition result with a previously registered word or the like and correcting an error in the character recognition result. For example, when the country name is to be recognized, the character recognition result is compared with the dictionary in which the word of the country name is registered, and the most matching word is regarded as the recognition result.

文字識別の他、文字切り出しにおいても誤りが発生する。例えば「∝」を「ｏｃ」などのように誤って切り出す可能性がある。このような切り出し誤りは、文字のかすれや潰れによっても発生する。このような切り出し誤りを回避するためには、例えば考えられる全ての切り出しパターンについて文字識別を実施し、言語照合において最も合致する文字認識結果を最終結果とすることが考えられる。 In addition to character identification, errors also occur in character segmentation. For example, “∝” may be erroneously cut out like “oc”. Such a cut-out error also occurs due to blurring or crushing of characters. In order to avoid such cutout errors, for example, it is conceivable that character recognition is performed on all possible cutout patterns, and the character recognition result that most closely matches in language collation is used as the final result.

帳票を対象とするＯＣＲにおいては、罫線や文字があらかじめ紙上に帳票の一部として印刷されており（プレ印刷と呼ぶ）、その上に記載された文字を認識する場合がある。ＯＣＲの読み取り対象である記載文字は、一般に黒や青で記載されることが多い。プレ印刷部分は、記載文字と区別するため、記載文字とは異なる色で印刷されていることが多い。 In an OCR for a form, ruled lines and characters are printed on paper as a part of the form in advance (referred to as pre-printing), and the characters described on the line may be recognized. In general, written characters to be read by the OCR are often written in black or blue. The pre-printed portion is often printed in a different color from the written characters in order to distinguish them from the written characters.

プレ印刷と記載文字が混在している帳票から文字行を認識する方法として、プレ印刷をドロップアウト（文字として認識しない）して記載文字のみ認識することが考えられる。この場合、認識すべき文字とプレ印刷が同じ色で記載されていると、これらを併せてドロップアウトすることにより言語照合ができなくなる課題がある。 As a method of recognizing a character line from a form in which pre-printing and written characters are mixed, it is conceivable to drop out pre-printed (not recognized as characters) and recognize only written characters. In this case, if the characters to be recognized and the pre-print are described in the same color, there is a problem that language collation cannot be performed by dropping them together.

下記特許文献１は、上記課題を解決するための技術として、プレ印刷文字と記載文字を併せて認識する手法を開示している。具体的には赤、緑、青、黒それぞれの色で抽出された２値画像を組み合わせて画像を合成し、合成画像に対して文字認識を実施して、手書き文字とプレ印刷文字を認識している。 The following Patent Document 1 discloses a technique for recognizing both preprinted characters and written characters as a technique for solving the above-described problems. Specifically, binary images extracted in red, green, blue, and black are combined to synthesize an image, character recognition is performed on the combined image, and handwritten characters and pre-printed characters are recognized. ing.

特開２００９−２６５７５１号公報JP 2009-265751 A

プレ印刷帳票上に記載された文字を認識する場合、プレ印刷部分に文字が重なって記載されている場合がある。この場合、上記特許文献１に記載されている技術においては、合成画像上においてもプレ印刷部分と文字部分が重なっているため、文字を識別することはできない。また、帳票の種類によっては、プレ印刷の色が必ずしも定まっていない場合があるので、上記特許文献１記載の技術によってはプレ印刷部分と文字部分を適切に分離することができない場合がある。 When recognizing characters written on a pre-printed form, the characters may be overlapped on the pre-printed portion. In this case, in the technique described in Patent Document 1, a character cannot be identified because the preprinted portion and the character portion overlap each other even on the composite image. Also, depending on the type of form, the color of pre-printing may not always be determined, so that the pre-printed part and the character part may not be properly separated depending on the technique described in Patent Document 1.

本発明は、上記のような課題に鑑みてなされたものであり、帳票のプレ印刷部分と記載文字部分が重なっている場合であっても、記載文字を精度よく認識することができる帳票処理技術を実現することを目的とする。 The present invention has been made in view of the problems as described above, and is a form processing technology that can accurately recognize written characters even when the pre-printed portion and written character portion of the form overlap. It aims at realizing.

本発明に係る帳票処理装置は、帳票画像から色毎に文字を認識し、認識した文字の順序を入れ替えて複数の候補文字列を生成し、候補文字列と表記辞書内の文字列パターンとを比較することにより候補文字列を評価する。 The form processing apparatus according to the present invention recognizes characters for each color from a form image, generates a plurality of candidate character strings by changing the order of recognized characters, and obtains candidate character strings and character string patterns in a notation dictionary. The candidate character string is evaluated by comparing.

本発明に係る帳票処理装置によれば、帳票のプレ印刷部分と記載文字が重なっている場合であっても、記載文字を精度よく認識することができる。 According to the form processing apparatus according to the present invention, even if the pre-printed portion of the form and the written characters overlap, the written characters can be recognized with high accuracy.

実施形態１に係る帳票処理装置１００機能ブロック図である。2 is a functional block diagram of a form processing apparatus 100 according to Embodiment 1. FIG. プレ印刷文字と記載文字が重なっている場合における文字認識結果の１例を示す図である。It is a figure which shows an example of the character recognition result in case a pre print character and the description character have overlapped. カラー帳票画像から色を抽出することによりプレ印刷文字と記載文字を分離した例を示す図である。It is a figure which shows the example which isolate | separated the pre-printed character and the description character by extracting a color from a color form image. 帳票処理装置１００の動作フローを説明する図である。FIG. 5 is a diagram illustrating an operation flow of the form processing apparatus 100. ステップＳ４０６の詳細を説明するフローチャートである。It is a flowchart explaining the detail of step S406. ステップＳ４０６において生成された候補文字列を例示する図である。It is a figure which illustrates the candidate character string produced | generated in step S406. ステップＳ４０７〜Ｓ４０８の詳細を説明する図である。It is a figure explaining the detail of step S407-S408. 出力部１８０が文字列評価部１６０による評価結果をＧＵＩ形式で画面表示する例を示す図である。It is a figure which shows the example in which the output part 180 displays the evaluation result by the character string evaluation part 160 on a screen in a GUI format. カラードロップアウトの程度が異なる複数の２値化画像を例示する図である。It is a figure which illustrates the some binarized image from which the grade of a color dropout differs.

＜実施の形態１＞
図１は、本発明の実施形態１に係る帳票処理装置１００機能ブロック図である。帳票認識装置１００は、帳票画像を処理する装置であり、画像入力部１１０、色抽出部１２０、文字認識部１３０、文字列生成部１４０、表記辞書読出部１５１、表記辞書１５２、文字列評価部１６０、記憶装置１７０、出力部１８０を備える。 <Embodiment 1>
FIG. 1 is a functional block diagram of a form processing apparatus 100 according to the first embodiment of the present invention. The form recognition apparatus 100 is an apparatus for processing a form image, and includes an image input unit 110, a color extraction unit 120, a character recognition unit 130, a character string generation unit 140, a notation dictionary reading unit 151, a notation dictionary 152, and a character string evaluation unit. 160, a storage device 170, and an output unit 180.

画像入力部１１０は、例えばスキャナなどの装置を用いて構成され、帳票を画像データとして取り込む。色抽出部１２０は、取り込まれた帳票画像内の色を抽出する。文字認識部１３０は、抽出された色毎に文字を認識する。文字列生成部１４０は、後述の図４〜図７で説明する処理により、色毎の文字識別結果から最終認識結果の候補となる文字列を生成する。表記辞書読出部は、表記辞書１５２が格納している文字列パターンを読み出す。文字列評価部１６０は、文字列生成部１４０が生成した候補文字列と表記辞書１５２内の文字列パターンに対して言語照合を実施し、照合結果をスコア付けして評価する。出力部１８０は、文字列評価部１６０による評価結果を出力する。 The image input unit 110 is configured using, for example, a device such as a scanner, and takes a form as image data. The color extraction unit 120 extracts colors in the captured form image. The character recognition unit 130 recognizes a character for each extracted color. The character string generation unit 140 generates a character string that is a candidate for the final recognition result from the character identification result for each color by the process described later with reference to FIGS. The notation dictionary reading unit reads the character string pattern stored in the notation dictionary 152. The character string evaluation unit 160 performs language collation on the candidate character string generated by the character string generation unit 140 and the character string pattern in the notation dictionary 152, and scores and evaluates the collation result. The output unit 180 outputs the evaluation result by the character string evaluation unit 160.

表記辞書１５２は、帳票処理装置１００が帳票を処理する際に参照する辞書データベースである。表記辞書１５２は、具体的には、帳票処理装置１００が文字認識を実施する際に参照する文字認識辞書、言語照合を実施する際に参照する知識辞書、帳票レイアウトを特定する際に参照するレイアウト情報、などを格納する。知識辞書は、例えば日付、金額、住所、氏名、口座番号といった項目を有する。各辞書は、後述の図７で例示する文字列パターンを保持している。記憶装置１７０は、帳票処理装置１００が処理する帳票画像データを格納する。 The notation dictionary 152 is a dictionary database that is referred to when the form processing apparatus 100 processes a form. Specifically, the notation dictionary 152 is a character recognition dictionary that is referred to when the form processing apparatus 100 performs character recognition, a knowledge dictionary that is referred to when language verification is performed, and a layout that is referred to when specifying a form layout. Store information, etc. The knowledge dictionary has items such as date, amount, address, name, and account number. Each dictionary holds a character string pattern exemplified in FIG. The storage device 170 stores form image data to be processed by the form processing apparatus 100.

文字認識部１３０は、帳票画像の２値化、文字行抽出、文字切り出し、文字認識を実施する。文字行抽出においては例えば、文字が記載されていると予想される矩形領域を連結し、その全体を包含する矩形を文字行とみなすことができる。 The character recognition unit 130 performs binarization of the form image, character line extraction, character segmentation, and character recognition. In character line extraction, for example, a rectangular region where characters are expected to be described can be connected, and a rectangle including the whole can be regarded as a character line.

上記各機能部は、その機能を実現する回路デバイスなどのハードウェアを用いて実現することもできるし、その機能を実装したソフトウェアをＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）などの演算装置が実行することによって実現することもできる。 Each of the above functional units can be realized by using hardware such as a circuit device that realizes the function, or realized by executing a software that implements the function by an arithmetic unit such as a CPU (Central Processing Unit). You can also

表記辞書１５２と記憶装置１７０は、ハードディスク装置などの記憶装置にデータを格納することによって実現することができる。出力部１８０の出力形式は、例えば任意の通信線を介して出力するデータ形式でもよいし、画面上にＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）として出力する形式でもよい。 The notation dictionary 152 and the storage device 170 can be realized by storing data in a storage device such as a hard disk device. The output format of the output unit 180 may be, for example, a data format that is output via an arbitrary communication line, or may be a format that is output as a GUI (Graphical User Interface) on the screen.

図２は、プレ印刷文字と記載文字が重なっている場合における文字認識結果の１例を示す図である。カラー帳票画像２０１において、プレ印刷文字２０２の上に印字ずれにより記載文字２０３が重なっている。２値化画像２０４は、カラー帳票画像２０１を２値化した画像である。領域２０５は、プレ印刷文字２０２と記載文字２０３が重なりあった領域である。文字行２０６は、２値化画像２０４内の文字行部分を抽出した結果を示す。認識結果２０７は、文字行２０６から文字を切り出して文字識別を実施した結果である。プレ印刷文字２０２と記載文字２０３が重なっていない部分は文字を識別できているが、重なっている部分は識別結果がないかまたは誤った文字として識別される。 FIG. 2 is a diagram illustrating an example of a character recognition result when the pre-printed characters and the written characters overlap. In the color form image 201, the written character 203 overlaps the preprinted character 202 due to printing misalignment. The binarized image 204 is an image obtained by binarizing the color form image 201. An area 205 is an area where the preprint character 202 and the written character 203 overlap. A character line 206 indicates a result of extracting a character line part in the binarized image 204. The recognition result 207 is a result of character identification performed by cutting out characters from the character line 206. A portion where the pre-printed character 202 and the written character 203 do not overlap can be identified, but the overlapping portion is identified as an erroneous character with no identification result.

図２に例示するように、プレ印刷文字と記載文字が重なっている場合には、その２値化画像に対してまま文字認識を実施しても、重なっている部分の文字認識結果を得ることは困難である。そこで本実施形態１においては、以下に説明する手法によりプレ印刷部分と記載文字部分を分離する。 As illustrated in FIG. 2, when the pre-printed character and the written character overlap, even if character recognition is performed on the binarized image, the character recognition result of the overlapping portion is obtained. It is difficult. Therefore, in the first embodiment, the preprinted portion and the written character portion are separated by the method described below.

図３は、カラー帳票画像から色を抽出することによりプレ印刷文字と記載文字を分離した例を示す図である。２値化画像３０１は、記載文字色を残して他の色をドロップアウトした２値化画像である。文字行３０２は、２値化画像３０１から文字行を抽出した結果である。文字列３０３は、文字行３０２に対して文字切り出し、文字識別を実施した結果得られた文字列である。２値化画像３０４は、プレ印刷色を残して他の色をドロップアウトした２値化画像である。文字行３０５は、２値化画像３０４から文字行を抽出した結果である。文字列３０６は、文字行３０５に対して文字切り出し、文字識別を実施した結果得られた文字列である。各文字の識別結果には帳票画像内における座標値が関連付けられるので、文字間のスペース部分については省略してもよい。 FIG. 3 is a diagram illustrating an example in which pre-printed characters and written characters are separated by extracting colors from a color form image. The binarized image 301 is a binarized image in which other colors are dropped out while leaving the described character color. A character line 302 is a result of extracting a character line from the binarized image 301. The character string 303 is a character string obtained as a result of character extraction from the character line 302 and character identification. The binarized image 304 is a binarized image in which other colors are dropped out while leaving the pre-print color. A character line 305 is a result of extracting a character line from the binarized image 304. The character string 306 is a character string obtained as a result of character extraction from the character line 305 and character identification. Since the identification value of each character is associated with the coordinate value in the form image, the space portion between the characters may be omitted.

図３に例示するように、色毎に生成した２値化画像を用いることにより、記載文字とプレ印刷文字を区別して文字認識することができる。しかし、それぞれの文字認識結果は互いに独立しているので、これらが一体になった文字列に対して言語照合を実施することができない。そこで本実施形態１においては、さらにプレ印刷文字と記載文字を組み合わせた候補文字列を生成する。候補文字列の詳細については後述の図６で説明する。 As illustrated in FIG. 3, by using a binarized image generated for each color, it is possible to recognize characters by distinguishing between written characters and pre-printed characters. However, since the respective character recognition results are independent of each other, language collation cannot be performed on a character string in which these are integrated. Therefore, in the first embodiment, a candidate character string is generated by further combining preprinted characters and written characters. Details of the candidate character string will be described later with reference to FIG.

図４は、帳票処理装置１００の動作フローを説明する図である。以下、図４の各ステップについて説明する。 FIG. 4 is a diagram for explaining the operation flow of the form processing apparatus 100. Hereinafter, each step of FIG. 4 will be described.

（図４：ステップＳ４０１〜Ｓ４０３）
画像入力部１１０は、カラー帳票画像を取得する（Ｓ４０１）。色抽出部１２０は、カラー帳票画像からプレ印刷色と記載文字色を抽出する（Ｓ４０２）。このとき、近似した色については同一色とみなしてもよい。ステップＳ４０２において抽出した各色について、以下のステップＳ４０４〜Ｓ４０５を繰り返す。 (FIG. 4: Steps S401 to S403)
The image input unit 110 acquires a color form image (S401). The color extraction unit 120 extracts the pre-print color and the description character color from the color form image (S402). At this time, the approximate colors may be regarded as the same color. The following steps S404 to S405 are repeated for each color extracted in step S402.

（図４：ステップＳ４０４）
文字認識部１３０は、ステップＳ４０２において抽出されたいずれか１色を残してその他の色をドロップアウトすることにより、２値化画像を生成する。文字認識部１３０はさらに、生成した２値化画像に対してレイアウト解析を実施し、文字行を抽出する。例えば帳票画像内の罫線や点線を抽出し、抽出された罫線や点線に基づき表部分を抽出し、表の内外から文字行を抽出することができる。帳票画像全体のレイアウトについては、同様の手法により実施してもよいし、例えば表記辞書１５２内のレイアウト情報を用いて認識してもよい。 (FIG. 4: Step S404)
The character recognition unit 130 generates a binarized image by leaving any one color extracted in step S402 and dropping out the other colors. The character recognition unit 130 further performs layout analysis on the generated binarized image and extracts a character line. For example, it is possible to extract ruled lines and dotted lines in a form image, extract a table portion based on the extracted ruled lines and dotted lines, and extract character lines from inside and outside the table. About the layout of the whole form image, you may implement by the same method, for example, you may recognize using the layout information in the description dictionary 152. FIG.

（図４：ステップＳ４０５）
文字認識部１３０は、ステップＳ４０４において抽出した文字行内の各文字を識別するため、文字認識を実施する。具体的には、文字行から文字を切り出して文字列パターンを生成し、生成した文字パターンと表記辞書１５２内の文字認識辞書が保持している文字パターンを比較することにより、各文字を識別する。文字認識部１３０は、識別した文字パターンの帳票画像内における座標値や識別結果を、記憶装置１７０に格納する。 (FIG. 4: Step S405)
The character recognition unit 130 performs character recognition in order to identify each character in the character line extracted in step S404. Specifically, a character string pattern is generated by cutting out characters from a character line, and each character is identified by comparing the generated character pattern with the character pattern held in the character recognition dictionary in the notation dictionary 152. . The character recognition unit 130 stores the coordinate values and identification results of the identified character patterns in the form image in the storage device 170.

（図４：ステップＳ４０６）
文字列生成部１４０は、ステップＳ４０５において認識された文字を用いて、表記辞書１５２内の知識辞書が保持している文字列パターンに合致する候補文字列を生成する。具体的には、プレ印刷文字と記載文字それぞれの順序を入れ替えることにより、複数の文字列パターンを候補文字列として生成する。本ステップの詳細は後述の図５で説明する。 (FIG. 4: Step S406)
The character string generation unit 140 generates a candidate character string that matches the character string pattern held in the knowledge dictionary in the notation dictionary 152 using the character recognized in step S405. Specifically, a plurality of character string patterns are generated as candidate character strings by switching the order of pre-printed characters and written characters. Details of this step will be described later with reference to FIG.

（図４：ステップＳ４０７〜Ｓ４０８）
文字列評価部１６０は、ステップＳ４０６において生成された候補文字列と表記辞書１５２内の知識辞書が保持している文字列パターンを比較することにより、言語照合を実施する（Ｓ４０７）。文字列評価部１６０は、言語照合の結果に基づき、候補文字列に対してスコアを付与する（Ｓ４０８）。 (FIG. 4: Steps S407 to S408)
The character string evaluation unit 160 performs language collation by comparing the candidate character string generated in step S406 with the character string pattern held in the knowledge dictionary in the notation dictionary 152 (S407). The character string evaluation unit 160 assigns a score to the candidate character string based on the result of language collation (S408).

（図４：ステップＳ４０９）
出力部１８０は、文字列評価部１６０による評価結果を出力する。評価結果をＧＵＩとして出力する場合の画面例については後述の図８で説明する。 (FIG. 4: Step S409)
The output unit 180 outputs the evaluation result by the character string evaluation unit 160. A screen example when the evaluation result is output as a GUI will be described with reference to FIG.

図５は、ステップＳ４０６の詳細を説明するフローチャートである。以下、図５に示す各ステップについて説明する。 FIG. 5 is a flowchart for explaining details of step S406. Hereinafter, each step shown in FIG. 5 will be described.

（図５：ステップＳ５０１〜Ｓ５０２）
文字列生成部１４０は、ステップＳ４０５において認識された記載文字とプレ印刷文字を、候補文字列の初期値とする（Ｓ５０１）。文字列生成部１４０は、各文字に関連付けられている座標値に基づき、記載文字とプレ印刷文字が行単位で重なりあうか否かを確認する（Ｓ５０２）。行が重なり合っている場合はステップＳ５０３へ進み、重なり合っていない場合はステップＳ５０１において生成した候補文字列をそのまま用いる。 (FIG. 5: Steps S501 to S502)
The character string generation unit 140 sets the written character and the pre-printed character recognized in step S405 as the initial value of the candidate character string (S501). The character string generation unit 140 checks whether or not the written characters and the pre-printed characters overlap in line units based on the coordinate values associated with each character (S502). If the lines overlap, the process proceeds to step S503. If the lines do not overlap, the candidate character string generated in step S501 is used as it is.

（図５：ステップＳ５０２：補足）
文字列生成部１４０は、記載文字とプレ印刷文字が行単位で重なりあうかを判断する際に、ある程度の許容範囲を設けてもよい。例えば、記載文字とプレ印刷文字が重なり合っている部分の高さが文字高さの半分以内であれば、これらは同じ行内に記載されているとみなすことができる。これにより、記載文字が高さ方向に若干ずれている場合であってもプレ印刷文字と同じ行内に記載されているものとして処理することができる。 (FIG. 5: Step S502: Supplement)
The character string generation unit 140 may provide a certain allowable range when determining whether the written character and the pre-printed character overlap in line units. For example, if the height of the portion where the written character and the pre-printed character overlap is within half of the character height, it can be considered that they are written in the same line. As a result, even if the written characters are slightly shifted in the height direction, it can be processed as being written in the same line as the pre-printed characters.

（図５：ステップＳ５０３）
文字列生成部１４０は、ステップＳ４０５の文字認識によって得られた各文字の座標値に基づき、記載文字とプレ印刷文字をそれぞれ左端座標でソートする。これにより例えば図３の文字行３０２内に記載されている文字列は左詰めされ、文字列３０３となる。 (FIG. 5: Step S503)
The character string generation unit 140 sorts the written characters and the pre-printed characters by the left end coordinate based on the coordinate value of each character obtained by the character recognition in step S405. Thus, for example, the character string described in the character line 302 of FIG.

（図５：ステップＳ５０４）
文字列生成部１４０は、各文字の座標値に基づき、記載文字とプレ印刷文字が文字単位で重なっているか否かを確認する。文字単位で重なっている場合はステップＳ５０５へ進み、重なっていない場合はステップＳ５０３で生成した文字列を候補文字列とする。 (FIG. 5: Step S504)
Based on the coordinate value of each character, the character string generation unit 140 checks whether the written character and the pre-printed character overlap in character units. If they overlap in character units, the process proceeds to step S505. If they do not overlap, the character string generated in step S503 is set as a candidate character string.

（図５：ステップＳ５０５）
文字列生成部１４０は、記載文字の文字順とプレ印刷文字の文字順を入れ替えることにより、文字認識の最終結果の候補となる候補文字列を生成する。例えば以下のような仮説に基づいて文字順を入れ替えることにより、候補文字列を生成することができる。 (FIG. 5: Step S505)
The character string generation unit 140 generates candidate character strings that are candidates for the final result of character recognition by switching the character order of written characters and the character order of pre-printed characters. For example, a candidate character string can be generated by changing the character order based on the following hypothesis.

（図５：ステップＳ５０５：文字順を入れ替える手法例）
（手法例１）記載文字が左にずれることにより記載文字とプレ印刷文字が文字単位で重なっていると仮定し、記載文字を右方向にＸ文字ずらす：
（手法例２）記載文字が右にずれることにより記載文字とプレ印刷文字が文字単位で重なっていると仮定し、記載文字を左方向にＸ文字ずらす：
（手法例３）記載文字が大きく右にずれることにより記載文字とプレ印刷文字が文字単位で重なっていると仮定し、まず全ての記載文字を左から順にセットし、その後にプレ印刷文字が続く文字列を生成する：
（手法例４）記載文字が大きく左にずれることにより記載文字とプレ印刷文字が文字単位で重なっていると仮定し、まず全てのプレ印刷文字を左から順にセットし、その後に記載文字が続く文字列を生成する。 (FIG. 5: Step S505: Example of method for changing the character order)
(Method Example 1) Assuming that the written character and the pre-printed character are overlapped in character units by shifting the written character to the left, the written character is shifted to the right by X characters:
(Method Example 2) Assuming that the written character and the preprinted character are overlapped in character units by shifting the written character to the right, the written character is shifted to the left by X characters:
(Method 3) Assuming that the written characters and the pre-printed characters are overlapped in units of characters due to the large shift of the written characters to the right, first set all the written characters in order from the left, followed by the pre-printed characters Generate a string:
(Method 4) Assuming that the written characters and the pre-printed characters are overlapped in units of characters because the written characters are greatly shifted to the left, first, all the pre-printed characters are set in order from the left, followed by the written characters Generate a string.

（図５：ステップＳ５０５：補足）
文字認識部１３０は、記載文字の文字間隔が文字行の高さ以上ある場合は、記載文字のなかにスペース文字が含まれていると判断することもできる。プレ印刷文字についても同様である。文字列生成部１４０は、記載文字内のスペース部分にプレ印刷文字を挿入し、またはプレ印刷文字内のスペース部分に記載文字を挿入することにより、候補文字列を生成することもできる。各文字が複数の色で記載されている場合は、挿入する文字を色毎に組み合わせることもできる。 (FIG. 5: Step S505: Supplement)
The character recognition unit 130 can also determine that a space character is included in the written character when the character spacing of the written character is greater than or equal to the height of the character line. The same applies to pre-printed characters. The character string generation unit 140 can also generate a candidate character string by inserting a pre-printed character in a space portion in the written character or inserting a written character in the space portion in the pre-printed character. When each character is described in a plurality of colors, the characters to be inserted can be combined for each color.

図６は、ステップＳ４０６において生成された候補文字列を例示する図である。候補文字列表６０１は候補文字列のリストである。記載文字は濃い文字で表し、プレ印刷文字は薄い文字で表している。候補文字列６０２と６０３はステップＳ５０１において生成されたものである。候補文字列６０７と６０８はそれぞれ（手法例３）（手法例４）によって生成されたものである。 FIG. 6 is a diagram illustrating the candidate character string generated in step S406. The candidate character string table 601 is a list of candidate character strings. The written characters are represented by dark characters, and the preprinted characters are represented by light characters. Candidate character strings 602 and 603 are generated in step S501. Candidate character strings 607 and 608 are respectively generated by (Method Example 3) (Method Example 4).

図７は、ステップＳ４０７〜Ｓ４０８の詳細を説明する図である。説明のため、表記辞書１５２が保持している文字列パターンを例示した。照合表７０１は、ステップＳ４０７〜Ｓ４０８において表記辞書１５２と候補文字列を照合した結果を示す。まず表記辞書１５２の構成について説明し、次に照合表７０１について説明する。 FIG. 7 is a diagram illustrating the details of steps S407 to S408. For the sake of explanation, the character string pattern held in the notation dictionary 152 is illustrated. Collation table 701 shows the result of collating notation dictionary 152 and candidate character strings in steps S407 to S408. First, the configuration of the notation dictionary 152 will be described, and then the collation table 701 will be described.

表記辞書１５２は、例えばＲＴＮ（ＲｅｃｕｒｓｉｖｅＴｒａｎｓｉｔｉｏｎＮｅｔｗｏｒｋ）形式で記述することができる。候補文字列と表記辞書１５２を照合する際には、例えば候補文字列を候補文字ネットワークと呼ばれる文字の切り出しの曖昧性や文字識別結果の曖昧性を考慮したネットワーク形式に置き換えた上で、両者を照合することができる。表記辞書１５２の形式および照合方法はこれらに限られるものではなく、表記辞書１５２が保持している文字列パターンと候補文字列を比較することができればよい。 The notation dictionary 152 can be described in, for example, an RTN (Recursive Transition Network) format. When collating the candidate character string with the notation dictionary 152, for example, the candidate character string is replaced with a network format that takes into account the ambiguity of character extraction called the candidate character network and the ambiguity of the character identification result. Can be verified. The format and collation method of the notation dictionary 152 are not limited to these, as long as the character string pattern held in the notation dictionary 152 can be compared with the candidate character string.

日付表記パターン１５２１と１５２２は、日付表記をＲＴＮ形式で表現した例である。ＹＹＹＹは西暦、ＭＭは月、ＤＤは日を表す。丸括弧は省略可能であることを示し、「｜」の前後はどちらかがあることを示す。例えば（年｜．｜−）は、「年」「．」「−」のいずれかが存在すれば同文字列パターンに合致することを示す。さらに同文字列パターンは丸括弧で囲まれているため、照合を省略することもできる。金額表記パターン１５２３と１５２４は、金額表記をＲＴＮ形式で表現した例である。「Ｎ」は数字を表し、「＾」は数字Ｎの繰り返しを表す。これらの表記は１例であり、表記辞書１５２が保持する文字列パターンはこれらに限られるものではない。 Date notation patterns 1521 and 1522 are examples in which date notation is expressed in the RTN format. YYYY represents the year, MM represents the month, and DD represents the day. A parenthesis indicates that it can be omitted, and indicates that there is either one before or after “|”. For example, (year |. | −−) indicates that the character string pattern is matched if any of “year”, “.”, And “−” exists. Furthermore, since the character string pattern is enclosed in parentheses, the collation can be omitted. The money amount notation patterns 1523 and 1524 are examples in which the money amount notation is expressed in the RTN format. “N” represents a number, and “^” represents the repetition of the number N. These notations are examples, and the character string patterns held in the notation dictionary 152 are not limited to these.

照合表７０１は、文字列評価部１６０が図６に示す候補文字列６０２〜６０８を図７に示す表記辞書１５２と照合した結果を示している。候補文字列６０２は日付表記パターン１５２１の「ＹＹＭＭＤＤ」と合致するとともに、金額表記パターン１５２４の「ＮＮＮＮＮＮ」と合致している。候補文字列６０６は、日付表記パターン１５２１の「ＹＹ年ＭＭ月ＤＤ日」と合致している。候補文字列６０３〜６０５と６０７〜６０８はいずれの表記パターンとも合致していない。 The collation table 701 shows the result of the character string evaluation unit 160 collating the candidate character strings 602 to 608 shown in FIG. 6 with the notation dictionary 152 shown in FIG. The candidate character string 602 matches “YYMMDD” of the date notation pattern 1521 and also matches “NNNNNN” of the money amount notation pattern 1524. The candidate character string 606 matches the “YY year MM month DD day” of the date notation pattern 1521. The candidate character strings 603 to 605 and 607 to 608 do not match any of the notation patterns.

文字列評価部１６０は、表記パターンと合致した候補文字列に対して、例えば以下の計算式を用いてスコアを付与する。

The character string evaluation unit 160 assigns a score to the candidate character string that matches the notation pattern using, for example, the following calculation formula.

照合表７０１においては、候補文字列６０６が最も高いスコアを得ているので、候補文字列６０６を文字列認識の最終結果とすることができる。したがって文字認識結果は、日付表記パターンの文字列とみなされる。スコアの計算式は上記に限られるものではなく、その他の計算式を用いることもできる。 In the collation table 701, since the candidate character string 606 has the highest score, the candidate character string 606 can be used as the final result of character string recognition. Therefore, the character recognition result is regarded as a character string of a date notation pattern. The calculation formula of the score is not limited to the above, and other calculation formulas can also be used.

図８は、出力部１８０が文字列評価部１６０による評価結果をＧＵＩ形式で画面表示する例を示す図である。入力画像表示部８０１は帳票処理装置１００に入力されたカラー帳票画像を表示する。対象領域８０２は、文字認識を実施する領域である。文字行認識結果欄８０３は、対象領域８０２から色毎に文字行を認識した結果を表示する。照合表欄８０４は図７で説明した照合表７０１を表示する。照合表欄８０４内の太線枠は文字認識の最終結果となった候補文字列を示す。最終認識結果８０５はその最終結果を表示する。ユーザは最終認識結果８０５を修正することもできる。この場合、修正入力欄に表示する初期値は最終認識結果８０５（すなわち最も評価結果が高かった評価文字列）とする。 FIG. 8 is a diagram illustrating an example in which the output unit 180 displays the evaluation result by the character string evaluation unit 160 in a GUI format. The input image display unit 801 displays a color form image input to the form processing apparatus 100. The target area 802 is an area where character recognition is performed. The character line recognition result column 803 displays the result of recognizing the character line for each color from the target area 802. The collation table column 804 displays the collation table 701 described with reference to FIG. A thick line frame in the collation table column 804 indicates a candidate character string that is a final result of character recognition. The final recognition result 805 displays the final result. The user can also correct the final recognition result 805. In this case, the initial value displayed in the correction input field is the final recognition result 805 (that is, the evaluation character string having the highest evaluation result).

＜実施の形態１：まとめ＞
以上のように、本実施形態１に係る帳票処理装置１００は、帳票画像から色毎に文字を認識して文字順を入れ替えることにより候補文字列を生成し、候補文字列と表記辞書１５２を照合することにより候補文字列が正しいか否かを評価する。これにより、プレ印刷文字と記載文字が重なっている場合であっても、文字列を精度よく認識することができる。 <Embodiment 1: Summary>
As described above, the form processing apparatus 100 according to the first embodiment generates candidate character strings by recognizing characters for each color from the form image and switching the character order, and collates the candidate character strings with the notation dictionary 152. To evaluate whether the candidate character string is correct. Thereby, even if the pre-printed character and the written character overlap, the character string can be recognized with high accuracy.

また、本実施形態１に係る帳票処理装置１００は、文字行内に含まれるスペース文字を認識し、スペース部分に他色の文字を挿入することにより、候補文字列を生成する。これにより、スペース部分に文字を記載することが意図されているプレ印刷帳票に対して文字認識を精度よく実施することができる。 Further, the form processing apparatus 100 according to the first embodiment recognizes a space character included in a character line and generates a candidate character string by inserting a character of another color into the space portion. As a result, it is possible to accurately perform character recognition on a pre-printed form intended to describe characters in the space portion.

＜実施の形態２＞
実施形態１では、色毎に文字を認識して文字順を入れ替えることを説明した。色毎に文字を認識する際には、認識対象外の色はドロップアウトする。このとき、記載文字やプレ印刷に濃淡むらがあると、カラードロップアウトを実施することにより文字の一部が欠落してしまう可能性がある。そこで本発明の実施形態２では、色をドロップアウトする程度が異なる複数の２値化画像を生成し、これらを比較することにより濃淡むらを補償する手法を説明する。帳票処理装置１００の構成は実施形態１と同様であるため、以下では濃淡むらに関する動作について主に説明する。 <Embodiment 2>
In the first embodiment, it has been described that characters are recognized for each color and the character order is changed. When recognizing characters for each color, unrecognized colors are dropped out. At this time, if the written characters and pre-printing are uneven in color, a part of the characters may be lost due to the color dropout. Therefore, in the second embodiment of the present invention, a method of generating a plurality of binarized images having different degrees of color dropout and comparing them will be described. Since the configuration of the form processing apparatus 100 is the same as that of the first embodiment, the operation relating to the shading unevenness will be mainly described below.

図９は、カラードロップアウトの程度が異なる複数の２値化画像を例示する図である。文字認識部１３０は、ステップＳ４０３において、ステップＳ４０２で抽出した色値を中心として所定範囲内にある色値を残し、その他の色値をドロップアウトすることにより、２値化画像を生成する。さらに、ドロップアウトする色値の閾値を変更して複数の２値化画像を取得する。ドロップアウトする色の閾値は、例えばＲＧＢ色空間における色値を用いてもよいし、ＲＧＢ値を色相、彩度、輝度または明度に変更したＨＳＶ色空間における色値を用いてもよいし、その他の色空間を用いてもよい。ここでは、図９に示す２値化画像９０１〜９０６が生成されたものと仮定する。 FIG. 9 is a diagram illustrating a plurality of binarized images with different degrees of color dropout. In step S403, the character recognizing unit 130 generates a binarized image by leaving a color value within a predetermined range centered on the color value extracted in step S402 and dropping out other color values. Further, a plurality of binarized images are acquired by changing the threshold value of the color value to be dropped out. As the threshold value of the color to be dropped out, for example, a color value in the RGB color space may be used, or a color value in the HSV color space in which the RGB value is changed to hue, saturation, luminance, or brightness may be used. The color space may be used. Here, it is assumed that binarized images 901 to 906 shown in FIG. 9 have been generated.

２値化画像９０１は、記載文字色を中心として小さめの閾値を設定する（残す範囲を少なくする）ことにより生成したものである。２値化画像９０３は、記載文字色を中心として大きめの閾値を設定する（残す範囲を多くする）ことにより生成したものである。２値化画像９０２は９０１と９０３の中間程度の閾値を設定したものである。２値化画像９０４は、プレ印刷文字色を中心として小さめの閾値を設定することにより生成したものである。２値化画像９０６は、プレ印刷文字色を中心として大きめの閾値を設定することにより生成したものである。２値化画像９０５は９０４と９０６の中間程度の閾値を設定したものである。 The binarized image 901 is generated by setting a small threshold (reducing the remaining range) around the written character color. The binarized image 903 is generated by setting a large threshold (increase the remaining range) around the written character color. The binarized image 902 is set with a threshold value that is approximately between 901 and 903. The binarized image 904 is generated by setting a small threshold with the pre-printed character color as the center. The binarized image 906 is generated by setting a larger threshold with the pre-printed character color as the center. The binarized image 905 is set with a threshold value that is approximately between 904 and 906.

文字認識部１３０は、図９のように生成した複数の２値化画像を比較することにより、文字行内の文字を認識する。例えば記載文字に濃淡むらがあり文字の一部が欠落している場合は、大きめの閾値を用いて生成した２値化画像９０３や９０６により、欠落が少ない状態で文字を認識することができる。他方、記載文字が滲んで文字の一部が潰れている場合は、小さめの閾値を用いて生成した２値化画像９０１や９０４により、文字の潰れが少ない状態で文字を認識することができる。いずれの場合においても、例えば文字認識結果の尤度が最も高いものを採用すればよい。 The character recognition unit 130 recognizes characters in the character line by comparing a plurality of binarized images generated as shown in FIG. For example, when there are uneven shading in the written characters and some of the characters are missing, it is possible to recognize the characters with little missing by the binarized images 903 and 906 generated using a larger threshold. On the other hand, when the written characters are blurred and part of the characters are crushed, the characters can be recognized with less crushed characters by the binarized images 901 and 904 generated using a smaller threshold. In any case, for example, a character recognition result having the highest likelihood may be adopted.

なお、ステップＳ５０４において文字単位で重なり合うか否かを確認する際には、同じ文字行から生成した複数の２値化画像については確認対象外とする。これら複数の２値化画像は文字座標が同じであるため当然に重なり合うからである。 In step S504, when it is confirmed whether or not the characters overlap each other, a plurality of binarized images generated from the same character line are not subject to confirmation. This is because these plurality of binarized images have the same character coordinates, and thus naturally overlap.

＜実施の形態２：まとめ＞
以上のように、本実施形態２に係る帳票処理装置１００は、カラードロップアウトを実施する範囲を変更して複数の２値化画像を生成し、これらに対する文字認識結果を照合することにより最も尤もらしい文字認識結果を特定する。これにより、記載文字に濃淡むらや滲みがある場合であっても、精度よく文字認識を実施することができる。 <Embodiment 2: Summary>
As described above, the form processing apparatus 100 according to the second embodiment generates the plurality of binarized images by changing the range in which the color dropout is performed, and compares the character recognition results with respect to these images to the maximum likelihood. Specific character recognition results are identified. As a result, even if the written characters have shading unevenness or bleeding, character recognition can be performed with high accuracy.

＜実施の形態３＞
実施形態１〜２において、カラー帳票画像内にはプレ印刷文字と記載文字が重なっている部分と重なっていない部分が混在している場合がある。両者が重なっていない部分については従来の文字認識手法を用いるのみで充分であると考えられる。そこで、例えば入力されたカラー帳票画像に対して従来の文字認識をいったん実施し、認識できなかった部分に対してのみ実施形態１で説明した手法を実施するようにしてもよい。これにより、帳票全面に対して文字の重なりを調べる必要がなくなり、処理時間の短縮につながる。 <Embodiment 3>
In Embodiments 1 and 2, there may be a case where a preprinted character and a description character overlap and a portion that does not overlap in the color form image. It is considered that it is sufficient to use a conventional character recognition method for a portion where both do not overlap. Therefore, for example, conventional character recognition may be performed once on an input color form image, and the method described in the first embodiment may be performed only on a portion that cannot be recognized. This eliminates the need to check the overlap of characters on the entire form, leading to a reduction in processing time.

実施形態１〜２において、出力部１８０は、最終認識結果８０５を１つのみ出力してもよいし、例えばスコアが高いほうから順に複数の認識結果を出力してもよい。この場合、出力する認識結果の個数またはスコア範囲はあらかじめ定めておいてもよいし、ユーザが指定できるようにしてもよい。 In the first and second embodiments, the output unit 180 may output only one final recognition result 805, or may output a plurality of recognition results in order from the highest score, for example. In this case, the number of recognition results to be output or the score range may be determined in advance or may be specified by the user.

本発明は上記した実施形態に限定されるものではなく、様々な変形例が含まれる。上記実施形態は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施形態の構成の一部を他の実施形態の構成に置き換えることもできる。また、ある実施形態の構成に他の実施形態の構成を加えることもできる。また、各実施形態の構成の一部について、他の構成を追加・削除・置換することもできる。 The present invention is not limited to the embodiments described above, and includes various modifications. The above embodiment has been described in detail for easy understanding of the present invention, and is not necessarily limited to the one having all the configurations described. A part of the configuration of one embodiment can be replaced with the configuration of another embodiment. The configuration of another embodiment can be added to the configuration of a certain embodiment. Further, with respect to a part of the configuration of each embodiment, another configuration can be added, deleted, or replaced.

上記各構成、機能、処理部、処理手段等は、それらの一部や全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に格納することができる。 Each of the above-described configurations, functions, processing units, processing means, and the like may be realized in hardware by designing a part or all of them, for example, with an integrated circuit. Each of the above-described configurations, functions, and the like may be realized by software by interpreting and executing a program that realizes each function by the processor. Information such as programs, tables, and files for realizing each function can be stored in a recording device such as a memory, a hard disk, an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

１００：帳票認識装置、１１０：画像入力部、１２０：色抽出部、１３０：文字認識部、１４０：文字列生成部、１５１：表記辞書読出部、１５２：表記辞書、１６０：文字列評価部、１７０：記憶装置、１８０：出力部。 100: Form recognition device 110: Image input unit 120: Color extraction unit 130: Character recognition unit 140: Character string generation unit 151: Notation dictionary reading unit 152: Notation dictionary 160: Character string evaluation unit 170: storage device, 180: output unit.

Claims

An apparatus for processing a form image,
An image input unit for receiving the form image;
A color extraction unit for extracting colors included in the form image;
A character recognition unit that recognizes characters described in the form image for each color extracted by the color extraction unit;
A character string generation unit that generates a plurality of candidate character strings by connecting the characters recognized by the character recognition unit for each color;
A notation dictionary reading unit that reads the character string pattern from a notation dictionary that stores a plurality of character string patterns including a character string pattern described in the form image;
A character string evaluator that evaluates the degree of matching between each candidate character string and the character string pattern in the notation dictionary by comparing each candidate character string and the character string pattern in the notation dictionary;
An output unit for outputting the evaluation result by the character string evaluation unit and the candidate character string;
A form processing apparatus comprising:

The character string generation unit generates the candidate character string for each color by connecting the characters recognized by the character recognition unit for each color extracted by the color extraction unit,
The character string evaluation unit
When the character strings for each color generated by the character recognition unit overlap in character units in the form image, the plurality of candidate character strings are connected by shifting the positions of the characters in the character strings. Produces
When the character strings for each color generated by the character recognition unit do not overlap in character units in the form image, the character strings for each color generated by the character recognition unit are used as the candidate character strings. The form processing apparatus according to claim 1, wherein

When the distance between characters recognized for each color is equal to or greater than a predetermined value, the character recognition unit recognizes that a space character is described between the characters,
The said character string production | generation part produces | generates the said candidate character string by inserting the character of another said color with respect to the part which the said character recognition part recognized as a space character about a certain said color. 2. The form processing apparatus according to 2.

The character recognition unit
By dividing the form image into a part having a color value within a predetermined range centered on the color extracted by the color extraction unit and another part, a partial area in which characters are described in the form image is obtained. Extract and
Further, by performing the division while changing the predetermined range, a plurality of the partial areas in which the same character is described are extracted, and the result of recognizing the character using each of the partial areas is compared with the most. The form processing apparatus according to claim 1, wherein a plausible recognition result is specified.

The character string generation unit regards a plurality of characters recognized by the character recognition unit within a predetermined range in the height direction of the characters as characters described in the same line and described in the same line. The form processing apparatus according to claim 2, wherein the candidate character string is generated for each color extracted by the color extraction unit by concatenating characters regarded as being processed.

The form processing apparatus according to claim 1, wherein the output unit outputs the evaluation result only when the matching degree evaluated by the character string evaluation unit is within a predetermined range.

The form processing apparatus according to claim 1, wherein the output unit outputs the character string pattern evaluated by the character string evaluation unit together with the evaluation result.

The form processing apparatus according to claim 7, wherein the output unit distinguishes and outputs the character string pattern evaluated by the character string evaluation unit for each color extracted by the color extraction unit.

The output unit is configured as a display unit that displays the evaluation result and the candidate character string on a screen,
The display unit has a correction input field for correcting the previous evaluation result,
The form processing apparatus according to claim 7, wherein the initial value displayed in the correction input field is the candidate character string having the highest evaluation result.

A method for processing a form image,
An image input step for receiving the form image;
A color extraction step of extracting colors included in the form image;
A character recognition step for recognizing characters described in the form image for each color extracted in the color extraction step;
A character string generation step of generating a plurality of candidate character strings by concatenating the characters recognized for each color in the character recognition step;
A notation dictionary reading step of reading out the character string pattern from a notation dictionary storing a plurality of character string patterns including the character string pattern described in the form image;
A character string evaluation step for evaluating the degree of matching between each candidate character string and the character string pattern in the notation dictionary by comparing each candidate character string and the character string pattern in the notation dictionary;
An output step of outputting the evaluation result in the character string evaluation step and the candidate character string;
A form processing method characterized by comprising:

A form processing program for causing a computer to execute the form processing method according to claim 10.