JP2012141670A

JP2012141670A - Apparatus, method and program for recognizing form

Info

Publication number: JP2012141670A
Application number: JP2010292253A
Authority: JP
Inventors: Masaya Sunakawa; 真哉砂川; Hirotaka Inoue; 博貴井上; Kazuo Nakamura; 一夫中村; Masanori Nakabayashi; 正典中林; Katsutoshi Obara; 勝利小原
Original assignee: Fujitsu Frontech Ltd
Current assignee: Fujitsu Frontech Ltd
Priority date: 2010-12-28
Filing date: 2010-12-28
Publication date: 2012-07-26

Abstract

PROBLEM TO BE SOLVED: To provide an apparatus for recognizing an unspecified number of kinds of forms without using a form defined in advance, while increasing binarization processing speed without deteriorating character recognition accuracy.SOLUTION: The form recognition apparatus recognizes a character written on a form including a table composed of cells partitioned by ruled lines and including or not including a character or a character string. The form recognition apparatus defines a table structure for determining arrangement of cells of the table, extracts a title which is a predetermined character or character string from an image, identifies a position of a cell including the title in the table structure, detects a filled-in cell filled with any character or character string among the cells in the defined table structure, and selects only the detected filled-in cell as a binarization cell. The binarization cell in the table is binarized to generate a binary image, thereby recognising a character written in the form with the use of the binary image.

Description

本発明は、帳票などの表を有する文書において、帳票上の文字を認識するために用いられる帳票認識装置、方法およびプログラムに関し、特に、帳票などの表を有する画像データの二値化処理を含む帳票認識装置、方法およびプログラムに関する。 The present invention relates to a form recognition apparatus, method, and program used for recognizing characters on a form in a document having a table such as a form, and particularly includes binarization processing of image data having a table such as a form. The present invention relates to a form recognition apparatus, method, and program.

今日の、特に企業の経済活動では、多種類の帳票を同時に扱う場面が少なくない。例えば、顧客が金融機関と取引をする際には、顧客が複数種類の帳票の一つに必要な事項を記入し金融機関に提出する。顧客から記入済み帳票を受け取った金融機関は、帳票の記載内容に基づいて、所定の処理及び顧客への対応を行う。帳票の記載内容には、金額等の数字も含まれる。そして、そのような複数種類の帳票は、帳票自体の大きさが異なっていたり、表のセルを定義する罫線が異なっていたり、罫線の構造が同一でも、表のセルに記入されるべき項目が異なっていたりする。また、例えば、金融機関で用いられる振込み依頼書のように、異なる複数の対象にある処理を依頼するための帳票では、顧客によって表の記入面積が異なることがある。 In today's business activities, especially corporate activities, there are many cases where many types of forms are handled simultaneously. For example, when a customer makes a transaction with a financial institution, the customer fills in one of a plurality of forms and submits it to the financial institution. The financial institution that has received the completed form from the customer performs a predetermined process and a response to the customer based on the description content of the form. The contents of the form include numbers such as amounts. Even if such multiple types of forms are different in size of the form itself, the ruled lines defining the table cells are different, or the ruled line structure is the same, the items to be entered in the table cells are different. It ’s different. In addition, for example, in a form for requesting processing for a plurality of different objects, such as a transfer request form used in a financial institution, a table entry area may differ depending on a customer.

また、それらの帳票の多くは電子化された文書として処理されるようになった。このために、紙の文書を電子化された文書に変換する技術として画像認識技術が着目されている。従来から知られている画像認識技術には、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒｉｚｅＲｅｃｏｇｎｉｔｉｏｎ、光学式文字認識）などがある。 Many of these forms have been processed as digitized documents. For this reason, an image recognition technique has attracted attention as a technique for converting a paper document into an electronic document. Conventionally known image recognition techniques include OCR (Optical Character Recognition Recognition).

近年、不特定多数の種類の帳票を一台の機械で処理するための、電子化された帳票の記載内容を自動的に読み取る装置が知られるようになった。
そのような装置では、不特定多数の種類の帳票の処理を行うためには、帳票上の表の罫線を認識して表のセルの構造を画定し、表の見出し（項目）を認識し、見出しの種類から各セルに入るべき中身を自動的に推測するなどして、帳票のフォーマットを推測する必要がある。予め帳票のフォーマットを記録しておき、帳票に付けられた識別のための記号を読み取ることによって、その帳票のフォーマットを特定するという方法は、今の場合、適用不可能である。 2. Description of the Related Art In recent years, an apparatus that automatically reads the contents described in an electronic form for processing an unspecified number of types of forms with a single machine has come to be known.
In such an apparatus, in order to process an unspecified number of types of forms, the ruled lines of the table on the form are recognized, the structure of the table cell is defined, the table header (item) is recognized, It is necessary to infer the format of the form by automatically inferring the contents to be included in each cell from the type of heading. In this case, the method of recording the format of the form in advance and specifying the format of the form by reading the identification symbol attached to the form is not applicable in this case.

このとき、一般に帳票に含まれる表の各セルには、見出しを表す文字、見出しの内容を示す文字、数字、又は記号などが記入されるので、そのような装置では、表の上での見出しと、その見出しに関連する文字、数字、記号などを自動的に対応付ける必要がある。たとえば、見出しが「金額」であれば、その見出しに関連するものは単位が円、ドルなどの「数字」である。もし帳票のフォーマットが予め定義されている場合には、そのような対応付けはデータベース中に記憶させることができる。しかし、予め帳票のフォーマットを定義できない場合には、対応付けを自動的に生成する必要がある。 At this time, generally, each cell of the table included in the form is filled with a headline character, a character indicating the content of the headline, a number, a symbol, or the like. It is necessary to automatically associate letters, numbers, symbols, etc. related to the heading. For example, if the headline is “amount”, the item related to the headline is “number” such as yen or dollar. If the form format is predefined, such a correspondence can be stored in the database. However, if the form format cannot be defined in advance, it is necessary to automatically generate a correspondence.

紙の帳票を読み込んで、表の罫線を認識し、またＯＣＲ等で文字認識を行うためには、認識処理をする前に、帳票全体の二値画像を生成する必要がある。二値画像を用いて、ＯＣＲ等の文字認識などが行われる。一般に、不特定多数の種類の帳票の処理を行うためには、帳票の全面の二値化処理が必要である。 In order to read a paper form, recognize a ruled line of a table, and perform character recognition using OCR or the like, it is necessary to generate a binary image of the entire form before performing recognition processing. Character recognition such as OCR is performed using the binary image. Generally, in order to process an unspecified number of types of forms, it is necessary to binarize the entire form.

例えば、複数の見出し（項目）に関して、その見出しに関わる数字、漢字などのデータ文字をその中に記入する記入欄と予めプリントされた項目名とが表の中に配列された帳票を光学的に読み取ったイメージから、記入欄に記入されたデータ文字（キーワード）を認識する光学式文字認識装置が知られている（特許文献１）。このような装置の例には、帳票イメージ全面に対してシステム辞書を用いてＯＣＲ処理を行い表構造及び表のセル中の文字を認識し、見出しごとに再度ＯＣＲ処理が必要か否かを予め定義した再ＯＣＲ処理指定情報を参照して、再度ＯＣＲをすべき記入欄を特定し、各見出しについて特定の記入欄に対して、適合した辞書を用いて、部分的にＯＣＲ処理を行うことによって、予めフォーマットが登録されていない、項目のレイアウトが異なる帳票に対しても対応することが可能な装置が含まれる。たとえば、帳票イメージ全面に対するＯＣＲ処理では、帳票の各見出しのセルのレイアウトが認識され、部分ＯＣＲでは、対象となる記入欄に適したユーザ辞書、たとえば金額が入るべき記入欄に対しては数字が記憶された辞書、を用いることができる。この方法では、項目名の属性に応じて、ＯＣＲ処理で用いる辞書を変えるなど、認識方法を変更するので、データの文字の認識精度を上げることができる。 For example, for a plurality of headings (items), a form in which an entry field for entering data characters such as numbers and kanji related to the headings and preprinted item names are arranged in a table is optically provided. An optical character recognition device that recognizes data characters (keywords) entered in an entry field from a read image is known (Patent Document 1). In an example of such an apparatus, OCR processing is performed on the entire form image using a system dictionary to recognize table structures and characters in the cells of the table, and whether or not OCR processing is necessary again for each heading is determined in advance. By referring to the defined re-OCR processing designation information, specifying the entry field to be subjected to OCR again, and performing partial OCR processing for each entry using a suitable dictionary for the specified entry field. Included is an apparatus that can cope with forms whose layouts are not registered in advance and whose items have different layouts. For example, in the OCR process for the entire form image, the layout of each heading cell of the form is recognized, and in the partial OCR, a user dictionary suitable for the target entry field, for example, a number is entered for the entry field where the amount is to be entered. A stored dictionary can be used. In this method, since the recognition method is changed, such as changing the dictionary used in the OCR process according to the attribute of the item name, the recognition accuracy of the data characters can be improved.

また、上記のような装置での処理を高速化するために、項目名の文字が記入されている確率が高い領域を優先的に探索する装置が知られている（特許文献２）。
さらに、帳票の文字認識に適した二値画像を生成することによって、文字認識の精度向上を図った画像処理装置が存在する。 In addition, in order to speed up the processing in the apparatus as described above, an apparatus that preferentially searches an area having a high probability that an item name character is written is known (Patent Document 2).
Furthermore, there is an image processing apparatus that improves the accuracy of character recognition by generating a binary image suitable for character recognition of a form.

そのような装置の一つの例として、ある領域内の文字の有無を判定し、その領域に文字が存在しないと判定されたときには、その領域を白画素で埋める装置が知られている（特許文献３）。この装置では、所定の領域内の連結成分のエッジ数、例えば領域内の横又は縦方向に走査したときの白黒の反転の回数に着目して、文字の有無を判定し、もしその領域に文字が無いと判定された場合には、その領域をすべて白画素にする処理を行う。 As one example of such an apparatus, there is known an apparatus that determines the presence or absence of characters in a certain area and fills the area with white pixels when it is determined that no character exists in the area (Patent Literature). 3). This device determines the presence or absence of a character by paying attention to the number of edges of connected components in a predetermined area, for example, the number of black and white inversions when scanning in the horizontal or vertical direction within the area. If it is determined that there is no pixel, processing for making all the areas white pixels is performed.

また、文字などが記入された画像から背景ノイズの影響を除き、その後の文字認識などの処理に寄与できる画像処理装置であって、記入のない余白部分や予め用意した白紙帳票の領域の濃度分布と、任意に指定された領域を分割した小領域の濃度分布を比較して、その小領域に文字が存在するか否かを判定し、文字の記入がないと判定された領域の濃度を、所定の背景濃度に置換することによって原画像よりノイズを除去する装置が存在する（特許文献４）。 In addition, it is an image processing device that can contribute to the subsequent character recognition and other processing by removing the influence of background noise from the image in which characters are entered, and the density distribution of blank areas that are not filled in or areas of blank forms prepared in advance And comparing the density distribution of the small area obtained by dividing the arbitrarily designated area to determine whether or not there is a character in the small area, and the density of the area determined to have no character input, There is an apparatus that removes noise from an original image by replacing it with a predetermined background density (Patent Document 4).

特許第４３４７６７７号公報Japanese Patent No. 4347777 特開２００９−９３３０５号公報JP 2009-93305 A 特開２０００−３３１１１８号公報JP 2000-331118 A 特許第４３９３４１１号公報Japanese Patent No. 4393411

しかしながら、一般に帳票は、顧客によって表の構造も異なるし、表の記入面積も異なっている。上述のような従来技術を用いて帳票の認識を行う場合、電子化された文書から、文字認識に用いる二値化画像の生成に時間を要するという課題があった。 However, in general, forms have different table structures and different table entry areas depending on customers. When a form is recognized using the above-described conventional technique, there is a problem that it takes time to generate a binarized image used for character recognition from an electronic document.

また、各文字領域内の文字の有無を判定し、その領域に文字が存在しないと判定されたときには、その領域を白画素で埋める装置においては、二値化閾値を算出する処理を行う面積は変わらないので、二値化処理に要する時間を大幅に減少させることは難しいという課題があった。 Also, in the device that determines the presence / absence of characters in each character region and determines that there are no characters in the region, the area for performing the process of calculating the binarization threshold in the device that fills the region with white pixels is Since there is no change, there is a problem that it is difficult to significantly reduce the time required for the binarization processing.

従って、不特定多数の種類の帳票を、予め定義されたフォームを用いずに認識する帳票認識装置において、文字認識精度を劣化させることなく、二値化処理を高速化した装置が要求されている。 Therefore, in a form recognition apparatus that recognizes an unspecified number of types of forms without using a predefined form, there is a demand for an apparatus that speeds up the binarization process without degrading the character recognition accuracy. .

本発明の一態様に従う帳票認識装置は、罫線で区切られたセルであって、その中に文字若しくは文字列を含む、又は含まないセルで構成される表を含む帳票に記載されている文字を認識する帳票認識装置であって、前記表において、前記セルの配置を定める表構造を画定する表構造画定手段と、前記画像から予め定められた文字又は文字列である見出しを抽出する、見出し抽出手段と、前記見出しを含むセルである見出しセルの位置を前記表構造中に特定する、見出し位置特定手段と、前記表構造画定手段で画定された前記表構造のセルのうち、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出する記入済みセル検出手段と、前記記入済みセル検出手段で検出された前記記入済みセルのみを二値化対象セルとして選択する二値化対象セル選択手段と、前記表の中の前記二値化対象セルの二値化処理を行ない、二値画像を生成する二値化手段と、前記二値画像から帳票に記載されている文字を認識する文字認識手段、を具備することを特徴とする。 The form recognition apparatus according to one aspect of the present invention is a cell delimited by ruled lines, and includes characters written in a form including a table including or not including a character or a character string. A form recognition apparatus for recognizing a table structure defining means for defining a table structure for determining an arrangement of cells in the table, and extracting a headline that is a predetermined character or character string from the image Among the cells of the table structure defined by the means and the table structure defining means, the position of the header cell that is a cell including the heading is specified in the table structure. A filled cell detecting means for detecting filled cells in which characters or character strings are written, and only the filled cells detected by the filled cell detecting means are selected as binarization target cells. Binarization target cell selection means, binarization means for performing binarization processing of the binarization target cells in the table and generating a binary image, and being described in the form from the binary image Character recognition means for recognizing a character being present.

本発明の他の態様に従う帳票認識方法およびプログラムは、上記帳票認識装置に用いられる手段を実現する方法、上記帳票装置に用いられる手段の機能をコンピュータで実現するコンピュータ認識可能な命令を含むプログラムである。 A form recognition method and program according to another aspect of the present invention are a method for realizing means used in the form recognition apparatus, and a program including computer-recognizable instructions for realizing the function of the means used in the form apparatus by a computer. is there.

このような帳票認識方法および装置では、予め定義されたフォームを用いずに帳票を認識することができ、さらに、帳票上の表の一部の領域しか二値化処理を行わないことによって、二値化処理に要する時間を減らすことができる。その結果、自由なフォーラムの帳票の認識の認識精度を落とすことなく、高速化することができる。 In such a form recognition method and apparatus, a form can be recognized without using a predefined form, and only a partial area of the table on the form is binarized to perform binary processing. The time required for the valuation process can be reduced. As a result, the speed can be increased without degrading the recognition accuracy of the free forum form recognition.

本発明の装置のブロック図である。It is a block diagram of the apparatus of this invention. 本発明の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of this invention. 本発明の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of this invention. 本発明の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of this invention. 表構造情報抽出部での処理を説明するための図である。It is a figure for demonstrating the process in a table structure information extraction part. 本発明の装置が処理対象とする帳票の例を示す図である。It is a figure which shows the example of the form used as the process target by the apparatus of this invention. 図４に示されている帳票の例の一部を示す図である。FIG. 5 is a diagram illustrating a part of the example of the form illustrated in FIG. 4. 本発明の例１の装置のブロック図である。It is a block diagram of the apparatus of Example 1 of this invention. 本発明の例１の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of Example 1 of this invention. 罫線を残した二値画像の例である。It is an example of the binary image which left the ruled line. 本発明の例１の装置が処理対象とする帳票の例を示す図である。It is a figure which shows the example of the form which the apparatus of Example 1 of this invention makes into a process target. 本発明の例２の装置のブロック図である。It is a block diagram of the apparatus of Example 2 of this invention. 本発明の例２の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of Example 2 of this invention. 本発明の例２の装置が処理対象とする帳票の例の一部を示す図である。It is a figure which shows a part of example of the form which the apparatus of Example 2 of this invention makes into a process target. 図１２に示した帳票の例の一部の二値化対象外範囲を示す図である。It is a figure which shows the one part non-binarization object range of the example of the form shown in FIG. 本発明の例３の装置のブロック図である。It is a block diagram of the apparatus of Example 3 of this invention. 本発明の例３の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of Example 3 of this invention. 本発明の例３の装置が処理対象とする帳票の例の一部を示す図である。It is a figure which shows a part of example of the form which the apparatus of Example 3 of this invention makes into a process target. 本発明の例４の装置のブロック図である。It is a block diagram of the apparatus of Example 4 of this invention. 本発明の例４の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of Example 4 of this invention. 本発明の例４の装置に適用されるキーワード認識技術の処理方法中のはしご枠の抽出を説明する図である。It is a figure explaining extraction of the ladder frame in the processing method of the keyword recognition technique applied to the apparatus of Example 4 of this invention. 本発明の例４の装置が処理対象とする帳票の例の一部の二値化対象外範囲を示す図である。It is a figure which shows the binarization non-target range of the example of the form which the apparatus of Example 4 of this invention makes into a process target. 本発明の例４の装置に適用されるキーワード認識技術の処理方法中のはしご枠のグルーピングを説明する図である。It is a figure explaining the grouping of the ladder frame in the processing method of the keyword recognition technique applied to the apparatus of Example 4 of this invention. 本発明の例５の装置のブロック図である。It is a block diagram of the apparatus of Example 5 of this invention. 本発明の例５の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of Example 5 of this invention. 本発明の例５の装置が処理対象とする帳票の例の一部の二値化対象外範囲を示す図である。It is a figure which shows the binarization non-target range of a part of the example of the form which the apparatus of Example 5 of this invention processes. 本発明の例６の装置のブロック図である。It is a block diagram of the apparatus of Example 6 of this invention. 本発明の例６の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the keyword recognition technique applied to the apparatus of Example 6 of this invention. 本発明の例６の装置に適用されるキーワード認識技術の処理方法中のセルを上下に分割する罫線のチェックを説明する図である。It is a figure explaining the check of the ruled line which divides | segments the cell up and down in the processing method of the keyword recognition technique applied to the apparatus of Example 6 of this invention. 本発明の例６の装置が処理対象とする帳票の例の一部のカナ項目フラグ１を二値化対象外範囲とする処理を示す図である。It is a figure which shows the process which makes some Kana item flags 1 of the example of the form which the apparatus of Example 6 of this invention processes as a non-binarization object range.

（全般的説明）
本発明の一態様に従う帳票認識装置は、罫線で区切られたセルであって、その中に文字若しくは文字列を含む、又は含まないセルで構成される表を含む帳票に記載されている文字を認識する帳票認識装置であって、前記表において、前記セルの配置を定める表構造を画定する表構造画定手段と、前記画像から予め定められた文字又は文字列である見出しを抽出する、見出し抽出手段と、前記見出しを含むセルである見出しセルの位置を前記表構造中に特定する、見出し位置特定手段と、前記表構造画定手段で画定された前記表構造のセルのうち、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出する記入済みセル検出手段と、前記記入済みセル検出手段で検出された前記記入済みセルのみを二値化対象セルとして選択する二値化対象セル選択手段と、前記表の中の前記二値化対象セルの二値化処理を行ない、二値画像を生成する二値化手段と、前記二値画像から帳票に記載されている文字を認識する文字認識手段、を具備することを特徴とする。 (General explanation)
The form recognition apparatus according to one aspect of the present invention is a cell delimited by ruled lines, and includes characters written in a form including a table including or not including a character or a character string. A form recognition apparatus for recognizing a table structure defining means for defining a table structure for determining an arrangement of cells in the table, and extracting a headline that is a predetermined character or character string from the image Among the cells of the table structure defined by the means and the table structure defining means, the position of the header cell that is a cell including the heading is specified in the table structure. A filled cell detecting means for detecting filled cells in which characters or character strings are written, and only the filled cells detected by the filled cell detecting means are selected as binarization target cells. Binarization target cell selection means, binarization means for performing binarization processing of the binarization target cells in the table and generating a binary image, and being described in the form from the binary image Character recognition means for recognizing a character being present.

ここで、帳票の画像はカラー画像であっても良い。
ここで、表構造画定手段は、電子化された帳票の画像の全体の二値化処理を行い、その二値化画像から罫線だけを残す処理を行う。この処理は例えば、帳票の画像上の横線と縦線を抽出することによって行っても良い。さらに、帳票の画像上の横線と縦線を抽出し、残りの範囲を背景色、例えば白色、で上塗りしても良い。表の罫線が決まると、罫線で区切られたセルが定まり、各セルには列番号と行番号を付与することができる。また、以下で述べるように、表構造画定手段は、点線、破線などの実線ではない線を抽出することによって、例えばはしご枠セル等の複数のセルを含む結合セルを抽出しても良い。 Here, the image of the form may be a color image.
Here, the table structure demarcating means performs binarization processing for the entire digitized form image and performs processing for leaving only ruled lines from the binarized image. This processing may be performed, for example, by extracting horizontal lines and vertical lines on the form image. Further, the horizontal and vertical lines on the form image may be extracted, and the remaining range may be overcoated with a background color, for example, white. When the ruled line of the table is determined, cells delimited by the ruled line are determined, and a column number and a row number can be assigned to each cell. Further, as described below, the table structure defining means may extract a combined cell including a plurality of cells such as a ladder frame cell by extracting a line that is not a solid line such as a dotted line or a broken line.

また、前記表構造画定手段は、前記画像中の点であって、そこから少なくとも２本の線が延びる点、即ち交点、または少なくとも一端が別の線と接触する線分を検出することによって前記表の前記罫線を検出しても良い。また、交点のうち、一番外側、すなわち帳票の上下左右の枠に近い交点を抽出することによって、表の外枠を画定しても良い。 Further, the table structure defining means detects the point in the image by detecting a point where at least two lines extend from the point, that is, an intersection point, or a line segment where at least one end contacts another line. The ruled line of the table may be detected. Further, the outer frame of the table may be demarcated by extracting the intersection point closest to the outermost side, that is, the top, bottom, left and right frames of the form.

ここで、見出し抽出手段は、電子化された帳票の画像の全体または一部を二値化処理することによって得られた二値化画像から、予め見出しが記憶された見出し辞書を参照しながら文字認識を行い、帳票上の見出しを抽出する。一般に、見出しは表の上の方の行、例えば、第１行目に記載されることが多い。そこで、表構造画定手段によって画定された表構造を参照して、表の上の方の行の文字認識を優先的に行っても良い。 Here, the headline extraction means refers to a character string while referring to a headline dictionary in which headlines are stored in advance from a binarized image obtained by binarizing all or part of an electronic form image. Recognize and extract the headline on the form. In general, headings are often written in the upper row of the table, for example, in the first row. Therefore, character recognition in the upper row of the table may be performed with priority by referring to the table structure defined by the table structure defining means.

ここで、見出し位置特定手段では、見出しの位置に関する情報、即ち表中の行番号、列番号が特定される。より詳細には、前記表構造画定手段で画定された表構造と、見出し抽出手段によって抽出された見出しとその位置、例えば、帳票の左上隅からの距離を比較し、見出しの表中の位置情報、即ち列番号および行番号を特定する。 Here, the heading position specifying means specifies information related to the position of the heading, that is, the row number and column number in the table. More specifically, the table structure defined by the table structure defining unit is compared with the position extracted by the headline extracting unit and its position, for example, the distance from the upper left corner of the form, and the position information in the table of the headline is compared. That is, the column number and the row number are specified.

ここで、二値化対象セル選択手段では、特定の見出しに関する表の領域を二値化対象外とする。例えば、特定の見出しの列を二値化対象外とする。別の言葉では、特定の見出しに関する表の行や列の領域のみを二値化対象とし、その領域に含まれるセルを二値化対象セルとして選択する。 Here, in the binarization target cell selection means, the area of the table relating to the specific heading is excluded from the binarization target. For example, a column with a specific heading is excluded from binarization. In other words, only a row or column area of a table related to a specific heading is set as a binarization target, and cells included in the area are selected as binarization target cells.

しかしながら、二値化対象セル選択手段は、常に、表の列単位で二値化対象領域を選択しなくても良い。例えば、「名前」と「フリガナ」のように、一つの行の中に同一の情報を含む２つの記入欄が用意されているような場合には、「フリガナ」が記入されるセルだけを二値化対象とすることも可能である。この場合は、隔行で二値化対象セルが選択されても良い。 However, the binarization target cell selection unit does not always have to select the binarization target area in units of columns in the table. For example, in the case where two entry fields containing the same information are prepared in one row, such as “name” and “phonetic”, only two cells where “phonetic” is written are stored. It is also possible to make it a target for valuation. In this case, the binarization target cell may be selected every other row.

ここで、二値化手段は、二値化対象セル選択手段で選択されたセルを二値化する。二値化手段を用いて得られる二値画像は、文字認識手段で用いる文字認識技術に適していることが望ましい。 Here, the binarization means binarizes the cell selected by the binarization target cell selection means. It is desirable that the binary image obtained by using the binarization means is suitable for the character recognition technique used by the character recognition means.

ここで、文字とは数字、記号、アルファベット、カタカナ、ひらがな、漢字等を含む。また、場合によってはバーコードのように文字を表す記号であっても良い。
また、罫線は、実線であっても良いし、点線、破線、一点鎖線等であっても良い。例えば、金額など数字の列を記入するためのはしご枠セルは、点線で区切られている。上述のようにはしご枠は表構造画定手段によって抽出されても良い。 Here, the characters include numbers, symbols, alphabets, katakana, hiragana, kanji, and the like. In some cases, a symbol representing a character such as a barcode may be used.
The ruled line may be a solid line, a dotted line, a broken line, a one-dot chain line, or the like. For example, ladder frame cells for entering a string of numbers such as monetary amounts are separated by dotted lines. As described above, the ladder frame may be extracted by the table structure defining means.

記入済みセル検出手段は、表構造画定手段によって定められた表の各セル中に文字が記載されているかどうかを判定し、その中に文字が記入された記入済みセルを検出する。
あるセル中に文字が記入されているか否かの判定は、セル中の黒画素数をチェックすることによって行ってもよい。また、あるセル全体の色彩を検出し、その色彩からそのセル中に文字が記入されているか否かを判定しても良い。より具体的には、その色彩の明度から判定しても良い。このように明度からあるセル中の文字の記入の有無を判定する方法では、１つのセルに対し、色彩の測定を１回行えばよく、他の方法、例えばエッジ数の検出などの方法に比べて、あるセル中の文字の記入の有無を判定を高速で行うことができる。 The filled-in cell detection means determines whether or not a character is written in each cell of the table defined by the table structure defining means, and detects a filled-in cell in which the character is written.
Whether or not a character is written in a certain cell may be determined by checking the number of black pixels in the cell. Alternatively, the color of an entire cell may be detected, and it may be determined from the color whether or not characters are entered in the cell. More specifically, it may be determined from the brightness of the color. In this way, in the method of determining the presence / absence of the entry of a character in a certain cell from the brightness, it is only necessary to measure the color once for one cell, compared to other methods such as detection of the number of edges. Thus, it is possible to determine whether or not a character in a certain cell has been entered at high speed.

また、帳票の画像がカラー画像の場合は、記入済みセル検出手段は、セルの色の、色相、彩度及び明度の３つの成分からなるＨＳＶ色空間内のベクトルを用いて、セルが記入済みか否かを判断しても良い。 In addition, when the form image is a color image, the filled-in cell detecting means fills in the cell using a vector in the HSV color space composed of three components of hue, saturation and brightness of the cell color. It may be determined whether or not.

一般に、ＨＳＶ空間のベクトルでは、色を表す色相軸の変域は０°〜３６０°、色の鮮やかさを現す彩度は０〜１００％、明るさを示す明度は０〜１００％である。明度の値が小さければ黒色に近い。例えば、明度が２０％より小さいセル中の画素を所定の数以上含むセルを記入済みセルを判定しても良い。 In general, in the HSV space vector, the range of the hue axis representing the color is 0 ° to 360 °, the saturation representing the vividness of the color is 0 to 100%, and the lightness representing the brightness is 0 to 100%. If the brightness value is small, it is close to black. For example, a cell in which a cell including a predetermined number or more of pixels in a cell having a lightness less than 20% is entered may be determined.

また、記入済みセル検出手段は、セルの色のＨＳＶ空間内のベクトルではなく、ＲＧＢ空間のベクトルを用いて、セルが記入済みか否かを判断しても良い。
記入済みセル検出手段で検出されたセルのみを二値化対象セルとして、二値化手段によって二値化を行い二値画像を生成する。 The filled-in cell detection means may determine whether or not a cell has been filled using a vector in the RGB space instead of a vector in the HSV space of the cell color.
Only the cells detected by the entered cell detection means are set as binarization target cells, and binarization is performed by the binarization means to generate a binary image.

文字認識手段は二値画像から帳票に記載されている文字を認識する。文字認識手段としては、ＯＣＲ等、公知の技術を用いることができる。
このように本態様に従う帳票認識装置では、予め定義されたフォームを用いずに帳票を認識することができ、さらに、帳票上の表の一部の領域しか二値化処理を行わないことによって、二値化処理に要する時間を減らすことができる。その結果、自由なフォーラムの帳票の認識の認識精度を落とすことなく、高速化することができる。 The character recognition means recognizes characters described in the form from the binary image. As the character recognition means, a known technique such as OCR can be used.
Thus, in the form recognition apparatus according to this aspect, it is possible to recognize a form without using a predefined form, and furthermore, by performing only a partial area of the table on the form, binarization processing is performed. The time required for the binarization process can be reduced. As a result, the speed can be increased without degrading the recognition accuracy of the free forum form recognition.

本発明の別の態様に従う帳票認識装置は、さらに、前記二値化対象セル選択手段は、前記表構造画定手段で画定された前記表構造のセルのうち、前記見出しに対応する相対位置に位置するセルが記入済みセルでなかったときに、前記記入済みセルでないセルに対して所定の相対位置にあるセルを前記二値化対象セルから除外することを特徴とする。 In the form recognition apparatus according to another aspect of the present invention, the binarization target cell selecting unit is further positioned at a relative position corresponding to the heading among the cells of the table structure defined by the table structure defining unit. When a cell to be processed is not a completed cell, a cell in a predetermined relative position with respect to a cell that is not the completed cell is excluded from the binarization target cell.

帳票の場合、表の明細行中に未記入セルが多くあるとき、その明細行は未記入明細である可能性が高い。よって、明細行中の未記入セルが一定数以上のとき、その行全体のセルの二値化を行わないことによって帳票の二値化処理に要する時間を減らすことができる。 In the case of a form, when there are many unfilled cells in the detail line of the table, it is highly possible that the detail line is an unfilled detail. Therefore, when the number of unfilled cells in the detail row is a certain number or more, the time required for the binarization processing of the form can be reduced by not binarizing the cells in the entire row.

本発明の別の態様に従う帳票認識装置は、さらに、前記見出しに対応する相対位置に位置する複数のセルを前記表構造中に画定する複数セル画定手段、を含み、前記二値化対象セル選択手段は、前記複数のセルを前記二値化対象セルとして認識することを特徴とする。 The form recognition apparatus according to another aspect of the present invention further includes a plurality of cell defining means for defining, in the table structure, a plurality of cells located at relative positions corresponding to the headings, and the binarization target cell selection The means recognizes the plurality of cells as the binarization target cells.

ここで、複数セル画定手段で定められる複数セルには、はしご枠セルを含む。はしご枠セルは、セル中の黒画素数をチェックするとき、表の中で、はしご枠を区切る点線などの罫線による、一定の閾値以上の黒画素数が周期的に表れるセル列を検出することによって認識することができる。そのはしご枠セルを二値化処理の対象としては複数セルとして、一つのセルとしてみなしても良い。 Here, the plurality of cells defined by the plurality of cell defining means include ladder frame cells. When checking the number of black pixels in a cell, the ladder frame cell detects a cell row in which the number of black pixels above a certain threshold is periodically represented by a ruled line such as a dotted line that divides the ladder frame in the table. Can be recognized by. The ladder frame cell may be regarded as a single cell as a plurality of cells as a binarization target.

帳票ではしご枠は、数字を記入する場合などに多く用いられる。また、一般に、帳票の場合、一番下の行に小計欄があることが多い。表の一番下の行で、表の中で一定の閾値以上の黒画素数が周期的に現れるセル列を抽出することによって小計欄を抽出し、小計欄を複数セルとして、一つのセルとしてみなしても良い。 In a form, a ladder frame is often used for entering numbers. In general, in the case of a form, there is often a subtotal column in the bottom line. In the bottom row of the table, the subtotal column is extracted by extracting the cell column in which the number of black pixels equal to or greater than a certain threshold appears periodically in the table, and the subtotal column is set as a single cell. You may consider it.

また、複数セルは、単に複数個のセルの集まりであっても良い。
このように、たとえばはしご枠を一つの複数セルとして認識し、二値化対象セルとみなすことによって、二値化処理の回数を減らすことができ、フォーマットが予め与えられていない帳票の二値化処理に要する時間をさらに減らすことができる。 Further, the plurality of cells may simply be a collection of a plurality of cells.
In this way, for example, by recognizing a ladder frame as a plurality of cells and considering it as a binarization target cell, the number of binarization processes can be reduced, and binarization of a form whose format is not given in advance. The time required for processing can be further reduced.

また、たとえば、何桁もの数字を記入する欄がはしご枠セルではなく、１つのセルである帳票の場合には、そのセルの端から端まで文字が記入されているとは限らない。そのような場合、文字が記入されている領域以外の領域を二値化対象外領域として、二値化処理を省略しても良い。 Further, for example, in the case where a column for entering a number of digits is not a ladder frame cell but a form that is a single cell, characters are not always written from end to end of the cell. In such a case, the binarization process may be omitted by setting an area other than the area where characters are entered as a non-binarization target area.

本発明の別の態様に従う帳票認識装置は、さらに、前記複数のセルのうち文字が記入されているセルが連続する記入済みの複数セルを検出する記入済み複数セル検出手段と、を含み、前記二値化対象セル選択手段は、前記見出し位置特定手段によって特定された前記見出しを含むセルの位置に対して、前記見出しに対応する相対位置に位置する前記複数セルのうち、記入済み複数セル検出手段で検出された前記複数セルのみを二値化対象セルとして選択することを特徴とする。 The form recognition apparatus according to another aspect of the present invention further includes a filled-in multiple-cell detecting unit that detects filled-in cells in which cells in which characters are written are continuous among the plurality of cells, The binarization target cell selection means detects the filled-in plural cells among the plurality of cells located at a relative position corresponding to the heading with respect to the position of the cell including the heading specified by the heading position specifying means. Only the plurality of cells detected by the means are selected as binarization target cells.

たとえば、帳票の表の金額記入欄は、金額のオーバーフローを起こさないように、大きな桁数が用意されている。帳票では、表の一番下の小計欄が最大桁数である。すると、それより上の行には、最大桁数より小さい数字が記入されている。よって、最大桁数より大きな桁のセルは空白である。その領域の二値化処理を省略することができれば、二値化処理をする面積を削減することができる。 For example, the amount entry column in the form table has a large number of digits so as not to cause an overflow of the amount. In the form, the subtotal column at the bottom of the table is the maximum number of digits. Then, a number smaller than the maximum number of digits is entered in the line above it. Therefore, cells with digits larger than the maximum number of digits are blank. If the binarization process for the region can be omitted, the area for the binarization process can be reduced.

本発明の別の態様に従う帳票認識装置は、さらに、前記見出しに対応する相対位置に位置する複数のセルを前記表構造中に画定する結合セル画定手段、を含み、前記二値化対象セル選択手段は、前記見出し位置特定手段によって特定された前記見出しを含むセルの位置に対して、前記見出しに対応する相対位置に位置する前記結合セル内の一部のみを、前記二値化対象セルとして選択することを特徴とする。 The form recognition apparatus according to another aspect of the present invention further includes a merged cell defining means for defining, in the table structure, a plurality of cells located at relative positions corresponding to the headings, and the binarization target cell selection The means, with respect to the position of the cell including the heading specified by the heading position specifying means, only a part of the merged cell located at a relative position corresponding to the heading is used as the binarization target cell. It is characterized by selecting.

ここで「結合セル」とは、はしご枠セル、はしご形式のセルなどとも呼ばれ、何桁もの数字を記入するために点線や破線で各桁が区切られているセルなどを指す。数字のみならず、平仮名や片仮名を記入する際に、濁点等を一文字として処理するために、点線や破線で各桁が区切られているセルも結合セルに含まれる。 Here, the “combined cell” is also referred to as a ladder frame cell, a ladder-type cell, or the like, and refers to a cell in which each digit is divided by a dotted line or a broken line to enter a number of digits. When not only numbers but also hiragana and katakana are entered, cells in which each digit is delimited by a dotted line or a broken line are included in the combined cells in order to process a cloud point or the like as one character.

この態様では、さらに前記表構造画定手段で画定された前記表構造のセルの少なくとも一部のセルから、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出する記入済みセル検出手段と、前記結合セル中の、前記記入済みセルの最大数を画定する最大セル数画定手段と、を含み、前記二値化対象セル選択手段はさらに、前記見出しを含むセルの位置に対して、前記見出しに対応する相対位置に位置する前記結合セルに含まれる前記複数のセルから、前記最大数のセルを、前記見出しに依存して二値化対象セルとして選択しても良い。 In this aspect, a filled cell detection for detecting a filled cell in which any character or character string is written in the cell from at least some of the cells of the table structure defined by the table structure defining means. Means for defining the maximum number of written cells in the merged cell, and the binarization target cell selecting means is further configured for a cell position including the heading. The maximum number of cells may be selected as the binarization target cell from the plurality of cells included in the merged cell located at the relative position corresponding to the heading depending on the heading.

一般に金額を含む帳票では、表の一番下の欄に小計欄又は合計欄を含むことがある。すると、その列の各行に記入された数字は、小計欄に記入された数字の桁数より小さな桁数を有する。そこで、小計欄に記載された数字の桁数を最大桁数として抽出し、その列の各行のセルの二値化処理にあたり、最大桁数を超えた桁のセルの二値化処理は省略することが可能である。 In general, a form including a monetary amount may include a subtotal column or a total column in the bottom column of the table. Then, the number entered in each row of the column has a smaller number of digits than the number of digits entered in the subtotal column. Therefore, the number of digits described in the subtotal column is extracted as the maximum number of digits, and the binarization processing of cells with digits exceeding the maximum number of digits is omitted in the binarization processing of cells in each row of the column. It is possible.

また、前記二値化手段は、前記見出しに依存して、前記結合セル内を右端のセルから左向きに、または左端のセルから右向きに、前記接合セル内で前記記入済みセルの列の始まりを見出すまで二値化処理をスキップしても良い。 Further, the binarization means may start the column of the written cells in the junction cell in the merged cell from the rightmost cell to the left or from the leftmost cell to the right depending on the heading. The binarization process may be skipped until it is found.

本発明の別の態様に従う帳票認識装置は、前記二値化手段は、前記見出しに依存して、前記複数セルの各々の中を左端のセルから右向きに、または右端のセルから左向きに、前記記入済み複数セルの端のセルを見出すまで二値化処理を省略することを特徴とする。 In the form recognition apparatus according to another aspect of the present invention, the binarization means, depending on the heading, in each of the plurality of cells from the leftmost cell to the right or from the rightmost cell to the left. The binarization process is omitted until an end cell of a plurality of completed cells is found.

本態様に従えば、一つのセルの空白部分の二値化処理を省略することができ、二値化処理をすべき面積を削減することができる。
また、本発明の一態様に従う帳票認識方法は、罫線で区切られたセルであって、その中に文字若しくは文字列を含む、又は含まないセルで構成される表を含む帳票に記載されている文字を認識する帳票認識方法であって、前記表において、前記セルの配置を定める表構造を画定するステップと、前記画像から予め定められた文字又は文字列である見出しを抽出するステップと、前記見出しを含むセルである見出しセルの位置を前記表構造中に特定するステップと、前記表構造画定手段で画定された前記表構造のセルのうち、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出するステップと、前記記入済みセル検出手段で検出された前記記入済みセルのみを二値化対象セルとして選択するステップと、前記表の中の前記二値化対象セルの二値化処理を行ない、二値画像を生成することと、前記二値画像から帳票に記載されている文字を認識するステップ、を具備することを特徴とする。 According to this aspect, the binarization process of the blank portion of one cell can be omitted, and the area to be binarized can be reduced.
In addition, the form recognition method according to one aspect of the present invention is described in a form including a table composed of cells that are separated by ruled lines and include or do not include characters or character strings. A form recognition method for recognizing characters, in the table, defining a table structure that determines the arrangement of the cells, extracting a headline that is a predetermined character or character string from the image, A position of a heading cell that is a cell including a heading is specified in the table structure, and among the cells of the table structure defined by the table structure defining means, any character or character string is entered in the cell. Detecting a completed cell, selecting only the completed cell detected by the completed cell detection means as a binarization target cell, and selecting the binary cell in the table. Of performs binarization of the target cell, characterized by comprising generating a binary image, a step recognizes characters that are described in the form of the binary image.

本態様に従う帳票認識方法では、予め定義されたフォームを用いずに帳票を認識することができ、さらに、帳票上の表の一部の領域しか二値化処理を行わないことによって、二値化処理に要する時間を減らすことができる。その結果、自由なフォーラムの帳票の認識の認識精度を落とすことなく、高速化することができる。 In the form recognition method according to this aspect, a form can be recognized without using a pre-defined form, and further, binarization is performed by performing only a part of the table area on the form. The time required for processing can be reduced. As a result, the speed can be increased without degrading the recognition accuracy of the free forum form recognition.

本発明の別の態様に従う帳票認識方法は、前記表構造のセルのうち、前記見出しによって定められるある特定のセルが記入済みセルでなかったときに、前記記入済みセルでないセルに対して所定の相対位置にあるセルを前記二値化対象セルから除外するステップを含むことを特徴とする。 The form recognition method according to another aspect of the present invention provides a predetermined recognition method for a non-filled cell when a specific cell defined by the heading is not a filled cell among the cells of the table structure. The method includes a step of excluding cells in relative positions from the binarization target cell.

本発明の別の態様に従う帳票の画像の二値化処理を行う方法は、さらに、前記見出しに対応する相対位置に位置する複数のセルを前記表構造中に画定するステップを含み、前記二値化対象セルを選択するステップは、前記複数のセルを前記二値化対象セルとして認識することを特徴とする。 The method for binarizing a form image according to another aspect of the present invention further includes the step of defining in the table structure a plurality of cells located at relative positions corresponding to the headings, The step of selecting a cell to be binarized recognizes the plurality of cells as the binarization cell.

本発明の別の態様に従う帳票認識方法は、さらに、前記複数のセルのうち文字が記入されているセルが連続する記入済みの複数セルを検出するステップを含み、前記二値化対象セルを選択するステップは、前記見出し位置特定手段によって特定された前記見出しを含むセルの位置に対して、前記見出しに対応する相対位置に位置する前記複数セルのうち、記入済み複数セル検出手段で検出された前記複数セルのみを二値化対象セルとして選択することを特徴とする。 The form recognition method according to another aspect of the present invention further includes a step of detecting a plurality of filled cells in which cells in which characters are written are continuous among the plurality of cells, and the binarization target cell is selected. The step of detecting is detected by the filled-in multiple cell detecting means among the plurality of cells located at a relative position corresponding to the heading with respect to the position of the cell including the heading specified by the heading position specifying means. Only the plurality of cells are selected as binarization target cells.

本発明の別の態様に従う帳票認識方法は、前記表構造画定手段で画定された前記表構造のセルのうち、前記見出しに対応する相対位置に位置するセルが記入済みセルでなかったときに、前記記入済みセルでないセルに対して所定の相対位置にあるセルを前記二値化対象セルから除外するステップを含むことを特徴とする。 In the form recognition method according to another aspect of the present invention, when a cell located at a relative position corresponding to the heading is not a completed cell among the cells of the table structure defined by the table structure defining means, The method includes a step of excluding, from the binarization target cell, a cell at a predetermined relative position with respect to a cell that is not the completed cell.

本発明の別の態様に従う帳票認識方法は、さらに、画定された前記表構造のセルの少なくとも一部のセルから、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出するステップと、前記結合セル中の、前記記入済みセルの最大数を画定するステップを含み、前記二値化対象セルを選択することはさらに、前記見出しを含むセルの位置に対して、前記見出しによって定められる相対位置に位置する前記結合セルに含まれる前記複数のセルから、前記最大数のセルを、前記見出しに依存して二値化対象セルとして選択することを特徴とする。 The form recognition method according to another aspect of the present invention further includes a step of detecting a filled cell in which any character or character string is written in the cell from at least some of the cells of the defined table structure. And defining a maximum number of the completed cells in the combined cell, wherein selecting the binarization target cell is further defined by the heading relative to a cell location including the heading. The maximum number of cells is selected as a binarization target cell from the plurality of cells included in the combined cell located at a relative position, depending on the heading.

本発明の別の態様に従う帳票認識方法は、さらに、前記見出しに対応する相対位置に位置する複数のセルを含む結合セルを前記表構造中に画定するステップを含み、前記二値化対象セルを選択するステップは、前記見出し位置特定手段によって特定された前記見出しを含むセルの位置に対して、前記見出しに対応する相対位置に位置する前記結合セル内の一部のみを、前記二値化対象セルとして選択することを特徴とする。 The form recognition method according to another aspect of the present invention further includes the step of defining a merged cell including a plurality of cells located at relative positions corresponding to the heading in the table structure, and In the selecting step, only a part of the merged cell located at a relative position corresponding to the heading with respect to the position of the cell including the heading specified by the heading position specifying means is binarized. It is characterized by selecting as a cell.

本発明の別の態様に従う帳票認識方法は、さらに、前記表構造のセルの少なくとも一部のセルから、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出するステップと、前記結合セル中の、前記記入済みセルの最大数を画定するステップを含み、前記二値化対象セルを選択するステップは、さらに、前記見出しを含むセルの位置に対して、前記見出しに対応する相対位置に位置する前記結合セルに含まれる前記複数のセルから、前記最大数のセルを、前記見出しに依存して二値化対象セルとして選択するステップを含むことを特徴とする。 The form recognition method according to another aspect of the present invention further includes a step of detecting a filled cell in which any character or character string is written in a cell from at least some of the cells of the table structure, Defining a maximum number of filled cells in a merged cell, and selecting the binarized cell further comprises relative to the position of the cell containing the heading relative to the heading. Selecting the maximum number of cells as binarization target cells from the plurality of cells included in the merged cell located at a position depending on the heading.

本発明の別の態様に従う帳票認識方法は、前記見出しに依存して、前記結合セル内を右端のセルから左向きに、または左端のセルから右向きに、前記結合セル内で前記記入済みセルの列の始まりを見出すまで二値化処理をスキップするステップを含むことを特徴とする。 The form recognition method according to another aspect of the present invention provides, depending on the heading, the left column from the rightmost cell or the right column from the leftmost cell in the merged cell, and the column of the filled cells in the merged cell. The method includes a step of skipping the binarization process until the beginning of is found.

本発明の別の態様に従う帳票認識方法は、前記見出しに依存して、二値化対象セル選択手段によって認識された前記複数セルの左端のセルから右向きに、または右端のセルから左向きに、前記複数セルの端のセルを見出すまで二値化処理を省略するステップを含むことを特徴とする。 The form recognition method according to another aspect of the present invention provides, depending on the heading, the leftmost cell of the plurality of cells recognized by the binarization target cell selecting unit, or the leftward direction from the rightmost cell. The method includes a step of omitting the binarization processing until a cell at the end of the plurality of cells is found.

また、本発明の一態様に従うプログラムは、罫線で区切られたセルであって、その中に文字若しくは文字列を含む、又は含まないセルで構成される表を含む帳票に記載されている文字を認識する帳票認識装置として用いることが可能なコンピュータに、前記表において、前記セルの配置を定める表構造を画定する表構造画定機能と、前記画像から予め定められた文字又は文字列である見出しを抽出する、見出し抽出機能と、前記見出しを含むセルである見出しセルの位置を前記表構造中に特定する、見出し位置特定機能と、前記表構造画定手段で画定された前記表構造のセルのうち、セル内に何らかの文字又は文字列が記入されている記入済みセルを検出する記入済みセル検出機能と、前記記入済みセル検出手段で検出された前記記入済みセルのみを二値化対象セルとして選択する二値化対象セル選択機能と、前記表の中の前記二値化対象セルの二値化処理を行ない、二値画像を生成する二値化機能と、前記二値画像から帳票に記載されている文字を認識する文字認識機能、を実現させることを特徴とする。 Further, the program according to one aspect of the present invention includes a cell that is delimited by a ruled line and includes a character or a character string that is included in a cell including a table that includes or does not include a character or a character string. In a computer that can be used as a form recognition device for recognizing, in the table, a table structure defining function for defining a table structure for determining the arrangement of the cells, and a heading that is a character or a character string predetermined from the image Of the table structure cells defined by the table structure defining means, a headline extraction function for extracting, a heading position specifying function for specifying the position of the header cell that is a cell including the headings in the table structure, and A filled cell detection function for detecting a filled cell in which any character or character string is written in the cell, and the filled cell detected by the filled cell detection means. A binarization target cell selection function for selecting only as a binarization target cell, a binarization function for performing binarization processing of the binarization target cell in the table and generating a binary image, A character recognition function for recognizing characters described in a form from the binary image is realized.

図１乃至５を参照しながら、上記のような本発明に従う画像データの二値化処理方法及びそれを用いた帳票認識装置１０について全般的な説明する。
図１は、本発明の帳票認識装置１０のブロック図である。 With reference to FIGS. 1 to 5, the image data binarization processing method according to the present invention as described above and the form recognition apparatus 10 using the same will be described in general.
FIG. 1 is a block diagram of a form recognition apparatus 10 of the present invention.

本発明に従う帳票認識装置１０は、スキャナ１００、罫線二値化処理部１１０、表構造情報抽出部１２０、見出し二値化処理部１３０、見出し・項目抽出部１４０、簡易文字認識処理部１４２、二値化対象外処理部１５０、はしご枠グルーピング処理部１５２、記入有無判定部１５４、部分二値化処理部１６０及び文字認識処理部１７０を含んでいる。これらは互いに電気的に接続されている。これらの処理部は、たとえば、それぞれの処理を専門に行うプロセッサであっても良いし、汎用計算機及びそれぞれの処理を行うための計算機認識可能な命令を含む媒体の組み合わせであっても良い。各処理部を汎用計算機によって実現するときには、各処理部の機能は、その機能をコンピュータに実行させる命令を含むプログラムを用いて実現する。 The form recognition apparatus 10 according to the present invention includes a scanner 100, a ruled line binarization processing unit 110, a table structure information extraction unit 120, a heading binarization processing unit 130, a heading / item extraction unit 140, a simple character recognition processing unit 142, A non-valuation target processing unit 150, a ladder frame grouping processing unit 152, an entry presence / absence determination unit 154, a partial binarization processing unit 160 and a character recognition processing unit 170 are included. These are electrically connected to each other. These processing units may be, for example, a processor that specializes each processing, or a combination of a general-purpose computer and a medium that includes a computer-recognizable instruction for performing each processing. When each processing unit is realized by a general-purpose computer, the function of each processing unit is realized by using a program including instructions for causing a computer to execute the function.

また、装置１０は、原画像記憶部２００、罫線二値画像記憶部２０２、表構造情報記憶部２０４、見出し二値化記憶部２０６、見出し・項目情報記憶部２０８、見出し辞書部２１０、二値化対象外範囲記憶部２１２、及び閾値記憶部２１４を含んでいる。これらは、ＲＡＭまたはＲＯＭなどのメモリであってよい。また、上記様々な処理部とは同一のデバイス内に配置されても良いし、別個のデバイスとして準備されても良い。 In addition, the apparatus 10 includes an original image storage unit 200, a ruled line binary image storage unit 202, a table structure information storage unit 204, a headline binarization storage unit 206, a headline / item information storage unit 208, a headline dictionary unit 210, a binary A non-target range storage unit 212 and a threshold storage unit 214 are included. These may be memory such as RAM or ROM. The various processing units may be arranged in the same device, or may be prepared as separate devices.

またこれらの記憶部は、上記の様々な処理部から出力されたデータを受けて記憶することができる。
原画像記憶部２００はスキャナ１００に通信可能に接続され、スキャナで取り込まれた画像を記憶する。罫線二値画像記憶部２０２は罫線二値化処理部１１０に通信可能に接続され、罫線二値化処理部１１０での結果が記憶される。表構造情報記憶部２０４及び見出し二値化記憶部２０６はそれぞれ、表構造情報抽出部１２０及び見出し二値化処理部１３０に通信可能に接続され、それぞれの処理部での結果が記憶される。見出し・項目情報記憶部２０８及び見出し辞書部２１０は共に、見出し・項目抽出部１４０に通信可能に接続される。見出し・項目抽出部１４０では、見出し辞書部２１０を参照しながら、帳票画像上から見出し・項目を抽出し、結果は見出し・項目情報記憶部２０８に出力される。見出し辞書部２１０には、予め設定した見出しの文字情報が収納されている。二値化対象外範囲記憶部２１２は、二値化対象外処理部１５０に通信可能に接続され、二値化対象外処理部１５０で選択された二値化処理を行わない範囲を記憶する。閾値記憶部２１４は、記入有無判定部１５４に通信可能に接続され、記入有無判定部１５４は、閾値記憶部２１４に記憶されている閾値を用いて、あるセル中に文字の記入があるか否かを判定する。 In addition, these storage units can receive and store data output from the various processing units.
The original image storage unit 200 is communicably connected to the scanner 100 and stores an image captured by the scanner. The ruled line binary image storage unit 202 is communicably connected to the ruled line binarization processing unit 110, and the result of the ruled line binarization processing unit 110 is stored. The table structure information storage unit 204 and the headline binarization storage unit 206 are communicably connected to the table structure information extraction unit 120 and the headline binarization processing unit 130, respectively, and the results of the respective processing units are stored. Both the headline / item information storage unit 208 and the headline dictionary unit 210 are communicably connected to the headline / item extraction unit 140. The headline / item extraction unit 140 extracts a headline / item from the form image while referring to the headline dictionary unit 210, and the result is output to the headline / item information storage unit 208. The headline dictionary unit 210 stores character information of a preset headline. The non-binarization target range storage unit 212 is communicably connected to the non-binarization target processing unit 150 and stores a range in which the binarization processing selected by the non-binarization target processing unit 150 is not performed. The threshold value storage unit 214 is communicably connected to the entry presence / absence determination unit 154. The entry presence / absence determination unit 154 uses the threshold value stored in the threshold value storage unit 214 to determine whether or not there is a character entry in a cell. Determine whether.

記入有無判定部１５４は、あるセル中の黒画素の数またはその分布に基づいて、そのセル内に文字等の記入があるかどうかを判定しても良い。この場合、黒画素であるか否かは、色相、彩度及び明度の３つの成分からなるＨＳＶ色空間中のベクトルの大きさ及び／又は向きを用いて判定することができる。または、あるセル中の連結成分の個数を検出することによって、セル内に文字等の記載があるかどうかを判定しても良い。 The entry presence / absence determination unit 154 may determine whether characters or the like are entered in the cell based on the number of black pixels in the cell or the distribution thereof. In this case, whether or not the pixel is a black pixel can be determined using the magnitude and / or orientation of a vector in the HSV color space composed of the three components of hue, saturation, and brightness. Alternatively, by detecting the number of connected components in a certain cell, it may be determined whether or not a character or the like is written in the cell.

部分二値化処理部１６０では、帳票画像のうち、二値化対象外処理部１５０で選択された二値化対象外範囲を除いた領域の二値化処理を行う。
文字認識処理部１７０では、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒｉｚｅＲｅｃｏｇｎｉｔｉｏｎ、光学式文字認識）等、公知の文字認識手段を用いて、部分二値化処理部１６０で二値化された画像中の文字を認識する。 The partial binarization processing unit 160 performs binarization processing on a region of the form image excluding the binarization target non-selected range selected by the binarization target non-processing unit 150.
The character recognition processing unit 170 recognizes characters in the image binarized by the partial binarization processing unit 160 by using known character recognition means such as OCR (Optical Character Recognition Recognition).

表構造画定手段は、罫線二値化処理部１１０、表構造抽出部１２０、罫線二値画像記憶部２０２、および表構造情報記憶部２０４を含む。
見出し抽出手段は、見出し二値化処理部１３０、見出し・項目抽出部１４０、簡易文字認識処理部１４２、見出し二値化記憶部２０６、見出し・項目情報記憶部２０８、及び見出し辞書部２１０を含む。見出し位置特定手段は、表構造抽出部１２０、罫線二値画像記憶部２０２、表構造情報記憶部２０４、見出し二値化処理部１３０、見出し・項目抽出部１４０、簡易文字認識処理部１４２、見出し二値化記憶部２０６、見出し・項目情報記憶部２０８、及び見出し辞書部２１０を含む。二値化対象セル選択手段は、二値化対象外処理部１５０、二値化対象外範囲記憶部２１２を含む。二値化手段は、部分二値化処理部１６０を含む。記入済みセル検出手段は、記入有無判定部１５４および閾値記憶部２１４を含み、複数セル画定手段は、罫線二値化処理部１１０、表構造抽出部１２０、罫線二値画像記憶部２０２、表構造情報記憶部２０４、およびはしご枠グルーピング処理部１５２を含む。文字認識手段は、文字認識処理部１７０を含む。また、最大桁数検出手段は、記入有無判定部１５４を含む。結合セル画定手段は、罫線二値化処理部１１０、表構造抽出部１２０、および罫線二値画像記憶部２０２を含む。上記各手段は、それぞれの手段における機能を実現するコンピュータが認識可能な命令を含むプログラムによって定義されても良い。たとえば、表構造画定手段は、その手段の機能に対応する表構造画定機能を実現するコンピュータが認識可能なプログラムまたはプログラムの部分を定義しても良い。 The table structure defining means includes a ruled line binarization processing unit 110, a table structure extracting unit 120, a ruled line binary image storage unit 202, and a table structure information storage unit 204.
The headline extraction means includes a headline binarization processing unit 130, a headline / item extraction unit 140, a simple character recognition processing unit 142, a headline binarization storage unit 206, a headline / item information storage unit 208, and a headline dictionary unit 210. . The heading position specifying means includes a table structure extraction unit 120, a ruled line binary image storage unit 202, a table structure information storage unit 204, a heading binarization processing unit 130, a heading / item extraction unit 140, a simple character recognition processing unit 142, a heading. A binarization storage unit 206, a headline / item information storage unit 208, and a headline dictionary unit 210 are included. The binarization target cell selection unit includes a binarization target non-processing unit 150 and a binarization target non-range storage unit 212. The binarization means includes a partial binarization processing unit 160. The entered cell detection means includes an entry presence / absence determination unit 154 and a threshold storage unit 214, and the multiple cell demarcation means includes a ruled line binarization processing unit 110, a table structure extraction unit 120, a ruled line binary image storage unit 202, and a table structure. An information storage unit 204 and a ladder frame grouping processing unit 152 are included. The character recognition means includes a character recognition processing unit 170. The maximum digit number detecting means includes an entry presence / absence determining unit 154. The combined cell delimiting means includes a ruled line binarization processing unit 110, a table structure extraction unit 120, and a ruled line binary image storage unit 202. Each of the above means may be defined by a program including instructions that can be recognized by a computer that implements the functions of the respective means. For example, the table structure defining means may define a computer-recognizable program or a part of a program that realizes a table structure defining function corresponding to the function of the means.

図２は、本発明の装置１０に適用されるキーワード認識技術の処理方法を示すフローチャートである。
Ｓ１０００では、スキャナ１００を用いて、帳票のイメージ読取処理を行う。本発明に従う帳票認識装置１０では、帳票はカラー画像であることが望ましい。カラー画像を用いることで、文字とノイズ、または文字と背景色の分離等が容易になり、高精度の帳票認識が可能である。しかしながら、本発明に従う帳票認識装置１０で扱われる帳票はカラーでなくても構わない。スキャナ１００を用いて読み取られた画像は、原画像記憶部２００に記憶される。Ｓ１０００の処理が終了するとＳ１００５に進む。 FIG. 2 is a flowchart showing a keyword recognition technique processing method applied to the apparatus 10 of the present invention.
In step S1000, a form image reading process is performed using the scanner 100. In the form recognition apparatus 10 according to the present invention, the form is preferably a color image. By using a color image, it becomes easy to separate characters and noise, or characters and background colors, and form recognition with high accuracy is possible. However, the form handled by the form recognition apparatus 10 according to the present invention may not be color. An image read using the scanner 100 is stored in the original image storage unit 200. When the processing of S1000 ends, the process proceeds to S1005.

Ｓ１００５では、Ｓ１０００で罫線二値化処理部１１０により、原画像記憶部２００に記憶されているカラー画像の二値化処理を行い、罫線を残した二値画像を出力する。より詳細には、Ｓ１００５では、入力されたカラー画像の二値化処理を行い、罫線のみを残し、それ以外は全て背景色（例えば白色）とした二値画像を出力する。罫線を残した二値画像は、罫線二値画像記憶部２０２に出力される。このときの二値化処理は、罫線を抽出することを目的とするものなので、複数画素によって構成される、二値化処理範囲の単位の大きさを、文字を認識するには大きいが、線を認識するには十分に小さくして、処理速度を最小化しても良い。 In S1005, the ruled line binarization processing unit 110 performs the binarization processing of the color image stored in the original image storage unit 200 in S1000, and outputs a binary image that leaves the ruled line. More specifically, in S1005, the input color image is binarized, and only a ruled line is left, and a binary image with all other background colors (for example, white) is output. The binary image that leaves the ruled line is output to the ruled line binary image storage unit 202. Since the binarization processing at this time is intended to extract ruled lines, the unit size of the binarization processing range composed of a plurality of pixels is large for recognizing characters, The processing speed may be minimized by making it sufficiently small for recognizing.

カラー画像から罫線だけを残した二値画像は、帳票上の横線及び縦線といった直線成分を抽出することによって生成される。Ｓ１００５の処理が終了するとＳ１０１０に進む。
Ｓ１０１０では、罫線二値画像記憶部２０２に記憶された罫線を残した二値画像を参照し、表構造抽出部１２０によって、横線と縦線の交点の情報からセルと、最終的な線形（表形）を算出し、各セルに列番号と行番号の情報（座標）を付加したものを表構造情報として、表構造情報記憶部２０４に出力する。各セルの位置は、列番号と行番号の２成分であらわさずに、例えば、左上隅のセルから右下隅のセルに向かって、行に関しては左から右に、行に関しては上から下に番号付けをし、その番号によって表しても良い。 A binary image in which only ruled lines are left from a color image is generated by extracting linear components such as horizontal lines and vertical lines on a form. When the processing of S1005 ends, the process proceeds to S1010.
In S1010, the binary image with the ruled lines stored in the ruled line binary image storage unit 202 is referred to, and the table structure extracting unit 120 uses the information on the intersection of the horizontal and vertical lines to determine the cell and the final linear (table). Form), and information obtained by adding column number and row number information (coordinates) to each cell is output to the table structure information storage unit 204 as table structure information. The position of each cell is not represented by the two components of the column number and the row number. For example, from the cell in the upper left corner toward the cell in the lower right corner, the cell is numbered from left to right for the row and from top to bottom for the row. It may be attached and represented by its number.

Ｓ１０１０における表構造抽出部１２０の処理を、図３を参照しながら説明する。図３中の大きな丸印で表される点Ｐ１は、図３中に示されている罫線を有する表の仮の端点である。また、小さな丸印で表される縦線と横線が交わる交点Ｐ３を検出することによって、交点Ｐ３を頂点とする四角形を個々のセルとして認識する。表構造抽出部１２０が点Ｐ１を頂点とする四角形を表形とすると、点Ｐ２のように突出した交点が出現する。そこで、点Ｐ２を頂点に含む四角形も表のセルと認識し、最終的な表形を抽出する。 The processing of the table structure extraction unit 120 in S1010 will be described with reference to FIG. A point P1 represented by a large circle in FIG. 3 is a temporary end point of the table having ruled lines shown in FIG. Further, by detecting an intersection P3 where a vertical line and a horizontal line that are represented by small circles intersect, a quadrangle having the intersection P3 as a vertex is recognized as an individual cell. When the table structure extraction unit 120 defines a quadrangle having the point P1 as a vertex as a table shape, a protruding intersection appears as a point P2. Therefore, a rectangle including the point P2 as a vertex is also recognized as a table cell, and a final table shape is extracted.

また、Ｓ１０１０で表構造抽出部１２０は、後に記載するように、「はしご枠」または「結合セル」の抽出を行っても良い。はしご枠とは、複数の桁の数字を記入するために各桁が点線で区切られた連続するセルを定義する表の枠である。図４に示されている、本装置１０が処理対象とする帳票の例では、「口座番号」、「金額」、及び「手数料」の見出しの列はそれぞれ、連続する１０個のセル、１０個のセル、５個のセルからなるはしご枠セルであって、はしご枠セルを定義する罫線がはしご枠である。Ｓ１０１０の処理が終了するとＳ１０１５に進む。はしご枠セルの位置、含まれるセルの数などははしご枠情報として、表構造情報記憶部２０４に記憶されても良い。 In S1010, the table structure extraction unit 120 may extract a “ladder frame” or a “combined cell” as described later. A ladder frame is a table frame that defines a series of cells in which each digit is separated by a dotted line to enter a number of digits. In the example of the form to be processed by the apparatus 10 shown in FIG. 4, the columns of the headings “Account No.”, “Amount”, and “Fees” are each 10 consecutive cells, 10 The cell is a ladder frame cell composed of five cells, and the ruled line defining the ladder frame cell is a ladder frame. When the process of S1010 ends, the process proceeds to S1015. The position of the ladder frame cell, the number of included cells, and the like may be stored in the table structure information storage unit 204 as ladder frame information.

Ｓ１０１５では、表構造抽出部１２０によって抽出された表構造情報を用いて、表の一番上の行を対象に、文字を残した二値画像を作成し、見出し二値化記憶部２０６に出力する。より詳細には、表構造情報を用いて、表の一番上の行から罫線を削除したカラー画像において、表の一番上の行の領域で二値化処理を行って文字を残した二値画像を生成しても良い。Ｓ１０１５は、見出し二値化処理部１３０によって処理される。Ｓ１０１５の処理が終了するとＳ１０２０に進む。 In S1015, using the table structure information extracted by the table structure extraction unit 120, a binary image is generated with the characters remaining for the top row of the table, and is output to the heading binarization storage unit 206. To do. More specifically, in the color image in which the ruled lines are deleted from the top row of the table using the table structure information, the binarization process is performed on the top row region of the table and the characters are left. A value image may be generated. S1015 is processed by the headline binarization processing unit 130. When the process of S1015 ends, the process proceeds to S1020.

Ｓ１０１５での見出し二値化処理部１３０による二値化処理では、二値化処理範囲の単位の大きさ（一般に複数の画素を含み、その領域内で二値化閾値が定義される領域）は、罫線二値化処理部１１０が行う二値化処理の際に用いられる大きさと同等または小さいことが望ましい。このように、罫線二値化処理部１１０と見出し二値化処理部１３０の二値化処理での個々の二値化処理範囲の単位の大きさを変化させることによって、帳票の二値化処理速度を改善することができる。 In the binarization processing by the heading binarization processing unit 130 in S1015, the unit size of the binarization processing range (generally including a plurality of pixels and the binarization threshold is defined in the region) is The size used in the binarization processing performed by the ruled line binarization processing unit 110 is preferably equal to or smaller than the size used. In this way, the binarization processing of the form is performed by changing the unit size of each binarization processing range in the binarization processing of the ruled line binarization processing unit 110 and the headline binarization processing unit 130. Speed can be improved.

Ｓ１０２０では、Ｓ１０１５で作成された表の一番上の行中の文字を残した二値画像を対象に文字認識を行い、抽出した文字情報を見出し辞書部２１０に記憶されている情報と比較し、見出しとみなした文字情報を見出し・項目情報記憶部２０８に出力する。 In S1020, character recognition is performed on the binary image in which the characters in the top row of the table created in S1015 are left, and the extracted character information is compared with information stored in the index dictionary unit 210. The character information regarded as the headline is output to the headline / item information storage unit 208.

Ｓ１０２０に引き続くＳ１０２５では、表の一番上の行中の文字を残した二値画像中の文字が見出し辞書部２１０に無かった場合、認識対象外の見出し項目とし、Ｓ１０３０に進む。また、表の一番上の行中の文字を残した二値画像中の文字が見出し辞書部２１０にある場合、Ｓ１０３５に進む。Ｓ１０２０及びＳ１０２５は、見出し・項目抽出部１４０によって処理される。 In S1025 subsequent to S1020, if there is no character in the binary dictionary image 210 in which the character in the top row of the table is left in the headline dictionary unit 210, the headline item is not recognized and the process proceeds to S1030. On the other hand, if the character in the binary image in which the character in the top row of the table is left is in the heading dictionary unit 210, the process proceeds to S1035. Steps S1020 and S1025 are processed by the headline / item extraction unit 140.

図２のフローチャート中の分岐において、“Ｙ”はイエス（肯定）を表し、“Ｎ”はノー（否定）を表す。図２以降の図中に表れるフローチャートにおいても同様である。
Ｓ１０３０では、Ｓ１０２５で選択された二値化対象外の見出しが属する表の列を二値化対象外とし、二値化対象外範囲記憶部２１２に出力する。Ｓ１０３０の処理は、二値化対象外処理部１５０によって処理される。Ｓ１０３０の処理が終わると、Ｓ１０３５に進む。 In the branch in the flowchart of FIG. 2, “Y” represents yes (affirmative) and “N” represents no (denied). The same applies to the flowcharts appearing in the drawings after FIG.
In S1030, the column of the table to which the non-binarization heading selected in S1025 belongs is excluded from the binarization target, and is output to the binarization non-target range storage unit 212. The processing of S1030 is processed by the binarization target non-processing unit 150. When the process of S1030 ends, the process proceeds to S1035.

Ｓ１０３５では、Ｓ１０２０及びＳ１０２５で抽出された見出しを対象に、右詰か左詰かの位置属性付けを行う。この処理は、見出し・項目抽出部１４０によって行われ、結果は、見出し・項目情報記憶部２０８に出力される。Ｓ１０３５の処理が終わると、Ｓ１０４０に進む。 In S1035, the right attribute or the left attribute is assigned to the heading extracted in S1020 and S1025. This processing is performed by the headline / item extraction unit 140, and the result is output to the headline / item information storage unit 208. When the process of S1035 ends, the process proceeds to S1040.

この位置属性情報は、見出し辞書部２１０に、Ｓ１０２０及びＳ１０２５で抽出された見出しを表す文字と共に記憶されていて良い。
Ｓ１０４０では、記入有無判定部１５４によって、Ｓ１０３０で抽出した見出しごとの位置属性情報に基づいて、見出し列ごとに、文字の記入がない空白部分を特定し、さらに二値化対象外処理部１５０によって、二値化対象外範囲記憶部２１２に出力する。Ｓ１０４０の処理が終わると、Ｓ１０４５に進む。 This position attribute information may be stored in the headline dictionary unit 210 together with characters representing the headlines extracted in S1020 and S1025.
In S1040, the entry presence / absence determination unit 154 specifies a blank portion in which no character is entered for each heading column based on the position attribute information for each heading extracted in S1030. , And output to the binarized non-target range storage unit 212. When the process of S1040 ends, the process proceeds to S1045.

Ｓ１０４５では、記入有無判定部１５４によってＳ１０１０で抽出した表構造情報と、閾値記憶部２１４に記憶されている、予め設定されたセルごとの閾値、見出しごとの閾値等に基づいて、表の下の行から表の各セルに文字が記入されているか否かを判定する。 In S1045, based on the table structure information extracted in S1010 by the entry presence / absence determining unit 154 and the preset threshold for each cell, threshold for each heading, and the like stored in the threshold storage unit 214, It is determined whether or not a character is entered in each cell of the table from the row.

Ｓ１０４５に引き続くＳ１０５０では、表中に空白とみなせる行（明細行）があるか否かを判定する。この処理は、記入有無判定部１５４によって行われる。もし、空白とみなせる行があれば処理はＳ１０５５に進み、空白とみなせる行は、二値化対象外処理部１５０によって、二値化対象外範囲として二値化対象外範囲記憶部２１２に出力される。その後、処理はＳ１０６０に進む。もし、空白とみなせる行がなければ、処理は直接、Ｓ１０６０に進む。 In S1050 subsequent to S1045, it is determined whether or not there is a line (detail line) that can be regarded as a blank in the table. This process is performed by the entry presence / absence determination unit 154. If there is a line that can be regarded as a blank, the process proceeds to S1055, and the line that can be regarded as a blank is output to the binarization-untargeted range storage unit 212 by the binarization-untargeted processing unit 150 as a binarization-untargeted range. The Thereafter, the process proceeds to S1060. If there is no line that can be regarded as a blank, the process directly proceeds to S1060.

Ｓ１０６０では、Ｓ１０１０で抽出されたはしご枠情報に基づいて、表の中のはしご枠を抽出し、表構造情報記憶部２０４に出力する。Ｓ１０６０の処理が終了すると、Ｓ１０６５に進む。はしご枠の抽出方法は、後述のようなヒストグラムを用いる方法のほか、表中の実線で囲まれた領域内を区分する点罫線を抽出することによって行っても良い。 In S1060, a ladder frame in the table is extracted based on the ladder frame information extracted in S1010, and is output to the table structure information storage unit 204. When the process of S1060 ends, the process proceeds to S1065. The method of extracting the ladder frame may be performed by extracting dotted ruled lines that divide the area surrounded by the solid line in the table, in addition to a method using a histogram as described later.

Ｓ１０６５では、Ｓ１０６０の結果を参照しながら、表の中にはしご枠があるかの判定を行う。Ｓ１０６０及びＳ１０６５の処理は表構造情報抽出部１２０によって行われる。
Ｓ１０６５での判定の結果が「ある」であればＳ１０７０に、結果が「なし」であればＳ１０９５に進む。Ｓ１０７０及びＳ１０９５では、簡易文字認識処理部１４２によって、表構造情報抽出部１２０に記憶された表構造情報を用いて、表の下から２行分を対象に文字認識を行う。結果として得られる文字情報は、見出し辞書部２１０に記憶されているデータと突き合わされ、小計欄とみなされた見出しの文字情報を見出し・項目情報記憶部２０８に出力する。Ｓ１０７０では、はしご枠セルが存在するので、小計記載欄を抽出する際に、はしご枠セルを小計記載欄の候補として優先的に検索しても良い。Ｓ１０７０及びＳ１０９５の処理が終わるとそれぞれ、Ｓ１０７５及びＳ１１００に進む。 In S1065, it is determined whether there is a ladder frame in the table with reference to the result of S1060. The processing of S1060 and S1065 is performed by the table structure information extraction unit 120.
If the result of determination in S1065 is “Yes”, the process proceeds to S1070. If the result is “None”, the process proceeds to S1095. In S1070 and S1095, the simple character recognition processing unit 142 performs character recognition for two rows from the bottom of the table using the table structure information stored in the table structure information extracting unit 120. The resulting character information is matched with the data stored in the heading dictionary unit 210 and the heading character information regarded as a subtotal column is output to the heading / item information storage unit 208. In S1070, since a ladder frame cell exists, when extracting the subtotal description column, the ladder frame cell may be preferentially searched as a candidate for the subtotal description column. When the processes of S1070 and S1095 are finished, the process proceeds to S1075 and S1100, respectively.

Ｓ１０７５及びＳ１１００では、小計欄が抽出できたかを判定する。
Ｓ１０７５で小計欄が抽出された場合はＳ１０８０に、小計欄が抽出されなかった場合には、Ｓ１０８５に進む。Ｓ１１００で小計欄が抽出された場合はＳ１１０５に、小計欄が抽出されなかった場合には、Ｓ１１１５に進む。 In S1075 and S1100, it is determined whether the subtotal column has been extracted.
If the subtotal field is extracted in S1075, the process proceeds to S1080, and if the subtotal field is not extracted, the process proceeds to S1085. If the subtotal field is extracted in S1100, the process proceeds to S1105. If the subtotal field is not extracted, the process proceeds to S1115.

Ｓ１０８０では、小計欄ははしご枠であるので、記入有無判定部１５４によって、記入桁数を算出する。
Ｓ１１０５では、小計欄ははしご枠ではないので、記入有無判定部１５４によって、記入範囲を示す座標を算出する。そして、この記入範囲を示す座標から、小計記入欄中で数字などの文字が記載された長さである小計記入長を算出する。 In S1080, since the subtotal column is a ladder frame, the entry presence / absence determination unit 154 calculates the number of digits to be entered.
In S1105, since the subtotal column is not a ladder frame, the entry presence / absence determination unit 154 calculates coordinates indicating the entry range. Then, from the coordinates indicating the entry range, a subtotal entry length that is a length in which characters such as numbers are described in the subtotal entry field is calculated.

Ｓ１０８０に引き続くＳ１０８５では、Ｓ１０８０で算出された桁数以外の、たとえば金額が記入される可能性があるセル範囲を二値化対象外範囲とし、二値化対象外範囲記憶部２１２に出力する。また、小計欄がはしご枠ではない場合、Ｓ１１１０では、Ｓ１１０５で算出された小計記入長以外の、たとえば金額が記入されるセル範囲を二値化対象外範囲とし、二値化対象外範囲記憶部２１２に出力する。 In S1085 subsequent to S1080, a cell range in which, for example, a monetary amount other than the number of digits calculated in S1080 may be entered is set as a binarization non-target range, and is output to the binarization target non-target range storage unit 212. If the subtotal column is not a ladder frame, in S1110, for example, a cell range other than the subtotal entry length calculated in S1105 is entered as a non-binarization target range, and a binarization non-target range storage unit It outputs to 212.

Ｓ１０８５及びＳ１１１０の処理が終了するとそれぞれ、Ｓ１０９０及びＳ１１１５に進む。
Ｓ１０９０では、はしご枠グルーピング処理部１５２によって、Ｓ１０８５で特定された二値化対象外範囲をグルーピングし、グルーピング結果を二値化対象外範囲記憶部２１２に出力する。グルーピングされたセルは「複数セル」としても参照され得る。 When the processing of S1085 and S1110 ends, the process proceeds to S1090 and S1115, respectively.
In S1090, the ladder frame grouping processing unit 152 groups the binarization non-target range specified in S1085, and outputs the grouping result to the binarization non-target range storage unit 212. Grouped cells may also be referred to as “multiple cells”.

Ｓ１１１５では、Ｓ１０１０で抽出された表構造情報及びＳ１０２０で抽出された見出し情報に基づいて、受取人項目の一番上のセルを対象に、上下に分割する罫線があるかのチェックを行う。ここでは受取人項目を例に説明をしているが、氏名とふりがなの組のように、ルビを振るためにセル中に上下に分割する罫線を用いることが好適なセルを対象としても良い。 In step S1115, based on the table structure information extracted in step S1010 and the heading information extracted in step S1020, the top cell of the recipient item is checked for a ruled line that is divided vertically. In this example, the recipient item is described as an example. However, it is also possible to target a cell in which ruled lines that are divided vertically are used in the cell in order to shake the ruby, such as a combination of name and furigana.

Ｓ１１１５に引き続くＳ１１２０では、受取人項目の罫線有無の判定を行い、罫線がありの場合はＳ１１２５に、罫線がなしの場合はＳ１１３５に進む。
Ｓ１１２５及びＳ１１３５では、簡易文字認識処理部１４２によって、受取人項目の罫線の上または下にカナ項目があるかのチェックを行う。カナ項目としては、平仮名、片仮名の他、アルファベットなど外国語で用いられる文字であっても良い。 In S1120 subsequent to S1115, it is determined whether or not the recipient item has a ruled line. If there is a ruled line, the process proceeds to S1125, and if there is no ruled line, the process proceeds to S1135.
In S1125 and S1135, the simple character recognition processing unit 142 checks whether there is a kana item above or below the ruled line of the recipient item. As the kana item, in addition to hiragana and katakana, characters used in foreign languages such as alphabet may be used.

Ｓ１１２５でカナ項目が存在するとの結果であればＳ１１４５に進み、存在しないとの結果であれば、Ｓ１１３０に進む。
Ｓ１１３５でカナ項目が存在するとの結果であればＳ１１５０に進み、存在しないとの結果であれば、Ｓ１１４０に進む。 If it is determined in step S1125 that a kana item exists, the process advances to step S1145. If the result does not exist, the process advances to step S1130.
If it is determined in step S1135 that a kana item exists, the process advances to step S1150. If the result does not exist, the process advances to step S1140.

Ｓ１１３０及びＳ１１４０では、二値化対象処理部１５０によって、受取人項目のカナ項目以外の列を二値化対象外範囲とみなし、二値化対象外範囲記憶部２１２に出力する。
Ｓ１１３０及びＳ１１４０では、受取人項目のカナ項目以外の列を二値化対象外範囲とみなすが、受取人項目に漢字記入欄しか設けられていない場合は、受取人項目のカナ項目以外の列を二値化対象外範囲とせずに、二値化処理の対象として残しておいても良い。 In S <b> 1130 and S <b> 1140, the binarization target processing unit 150 regards a column other than the kana item of the recipient item as a binarization non-target range and outputs it to the binarization target non-target range storage unit 212.
In S1130 and S1140, a column other than the Kana item of the recipient item is regarded as a non-binarization target range. However, if the recipient item has only a Kanji entry field, a column other than the Kana item of the recipient item is selected. You may leave it as the object of a binarization process, without setting it as the range which is not a binarization object.

Ｓ１１３０及びＳ１１４０の処理が終了すると、Ｓ１１５０に進む。
一方、Ｓ１１４５では、受取人項目にカナ項目があるので、Ｓ１０１０で抽出された表構造情報に基づいて、受取人項目列中のカナ項目ではない項目が記入されたセルの位置を特定し、二値化対象外処理部１５０によって二値化対象外範囲とみなした後、二値化対象外範囲記憶部２１２に出力する。Ｓ１１４５の処理が終了するとＳ１１５０に進む。 When the processing of S1130 and S1140 ends, the process proceeds to S1150.
On the other hand, in S1145, since there is a Kana item in the recipient item, based on the table structure information extracted in S1010, the position of a cell in which an item that is not a Kana item is entered is specified. After being regarded as a binarization non-target range by the binarization non-target processing unit 150, it is output to the binarization non-target range storage unit 212. When the process of S1145 ends, the process proceeds to S1150.

Ｓ１１５０では、二値化対象外範囲記憶部２１２を参照することによって、二値化対象外範囲があるかないかを判定する。もし、二値化対象外範囲があればＳ１１５５に、なければＳ１１６０に進む。 In S1150, it is determined by referring to the binarization non-target range storage unit 212 whether there is a binarization non-target range. If there is a non-binarization target range, the process proceeds to S1155, and if not, the process proceeds to S1160.

Ｓ１１５５では、二値化対象外処理部１５０によって、二値化対象外範囲記憶部２１２から二値化対象外範囲に関する情報を読み出し、最終的な二値化対象範囲を画定する。その後、Ｓ１１６０に進む。 In S <b> 1155, the binarization non-target processing unit 150 reads information related to the binarization non-target range from the binarization target non-target storage unit 212, and defines the final binarization target range. Thereafter, the process proceeds to S1160.

Ｓ１１６０では、部分二値化処理部１６０によって、Ｓ１１５５によって画定された最終的な二値化対象範囲のセルに対し、セル単位の二値化処理を行う。処理が終了すると、Ｓ１１６５に進む。Ｓ１１６０での二値化処理では、一つ一つの二値化処理範囲の単位の大きさは、罫線二値化処理部１１０が行う二値化処理の際に用いられる大きさと同等または小さいことが望ましい。このようにすることによって、生成された二値画像の文字認識時の精度を犠牲にせず、二値化処理の速度を改善することができる。
Ｓ１１６５では、文字認識処理部１７０によって、帳票上の表の二値化対象範囲に対し文字認識を行う。 In S1160, the binarization processing unit 160 performs binarization processing in units of cells on the cells in the final binarization target range defined in S1155. When the process ends, the process proceeds to S1165. In the binarization processing in S1160, the unit size of each binarization processing range may be equal to or smaller than the size used in the binarization processing performed by the ruled line binarization processing unit 110. desirable. By doing so, it is possible to improve the speed of the binarization process without sacrificing the accuracy at the time of character recognition of the generated binary image.
In S1165, the character recognition processing unit 170 performs character recognition on the binarization target range of the table on the form.

図４は、本発明の装置が処理対象とする帳票の例を示す図である。帳票に含まれる表は図４の場合、２２行、３２列の構造を有している。ここで、「口座番号」の見出しの列は１０列、「金額」の見出しの列は１０列、「手数料」の見出しの列は５列と数えている。図４の場合、２２行×３２列＝７０４個のセルを含む。 FIG. 4 is a diagram showing an example of a form to be processed by the apparatus of the present invention. In the case of FIG. 4, the table included in the form has a structure of 22 rows and 32 columns. Here, the column of the “Account Number” heading is counted as 10, the column of the “Money” heading is counted as 10, and the column of the “Fees” heading is counted as 5. In the case of FIG. 4, 22 rows × 32 columns = 704 cells are included.

図５は、図４に示されている帳票でのセル、はしご形式のセル（はしご枠のセル）、見出し、認識対象文字データを示している。
実線、破線等の直線である罫線で区切られた領域がセルである。セルは表を構成する単位要素でもある。はしご形式のセルは、点罫線等により分割されている複数のセルを含む。このはしご形式のセルは複数セルとしても参照される。図５の例では、口座番号、金額及び手数料の記入欄がはしご形式のセルである。 FIG. 5 shows cells, ladder format cells (ladder frame cells), headings, and recognition target character data in the form shown in FIG.
A region delimited by a ruled line that is a straight line such as a solid line or a broken line is a cell. A cell is also a unit element constituting a table. The ladder format cell includes a plurality of cells divided by dotted ruled lines or the like. This ladder type cell is also referred to as a plurality of cells. In the example of FIG. 5, the entry columns for the account number, amount, and fee are ladder-type cells.

図５では、表の第１行目の「項番」、「金融機関名」、「支店名」、「預金種目」、「口座番号」、「受取人」、「金額」、「振込区分」、「手数料」、「摘要」が見出しである。また、「依頼人」も見出しに含まれる。文字認識の対称となる認識対象文字データは、表の見出しの列の各々の行に記入される文字または文字列である。 In FIG. 5, “No.”, “Financial institution name”, “Branch name”, “Deposit line item”, “Account number”, “Recipient”, “Amount”, “Transfer category” in the first row of the table. , “Fee” and “Summary” are headings. “Client” is also included in the heading. Character data to be recognized that is symmetrical to character recognition is a character or a character string that is entered in each row of the heading column in the table.

上記のような構成を取ることによって、不特定多数の種類の帳票を、予め定義されたフォームを用いずに認識する装置において、文字認識精度を劣化させることなく、二値化処理を高速化した帳票認識装置を得ることができる。 By adopting the configuration as described above, the binarization process has been speeded up without degrading the character recognition accuracy in a device that recognizes an unspecified number of types of forms without using a predefined form. A form recognition device can be obtained.

（例１）
以下では、幾つかの例を参照して、本発明に従う画像データの二値化処理方法及びそれを用いた帳票認識装置のより詳細な説明を行う。以下では、同一または類似の機能を有する要素、または同一または類似の処理を行うステップは上述の装置１０と同じ参照符号を付与し、詳細な説明を省略する。 (Example 1)
Hereinafter, with reference to some examples, the image data binarization processing method according to the present invention and the form recognition apparatus using the method will be described in more detail. Hereinafter, elements having the same or similar functions, or steps for performing the same or similar processing are given the same reference numerals as those of the apparatus 10 described above, and detailed description thereof is omitted.

本例は、見出し情報から認識対象外の項目に関連するセルを二値化対象外範囲とすることによって、部分二値化する範囲を削減する方法及びそれを用いた装置に関する。
以下、図６乃至９を参照して、本例を説明する。図６は本発明の例１の装置のブロック図である。図７は本発明の本例の装置２０に適用されるキーワード認識技術の処理方法を示すフローチャートである。図８は罫線を残した二値画像の例である。図９は本発明の例１の装置が処理対象とする帳票の例を示す図である。 This example relates to a method for reducing the range to be partially binarized by setting cells related to items that are not to be recognized from the heading information as binarization-free ranges, and an apparatus using the method.
Hereinafter, this example will be described with reference to FIGS. FIG. 6 is a block diagram of the apparatus of Example 1 of the present invention. FIG. 7 is a flowchart showing the processing method of the keyword recognition technique applied to the apparatus 20 of this example of the present invention. FIG. 8 is an example of a binary image with ruled lines left. FIG. 9 is a diagram showing an example of a form to be processed by the apparatus of Example 1 of the present invention.

図６に示されているように、本例に従う帳票認識装置２０は、スキャナ１００、罫線二値化処理部１１０、表構造情報抽出部１２０、見出し二値化処理部１３０、見出し・項目抽出部１４０、及び二値化対象外処理部１５０を含んでいる。これらは互いに電気的に接続されている。 As shown in FIG. 6, the form recognition apparatus 20 according to this example includes a scanner 100, a ruled line binarization processing unit 110, a table structure information extraction unit 120, a heading binarization processing unit 130, a heading / item extraction unit. 140, and a binarized non-processing unit 150. These are electrically connected to each other.

また、装置２０は、原画像記憶部２００、罫線二値画像記憶部２０２、表構造情報記憶部２０４、見出し二値化記憶部２０６、見出し・項目情報記憶部２０８、見出し辞書部２１０、及び二値化対象外範囲記憶部２１２、を含んでいる。これらの記憶部は、上述の装置１０と同様に対応する処理部に接続される。 In addition, the apparatus 20 includes an original image storage unit 200, a ruled line binary image storage unit 202, a table structure information storage unit 204, a headline binarization storage unit 206, a headline / item information storage unit 208, a headline dictionary unit 210, and a binary dictionary. A non-valuation target range storage unit 212 is included. These storage units are connected to the corresponding processing units in the same manner as the apparatus 10 described above.

図７は装置２０で実行される処理を表すフローチャートである。
本例の装置２０では、図４のような帳票上の表において、認識対象項目以外の項目を二値化対象外とすることによって二値化面積を削減する処理を実行する。 FIG. 7 is a flowchart showing processing executed by the apparatus 20.
In the apparatus 20 of this example, the binarized area is reduced by excluding items other than the recognition target items from the binarization target in the table on the form as shown in FIG.

装置１０で実行されるステップＳ１０００〜Ｓ１０１５を処理した後、本例では、Ｓ２０００で簡易文字認識を行う。
Ｓ１００５で得られる罫線を残した二値画像の例を図８に示す。図８は図４の帳票上の表の罫線を残した二値画像の一部である。 After processing steps S1000 to S1015 executed by the apparatus 10, in this example, simple character recognition is performed in S2000.
An example of a binary image leaving the ruled line obtained in S1005 is shown in FIG. FIG. 8 is a part of a binary image in which the ruled lines of the table on the form of FIG. 4 are left.

Ｓ２０００では、見出し・項目抽出部１４０を用いて、Ｓ１０１５で作成された表の一番上の行中の文字を残した二値画像を対象に文字認識を行い、次のＳ２００５に進む。
Ｓ２００５では、抽出した文字情報を見出し辞書部２１０に記憶されている情報と突合せ、抽出した文字情報が見出し項目かどうかを判定する。 In S2000, the headline / item extraction unit 140 is used to perform character recognition on the binary image in which the characters in the top row of the table created in S1015 are left, and the process proceeds to the next S2005.
In S2005, the extracted character information is matched with information stored in the heading dictionary unit 210, and it is determined whether the extracted character information is a heading item.

もし、文字情報が見出し項目であればＳ２０２０に進む。Ｓ２０２０では、見出しの位置、例えば列番号及び行番号を表構造情報と比較することによって画定し、見出し項目とみなした文字情報を列番号及び行番号情報と共に、見出し・項目情報記憶部２０８に出力する。 If the character information is a heading item, the process proceeds to S2020. In S2020, the position of the heading, for example, the column number and the row number is defined by comparing with the table structure information, and the character information regarded as the heading item is output to the heading / item information storage unit 208 together with the column number and the line number information. To do.

もし、文字情報が見出し項目になければ、認識対象外の見出し項目として、列番号及び行番号情報と共に、見出し・項目情報記憶部２０８に出力し、Ｓ２０２５に進む。
Ｓ２０２５では、認識対象外の見出し項目があるかを判定し、なければ二値化面積を削減する処理は終了する。 If there is no character information in the heading item, it is output to the heading / item information storage unit 208 together with the column number and line number information as a heading item that is not a recognition target, and the process proceeds to S2025.
In S2025, it is determined whether there is a heading item that is not a recognition target, and if not, the process of reducing the binarized area ends.

もし認識対象外の見出し項目があれば、Ｓ２０３０で、表構造情報記憶部２０４に記憶されている表構造情報を参照しつつ、Ｓ２０２５で特定された認識対象外の見出し項目を含む列全体を二値化対象外範囲として、列単位で認識外フラグを付加する。 If there is a heading item that is not the recognition target, in S2030, the entire column including the heading item that is not the recognition target specified in S2025 is searched while referring to the table structure information stored in the table structure information storage unit 204. A non-recognized flag is added in units of columns as a non-valued range.

図９は、認識外フラグ１が付与され、二値化対象外範囲とされた列の例を示している。
最後に、Ｓ２０３０の後のＳ２０１０で、二値化対象外範囲記憶部２１２によって、Ｓ２０３０で設定されたフラグに基づいて、認識対象外の見出し項目の列を二値化対象外として、二値化対象外範囲記憶部２１２に座標を出力して、二値化面積を削減する処理を終了する。 FIG. 9 shows an example of a column to which the non-recognition flag 1 is assigned and is set as a binarization non-target range.
Finally, in S2010 after S2030, the binarization target non-range storage unit 212 binarizes the column of heading items that are not recognized based on the flag set in S2030 and excludes the column from the binarization target. The coordinates are output to the non-target range storage unit 212, and the process of reducing the binarized area ends.

本例で二値化面積を削減する処理を行った後に、図２のＳ１１５０〜Ｓ１１６５を行うことによって、帳票の認識を行うことができる。 After performing the process of reducing the binarized area in this example, the forms can be recognized by performing S1150 to S1165 in FIG.

図４に示した帳票を例に本例の装置２０の効果を説明する。
もし本例に示したような二値化対象外範囲を二値化処理せずに、図４に示した帳票上の表の全てのセルを二値化処理をする場合、処理回数は、行数２２×列数３２＝７０４回である。それに対し、本例の装置３０を用いることによる二値化処理回数は、行数２２×列数３＝６６回を削減することができる。 The effect of the apparatus 20 of this example will be described using the form shown in FIG. 4 as an example.
If binarization processing is performed on all cells in the table shown in FIG. 4 without performing binarization processing on the binarized non-target range as shown in this example, the number of processing is Number 22 × number of columns 32 = 704 times. On the other hand, the number of binarization processes by using the apparatus 30 of this example can be reduced to 22 rows × 3 columns = 66 times.

（例２）
例２は、各項目のデータの記載位置に位置属性付けを行い、セルごとの空白部分と特定し対象外とすることで、部分二値化する範囲を削減する方法及びそれを用いた装置に関する。 (Example 2)
Example 2 relates to a method for reducing the range of partial binarization by assigning a position attribute to the description position of data of each item, specifying a blank part for each cell, and excluding it, and an apparatus using the same .

図１０乃至１３を参照して、本例を説明する。図１０は本発明の例２の装置３０のブロック図である。図１１は本発明の例２の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。図１２は本発明の例２の装置が処理対象とする帳票の例の一部を示す図である。図１３は図１２に示した帳票の例の一部の二値化対象外範囲を示す図である。 This example will be described with reference to FIGS. FIG. 10 is a block diagram of the apparatus 30 of Example 2 of the present invention. FIG. 11 is a flowchart showing the processing method of the keyword recognition technique applied to the apparatus of Example 2 of the present invention. FIG. 12 is a diagram showing a part of an example of a form to be processed by the apparatus of Example 2 of the present invention. FIG. 13 is a diagram showing a non-binarization target range in the example of the form shown in FIG.

図１０に示されているように、本例に従う帳票認識装置３０は、表構造情報抽出部１２０、見出し・項目抽出部１４０、二値化対象外処理部１５０、及び記入有無判定部１５４を含んでいる。これらは互いに電気的に接続されている。 As shown in FIG. 10, the form recognition apparatus 30 according to this example includes a table structure information extraction unit 120, a heading / item extraction unit 140, a binarization target non-processing unit 150, and an entry presence / absence determination unit 154. It is out. These are electrically connected to each other.

また、装置３０は、原画像記憶部２００、表構造情報記憶部２０４、見出し・項目情報記憶部２０８、見出し辞書部２１０、二値化対象外範囲記憶部２１２、及び閾値記憶部２１４を含んでいる。 The apparatus 30 also includes an original image storage unit 200, a table structure information storage unit 204, a headline / item information storage unit 208, a headline dictionary unit 210, a binarized non-target range storage unit 212, and a threshold storage unit 214. Yes.

装置３０では、見出し・項目情報記憶部２０８及び見出し辞書部２１０は見出し・項目抽出部１４０に、表構造情報記憶部２０４は表構造情報抽出部１２０に、閾値記憶部２１４及び原画像記憶部２００は記入有無判定部１５４に接続されている。 In the apparatus 30, the headline / item information storage unit 208 and the headline dictionary unit 210 are in the headline / item extraction unit 140, the table structure information storage unit 204 is in the table structure information extraction unit 120, the threshold storage unit 214 and the original image storage unit 200. Is connected to the entry presence / absence determination unit 154.

図１１は、装置３０で実行される処理を表すフローチャートである。
本例の装置３０では、見出し項目を対象にデータ記載位置が左詰か右詰かの位置属性付けを行い、項目列ごとに空白部分が存在するセルにチェックを施し、特定した空白部分を部分二値化対象外範囲として、二値化処理面積を削減する。 FIG. 11 is a flowchart showing processing executed by the device 30.
In the apparatus 30 of this example, position attribute assignment is performed on the heading item to determine whether the data description position is left-justified or right-justified, and a check is performed on a cell in which a blank part exists for each item column, and the specified blank part is partially The binarization processing area is reduced as the binarization out-of-range.

図１１に示される処理では、Ｓ３０００で、例１のＳ２０００と同様に、表の一番上の行中の文字を残した二値画像を対象に文字認識を行い、抽出した文字情報を見出し辞書部２１０に記憶されている情報と突合せ、見出しとみなした文字情報を見出し・項目情報記憶部２０８に出力する。 In the process shown in FIG. 11, in S3000, as in S2000 of Example 1, character recognition is performed on a binary image in which the characters in the top row of the table are left, and the extracted character information is found in a dictionary. Matching the information stored in the unit 210, the character information regarded as the headline is output to the headline / item information storage unit 208.

次にＳ１０２５では、表の一番上の行中の文字を残した二値画像中の文字が見出し辞書部２１０に無かった場合、部分二値化する範囲を削減する処理を終了する。また、表の一番上の行中の文字を残した二値画像中の文字が見出し辞書部２１０にある場合、Ｓ１０３５に進む。 Next, in S1025, if there is no character in the binary dictionary image 210 in which the character in the top row of the table is left in the index dictionary unit 210, the process of reducing the range to be partially binarized is terminated. On the other hand, if the character in the binary image in which the character in the top row of the table is left is in the heading dictionary unit 210, the process proceeds to S1035.

Ｓ１０３５では、見出し・項目情報記憶部２０８に記憶されている認識対象とした見出し情報を参照し、認識外フラグが付加されていない項目に位置属性フラグを付加する。フラグ例としては、０なら左詰、１なら右詰である。Ｓ１０３５の処理は、見出し・項目抽出部１４０で行われる。 In S1035, the heading information as the recognition target stored in the heading / item information storage unit 208 is referred to, and the position attribute flag is added to the item to which the unrecognized flag is not added. As an example of the flag, 0 is left justified and 1 is right justified. The processing of S1035 is performed by the headline / item extraction unit 140.

図１２は、位置属性フラグの付加の例を示している。「金融機関名」なる見出しに対しては、位置属性フラグ０が付加され、認識対象文字データである「ＡＡＡ銀行」、「ＢＢＢ銀行」などは、セル内で左詰で記入される。一方、「口座番号」なる見出しに対しては、位置属性フラグ１が付加され、口座番号は右詰で記入される。 FIG. 12 shows an example of adding a position attribute flag. A position attribute flag 0 is added to the heading “financial institution name”, and “AAA bank”, “BBB bank”, etc., which are recognition target character data, are left-justified in the cell. On the other hand, the position attribute flag 1 is added to the heading “account number”, and the account number is written right-justified.

Ｓ１０３５の次のＳ３００５では、空白部分のチェックが行われる。
Ｓ３００５の処理を、図１３を参照しながら説明する。
原画像記憶部２００中の、スキャナで読み込まれた帳票のカラー画像を参照し、記入有無判定部１５４が、前述のように設定されたフラグに基づいて、帳票上の表の列ごとにヒストグラムを用いた黒画素判定をドット単位で行い、空白部分を特定する。ここで、ヒストグラムとは、図１３に示されているように、結合セル（はしご枠セル）内の黒画素領域分布を示す。図１３中のヒストグラムで、横軸はセル内の左端からの距離、縦軸は黒画素数である。ヒストグラムは、二値化処理を行う個々の単位領域が黒画素であるか、そうでないかを、その領域の色のＨＳＶ空間におけるベクトルから判定した結果の分布であっても良い。例えば、明度が２０％より小さい単位領域を黒画素と判定しても良い。はしご枠セルでヒストグラムを作成すると、桁を区別する罫線で黒画素数の鋭いピークが表れる。このピークの情報を利用して、はしご枠セルの抽出を行う、または確認を行っても良い。また、このピークは、空白部分として認識される。 In step S3005 subsequent to step S1035, a blank portion is checked.
The process of S3005 will be described with reference to FIG.
With reference to the color image of the form read by the scanner in the original image storage unit 200, the entry presence / absence determination unit 154 generates a histogram for each column of the table on the form based on the flag set as described above. The black pixel used is determined for each dot, and a blank portion is specified. Here, the histogram indicates the black pixel region distribution in the combined cell (ladder frame cell) as shown in FIG. In the histogram in FIG. 13, the horizontal axis represents the distance from the left end in the cell, and the vertical axis represents the number of black pixels. The histogram may be a distribution obtained as a result of determining whether each unit area to be binarized is a black pixel or not from a vector in the HSV space of the color of the area. For example, a unit area having a lightness of less than 20% may be determined as a black pixel. When a histogram is created with ladder frame cells, a sharp peak of the number of black pixels appears on a ruled line that distinguishes digits. Using this peak information, ladder frame cells may be extracted or confirmed. Moreover, this peak is recognized as a blank part.

そして、特定された空白部分を二値化対象外範囲記憶部２１２に出力する。
見出し「金融機関名」には、空白部分の特定のために黒画素が存在するか否かのチャック処理をセルの右から行うことを示す列の位置属性フラグ“０”が、見出し「口座番号」には、空白部分の特定のために黒画素が存在するか否かのチャック処理をセルの左から行うことを示す列の位置属性フラグ“１”が付加される。 Then, the specified blank portion is output to the binarized non-target range storage unit 212.
In the heading “financial institution name”, the position attribute flag “0” in the column indicating that the chuck processing is performed from the right side of the cell to determine whether or not there is a black pixel for specifying a blank portion is included in the heading “account number”. ”Is added with a column position attribute flag“ 1 ”indicating that the chuck processing for determining whether or not there is a black pixel is performed from the left of the cell in order to specify a blank portion.

Ｓ３００５に引き続くＳ２０１０で、二値化対象外範囲記憶部２１２によって、Ｓ１０３０で設定されたフラグに基づいて、認識対象外の見出し項目の列の空白部分の座標を二値化対象外として、二値化対象外範囲記憶部２１２に座標を出力する。 In S2010 subsequent to S3005, the binarization target non-range storage unit 212 sets the coordinates of the blank part of the column of the heading item that is not the recognition target to the binarization target based on the flag set in S1030. The coordinates are output to the non-target range storage unit 212.

図４に示した帳票を例に本例の装置３０の効果を説明する。
本例の装置３０を用いることによる二値化処理面積は、見出しが「項番」の列の４％、「振込み」の列の８％、「摘要」の列の７％の計１９％を削減することができる。 The effect of the apparatus 30 of this example will be described using the form shown in FIG. 4 as an example.
By using the apparatus 30 of this example, the binarization processing area is 19%, which is 4% in the column of “No.”, 8% in the “Transfer” column, and 7% in the “Summary” column. Can be reduced.

（例３）
本例は、表のセル情報の記入の有無をチェックすることにより、部分二値化をする範囲と回数を削減する装置に関する。 (Example 3)
This example relates to an apparatus for reducing the range and the number of times of partial binarization by checking the presence or absence of entry of cell information in a table.

図１４乃至１６を参照して本例を説明する。図１４は本発明の例３の装置４０のブロック図である。図１５は本発明の例３の装置に適用されるキーワード認識技術の処理方法を示すフローチャートである。図１６は、本発明の例３の装置が処理対象とする帳票の例の一部を示す図である。 This example will be described with reference to FIGS. FIG. 14 is a block diagram of the apparatus 40 of Example 3 of the present invention. FIG. 15 is a flowchart showing a keyword recognition technique processing method applied to the apparatus of Example 3 of the present invention. FIG. 16 is a diagram showing a part of an example of a form to be processed by the apparatus of Example 3 of the present invention.

図１４に示されているように、本例に従う帳票認識装置４０は、表構造情報抽出部１２０、二値化対象外処理部１５０、及び記入有無判定部１５４を含んでいる。これらは互いに電気的に接続されている。
また、装置４０は、原画像記憶部２００、表構造情報記憶部２０４、二値化対象外範囲記憶部２１２、閾値記憶部２１４、ヒストグラム記憶部２１６を含んでいる。表構造情報記憶部２０４は表構造情報抽出部１２０に、二値化対象外範囲記憶部２１２は二値化対象外処理部１５０に、並びに閾値記憶部２１４、ヒストグラム記憶部２１６及び原画像記憶部２００は記入有無判定部１５４に接続されている。 As shown in FIG. 14, the form recognition apparatus 40 according to the present example includes a table structure information extraction unit 120, a binarization non-target processing unit 150, and an entry presence / absence determination unit 154. These are electrically connected to each other.
The apparatus 40 includes an original image storage unit 200, a table structure information storage unit 204, a binarized non-target range storage unit 212, a threshold storage unit 214, and a histogram storage unit 216. The table structure information storage unit 204 is in the table structure information extraction unit 120, the binarization non-target range storage unit 212 is in the binarization target non-processing unit 150, the threshold storage unit 214, the histogram storage unit 216, and the original image storage unit. 200 is connected to the entry presence / absence determination unit 154.

図１５は装置４０で実行される処理を表すフローチャートである。
本例の装置４０では、表のセル情報の記入の有無を行単位でチェックし、未記入セルが一定数以上からなる行を二値化対象外とし、二値化処理面積を削減する。 FIG. 15 is a flowchart showing processing executed by the device 40.
In the apparatus 40 of this example, the presence or absence of entry of cell information in the table is checked in units of rows, and rows having a certain number or more of unfilled cells are excluded from the binarization target, thereby reducing the binarization processing area.

本例の処理では、表構造情報記憶部２０４から、処理をしたい帳票の電子化された画像に関する表構造情報を参照し、帳票に含まれる表の一番下の行から上の行に向かって、記入の有無をチェックする。 In the processing of this example, the table structure information related to the digitized image of the form to be processed is referred to from the table structure information storage unit 204, and from the bottom line of the table included in the form to the top line. , Check for entry.

まず、Ｓ４０００では、原画像記憶部２００から帳票のカラー画像を参照し、記入有無判定部１５４により黒画素数のチェックを行うことでセルごとの記入有無を判定し、記入有りと判断した時には記入フラグを付加する。例えば、記入フラグ“１”は記入有り、記入フラグ“０”は記入されているとは断定できない記入不明確の状態を表しても良い。 First, in S4000, the color image of the form is referred to from the original image storage unit 200, and the presence / absence determination unit 154 checks the number of black pixels to determine the presence / absence of entry for each cell. Add a flag. For example, the entry flag “1” may indicate that there is an entry and the entry flag “0” may indicate that the entry is unclear.

図１６に、記入フラグが付加された表のセルの例を示す。図１６中で各セルの下に示されているのはセル中の黒画素数のヒストグラムである。
このとき、ヒストグラム中、一定の閾値以上の黒画素が周期的に見出されるセル列ははしご枠セルのヒストグラムとしてヒストグラム記憶部２１６に出力する。閾値は、予め閾値記憶部２１４に記憶されている。 FIG. 16 shows an example of a table cell to which an entry flag is added. Shown below each cell in FIG. 16 is a histogram of the number of black pixels in the cell.
At this time, a cell column in which black pixels having a certain threshold value or more are periodically found in the histogram is output to the histogram storage unit 216 as a histogram of ladder frame cells. The threshold value is stored in advance in the threshold value storage unit 214.

また、はしご枠セルは、表構造情報を参照して、見出しを表す文字が記入されているセルの長さと、見出しを表す文字が記入されているセルに対応する場所（たとえば、見出しが記入されているセルに対応する列）の長さを比較することによって抽出するまたは候補を抽出しても良い。一般に、はしご枠セルの各桁を区切る罫線は点線または破線であることが少なくない。よって、帳票上の点線または破線などの実線ではない罫線を抽出することによって、はしご枠セルを抽出するまたは候補を抽出しても良い。 In addition, the ladder frame cell refers to the table structure information, and the length of the cell in which the heading character is written and the location corresponding to the cell in which the heading character is written (for example, the heading is written). It is also possible to extract by extracting the candidates or candidates by comparing the lengths of the columns corresponding to the cells. In general, the ruled lines separating the digits of the ladder frame cells are often dotted lines or broken lines. Therefore, ladder frame cells may be extracted or candidates may be extracted by extracting ruled lines that are not solid lines such as dotted lines or broken lines on a form.

Ｓ４０００に引き続くＳ４００５では、その行に属する全てのセルが未記入または記入不明瞭であるかを判定する。その行に属する全てのセルが未記入または記入不明瞭であれば、その行に属する全てのセルが未記入または記入不明瞭であるかを示すフラグである、未記入フラグ“１”を立てて、Ｓ４０１０に進み、処理する行を一つ１の行に移し、Ｓ４０００に戻る。 In S4005 subsequent to S4000, it is determined whether all the cells belonging to the row are unfilled or unclear. If all the cells belonging to the row are unfilled or ambiguous, set the unfilled flag “1”, which is a flag indicating whether all the cells belonging to the row are unfilled or ambiguous. , The process proceeds to S4010, the line to be processed is moved to one line, and the process returns to S4000.

Ｓ４００５で、その行に記入フラグ“1”と記入フラグ“0”が立っているセルが一定数以上見出されると、その行には文字が記入されたセルが存在することになる。この時は、Ｓ４０１５に進み、二値化対象外処理部１５０によって、未記入フラグ“１”が立てられた行を、二値化対象外範囲記憶部２１２に、二値化対象外の座標として出力し、二値化をする範囲と回数を削減する処理を終了する。 In S4005, when a certain number or more of cells having the entry flag “1” and the entry flag “0” are found in the row, there are cells in which characters are entered in the row. At this time, the process proceeds to S4015, and the line where the unfilled flag “1” is set by the binarization non-target processing unit 150 is set in the binarization non-target range storage unit 212 as coordinates that are not binarized. The process of outputting and reducing the range and number of times of binarization is completed.

本例で二値化をする範囲と回数を削減する処理を行った後に、図２のＳ１１５０〜Ｓ１１６５を行うことによって、帳票の認識を行うことができる。 In this example, after performing the process of reducing the binarization range and the number of times, the forms can be recognized by performing S1150 to S1165 in FIG.

図４に示した帳票を例に本例の装置４０の効果を説明する。
本例の装置４０を用いることによる二値化処理回数は、図４中の帳票上の表の第１３行目から第２０行目のセルの二値化処理を削減することができる。よって、表の一番上の見出しを含む行と表の一番下の小計欄を含む行の計２２行中、８行は二値化処理を行わない。よって二値化処理の回数は、約３４％減らすことができる。 The effect of the apparatus 40 of this example will be described using the form shown in FIG. 4 as an example.
The number of binarization processes by using the apparatus 40 of this example can reduce the binarization process for the cells in the 13th to 20th rows of the table on the form in FIG. Therefore, binarization processing is not performed on 8 lines out of 22 lines including the line including the top heading of the table and the line including the subtotal column at the bottom of the table. Therefore, the number of binarization processes can be reduced by about 34%.

（例４）
本例は、例えば金額、手数料などに関するセルなど、セルがはしご枠の場合、はしご枠内のセルをはしご枠セルとして１つにグルーピングして、表の一番下の行からはしご枠セル内に記入される数字の最大桁数を求め、はしご枠セルで部分二値化を行う範囲を削減する装置に関する。ここで、グルーピングされて生成されたセルは複数セルとしても参照され得る。 (Example 4)
In this example, if the cell is a ladder frame, such as a cell related to an amount, a fee, etc., the cells in the ladder frame are grouped together as a ladder frame cell, and the bottom row of the table enters the ladder frame cell. The present invention relates to an apparatus that obtains the maximum number of digits to be entered and reduces the range of partial binarization in ladder frame cells. Here, the cells generated by grouping may be referred to as a plurality of cells.

図１７乃至２１を参照して本例を説明する。図１７は装置５０のブロック図、図１８は装置５０に適用されるキーワード認識技術の処理方法を示すフローチャートである。図１９は装置５０に適用されるキーワード認識技術の処理方法中のはしご枠の抽出を説明する図である。図２０は装置５０が処理対象とする帳票の例の一部の二値化対象外範囲を示す図である。図２１は装置５０に適用されるキーワード認識技術の処理方法中のはしご枠のグルーピングを説明する図である。 This example will be described with reference to FIGS. FIG. 17 is a block diagram of the apparatus 50, and FIG. 18 is a flowchart showing a keyword recognition technique processing method applied to the apparatus 50. FIG. 19 is a diagram for explaining the extraction of a ladder frame in the processing method of the keyword recognition technique applied to the apparatus 50. FIG. 20 is a diagram illustrating a non-binarization target range of an example of a form to be processed by the apparatus 50. FIG. 21 is a diagram for explaining ladder frame grouping in the processing method of the keyword recognition technique applied to the apparatus 50.

図１７に示されているように、本例に従う帳票認識装置５０は、表構造情報抽出部１２０、見出し・項目抽出部１４０、簡易文字認識処理部１４２、二値化対象外処理部１５０、はしご枠グルーピング処理部１５２、記入有無判定部１５４を含んでいる。 As shown in FIG. 17, the form recognition device 50 according to this example includes a table structure information extraction unit 120, a heading / item extraction unit 140, a simple character recognition processing unit 142, a non-binarization target processing unit 150, a ladder. A frame grouping processing unit 152 and an entry presence / absence determination unit 154 are included.

また、装置５０は、原画像記憶部２００、表構造情報記憶部２０４、見出し辞書部２１０、二値化対象外範囲記憶部２１２、閾値記憶部２１４、ヒストグラム記憶部２１６を含んでいる。表構造情報記憶部２０４は表構造情報抽出部１２０に、見出し辞書部２１０は見出し・項目抽出部１４０に、二値化対象外範囲記憶部２１２は二値化対象外処理部１５０に接続されている。原画像記憶部２００、閾値記憶部２１４、及びヒストグラム記憶部２１６は記入有無判定部１５４に接続されている。 The device 50 also includes an original image storage unit 200, a table structure information storage unit 204, a heading dictionary unit 210, a binarized non-target range storage unit 212, a threshold storage unit 214, and a histogram storage unit 216. The table structure information storage unit 204 is connected to the table structure information extraction unit 120, the heading dictionary unit 210 is connected to the heading / item extraction unit 140, and the binarization non-target range storage unit 212 is connected to the binarization target non-processing unit 150. Yes. The original image storage unit 200, the threshold storage unit 214, and the histogram storage unit 216 are connected to the entry presence / absence determination unit 154.

図１８は装置５０で実行される処理を表すフローチャートである。
本処理では、まず、表構造情報を元に、ある見出しに対応する表の最下部に位置する小計欄を抽出し、はしご枠セルである小計欄に記入されている数字の桁数を最大桁数として抽出する。すると、その見出しの列の各行に記載される数字の桁数は、小計欄に記入される桁数より小さいので、各行の最大桁数を超える桁は空白である。その空白部分は二値化処理を行わないことによって、二値化処理の回数および二値化処理面積を減らすことができる。 FIG. 18 is a flowchart showing processing executed by the device 50.
In this process, first, based on the table structure information, the subtotal column located at the bottom of the table corresponding to a certain heading is extracted, and the number of digits entered in the subtotal column that is a ladder frame cell is the maximum number of digits. Extract as a number. Then, since the number of digits of the number described in each row of the column of the heading is smaller than the number of digits entered in the subtotal column, the digits exceeding the maximum number of digits in each row are blank. By not performing binarization processing on the blank portion, the number of binarization processing and the binarization processing area can be reduced.

Ｓ１０６０では、表の中のはしご枠を抽出し、表構造情報記憶部２０４に出力する。次のＳ１０６５では、表構造情報記憶部２０４中の表構造情報を参照し、記入有無判定部１５４により表（明細）中のセルを対象にヒストグラムによるチェックを行い、たとえば前の例３で述べたような方法ではしご枠情報を抽出し、はしご枠とみなしたセルには、はしごフラグ“１”を付与する。図１９はセルにはしごフラグが付与された例を示している。図１９に示されているように、はしご枠では、例えば数字の場合の桁を区切る破線によるヒストグラムのピークが空間的に周期的に現れるという特徴を有する。はしご枠の特定に使用する閾値は予め閾値記憶部２１４に記憶されている。 In S1060, the ladder frame in the table is extracted and output to the table structure information storage unit 204. In the next step S1065, the table structure information in the table structure information storage unit 204 is referred to, and the entry presence / absence determination unit 154 checks the cells in the table (details) using a histogram. For example, as described in the previous example 3 In such a method, ladder frame information is extracted, and a ladder flag “1” is assigned to a cell regarded as a ladder frame. FIG. 19 shows an example in which a ladder flag is assigned to a cell. As shown in FIG. 19, the ladder frame has a feature that, for example, a peak of a histogram by a broken line separating digits in the case of numbers appears spatially and periodically. The threshold value used for specifying the ladder frame is stored in the threshold value storage unit 214 in advance.

Ｓ１０６５ではしご枠があると判定された場合には、Ｓ１０７０に進み、簡易文字認識処理部１４２によって、表構造情報記憶部２０４の情報から帳票中の一番下行から２行分を対象に文字認識を行う。次のＳ１０７０では、抽出した文字情報を見出し辞書部２１０と比較し、小計欄とみなした見出し情報を（図示されていない）見出し・項目情報記憶部２０８に出力する。ほとんどの場合、小計欄は帳票上の表の最下部に位置し、合計欄と共に記載されることが多いため、表の下から２行分の文字認識を行うことが望ましい。 If it is determined in step S1065 that there is a ladder frame, the process advances to step S1070, and the simple character recognition processing unit 142 performs character recognition for the two lines from the bottom line in the form from the information in the table structure information storage unit 204. I do. In next step S1070, the extracted character information is compared with the heading dictionary unit 210, and the heading information regarded as a subtotal column is output to the heading / item information storage unit 208 (not shown). In most cases, since the subtotal column is located at the bottom of the table on the form and is often described together with the total column, it is desirable to perform character recognition for two lines from the bottom of the table.

Ｓ１０６５ではしご枠がないと判定された場合には、はしご枠セルで部分二値化を行う範囲を削減する処理を終了する。
Ｓ５０００では、小計欄とみなした文字情報が見出しにあるのかどうかを判定する。もし、Ｓ５０００の判定の結果、見出しにあると判定されればＳ１０８０に進み、ｓ１０８０では、小計見出しの列に存在するセルに対し、ヒストグラムによるチェックを行う。図２０の金額欄及び手数料欄のように、小計記入欄がはしご枠の場合、原画像記憶部２００からカラー画像を参照し、記入有無判定部１５４によりはしご枠の黒画素数をチェックし、その桁数を算出する。 If it is determined in S1065 that there is no ladder frame, the process of reducing the range for partial binarization in the ladder frame cell is terminated.
In S5000, it is determined whether or not the character information regarded as the subtotal column is in the heading. If it is determined as a result of the determination in S5000, the process proceeds to S1080, and in s1080, the cells existing in the subtotal heading column are checked by a histogram. When the subtotal entry field is a ladder frame, as in the amount field and the fee field in FIG. Calculate the number of digits.

Ｓ１０８０の次のＳ１１５５では、各明細中の金額、手数料欄のはしご枠のうち、算出した桁数以外のはしごセルを二値化対象外処理部１５０によって二値化対象外領域とし、二値化対象外範囲記憶部２１２に出力し、そのようなセルの表構造情報中のはしごフラグを削除する。 In step S1155 following step S1080, the binarization target non-binarization processing unit 150 sets the ladder cells other than the calculated number of digits in the amount and fee column ladders in each detail to be binarized and binarized. The ladder flag is deleted from the table structure information of such a cell, which is output to the non-target range storage unit 212.

次のＳ５００５では、はしご枠グルーピング処理部１５２により、図２１に示されているように、帳票全体のはしごフラグが付加されたセルのうち、隣接するセルをグルーピングし、次のＳ５０１０でグルーピング結果を二値化対象外範囲記憶部２１２に出力する。 In the next S5005, as shown in FIG. 21, the ladder frame grouping processing unit 152 groups adjacent cells among the cells to which the ladder flag of the entire form is added, and the grouping result is displayed in the next S5010. The data is output to the binarized non-target range storage unit 212.

本例で二値化をする範囲と回数を削減する処理を行った後に、図２のＳ１１５０〜Ｓ１１６５を行うことによって、帳票の認識を行うことができる。
図２０に示した帳票を例に本例の装置５０の効果を説明する。 In this example, after carrying out the process of reducing the binarization range and the number of times, the forms can be recognized by performing S1150 to S1165 in FIG.
The effect of the apparatus 50 of this example will be described using the form shown in FIG. 20 as an example.

本例の装置５０を用いることによる二値化処理面積は、「金額」の列の４桁分と手数料列の１桁分を削減することができる。この面積は表全体の約１０％にあたる。
以上のように、本例では、小計欄、金額欄がはしご枠の場合、小計の記入桁数を特定することで見出しごとの金額、手数料の最大桁数も特定でき、二値化対象範囲を狭めることができる。よって、帳票認識装置における二値化処理の高速化を図ることができる。 By using the apparatus 50 of this example, the binarization processing area can be reduced by 4 digits in the “Amount” column and 1 digit in the fee column. This area is about 10% of the entire table.
As described above, in this example, when the subtotal field and the amount field are ladder frames, by specifying the number of digits in the subtotal, the amount of each headline and the maximum number of fees can be specified. It can be narrowed. Therefore, it is possible to speed up the binarization process in the form recognition apparatus.

（例５）
本例は、例えば金額、手数料などに関するセルなどの、セルがはしご枠でない場合、小計の記入文字列の文字数（記入文字列の長さ）を参照し、セル中の空白部分を特定し、部分二値化を行う範囲を削減する装置６０に関する。 (Example 5)
In this example, if the cell is not a ladder frame, such as a cell related to the amount, fee, etc., the number of characters in the subtotal entry string (length of the entry string) is referred to, and the blank part in the cell is specified. The present invention relates to a device 60 for reducing the range for binarization.

図２２乃至２４を参照して、本例を説明する。
図２２に示されているように、本例に従う帳票認識装置６０は、表構造情報抽出部１２０、見出し・項目抽出部１４０、簡易文字認識処理部１４２、二値化対象外処理部１５０、記入有無判定部１５４を含んでいる。 This example will be described with reference to FIGS.
As shown in FIG. 22, the form recognition device 60 according to this example includes a table structure information extraction unit 120, a heading / item extraction unit 140, a simple character recognition processing unit 142, a non-binarization target processing unit 150, and an entry. A presence / absence determination unit 154 is included.

また、装置６０は、原画像記憶部２００、表構造情報記憶部２０４、見出し辞書部２１０、閾値記憶部２１４を含んでいる。表構造情報記憶部２０４は表構造情報抽出部１２０に、見出し辞書部２１０は見出し・項目抽出部１４０に接続されている。原画像記憶部２００及び閾値記憶部２１４は記入有無判定部１５４に接続されている。 The device 60 includes an original image storage unit 200, a table structure information storage unit 204, a heading dictionary unit 210, and a threshold storage unit 214. The table structure information storage unit 204 is connected to the table structure information extraction unit 120, and the heading dictionary unit 210 is connected to the heading / item extraction unit 140. The original image storage unit 200 and the threshold value storage unit 214 are connected to the entry presence / absence determination unit 154.

図２３は、本例の装置６０で実行される処理を表すフローチャートである。
本例の装置６０では、例４でははしご枠セルであった小計欄がはしご枠ではない、一つのセルの場合に、小計欄に記入されている数字の桁数を最大文字列長として抽出する。すると、その見出しの列の各行に記載される数字の桁数は、小計欄に記入される数字より短いので、各行の最大文字列長を超える桁は空白である。その空白部分は二値化処理を行わないことによって、二値化処理の回数および二値化処理面積を減らすことができる。 FIG. 23 is a flowchart showing processing executed by the device 60 of this example.
In the apparatus 60 of this example, in the case of one cell in which the subtotal column that was a ladder frame cell in Example 4 is not a ladder frame, the number of digits of the number entered in the subtotal column is extracted as the maximum character string length. . Then, since the number of digits of the number described in each row of the column of the heading is shorter than the number entered in the subtotal column, the digits exceeding the maximum character string length of each row are blank. By not performing binarization processing on the blank portion, the number of binarization processing and the binarization processing area can be reduced.

Ｓ１０７０及びＳ５０００の処理を行った後、Ｓ６０００では、図２４に示されているように、原画像記憶部２００からカラー画像を参照し、記入有無判定部１５４によりはしご枠の黒画素数をヒストグラムを用いてチェックを行い、小計の文字列長を算出する。 After performing the processing of S1070 and S5000, in S6000, as shown in FIG. 24, a color image is referred to from the original image storage unit 200, and the number of black pixels in the ladder frame is displayed as a histogram by the entry presence / absence determination unit 154. Use this to check and calculate the subtotal string length.

Ｓ６０００の処理が終わると、Ｓ１１５５に進み処理をする。即ち、二値化対象外処理部１５０によって、Ｓ６０００で算出された小計の文字列長を超えたセル中の領域を二値化対象外範囲とする。 When the process of S6000 ends, the process proceeds to S1155 for processing. In other words, the binarized non-processing unit 150 sets the area in the cell that exceeds the character string length of the subtotal calculated in S6000 as the binarized non-target range.

図２４に示した帳票を例に本例の装置６０の効果を説明する。
本例の装置５０を用いることによる二値化処理面積は、「金額」の列の４桁分と手数料列の１桁分を削減することができる。この面積は表全体の１０％にあたる。
尚、例３で記載したような処理を行い、その結果未記入フラグが付加されたセルには上記処理を行わないようにしても良い。 The effect of the apparatus 60 of this example will be described using the form shown in FIG. 24 as an example.
By using the apparatus 50 of this example, the binarization processing area can be reduced by 4 digits in the “Amount” column and 1 digit in the fee column. This area is 10% of the entire table.
Note that the processing described in Example 3 may be performed, and as a result, the above processing may not be performed on the cells to which the unfilled flag is added.

（例６）
本例は、受取人の見出し情報にカナ部分と漢字項目が存在する場合、その項目のデータセル上の漢字項目を二値化対象外範囲とすることで、部分二値化を行う範囲を削減する装置７０に関する。 (Example 6)
In this example, if there is a kana part and kanji item in the recipient's heading information, the range of partial binarization is reduced by setting the kanji item on the data cell of that item as a non-binarized range It relates to the apparatus 70 which performs.

図２５乃至２８を参照して本例を説明する。
図２５に示されているように、本例に従う帳票認識装置７０は、表構造情報抽出部１２０、見出し・項目抽出部１４０、簡易文字認識処理部１４２、二値化対象外処理部１５０、記入有無判定部１５４を含んでいる。
また、装置７０は、原画像記憶部２００、表構造情報記憶部２０４、見出し辞書部２１０、二値化対象外範囲記憶部２１２、閾値記憶部２１４を含んでいる。表構造情報記憶部２０４は表構造情報抽出部１２０に、見出し辞書部２１０は見出し・項目抽出部１４０に接続されている。原画像記憶部２００及び閾値記憶部２１４は記入有無判定部１５４に接続されている。二値化対象外範囲記憶部２１２は、二値化対象外処理部１５０に接続されている。 This example will be described with reference to FIGS.
As shown in FIG. 25, the form recognition apparatus 70 according to the present example includes a table structure information extraction unit 120, a heading / item extraction unit 140, a simple character recognition processing unit 142, a binarization non-processing unit 150, an entry. A presence / absence determination unit 154 is included.
The apparatus 70 also includes an original image storage unit 200, a table structure information storage unit 204, a heading dictionary unit 210, a binarized non-target range storage unit 212, and a threshold storage unit 214. The table structure information storage unit 204 is connected to the table structure information extraction unit 120, and the heading dictionary unit 210 is connected to the heading / item extraction unit 140. The original image storage unit 200 and the threshold value storage unit 214 are connected to the entry presence / absence determination unit 154. The binarization non-target range storage unit 212 is connected to the binarization non-target processing unit 150.

図２６は、本例の装置７０で実行される処理を表すフローチャートである。
Ｓ７０００で、（図２６には図示されていない）見出し・項目情報記憶部２０８から見出し情報を参照し、表構造上の受取人と口座番号の見出しの座標位置を特定する。 FIG. 26 is a flowchart showing processing executed by the apparatus 70 of this example.
In S7000, the heading information is referenced from the heading / item information storage unit 208 (not shown in FIG. 26), and the coordinate positions of the payee and account number heading on the table structure are specified.

次のＳ７００５では受取人見出しの有無を判定し、あればＳ７０１０に進む。受取人見出しがない場合は、部分二値化を行う範囲を削減する処理を終了する。
また、Ｓ７０１０では、口座番号見出しの有無を判定し、あればＳ７０１５に進む。口座番号見出しがない場合は、部分二値化を行う範囲を削減する処理を終了する。 In next step S7005, it is determined whether or not there is a recipient header, and if there is, the flow proceeds to S7010. If there is no recipient headline, the process of reducing the range for partial binarization is terminated.
In S7010, it is determined whether or not there is an account number header, and if there is, the process proceeds to S7015. If there is no account number header, the process of reducing the range for partial binarization is terminated.

Ｓ７０１５では、表構造情報記憶部２０４から表構造情報を参照し、口座番号項目の一つ下のセルを参照し、１項目（１明細）分の高さを算出する。次のＳ７０２０では、Ｓ７０１５で算出した１明細分の高さを参照し、見出し受取人項目から１つ下のセルを特定し、図２７に示されているように、記入有無判定部１５４によりセル中の上下に分割する罫線があるかヒストグラムを用いてチェックを実施する。罫線チェックに使用する閾値は、予め閾値記憶部２１４に記憶されている。その後、Ｓ７０２５に進む。 In S7015, the table structure information is referenced from the table structure information storage unit 204, the cell immediately below the account number item is referenced, and the height for one item (one item) is calculated. In next step S7020, the height of one item calculated in step S7015 is referred to specify a cell below the headline recipient item, and as shown in FIG. A histogram is used to check whether there are ruled lines that divide in the top and bottom. The threshold used for the ruled line check is stored in advance in the threshold storage unit 214. Thereafter, the process proceeds to S7025.

Ｓ７０２５では、受取人項目のセルに横罫線があるかを判定する。ある場合にはＳ７０3３０に進む。Ｓ７０３０では、原画像記憶部２００からカラー画像を参照し、受取人セルの上段と下段に対し、簡易文字認識処理部１５０によって、カテゴリチェックを行い、カナ項目位置の特定を行う。罫線がない場合は、このＳ７０２５の判定処理を受取人セル全体にわたって行う。 In S7025, it is determined whether there is a horizontal ruled line in the recipient item cell. If there is, the process proceeds to S70330. In step S7030, the color image is referred to from the original image storage unit 200, and the category check is performed by the simple character recognition processing unit 150 on the upper and lower recipient cells to identify the kana item position. If there is no ruled line, the determination process of S7025 is performed over the entire recipient cell.

次のＳ１１２５では、簡易文字認識処理部１４２によって、受取人項目にカナ項目があるかのチェックを行う。もしＳ１１２５でカナ項目があると判定された場合には、Ｓ７０３５に進む。 In next step S1125, the simple character recognition processing unit 142 checks whether there is a kana item in the recipient item. If it is determined in S1125 that there is a kana item, the process proceeds to S7035.

もしＳ１１２５でカナ項目がないと判定された場合には、Ｓ１１３０に進む。
Ｓ７０３５では、受取人項目列の明細ごとに、セルの上段、下段に対しカナ項目フラグを、図２８のように付与する。 If it is determined in S1125 that there is no kana item, the process proceeds to S1130.
In S7035, a Kana item flag is assigned to the upper and lower cells of each recipient item string as shown in FIG.

Ｓ７０３５のあとのＳ１１５５では、受取人項目列中のカナ項目フラグがカナ項目ではない範囲を二値化対象外処理部１５０によって、二値化対象外範囲として、二値化対象外範囲記憶部２１２に出力する。 In S1155 after S7035, the binarization target non-binary range storage unit 212 sets the range where the kana item flag in the recipient item string is not the kana item as the binarization target non-processing range 150. Output to.

図２８に示した帳票を例に本例の装置７０の効果を説明する。
本例の装置５０を用いることによる二値化処理面積は、図４中の「（フリガナ）」と「受取人」の列のうち、通常は漢字を含む「受取人」の列の二値化処理を省略する。よって、「（フリガナ）」と「受取人」の列の全体に占める面積１３％のうち、半分の７％の二値化処理面積を削減することができる。 The effect of the apparatus 70 of this example will be described using the form shown in FIG. 28 as an example.
The binarization processing area by using the apparatus 50 of this example is the binarization of the “recipient” column, which usually includes kanji, out of the “(phonetic)” and “recipient” columns in FIG. The process is omitted. Therefore, it is possible to reduce the binarization processing area of 7%, which is half of the area 13% in the entire column of “(phonetic)” and “recipient”.

上記では本発明の幾つかの例について記載してきたが、もちろん、上記の例を組み合わせても良い。
二値化処理回数に関しては、例えば例１と例３の技術を組み合わせることによって、例１で６６回、例３でさらに２４２回の計３０８回を減少させることができる。また二値化面積に関しては、例えば例２と例４と例６の技術を組み合わせることによって、１９％＋１０％＋７％＝３６％の面積を減らすことができる。 Although several examples of the present invention have been described above, of course, the above examples may be combined.
Regarding the number of binarization processes, for example, by combining the techniques of Example 1 and Example 3, it is possible to reduce the total of 308 times, 66 times in Example 1 and 242 times in Example 3. Regarding the binarized area, for example, by combining the techniques of Example 2, Example 4, and Example 6, the area of 19% + 10% + 7% = 36% can be reduced.

Claims

A form recognition device for recognizing characters described in a form including a table composed of cells that are separated by a ruled line and include or do not include characters or character strings,
In the table, a table structure defining means for defining a table structure that defines the arrangement of the cells;
Headline extracting means for extracting a headline that is a predetermined character or character string from the image;
Heading position specifying means for specifying the position of a heading cell that is a cell including the heading in the table structure;
Of the cells of the table structure defined by the table structure defining means, a filled cell detection means for detecting a filled cell in which some character or character string is entered in the cell;
Binarization target cell selection means for selecting only the filled cells detected by the filled cell detection means as binarization target cells;
Binarization means for performing binarization processing of the binarization target cell in the table and generating a binary image;
Character recognition means for recognizing characters described in a form from the binary image;
A form recognition device comprising:

The binarization target cell selection means, when a cell located at a relative position corresponding to the heading among the cells of the table structure defined by the table structure defining means is not a completed cell, The form recognition apparatus according to claim 1, wherein a cell at a predetermined relative position with respect to a cell that is not a completed cell is excluded from the binarization target cell.

A plurality of cells defining means for defining, in the table structure, a plurality of cells located at relative positions corresponding to the headings;
Including
The form recognition apparatus according to claim 1, wherein the binarization target cell selection unit recognizes the plurality of cells as the binarization target cells.

Filled multiple cell detection means for detecting a plurality of filled cells in which cells in which characters are written among the plurality of cells are continuous,
Including
The binarization target cell selecting means is a plurality of completed cells among the plurality of cells located at a relative position corresponding to the heading with respect to the position of the cell including the heading specified by the heading position specifying means. 4. The form recognition apparatus according to claim 3, wherein only the plurality of cells detected by the detecting means are selected as binarization target cells.

A merged cell defining means for defining in the table structure a merged cell including a plurality of cells located at relative positions corresponding to the headings;
Including
The binarization target cell selecting means selects only a part of the merged cell located at a relative position corresponding to the heading with respect to the position of the cell including the heading specified by the heading position specifying means, The form recognition apparatus according to claim 1, wherein the form recognition device is selected as the binarization target cell.

Filled cell detection means for detecting a filled cell in which any character or character string is written in the cell from at least some of the cells of the table structure defined by the table structure defining means;
A maximum cell number defining means for defining a maximum number of the filled cells in the combined cells;
Including
The binarization target cell selecting unit further selects the maximum number of cells from the plurality of cells included in the merged cell located at a relative position corresponding to the heading with respect to the position of the cell including the heading. The form recognition apparatus according to claim 1, wherein the form recognition cell is selected as a binarization target cell depending on the heading.

The image of the form is a color image,
The filled-in cell detection means determines whether or not the cell has been filled using a vector in the HSV color space composed of three components of hue, saturation and brightness of the color of the cell. The form recognition apparatus according to any one of claims 1 to 6.

A form recognition method for recognizing characters described in a form including a table composed of cells that are separated by a ruled line and include or do not include characters or character strings,
In the table, defining a table structure defining an arrangement of the cells;
Extracting a heading that is a predetermined character or character string from the image;
Identifying in the table structure the position of the header cell that is the cell containing the header;
Detecting a filled cell in which any character or character string is written in the cell among the cells of the table structure defined by the table structure defining means;
Selecting only the completed cells detected by the completed cell detection means as binarization target cells;
Performing a binarization process on the binarization target cell in the table to generate a binary image;
Recognizing characters described in a form from the binary image;
The form recognition method characterized by comprising.

It can be used as a form recognition device for recognizing characters written in a form including a table composed of cells that are separated by a ruled line and include or do not include characters or character strings. On the computer,
In the table, a table structure defining function for defining a table structure that defines the arrangement of the cells;
A headline extraction function for extracting a headline that is a predetermined character or character string from the image;
A heading position specifying function for specifying, in the table structure, a position of a heading cell that is a cell including the heading;
A filled cell detection function for detecting a filled cell in which some character or character string is written in the cell among the cells of the table structure defined by the table structure defining means;
A binarization target cell selection function for selecting only the filled cells detected by the filled cell detection means as binarization target cells;
A binarization function for performing binarization processing of the binarization target cell in the table and generating a binary image;
A character recognition function for recognizing characters described in a form from the binary image;
A program characterized by realizing.