JP2020087112A

JP2020087112A - Document processing apparatus and document processing method

Info

Publication number: JP2020087112A
Application number: JP2018222392A
Authority: JP
Inventors: 健太高野橋; Kenta Takanohashi; 新庄　広; Hiroshi Shinjo; 広新庄; 良介大館; Ryosuke Odate; 直行寺下; Naoyuki Terashita
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2018-11-28
Filing date: 2018-11-28
Publication date: 2020-06-04

Abstract

To improve accuracy of reading even if various kinds of documents coexist, while suppressing increase of cost.SOLUTION: A position collation unit 104 receives document image data 113 and document definition data 110b, and extracts a character string image 114 serving as candidates of character string to be read, from the document image data 113, on the basis of position information 111b included in the document definition data 110b. A character recognition unit 105 receives the character string image 114, and performs character recognition processing on the basis of the character string image 114. An attribute collation unit 106 receives a character code 115 and the document definition data 110b, and determines a read result 116 on the basis of a collation result on the character code 115 using attribute information 112b included in simple document definition data 110b.SELECTED DRAWING: Figure 1

Description

本発明は、帳票を読取り可能な帳票処理装置および帳票処理方法に関する。 The present invention relates to a form processing device and a form processing method capable of reading a form.

従来、帳票上の文字列を認識するとき、認識すべき文字列の位置、属性、文字種、文字列の周囲の枠線のサイズや種類、枠線中に書かれているプレ印刷文字などを帳票定義データとして予め登録し、その帳票定義データに基づいて文字領域を決定して認識を行っていた。 Conventionally, when recognizing a character string on a form, the position, attribute, character type of the character string to be recognized, the size and type of the border line around the character string, the pre-printed characters written in the border line, etc. It is registered as definition data in advance, and the character area is determined and recognized based on the form definition data.

また、多数の種類の帳票が混在した環境で処理を行う場合には、帳票定義データを複数登録し、処理対象の帳票種を特定した上で、適切な帳票定義データを選択する必要があった。一方、帳票定義データを利用せずに、帳票中の項目名や項目値の候補を自動的に抽出し、それらの位置関係などから尤もらしい読取り項目を決定する技術もあった。 Also, when processing is performed in an environment in which many types of forms are mixed, it is necessary to register multiple form definition data, specify the form type to be processed, and then select the appropriate form definition data. .. On the other hand, there is also a technique of automatically extracting candidate item names and item values in a form without using the form definition data and determining a likely read item based on their positional relationship.

帳票定義データを作成する技術に関しては、例えば、特許文献１に記載の技術がある。特許文献１には、「指定された読取り領域周辺あるいは内部のプレ印刷文字、記入文字と定義画像データ入力時に自動抽出した枠、罫線等のレイアウト情報を基にして定義データの自動作成を行う」という記載がある。 As a technique for creating the form definition data, there is a technique described in Patent Document 1, for example. In Patent Document 1, "definition data is automatically created based on layout information such as preprinted characters around or inside a designated reading area, entered characters and layout information such as frames and ruled lines automatically extracted when inputting definition image data." There is a description.

帳票種を特定する技術に関しては、例えば、特許文献２に記載の技術がある。特許文献２には、「本発明では、登録用カラー帳票画像の画素値の度数分布と、処理対象のカラー帳票画像の画素値の度数分布を作成する。各色成分毎に度数分布の相関係数を算出し、相関係数からカラー画像間の類似度を算出する。最も高い類似度が所定値以上のとき、類似度が最高値をとる登録カラー画像が、処理対象のカラー帳票画像と同一種であると判定する」という記載がある。 As a technique for specifying the form type, there is a technique described in Patent Document 2, for example. In Patent Document 2, "In the present invention, a frequency distribution of pixel values of a registration color form image and a frequency distribution of pixel values of a color form image to be processed are created. Correlation coefficient of frequency distribution for each color component Then, the similarity between color images is calculated from the correlation coefficient.When the highest similarity is a predetermined value or more, the registered color image with the highest similarity is the same type as the color form image to be processed. It is determined that it is."

帳票定義データを用いることなく帳票上の文字列を認識する技術に関しては、例えば、特許文献３に記載の技術がある。特許文献３には、「帳票画像から文字列領域を検出する文字列検出部と、前記文字列領域の個々の文字を認識する文字列認識部と、帳票画像内の文字列に対し、当該文字列が項目名である確率を表す項目名尤度を計算する項目名尤度計算部と、帳票画像内の文字列に対し、当該文字列が表記辞書に登録された単語や文字列の文法表記ルールに一致する確率を表す項目値尤度を計算する項目値尤度計算部と、帳票画像内の文字列ペアに対し、当該文字列ペアの文字列の枠または文字列矩形に基づいて、当該文字列ペアの配置関係が項目名−項目値関係として妥当であるかを表す配置尤度を計算する配置尤度計算部と、前記項目名尤度、項目値尤度、配置尤度を基に、当該文字列ペアの項目名−項目値としての尤もらしさを表す評価値を計算する項目名−項目値関係評価値計算部と、前記項目名−項目値関係評価値計算部の出力する前記評価値により、帳票画像内での項目名−項目値関係の対応付けを決定する項目名−項目値関係決定部を有することを特徴とする」という記載がある。 As a technique for recognizing a character string on a form without using the form definition data, for example, there is a technique described in Patent Document 3. Patent Document 3 describes "a character string detection unit that detects a character string region from a form image, a character string recognition unit that recognizes each character in the character string region, and a character string in a form image with respect to the character. An item name likelihood calculator that calculates the likelihood of an item name representing the probability that the column is an item name, and a grammatical notation of the word or character string registered in the dictionary for the character string in the form image An item value likelihood calculation unit that calculates an item value likelihood that represents the probability of matching the rule, and a character string pair in the form image, based on the frame of the character string of the character string pair or the character string rectangle, Based on the above-mentioned item name likelihood, item value likelihood, and placement likelihood, a placement likelihood calculation unit that calculates a placement likelihood that indicates whether the placement relationship of character string pairs is valid as an item name-item value relationship. , An item name of the character string pair-an item name-item value relationship evaluation value calculation unit that calculates an evaluation value representing likelihood as an item value, and the evaluation output by the item name-item value relationship evaluation value calculation unit It is characterized by having an item name-item value relationship determining unit that determines the correspondence of the item name-item value relationship in the form image depending on the value."

特開２００４−２５８７０６号公報JP 2004-258706 A 特開２００２−２４８２９号公報JP, 2002-24829, A 特開２００２−２４８２９号公報JP, 2002-24829, A

しかしながら、特許文献１、２の技術では、処理の対象となる帳票種の多様な状況が想定されていない。 However, the technologies of Patent Documents 1 and 2 do not assume various situations of the form type to be processed.

すなわち、特許文献１の技術においては、帳票定義データ作成の一部自動化が実現されているものの、多種の帳票定義を作成するためには、高いコストが発生していた。特許文献２の技術においては、帳票種が多く、多種の帳票が混在する場合に精度良く文字列を見分けることが困難だった。 That is, although the technique of Patent Document 1 realizes partial automation of the form definition data creation, a high cost is required to create various form definitions. In the technique of Patent Document 2, there are many types of forms, and it is difficult to accurately identify a character string when various types of forms are mixed.

特許文献３では、項目名が存在せず、項目値のみが書かれた帳票については読取りが困難だった。また、項目名に対応する尤もらしい項目値の候補が複数存在する場合に帳票の読取り精度が低下することがあった。 In Patent Document 3, it is difficult to read the form in which the item name does not exist and only the item value is written. In addition, when there are a plurality of likely item value candidates corresponding to the item name, the reading accuracy of the form may decrease.

本発明は、上記事情に鑑みなされたものであり、その目的は、多種の帳票が混在する場合においても、コストの増大を抑制しつつ、帳票の読取り精度を向上させることが可能な帳票処理装置および帳票処理方法を提供することにある。 The present invention has been made in view of the above circumstances, and an object thereof is a form processing apparatus capable of improving the reading accuracy of a form while suppressing an increase in cost even when various types of forms are mixed. And to provide a form processing method.

上記目的を達成するため、第１の観点に係る帳票処理装置は、帳票画像データから抽出された文字列画像の位置情報と、読取り対象文字列に対して定義された位置情報との照合結果に基づいて、前記読取り対象文字列の候補となる文字列画像を決定する位置照合部と、前記読取り対象文字列の候補となる文字列画像に基づいて文字認識を行う文字認識部と、前記文字認識部による文字認識結果の属性と、前記読取り対象文字列に対して定義された属性情報との照合結果に基づいて、前記読取り対象文字列の候補となる文字列画像の読取結果を決定する属性照合部とを備える。 In order to achieve the above object, the form processing apparatus according to the first aspect provides a comparison result between position information of a character string image extracted from form image data and position information defined for a read target character string. A position matching unit that determines a character string image that is a candidate for the reading target character string, a character recognition unit that performs character recognition based on the character string image that is a candidate for the reading target character string, and the character recognition Attribute matching that determines the reading result of the character string image that is a candidate for the reading target character string, based on the matching result of the attribute of the character recognition result by the section and the attribute information defined for the reading target character string. And a section.

本発明によれば、多種の帳票が混在する場合においても、コストの増大を抑制しつつ、帳票の読取り精度を向上させることができる。 According to the present invention, even when various types of forms are mixed, it is possible to improve the reading accuracy of the forms while suppressing an increase in cost.

図１は、第１実施形態に係る帳票処理装置の構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of the form processing apparatus according to the first embodiment. 図２は、帳票定義データの作成者に提示される帳票画像データの画面表示例を示す図である。FIG. 2 is a diagram showing a screen display example of the form image data presented to the creator of the form definition data. 図３は、第１実施形態に係る帳票処理装置で用いられる簡易帳票定義データのデータ構造の一例を示す図である。FIG. 3 is a diagram showing an example of the data structure of simplified form definition data used in the form processing apparatus according to the first embodiment. 図４は、帳票画像データから抽出された特徴量と帳票定義データとの対応関係を示す図である。FIG. 4 is a diagram showing a correspondence relationship between the feature amount extracted from the form image data and the form definition data. 図５は、第１実施形態に係る帳票処理装置の処理を示すフローチャートである。FIG. 5 is a flowchart showing the processing of the form processing apparatus according to the first embodiment. 図６は、第２実施形態に係る帳票処理装置の構成を示すブロック図である。FIG. 6 is a block diagram showing the configuration of the form processing apparatus according to the second embodiment. 図７は、第２実施形態に係る帳票処理装置の処理を示すフローチャートである。FIG. 7 is a flowchart showing the processing of the form processing apparatus according to the second embodiment. 図８は、第３実施形態に係る帳票処理装置のハードウェア構成を示すブロック図である。FIG. 8 is a block diagram showing the hardware configuration of the form processing apparatus according to the third embodiment.

実施形態について、図面を参照して説明する。なお、以下に説明する実施形態は特許請求の範囲に係る発明を限定するものではなく、また、実施形態の中で説明されている諸要素およびその組み合わせの全てが発明の解決手段に必須であるとは限らない。 Embodiments will be described with reference to the drawings. Note that the embodiments described below do not limit the invention according to the claims, and all of the elements and combinations thereof described in the embodiments are essential to the solution means of the invention. Not necessarily.

図１は、第１実施形態に係る帳票処理装置の構成を示すブロック図である。
図１において、帳票処理装置は、スキャナ１０１、特徴抽出部１０２、帳票定義データベース１０３、位置照合部１０４、文字認識部１０５、属性照合部１０６、読取結果データベース１０７および帳票定義データ作成部１０８を備える。帳票定義データ作成部１０８は、ディスプレイ１０９に接続されている。 FIG. 1 is a block diagram showing the configuration of the form processing apparatus according to the first embodiment.
In FIG. 1, the form processing apparatus includes a scanner 101, a feature extraction unit 102, a form definition database 103, a position matching unit 104, a character recognition unit 105, an attribute matching unit 106, a reading result database 107, and a form definition data creation unit 108. .. The form definition data creation unit 108 is connected to the display 109.

スキャナ１０１は、帳票を入力とし、図示しないランプを用いて光を帳票に照射し、その透過光を図示しない撮像素子を利用して電気信号へと変換し、帳票画像データ１１３として出力する。帳票画像データ１１３は、帳票に記入されている文字列の文字列画像を含む。スキャナ１０１は、カラー帳票画像データを出力してもよいし、モノクロ帳票画像データを出力してもよい。スキャナ１０１は、ノイズ、モアレおよび裏写りなどを軽減する前処理を帳票画像データ１１３に適用してもよい。 The scanner 101 receives a form as an input, irradiates the form with light using a lamp (not shown), converts the transmitted light into an electric signal using an image pickup device (not shown), and outputs the form image data 113. The form image data 113 includes a character string image of a character string entered on the form. The scanner 101 may output color form image data or monochrome form image data. The scanner 101 may apply pre-processing for reducing noise, moiré, show-through, etc. to the form image data 113.

特徴抽出部１０２は、帳票画像データ１１３を入力とし、予め定められた複数の特徴に関する特徴量１１４ａ、１１４ｂを帳票画像データ１１３から抽出し、帳票定義データベース１０３に出力する。帳票画像データ１１３の特徴としては、例えば、帳票全体の輝度ヒストグラム情報、帳票に記載された枠線の接続関係、枠線の交点の接続関係などを設定することができる。特徴抽出部１０２は、帳票画像データ１１３に含まれる特徴を数値化することで特徴量１１４ａ、１１４ｂを抽出する。 The feature extraction unit 102 receives the form image data 113 as input, extracts feature amounts 114a and 114b relating to a plurality of predetermined features from the form image data 113, and outputs them to the form definition database 103. As characteristics of the form image data 113, for example, brightness histogram information of the entire form, connection relations of frame lines described in the form, connection relations of intersections of frame lines, and the like can be set. The feature extraction unit 102 extracts the feature amounts 114a and 114b by digitizing the features included in the form image data 113.

特徴量１１４ａは、帳票定義データ作成部１０８で作成された帳票定義データ１１０ａと紐付けされた状態で帳票定義データベース１０３に格納される。特徴量１１４ｂは、位置照合部１０４および属性照合部１０６に出力する帳票定義データ１１０ｂを帳票定義データ１１０ａから選択するために用いられる。このとき、帳票定義データベース１０３は、特徴量１１４ｂを特徴量１１４ａと比較し、その比較結果に基づいて帳票定義データ１１０ｂを選択することができる。 The feature amount 114a is stored in the form definition database 103 in a state of being associated with the form definition data 110a created by the form definition data creating unit 108. The feature amount 114b is used to select the form definition data 110b output to the position matching unit 104 and the attribute matching unit 106 from the form definition data 110a. At this time, the form definition database 103 can compare the feature amount 114b with the feature amount 114a and select the form definition data 110b based on the comparison result.

帳票定義データベース１０３は、特徴量１１４ａと特徴量１１４ｂとの比較結果に基づいて帳票定義データ１１０ｂを選択することにより、帳票定義データ１１０ｂの抽出の信頼性を向上させることが可能になる。 The form definition database 103 can improve the reliability of extraction of the form definition data 110b by selecting the form definition data 110b based on the comparison result of the feature amount 114a and the feature amount 114b.

特徴抽出部１０２は、機械学習を用いて帳票画像データ１１３から特徴を抽出してもよい。例えば、多数の帳票画像データ１１３を収集し、帳票種でカテゴリ分けする。次に、前段が畳み込み層、後段が全結合層のニューラルネットワークモデルを設定し、入力を帳票画像データ１１３、出力を帳票種として、収集した帳票画像データを教師データとして用いて多クラス識別器を構築する。以上の手順で構築したニューラルネットワークモデルの畳み込み層を特徴抽出器として利用できる。このとき、ニューラルネットワークモデルの畳み込み層に帳票画像データ１１３を入力すると、特徴量１１４ａ、１１４ｂが出力される。 The feature extraction unit 102 may extract features from the form image data 113 using machine learning. For example, a large number of form image data 113 are collected and classified into form types. Next, a neural network model with a convolutional layer in the front stage and a fully connected layer in the rear stage is set, and the input is the form image data 113, the output is the form type, and the collected form image data is used as teacher data to create a multi-class classifier. To construct. The convolutional layer of the neural network model constructed by the above procedure can be used as a feature extractor. At this time, when the form image data 113 is input to the convolutional layer of the neural network model, the feature quantities 114a and 114b are output.

このように、機械学習を用いて特徴を抽出することで、特徴の選択と特徴抽出器の構築を自動化することができる。また、多クラス識別器を精度良く構築できた場合には、多クラス識別器から取り出した畳み込み層も、帳票画像の特徴を精度良く抽出する特徴抽出器であることが期待できる。 As described above, by extracting features using machine learning, it is possible to automate feature selection and feature extractor construction. Further, when the multi-class classifier can be constructed with high accuracy, the convolutional layer extracted from the multi-class classifier can be expected to be a feature extractor that accurately extracts the features of the form image.

帳票定義データ作成部１０８は、帳票画像データ１１３を入力として、帳票画像データ１１３を帳票定義データ１１０ａの作成者に提示する。そして、帳票定義データ作成部１０８は、作成者の読取り対象文字列の位置と属性の指定に基づいて、帳票定義データ１１０ａを作成し、帳票定義データベース１０３に出力する。 The form definition data creation unit 108 receives the form image data 113 and presents the form image data 113 to the creator of the form definition data 110a. Then, the form definition data creation unit 108 creates the form definition data 110 a based on the position and attribute specification of the read target character string by the creator and outputs it to the form definition database 103.

図２は、帳票定義データの作成者に提示される帳票画像データの画面表示例を示す図である。
図２において、画面３０１は、図１のディスプレイ１０９に表示される。画面３０１は、帳票画像データ１１３、カーソル３０３および確定ボタン３０４を表示する。 FIG. 2 is a diagram showing a screen display example of the form image data presented to the creator of the form definition data.
In FIG. 2, the screen 301 is displayed on the display 109 of FIG. The screen 301 displays the form image data 113, the cursor 303, and the confirm button 304.

帳票定義データ１１０ａの作成者は、図示しないポインティングデバイスを用いてカーソル３０３を画面３０１上で移動させることができる。ポインティングデバイスとしては、例えば、マウス、タッチペンまたはタッチパネルなどを用いることができる。 The creator of the form definition data 110a can move the cursor 303 on the screen 301 using a pointing device (not shown). As the pointing device, for example, a mouse, a touch pen, a touch panel, or the like can be used.

帳票定義データ１１０ａの作成者は、例えば、帳票画像データ１１３の位置Ｐ１〜Ｐ４に記入された文字列が読取り対象文字列として定義された帳票定義データ１１０ａを作成するものとする。 It is assumed that the creator of the form definition data 110a creates, for example, the form definition data 110a in which the character strings entered in the positions P1 to P4 of the form image data 113 are defined as the read target character strings.

このとき、帳票定義データ１１０ａの作成者は、帳票画像データ１１３の各位置Ｐ１〜Ｐ４と、各位置Ｐ１〜Ｐ４の文字列の属性を指定する。 At this time, the creator of the form definition data 110a specifies the positions P1 to P4 of the form image data 113 and the attributes of the character strings at the positions P1 to P4.

例えば、帳票定義データ１１０ａの作成者は、帳票画像データ１１３の位置Ｐ２と、位置Ｐ２の文字列の属性を指定するものとする。このとき、帳票定義データ１１０ａの作成者は、位置Ｐ２の左上座標Ｐ２−１をカーソル３０３で指し示し、ポインティングデバイスのボタンを押下することで、帳票画像データ１１３の位置Ｐ２の左上座標Ｐ２−１が、位置Ｐ２の読取り対象文字列の位置情報１１１ａとして定義される。また、帳票定義データ１１０ａの作成者は、位置Ｐ２の右下座標Ｐ２−２をカーソル３０３で指し示し、ポインティングデバイスのボタンを押下することで、帳票画像データ１１３の位置Ｐ２の右下座標Ｐ２−２が、位置Ｐ２の読取り対象文字列の位置情報１１１ａとして定義される。 For example, the creator of the form definition data 110a specifies the position P2 of the form image data 113 and the attribute of the character string at the position P2. At this time, the creator of the form definition data 110a points the upper left coordinate P2-1 of the position P2 with the cursor 303, and presses the button of the pointing device so that the upper left coordinate P2-1 of the position P2 of the form image data 113 is changed. , And is defined as the position information 111a of the character string to be read at the position P2. Further, the creator of the form definition data 110a points the lower right coordinate P2-2 of the position P2 with the cursor 303, and presses the button of the pointing device, whereby the lower right coordinate P2-2 of the position P2 of the form image data 113. Is defined as position information 111a of the read target character string at position P2.

このとき、画面３０１は、位置情報１１１ａが示す範囲を矩形で表示する。例えば、画面３０１は、位置Ｐ２の読取り対象文字列について、対向する頂点として左上座標Ｐ２−１および右下座標Ｐ２−２を持つ矩形を表示する。 At this time, the screen 301 displays the range indicated by the position information 111a in a rectangular shape. For example, the screen 301 displays a rectangle having an upper left coordinate P2-1 and a lower right coordinate P2-2 as opposite vertices for the read target character string at the position P2.

次に、帳票定義データ１１０ａの作成者は、例えば、位置Ｐ２の読取り対象文字列について、左上座標Ｐ２−１および右下座標Ｐ２−２が設定された状態で、位置Ｐ２を表す矩形内をカーソル３０３で指し示し、ポインティングデバイスのボタンを押下することで、予め決められた属性群を表示させる。そして、帳票定義データ１１０ａの作成者は、その属性群から属性を指定することで、その指定した属性が、位置Ｐ２の読取り対象文字列の属性情報１１２ａとして定義される。このとき、位置Ｐ２の読取り対象文字列の属性情報１１２ａが、位置Ｐ２を表す矩形の近辺に表示される。例えば、位置Ｐ２の読取り対象文字列の属性情報１１２ａが支店名である場合、支店名という属性情報１１２ａが、位置Ｐ２を表す矩形の近辺に表示される。 Next, the creator of the form definition data 110a moves the cursor inside the rectangle representing the position P2 with the upper left coordinate P2-1 and the lower right coordinate P2-2 set for the read target character string at the position P2, for example. A predetermined attribute group is displayed by pointing at 303 and pressing the button of the pointing device. Then, the creator of the form definition data 110a specifies an attribute from the attribute group, and the specified attribute is defined as the attribute information 112a of the read target character string at the position P2. At this time, the attribute information 112a of the read target character string at the position P2 is displayed near the rectangle representing the position P2. For example, when the attribute information 112a of the character string to be read at the position P2 is the branch name, the attribute information 112a called the branch name is displayed near the rectangle representing the position P2.

また、帳票定義データ１１０ａの作成者は、読取り対象文字列について、属性情報１１２ａが設定された状態で、位置情報１１１ａを表す矩形をカーソル３０３で指し示し、ポインティングデバイスのボタンを押下することで、予め決められた属性群を表示させる。そして、帳票定義データ１１０ａの作成者は、その属性群から、現在設定されている属性情報１１２とは異なる属性を指定することで、現在設定されている属性情報１１２ａとは異なる属性が、カーソル３０３で指し示した位置の読取り対象文字列の属性情報１１２ａとして定義される。このとき、新たに定義された属性情報１１２ａが、カーソル３０３で指し示した位置を表す矩形の近辺に表示される。 Further, the creator of the form definition data 110a points to the rectangle representing the position information 111a with the cursor 303 in the state where the attribute information 112a is set for the read target character string, and presses the button of the pointing device in advance. Display the set attributes. Then, the creator of the form definition data 110a specifies an attribute different from the currently set attribute information 112 from the attribute group, so that the attribute different from the currently set attribute information 112a is set to the cursor 303. It is defined as the attribute information 112a of the character string to be read at the position pointed to by. At this time, the newly defined attribute information 112a is displayed near the rectangle representing the position pointed by the cursor 303.

帳票定義データ１１０ａの作成者は、以上の操作を繰り返すことで、帳票画像データ１１３の各位置Ｐ１〜Ｐ４について、各読取り対象文字列の位置情報１１１ａと属性情報１１２ａとの組を定義する。 By repeating the above operation, the creator of the form definition data 110a defines a set of position information 111a and attribute information 112a of each read target character string for each position P1 to P4 of the form image data 113.

そして、帳票定義データ１１０ａの作成者は、帳票画像データ１１３の各位置Ｐ１〜Ｐ４について、読取り対象文字列の位置情報１１１ａと属性情報１１２ａとの組を設定すると、確定ボタン３０４をカーソル３０３にて指し示し、ポインティングデバイスのボタンを押下することにより、読取り対象文字列の位置情報１１１ａと属性情報１１２ａとの組を確定させる。 Then, the creator of the form definition data 110a sets a set of the position information 111a of the read target character string and the attribute information 112a for each of the positions P1 to P4 of the form image data 113, and sets the confirm button 304 with the cursor 303. By pointing and pressing the button of the pointing device, the set of the position information 111a and the attribute information 112a of the read target character string is fixed.

帳票定義データ作成部１０８は、読取り対象文字列の位置情報１１１ａと属性情報１１２ａとの組が確定されると、読取り対象文字列の位置情報１１１ａと属性情報１１２ａとの組が定義された帳票定義データ１１０ａを作成し、帳票定義データベース１０３に出力する。 When the set of the position information 111a of the read target character string and the attribute information 112a is determined, the form definition data creation unit 108 defines the set of the position information 111a of the read target character string and the attribute information 112a. The data 110a is created and output to the form definition database 103.

このように、図２の画面構成をとることで、帳票定義データ１１０ａの作成者は、ポインティングデバイスを用いた簡単な操作で位置情報１１１と属性情報１１２を簡単に設定することが可能となり、帳票定義データ１１０ａを低コストで作成することができる。 In this way, by adopting the screen configuration of FIG. 2, the creator of the form definition data 110a can easily set the position information 111 and the attribute information 112 by a simple operation using a pointing device, and the form definition data 110a can be easily set. The definition data 110a can be created at low cost.

帳票定義データベース１０３は、帳票定義データ１１０ａの作成時には、帳票画像データ１１３の特徴量１１４ａに紐付けて、その帳票画像データ１１３についての帳票定義データ１１０ａを格納する。また、帳票定義データベース１０３は、帳票の読取り時には、帳票画像データ１１３の特徴量１１４ｂに基づいて、帳票定義データ１１０ｂを選択する。そして、帳票定義データ１１０ｂに含まれる位置情報１１１ｂを位置照合部１０４に出力し、帳票定義データ１１０ｂに含まれる属性情報１１２ｂを属性照合部１０６に出力する。 When creating the form definition data 110a, the form definition database 103 stores the form definition data 110a for the form image data 113 in association with the feature amount 114a of the form image data 113. Further, the form definition database 103 selects the form definition data 110b based on the feature amount 114b of the form image data 113 when reading the form. Then, the position information 111b included in the form definition data 110b is output to the position matching unit 104, and the attribute information 112b included in the form definition data 110b is output to the attribute matching unit 106.

すなわち、帳票定義データベース１０３は、帳票定義データ１１０ａの作成時には、帳票定義データ１１０ａと、帳票定義データ１１０ａを作成したときに利用した帳票画像データ１１３から特徴抽出部１０２で抽出された特徴量１１４ａを入力とし、帳票定義データ１１０ａと特徴量１１４ａを紐付けて記憶する。 That is, when creating the form definition data 110a, the form definition database 103 stores the form definition data 110a and the feature amount 114a extracted by the feature extraction unit 102 from the form image data 113 used when the form definition data 110a was created. As an input, the form definition data 110a and the feature amount 114a are stored in association with each other.

帳票定義データベース１０３は、帳票定義データ１１０ａと特徴量１１４ａが新たに入力される度に、帳票定義データ１１０ａと特徴量１１４ａを紐付けて追記することで、複数の帳票定義データ１１０ａを格納する。 The form definition database 103 stores a plurality of form definition data 110a by linking and adding the form definition data 110a and the feature amount 114a each time the form definition data 110a and the feature amount 114a are newly input.

また、帳票定義データベース１０３は、帳票の読取り時には、特徴量１１４ｂを入力とし、その入力された特徴量１１４ｂに基づいて、帳票定義データ１１０ａの中から帳票定義データを１つ以上抽出し、帳票定義データ１１０ｂとして出力する。 Further, the form definition database 103 receives the feature amount 114b as an input at the time of reading the form, and extracts one or more form definition data from the form definition data 110a based on the input feature amount 114b to form the form definition. The data 110b is output.

すなわち、帳票定義データベース１０３は、すべての記憶した特徴量１１４ａに対し、入力された特徴量１１４ｂとの距離Ｄを数式１に従って算出し、特徴量１１４ｂとの距離Ｄが予め設定された閾値ＴＤ以下の特徴量１１４ａに紐付けられた帳票定義データ１１０ａを帳票定義データ１１０ｂとして出力する。 That is, the form definition database 103 calculates the distance D from the input feature amount 114b for all the stored feature amounts 114a according to Formula 1, and the distance D from the feature amount 114b is equal to or less than a preset threshold TD. The form definition data 110a associated with the feature amount 114a of 1 is output as the form definition data 110b.

ここで、Ｆａは特徴量１１４ａを示すベクトル、Ｆｂは特徴量１１４ｂを示すベクトル、・はドット積を示す。 Here, Fa is a vector indicating the characteristic amount 114a, Fb is a vector indicating the characteristic amount 114b, and · is a dot product.

このとき、帳票定義データベース１０３は、特徴量の観点で類似した帳票画像データ１１３に紐付けられた帳票定義データタ１１０ｂを抽出することができる。 At this time, the form definition database 103 can extract the form definition data 110b associated with the form image data 113 that is similar in terms of the feature amount.

帳票定義データベース１０３は、帳票定義データ１１０ｂを１つ以上抽出できなかった場合、帳票定義データ１１０ｂの取得に失敗したとみなしてもよい。帳票定義データ１１０ｂの取得に失敗した場合の動作は、図５のフローチャートを用いて後述する。 The form definition database 103 may consider that acquisition of the form definition data 110b has failed when one or more form definition data 110b cannot be extracted. The operation when the acquisition of the form definition data 110b fails will be described later with reference to the flowchart of FIG.

位置照合部１０４は、帳票画像データ１１３および帳票定義データ１１０ｂを入力とし、帳票定義データ１１０ｂに含まれる位置情報１１１ｂに基づいて、読取り対象文字列の候補となる文字列画像１１４を帳票画像データ１１３から抽出し、文字認識部１０５に出力する。 The position matching unit 104 receives the form image data 113 and the form definition data 110b as input, and based on the position information 111b included in the form definition data 110b, extracts the character string image 114 that is a candidate for the read target character string from the form image data 113. And output to the character recognition unit 105.

このとき、位置照合部１０４は、帳票画像データ１１３に含まれるすべての文字列を抽出する必要はなく、帳票定義データ１１０ｂで定義された読取り対象文字列を漏れなく抽出できればよい。言い換えれば、位置照合部１０４は、通常では読取り対象文字列にはならない帳票名、表中の項目名、各種注意書きに関する文字列画像などは抽出しなくてもよい。また、一般に帳票の端部付近に読取り対象文字列が含まれることは少ないため、位置照合部１０４は、帳票の端部以外から文字列画像１１４を抽出するようにしてもよい。 At this time, the position matching unit 104 does not need to extract all the character strings included in the form image data 113, and may just extract the read target character strings defined in the form definition data 110b without omission. In other words, the position matching unit 104 does not have to extract the form name, the item name in the table, the character string image related to various cautionary notes, and the like that are not normally read target character strings. Further, since the character string to be read is rarely included near the end of the form, the position matching unit 104 may extract the character string image 114 from a position other than the end of the form.

このように、読取り対象文字列の位置を定義することにより、帳票画像データ１１３から抽出される文字列画像１１４を減少させることができ、読取り精度を維持しつつ、処理の高速化を図ることができる。 By defining the position of the reading target character string in this way, the character string image 114 extracted from the form image data 113 can be reduced, and the processing speed can be increased while maintaining the reading accuracy. it can.

位置照合部１０４における帳票画像データ１１３からの文字列画像の抽出は任意の方法を用いることができる。例えば、特許２９９１７６１号公報に記載されている方法を用いるようにしてもよい。 Any method can be used to extract the character string image from the form image data 113 in the position matching unit 104. For example, the method described in Japanese Patent No. 2991761 may be used.

次に、位置照合部１０４は、帳票画像データ１１３から抽出した文字列画像に対し、位置情報１１１ｂで指定される位置から最短距離となる文字列画像１１４を数式２に従って決定し、文字認識部１０５に出力する。すなわち、位置照合部１０４は、位置情報１１１ｂで指定される位置の最も近くに存在する文字列画像１１４を抽出し、文字認識部１０５に出力する。 Next, the position matching unit 104 determines the character string image 114, which is the shortest distance from the position specified by the position information 111b, for the character string image extracted from the form image data 113 according to Formula 2, and the character recognition unit 105. Output to. That is, the position matching unit 104 extracts the character string image 114 existing closest to the position specified by the position information 111b, and outputs it to the character recognition unit 105.

ここで、Ｌａは、文字列画像の抽出元の中心座標、Ｌｂは、帳票定義データ１１０ｂに含まれる位置情報１１１ｂの中心座標である。この中心座標は、位置情報１１１ｂに含まれる左上座標Ｐ２−１と左上座標Ｐ２−２の平均値で求められる。なお、帳票画像データ１１３から抽出した文字列画像が複数存在する場合、それらの文字列画像を添字ｉで区別する。 Here, La is the central coordinate of the extraction source of the character string image, and Lb is the central coordinate of the position information 111b included in the form definition data 110b. The center coordinates are obtained by an average value of the upper left coordinates P2-1 and the upper left coordinates P2-2 included in the position information 111b. When there are a plurality of character string images extracted from the form image data 113, those character string images are distinguished by the subscript i.

帳票定義データ１１０ｂに複数の位置情報１１１ｂが含まれている場合、位置照合部１０４は、それぞれの位置情報１１１ｂに対して、数式２に従って最短距離となる文字列画像１１４を決定し、文字認識部１０５に出力する。 When the form definition data 110b includes a plurality of position information 111b, the position matching unit 104 determines the shortest distance character string image 114 for each position information 111b according to Formula 2, and the character recognition unit Output to 105.

帳票定義データベース１０３から出力された帳票定義データ１１０ｂが複数存在するとき、位置照合部１０４は、前述の方法でそれぞれの帳票定義データ１１０ｂに対して最短距離となる文字列画像を決定する。 When there are a plurality of form definition data 110b output from the form definition database 103, the position matching unit 104 determines the character string image having the shortest distance to each form definition data 110b by the method described above.

次に、位置照合部１０４は、それぞれの帳票定義データ１１０ｂに対して数式３に従って評価値Ｅを算出し、評価値Ｅが最小となる帳票定義データ１１０ｂに基づいた文字列画像１１４を決定し、文字認識部１０５に出力する。 Next, the position matching unit 104 calculates an evaluation value E for each form definition data 110b according to Formula 3, and determines the character string image 114 based on the form definition data 110b having the smallest evaluation value E, It is output to the character recognition unit 105.

ここで、Ｌ´ａは、各帳票定義データ１１０ｂに対して決定された文字列画像の抽出元の中心座標、Ｌｂは、位置情報１１１ｂの中心座標を示す。この中心座標は、位置情報１１１ｂに含まれる左上座標Ｐ２−１と左上座標Ｐ２−２の平均値で求められる。帳票定義データ１１０ｂに位置情報１１０ｂが複数存在する場合、添字ｊで区別する。 Here, L'a indicates the center coordinates of the extraction source of the character string image determined for each form definition data 110b, and Lb indicates the center coordinates of the position information 111b. The center coordinates are obtained by an average value of the upper left coordinates P2-1 and the upper left coordinates P2-2 included in the position information 111b. When there are a plurality of position information 110b in the form definition data 110b, they are distinguished by the subscript j.

位置照合部１０４は、その処理の途中経過および結果が信頼できないとみなされる状況になったならば、位置の照合に失敗したとみなしてもよい。例えば、位置照合部１０４は、、最小の評価値Ｅが、予め設定された閾値ＴＥを超えた場合、位置の照合に失敗したものとみなすことができる。位置の照合に失敗した場合の動作は、図５のフローチャートを用いて後述する。 The position matching unit 104 may consider that the position matching has failed when the progress of the process and the situation where the result is regarded as unreliable. For example, the position matching unit 104 can consider that the position matching has failed when the minimum evaluation value E exceeds the preset threshold value TE. The operation when the position verification fails will be described later with reference to the flowchart of FIG.

文字認識部１０５は、文字列画像１１４と入力とし、その文字列画像１１４に基づいて文字認識処理を実施し、文字認識結果を文字コード１１５として出力する。 The character recognition unit 105 receives the character string image 114 as input, performs character recognition processing based on the character string image 114, and outputs the character recognition result as a character code 115.

すなわち、文字認識部１０５は、文字列画像１１４を文字単位に切り出した後、それぞれの文字を特徴量に変換し、図示しない文字データベースに登録されている文字の特徴量との距離を計算し、最短距離の文字コード１１５を属性照合部１０６に出力する。 That is, the character recognition unit 105 cuts out the character string image 114 for each character, converts each character into a feature amount, and calculates a distance from the feature amount of the character registered in a character database (not shown), The character code 115 of the shortest distance is output to the attribute matching unit 106.

文字認識部１０５は、その処理の途中経過および結果が信頼できないとみなされる状況になったならば、文字認識に失敗したとみなしてもよい。例えば、文字認識部１０５は、特徴量間の最短距離が、予め設定された閾値を越えた場合に、信頼度が高い文字認識ができなかったものとし、文字認識に失敗したとみなすことができる。文字認識に失敗した場合の動作は、図５のフローチャートを用いて後述する。 The character recognition unit 105 may consider that the character recognition has failed when the progress of the process and the situation in which the result is considered unreliable. For example, when the shortest distance between feature amounts exceeds a preset threshold, the character recognition unit 105 can assume that character recognition with high reliability cannot be performed, and consider that character recognition has failed. .. The operation when the character recognition fails will be described later with reference to the flowchart of FIG.

属性照合部１０６は、文字コード１１５と帳票定義データ１１０ｂを入力とし、簡易帳票定義データ１１０ｂに含まれる属性情報１１２ｂを用いた文字コード１１５に対する照合結果に基づいて読取結果１１６を決定し、読取結果データベース１０７に出力する。 The attribute matching unit 106 receives the character code 115 and the form definition data 110b as input, determines the reading result 116 based on the matching result for the character code 115 using the attribute information 112b included in the simplified form definition data 110b, and reads the reading result. Output to the database 107.

すなわち、属性照合部１０６は、読取り対象文字列の属性の種類に応じた検証処理を予め用意しておき、属性情報１１２ｂに従って文字コード１１５の照合処理を行い、文字コード１１５が示す属性が確からしい場合には、その文字コード１１５を読取結果１１６として出力する。 That is, the attribute matching unit 106 prepares a verification process according to the type of the attribute of the read target character string, performs the matching process of the character code 115 according to the attribute information 112b, and the attribute indicated by the character code 115 seems to be certain. In that case, the character code 115 is output as the reading result 116.

例えば、属性照合部１０６は、属性「銀行名」の検証処理として、「図示しない全国の銀行名を格納した銀行名データベースに対して文字列を問い合わせ、その銀行名データベースに文字列が含まれていたならば、文字列が属性「銀行名」として確からしいとみなす」検証処理を用意することができる。 For example, the attribute collation unit 106 inquires a bank name database that stores bank names nationwide (not shown) for a character string as a verification process for the attribute "bank name", and the bank name database contains the character string. If so, it is possible to prepare a verification process in which the character string is considered as the attribute "bank name".

また、例えば、属性照合部１０６は、属性「口座番号」の検証処理として、「文字列が７桁以下の数字で構成されていれば、文字列が確からしいとみなす」検証処理を用意することができる。 In addition, for example, the attribute matching unit 106 prepares, as the verification processing of the attribute “account number”, the verification processing “if the character string is composed of 7 digits or less, the character string is considered to be certain”. You can

属性照合部１０６は、その処理の途中経過および結果が信頼できないとみなされる状況になったならば、属性の照合に失敗したとみなしてもよい。例えば、属性照合部１０６は、属性「銀行名」の検証処理において、銀行名データベースに文字列が含まれていなかった場合に、属性の照合に失敗したとみなすことができる。属性の照合に失敗した場合の動作は、図５のフローチャートを用いて後述する。 The attribute collating unit 106 may consider that the attribute collation has failed when the progress of the process and the situation in which the result is regarded as unreliable. For example, in the verification process of the attribute “bank name”, the attribute matching unit 106 can consider that the attribute matching has failed if the bank name database does not include a character string. The operation when the attribute matching fails will be described later with reference to the flowchart of FIG.

読取結果データベース１０７は、読取結果１１６を入力とし、その読取結果１１６を記憶する。また、読取結果データベース１０７は、読取結果１１６をユーザに提示し、ユーザによって修正された読取結果を読取結果データベース１０７に格納する読取結果修正部（図示しない）から、読取結果１１６の呼び出しあるいは格納の要求があれば、それぞれの要求に従う。 The reading result database 107 receives the reading result 116 and stores the reading result 116. Further, the read result database 107 presents the read result 116 to the user and calls or stores the read result 116 from a read result correction unit (not shown) that stores the read result corrected by the user in the read result database 107. If requested, follow each request.

図３は、第１実施形態に係る帳票処理装置で用いられる簡易帳票定義データのデータ構造の一例を示す図である。
図３において、帳票定義データ１１０ａは、読取り対象文字列の位置情報１１１ａと属性情報１１２ａの組を複数個格納したデータ構造を持つ。例えば、図２の帳票画像データ１１３の各位置Ｐ１〜Ｐ４に記入されている文字列を読取り対象文字列１〜４とすることができる。このとき、帳票定義データ１１０ａには、読取り対象文字列１〜４ごとに、位置情報１１１ａと属性情報１１２ａの組が登録される。 FIG. 3 is a diagram showing an example of the data structure of simplified form definition data used in the form processing apparatus according to the first embodiment.
In FIG. 3, the form definition data 110a has a data structure in which a plurality of sets of position information 111a and attribute information 112a of a read target character string are stored. For example, the character strings written in the respective positions P1 to P4 of the form image data 113 in FIG. 2 can be set as the read target character strings 1 to 4. At this time, in the form definition data 110a, a set of position information 111a and attribute information 112a is registered for each of the read target character strings 1 to 4.

位置情報１１１ａは、読取り対象文字列の左上座標Ｐ２−１および左上座標Ｐ２−２をそれぞれ帳票画像データ１１３の解像度における１ピクセル単位で表現することができる。 The position information 111a can represent the upper left coordinate P2-1 and the upper left coordinate P2-2 of the read target character string in units of one pixel in the resolution of the form image data 113, respectively.

ただし、位置情報１１１ａは、１ピクセル単位に限定されるものではない。例えば、位置情報１１１は、帳票画像データ１１３の解像度より荒い単位、例えば２ピクセル単位または５ピクセル単位で、左上座標Ｐ２−１および左上座標Ｐ２−２を保持してもよい。このとき、画面３０１は、位置情報１１１ａの保持に用いられる単位で、読取り対象文字列の位置を示す矩形を表示するようにしてもよい。 However, the position information 111a is not limited to one pixel unit. For example, the position information 111 may hold the upper left coordinate P2-1 and the upper left coordinate P2-2 in units that are rougher than the resolution of the form image data 113, for example, in units of 2 pixels or 5 pixels. At this time, the screen 301 may display a rectangle indicating the position of the read target character string in units used for holding the position information 111a.

また、位置情報１１１ａは、図１のスキャナ１０１に入力された帳票を基準としてミリメートル単位で保持してもよい。この場合、画面３０１は、適切なスケーリング処理を実施して、読取り対象文字列の位置を示す矩形を表示するようにしてもよい。この単位の設定は、帳票読取装置全体における読取り精度と関連し、単位を荒くすれば帳票定義を素早く行うことができるようになるが、読取り精度は低下するため、このバランスをみて設定するのがよい。 Further, the position information 111a may be held in millimeters with the form input to the scanner 101 of FIG. 1 as a reference. In this case, the screen 301 may perform an appropriate scaling process to display a rectangle indicating the position of the read target character string. The setting of this unit is related to the reading accuracy of the entire form reading apparatus, and if the unit is roughened, the form definition can be performed quickly, but since the reading accuracy will be lowered, it is recommended to set this balance. Good.

このように読取り対象文字列の位置情報１１１ａを設定することにより、帳票定義データ１１０ａの情報量を低減し、記録量が低減できるだけでなく、帳票定義データ生成部１０８において、必要以上に高い精度で手間をかけて帳票定義データ１１０ａが生成されるのを抑制できる。 By setting the position information 111a of the read target character string in this way, not only the information amount of the form definition data 110a can be reduced and the recording amount can be reduced, but also in the form definition data generation unit 108, with higher precision than necessary. It is possible to suppress the time and effort to generate the form definition data 110a.

属性情報１１２ａは、読取り対象文字列の種類を示す情報である。例えば、金融分野の帳票では、銀行名、支店名、口座種別、口座番号、振込日、名義および住所などを属性情報１１２ａとして設定することができる。 The attribute information 112a is information indicating the type of the read target character string. For example, in a financial report, a bank name, a branch name, an account type, an account number, a transfer date, a name and an address can be set as the attribute information 112a.

このように、帳票定義データ１１０ａは、位置情報１１１ａと属性情報１１２ａを含み、通常の定義データに含まれるような、枠形状、文字数、手書き・活字、プレ印刷文字および文字ピッチなどの情報を含まない。このため、帳票定義データ作成部１０８によって帳票定義データ１１０ａを容易に作成でき、帳票定義データ１１０ａの作成にかかる手間およびコストを削減できる。 As described above, the form definition data 110a includes the position information 111a and the attribute information 112a, and includes information such as the frame shape, the number of characters, handwriting/printing characters, preprinted characters, and character pitch, which are included in normal definition data. Absent. Therefore, the form definition data creating unit 108 can easily create the form definition data 110a, and the labor and cost for creating the form definition data 110a can be reduced.

図４は、帳票画像データから抽出された特徴量と帳票定義データとの対応関係を示す図である。なお、図４の例では、個々の特徴量１１４ａ、１１４ｂが２つの特徴量Ａ、Ｂのから構成される場合を示した。
図４において、帳票定義データベース１０３は、各帳票定義データ１１０ａを、その帳票定義データ１１０ａの作成に用いた帳票画像データ１１３の特徴量１１４ａと紐付けて記憶する。そして、帳票定義データベース１０３は、特徴抽出部１０２から特徴量１１４ｂが入力されると、特徴量１１４ｂとの距離Ｄが閾値ＴＤ以下の特徴量１１４ａに紐付けられた帳票定義データ１１０ａを選択し、その選択した帳票定義データ１１０ａを帳票定義データ１１０ｂとして出力する。 FIG. 4 is a diagram showing a correspondence relationship between the feature amount extracted from the form image data and the form definition data. In addition, in the example of FIG. 4, the case where each feature amount 114a, 114b is composed of two feature amounts A, B is shown.
In FIG. 4, the form definition database 103 stores each form definition data 110a in association with the feature amount 114a of the form image data 113 used to create the form definition data 110a. When the feature amount 114b is input from the feature extraction unit 102, the form definition database 103 selects the form definition data 110a associated with the feature amount 114a whose distance D from the feature amount 114b is equal to or less than the threshold TD, The selected form definition data 110a is output as the form definition data 110b.

図５は、第１実施形態に係る帳票処理装置の処理を示すフローチャートである。
図５において、ステップＳ１０１では、スキャナ１０１は、帳票をスキャンし、帳票画像データ１１３に変換し、帳票画像データ１１３を特徴抽出部１０２、位置照合部１０４および帳票定義データ作成部１０８に出力する。 FIG. 5 is a flowchart showing the processing of the form processing apparatus according to the first embodiment.
In FIG. 5, in step S101, the scanner 101 scans a form, converts it into form image data 113, and outputs the form image data 113 to the feature extraction unit 102, the position matching unit 104, and the form definition data creation unit 108.

次に、ステップＳ１０２において、特徴抽出部１０２は、帳票画像データ１１３から特徴量１１４ａ、１１４ｂを抽出し、帳票定義データベース１０３に出力する。 Next, in step S102, the feature extraction unit 102 extracts the feature amounts 114a and 114b from the form image data 113 and outputs them to the form definition database 103.

次に、ステップＳ１０３ａにおいて、帳票定義データベース１０３は、特徴量１１４ｂに対応する帳票定義データ１１０ｂを１つ以上抽出する。 Next, in step S103a, the form definition database 103 extracts one or more form definition data 110b corresponding to the feature amount 114b.

次に、ステップＳ１０３ｂにおいて、帳票定義データベース１０３は、ステップＳ１０３ａにおける帳票定義データ１１０ｂの取得に失敗したかどうかを判断する。帳票定義データ１１０ｂの取得に成功した場合、ステップＳ１０４ａに進み、帳票定義データ１１０ｂの取得に失敗した場合、ステップＳ１０８ａに進む。 Next, in step S103b, the form definition database 103 determines whether acquisition of the form definition data 110b in step S103a has failed. If the form definition data 110b is successfully acquired, the process proceeds to step S104a. If the form definition data 110b is not successfully acquired, the process proceeds to step S108a.

次に、ステップＳ１０４ａにおいて、位置照合部１０４は、帳票定義データ１１０ｂに含まれる位置情報１１１ｂに基づいて、帳票画像データ１１３から文字列画像１１４を抽出する。 Next, in step S104a, the position matching unit 104 extracts the character string image 114 from the form image data 113 based on the position information 111b included in the form definition data 110b.

次に、ステップＳ１０４ｂにおいて、位置照合部１０４は、ステップＳ１０４ａにおける位置の照合に失敗したかどうかを判断する。位置の照合に成功した場合、ステップＳ１０５ａに進み、位置の照合に失敗した場合、ステップＳ１０８ａに進む。 Next, in step S104b, the position matching unit 104 determines whether the position matching in step S104a has failed. When the position matching is successful, the process proceeds to step S105a, and when the position matching is unsuccessful, the process proceeds to step S108a.

次に、ステップＳ１０５ａにおいて、文字認識部１０５は、文字列画像１１４を文字認識し、文字コード１１５へ変換する。 Next, in step S105a, the character recognition unit 105 character-recognizes the character string image 114 and converts it into the character code 115.

次に、ステップＳ１０５ｂにおいて、文字認識部１０５は、ステップＳ１０５ａにおける文字の認識に失敗したかどうかを判断する。文字の認識に成功した場合、ステップＳ１０６ａに進み、文字の認識に失敗した場合、ステップＳ１０８ａに進む。 Next, in step S105b, the character recognition unit 105 determines whether or not the character recognition in step S105a has failed. If the character recognition is successful, the process proceeds to step S106a. If the character recognition is unsuccessful, the process proceeds to step S108a.

次に、ステップＳ１０６ａにおいて、属性照合部１０６は、帳票定義データ１１０ｂに含まれる属性情報１１２ｂに基づいて文字コード１１５を検証し、文字コード１１５の読取結果１１６を得る。 Next, in step S106a, the attribute matching unit 106 verifies the character code 115 based on the attribute information 112b included in the form definition data 110b, and obtains the reading result 116 of the character code 115.

次に、ステップＳ１０６ｂにおいて、属性照合部１０６は、ステップＳ１０６ａにおける属性の照合に失敗したかどうかを判断する。属性の照合に成功した場合、ステップＳ１０７に進み、位置の照合に失敗した場合、ステップＳ１０８ａに進む。 Next, in step S106b, the attribute matching unit 106 determines whether the attribute matching in step S106a has failed. If the attribute collation succeeds, the process proceeds to step S107. If the position collation fails, the process proceeds to step S108a.

次に、ステップＳ１０７において、読取結果データベース１０７は、属性照合部１０６から出力された読取り結果１１６を格納する。 Next, in step S107, the read result database 107 stores the read result 116 output from the attribute matching unit 106.

次に、ステップＳ１０８ａにおいて、帳票定義作成部１０８は、帳票定義データ１１０ａを作成する必要があるか判断する。帳票定義データ１１０ａを作成する必要がある場合、ステップＳ１０８ｂに進み、帳票定義データ１１０ａを作成する必要がない場合、処理を終了する。 Next, in step S108a, the form definition creating unit 108 determines whether the form definition data 110a needs to be created. If the form definition data 110a needs to be created, the process proceeds to step S108b, and if the form definition data 110a does not need to be created, the process ends.

帳票定義データを作成する必要があるのは、帳票が正しく読取りできなかったと推定される場合である。具体的には、次の場合である。
（１）ステップＳ１０３ａにおいて、帳票定義データ１１０ｂの取得に失敗した場合。
（２）ステップＳ１０４ａにおいて、位置の照合に失敗した場合。
（３）ステップＳ１０５ａにおいて、文字認識に失敗した場合。
（４）ステップＳ１０６ａにおいて、属性の照合に失敗した場合。
（５）ステップＳ１０３ａにおいて、出力された帳票定義データ１１０ｂが信頼できない場合。すなわち、ＴＤ’＜ＴＤなる閾値ＴＤ’を予め設定したとき、閾値ＴＤ’＜距離Ｄ＜閾値ＴＤの関係を満たす場合。
（６）ステップＳ１０４ａにおいて、出力された文字列画像１１４が信頼できない場合。すなわち、ＴＥ’＜ＴＥなる閾値ＴＥ’を予め設定したとき、閾値ＴＥ’＜距離Ｅ＜閾値ＴＥの関係を満たす場合。
（７）図示しない読取結果の修正ステップにおいて、読取結果データベース１０７に格納された読取結果１１６が修正された場合。 The form definition data needs to be created when it is estimated that the form could not be read correctly. Specifically, it is the following case.
(1) When acquisition of the form definition data 110b fails in step S103a.
(2) In step S104a, when the position verification fails.
(3) When character recognition fails in step S105a.
(4) In step S106a, when collation of attributes fails.
(5) In step S103a, when the output form definition data 110b is unreliable. That is, when the threshold value TD′ satisfying TD′<TD is set in advance, the relationship of threshold value TD′<distance D<threshold value TD is satisfied.
(6) In step S104a, when the output character string image 114 is not reliable. That is, when the threshold TE'<TE'<TE is preset and the relationship of threshold TE'<distance E<threshold TE is satisfied.
(7) In the case where the read result 116 stored in the read result database 107 is corrected in the read result correction step (not shown).

次に、ステップＳ１０８ｂにおいて、帳票定義作成部１０８は、帳票画像データ１１３に基づいて、帳票定義データ１１０ａを生成する。 Next, in step S108b, the form definition creation unit 108 generates the form definition data 110a based on the form image data 113.

次に、ステップＳ１０３ｃにおいて、帳票定義データベース１０３は、帳票定義データ１１０ａを格納し、処理を終了する。 Next, in step S103c, the form definition database 103 stores the form definition data 110a, and the process ends.

なお、ステップＳ１０４ｂあるいはステップＳ１０５ｂあるいはステップＳ１０６ｂに関しては、それぞれ省略することも可能である。この動作を採用したとき、読取り対象文字列が複数ある場合に、一部の文字列の読取に失敗しても、処理を完遂させることが可能である。 Note that step S104b, step S105b, or step S106b can be omitted. When this operation is adopted, when there are a plurality of reading target character strings, the processing can be completed even if reading of a part of the character strings fails.

以上説明したように、上述した第１実施形態によれば、多種の帳票が混在する環境下でも、低コストで高精度の帳票読取りが実現できる。すなわち、位置情報１１１ａと属性情報１１２ａのみから成る帳票定義データ１１０ａを簡便な方法で生成することで、帳票定義データ１１０ａの作成のコストを削減することが可能となる。また、帳票定義データ１１０ｂの位置情報１１１ｂに基づいて候補となる文字列を帳票から抽出し、帳票定義データ１１０ｂの属性情報１１２ｂに基づいて文字列を検証することで、読取りの精度を担保することができる。ここで、帳票定義データ１１０ｂで定義された位置情報１１１ｂと属性情報１１２ｂに基づいて、帳票から抽出された文字列画像の読取結果１１６を決定でき、項目名が帳票にない場合においても、帳票読取りが実現できる。また、読取処理に失敗した場合に、帳票定義データ１１０ａを新たに生成することで、同様の帳票が次に入力されたときに読取りに成功する可能性を上げることができる。 As described above, according to the above-described first embodiment, it is possible to realize high-accuracy form reading at low cost even in an environment in which various forms are mixed. That is, the cost of creating the form definition data 110a can be reduced by generating the form definition data 110a including only the position information 111a and the attribute information 112a by a simple method. Further, the accuracy of reading is ensured by extracting candidate character strings from the form based on the position information 111b of the form definition data 110b and verifying the character strings based on the attribute information 112b of the form definition data 110b. You can Here, the reading result 116 of the character string image extracted from the form can be determined based on the position information 111b and the attribute information 112b defined in the form definition data 110b, and the form reading can be performed even when the item name is not in the form. Can be realized. Further, when the reading process fails, the form definition data 110a is newly generated, so that it is possible to increase the possibility of successful reading when the same form is input next time.

また、ステップＳ１０３ｂあるいはステップＳ１０４ｂあるいはステップＳ１０５ｂあるいはステップＳ１０６ｂにおいて処理に失敗した場合、ステップＳ１０３ｃにおいて、新たな帳票定義データ１１０ａを帳票定義データベース１０３に格納した後、ステップＳ１０２に進み、同じ帳票画像データ１１３に対して処理するようにしてもよい。 If the process fails in step S103b, step S104b, step S105b, or step S106b, the new form definition data 110a is stored in the form definition database 103 in step S103c, and then the process proceeds to step S102 and the same form image data 113 is stored. May be processed.

本動作によって、帳票定義データ１１０ａを新たに作成するだけで、一度読取りに失敗した帳票に対して自動的に再読取り処理が実行されるため、失敗した読取結果をユーザが手作業で修正する手間を削減することができ、さらに新たに作成した帳票定義データ１１０ａによって帳票が読めるようになったのか即座に確認することが可能になる。また、帳票が読めるようになるまで、帳票定義データ１１０ａを修正することも容易にできるようになるため、帳票定義データ１１０ａの品質の向上も図ることができる。 By this operation, by simply creating a new form definition data 110a, the re-reading process is automatically executed for the form that has once failed to be read. Therefore, it is troublesome for the user to manually correct the failed read result. Can be reduced, and it is possible to immediately confirm whether the form can be read by the newly created form definition data 110a. Further, it becomes possible to easily correct the form definition data 110a until the form can be read, so that the quality of the form definition data 110a can be improved.

図６は、第２実施形態に係る帳票処理装置の構成を示すブロック図である。
図６の帳票処理装置には、図１の帳票処理装置に定義レス読取部２０１が追加されている。図５のステップＳ１０３ｂ、Ｓ１０４ｂ、Ｓ１０５ｂ、Ｓ１０６ｂのいずれかで処理に失敗したと判断された時に、定義レス読取部２０１は動作する。 FIG. 6 is a block diagram showing the configuration of the form processing apparatus according to the second embodiment.
In the form processing apparatus of FIG. 6, a definitionless reading unit 201 is added to the form processing apparatus of FIG. When it is determined that the processing has failed in any of steps S103b, S104b, S105b, and S106b in FIG. 5, the definitionless reading unit 201 operates.

定義レス読取部２０１は、帳票画像データ１１３を入力として、予め定められ帳票中の読取項目を文字コード化して、読取結果１１６として出力する。定義レス読取部２０１は、帳票中の項目名などのキーワードを利用したり、文字認識結果を予め作成した属性ごとの辞書と突合することで、帳票定義データ１１０ｂを用いることなく、読取項目を推定する。定義レス読取部２０１は、例えば、特許第５６２１１６９号公報に記載の方法で実現することができる。 The definitionless reading unit 201 receives the form image data 113 as an input, converts the predetermined read items in the form into character codes, and outputs the read results 116. The definitionless reading unit 201 estimates a read item without using the form definition data 110b by using a keyword such as an item name in a form or matching a character recognition result with a dictionary for each attribute created in advance. To do. The definitionless reading unit 201 can be realized by the method described in Japanese Patent No. 5621169, for example.

定義レス読取部２０１は、その処理の途中経過および結果が信頼できないとみなされる状況になったならば、帳票の読取りに失敗したとみなしてもよい。帳票の読取りに失敗した場合の動作は、図７のフローチャートを用いて後述する。 The definitionless reading unit 201 may consider that the reading of the form has failed if the progress of the process and the situation in which the result is considered unreliable. The operation when reading of the form fails will be described later with reference to the flowchart of FIG. 7.

例えば、特許第５６２１１６９号公報に記載の方法で定義レス読取部２０１を実現した場合、定義レス読取部２０１は、特許第５６２１１６９号公報のステップＳ１７０で計算された評価値が、予め定めた閾値以下であったときに、信頼できる帳票読取りができなかったものとし、帳票の読取に失敗したとみなすことができる。 For example, when the definitionless reading unit 201 is realized by the method described in Japanese Patent No. 5621169, the definitionless reading unit 201 determines that the evaluation value calculated in step S170 of Japanese Patent No. 5621169 is less than or equal to a predetermined threshold value. If it is, it can be considered that the reliable reading of the form could not be performed and that the reading of the form failed.

図７は、第２実施形態に係る帳票処理装置の処理を示すフローチャートである。
図７の処理には、図５の処理にステップＳ２０１ａ、Ｓ２０１ｂが追加されている。 FIG. 7 is a flowchart showing the processing of the form processing apparatus according to the second embodiment.
In the processing of FIG. 7, steps S201a and S201b are added to the processing of FIG.

ステップＳ１０３ｂにおいて、帳票定義データベース１０３は、ステップＳ１０３ａに帳票定義データの取得に失敗したとき、ステップＳ２０１ａに進む。ステップＳ１０４ｂにおいて、位置照合部１０４は、ステップＳ１０４ａにおける位置の照合に失敗したとき、ステップＳ２０１ａに進む。ステップＳ１０５ｂにおいて、文字認識部１０５は、ステップＳ１０５ａにおける文字の認識に失敗したとき、ステップＳ２０１ａに進む。ステップＳ１０６ｂにおいて、属性照合部１０６は、ステップＳ１０６ａにおける属性の照合に失敗したとき、ステップＳ２０１ａに進む。 In step S103b, the form definition database 103 proceeds to step S201a when acquisition of form definition data fails in step S103a. In step S104b, when the position verification unit 104 fails in the position verification in step S104a, the process proceeds to step S201a. In step S105b, when the character recognition unit 105 fails in character recognition in step S105a, the process proceeds to step S201a. In step S106b, the attribute matching unit 106 proceeds to step S201a when the attribute matching in step S106a fails.

次に、ステップＳ２０１ａにおいて、定義レス読取部２０１は、帳票画像データ１１３の読取項目を文字コード化し、読取結果１１６を得る。 Next, in step S201a, the definitionless reading unit 201 character-codes the read items of the form image data 113 and obtains the read result 116.

次に、ステップＳ２０１ｂにおいて、定義レス読取部２０１は、ステップＳ２０１ａにおける帳票の読取りに失敗したかどうかを判断する。帳票の読取りに成功した場合、ステップＳ１０７に進み、帳票の読取りに失敗した場合、ステップＳ１０８ａに進む。 Next, in step S201b, the definitionless reading unit 201 determines whether the reading of the form in step S201a has failed. If the form has been successfully read, the process proceeds to step S107. If the form has not been successfully read, the process proceeds to step S108a.

帳票定義データを作成する必要があるのは、帳票が正しく読取りできなかったと推定される場合である。具体的には、次の場合である。
（１）ステップＳ１０３において、帳票定義データ１１０ｂの取得に失敗した場合。
（２）ステップＳ１０４ａにおいて、位置の照合に失敗した場合。
（３）ステップＳ１０５ａにおいて、文字認識に失敗した場合。
（４）ステップＳ１０６ａにおいて、属性の照合に失敗した場合。
（５）ステップＳ１０３ａにおいて、出力された帳票定義データ１１０ｂが信頼できない場合。すなわち、ＴＤ’＜ＴＤなる閾値ＴＤ’を予め設定したとき、閾値ＴＤ’＜距離Ｄ＜閾値ＴＤの関係を満たす場合。
（６）ステップＳ１０４において、出力された文字列画像が信頼できない場合。すなわち、ＴＥ’＜ＴＥなる閾値ＴＥ’を予め設定したとき、閾値ＴＥ’＜距離Ｅ＜閾値ＴＥの関係を満たす場合。
（７）図示しない読取結果の修正ステップにおいて、読取結果データベース１０７に格納された読取結果１１６が修正された場合。
（８）ステップＳ２０１ａにおいて、帳票の読取りに失敗した場合。 The form definition data needs to be created when it is estimated that the form could not be read correctly. Specifically, it is the following case.
(1) When acquisition of the form definition data 110b fails in step S103.
(2) In step S104a, when the position verification fails.
(3) When character recognition fails in step S105a.
(4) In step S106a, when collation of attributes fails.
(5) In step S103a, when the output form definition data 110b is unreliable. That is, when the threshold value TD′ satisfying TD′<TD is set in advance, the relationship of threshold value TD′<distance D<threshold value TD is satisfied.
(6) In step S104, when the output character string image is not reliable. That is, when the threshold value TE′<TE is set in advance and the relationship of threshold value TE′<distance E<threshold value TE is satisfied.
(7) In the case where the reading result 116 stored in the reading result database 107 is corrected in the reading result correction step (not shown).
(8) When the reading of the form has failed in step S201a.

以上説明したように、上述した第２実施形態によれば、定義レス読取部２０１を設けることにより、処理対象の帳票に対応する帳票定義データ１１０ａが作成されていなくても、帳票の読取りが行うことが可能となる。例えば、帳票読取装置の運用初期に帳票定義データ１１０ａが十分に作成できていなくても、帳票読取装置の運用を開始することができる。あるいは、帳票読取装置の運用中に、新たな種類の帳票が入力されても、その帳票の読取りを行うことができる。 As described above, according to the second embodiment described above, by providing the definitionless reading unit 201, the form is read even if the form definition data 110a corresponding to the form to be processed is not created. It becomes possible. For example, even if the form definition data 110a has not been sufficiently created at the beginning of the operation of the form reading apparatus, the operation of the form reading apparatus can be started. Alternatively, even if a new type of form is input during operation of the form reading device, the form can be read.

また、ステップＳ１０３ｂおよびステップＳ１０４ｂおよびステップＳ１０５ｂおよびステップＳ１０６ｂおよびステップＳ２０１ｂを省略し、ステップＳ２０１ａをステップＳ１０６ａの後、ステップＳ１０７の前に実行するように動作を変更してもよい。 Further, step S103b, step S104b, step S105b, step S106b, and step S201b may be omitted, and the operation may be changed so that step S201a is executed after step S106a and before step S107.

このとき、位置照合部１０４は、位置の照合結果の評価値に相当する情報を算出し、文字列画像１１４とともに出力するように構成を変更する。文字認識部１０５は、文字の認識結果の評価値に相当する情報を算出し、入力された評価値と合算して、文字コード１１５とともに出力するように構成を変更する。属性照合部１０６は、属性の照合結果の評価値に相当する情報を算出し、入力された評価値と合算して、読取結果１１６として出力するように構成を変更する。定義レス読取部２０１は、定義レス読取りの評価値に相当する情報を算出し、読取結果１１６とともに出力するように構成を変更する。読取結果データベース１０７は、属性照合部１０６から出力された評価値と、定義レス読取部２０１から出力された評価値を比較し、評価値が高い方の読取結果１１６を採用して、読取結果データベース１０７に格納するように構成を変更する。 At this time, the position matching unit 104 changes the configuration so that information corresponding to the evaluation value of the position matching result is calculated and output together with the character string image 114. The character recognition unit 105 calculates the information corresponding to the evaluation value of the character recognition result, adds the information to the input evaluation value, and outputs the information together with the character code 115. The attribute matching unit 106 calculates the information corresponding to the evaluation value of the attribute matching result, adds the information to the input evaluation value, and outputs the read result 116. The definitionless reading unit 201 calculates the information corresponding to the evaluation value of the definitionless reading and changes the configuration so as to output the information together with the reading result 116. The reading result database 107 compares the evaluation value output from the attribute matching unit 106 with the evaluation value output from the definitionless reading unit 201, adopts the reading result 116 with the higher evaluation value, and reads the reading result database. The configuration is changed so as to be stored in 107.

この変更により、帳票定義データ１１０ａを活用して読取りを行った場合と、帳票定義データ１１０ａを用いずに読取りを行った場合で、より正しいと考えられる読取結果を採用することが可能となる。 By this change, it is possible to adopt the reading result which is considered to be more correct, when the reading is performed by utilizing the form definition data 110a and when the reading is performed without using the form definition data 110a.

なお、上述した実施形態では、医療分野における帳票読取装置を例にとり説明したが、それ以外の各種読取装置に適用してもよい。例えば、領収書、医療レセプト、小切手または伝票などの読取装置に適用してもよい。 In the above-described embodiment, the form reading device in the medical field has been described as an example, but it may be applied to various other reading devices. For example, it may be applied to a reading device such as a receipt, medical receipt, check or slip.

図８は、第３実施形態に係る帳票処理装置のハードウェア構成を示すブロック図である。
図８において、帳票処理装置１１０は、プロセッサ１１、通信制御デバイス１２、通信インターフェース１３、主記憶デバイス１４および外部記憶デバイス１５を備える。プロセッサ１１、通信制御デバイス１２、通信インターフェース１３、主記憶デバイス１４および外部記憶デバイス１５は、内部バス１６を介して相互に接続されている。主記憶デバイス１４および外部記憶デバイス１５は、プロセッサ１１からアクセス可能である。 FIG. 8 is a block diagram showing the hardware configuration of the form processing apparatus according to the third embodiment.
8, the form processing apparatus 110 includes a processor 11, a communication control device 12, a communication interface 13, a main storage device 14 and an external storage device 15. The processor 11, the communication control device 12, the communication interface 13, the main storage device 14, and the external storage device 15 are connected to each other via an internal bus 16. The main storage device 14 and the external storage device 15 are accessible from the processor 11.

また、帳票処理装置１１０の外部には、ポインティングデバイス２０、ディスプレイ２１およびスキャナ２２が設けられている。ポインティングデバイス２０、ディスプレイ２１およびスキャナ２２は、入出力インターフェース１７を介して内部バス１６に接続されている。 Further, outside the form processing apparatus 110, a pointing device 20, a display 21 and a scanner 22 are provided. The pointing device 20, the display 21, and the scanner 22 are connected to the internal bus 16 via the input/output interface 17.

プロセッサ１１は、帳票処理装置１１０全体の動作制御を司るハードウェアである。主記憶デバイス１４は、例えば、ＳＲＡＭまたはＤＲＡＭなどの半導体メモリから構成することができる。主記憶デバイス１４には、プロセッサ１１が実行中のプログラムを格納したり、プロセッサ１１がプログラムを実行するためのワークエリアを設けたりすることができる。 The processor 11 is hardware that controls the operation of the entire form processing apparatus 110. The main storage device 14 can be composed of, for example, a semiconductor memory such as SRAM or DRAM. The main storage device 14 can store a program being executed by the processor 11 or can be provided with a work area for the processor 11 to execute the program.

外部記憶デバイス１５は、大容量の記憶容量を有する記憶デバイスであり、例えば、ハードディスク装置やＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）である。外部記憶デバイス１５は、各種プログラムの実行ファイルやプログラムの実行に用いられるデータを保持することができる。外部記憶デバイス１５には、帳票処理プログラム１５Ａおよび帳票定義データ１５Ｂを格納することができる。帳票処理プログラム１５Ａは、帳票処理装置１１０にインストール可能なソフトウェアであってもよいし、帳票処理装置１１０にファームウェアとして組み込まれていてもよい。 The external storage device 15 is a storage device having a large storage capacity, and is, for example, a hard disk device or an SSD (Solid State Drive). The external storage device 15 can hold execution files of various programs and data used for executing the programs. The external storage device 15 can store a form processing program 15A and form definition data 15B. The form processing program 15A may be software that can be installed in the form processing apparatus 110, or may be incorporated in the form processing apparatus 110 as firmware.

通信制御デバイス１２は、外部との通信を制御する機能を有するハードウェアである。通信制御デバイス１２は、通信インターフェース１３を介してネットワーク１９に接続される。ネットワーク１９は、インターネットなどのＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）であってもよいし、ＷｉＦｉなどのＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）であってもよいし、ＷＡＮとＬＡＮが混在していてもよい。 The communication control device 12 is hardware having a function of controlling communication with the outside. The communication control device 12 is connected to the network 19 via the communication interface 13. The network 19 may be a WAN (Wide Area Network) such as the Internet, a LAN (Local Area Network) such as WiFi, or a mixture of WAN and LAN.

入出力インターフェース１７は、ポインティングデバイス２０、ディスプレイ２１およびスキャナ２２から入力されるデータをプロセッサ１１が処理可能なデータ形式に変換したり、プロセッサ１１から出力されるデータをポインティングデバイス２０およびディスプレイ２１が処理可能なデータ形式に変換したりする。 The input/output interface 17 converts data input from the pointing device 20, the display 21 and the scanner 22 into a data format that the processor 11 can process, and processes data output from the processor 11 by the pointing device 20 and the display 21. Convert to a possible data format.

プロセッサ１１が帳票処理プログラム１５Ａを主記憶デバイス１４に読み出し、帳票処理プログラム１５Ａを実行することにより、帳票定義データ１５Ｂに含まれる位置情報に基づいて、読取り対象文字列の候補となる文字列画像を帳票画像データから抽出し、その抽出した文字列画像の文字認識処理を実施し、簡易帳票定義データ１５Ｂに含まれる属性情報に基づいて文字認識結果の属性を照合することにより読取結果１１６を決定することができる。この時、帳票処理プログラム１５Ａは、図１の位置照合部１０４、文字認識部１０５および属性照合部１０６の機能を実現することができる。文字認識結果の属性の照合では、プロセッサ１１は、ネットワーク１９を介し、属性の検証処理に用いるデータベースにアクセスし、属性の検証処理に用いる情報を取得するようにしてもよい。 The processor 11 reads the form processing program 15A into the main storage device 14 and executes the form processing program 15A to generate a character string image that is a candidate for a character string to be read based on the position information included in the form definition data 15B. The reading result 116 is determined by extracting from the form image data, performing character recognition processing of the extracted character string image, and collating the attributes of the character recognition result based on the attribute information included in the simplified form definition data 15B. be able to. At this time, the form processing program 15A can realize the functions of the position matching unit 104, the character recognition unit 105, and the attribute matching unit 106 of FIG. In the collation of the attribute of the character recognition result, the processor 11 may access the database used for the attribute verification processing via the network 19 and acquire the information used for the attribute verification processing.

なお、帳票処理プログラム１５Ａの実行は、複数のプロセッサやコンピュータに分担させてもよい。あるいは、プロセッサ１１は、ネットワーク１９を介してクラウドコンピュータなどに帳票処理プログラム１５Ａの全部または一部の実行を指示し、その実行結果を受け取るようにしてもよい。 The execution of the form processing program 15A may be shared by a plurality of processors or computers. Alternatively, the processor 11 may instruct a cloud computer or the like via the network 19 to execute all or part of the form processing program 15A and receive the execution result.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。 It should be noted that the present invention is not limited to the above-described embodiments, but includes various modifications. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the configurations described. Further, a part of the configuration of a certain embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of a certain embodiment. Further, with respect to a part of the configuration of each embodiment, other configurations can be added/deleted/replaced. Further, the above-described respective configurations, functions, processing units, processing means, etc. may be realized by hardware by designing a part or all of them, for example, with an integrated circuit.

１０１スキャナ、１０２特徴抽出部、１０３帳票定義データベース、１０４位置照合部、１０５文字認識部、１０６属性照合部、１０７読取結果データベース、１０８帳票定義データ作成部

101 Scanner, 102 Feature Extraction Unit, 103 Form Definition Database, 104 Position Matching Unit, 105 Character Recognition Unit, 106 Attribute Matching Unit, 107 Read Result Database, 108 Form Definition Data Creation Unit

Claims

A character string image that is a candidate for the read target character string is determined based on the matching result of the position information of the character string image extracted from the form image data and the position information defined for the read target character string. A position matching unit,
A character recognition unit that performs character recognition based on a character string image that is a candidate for the reading target character string;
The reading result of the character string image that is a candidate for the reading target character string is determined based on the matching result between the attribute of the character recognition result by the character recognition unit and the attribute information defined for the reading target character string. Form processing apparatus having an attribute collating unit for performing.

The position matching unit, based on the distance between the position of the character string image extracted from the form image data and the position defined for the reading target character string, a character string that is a candidate for the reading target character string. The form processing apparatus according to claim 1, which determines an image.

The said position collation part determines the character string image with the shortest distance with the said reading target character string as a candidate for the said reading target character string, when the said character string image exists in multiple in a form image. Form processing device.

2. The form definition data creating unit, further comprising: a form definition data creating unit that creates form definition data in which position information and attribute information of the read target character string is defined based on a position and an attribute of the read target character string. Form processing device.

A form definition database that stores form definition data in which position information and attribute information of the reading target character string are defined in association with the feature amount of the form image data,
A feature extraction unit that extracts the feature amount of the form image data,
The form processing apparatus according to claim 4, wherein the form definition database selects the form definition data based on a feature amount of the form image data and outputs the selected form definition data to the position matching unit.

In the form definition database, the form definition data corresponding to the second feature amount in which the distance between the first feature amount of the form image data and the second feature amount associated with the form definition data is equal to or less than a threshold value. 6. The form processing apparatus according to claim 5, wherein is selected and output to the position matching unit.

When a plurality of the form definition data are selected, the position matching unit determines the position based on the distance between the position of the character string image extracted from the form image data and the position defined for the read target character string. The form processing apparatus according to claim 6, wherein an evaluation value is calculated for each form definition data, and a character string image based on the form definition data having the smallest evaluation value is determined as a candidate for the read target character string.

The form processing apparatus according to claim 5, further comprising a definitionless reading unit that determines a reading result of the character string image based on a character recognition result of a character string image including an item name extracted from the form image data.

When the position matching unit fails to match the position information,
When the character recognition unit fails in the character recognition,
When the attribute matching unit fails to match the attribute information,
When the form definition database fails to select the form definition data,
The form processing apparatus according to claim 8, wherein the definitionless reading unit executes reading of the form image data.

When the position matching unit fails to match the position information,
When the character recognition unit fails in the character recognition,
When the attribute matching unit fails to match the attribute information,
When the form definition database fails to select the form definition data,
When the definitionless reading unit fails to read the form image data,
The form processing apparatus according to claim 8, wherein the form definition data creation unit executes creation of the form definition data.

A form processing method executed by a processor, comprising:
The processor is
A character string image that is a candidate for the read target character string is determined based on the matching result of the position information of the character string image extracted from the form image data and the position information defined for the read target character string. ,
Character recognition is performed based on a character string image that is a candidate for the reading target character string,
A form processing method for determining a reading result of a character string image that is a candidate for the reading target character string, based on a matching result between the attribute of the character recognition result and the attribute information defined for the reading target character string. .