JP2009223612A

JP2009223612A - Image recognition device and program

Info

Publication number: JP2009223612A
Application number: JP2008067266A
Authority: JP
Inventors: Teruka Saito; 照花斎藤
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2008-03-17
Filing date: 2008-03-17
Publication date: 2009-10-01

Abstract

<P>PROBLEM TO BE SOLVED: To prevent the degradation of recognition rate in recognition of a handwritten character entered to a fixed form sheet, compared with the case of considering all possibilities for the presence of an image around an entry column. <P>SOLUTION: A device includes: a recognition dictionary 120 which stores one or more class candidates corresponding to each category; an application class determination part 112 which selects, from the one or more class candidates corresponding to the category stored in the recognition dictionary 120, one or more classes specified based on a positional relationship between an entry column and an image around the entry column as application classes; a pattern recognition part 116 which recognizes a class to which an entered image belongs from the application classes by executing pattern recognition processing to the entered image to the entry column; and a category determination part 118 which specifies a category corresponding to the class recognized by the pattern recognition part 116 based on the candidate storage means. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、画像認識装置及びプログラムに関する。 The present invention relates to an image recognition apparatus and a program.

手書き文字認識の応用分野の一つに、帳票や答案用紙、アンケート用紙などといった定型用紙（フォームとも呼ばれる）に対する手書き記入の認識がある。定型用紙に対する手書き記入の認識では、例えば、手書き記入後の定型用紙の画像と未記入の定型用紙の画像との差分を求め、求めた差分画像に対してパターン認識を行っている。このような認識では、手書き記入が記入枠、写真、グラフィックス図形などのように定型用紙上に元々存在する画像に対して重なってしまうと、手書き記入の完全な画像が得られないため、認識の精度が劣化してしまう可能性があった。 One application field of handwritten character recognition is the recognition of handwritten entries on standard forms (also called forms) such as forms, answer sheets, and questionnaire forms. In recognition of handwritten entry on a standard form, for example, a difference between an image of a standard form after handwriting and an image of a non-filled form is obtained, and pattern recognition is performed on the obtained difference image. In such recognition, if the handwritten entry overlaps the image that originally exists on the standard paper, such as an entry frame, a photograph, or a graphics figure, a complete image of the handwritten entry cannot be obtained. There was a possibility that the accuracy of would deteriorate.

これに対し、特許文献１に開示された装置では、記入枠と文字を重ね合わせた枠接触文字を、記入枠に対する文字の位置、サイズ又は傾きを様々に替えながら多数作成し、それら様々な枠接触文字を知識テーブルに登録し、パターン認識に用いている。 On the other hand, in the apparatus disclosed in Patent Document 1, a large number of frame contact characters in which an entry frame and characters are superimposed are created while changing the position, size, or inclination of the character with respect to the entry frame. Contact characters are registered in the knowledge table and used for pattern recognition.

特開平１０−１５４２０４号公報（特に第２１１〜２２８段落、図４１〜図４４参照）Japanese Patent Laid-Open No. 10-154204 (in particular, paragraphs 211 to 228, see FIGS. 41 to 44)

記入欄の周囲の様々な位置に写真や図形などの様々な画像が存在する可能性がある場合、それらすべての可能性を考慮したパターン認識を行うことが考えられる。しかし、パターン認識の際に考慮するパターンが多くなるほど、誤認識が増えて認識率が低下することが懸念される。 If there is a possibility that various images such as photographs and figures exist at various positions around the entry field, it is conceivable to perform pattern recognition considering all these possibilities. However, there is a concern that as the number of patterns to be considered in pattern recognition increases, misrecognition increases and the recognition rate decreases.

本発明は、記入欄の周囲における画像の存在についての可能性のすべてを考慮する場合と比較して、認識率の低下を抑制することを目的とする。 An object of the present invention is to suppress a decrease in the recognition rate as compared with a case where all the possibilities regarding the presence of an image around the entry field are considered.

本発明は、複数のカテゴリの各々について、当該カテゴリに対応する１以上のクラス候補を記憶する候補記憶手段と、前記複数のカテゴリの各々について、前記候補記憶手段に記憶された当該カテゴリに対応する１以上のクラス候補の中から、記入欄と当該記入欄の周囲の画像要素との位置関係に基づき特定される１以上のクラスを、当該カテゴリについての判定に用いるべきクラスとして選択する選択手段と、前記記入欄に対する記入の画像に対してパターン認識処理を実行することにより、前記選択手段が前記複数のカテゴリの各々について選択したクラスのなかから、前記記入欄に対する記入の画像が属するクラスを認識するパターン認識手段と、前記パターン認識手段が認識したクラスに対応するカテゴリを前記候補記憶手段に基づき特定し、特定したカテゴリを前記記入欄に対する記入の画像が属するカテゴリとして出力する出力手段と、を備える画像認識装置である。 The present invention corresponds to candidate storage means for storing one or more class candidates corresponding to the category for each of the plurality of categories, and corresponding to the category stored in the candidate storage means for each of the plurality of categories. Selecting means for selecting one or more classes specified based on the positional relationship between the entry field and the image elements around the entry field from among one or more class candidates as a class to be used for determination of the category; The pattern recognition process is performed on the image entered in the entry field to recognize the class to which the entry image in the entry field belongs from the classes selected by the selection unit for each of the plurality of categories. And a category corresponding to the class recognized by the pattern recognition means based on the candidate storage means. Identified, an image recognition device and an output means for outputting the identified category as image belongs categories of entries for the answer column.

１つの態様では、画像認識装置は、入力された画像から前記記入欄に対する記入の画像を抽出する抽出手段と、前記抽出手段により抽出された記入の画像の端点を検出する端点検出手段と、を更に備え、前記選択手段は、前記端点検出手段が検出した端点のいずれかと接触する１以上の画像要素を特定し、特定した１以上の画像要素と前記記入欄との位置関係に基づき前記判定に用いるべきクラスを選択する。 In one aspect, the image recognition apparatus includes: an extraction unit that extracts an image of an entry for the entry field from an input image; and an end point detection unit that detects an end point of the entry image extracted by the extraction unit. Further, the selection means specifies one or more image elements that are in contact with any of the end points detected by the end point detection means, and performs the determination based on a positional relationship between the specified one or more image elements and the entry field. Select the class to use.

別の態様では、画像認識装置は、１以上の画像要素と１以上の記入欄とを含んだ定型文書の記入欄ごとに、且つカテゴリごとに前記候補記憶手段に記憶されたクラス候補のうち、当該記入欄の周囲の画像要素と当該記入欄との位置関係に基づき特定された１以上のクラスを、当該記入欄についての判定に用いるクラスとして記憶する記入欄情報記憶手段と、入力された対象画像から前記各記入欄に対する記入の画像を抽出する抽出手段と、を更に備え、前記選択手段は、前記抽出手段が抽出した前記各記入欄に対する記入の画像ごとに、当該記入欄についての判定に用いるクラスを前記記入欄情報記憶手段から求める。 In another aspect, the image recognition apparatus includes class candidates stored in the candidate storage unit for each entry field of a standard document including one or more image elements and one or more entry fields, and for each category. Entry field information storage means for storing one or more classes specified based on the positional relationship between the image elements around the entry field and the entry field as a class used for determination of the entry field, and an input target Extraction means for extracting an image of entry for each entry field from an image, and the selection means is configured to determine the entry field for each entry image for each entry field extracted by the extraction means. The class to be used is obtained from the entry field information storage means.

以下、図面を参照して本発明の実施の形態（以下、実施形態という）を説明する。 Hereinafter, an embodiment of the present invention (hereinafter referred to as an embodiment) will be described with reference to the drawings.

本実施形態の画像認識装置は、帳票や答案用紙などの定型用紙（フォーム）に対してユーザが記入した文字や記号など（以下「文字」と総称する）の記入内容を認識するための装置である。 The image recognition apparatus according to the present embodiment is an apparatus for recognizing the entry contents of characters and symbols (hereinafter collectively referred to as “characters”) entered by a user on a fixed form (form) such as a form or answer sheet. is there.

本実施形態では、記入欄と、当該記入欄の周囲の画像オブジェクト（例えば写真、図形、記入枠、文字列などのように、記入される前から存在している印刷画像）との位置関係に基づき、その記入欄への記入内容をパターン認識する際に考慮するパターンを限定する。記入欄に対してあらゆる位置に画像オブジェクトが存在する可能性があると考慮すべきパターンの数は膨大なものとなるが、画像オブジェクトの位置を限定すれば考慮すべきパターンの数を減らすことができる。 In this embodiment, the positional relationship between the entry field and the image objects around the entry field (for example, a print image that exists before entry, such as a photograph, a figure, an entry frame, and a character string). Based on this, the patterns to be taken into account when recognizing the contents entered in the entry field are limited. The number of patterns that should be considered that there is a possibility that an image object exists at any position with respect to the entry field is enormous, but if the position of the image object is limited, the number of patterns to be considered can be reduced. it can.

一般に、パターン認識では、例えば識別すべき各文字の特徴情報を保持する認識辞書を用意し、その認識辞書を参照することで、識別対象の文字画像がいずれの文字に該当するかを判定する。ここで本実施形態では、完全な状態の文字に対応する特徴情報だけでなく、周囲の画像オブジェクトによって部分的に隠された状態の文字の特徴情報も、認識辞書に登録しておく。ここで、同じ文字であっても、周囲の画像オブジェクトとの相対的な位置関係が異なれば、その画像オブジェクトにより隠される部分が異なるので、文字ごとに、その文字が周囲の画像オブジェクトに対して様々な位置関係で隠された場合の特徴情報を認識辞書に登録しておく。ここでいう位置関係には、例えば相対的な方向又は距離、又はその両方が含まれる。そして、パターン認識の際には、認識辞書に登録されたそれら様々な特徴情報のうち、認識対象の文字とその周囲にある画像オブジェクトとの位置関係に対応するものを選択して用いる。 In general, in pattern recognition, for example, a recognition dictionary that holds feature information of each character to be identified is prepared, and by referring to the recognition dictionary, it is determined which character the character image to be identified corresponds to. Here, in this embodiment, not only feature information corresponding to characters in a complete state but also feature information of characters partially hidden by surrounding image objects is registered in the recognition dictionary. Here, even if the character is the same, if the relative positional relationship with the surrounding image object is different, the portion hidden by the image object is different, so that the character is relative to the surrounding image object for each character. Feature information when hidden in various positional relationships is registered in the recognition dictionary. The positional relationship here includes, for example, a relative direction and / or distance. At the time of pattern recognition, among the various feature information registered in the recognition dictionary, information corresponding to the positional relationship between the character to be recognized and the surrounding image objects is selected and used.

以下では、文字の種類を「カテゴリ」と呼び、文字は同じ種類であるが周囲の画像オブジェクトによる隠され方の異なる様々なバリエーションのことを「クラス」と呼ぶ。例えば、丸印が１つの「カテゴリ」であるとすると、周囲の画像オブジェクトにより一部分が隠された状態の丸印の形状が１つの「クラス」である。この場合、周囲の画像オブジェクトの形状や大きさ、又はその画像オブジェクトの丸印に対する位置関係、例えば上下左右のどこにあるかなどの組合せに応じて、同じ丸印という「カテゴリ」の中に様々な「クラス」ができる。 In the following, the type of character is referred to as “category”, and various variations of the same type of character but differently hidden by surrounding image objects are referred to as “class”. For example, if the circle is one “category”, the shape of the circle that is partially hidden by surrounding image objects is one “class”. In this case, there are various “categories” in the same circle depending on the shape and size of the surrounding image object, or the positional relationship of the image object with respect to the circle, for example, where the image object is located up, down, left and right "Class" is possible.

１つのカテゴリについてそのような様々なクラスのすべてを考慮するとなると、パターン認識の計算に多大のコストがかかる。しかし、認識対象の文字の周囲にその文字を部分的に隠す画像オブジェクトが存在するのか否か、存在するとしてそのオブジェクトのその文字に対する位置関係はいかなるものか、が何らかの手段で求めることができれば、カテゴリの中で考慮すべきクラスを絞り込むことができる。本実施形態は、このような考え方に即したものである。 Considering all such various classes for a category, the calculation of pattern recognition is costly. However, if there is an image object that partially hides the character around the character to be recognized, and what the positional relationship of the object with respect to the character can be determined by some means, You can narrow down the classes to be considered in the category. The present embodiment is based on such a concept.

本実施形態の画像認識装置の構成の例を図１に示す。この例では、定型用紙に対してユーザが記入を行った結果得られる記入済み文書がスキャナなどの画像読取装置により読み取られることにより、電子的な記入済み画像１０が生成される。その記入済み画像１０が、画像認識装置の備える記入抽出部１０２に入力される。 An example of the configuration of the image recognition apparatus of this embodiment is shown in FIG. In this example, an already-filled document obtained as a result of a user filling in a fixed form is read by an image reading device such as a scanner, thereby generating an electronically filled image 10. The completed image 10 is input to the entry extraction unit 102 included in the image recognition apparatus.

記入抽出部１０２は、入力された記入済み画像１０からユーザの記入内容の画像を抽出する。記入内容の抽出は、例えば記入済み画像１０から原本画像を減算することにより行うことができる。また、別の例として、記入済み画像１０から記入に用いられた筆記具の色の画素群を抽出することで、記入内容の画像（以下「記入内容画像」と呼ぶ）を抽出することもできる。この方式は、定型用紙の画像に含まれない色の筆記具を用いて記入が行われる場合などに適用できる。また、記入済み画像１０から原本画像を減算し、減算結果の画像の中から注目する筆記具の色の画素群を抽出する方式もある。この方式は、例えば被験者が鉛筆により記入した答案と、採点者が赤ペンにより記入した丸印、バツ印などの採点結果と、を分離して抽出する場合などに利用できる。なお、記入済み画像１０中に複数の記入内容画像が含まれる場合、記入抽出部１０２は、それら各記入内容画像に対応づけて、その画像の記入済み画像１０中での位置座標の情報を抽出してもよい。記入内容画像の位置座標は、例えば、当該記入内容画像の外接長方形（例えば、縦及び横の辺が、ｘ及びｙ軸方向にそれぞれ平行なもの）中の所定の基準点（例えばその長方形の対角線同士の交点、或いはその長方形の右上隅の点など）の座標の組で表せばよい。 The entry extraction unit 102 extracts an image of the entry contents of the user from the entered entry image 10. Extraction of the contents of entry can be performed by subtracting the original image from the completed image 10, for example. As another example, an image of entry content (hereinafter referred to as “entry content image”) can be extracted by extracting a pixel group of the color of the writing instrument used for entry from the completed image 10. This method can be applied, for example, when writing is performed using a writing instrument of a color that is not included in the image on the standard paper. There is also a method of subtracting the original image from the completed image 10 and extracting the pixel group of the writing instrument color of interest from the subtraction result image. This method can be used when, for example, an answer written by a subject with a pencil and a scoring result such as a round mark or cross mark written by a grader with a red pen are separated and extracted. When a plurality of entry content images are included in the completed image 10, the entry extraction unit 102 extracts information on position coordinates of the image in the completed image 10 in association with each entry content image. May be. The position coordinates of the entry content image are, for example, predetermined reference points (for example, diagonal lines of the rectangle) in the circumscribed rectangle (eg, the vertical and horizontal sides are parallel to the x and y axis directions, respectively) of the entry content image. It may be represented by a set of coordinates of the intersections of each other or the point in the upper right corner of the rectangle).

認識処理部１１０は、記入抽出部１０２が抽出した記入内容画像に対してパターン認識処理を行うことで、記入内容画像が示す文字を判定する。この判定の際に認識辞書１２０が利用される。 The recognition processing unit 110 performs pattern recognition processing on the entry content image extracted by the entry extraction unit 102 to determine a character indicated by the entry content image. The recognition dictionary 120 is used for this determination.

認識処理部１１０は、適用クラス判定部１１２，判定情報記憶部１１４，パターン認識部１１６及びカテゴリ判定部１１８を備える。 The recognition processing unit 110 includes an application class determination unit 112, a determination information storage unit 114, a pattern recognition unit 116, and a category determination unit 118.

適用クラス判定部１１２は、抽出された記入内容画像のパターン認識の際に適用するクラスを判定する。この実施形態では、判定情報記憶部１１４を参照して、当該記入内容画像のパターン認識に適用するクラス（以下「適用クラス」と呼ぶ）を特定する。 The application class determination unit 112 determines a class to be applied in pattern recognition of the extracted entry content image. In this embodiment, with reference to the determination information storage unit 114, a class to be applied to pattern recognition of the entry content image (hereinafter referred to as “application class”) is specified.

例えばアンケート用紙などのような定型文書の場合は、文書中の各記入欄の位置は固定であり、それら各記入欄の周囲のどの位置にどのような画像オブジェクトが存在するかは分かっている。記入欄に対して記入された記入内容画像は、例えばその記入欄の周囲近傍に画像オブジェクトがなければ隠されることがない。したがって、この場合には、記入内容画像が部分的に隠された文字である可能性は考慮する必要がない。また、記入欄の下方近傍に大きな、濃度が高い画像オブジェクトがあれば、その欄に対する記入内容画像の下部が部分的に隠れてしまうことを考慮する必要が出てくる。 For example, in the case of a standard document such as a questionnaire, the position of each entry column in the document is fixed, and what image object is present at which position around each entry column is known. The entry content image entered in the entry field is not hidden unless there is an image object in the vicinity of the entry field, for example. Therefore, in this case, there is no need to consider the possibility that the entry content image is a partially hidden character. Further, if there is a large, high-density image object near the lower part of the entry field, it is necessary to consider that the lower part of the entry content image for that field is partially hidden.

例えば、図２に例示する定型文書には、テキスト記事１１や写真等の写真画像１２−１，１２−２，グラフィックス図形１３などの画像オブジェクト群と、記入欄１４−１，１４−２，…とがそれぞれ所定の位置に配置されている。例えば記入欄１４−１の下方近傍には写真画像１２−１が存在するので、記入欄１４−１の記入内容画像の下部が広い範囲にわたって写真画像１２−１により隠されてしまう可能性を考慮する必要がある。しかし記入欄１４−１の左右や上方は記入欄を示す括弧の線画やテキスト記事１１なので、仮に記入内容画像がそれらにより隠されるとしても、それは幅の細い線により隠されるだけなので、記入内容画像の形状特徴が変化するほど広い範囲にわたって隠されてしまう可能性は低い。細線幅で部分的に隠されただけなら、欠損線分補間などの公知の画像処理で隠された部分をほぼ復元できるので、隠されない場合と同等に扱うことができる。したがって、記入欄１４−１に対する記入内容の認識では、「丸印」や「バツ印」などの各カテゴリについて、それぞれそのカテゴリの文字がまったく隠されない場合（すなわちユーザが記入欄内に正しく収まるように文字を記入した場合）、及び下部が隠される場合に対応する各クラスを考慮すればよい。また、記入欄１４−２については、周囲近傍には記入欄を表す括弧の線があるのみ（上方の写真画像１２−１はその欄から遠すぎるとする）なので、文字がまったく隠されない場合に該当するクラスのみを考慮すればよい。 For example, the standard document illustrated in FIG. 2 includes a group of image objects such as text articles 11 and photographic images 12-1 and 12-2, graphics figures 13, and entry fields 14-1 and 14-2. Are arranged at predetermined positions. For example, since the photographic image 12-1 exists near the lower part of the entry column 14-1, the possibility that the lower part of the entry content image in the entry column 14-1 is hidden by the photographic image 12-1 over a wide range is considered. There is a need to. However, since the left and right and above the entry field 14-1 are parenthesis line drawings and text articles 11 indicating the entry field, even if the entry content image is hidden by them, the entry content image is only hidden by the thin line. As the shape feature changes, the possibility of being hidden over a wide range is low. If it is only partially hidden by the thin line width, the portion hidden by known image processing such as missing line segment interpolation can be almost restored, so that it can be handled in the same way as when it is not hidden. Accordingly, in the recognition of the contents of entry in the entry field 14-1, for each category such as “circle” and “cross”, the characters of the category are not hidden at all (that is, the user correctly fits in the entry field). Each class corresponding to the case where the character is entered) and the lower part is hidden may be considered. For the entry field 14-2, there is only a parenthesis line representing the entry field in the vicinity (assuming that the upper photographic image 12-1 is too far from the field), so that characters are not hidden at all. Only relevant classes need to be considered.

このように、定型文書のレイアウト又は画像の情報から、各記入欄に対する記入の認識には、どの部分が隠された状態のクラスを考慮すべきかが事前に分かる。従って、この場合は、判定情報記憶部１１４には、各カテゴリ中の多数のクラスの中からどのクラスを適用するのかを、記入欄ごとに示した判定情報を記憶しておけばよい。 As described above, from the layout of the standard document or the information on the image, it can be known in advance which part should be considered for the recognition of the entry in each entry field. Therefore, in this case, the determination information storage unit 114 may store the determination information indicating for each entry column which class to apply from among the many classes in each category.

また、別の例として、記入内容画像に対する周辺の画像オブジェクトの位置及び種類の組合せごとに、その組合せに対応する適用クラスを示した情報を判定情報記憶部１１４に記憶させておいてもよい。この例では、適用クラス判定部１１２は、例えば記入済み画像１０から抽出した記入内容画像の周囲近傍に画像オブジェクトがあるかどうかを調べ、あればその画像オブジェクトの位置と種類を判定し、その判定結果に対応する適用クラスを判定情報記憶部１１４中の情報から求める。この例は、記入内容画像の周囲近傍の画像オブジェクトについて事前の知識がない場合にも適用できる。 As another example, for each combination of positions and types of peripheral image objects with respect to the entry content image, information indicating an application class corresponding to the combination may be stored in the determination information storage unit 114. In this example, the application class determination unit 112 checks whether there is an image object near the periphery of the entry content image extracted from the completed image 10, for example, determines the position and type of the image object, and determines the determination. An application class corresponding to the result is obtained from information in the determination information storage unit 114. This example can also be applied when there is no prior knowledge about the image objects near the periphery of the entry content image.

パターン認識部１１６は、公知のパターン認識処理を行うことで、適用クラス判定部１１２が判定した適用クラスの中から、記入内容画像に該当するクラスを判定する。この判定の際に、認識辞書１２０が参照される。 The pattern recognition unit 116 determines a class corresponding to the entry content image from the application classes determined by the application class determination unit 112 by performing a known pattern recognition process. In this determination, the recognition dictionary 120 is referred to.

図３に認識辞書１２０のデータ内容の一例を示す。この例は、試験の採点のための丸印、三角印、バツ印の判別のための認識辞書の例である。この例では、丸、三角、バツのそれぞれのカテゴリについて、そのカテゴリに含まれる各クラスのクラスＩＤ（識別情報）とクラス判定情報とが含まれる。クラス判定情報は、パターン認識において、記入内容画像が当該クラスに属するかどうか判定する際の判定基準となる情報である。例えば、丸印の場合、様々な人が書いた丸印のサンプルを学習して得た丸印の特徴を表す特徴量がその一例である。丸印の下部が隠された場合のクラスについては、下部が隠された丸印の多数のサンプルから学習された特徴量をクラス判定情報として用いればよい。このような特徴量は「プロトタイプ」と呼ばれる。また、プロトタイプに加え学習サンプル群の分散を考慮する方法も知られており、この場合は分散に関する情報もクラス判定情報に含まれる。なお、クラス判定情報としてプロトタイプを用いるというのはあくまで一例に過ぎず、公知のパターン認識手法で用いられる他の判定情報を用いてもよい。例えば、クラスに該当するサンプル画像の集合自体を、そのクラスについてのクラス判定情報として用いてもよい。 FIG. 3 shows an example of data contents of the recognition dictionary 120. This example is an example of a recognition dictionary for discriminating round marks, triangle marks, and cross marks for scoring a test. In this example, for each of the categories of circle, triangle, and cross, the class ID (identification information) and class determination information of each class included in the category are included. The class determination information is information that serves as a determination criterion when determining whether or not an entry content image belongs to the class in pattern recognition. For example, in the case of a circle, a feature amount representing the feature of a circle obtained by learning a sample of a circle written by various people is an example. As for the class in the case where the lower part of the circle is hidden, a feature amount learned from a large number of samples of the circles whose lower part is hidden may be used as the class determination information. Such a feature is called a “prototype”. In addition to the prototype, there is also known a method that considers the variance of the learning sample group. In this case, information on the variance is also included in the class determination information. The use of a prototype as class determination information is merely an example, and other determination information used in a known pattern recognition method may be used. For example, a set of sample images corresponding to a class may be used as class determination information for the class.

なお、部分的に隠された状態に対応するクラスのクラス判定情報を作成するには、隠されていない状態の文字の各サンプル画像をそのクラスに対応した状態で部分的に隠した上で、学習を行えばよい。また、単に部分的に隠した状態のサンプル画像のクラスの他に、部分的に隠した結果生じたサンプル画像の端点同士を接続した画像も作成し、これらを学習することで端点接続を行った状態に対応する別のクラスの判定情報を生成してもよい。 In order to create class determination information for a class corresponding to a partially hidden state, each sample image of a character that is not hidden is partially hidden in a state corresponding to that class, Just learn. In addition to the sample image class that is only partially hidden, an image in which the end points of the sample image resulting from the partial occlusion are connected is created, and the end points are connected by learning them. Another class of determination information corresponding to the state may be generated.

そのような認識辞書１２０を用いて、パターン認識部１１６は、１以上の適用クラスの中から、対象の記入内容画像に最も近い１乃至複数のクラスを特定する。 Using such a recognition dictionary 120, the pattern recognition unit 116 identifies one or more classes closest to the target entry content image from one or more application classes.

カテゴリ判定部１１８は、パターン認識部１１６が特定したクラスがどのカテゴリに属するかを、認識辞書１２０を参照して判定する。例えば、記入内容画像がクラスＣ０３に最も近いとパターン認識部１１６が判定した場合、カテゴリ判定部１１８は、そのクラスＣ０３が属するカテゴリＣ（丸印）を、認識結果として求める。 The category determination unit 118 determines which category the class specified by the pattern recognition unit 116 belongs to by referring to the recognition dictionary 120. For example, when the pattern recognition unit 116 determines that the entry content image is closest to the class C03, the category determination unit 118 obtains the category C (circle) to which the class C03 belongs as a recognition result.

以上、本実施形態の概要を説明した。次に、より具体的なシステムの例について説明する。 The outline of the present embodiment has been described above. Next, a more specific system example will be described.

まず第１の例として、事前準備型の例を説明する。この例は、定型文書などのように記入内容がどのように隠される可能性があるかが事前に分かる文書に対する記入の認識のために用いられる。この例は、事前に分かる情報を用いて各記入欄についての適用クラスを事前に求めておき、実際のパターン認識時にその適用クラスの情報を利用するので、事前準備型と呼ぶ。この例を、図４〜図１０を参照して説明する。 First, an example of a preliminary preparation type will be described as a first example. This example is used for recognizing entry for a document that knows in advance how the entry may be hidden, such as a standard document. In this example, an application class for each entry field is obtained in advance using information known in advance, and information on the application class is used at the time of actual pattern recognition. This example will be described with reference to FIGS.

図４は、この事前準備型のシステムの全体像を示す図である。図示のように、このシステムは、準備処理システム１３０と認識システム１５０とを備える。準備処理システム１３０は、前述の判定情報記憶部１１４を構築するための準備処理を実行するシステムである。この例では、判定情報記憶部１１４には、記入欄位置記憶部１４２と記入欄・クラス対応記憶部１４４とが含まれる（両者ともあとで詳述）。準備処理システム１３０は、記入前の定型文書の画像を解析することで、定型文書中の各記入欄についての適用クラスを特定し、記入欄・クラス対応記憶部１４４に登録する。記入欄位置記憶部１４２は、定形文書上での各記入欄の位置を記憶している。各記入欄の位置の情報は、ユーザが事前に求めて記入欄位置記憶部１４２に登録しておけばよい。 FIG. 4 is a diagram showing an overall view of the preparatory system. As shown, the system includes a preparation processing system 130 and a recognition system 150. The preparation processing system 130 is a system that executes preparation processing for constructing the above-described determination information storage unit 114. In this example, the determination information storage unit 114 includes an entry column position storage unit 142 and an entry column / class correspondence storage unit 144 (both will be described in detail later). The preparation processing system 130 analyzes the image of the standard document before entry, identifies the application class for each entry field in the fixed form document, and registers it in the entry field / class correspondence storage unit 144. The entry column position storage unit 142 stores the position of each entry column on the standard document. Information on the position of each entry field may be obtained in advance by the user and registered in the entry field position storage unit 142.

認識システム１５０は、準備処理システム１３０により構築された判定情報記憶部１１４を参照して、記入済み画像における記入内容のパターン認識を行う。 The recognition system 150 refers to the determination information storage unit 114 constructed by the preparation processing system 130 and performs pattern recognition of the entered content in the completed image.

以下、準備処理システム１３０及び認識システム１５０の詳細な例について、順に説明する。 Hereinafter, detailed examples of the preparation processing system 130 and the recognition system 150 will be described in order.

図５に準備処理システム１３０の詳細構成の例を示す。準備処理システム１３０は、定型文書の原本画像３０を読み込み、その原本画像３０の中の各記入欄について、その周辺に存在する画像オブジェクトを調べることで、当該記入欄の認識に適用すべきクラスを判定する。原本画像３０は、記入がまったくなされていない定型文書の原稿の画像である。 FIG. 5 shows an example of a detailed configuration of the preparation processing system 130. The preparation processing system 130 reads the original image 30 of the standard document and examines image objects existing in the vicinity of each entry field in the original image 30 to determine a class to be applied to recognition of the entry field. judge. The original image 30 is an image of a manuscript of a standard document that is not filled in at all.

原本画像３０はこの機構中の画像入力部１３２に入力される。画像入力部１３２は、紙に印刷された原本画像３０を読み取るスキャナであってもよいし、原本画像３０の電子データを入力するインタフェースであってもよい。オブジェクト抽出部１３４は、入力された原本画像３０上に存在する文字列や写真、線画などの画像オブジェクトを、公知の技術により抽出する。このとき、オブジェクト抽出部１３４は、原本画像３０上での画像オブジェクトの位置も求める。画像オブジェクトの位置は、例えば、画像オブジェクトに外接する矩形（縦横の辺が原本画像３０の縦横の辺と並行なもの）の対角線の交点や、その矩形の対角線上の２点の座標の組などで表せばよい。 The original image 30 is input to the image input unit 132 in this mechanism. The image input unit 132 may be a scanner that reads the original image 30 printed on paper, or may be an interface that inputs electronic data of the original image 30. The object extraction unit 134 extracts image objects such as character strings, photographs, and line drawings existing on the input original image 30 by a known technique. At this time, the object extraction unit 134 also obtains the position of the image object on the original image 30. The position of the image object is, for example, an intersection of diagonal lines of a rectangle circumscribing the image object (vertical and horizontal sides parallel to the vertical and horizontal sides of the original image 30), or a set of coordinates of two points on the diagonal of the rectangle. It can be expressed as

記入欄位置記憶部１４２には、図６に例示するように、原本画像３０上の記入欄ごとに、その記入欄の識別情報（記入欄ＩＤ）と原本画像３０上での位置が登録されている。記入欄の位置は、この例では、その記入欄の矩形範囲の右上隅と左下隅の２点の座標で表されている。 In the entry column position storage unit 142, as illustrated in FIG. 6, for each entry column on the original image 30, identification information (entry column ID) of the entry column and a position on the original image 30 are registered. Yes. In this example, the position of the entry column is represented by the coordinates of two points, the upper right corner and the lower left corner of the rectangular range of the entry column.

重複判定部１３６は、記入欄位置記憶部１４２に登録された記入欄ごとに、オブジェクト抽出部１３４が抽出した画像オブジェクトの中からその記入欄中の文字と重複する可能性のある画像オブジェクトを判定する。この処理では、例えば、記入欄の位置情報が示す記入欄の領域を所定倍率で拡大し、拡大された記入欄の領域と各画像オブジェクトとが位置的に重なるかどうかを判定する。この例では、記入欄の領域を拡大しておくことで、ユーザが記入欄から少しはみ出るように記入した場合に対応している。この判定で、それら記入欄の各々について、当該記入欄に対して位置的に重複する画像オブジェクトの種類（テキスト、線画、濃い写真画像、薄い写真画像など）や、その画像オブジェクトの当該記入欄に対する位置関係が特定できる。 The duplication determination unit 136 determines, for each entry field registered in the entry field position storage unit 142, image objects that may overlap with characters in the entry field from among the image objects extracted by the object extraction unit 134. To do. In this process, for example, the area of the entry field indicated by the position information of the entry field is enlarged at a predetermined magnification, and it is determined whether or not the enlarged entry field area and each image object overlap each other. In this example, by expanding the area of the entry field, it corresponds to the case where the user has entered so as to slightly protrude from the entry field. In this determination, for each of the entry fields, the type of image object (text, line drawing, dark photographic image, thin photographic image, etc.) that overlaps the entry field and the entry field of the image object. The positional relationship can be specified.

選択規則記憶部１３９は、画像オブジェクトの種類と、記入欄と画像オブジェクトとの位置関係と、の組合せごとに、その組合せに該当する場合の適用クラスが記憶されている。図７に選択規則記憶部１３９が記憶する情報の一例を模式的に示す。厳密には、図７に示した表のうち、左端の列は具体的な重複の例を示すものであり、選択規則記憶部１３９には記憶されない。この列では、記入欄の範囲を、便宜上、破線の矩形で示している。選択規則記憶部１３９に記憶されるのは、「重複オブジェクト」と「認識に適用するクラス」（適用クラス）の列の情報である。 In the selection rule storage unit 139, for each combination of the type of the image object and the positional relationship between the entry field and the image object, an application class when the combination is applicable is stored. FIG. 7 schematically shows an example of information stored in the selection rule storage unit 139. Strictly speaking, the leftmost column in the table shown in FIG. 7 shows a specific example of overlap, and is not stored in the selection rule storage unit 139. In this column, the range of the entry column is indicated by a broken-line rectangle for convenience. What is stored in the selection rule storage unit 139 is information on columns of “duplicate object” and “class applied to recognition” (applied class).

「重複オブジェクト」の欄には、重複オブジェクトの種類と位置（位置関係）の組合せが登録される。例えば、上から１番目の行の例は、例えば重複する画像オブジェクトが記入欄を示す括弧である場合の例である。この例では重複するオブジェクトの種類は、括弧なので「細い線分」であり、位置は記入欄の左右である。また、２番目の行は、記入欄の右上に濃い写真画像のオブジェクトが存在する場合に対応する。また、「認識に提供するクラス」の欄には、適用クラスのクラスＩＤが登録されている。図では、説明の便宜上、クラスＩＤの代わりに、クラスに対応する文字形状を示している。 In the “duplicate object” column, a combination of the type and position (positional relationship) of the duplicate object is registered. For example, the example of the first line from the top is an example in the case where, for example, the overlapping image object is a parenthesis indicating an entry field. In this example, since the type of the overlapping object is a parenthesis, it is “thin line segment”, and the position is on the left and right of the entry column. The second line corresponds to the case where a dark photographic image object exists in the upper right of the entry field. The class ID of the applicable class is registered in the “class provided for recognition” column. In the figure, for convenience of explanation, character shapes corresponding to classes are shown instead of class IDs.

例えば上から１番目の行の例では、重複するオブジェクトは細い線分だけなので、丸印、バツ印などの各カテゴリについて、それぞれ、画像オブジェクトにより隠蔽されない基本形のクラスのクラス判定情報が登録されている。一方、２番目の行の例では、記入欄の左上に濃い写真画像の領域があるので、記入欄に記入された文字の左上が隠される可能性を考慮する必要がある。したがって、２行目の場合の適用クラスには、カテゴリごとに、基本形のクラスに加え、そのカテゴリの文字の左上が隠されて欠けた場合に対応するクラスが登録される。４行目の例は、記入欄を囲む太線の矩形枠が存在する例である。矩形枠は、記入欄に対して上下左右にある太い線分と捉えられる。この場合、記入内容画像と画像オブジェクトである太い線分とが繋がった一連の文字画像と認識される場合がある。例えば、丸印が矩形枠の一部の辺を横切った場合には、丸印を構成する線と矩形枠の辺とが一体となった、例えば半月形の画像が記入内容画像と認識される場合がある。そこで、そのような半月形の画像に対応するクラスが、適用クラスに追加されている。 For example, in the example of the first line from the top, since the overlapping object is only a thin line segment, for each category such as a circle mark and a cross mark, class determination information of a basic class that is not hidden by the image object is registered. Yes. On the other hand, in the example of the second row, since there is a dark photographic image area at the upper left of the entry column, it is necessary to consider the possibility of hiding the upper left of the characters entered in the entry column. Therefore, in the application class in the case of the second line, for each category, in addition to the basic class, a class corresponding to the case where the upper left of the characters in the category is hidden and missing is registered. The example on the fourth line is an example in which a thick rectangular frame surrounding the entry field exists. The rectangular frame is regarded as a thick line segment on the top, bottom, left and right of the entry field. In this case, it may be recognized as a series of character images in which the entry content image and a thick line segment as an image object are connected. For example, when a round mark crosses a part of a side of a rectangular frame, for example, a half-moon-shaped image in which a line constituting the round mark and a side of the rectangular frame are integrated is recognized as an entry content image. There is a case. Therefore, a class corresponding to such a half-moon image is added to the application class.

なお、記入欄の近傍にあるのが、記入内容画像がそのオブジェクトと重なったとしても記入内容画像が抽出できる程度の薄い写真画像オブジェクトである場合は、重複により記入内容画像が部分的に欠けることを考慮しなくてもよい。 If there is a photo image object that is thin enough to extract the entry content image even if the entry content image overlaps the object, the entry content image is partially missing due to duplication. Need not be considered.

オブジェクトが濃い写真画像であるか薄い写真画像であるかなどの種別判定は、（平均濃度を計算する等の）公知の技術で行える。 Whether the object is a dark photographic image or a thin photographic image can be determined by a known technique (such as calculating an average density).

クラス対応情報登録部１３８は、重複判定部１３６の判定結果と選択規則記憶部１３９に記憶された情報とに基づき、各記入欄の適用クラスを判定し、判定した記入欄と適用クラスの対応関係を記入欄・クラス対応記憶部１４４に登録する。 The class correspondence information registration unit 138 determines the application class of each entry field based on the determination result of the duplication determination unit 136 and the information stored in the selection rule storage unit 139, and the correspondence between the determined entry field and the application class Is registered in the entry field / class correspondence storage unit 144.

例えば、重複判定部１３６が、ある記入欄の右上に濃い写真画像オブジェクトが重なると判定した場合、クラス対応情報登録部１３８は、記入欄・クラス対応記憶部１４４に対し、その記入欄のＩＤに対応づけて、図７の例の２行目の適用クラスの欄に登録されている各クラスのＩＤを登録する。図８に、記入欄・クラス対応記憶部１４４に記憶されたデータの一例を示す。”C01”、”T02”等はクラスのＩＤである。 For example, when the duplication determination unit 136 determines that a dark photographic image object overlaps in the upper right of a certain entry field, the class correspondence information registration unit 138 uses the entry field / class correspondence storage unit 144 as the ID of the entry field. Correspondingly, the ID of each class registered in the column of “applicable class” on the second line in the example of FIG. 7 is registered. FIG. 8 shows an example of data stored in the entry field / class correspondence storage unit 144. “C01”, “T02”, etc. are class IDs.

以上のような処理を原本画像３０上のすべての記入欄について行うことで、各記入欄に対応する適用クラスの情報が記入欄・クラス対応記憶部１４４に蓄積されることになる。 By performing the processing as described above for all entry fields on the original image 30, the information on the applicable classes corresponding to each entry field is accumulated in the entry field / class correspondence storage unit 144.

以上、準備処理システム１３０について説明した。次に、図９及び図１０を参照して、認識システム１５０の詳細な例を説明する。 The preparation processing system 130 has been described above. Next, a detailed example of the recognition system 150 will be described with reference to FIGS. 9 and 10.

図９に例示した認識システム１５０は、画像入力部１５２，位置歪み補正部１５４，差分抽出部１５６，前処理部１６０，認識処理部１１０及び認識辞書１２０を含む。 The recognition system 150 illustrated in FIG. 9 includes an image input unit 152, a positional distortion correction unit 154, a difference extraction unit 156, a preprocessing unit 160, a recognition processing unit 110, and a recognition dictionary 120.

認識の対象となる記入済み画像１０は、画像入力部１５２に入力される。記入済み画像は、原本画像３０に対してユーザが記入を行ったあとの画像である。画像入力部１５２は紙原稿を読み取るスキャナであってもよいし、画像データを入力するインタフェースであってもよく、準備処理システム１３０の画像入力部１３２と兼用してもよい。位置歪み補正部１５４は、入力された記入済み画像１０に対し、（原本画像３０との位置合わせのために）周知の位置ずれ補正や歪み補正を施す。差分抽出部１５６は、位置補正及び歪み補正が済んだ記入済み画像から原本画像を減算することで、ユーザの記入内容のみを示す差分画像を求める。前処理部１６０は、差分抽出部１５６が求めた差分画像に対し、後段のパターン認識に適した画像にするための前処理を施す。図示例では、前処理部１６０は、所定色画素抽出部１６２，欠損線分補間部１６４及び接触分離部１６６を含む。 The completed image 10 to be recognized is input to the image input unit 152. The completed image is an image after the user has entered the original image 30. The image input unit 152 may be a scanner that reads a paper document, may be an interface for inputting image data, and may also be used as the image input unit 132 of the preparation processing system 130. The positional distortion correction unit 154 performs well-known positional deviation correction and distortion correction (for alignment with the original image 30) on the input completed image 10. The difference extraction unit 156 subtracts the original image from the completed image that has been subjected to position correction and distortion correction, thereby obtaining a differential image that shows only the user's written content. The preprocessing unit 160 performs preprocessing on the difference image obtained by the difference extraction unit 156 to obtain an image suitable for subsequent pattern recognition. In the illustrated example, the preprocessing unit 160 includes a predetermined color pixel extraction unit 162, a missing line segment interpolation unit 164, and a contact separation unit 166.

所定色画素抽出部１６２は、差分画像から所定の色の画素のみを抽出する。これは、例えば試験答案の採点マークの認識を想定したものである。例えば、受験者が答案用紙に黒いペンで解答を記入し、採点者がその答案に対して赤いペンで丸印やバツ印等の採点マークを書き込んでいく典型的なケースでは、所定色画素抽出部１６２は、赤色の範囲に属する色を持つ画素を差分画像から抽出する。 The predetermined color pixel extraction unit 162 extracts only pixels of a predetermined color from the difference image. This assumes, for example, recognition of a scoring mark on a test answer. For example, in a typical case where a test taker writes an answer with a black pen on an answer sheet and a grader writes a scoring mark such as a circle or cross with a red pen on the answer sheet, a predetermined color pixel extraction is performed. The unit 162 extracts a pixel having a color belonging to the red range from the difference image.

欠損線分補間部１６４は、所定色画素抽出部１６２により抽出された所定色の画素からなる画像に対し、欠損線分補間を行う。欠損線分補間は、線分の欠損部分を保管する周知の処理である。これにより、線分の一部が読取ノイズや細線との交差などにより欠落した場合でもその部分が補間される。 The missing line segment interpolation unit 164 performs missing line segment interpolation on the image composed of pixels of the predetermined color extracted by the predetermined color pixel extraction unit 162. The missing line segment interpolation is a well-known process for storing a missing line segment. As a result, even if a part of the line segment is lost due to reading noise or crossing with a thin line, the part is interpolated.

接触分離部１６６は、欠損線分補間後の画像において、複数のオブジェクトに分離されるべき画像同士が接触しているのを分離するための処理を行う。この接触分離処理には公知の手法（例えば特開２００７−２４１３５７号公報参照）を用いることができるので、説明を省略する。 The contact separation unit 166 performs processing for separating the images that are to be separated into the plurality of objects in the image after the missing line segment interpolation. Since a known method (for example, see Japanese Patent Application Laid-Open No. 2007-241357) can be used for the contact separation process, description thereof is omitted.

以上のような機能モジュールにより、前処理部１６０は、記入済み画像１０中の個々の採点マークの画像（前述の記入内容画像のこと）を求め、それら各画像を、その画像の記入済み画像１０上での位置を示す情報と対応づけて出力する。認識処理部１１０は、それら各記入内容画像がどのカテゴリに属するかを、認識辞書１２０を参照して判定する。 With the functional modules as described above, the preprocessing unit 160 obtains images of individual scoring marks in the filled-in image 10 (the above-mentioned filled content images), and each of these images is filled in the filled-in image 10 of the image. Output in association with the information indicating the position above. The recognition processing unit 110 determines which category each entry content image belongs to by referring to the recognition dictionary 120.

この認識処理部１１０の詳細な構成の例を図１０に示す。図１０の例において、正規化部１７２は、前処理部１６０から入力された個々の記入内容画像をそれぞれ正規化する。正規化は、画像のサイズや位置、傾き、線幅などを所定の基準に合わせるための周知の処理である。また、特徴量抽出部１７４が、正規化後の各記入内容画像のそれぞれについて、その画像の特徴量を求める。これは記入内容画像を、画像そのものではなく、その画像の特徴を表すパラメータの組（特徴量）に変換することで、情報圧縮を行う処理である。画像の特徴量やその求め方としては従来周知のものを用いればよい。 An example of a detailed configuration of the recognition processing unit 110 is shown in FIG. In the example of FIG. 10, the normalization unit 172 normalizes each entry content image input from the preprocessing unit 160. Normalization is a well-known process for adjusting the size, position, inclination, line width, etc. of an image to a predetermined standard. Further, the feature quantity extraction unit 174 obtains the feature quantity of each of the entry content images after normalization. This is a process of compressing information by converting an entry content image into a set of parameters (features) representing the features of the image, not the image itself. A conventionally well-known one may be used as an image feature amount or a method for obtaining the feature amount.

このような記入内容画像に対する処理に並行して、記入欄特定部１７６は、前処理部１６０から入力された各記入内容画像の位置の情報に基づき、各記入内容画像がどの記入欄に記入されたものかを特定する。この特定には、記入欄位置記憶部１４２内の情報を参照する。例えば、記入内容画像の位置と最も近い記入欄をその画像に対応する記入欄と判定し、その記入欄のＩＤを求めればよい。適用クラス判定部１１２は、記入内容画像ごとに、記入欄特定部１７６で特定された記入欄のＩＤに対応する各適用クラスのＩＤを、記入欄・クラス対応記憶部１４４から検索する。記入内容画像ごとに求められた各適用クラスのＩＤは、パターン認識部１１６に入力される。 In parallel with the processing for the entry content image, the entry field specifying unit 176 inputs the entry content image into which entry field based on the position information of the entry content image input from the preprocessing unit 160. To identify For this specification, information in the entry column position storage unit 142 is referred to. For example, the entry field closest to the position of the entry content image may be determined as the entry field corresponding to the image, and the ID of the entry field may be obtained. The application class determination unit 112 searches the entry column / class correspondence storage unit 144 for the ID of each application class corresponding to the entry column ID specified by the entry column specifying unit 176 for each entry content image. The ID of each application class obtained for each entry content image is input to the pattern recognition unit 116.

パターン認識部１１６は、記入内容画像ごとにその画像がどのクラスに該当するかをパターン認識処理により求める。より具体的には、記入内容画像ごとに、適用クラス判定部１１２が求めたその記入内容画像に対応する各適用クラスのＩＤに対応するクラス判定情報を認識辞書１２０から求め、それら各適用クラスのクラス判定情報と、特徴量抽出部１７４が求めたその記入内容画像の特徴量に基づき、周知のパターン認識を行う。これにより、それら適用クラスの中から、その記入内容画像に最も近い１乃至複数のクラスを特定する。 The pattern recognition unit 116 obtains which class the image corresponds to for each entry content image by pattern recognition processing. More specifically, for each entry content image, class determination information corresponding to the ID of each application class corresponding to the entry content image obtained by the application class determination unit 112 is obtained from the recognition dictionary 120, and each of the application classes is determined. Based on the class determination information and the feature amount of the entry content image obtained by the feature amount extraction unit 174, known pattern recognition is performed. As a result, one or more classes closest to the entry content image are specified from the application classes.

カテゴリ判定部１１８は、特定されたクラスがどのカテゴリに属するかを認識辞書１２０（図３参照）から求める。求められたカテゴリ（図３の例では、丸、三角、バツの３つのいずれか）が、この認識システム１５０による記入内容画像の認識結果となる。 The category determination unit 118 obtains to which category the identified class belongs from the recognition dictionary 120 (see FIG. 3). The obtained category (in the example of FIG. 3, any one of circle, triangle, and cross) becomes the recognition result of the entry content image by the recognition system 150.

なお、以上では個々の記入内容画像の認識までを説明したが、そのあとにその認識結果を集計（例えば丸印の数を集計して点数を付けるなど）するなどの後処理が続いてもよい。 In addition, although the description up to the recognition of each entry content image has been described above, post-processing such as counting the recognition results (for example, counting the number of circles and adding points) may be continued. .

以上、事前準備型のシステムについて説明した。このシステムは、記入内容画像の近傍にどのような画像オブジェクトがあるかが事前に分かっている場合のものであった。これに対し、そのような事前知識がない場合にも適用可能な事前準備無しのシステムの例を、図１１及び図１２を参照して説明する。 The advance preparation type system has been described above. This system is used when it is known in advance what kind of image object is in the vicinity of the entry content image. On the other hand, an example of a system without prior preparation that can be applied even in the absence of such prior knowledge will be described with reference to FIGS. 11 and 12.

このシステムでは、図４に例示した準備処理システム１３０のような機構は不要であり、認識システムのみを有すればよい。この場合の認識システムの全体構成は、例えば図９に示したものと同様でよいが、その中の認識処理部１１０の構成が事前準備型のシステムとは異なる。事前準備無しのシステムにおける認識処理部１１０の構成の例を図１１に示す。 In this system, a mechanism such as the preparation processing system 130 illustrated in FIG. 4 is not necessary, and only a recognition system is required. The overall configuration of the recognition system in this case may be the same as that shown in FIG. 9, for example, but the configuration of the recognition processing unit 110 therein is different from that of the preparatory system. An example of the configuration of the recognition processing unit 110 in the system without prior preparation is shown in FIG.

図１１に例示する認識処理部１１０の内部構成のうち、端点抽出部１８０，接触判定部１８２、適用クラス判定部１８６、接触端点・クラス対応記憶部１４６を除いた残りの機能モジュールは、図１０に例示した事前準備型システムの認識処理部１１０内の同一符号のモジュールと同様のものである。 Of the internal configuration of the recognition processing unit 110 illustrated in FIG. 11, the remaining functional modules excluding the end point extraction unit 180, the contact determination unit 182, the applied class determination unit 186, and the contact end point / class correspondence storage unit 146 are as shown in FIG. These are the same as the modules with the same reference signs in the recognition processing unit 110 of the preparatory system exemplified in FIG.

大略的にいえば、事前準備無しのシステムでは、事前準備型システムで準備処理システム１３０が行っていたのと似た処理を、（事前にではなく）記入内容画像の認識の際に実行する。ただし、事前処理の場合には実際の記入内容画像がないので記入欄と画像オブジェクトとの重複を調べていたのに対し、この例では実際の記入内容画像と画像オブジェクトとの重複を調べることができる。記入欄を調べていたのでは実際の記入内容画像がその欄に対して多少ずれても認識漏れがでないように、起こりうる画像オブジェクトによる部分的な隠れ状態をすべて適用クラスとして検査する必要があった。これに対し、この例では、記入内容画像と周囲の画像オブジェクトとの実際の重複状態に即したクラスのみを適用クラスとして検査すればよいので、よりよい絞込ができる。 Generally speaking, in a system without advance preparation, processing similar to that performed by the preparation processing system 130 in the advance preparation type system is executed when recognizing a written content image (not in advance). However, in the case of pre-processing, since there is no actual entry content image, the duplication between the entry field and the image object was examined, whereas in this example, the duplication between the actual entry content image and the image object was examined. it can. Checking the entry field requires that all possible hidden states due to possible image objects be inspected as applicable classes so that there is no recognition omission even if the actual entry image is slightly shifted from that field. It was. On the other hand, in this example, only the class corresponding to the actual overlapping state between the entry content image and the surrounding image object has to be inspected as the application class, so that a further narrowing down can be performed.

この例では、端点抽出部１８０は、前処理部１６０から与えられた個々の記入内容画像から、端点を抽出する。画像からの端点抽出の仕方は周知なので説明は省略する。 In this example, the end point extraction unit 180 extracts end points from individual entry content images given from the preprocessing unit 160. Since the method of extracting the end points from the image is well known, the description is omitted.

接触判定部１８２は、端点抽出部１８０が抽出した記入内容画像の端点のうち、他の画像オブジェクトと接触している端点を特定する。検出した端点が他の画像オブジェクトと接しているかどうかは、その端点の位置と、原本画像３０から求められる各画像オブジェクトの位置との比較により判定できる。 The contact determination unit 182 identifies an end point in contact with another image object among the end points of the entry content image extracted by the end point extraction unit 180. Whether or not the detected end point is in contact with another image object can be determined by comparing the position of the end point with the position of each image object obtained from the original image 30.

適用クラス判定部１８６は、接触判定部１８２の判定結果に基づき、記入内容画像の認識の際に考慮すべきクラス、すなわち適用クラスを判定する。この判定では、接触端点・クラス対応記憶部１４６に記憶された情報を参照する。 The application class determination unit 186 determines a class to be considered when recognizing the entry content image, that is, an application class, based on the determination result of the contact determination unit 182. In this determination, information stored in the contact end point / class correspondence storage unit 146 is referred to.

接触端点・クラス対応記憶部１４６には、記入内容画像の端点のうち他の画像オブジェクトと接触する端点の位置に対応づけて、適用クラスが記憶されている。図１２にその記憶部１４６に記憶されるデータの一例を模式的に示す。厳密には、図１２に示した表のうち、左端の列は、各行が示すケースに対応する具体的な記入内容画像の例を示すものであり、接触端点・クラス対応記憶部１４６には記憶されない。接触端点・クラス対応記憶部１４６に記憶されるのは、「端点と他図形との接触」と「認識に適用するクラス」（適用クラス）の欄の情報である。「端点と他図形との接触」の欄には、記入内容画像中の端点のうち他の画像オブジェクトと接触している端点の位置が登録される。接触している端点の位置は、当該端点の記入内容画像内での位置のことであり、例えば、その端点が記入内容画像の上下左右のどの側にあるのかのことである。 The contact end point / class correspondence storage unit 146 stores application classes in association with the positions of the end points in contact with other image objects among the end points of the entry content image. FIG. 12 schematically shows an example of data stored in the storage unit 146. Strictly speaking, in the table shown in FIG. 12, the leftmost column shows an example of a specific entry content image corresponding to the case indicated by each row, and is stored in the contact end point / class correspondence storage unit 146. Not. What is stored in the contact end point / class correspondence storage unit 146 is information in the columns of “contact between end point and other figure” and “class applied to recognition” (applied class). In the “contact between end point and other figure” column, the position of the end point in contact with another image object among the end points in the entry content image is registered. The position of the end point in contact is the position of the end point in the entry content image. For example, the end point is on the upper, lower, left, or right side of the entry content image.

例えば、図１２に示した表の左端の欄には、前処理部１６０から与えられる記入内容画像の例が示される。破線の矩形の内部の黒線の画像が記入内容画像である。１番上の行に示される例では、手書きの丸印全体が記入内容画像として切り出されている。この画像には、下方に２つの端点が存在するが、そのどちらも、他の画像オブジェクト（これは原本画像３０から求めておけばよい）とは接触していない。ここで、記入内容画像が他の画像オブジェクトに部分的に隠された場合は、記入内容画像にそのオブジェクトと接触する端点が存在するはずである。したがって、逆に言えば、図１２の１番上の行の記入内容画像は、他の画像オブジェクトで隠されていないことになる。このような場合は、図１２の表の右端の欄に示すように、各カテゴリの文字の基本形（欠落が一切ない状態）のクラスのみを適用クラスとすればよい。 For example, the leftmost column of the table shown in FIG. 12 shows an example of the entry content image given from the preprocessing unit 160. The black line image inside the broken rectangle is the entry content image. In the example shown in the top line, the entire handwritten circle is cut out as an entry content image. In this image, there are two end points below, but neither of them is in contact with another image object (which may be obtained from the original image 30). Here, if the entry content image is partially hidden by another image object, the entry content image should have an end point in contact with the object. Therefore, in other words, the entry content image in the top row in FIG. 12 is not hidden by other image objects. In such a case, as shown in the rightmost column of the table of FIG. 12, only the class of the basic form of characters of each category (a state in which there is no omission) may be set as the application class.

一方、図１２の２番目の行の例では、前処理部１６０から与えられた記入内容画像は４つの端点を持っている。ここで、その記入内容画像の外接矩形（図中の破線矩形）の上方に画像オブジェクトがあり（図１２の表の中央欄の２行目を参照）、上方の２つの端点の位置がそのオブジェクトの領域に接していることが分かれば、それら２つの端点が端点抽出部１８０により抽出されることになる。この場合、それら２つの端点が当該記入内容画像における上方の端点であること、別観点からいえばその記入内容画像と重複する画像オブジェクトがその画像の外接矩形の上方にあること、が分かる。このように、記入内容画像の上方が隠されていることが分かるので、この場合、適用クラスには、各カテゴリについて、基本形のクラスに加え、上方が隠された状態に対応するクラスが追加されることになる。例えば、丸印のカテゴリについては、上部が欠けた円弧や、上部が欠けた半円形（端点接続が成されてしまう場合を考慮したもの）等のクラスが追加される。 On the other hand, in the example of the second row in FIG. 12, the entry content image given from the preprocessing unit 160 has four end points. Here, there is an image object above the circumscribed rectangle (broken line rectangle in the figure) of the entry content image (see the second line in the center column of the table of FIG. 12), and the positions of the two upper end points are the object. If it is found that the area is in contact with the area, the two end points are extracted by the end point extraction unit 180. In this case, it can be seen that these two end points are the upper end points in the entry content image, and that, from another viewpoint, the image object overlapping the entry content image is above the circumscribed rectangle of the image. Thus, it can be seen that the upper part of the entry content image is hidden. In this case, for each category, in addition to the basic class, a class corresponding to the state where the upper part is hidden is added to the application class. Will be. For example, for the category of circles, a class such as a circular arc with a missing upper part or a semicircular shape with a missing upper part (considering the case where an end point connection is made) is added.

適用クラス判定部１８６が判定した適用クラスはパターン認識部１１６に供給される。 The application class determined by the application class determination unit 186 is supplied to the pattern recognition unit 116.

パターン認識部１１６及びカテゴリ判定部１１８は、図１０の例における対応要素と同様の処理を行えばよい。 The pattern recognition unit 116 and the category determination unit 118 may perform the same processing as the corresponding element in the example of FIG.

以上、本発明の実施形態を説明した。 The embodiments of the present invention have been described above.

記入内容画像のパターン認識に当たって他の画像オブジェクトにより起こり得る部分的な画像の欠落のケースをすべて考慮したのでは誤認識の可能性が高くなる。これに対し、本実施形態では、記入内容画像（又はそれが記入される記入欄）と周囲近傍の画像オブジェクトの位置関係に基づき、考慮すべきケースを絞り込んでいるので、誤認識の可能性を低減することができる。 In the case of pattern recognition of an entry content image, the possibility of misrecognition increases if all the cases of partial image loss that may occur due to other image objects are considered. On the other hand, in the present embodiment, the cases to be considered are narrowed down based on the positional relationship between the entry content image (or the entry field in which it is entered) and the surrounding image object, so that there is a possibility of erroneous recognition. Can be reduced.

以上に説明した各システムは、例えば、汎用のコンピュータに上述の各機能モジュールの処理を表すプログラムを実行させることにより実現される。ここで、コンピュータは、例えば、ハードウエアとして、図１３に示すように、ＣＰＵ１０００等のマイクロプロセッサ、ランダムアクセスメモリ（ＲＡＭ）１００２およびリードオンリメモリ（ＲＯＭ）１００４等のメモリ（一次記憶）、ＨＤＤ（ハードディスクドライブ）１００６を制御するＨＤＤコントローラ１００８、各種Ｉ／Ｏ（入出力）インタフェース１０１０、ローカルエリアネットワークなどのネットワークとの接続のための制御を行うネットワークインタフェース１０１２等が、たとえばバス１０１４を介して接続された回路構成を有する。また、そのバス１０１４に対し、例えばＩ／Ｏインタフェース１０１０経由で、ＣＤやＤＶＤなどの可搬型ディスク記録媒体に対する読み取り及び／又は書き込みのためのディスクドライブ１０１６、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体に対する読み取り及び／又は書き込みのためのメモリリーダライタ１０１８、などが接続されてもよい。上に例示した各機能モジュールの処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク等の通信手段経由で、ハードディスクドライブ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがＲＡＭ１００２に読み出されＣＰＵ１０００等のマイクロプロセッサにより実行されることにより、上に例示した機能モジュール群が実現される。なお、それら機能モジュール群のうちの一部又は全部を、専用ＬＳＩ(Large Scale Integration)、ＡＳＩＣ（Application Specific Integrated Circuit、特定用途向け集積回路）又はＦＰＧＡ（Field Programmable Gate Array）等のハードウエア回路として構成してもよい。 Each system described above is realized, for example, by causing a general-purpose computer to execute a program representing the processing of each functional module described above. Here, for example, as shown in FIG. 13, the computer includes, as hardware, a microprocessor such as a CPU 1000, a memory (primary storage) such as a random access memory (RAM) 1002 and a read only memory (ROM) 1004, an HDD ( HDD controller 1008 that controls (hard disk drive) 1006, various I / O (input / output) interfaces 1010, network interface 1012 that performs control for connection to a network such as a local area network, etc. are connected via bus 1014, for example. Circuit configuration. In addition, for example, a disk drive 1016 for reading from and / or writing to a portable disk recording medium such as a CD or a DVD, a portable memory of various standards, such as a flash memory via the I / O interface 1010 to the bus 1014. A memory reader / writer 1018 for reading from and / or writing to the nonvolatile recording medium may be connected. A program in which the processing contents of each functional module exemplified above are described is stored in a fixed storage device such as a hard disk drive via a recording medium such as a CD or DVD, or via a communication means such as a network, and stored in a computer. Installed. The program stored in the fixed storage device is read into the RAM 1002 and executed by a microprocessor such as the CPU 1000, whereby the functional module group exemplified above is realized. Some or all of these functional module groups are used as hardware circuits such as dedicated LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), or FPGA (Field Programmable Gate Array). It may be configured.

実施形態の画像認識装置の一例を示す図である。It is a figure which shows an example of the image recognition apparatus of embodiment. 定型文書の記入済み画像の例と、そこから採点記号を抜き出した結果の画像の例を示す図である。It is a figure which shows the example of the completed image of a fixed form document, and the example of the image of the result of extracting the scoring symbol from there. 認識辞書のデータ内容の一例を模式的に示す図である。It is a figure which shows an example of the data content of a recognition dictionary typically. 事前準備型のシステムの全体像を示す図である。It is a figure which shows the whole image of a preparation type system. 事前準備型のシステムのうちの準備処理システムの詳細構成の例を示す図である。It is a figure which shows the example of a detailed structure of the preparation processing system among the systems of a prior preparation type. 記入欄位置記憶部が記憶するデータの例を示す図である。It is a figure which shows the example of the data which an entry column position memory | storage part memorize | stores. 選択規則記憶部が記憶する選択規則の例を示す図である。It is a figure which shows the example of the selection rule which a selection rule memory | storage part memorize | stores. 記入欄・クラス対応記憶部が記憶するデータの例を示す図である。It is a figure which shows the example of the data which an entry column and a class corresponding | compatible memory | storage part memorize | store. 認識システムの一例の全体構成の一例を示す図である。It is a figure which shows an example of the whole structure of an example of a recognition system. 認識システムの認識処理部の内部構成の例を示す図である。It is a figure which shows the example of an internal structure of the recognition process part of a recognition system. 事前準備無しのシステムにおける認識処理部の構成の例を示す図である。It is a figure which shows the example of a structure of the recognition process part in the system without prior preparation. 接触端点・クラス対応記憶部が記憶するデータの一例を示す図である。It is a figure which shows an example of the data which a contact endpoint and class corresponding | compatible memory | storage part memorize | stores. コンピュータのハードウエア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a computer.

Explanation of symbols

１０記入済み画像、１０２記入抽出部、１１０認識処理部、１１２適用クラス判定部、１１４判定情報記憶部、１１６パターン認識部、１１８カテゴリ判定部、１２０認識辞書。 DESCRIPTION OF SYMBOLS 10 completed image, 102 entry extraction part, 110 recognition process part, 112 application class determination part, 114 determination information storage part, 116 pattern recognition part, 118 category determination part, 120 recognition dictionary

Claims

Candidate storage means for storing one or more class candidates corresponding to the category for each of the plurality of categories;
Each of the plurality of categories is specified based on the positional relationship between the entry field and the image elements around the entry field from among one or more class candidates corresponding to the category stored in the candidate storage unit. A selection means for selecting one or more classes as classes to be used in the determination for the category;
By executing pattern recognition processing on the image entered in the entry field, the class to which the entry image in the entry field belongs is recognized from the classes selected by the selection means for each of the plurality of categories. Pattern recognition means;
An output means for identifying a category corresponding to the class recognized by the pattern recognition means based on the candidate storage means, and outputting the identified category as a category to which an image of entry for the entry field belongs;
An image recognition apparatus comprising:

Extraction means for extracting an image of entry for the entry field from the input image;
Endpoint detection means for detecting the endpoint of the image of the entry extracted by the extraction means;
Further comprising
The selection means identifies one or more image elements that are in contact with any of the end points detected by the end point detection means, and a class to be used for the determination based on the positional relationship between the specified one or more image elements and the entry field Select
The image recognition apparatus according to claim 1.

Of the class candidates stored in the candidate storage unit for each entry field of the standard document including one or more image elements and one or more entry fields and for each category, the image elements around the entry field and the Entry field information storage means for storing one or more classes identified based on the positional relationship with the entry field as a class used for determination of the entry field;
Extraction means for extracting an image of entry for each entry field from the input target image;
Further comprising
The selection means obtains, from the entry field information storage means, a class to be used for determination of the entry field for each entry image for each entry field extracted by the extraction unit.
The image recognition apparatus according to claim 1.

Computer
Candidate storage means for storing one or more class candidates corresponding to the category for each of the plurality of categories;
Each of the plurality of categories is specified based on the positional relationship between the entry field and the image elements around the entry field from among one or more class candidates corresponding to the category stored in the candidate storage unit. A selection means for selecting one or more classes as classes to be used for determination of the category;
By executing pattern recognition processing on the image entered in the entry field, the class to which the entry image in the entry field belongs is recognized from the classes selected by the selection means for each of the plurality of categories. Pattern recognition means,
An output means for identifying a category corresponding to the class recognized by the pattern recognition means based on the candidate storage means, and outputting the identified category as a category to which an entry image for the entry field belongs;
Program to function as.