JP2010250387A

JP2010250387A - Image recognition device and program

Info

Publication number: JP2010250387A
Application number: JP2009096365A
Authority: JP
Inventors: Hiroshi Shinjo; 広新庄; Takeshi Eisaki; 健永崎; Kazuki Nakajima; 和樹中島; Kenji Shibata; 憲志柴田
Original assignee: Hitachi Computer Peripherals Co Ltd
Current assignee: Hitachi Information and Telecommunication Engineering Ltd
Priority date: 2009-04-10
Filing date: 2009-04-10
Publication date: 2010-11-04
Anticipated expiration: 2029-04-10
Also published as: JP5424694B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a process for detecting a space area, with high speed, from a document image having color and gray gradation acquired by scanning a paper document. <P>SOLUTION: In an image which is obtained by scanning the paper document having color and gray gradation, the profile shape of the space area is calculated, by calculating the boundary threshold between a foreground color and a background color by distribution analysis of image density, even when a dark color is designed at near edge of the space. An image edge is extracted at a high speed, by performing binary search into each directions perpendicular to each side close to the boundary side of the acquired space area, then four sides detection in which the calculation amount is related less with respect to a high-resolution color and gray gradation image is achieved, by minutely calculating the lines that constitute four sides. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、画像認識装置及びプログラムに関し、例えば、紙の文書をＯＣＲ装置等によって撮像し、撮像した画像から文書領域と思われる部分のみを切り出して保存する技術に関するものである。 The present invention relates to an image recognition apparatus and a program, for example, to a technique of capturing a paper document by an OCR apparatus or the like, and cutting out and storing only a portion that seems to be a document area from the captured image.

法令遵守や文書電子化などの社会的な要求から、様々な業務文書を電子画像として保存し、これを読取る文書読取システムが社会的に求められている。一般に、このようなプロセスを遂行するためにはＯＣＲ（Optical Character Recognition）装置が用いられる。すなわち、ＯＣＲ装置を用いて紙文書をスキャンし、画像上から紙文書のエリアを検出し、当該紙文書エリア内に書かれている文字を認識し、当該読取結果を修正インタフェース上に表示し、当該読取結果に存在する紙面領域の検出誤りや読取誤りや読取欠損を人手で修正する、という一連のプロセスによって、様々な業務文書を電子画像として保存する。 Due to social demands such as legal compliance and document digitization, there is a social demand for document reading systems that store various business documents as electronic images and read them. In general, an OCR (Optical Character Recognition) apparatus is used to perform such a process. That is, a paper document is scanned using an OCR device, an area of the paper document is detected from the image, characters written in the paper document area are recognized, and the reading result is displayed on the correction interface. Various business documents are stored as electronic images by a series of processes in which detection errors, reading errors, and reading defects in the paper area existing in the reading result are manually corrected.

このとき、ＯＣＲ装置及び文書読取システムにとっての重要な課題の一つは、様々なデザインや色を持つ文書に対して、容量をなるべく減らした上で、正しい文書画像を保存することである。そのためには画像中から文書の領域を推定して、当該領域のみの画像を抽出、さらに補正（回転補正や切り出しなど）して、これを保存することが必要となる。ここでは、紙領域と思われるエリアを推定する処理を、紙面の４辺（エッジ）を検出するという意味において、４辺検出と称することにする。 At this time, one of the important issues for the OCR apparatus and the document reading system is to store a correct document image while reducing the capacity as much as possible for documents having various designs and colors. For this purpose, it is necessary to estimate the area of the document from the image, extract an image of only the area, further correct (rotate correction, cut out, etc.), and save this. Here, the process of estimating an area that seems to be a paper region is referred to as four-side detection in the sense that four sides (edges) of the paper surface are detected.

特開平８−１６２１９０号公報JP-A-8-162190 特開２００５−２８５０１０号公報JP 2005-285010 A

２値画像に対しての４辺検出の技術は従来からあるが、さまざまなデザイン、模様、色を持つ文書に対しての４辺検出は未だ確立した技術とはなっていない。従来の手法では２値画像を用いる、あるいはカラー画像上を走査してエッジと思われる箇所をフィルタ演算などで抽出し、紙面領域の輪郭を追跡することで４辺検出を行っている（例えば、特許文献１）。つまり、従来のＯＣＲスキャナでは紙面領域（前景色）が白、背景色が黒であることを前提として、２値画像を用いて４辺検出を行っている。この場合は、２値画像からランを作成し、白と黒の境界線をラン上で追跡して紙面領域を囲む輪郭を計算するなどにより、４辺検出を行っている。従って、２値画像の入力を前提とした場合、紙面上に濃い色が紙の端まで載っている場合に、正確に検知できないという問題が生じる。その対策として、カラー画像やグレー階調画像を用いて４辺検出を行うことが考えられる。 Although there has been a conventional technique for detecting four sides of a binary image, detection of four sides for a document having various designs, patterns, and colors has not yet been established. In the conventional method, a binary image is used, or a color image is scanned to extract a portion that seems to be an edge by a filter operation or the like, and a four-side detection is performed by tracing the outline of a paper area (for example, Patent Document 1). That is, the conventional OCR scanner performs four-side detection using a binary image on the assumption that the paper area (foreground color) is white and the background color is black. In this case, four sides are detected by creating a run from a binary image, tracking a white / black boundary line on the run, and calculating a contour surrounding the paper area. Therefore, when it is assumed that a binary image is input, there is a problem in that it cannot be detected accurately when a dark color is placed up to the edge of the paper. As a countermeasure, it is conceivable to perform four-side detection using a color image or a gray gradation image.

しかしながら、２値画像に比べてカラー画像の容量は４〜２４倍へと大きく増えるため、２値画像を踏襲した方法では処理時間が大幅に増えるという問題がある。 However, since the capacity of a color image is greatly increased to 4 to 24 times that of a binary image, there is a problem that the processing time greatly increases in the method that follows the binary image.

また、紙面の背景色となる黒色も、ＯＣＲ装置に付いた紙粉などの影響により、安定した輝度にならないため、紙面領域の推定誤りが生じやすいという問題もある。 In addition, black, which is the background color of the paper surface, does not have a stable luminance due to the influence of paper dust attached to the OCR device, and there is a problem that an estimation error of the paper surface region is likely to occur.

さらに、上述のように高解像度のカラー画像やグレー階調画像を全面走査することは、計算量が掛かるため、処理時間の点で課題が残る。 Furthermore, as described above, scanning the entire surface of a high-resolution color image or gray gradation image requires a calculation amount, and thus there remains a problem in terms of processing time.

本発明はこのような状況に鑑みてなされたものであり、高解像度のカラー画像が対象であっても正確かつ高速に４辺検出することができる技術を提供するものである。 The present invention has been made in view of such a situation, and provides a technique capable of accurately and rapidly detecting four sides even when a high-resolution color image is a target.

上記課題を解決するために、本発明では、格子点状に画像からプレサンプリングを行い、画像構成色の分布解析によって背景色と前景色の境界を推定することにより、大まかな紙面領域を推定する。そして、大まかな領域近辺で詳細な二分探索を行うことで計算量を抑えてエッジの検出を行う。また、エッジ辺の妥当性に対する検定処理を行い、必要であればより詳細なエッジ辺検出を行う。このように、本発明では、多段構成のエッジ検出処理を実行する。 In order to solve the above-described problem, in the present invention, a rough paper area is estimated by pre-sampling from an image in a lattice point shape and estimating the boundary between the background color and the foreground color by analyzing the distribution of the image constituent colors. . Then, by performing a detailed binary search in the vicinity of the rough region, the amount of calculation is suppressed and the edge is detected. Also, a test process for validity of the edge side is performed, and if necessary, more detailed edge side detection is performed. Thus, in the present invention, edge detection processing with a multi-stage configuration is executed.

より具体的には、本発明による画像認識装置（ＯＣＲ装置）は、格子点抽出処理部と、画像解析部と、エッジ推定処理部と、境界線取得部と、を備えている。格子点抽出処理部は、画像データから複数の格子点を抽出する。画像解析部は、抽出された複数の格子点を用いて画像濃度に関する第1のヒストグラムを生成し、当該第１のヒストグラムから背景と前景とを分離するための第１の分離閾値を算出し、複数の格子点について当該第１の分離閾値を適用し、背景と前景の境界近傍の格子点を抽出する。また、エッジ推定処理部は、境界近傍の格子点も用いてエッジ推定する。なお、この推定は、第１の分離閾値に最も近い前記前景及び背景に属する複数の格子点に対して二分検索処理を実行することによって実現できる。境界線取得部は、エッジ推定処理によって得られた複数のエッジ点を用いて直線近似処理を実行して境界線を取得する。 More specifically, an image recognition apparatus (OCR apparatus) according to the present invention includes a lattice point extraction processing unit, an image analysis unit, an edge estimation processing unit, and a boundary line acquisition unit. The lattice point extraction processing unit extracts a plurality of lattice points from the image data. The image analysis unit generates a first histogram relating to the image density using the extracted plurality of grid points, calculates a first separation threshold for separating the background and the foreground from the first histogram, The first separation threshold is applied to a plurality of grid points to extract grid points near the boundary between the background and the foreground. The edge estimation processing unit also performs edge estimation using lattice points near the boundary. This estimation can be realized by executing a binary search process on a plurality of grid points belonging to the foreground and background closest to the first separation threshold. The boundary line acquisition unit acquires a boundary line by executing a straight line approximation process using a plurality of edge points obtained by the edge estimation process.

上記画像認識装置は、さらに、境界評価部を備えている。この境界評価部は、前景領域内の画像濃度と背景領域内の画像濃度との差分値を用いて、境界線が前景と背景を区別する境界として妥当であるか評価する。具体的には、境界評価部は、画像濃度の差分値が第１の閾値以下となる箇所の個数が所定値未満であるか否か (第１の条件）及び背景領域における画像濃度の分散が第２の閾値より大きいか否か(第２の条件）を判断し、第１又は第２の条件のいずれかを満足する場合には、取得された境界線は妥当であると判断する。第１及び第２の条件のいずれからも外れる場合には、境界線は不適当であると判断される。この場合、画像解析部は、取得された境界線の近傍の存在する所定数の画素サンプルを取得し、当該画素サンプルについての第２のヒストグラムを生成し、当該第２のヒストグラムから前記背景と前記前景を分離するための第２の分離閾値を算出する。そして、エッジ推定処理部は、第２の分離閾値を用いて二分探索処理を実行して複数のエッジ点（修正エッジ点）を検出する。さらに、境界線取得部が、修正エッジ点を用いて直線近似処理を実行して修正境界線を取得する。このようにして多段攻勢のエッジ検出処理が実現される。 The image recognition apparatus further includes a boundary evaluation unit. The boundary evaluation unit evaluates whether the boundary line is appropriate as a boundary for distinguishing between the foreground and the background, using a difference value between the image density in the foreground region and the image density in the background region. Specifically, the boundary evaluation unit determines whether or not the number of locations where the difference value of the image density is equal to or less than the first threshold is less than a predetermined value (first condition) and the variance of the image density in the background region. It is determined whether or not it is larger than the second threshold (second condition), and if either the first or second condition is satisfied, it is determined that the acquired boundary line is valid. If both the first and second conditions are not met, the boundary line is determined to be inappropriate. In this case, the image analysis unit acquires a predetermined number of pixel samples existing in the vicinity of the acquired boundary line, generates a second histogram for the pixel sample, and generates the second histogram from the second histogram. A second separation threshold for separating the foreground is calculated. Then, the edge estimation processing unit detects a plurality of edge points (corrected edge points) by executing a binary search process using the second separation threshold. Further, the boundary line acquisition unit acquires a corrected boundary line by executing a straight line approximation process using the corrected edge point. In this way, multistage aggressive edge detection processing is realized.

上記画像認識装置は、さらに、傾き補正部を備えている。この傾き補正部は、境界線又は修正境界線が複数ある場合、対向する２つの境界線又は修正境界線の傾き差を算出し、この傾き差が所定値以上のときに、対抗する境界線又は修正境界線を補正する。 The image recognition apparatus further includes an inclination correction unit. When there are a plurality of boundary lines or correction boundary lines, the inclination correction unit calculates the inclination difference between two opposing boundary lines or correction boundary lines, and when the inclination difference is equal to or greater than a predetermined value, Correct the corrected boundary line.

さらなる本発明の特徴は、以下本発明を実施するための最良の形態および添付図面によって明らかになるものである。 Further features of the present invention will become apparent from the best mode for carrying out the present invention and the accompanying drawings.

本発明により、紙文書をカラーまたはグレー階調でスキャンした画像について、紙面のエッジに暗い色が載っていても、画像構成色の分布解析によって背景色と前景色の分離境界を推定して、これを分離することができるようになる。また、詳細な境界の探索を二分探索で行うことで、高解像度の画像に対しても高速な４辺検出が見込める。 According to the present invention, for an image obtained by scanning a paper document with color or gray gradation, even if a dark color is placed on the edge of the paper, the separation boundary between the background color and the foreground color is estimated by the distribution analysis of the image constituent color, This can be separated. Further, by performing a detailed boundary search by binary search, high-speed four-side detection can be expected even for a high-resolution image.

装置構成を示す図である。It is a figure which shows an apparatus structure. ４辺検出の概念図である。It is a conceptual diagram of 4 side detection. ４辺検出処理の機能ブロック図である。It is a functional block diagram of 4 side detection processing. ４辺検出における画像濃度分布解析の例である。It is an example of the image density distribution analysis in 4 side detection. ４辺検出におけるエリア推定と２分探索エリアの決定例である。It is an example of determination of area estimation and binary search area in 4 side detection. ４辺検出における第二パスでの画像濃度分布解析の例である。It is an example of the image density distribution analysis in the second pass in the four-side detection. ４辺検出における尤度の低い１辺の補正例である。This is an example of correcting one side with low likelihood in four-side detection. 第二パスでの処理を想定する画像例である。It is an example of an image assuming processing in the second pass.

本発明は、例えば、紙の文書をＯＣＲ装置等によって撮像し、撮像した画像から文書領域と思われる部分のみを切り出して保存する技術に関するものである。これを実現するために、本発明では、格子点状に画像からプレサンプリングを行い、画像構成色の分布解析によって背景色と前景色の境界を推定することにより大まかな紙面領域を推定し、更に大まかな領域近辺で詳細な二分探索を行うことで計算量を抑えてエッジの検出を行っている。 The present invention relates to, for example, a technique for capturing a paper document with an OCR device or the like, and cutting out and storing only a portion that is considered to be a document area from the captured image. In order to realize this, in the present invention, pre-sampling is performed from an image in a lattice point shape, a rough paper area is estimated by estimating the boundary between the background color and the foreground color by analyzing the distribution of image constituent colors, and By performing a detailed binary search in the vicinity of a rough region, the amount of calculation is suppressed and edge detection is performed.

以下、添付図面を参照して本発明の実施形態について説明する。ただし、本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。また、各図において共通の構成については同一の参照番号が付されている。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. However, it should be noted that this embodiment is merely an example for realizing the present invention, and does not limit the technical scope of the present invention. In each drawing, the same reference numerals are assigned to common components.

＜認識装置（ＯＣＲ装置）の構成＞
まず、本実施形態が適用されるハードウェア構成について説明する。図１は、本発明の実施形態による文字認識装置（ＯＣＲ装置及び文書読取システム）の概略構成を示す図である。 <Configuration of recognition device (OCR device)>
First, a hardware configuration to which the present embodiment is applied will be described. FIG. 1 is a diagram showing a schematic configuration of a character recognition device (an OCR device and a document reading system) according to an embodiment of the present invention.

ＯＣＲ装置０１００では、画像撮像部である画像撮像装置０１０１により紙文書を電子データに変換し、それを記憶部である外部記憶装置０１０５及びメモリ０１０６に蓄えて、中央処理部（ＣＰＵ）である中央演算装置０１０７により読取を行う。 In the OCR device 0100, a paper document is converted into electronic data by an image capturing device 0101 that is an image capturing unit, and the electronic document is stored in an external storage device 0105 and a memory 0106 that are storage units, and is a central processing unit (CPU). Reading is performed by the arithmetic unit 0107.

本実施形態に係わるＯＣＲプログラム及び証跡管理プログラムは、外部記憶装置０１０５またはメモリ０１０６に蓄えられているか、通信装置０１０９を介して装置に導入され、これら記憶部０１０５又は０１０６に記憶される。ＯＣＲプログラムは、撮像された電子データ画像に対して、中央演算装置０１０７が画像処理を行い、４辺検出を行い、必要であれば紙面領域のみを切り出した画像を出力する。これらの処理結果に対しては、操作端末装置０１０２を通して操作者である人間が操作（修正）可能となっており、処理結果及び修正結果は表示端末装置０１０３に表示される。処理結果などの情報は、必要に応じて外部記憶装置０１０５に蓄積または通信装置０１０９を通して外部接続装置にデータとして送信されるようにしてもよい。 The OCR program and the trail management program according to the present embodiment are stored in the external storage device 0105 or the memory 0106 or introduced into the device via the communication device 0109 and stored in the storage unit 0105 or 0106. In the OCR program, the central processing unit 0107 performs image processing on the captured electronic data image, performs four-side detection, and outputs an image obtained by cutting out only the paper area if necessary. These processing results can be operated (corrected) by a human operator through the operation terminal device 0102, and the processing results and the correction results are displayed on the display terminal device 0103. Information such as processing results may be stored in the external storage device 0105 or transmitted as data to the external connection device through the communication device 0109 as necessary.

上述の各処理部及び装置は、内部バス０１０８によって繋がっている。入力された伝票類は、伝票の大きさや種類毎に、ソータ装置０１０４によって定義された箱に分配・集積される。言い換えるなら、ＯＣＲ装置０１００は、画像撮像装置０１０１とソータ装置０１０４を除けば、通常のパーソナルコンピュータ（ＰＣ）などのコンピュータシステムで構成されうるものである。 The processing units and devices described above are connected by an internal bus 0108. The input slips are distributed and collected in a box defined by the sorter device 0104 for each size and type of slip. In other words, the OCR device 0100 can be configured by a computer system such as a normal personal computer (PC) except for the image capturing device 0101 and the sorter device 0104.

ＯＣＲプログラムは、上記ＯＣＲ装置０１００から出力された画像、及び認識結果を表示端末装置０１０３に表示する。画像の４辺検出は、操作端末装置０１０２を通して操作者である人間によって修正、チェックが行われる。そして、ＯＣＲプログラムは、その修正結果を外部記憶装置０１０５またはメモリ０１０６に蓄える。 The OCR program displays the image output from the OCR device 0100 and the recognition result on the display terminal device 0103. The detection of the four sides of the image is corrected and checked by the person who is the operator through the operation terminal device 0102. Then, the OCR program stores the correction result in the external storage device 0105 or the memory 0106.

＜４辺検出の具体的処理内容＞
図２は、４辺検出の概念を示す図である。また、図３は、４辺検出処理の概要を説明するためのフローチャートである。さらに、図４乃至８は、４辺検出処理と画像の例を示し、４辺検出処理の内容の理解を助ける図である。 <Specific processing contents of four-side detection>
FIG. 2 is a diagram illustrating the concept of four-side detection. FIG. 3 is a flowchart for explaining an outline of the four-side detection process. Further, FIGS. 4 to 8 show examples of four-side detection processing and images, and are diagrams for helping understanding of the contents of the four-side detection processing.

以下、図３乃至８を基づいて各処理ステップを詳細に説明する。尚、各ステップの動作主体は、特に断らない限り、中央演算装置０１０７であるが、この中央処理装置０１０７は、各ステップを処理している際には、各処理部として機能している。つまり、例えば、プレサンプリング処理を動作させている場合には、中央処理装置０１０７は、プレサンプリング処理部となる。他のステップについても同様である。 Hereinafter, each processing step will be described in detail with reference to FIGS. The operating subject of each step is the central processing unit 0107 unless otherwise specified. The central processing unit 0107 functions as each processing unit when processing each step. That is, for example, when the pre-sampling process is operated, the central processing unit 0107 serves as a pre-sampling processing unit. The same applies to the other steps.

１）プレサンプリング処理（Ｓ０３０１）
当該ステップでは、入力された画像（例えば帳票画像）にＬｎ×Ｌｎの格子があてはめられ、その格子点の画素情報がサンプリング処理される。つまり、これは、入力画像に対して複数の格子点を定義し、この格子点近傍の画素の色（濃度）を推定する処理である。紙面だけでなく、黒の背景についてもサンプリングされる。 1) Pre-sampling process (S0301)
In this step, an Ln × Ln grid is fitted to the input image (for example, a form image), and pixel information of the grid point is sampled. That is, this is a process of defining a plurality of grid points for the input image and estimating the color (density) of the pixels near the grid points. It is sampled not only on paper but also on a black background.

Ｌｎの数は検出するべき最小紙面サイズと、撮像できる最大領域から決められる。例えば、最小紙面サイズ上にサンプルリング点を最小４×４個含みたい場合、格子点の数は、（４×最大領域サイズの横幅）／最小紙面サイズの横幅、という形で決められる。精度向上のために増やすこともできるが、これにともなって処理時間は増加する。 The number of Ln is determined from the minimum paper surface size to be detected and the maximum area that can be imaged. For example, if it is desired to include a minimum of 4 × 4 sampling points on the minimum paper size, the number of grid points is determined in the form of (4 × horizontal width of the maximum area size) / horizontal width of the minimum paper size. Although it can be increased to improve accuracy, the processing time increases accordingly.

そして、当該ステップでは、さらに、各格子点について画像のサンプリング処理が実行される。この際、画素値としてＰｎ×Ｐｎ画素の中央値（メディアン）が用いられる。Ｐｎ×Ｐｎ画素の中央値を用いた場合、対象領域に（Ｐｎ／２−１）画素幅のライン状ノイズが横断していたとしても、ノイズ成分は中央値に影響を及ぼすことはない。 In this step, image sampling processing is further executed for each lattice point. At this time, the median value of Pn × Pn pixels is used as the pixel value. When the median value of Pn × Pn pixels is used, the noise component does not affect the median value even if (Pn / 2-1) pixel-wide line-shaped noise crosses the target region.

従って、スキャン時の走査方向のノイズ幅として想定する値の２倍の値をＰｎとして設定すれば良い。一般には、Ｐｎとして１ｍｍ相当の画素数、すなわち２００ｄｐｉ画像であればＰｎ＝８であれば十分である。 Therefore, a value twice as large as the assumed noise width in the scanning direction at the time of scanning may be set as Pn. In general, the number of pixels corresponding to 1 mm as Pn, that is, Pn = 8 is sufficient for a 200 dpi image.

２）画像濃度分布解析による閾値決定処理（Ｓ０３０２）
当該ステップでは、プレサンプリングで得られたＬｎ×Ｌｎ箇所のサンプル点から、画像構成色のヒストグラムが生成される。 2) Threshold determination process by image density distribution analysis (S0302)
In this step, a histogram of image constituent colors is generated from Ln × Ln sample points obtained by pre-sampling.

図４はそのヒストグラムの一例を示している。図４に例示したヒストグラム０４０１では、データ区間５０の部分が背景色のデータであり、第一のピークが形成されている。従って、閾値Ｔｂ１をこのピークの次の値１００に設定すれば背景色と帳票色の識別が行えることがわかる。具体的には、画像濃度分布のヒストグラムの横軸を０（黒色）から昇順に走査し、第一ピークを発見し、第一ピークにおけるサンプル数のＴｐ比率より低い所を、背景色と前景色の境界とする。一般にはＴｐとして０．１を用いると良い。 FIG. 4 shows an example of the histogram. In the histogram 0401 illustrated in FIG. 4, the data section 50 is background color data, and a first peak is formed. Therefore, it can be seen that the background color and the form color can be identified by setting the threshold value Tb1 to the value 100 next to this peak. Specifically, the horizontal axis of the histogram of image density distribution is scanned in ascending order from 0 (black), the first peak is found, and the background color and foreground color are lower than the Tp ratio of the number of samples in the first peak. The boundary of In general, 0.1 is preferably used as Tp.

３）エッジ推定処理（Ｓ０３０３）
当該ステップで実行されるエッジ推定処理は２段階の処理で構成されている。 3) Edge estimation process (S0303)
The edge estimation process executed in this step is composed of two stages.

まず、プレサンプリングによって得られたＬｎ×Ｌｎの格子点の情報を用いて、帳票のエッジ（４辺）の位置の推定処理が実行される。これは、格子点上の画素値（上記Ｐｎ×Ｐｎ画素の中央値）についてオペレータ演算により、エッジがどのブロック（格子で囲まれた小領域）にあるかを計算する。具体的には、上下左右の４つの境界線を識別するために、上下方向、ならびに左右方向の２種類の擬似Ｓｏｂｅｌオペレータを使用する。図５Ａはサンプリングとエッジ推定区間の例を示す図である。サンプリング点を丸で、推定された領域を灰色の丸（０５０１）で、更に詳細な判定を行う区間を太線（０５０１）で示している。 First, using the information of Ln × Ln lattice points obtained by pre-sampling, the process of estimating the position of the edge (four sides) of the form is executed. This is to calculate which block (small region surrounded by the grid) the edge is by operator calculation for the pixel value on the grid point (the median value of the Pn × Pn pixels). Specifically, two types of pseudo Sobel operators in the vertical direction and the horizontal direction are used in order to identify four boundary lines in the vertical and horizontal directions. FIG. 5A is a diagram illustrating an example of sampling and edge estimation intervals. Sampling points are indicated by circles, estimated regions are indicated by gray circles (0501), and sections for further detailed determination are indicated by bold lines (0501).

次に、エッジ推定によって得られた小領域の中を二分探索によってエッジを詳細に計算する。二分探索は、４辺においてビットマップの端からエッジ推定を行った部分について行われ、両端からエッジを挟み撃ちにしてエッジを検出する処理である。二分探索において両端とその中央の画素値を比較するため、暗い地色の帳票の場合でもエッジ検出を行うことができる。すなわち、両端点における画素値をＧ１及びＧ２、中間点の画素値をＧ３とした場合、Ｇ１とＧ２の差分量｜Ｇ１−Ｇ２｜の絶対値のみでエッジを判断するだけでなく、画素値の変化量の大小による判断が可能となるためである。これについては、図５Ｂの例を参照して説明する。 Next, the edge is calculated in detail by a binary search in the small region obtained by the edge estimation. The binary search is a process of performing edge estimation on the four sides from the end of the bitmap and detecting the edge by pinching the edge from both ends. Since the pixel values at both ends and the center thereof are compared in the binary search, edge detection can be performed even in the case of a dark ground color form. That is, if the pixel values at both end points are G1 and G2, and the intermediate pixel value is G3, not only the absolute value of the difference amount | G1-G2 | This is because it is possible to make a judgment based on the amount of change. This will be described with reference to the example of FIG. 5B.

図５Ｂは、二分探索を行うある範囲での画素値の変化を表した図である。最初に境界内（紙面上）と判断された点の画素値（０５０３）は閾値Ｔｂ１をぎりぎりで上回っているが、そこからやや暗めの色がグラデーションを掛けられて紙面上に載っているとすると、その隣の点までが紙面上の領域となる。すなわち、紙面（明るい領域）と背景（暗い領域）の境界とを閾値Ｔｂ１のみで決定すると誤る可能性がある。そこで、二分探索によって画素値の並びＧ１→Ｇ３→Ｇ２で最も急峻に変換するところを探索する。探索開始当初の両端点が（０５０４，０５０５）であり、その中間点が（０５０６）、更に画素値の変化の急峻度合いが角度（０５０７）によって計量できる。より急峻な変化量を得るように｜Ｇ１−Ｇ３｜と｜Ｇ３−Ｇ２｜のうち差分が大きいほうを次の二分探索の端点とするように計算を繰り返すと、最終的には両端点として（０５０８、０５０９）、中間点として（０５１０）、変化量として最も大きな（０５１１）を得る。これによりエッジ推定区間内においてエッジと推定されうる部分が、点（０５０８）と点（０５１０）の間にあるという推定ができる。 FIG. 5B is a diagram illustrating changes in pixel values in a certain range in which a binary search is performed. The pixel value (0503) of the point first determined to be within the boundary (on the paper surface) exceeds the threshold Tb1, but it is assumed that a slightly darker color is applied with gradation on the paper surface. The area up to the adjacent point is the area on the paper. That is, there is a possibility that it is erroneous to determine the boundary between the paper surface (bright region) and the background (dark region) only by the threshold value Tb1. Therefore, a search is made for a place where the pixel values are converted most steeply in a binary search G1 → G3 → G2. The both end points at the beginning of the search are (0504, 0505), the intermediate point is (0506), and the steepness of the change of the pixel value can be measured by the angle (0507). When the calculation is repeated so that the larger difference between | G1-G3 | and | G3-G2 | is used as the end point of the next binary search so as to obtain a steeper change amount, the end point is finally set as ( 0508, 0509), the middle point (0510), and the largest change amount (0511). Thereby, it can be estimated that a portion that can be estimated as an edge in the edge estimation section is between the point (0508) and the point (0510).

４）境界線の近似処理（Ｓ０３０４）
当該ステップでは、各辺について最大Ｎｅ箇所のエッジ検出が実行され、ここで得られたエッジ群について最小二乗法によって直線近似が実行される。つまり、これは、エッジ推定処理によって得られた境界点であろうと思われる点に対して直線近似する処理である。 4) Boundary line approximation processing (S0304)
In this step, edge detection at the maximum Ne places is executed for each side, and linear approximation is executed for the obtained edge group by the least square method. In other words, this is a process for linearly approximating a point that seems to be a boundary point obtained by the edge estimation process.

直線近似において、近似した直線からの乖離が大きいものに関しては近似から外し、これを数回繰り返すことでコーナーカットやノイズによる誤認識の影響を除去する。予備実験の結果では、一箇所につき５〜７回の比較で収束し、各辺ではＮｅ×７回の比較、4辺合計でも最大４×Ｎｅ×７回の比較でエッジを検出する。一般にはＮｅとして１０を用いる。 In the linear approximation, those having a large deviation from the approximated straight line are excluded from the approximation, and this is repeated several times to eliminate the influence of erroneous recognition due to corner cuts or noise. As a result of the preliminary experiment, convergence is achieved by 5 to 7 comparisons per location, and edges are detected by Ne × 7 comparisons for each side, and a maximum of 4 × Ne × 7 comparisons for the total of the four sides. Generally, 10 is used as Ne.

５）近似境界線の評価処理（Ｓ０３０５）
当該ステップでは、近似境界線で区切られた２領域においてＮｅ点のサンプリングが行われ、紙面領域内と紙面領域外の画像濃度の差分値によって境界線の妥当性が評価される。つまり、これは、求めた近似直線が境界線として尤もらしいか判断する処理である。 5) Approximate boundary evaluation process (S0305)
In this step, Ne points are sampled in the two areas separated by the approximate boundary line, and the validity of the boundary line is evaluated based on the difference value of the image density inside and outside the paper area. That is, this is a process of determining whether the obtained approximate straight line is likely to be a boundary line.

境界線の内外での画素値の差分が小さければ、領域を区切る境界として不適当である。また、境界線外の画素値の分散が大きい場合は、様々な模様が載っていると判断して不適当と判断する。すなわち、以下の条件ａ又はｂを満たさない境界線は、ステップＳ０３０６以降の処理が実行される。以下の条件を満たす場合は、近似境界線は適正なものと判断され、処理はステップＳ０３０９に移行する。
ａ）境界線に沿ったサンプリング位置について、境界線の内側と外側とで画素値の差分量を計算し、差分量が閾値Ｔｅ１以下となる箇所の個数が、閾値Ｔｅ２（個数）を下回ること。
又は
ｂ）紙面領域外の画像濃度の分散が閾値Ｔｅ３大きいと判断されたこと（紙面の外と判断された領域の色がどれだけばらついているかについて判断）。 If the difference between the pixel values inside and outside the boundary line is small, it is not suitable as a boundary for dividing the region. Further, when the variance of the pixel values outside the boundary line is large, it is determined that various patterns are placed and it is determined to be inappropriate. That is, for the boundary line that does not satisfy the following condition a or b, the processing after step S0306 is executed. If the following condition is satisfied, the approximate boundary line is determined to be appropriate, and the process proceeds to step S0309.
a) For the sampling position along the boundary line, the difference amount of the pixel value is calculated between the inside and the outside of the boundary line, and the number of locations where the difference amount is equal to or less than the threshold value Te1 is less than the threshold value Te2 (number).
Or b) It is determined that the dispersion of the image density outside the paper area is larger than the threshold Te3 (determining how much the color of the area determined to be outside the paper varies).

この２つの条件の何れかに当てはまる境界線は、当該境界線についての第二パス処理により境界線の修正を行う。第二パスの処理は以下の６）〜８）で説明する。第二パス処理は、Ｓ０３０２よりももっと部分的に（細かく）画像濃度を解析して近似境界直線を求める処理である。 For a boundary line that meets either of these two conditions, the boundary line is corrected by a second pass process for the boundary line. The processing of the second pass will be described in the following 6) to 8). The second pass process is a process for obtaining an approximate boundary straight line by analyzing the image density more partially (finely) than in S0302.

６）画像濃度分布解析処理２（Ｓ０３０６）
当該ステップは第二パスの最初の処理であり、ここでは、画像濃度分布解析によって閾値が再度決定される。 6) Image density distribution analysis process 2 (S0306)
This step is the first process of the second pass, and here, the threshold value is determined again by image density distribution analysis.

まず、境界外の画素サンプル、および縁取りを検出するために境界線のすぐ外側の画素サンプルについて、各境界線に関してＮｏ個の画素サンプルを得て、これらの画素サンプルのヒストグラムが作成される。 First, No pixel samples are obtained for each boundary line for pixel samples outside the boundary and pixel samples just outside the boundary line in order to detect borders, and a histogram of these pixel samples is created.

図６は、このヒストグラムの例を示す図である。図６Ａの斜線エリアがサンプリングエリアとなり、そこから得られたヒストグラムが図６Ｂのようになる。 FIG. 6 is a diagram showing an example of this histogram. The hatched area in FIG. 6A becomes the sampling area, and the histogram obtained therefrom is as shown in FIG. 6B.

図６Ｂのヒストグラムにおいては、横軸の左側が暗い部分（背景）、右側が明るい部分となり、その間に谷ができていることが見てとれる。この谷を背景色と前景色との閾値Ｔｂ２とする。より具体的には、閾値の計算はデータ区間を０から昇順に走査して最初のピークから、頻度がＴｐ比率に下がった区間を閾値とする。閾値Ｔｂ１と閾値Ｔｂ２の違いは、ヒストグラムを作る際に使用する領域の広さである。閾値Ｔｂ２を計算する元となるヒストグラムは、当初境界線と推定された領域に限定されるため、より背景色と前景色の境界が出やすくなる。 In the histogram of FIG. 6B, it can be seen that the left side of the horizontal axis is a dark part (background), the right side is a bright part, and a valley is formed between them. This valley is defined as a threshold value Tb2 between the background color and the foreground color. More specifically, the threshold value is calculated by scanning the data section in ascending order from 0 and setting the section where the frequency has decreased to the Tp ratio from the first peak. The difference between the threshold value Tb1 and the threshold value Tb2 is the size of an area used when creating a histogram. Since the histogram from which the threshold value Tb2 is calculated is limited to the region estimated as the initial boundary line, the boundary between the background color and the foreground color is more likely to appear.

７）エッジ検出処理（Ｓ０３０７）
当該ステップでは、Ｓ０３０３の処理における二分探索でエッジが検出される。なお、このとき使用される閾値は、Ｓ０３０６の処理で求めた閾値である。 7) Edge detection process (S0307)
In this step, an edge is detected by the binary search in the process of S0303. Note that the threshold used at this time is the threshold obtained in the process of S0306.

８）境界線の近似処理（Ｓ０３０８）
当該ステップでは、Ｓ０３０７で求めたエッジ点から境界線が直線で近似される。境界線近似に関しては、Ｓ０３０４で実行された処理と同じ処理が実行される。 8) Boundary line approximation processing (S0308)
In this step, the boundary line is approximated by a straight line from the edge point obtained in S0307. Regarding the boundary line approximation, the same processing as that executed in S0304 is executed.

９）境界線の平行度評価処理（Ｓ０３０９）
当該ステップでは、対向する２つの境界線の傾きを比較し、傾きの差が一定以上の場合にはＳ０３１０において境界線の補正処理が実行されることになる。すなわち、左と右、または上と下の２つ境界線の方向をベクトル（２次元ベクトル）Ｌ１，Ｌ２として表した場合、ベクトル間の角度の絶対コサイン値｜＜Ｌ１、Ｌ２＞｜／（｜Ｌ１｜×｜Ｌ２｜）が閾値Ｔｒより大きい場合は、両者は平行線にならないと判断される。 9) Parallel line parallelism evaluation process (S0309)
In this step, the inclinations of the two opposing boundary lines are compared, and if the difference between the inclinations is greater than or equal to a certain value, the boundary line correction processing is executed in S0310. That is, when the directions of two boundary lines on the left and right, or on the upper and lower sides are expressed as vectors (two-dimensional vectors) L1 and L2, the absolute cosine value of the angle between the vectors | <L1, L2> | / (| L1 | If x | L2 |) is larger than the threshold value Tr, it is determined that both are not parallel lines.

１０）境界線の補正処理（Ｓ０３１０）
当該ステップでは、Ｓ０３０９の処理で平行線を構成しないと判断された２つの境界線については、各々の境界線において、垂直に隣接する境界線と交わる角度が求められる。当該角度に垂直から一定以上の誤差Ｔａがある場合には、その境界線は間違っているとみなされ、対向する境界線から境界線の推定が行われる。このような処理を行うのは、図７にあるように、紙面上に背景色とほぼ同じ暗さの模様が載っていて、それが大きな面積を占めているケースにおいて、境界線検出の失敗を救済するためである。この場合、当初右端の境界線として推定されたのが直線（０７０１）である。しかし、左側、上側、下側の境界線は高い尤度を持つので、これら三辺は正確であるとみなし、あいまいな右側の辺を求めなおす。その処理は次の過程から成る。 10) Boundary line correction processing (S0310)
In this step, for the two boundary lines determined not to form parallel lines in the process of S0309, the angle at which each boundary line intersects with the vertically adjacent boundary line is obtained. If there is an error Ta above a certain level from the perpendicular to the angle, the boundary line is considered to be wrong, and the boundary line is estimated from the opposing boundary line. As shown in FIG. 7, such processing is performed when a pattern having the same darkness as the background color is placed on the paper surface and occupies a large area. This is to help. In this case, the straight line (0701) was initially estimated as the rightmost boundary line. However, since the left, upper, and lower boundary lines have a high likelihood, these three sides are regarded as accurate, and the ambiguous right side is obtained again. The process consists of the following steps.

i）左辺から一番遠く、前景色の閾値以上の濃度を持つサンプル点を求める過程
ii）左辺と同じ傾きを持つ線分で、最も遠いサンプル点を通るものを右辺とする過程
そして、上記の処理により、新しい境界線０７０２が得られる。 i) Process of obtaining a sample point farthest from the left side and having a density equal to or higher than the foreground threshold.
ii) Process in which the right side is a line segment having the same inclination as the left side and passing through the farthest sample point. By the above processing, a new boundary line 0702 is obtained.

一方、Ｓ０３０８の処理で平行線を構成しないと判断された２つの境界線が、共に角度の誤差Ｔａ以上ある場合は、より境界線らしいものを基準として、もう一方の境界線の角度を推定する。この場合、より境界線らしいとは、Ｓ０３０５の処理の条件ａにあるように、境界線に沿ってサンプリング点を選び、境界内外の画素値の差分量を取り、その平均が大きいほうを、より境界線らしいと判断する。 On the other hand, if two boundary lines determined not to constitute a parallel line in the processing of S0308 both have an angle error Ta or more, the angle of the other boundary line is estimated based on what is more likely to be a boundary line. . In this case, more likely to be a boundary line, as in the condition a of S0305, select sampling points along the boundary line, take the amount of difference between the pixel values inside and outside the boundary, Judge that it seems to be a boundary line.

以上が４辺検出のプロセスとなる。第一パスと第二パスの役割の違いは、以下のようにまとめることができる。 The above is the process of detecting four sides. The difference between the roles of the first pass and the second pass can be summarized as follows.

R1）第一パスでは、あらかじめ規定されたサンプリングを行った全データ内部の設定値から決定された閾値によって認識処理を実行する。 R1) In the first pass, a recognition process is executed with a threshold value determined from set values in all data that have been sampled in advance.

R2）第二パスでは、第一パスで収集した統計情報を用いて内部の設定値を変更し決定した領域の外側のデータから閾値を決定して、認識処理を実行する。 R2) In the second pass, using the statistical information collected in the first pass, the threshold value is determined from the data outside the area determined by changing the internal set value, and the recognition process is executed.

なお、予備実験では、様々なデザインの存在する文書画像について、画像サンプルの９０％については第一パスが正確な辺（エッジ）を割り出す。残りの約１０％が帳票に暗い縁取りがあるなどの理由により、第二パスにて再推定が行われる。後者については、画像濃度分布情報を用いることで、背景色と前景色を識別する閾値を計算することにより、辺の再推定が正しく行える。 In the preliminary experiment, for a document image having various designs, the first pass finds an accurate side (edge) for 90% of the image samples. Reestimation is performed in the second pass because the remaining 10% of the form has a dark border. For the latter, by using the image density distribution information, the side can be correctly re-estimated by calculating a threshold value for identifying the background color and the foreground color.

第二パスで処理することを想定する画像としては、例えば図８にあるように、紙面の端が直線状でなく、ゆがんでいるまたは千切られたもの、または、紙面端に暗い色がデザインされているもの、などがある。 As an image that is assumed to be processed in the second pass, for example, as shown in FIG. 8, the edge of the paper surface is not straight, but is distorted or shredded, or a dark color is designed at the paper edge. There are things, etc.

＜実施形態のまとめ＞
以上説明したように、本発明の実施形態では、入力画像をサンプリングして得られた複数の格子点についてヒストグラムを作成し、その分布状態に基づいておおよそのエッジ存在領域を推定し、その領域に対して二分検索処理を行う。これにより、紙文書をカラーまたはグレー階調でスキャンした画像について、紙面のエッジに暗い色が載っていても、画像構成色の分布解析によって背景色と前景色の分離境界を推定して、これを分離することができる。 <Summary of Embodiment>
As described above, in the embodiment of the present invention, a histogram is created for a plurality of grid points obtained by sampling an input image, an approximate edge existence region is estimated based on the distribution state, and the region is stored in the region. On the other hand, a binary search process is performed. As a result, for a scanned image of a paper document in color or gray scale, even if a dark color appears on the edge of the paper, the separation boundary between the background color and the foreground color is estimated by analyzing the distribution of the image composition color. Can be separated.

また、１回目の二分検索処理によって得られたエッジから近似した境界線が妥当でない場合には、その得られた境界線近傍の画素についてさらにヒストグラムを生成する。そしてそのヒストグラムより、前景と背景を分離するための閾値を求め、その閾値を用いてさらに詳細な境界の探索を二分探索処理（２回目）によって行う。これにより、高解像度の画像に対しても高速な４辺検出が可能となる。 If the boundary line approximated from the edge obtained by the first binary search process is not valid, a histogram is further generated for the pixels near the obtained boundary line. A threshold value for separating the foreground and the background is obtained from the histogram, and a more detailed boundary search is performed by the binary search process (second time) using the threshold value. This enables high-speed four-side detection even for high-resolution images.

なお、本発明は、実施形態の機能を実現するソフトウェアのプログラムコードによっても実現できる。この場合、プログラムコードを記録した記憶媒体をシステム或は装置に提供し、そのシステム或は装置のコンピュータ（又はＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出す。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコード自体、及びそれを記憶した記憶媒体は本発明を構成することになる。このようなプログラムコードを供給するための記憶媒体としては、例えば、フレキシブルディスク、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭなどが用いられる。 The present invention can also be realized by a program code of software that realizes the functions of the embodiments. In this case, a storage medium in which the program code is recorded is provided to the system or apparatus, and the computer (or CPU or MPU) of the system or apparatus reads the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and the program code itself and the storage medium storing the program code constitute the present invention. As a storage medium for supplying such program code, for example, a flexible disk, CD-ROM, DVD-ROM, hard disk, optical disk, magneto-optical disk, CD-R, magnetic tape, nonvolatile memory card, ROM Etc. are used.

また、プログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳ（オペレーティングシステム）などが実際の処理の一部又は全部を行い、その処理によって前述した実施の形態の機能が実現されるようにしてもよい。さらに、記憶媒体から読み出されたプログラムコードが、コンピュータ上のメモリに書きこまれた後、そのプログラムコードの指示に基づき、コンピュータのＣＰＵなどが実際の処理の一部又は全部を行い、その処理によって前述した実施の形態の機能が実現されるようにしてもよい。 Also, based on the instruction of the program code, an OS (operating system) running on the computer performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing. May be. Further, after the program code read from the storage medium is written in the memory on the computer, the computer CPU or the like performs part or all of the actual processing based on the instruction of the program code. Thus, the functions of the above-described embodiments may be realized.

また、実施の形態の機能を実現するソフトウェアのプログラムコードを、ネットワークを介して配信することにより、それをシステム又は装置のハードディスクやメモリ等の記憶手段又はＣＤ-ＲＷ、ＣＤ-Ｒ等の記憶媒体に格納し、使用時にそのシステム又は装置のコンピュータ(又はＣＰＵやＭＰＵ)が当該記憶手段や当該記憶媒体に格納されたプログラムコードを読み出して実行するようにしても良い。 Also, by distributing the program code of the software that realizes the functions of the embodiment via a network, the program code is stored in a storage means such as a hard disk or memory of a system or apparatus, or a storage medium such as a CD-RW or CD-R And the computer of the system or apparatus (or CPU or MPU) may read and execute the program code stored in the storage means or the storage medium when used.

０１００・・・ＯＣＲ装置
０１０１・・・画像撮像装置
０１０２・・・操作端末装置
０１０３・・・表示端末装置
０１０４・・・ソータ装置
０１０５・・・外部記憶装置
０１０６・・・メモリ
０１０７・・・中央演算装置
０１０８・・・内部バス
０１０９・・・通信装置 0100: OCR device 0101 ... Image pickup device 0102 ... Operation terminal device 0103 ... Display terminal device 0104 ... Sorter device 0105 ... External storage device 0106 ... Memory 0107 ... Center Arithmetic device 0108 ... Internal bus 0109 ... Communication device

Claims

An image recognition apparatus that recognizes image data obtained by scanning and having a background and a foreground,
A grid point extraction processing unit for extracting a plurality of grid points from the image data;
Generating a first histogram relating to image density using the plurality of extracted grid points, calculating a first separation threshold for separating the background and the foreground from the first histogram; An image analysis unit that applies the first separation threshold to the grid points and extracts grid points in the vicinity of the boundary between the background and the foreground;
An edge estimation processing unit for edge estimation using lattice points in the vicinity of the boundary;
A boundary line acquisition unit that acquires a boundary line by performing a straight line approximation process using a plurality of edge points obtained by the edge estimation processing unit;
An image recognition apparatus comprising:

The edge estimation processing unit detects edges by performing a binary search process on a plurality of grid points belonging to the foreground and background that are closest to the first separation threshold. Image recognition device.

Further, using the difference value between the image density in the foreground region and the image density in the background region, the boundary line acquired by the boundary line acquisition unit is valid as a boundary for distinguishing the foreground and the background. The image recognition apparatus according to claim 2, further comprising a boundary evaluation unit that evaluates whether or not.

The boundary evaluation unit determines whether or not the number of locations where the difference value of the image density is equal to or less than a first threshold is less than a predetermined value (first condition) and the variance of the image density in the background region is second. It is determined whether or not the second boundary is greater than the threshold value (second condition), and if the first or second condition is satisfied, the acquired boundary line is determined to be valid. The image recognition apparatus according to claim 3.

The boundary evaluation unit determines that the acquired boundary line is inappropriate if it falls outside either of the first and second conditions,
In this case, the image analysis unit acquires a predetermined number of pixel samples existing in the vicinity of the acquired boundary line, generates a second histogram for the pixel sample, and generates the second histogram from the second histogram. And calculating a second separation threshold for separating the foreground,
The edge estimation processing unit detects the plurality of edge points (corrected edge points) by executing the binary search process using the second separation threshold,
The image recognition apparatus according to claim 4, wherein the boundary line acquisition unit acquires a corrected boundary line by executing a straight line approximation process using the corrected edge point.

Further, when there are a plurality of the boundary lines or the correction boundary lines, the inclination difference between two opposing boundary lines or the correction boundary lines is calculated, and when the inclination difference is equal to or larger than a predetermined value, the opposing boundary line or the correction boundary line is corrected. The image recognition apparatus according to claim 5, further comprising an inclination correction unit that corrects the boundary line.

A program for causing a computer to function as the image recognition apparatus according to claim 1.