JP5304553B2

JP5304553B2 - Image processing apparatus and image processing program

Info

Publication number: JP5304553B2
Application number: JP2009208573A
Authority: JP
Inventors: 景則長尾
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2009-09-09
Filing date: 2009-09-09
Publication date: 2013-10-02
Anticipated expiration: 2029-09-09
Also published as: JP2011059961A

Description

本発明は、画像処理装置及び画像処理プログラムに関する。 The present invention relates to an image processing apparatus and an image processing program.

従来から２つの文書の相似性を判断するために、例えば、従来では、文書をラスタ画像化し、各画像の特徴量を求め、各特徴量を参照に相似変換パラメータを推定し、一方の画像をこの推定した相似変換パラメータにて拡縮してから他方の画像と照合する技術が提案されている。 Conventionally, in order to determine the similarity of two documents, for example, conventionally, a document is converted into a raster image, a feature amount of each image is obtained, a similarity conversion parameter is estimated with reference to each feature amount, and one image is obtained. There has been proposed a technique for collating with the other image after scaling with the estimated similarity transformation parameter.

特許第４１４０２２１号明細書Japanese Patent No. 4140221 特許第３６９３１４７号明細書Japanese Patent No. 3693147 特開平１０−２６９３５７号公報Japanese Patent Laid-Open No. 10-269357

本発明は、２つの画像の相似性の判断を行う過程において求める相似変換パラメータの推定精度の向上を図ることを目的とする。 An object of the present invention is to improve the estimation accuracy of similarity transformation parameters obtained in the process of judging the similarity of two images.

本発明に係る画像処理装置は、比較対象とする各画像の画素値を水平又は垂直の少なくとも一方方向に累算し、その累算値のピーク位置を特定する情報を含む特徴量情報を取得する手段と、取得した各画像に対応した特徴量情報から、一方の画像の前記ピーク位置を基準点から拡縮又は平行移動の少なくとも一方をさせることにより他方の画像の前記ピーク位置に重ね合わせるための相似変換パラメータであって拡縮率及び移動量で表される相似変換パラメータを推定する相似変換パラメータ推定手段と、前記一方の画像を前記相似変換パラメータにより補正した後、前記各画像の相似関係を示す指標を算出する手段と、を有し、前記相似変換パラメータ推定手段は、一方の画像の前記ピーク位置それぞれを、他方の画像の前記ピーク位置のそれぞれと重ね合わせるための拡縮率及び移動量で表される拡縮率−移動量平面座標上の一次関数を生成し、相似変換パラメータが存在するであろう拡縮率−移動量平面座標上の予め決められた範囲を複数の領域に分割し、各領域を通過する一次関数の数を求めると共に当該各一次関数を生成した各画像の前記ピーク位置の組を領域毎に特定し、領域毎に特定した前記ピーク位置の組の中に同じピーク位置に基づく一次関数が重複して通過する領域が存在する場合、重複して通過する領域が存在しなくなるまで、当該領域を複数の領域に再分割し、その再分割した各領域を通過する一次関数の数を求めると共に当該各一次関数を生成した各画像の前記ピーク位置の組を特定する処理を繰り返し行い、一次関数が通過する数が最大となる領域を通過する一次関数に基づいて相似変換パラメータを求める、ことを特徴とする。 The image processing apparatus according to the present invention accumulates pixel values of respective images to be compared in at least one of horizontal and vertical directions, and acquires feature amount information including information for specifying a peak position of the accumulated value. Similarity for superimposing the peak position of one image to the peak position of the other image by causing the peak position of one image to be scaled or translated from the reference point from the feature amount information corresponding to each acquired image A similarity conversion parameter estimating means for estimating a similarity conversion parameter represented by a scaling factor and a movement amount, and an index indicating a similarity relationship between the images after correcting the one image with the similarity conversion parameter; And the similarity transformation parameter estimation means calculates each of the peak positions of one image and the peak position of the other image. Generate a linear function on the scale-moving plane coordinate expressed by the scaling ratio and the moving amount for superimposing each, and the scaling factor-moving-plane coordinate on which similar transformation parameters will exist The predetermined range is divided into a plurality of regions, the number of linear functions passing through each region is obtained, and the set of peak positions of each image that has generated each linear function is specified for each region. If there are areas where the linear functions based on the same peak position overlap in the set of specified peak positions, the area is subdivided into a plurality of areas until there is no overlapping area. Then, the number of linear functions that pass through each of the subdivided areas is calculated and the process of specifying the set of peak positions of each image that has generated each linear function is repeated, and the number of linear functions that pass through is maximized. The area Request similarity transformation parameter based on a linear function over to, characterized in that.

また、前記相似変換パラメータ推定手段は、一次関数が通過する数が最大となる領域を検出してから、その検出した領域のみを対象として、前記特定したピーク位置の組の中に同じピーク位置に基づく一次関数が重複して通過する領域の検出を行うことを特徴とする。 Further, the similarity transformation parameter estimation means detects a region where the number of passes through the linear function is maximum, and then targets only the detected region to the same peak position in the specified set of peak positions. It is characterized in that a region where a linear function based thereon passes through is detected.

また、前記相似変換パラメータ推定手段は、領域を複数の領域に再分割した場合、分割対象となった領域において一次関数の交点を求め、再分割により生成された領域のうち、求めた交点が存在する領域及びその領域に隣接した領域のみを、通過する一次関数の数を求めると共に当該各一次関数を生成した各画像の前記ピーク位置の組を特定する対象とすることを特徴とする。 Further, when the similarity transformation parameter estimation means subdivides the region into a plurality of regions, the intersection of the linear function is obtained in the region to be divided, and the obtained intersection exists among the regions generated by the subdivision. Only the region to be performed and the region adjacent to the region are targeted for obtaining the number of linear functions passing therethrough and specifying the set of peak positions of each image in which each linear function is generated.

本発明に係る画像処理プログラムは、コンピュータを、比較対象とする各画像の画素値を水平又は垂直の少なくとも一方方向に累算し、その累算値のピーク位置を特定する情報を含む特徴量情報を取得する手段、取得した各画像に対応した特徴量情報から、一方の画像の前記ピーク位置を基準点から拡縮又は平行移動の少なくとも一方をさせることにより他方の画像の前記ピーク位置に重ね合わせるための相似変換パラメータであって拡縮率及び移動量で表される相似変換パラメータを推定する相似変換パラメータ推定手段、前記一方の画像を前記相似変換パラメータにより補正した後、前記各画像の相似関係を示す指標を算出する手段、として機能させ、前記相似変換パラメータ推定手段は、一方の画像の前記ピーク位置それぞれを、他方の画像の前記ピーク位置のそれぞれと重ね合わせるための拡縮率及び移動量で表される拡縮率−移動量平面座標上の一次関数を生成し、相似変換パラメータが存在するであろう拡縮率−移動量平面座標上の予め決められた範囲を複数の領域に分割し、各領域を通過する一次関数の数を求めると共に当該各一次関数を生成した各画像の前記ピーク位置の組を領域毎に特定し、領域毎に特定した前記ピーク位置の組の中に同じピーク位置に基づく一次関数が重複して通過する領域が存在する場合、重複して通過する領域が存在しなくなるまで、当該領域を複数の領域に再分割し、その再分割した各領域を通過する一次関数の数を求めると共に当該各一次関数を生成した各画像の前記ピーク位置の組を特定する処理を繰り返し行い、一次関数が通過する数が最大となる領域を通過する一次関数に基づいて相似変換パラメータを求める、ことを特徴とする。 The image processing program according to the present invention includes feature amount information including information for accumulating pixel values of images to be compared in at least one of horizontal and vertical directions and specifying a peak position of the accumulated value. And a feature amount information corresponding to each acquired image to superimpose the peak position of one image on the peak position of the other image by performing at least one of enlargement / reduction or translation from the reference point A similarity transformation parameter estimating means for estimating a similarity transformation parameter represented by an enlargement / reduction ratio and a movement amount, and correcting the one image with the similarity transformation parameter, and then showing a similarity relationship between the images. Functioning as a means for calculating an index, wherein the similarity transformation parameter estimation means determines each peak position of one image as the other Generate a linear function on the plane coordinate of the scaling factor-moving amount expressed by the scaling factor and the moving amount to be superimposed on each of the peak positions of the image, and the scaling factor-moving amount that the similarity conversion parameter will exist A predetermined range on the plane coordinates is divided into a plurality of regions, the number of linear functions passing through each region is obtained, and the set of peak positions of each image that has generated each linear function is specified for each region. If there is a region where the linear function based on the same peak position passes through in the set of peak positions specified for each region, a plurality of such regions are excluded until there is no region where the overlapping passes. Subdividing into regions, calculating the number of linear functions that pass through each of the subdivided regions, and repeating the process of specifying the set of peak positions of each image that generated each linear function, and passing the linear function That number seek similarity transformation parameter based on a linear function passing through a region to be a maximum, it is characterized.

請求項１，４記載の発明によれば、本構成を有さない場合に比較して、２つの画像の相似性の判断を行う過程において求める相似変換パラメータの推定精度の向上を図ることができる。 According to the first and fourth aspects of the invention, it is possible to improve the estimation accuracy of the similarity transformation parameter obtained in the process of judging the similarity between two images, compared to the case where the present configuration is not provided. .

請求項２，３記載の発明によれば、２つの画像の相似性の判断を、対象とする領域を特定しない場合と比較して、より短時間に、あるいは、より少ないメモリ使用量で実現することができる。 According to the second and third aspects of the present invention, the similarity determination between two images is realized in a shorter time or with a smaller memory usage compared to the case where the target area is not specified. be able to.

本発明に係る画像処理装置の一実施の形態を示したブロック構成図である。1 is a block diagram illustrating an image processing apparatus according to an embodiment of the present invention. 本実施の形態における画像処理装置を形成するコンピュータのハードウェア構成図である。FIG. 2 is a hardware configuration diagram of a computer forming an image processing apparatus according to the present embodiment. 本実施の形態における画像照合処理を示したフローチャートである。It is the flowchart which showed the image collation process in this Embodiment. 本実施の形態において投影波形及び文書画像の特徴量の生成について説明するために用いる図である。It is a figure used in order to demonstrate the production | generation of the projection waveform and the feature-value of a document image in this Embodiment. 本実施の形態における変動補正パラメータ推定処理を示したフローチャートである。It is the flowchart which showed the fluctuation | variation correction parameter estimation process in this Embodiment. 本実施の形態において参照文書画像における局所ピーク座標と入力文書画像における局所ピーク座標との対応付けについて示した図である。It is the figure shown about matching with the local peak coordinate in a reference document image, and the local peak coordinate in an input document image in this Embodiment. 本実施の形態において生成するｓ−ｋ平面座標の一例を示した図である。It is the figure which showed an example of the sk plane coordinate produced | generated in this Embodiment. 図７に示したｓ−ｋ平面座標に設定された補正範囲内を複数のセルに分割したときのｓ−ｋ平面座標の一例を示した図である。It is the figure which showed an example of the sk plane coordinate when the inside of the correction range set to the sk plane coordinate shown in FIG. 7 is divided | segmented into several cells. 本実施の形態において重複投票が発生しうる参照文書画像における局所ピーク座標と入力文書画像における局所ピーク座標との位置関係について示した図である。It is the figure shown about the positional relationship of the local peak coordinate in the reference document image in which a duplicate vote may generate | occur | produce in this Embodiment, and the local peak coordinate in an input document image. 本実施の形態において重複投票が発生しているセルの一例を示した図である。It is the figure which showed an example of the cell in which the duplicate vote has generate | occur | produced in this Embodiment. 図１０に示したセルを再分割したときの状態を示した図である。It is the figure which showed the state when the cell shown in FIG. 10 is subdivided. 本実施の形態においてｓ−ｋ平面座標上で直線式が交わる部分を示した図である。It is the figure which showed the part where a linear type crosses on sk plane coordinate in this Embodiment. 本実施の形態においてセルを再分割したときの当該セルの要部を示した図である。It is the figure which showed the principal part of the said cell when a cell is subdivided in this Embodiment.

以下、図面に基づいて、本発明の好適な実施の形態について説明する。 Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

図１は、本発明に係る画像処理装置２０の一実施の形態を示したブロック構成図であり、図２は、この画像処理装置２０を形成するコンピュータのハードウェア構成図である。本実施の形態において画像処理装置２０を形成するコンピュータは、従前から存在する汎用的なハードウェア構成で実現できる。すなわち、コンピュータは、図２に示したようにＣＰＵ１、ＲＯＭ２、ＲＡＭ３、ハードディスクドライブ（ＨＤＤ）４を接続したＨＤＤコントローラ５、入力手段として設けられたマウス６とキーボード７、及び表示装置として設けられたディスプレイ８をそれぞれ接続する入出力コントローラ９、通信手段として設けられたネットワークコントローラ１０を内部バス１１に接続して構成される。 FIG. 1 is a block configuration diagram showing an embodiment of an image processing apparatus 20 according to the present invention, and FIG. 2 is a hardware configuration diagram of a computer forming the image processing apparatus 20. The computer forming the image processing apparatus 20 in the present embodiment can be realized with a general-purpose hardware configuration that has existed in the past. That is, as shown in FIG. 2, the computer is provided with a CPU 1, a ROM 2, a RAM 3, an HDD controller 5 connected to a hard disk drive (HDD) 4, a mouse 6 and a keyboard 7 provided as input means, and a display device. An input / output controller 9 for connecting each display 8 and a network controller 10 provided as a communication means are connected to an internal bus 11.

図１に戻り、画像処理装置２０は、投影波形生成部２１，２２、特徴量生成部２３，２４、変動補正パラメータ推定部２５、投影波形補正部２６及び照合スコア算出部２７を有している。投影波形生成部２１は、入力文書画像から投影波形を生成する。投影波形生成部２２は、参照文書画像から投影波形を生成する。投影波形生成部２１と投影波形生成部２２は、処理対象とする画像が異なるだけで、実施する処理自体は同じでよい。特徴量生成部２３は、投影波形生成部２１により生成された投影波形から入力文書画像の特徴量を生成する。特徴量生成部２４は、投影波形生成部２２により生成された投影波形から参照文書画像の特徴量を生成する。特徴量生成部２３と特徴量生成部２４は、処理対象とする画像の投影波形が異なるだけで、実施する処理自体は同じでよい。変動補正パラメータ推定部２５は、各特徴量生成部２３，２４により得られた各文書画像の特徴量を取得すると、その各文書画像の特徴量から変動補正パラメータを相似変換パラメータとして推定する。投影波形補正部２６は、変動補正パラメータ推定部２５により得られた変動補正パラメータで入力文書画像の投影波形を補正する。照合スコア算出部２７は、参照文書画像と、投影波形補正部２６により補正された入力文書画像の投影波形とを照合することによって参照文書と入力文書との相似関係の判定指標となる照合スコアを算出する。 Returning to FIG. 1, the image processing apparatus 20 includes projection waveform generation units 21 and 22, feature amount generation units 23 and 24, a fluctuation correction parameter estimation unit 25, a projection waveform correction unit 26, and a matching score calculation unit 27. . The projection waveform generation unit 21 generates a projection waveform from the input document image. The projection waveform generation unit 22 generates a projection waveform from the reference document image. The projection waveform generation unit 21 and the projection waveform generation unit 22 may be the same in the process to be performed except that the images to be processed are different. The feature amount generation unit 23 generates a feature amount of the input document image from the projection waveform generated by the projection waveform generation unit 21. The feature amount generation unit 24 generates a feature amount of the reference document image from the projection waveform generated by the projection waveform generation unit 22. The feature quantity generation unit 23 and the feature quantity generation unit 24 may be the same in the process to be performed except that the projection waveform of the image to be processed is different. When the variation correction parameter estimation unit 25 acquires the feature amount of each document image obtained by each of the feature amount generation units 23 and 24, the variation correction parameter estimation unit 25 estimates the variation correction parameter as a similarity conversion parameter from the feature amount of each document image. The projection waveform correction unit 26 corrects the projection waveform of the input document image with the variation correction parameter obtained by the variation correction parameter estimation unit 25. The collation score calculation unit 27 collates the reference document image with the projection waveform of the input document image corrected by the projection waveform correction unit 26 to thereby obtain a collation score serving as a determination index for the similarity relationship between the reference document and the input document. calculate.

画像処理装置２０における各構成要素２１〜２７は、画像処理装置２０を形成するコンピュータと、コンピュータに搭載されたＣＰＵ１で動作するプログラムとの協調動作により実現される。 Each component 21 to 27 in the image processing device 20 is realized by a cooperative operation of a computer that forms the image processing device 20 and a program that operates on the CPU 1 mounted on the computer.

また、本実施の形態で用いるプログラムは、通信手段により提供することはもちろん、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ等のコンピュータ読み取り可能な記録媒体に格納して提供することも可能である。通信手段や記録媒体から提供されたプログラムはコンピュータにインストールされ、コンピュータのＣＰＵがインストールプログラムを順次実行することで各種処理が実現される。 Further, the program used in this embodiment can be provided not only by communication means but also by storing it in a computer-readable recording medium such as a CD-ROM or DVD-ROM. The program provided from the communication means or the recording medium is installed in the computer, and various processes are realized by the CPU of the computer sequentially executing the installation program.

本実施の形態における特徴的なことは、変動補正パラメータ推定部２５における変動補正パラメータの推定処理において、重複投票を検出するようにし、そして、重複投票が検出された場合には、重複投票が検出されたセルを再分割して重複投票をなくし、重複投票がなくなった段階で最大投票値となるセルに基づき変動補正パラメータの推定を行うようにしたことである。それ以外の処理は、特許文献１と同様の処理内容としてもよい。 A characteristic feature of the present embodiment is that a duplicate vote is detected in the fluctuation correction parameter estimation process in the fluctuation correction parameter estimation unit 25, and a duplicate vote is detected when a duplicate vote is detected. The cell is subdivided to eliminate the duplicate vote, and the fluctuation correction parameter is estimated based on the cell having the maximum vote value when the duplicate vote is eliminated. Other processing may be the same as the processing content of Patent Document 1.

次に、本実施の形態における画像照合処理について図３に示したフローチャートを用いて説明する。 Next, image collation processing in the present embodiment will be described with reference to the flowchart shown in FIG.

比較対象となる入力文書及び参照文書の各画像が送られてくると、投影波形生成部２１，２２は、それぞれの文書画像を受け取り（ステップ１００）、各画像からＸ，Ｙ方向への投影波形を生成する（ステップ２００）。そして、特徴量生成部２３，２４は、それぞれ生成された投影波形に基づき各画像の特徴量を生成する（ステップ３００）。ここまでの処理について、図４を用いて詳述する。 When the images of the input document and reference document to be compared are sent, the projection waveform generation units 21 and 22 receive the respective document images (step 100), and the projection waveforms in the X and Y directions from each image. Is generated (step 200). Then, the feature amount generation units 23 and 24 generate feature amounts of the respective images based on the generated projection waveforms (step 300). The processing so far will be described in detail with reference to FIG.

なお、参照文書の画像というのは、入力文書の画像との照合の際に参照される文書のラスタライズされた画像データである。入力文書の画像というのは、参照文書の画像と照合される文書のラスタライズされた画像データである。つまり、本実施の形態における画像処理装置２０は、入力文書が参照文書と相似関係にあるか否かについて判定を行うための指標として照合スコアを提供する。これらの画像データは、画像処理装置２０に、ユーザにより指定された各文書の画像データを画像データベースから読み出されて入力されたり、スキャナ等の画像入力手段により読み取られ、ラスタ画像データに変換された後に入力されたりする。ここで説明する投影波形及び特徴量の生成処理は、入力文書及び参照文書の各画像に対して同じ処理を施すので、ここでは、入力文書の画像を代表して説明する。 Note that the image of the reference document is rasterized image data of the document that is referred to when collating with the image of the input document. The image of the input document is rasterized image data of the document that is collated with the image of the reference document. That is, the image processing apparatus 20 according to the present embodiment provides a collation score as an index for determining whether or not the input document is similar to the reference document. The image data of each document designated by the user is read from the image database and input to the image processing apparatus 20 or read by an image input means such as a scanner and converted into raster image data. Or input after In the projection waveform and feature amount generation processing described here, the same processing is performed on each image of the input document and the reference document. Therefore, here, the image of the input document will be described as a representative.

入力文書の画像が入力されると、投影波形生成部２１は、このラスタ画像データに対し、水平方向又は垂直方向の少なくとも一方に対する投影波形を形成する。例えば、入力される文書が縦書きであるか横書きであるかが予めわかっており、当該方向の投影波形のみで十分個々の文書の特徴を表すことができることがわかっている場合は、一方向のみへの投影を形成するだけでもよい。ただ、文書画像データの特徴が不明瞭な場合は両方向への投影波形を形成するのが好ましい。本実施の形態では、Ｘ，Ｙの両方向に対して投影波形を生成する。 When an image of the input document is input, the projection waveform generation unit 21 forms a projection waveform for at least one of the horizontal direction and the vertical direction with respect to the raster image data. For example, if it is known in advance whether the input document is vertical writing or horizontal writing, and it is known that the characteristics of individual documents can be sufficiently expressed only by the projection waveform in the direction, only one direction can be obtained. It may be only necessary to form a projection on the screen. However, if the characteristics of the document image data are unclear, it is preferable to form a projection waveform in both directions. In the present embodiment, a projection waveform is generated in both the X and Y directions.

ここで、投影とは、ラスタ形式の画像データに対し、Ｘ方向又はＹ方向に沿ったある経路において当該経路上に位置する画素の画素値を累算する処理をいう。Ｘ方向又はＹ方向に沿って経路が複数ある場合に、その累算した結果の数列が投影波形をなす。投影波形生成部２１は、画素値の累算結果を経路毎に求める。この場合、現実的には投影波形の形成処理前に公知の傾き補正技術を利用して読み取り時の軽微な傾きを補正しておくことが好ましい。 Here, the projection refers to a process of accumulating pixel values of pixels located on a certain path along the X direction or the Y direction with respect to raster-format image data. When there are a plurality of paths along the X direction or the Y direction, a sequence of the accumulated results forms a projection waveform. The projection waveform generation unit 21 obtains an accumulation result of pixel values for each path. In this case, practically, it is preferable to correct a slight inclination at the time of reading using a known inclination correction technique before the projection waveform forming process.

Ｘ方向の経路の場合、投影波形生成部２１は、文書画像１２に対し、Ｘ方向の経路上の画素の画素値を累算することにより投影波形を生成する。そして、本実施の形態では、その経路の中から所定の条件を満たす特徴部分を表す位置を選別する。特徴部分を表す位置を選別する所定の条件の一例としては、極値（局所的ピーク）、変曲点などの微分波形特徴量などがある。本実施の形態では、累算値の局所ピーク値となる位置を選別する。そして、特徴量生成部２３は、選別した局所ピークとなる位置の座標（ｘｎ）と当該位置の投影波高値（ｐｘｎ）とを組にすることによって当該位置の特徴量（ｘｎ，ｐｘｎ）として生成し、これらの特徴量のリストをＸ方向の入力文書画像特徴量として生成する。Ｙ方向においても同様にして、局所ピークとなる位置の座標（ｙｍ）と当該位置の投影波高値（ｐｙｍ）とを組にすることによって当該位置の特徴量（ｙｍ，ｐｙｍ）として生成し、これらの特徴量のリストをＹ方向の入力文書画像特徴量として生成する。なお、本実施の形態では、Ｘ，Ｙ方向それぞれにおいて局所ピーク座標を抽出しているので、ｎとｍの各値は必ずしも一致するとは限らない。 In the case of a path in the X direction, the projection waveform generation unit 21 generates a projection waveform by accumulating pixel values of pixels on the path in the X direction for the document image 12. In this embodiment, a position representing a characteristic portion that satisfies a predetermined condition is selected from the route. As an example of a predetermined condition for selecting a position representing a characteristic portion, there are a differential waveform feature amount such as an extreme value (local peak) and an inflection point. In the present embodiment, the position that becomes the local peak value of the accumulated value is selected. Then, the feature value generation unit 23 generates the feature value (xn, pxn) of the position by combining the coordinate (xn) of the position that becomes the selected local peak and the projection peak value (pxn) of the position. Then, a list of these feature amounts is generated as an input document image feature amount in the X direction. Similarly, in the Y direction, the coordinates (ym) of the position that becomes the local peak and the projection peak value (pym) of the position are generated as a pair to generate the feature amount (ym, pym) of the position. Is generated as an input document image feature amount in the Y direction. In this embodiment, since the local peak coordinates are extracted in the X and Y directions, the values of n and m do not always match.

投影波形生成部２２及び特徴量生成部２４も同様に処理して投影波形の生成及びＸ，Ｙの各方向の参照文書画像特徴量を生成する。なお、入力文書画像及び参照文書画像のＸ方向の各局所ピーク座標の数は、必ずしも一致するとは限らない。Ｙ方向においても同様である。 The projection waveform generation unit 22 and the feature amount generation unit 24 perform the same processing to generate a projection waveform and reference document image feature amounts in the X and Y directions. Note that the numbers of local peak coordinates in the X direction of the input document image and the reference document image do not always match. The same applies to the Y direction.

なお、参照文書に関しては、特徴量の生成処理までを事前に実施し、その処理結果をＨＤＤ４に保存しておき、この処理結果値を以降の処理において読み出し利用するようにしてもよい。 For the reference document, the process up to the feature value generation process may be performed in advance, the process result may be stored in the HDD 4, and the process result value may be read and used in subsequent processes.

続いて、変動補正パラメータ推定部２５は、上記処理にて生成した参照文書画像及び入力文書画像の各文書画像特徴量から両文書のＸ，Ｙ方向への投影波形が最も重なるような変動補正パラメータを推定する（ステップ４００）。変動補正パラメータが複数組推定される場合は、変動補正パラメータのリストを生成する。変動補正パラメータ推定部２５は、参照文書と入力文書とを照合するために入力文書画像の局所ピーク座標を参照文書画像の局所ピーク座標に重ね合わせるように処理するわけであるが、この際、入力文書画像を拡大又は縮小し、あるいは平行に移動させることになる。つまり、変動補正パラメータというのは、Ｘ，Ｙの各方向において、参照文書画像の局所ピーク座標に入力文書画像の局所ピーク座標を重ね合わせる際の入力文書画像の拡縮率と移動量により表される。この変動補正パラメータ推定部２５における変動補正パラメータ推定処理について、図５に示したフローチャートを用いて説明する。 Subsequently, the fluctuation correction parameter estimation unit 25 uses the fluctuation correction parameters such that the projected waveforms in the X and Y directions of the two documents most overlap from the document image feature amounts of the reference document image and the input document image generated by the above processing. Is estimated (step 400). When a plurality of sets of fluctuation correction parameters are estimated, a list of fluctuation correction parameters is generated. The fluctuation correction parameter estimation unit 25 performs processing so that the local peak coordinates of the input document image are superimposed on the local peak coordinates of the reference document image in order to collate the reference document with the input document. The document image is enlarged or reduced, or moved in parallel. That is, the fluctuation correction parameter is represented by the scaling factor and the movement amount of the input document image when the local peak coordinates of the input document image are superimposed on the local peak coordinates of the reference document image in each of the X and Y directions. . The fluctuation correction parameter estimation processing in the fluctuation correction parameter estimation unit 25 will be described with reference to the flowchart shown in FIG.

変動補正パラメータ推定部２５は、参照文書と入力文書の各画像の特徴量について、次の処理をＸ，Ｙ方向毎にそれぞれ行う。まず、Ｘ方向について、図６に例示したように参照文書画像の特徴量にｍ個の局所ピーク座標ａ１〜ａｍが、入力文書画像の特徴量にｎ個の局所ピーク座標ｂ１〜ｂｎが、それぞれ含まれているとすると、変動補正パラメータ推定部２５は、参照文書画像特徴量から局所ピーク座標（ａｉ（１≦ｉ≦ｍ））を、入力文書画像特徴量から局所ピーク座標（ｂｊ（１≦ｊ≦ｎ））を、それぞれ一つずつ選び、その入力文書画像特徴量における局所ピーク座標ｂｊを参照文書画像特徴量における局所ピーク座標ａｉに重ね合わせるときの拡縮率と移動量の関係を示す直線式を生成する。この直線式は、ｓ−ｋ平面座標上において、ａｉ＝ｋ×ｂｊ＋ｓという一次関数の式にて表せる。式中のｋは原点を基準点としてｂｊをａｉに重ね合わせるために投影波形を拡縮させたときの拡縮率、ｓはｂｊを拡縮と共に平行移動させたときの移動量である。各局所ピーク座標の全ての組み合わせについて同様の直線式を生成することにより直線式のリストを生成する（ステップ４０１）。つまり、直線式のリストには、ｍ×ｎ本の直線式が含まれる。この直線式をｓ−ｋ平面座標に描画した場合の例を図７に示す。直線リストを生成する際、変動補正パラメータ推定部２５は、直線式を生成する際に対応付けた１組の局所ピーク座標（ａｉとｂｊ）を、この直線式に関連付けして記憶しておく。 The fluctuation correction parameter estimation unit 25 performs the following processing for each feature amount of each image of the reference document and the input document for each of the X and Y directions. First, in the X direction, as illustrated in FIG. 6, m local peak coordinates a1 to am are included in the feature amount of the reference document image, and n local peak coordinates b1 to bn are included in the feature amount of the input document image. If included, the fluctuation correction parameter estimation unit 25 calculates the local peak coordinates (ai (1 ≦ i ≦ m)) from the reference document image feature value, and the local peak coordinates (bj (1 ≦ 1) from the input document image feature value. j ≦ n)) one by one, and a straight line indicating the relationship between the scaling ratio and the movement amount when the local peak coordinates bj in the input document image feature quantity are superimposed on the local peak coordinates ai in the reference document image feature quantity Generate an expression. This linear equation can be expressed by a linear function equation of ai = k × bj + s on the s-k plane coordinates. In the equation, k is an enlargement / reduction ratio when the projection waveform is enlarged / reduced in order to superimpose bj on ai with the origin as a reference point, and s is an amount of movement when bj is translated along with enlargement / reduction. A list of linear expressions is generated by generating similar linear expressions for all combinations of local peak coordinates (step 401). That is, the list of linear expressions includes m × n linear expressions. FIG. 7 shows an example in which this linear expression is drawn on the sk plane coordinates. When generating a straight line list, the fluctuation correction parameter estimating unit 25 stores a set of local peak coordinates (ai and bj) associated with generating a straight line expression in association with the straight line expression.

以上の処理をＹ方向についても行う。ここまでの処理により、図７に例示したｓ−ｋ平面座標がＸ，Ｙ方向それぞれに生成される。 The above processing is also performed in the Y direction. Through the processing so far, the s-k plane coordinates illustrated in FIG. 7 are generated in the X and Y directions, respectively.

ところで、拡縮率ｋに関し、ｋ＝１は拡縮無し、０＜ｋ＜１は縮小、ｋ＞１は拡大を意味するが、例えば変動補正パラメータである拡縮率ｋが存在するであろう範囲を、最大値ｋｍａｘ＝１０．０、最小値ｋｍｉｎ＝０．１などと予め決めておく。同様に、移動量ｓに関し、例えば移動量ｓの移動範囲を最大値ｓｍａｘ＝５０ピクセル、移動範囲の最小値ｓｍｉｎ＝−５０ピクセルなどと予め決めておく。そして、このように予め設定された補正範囲を例えば１６分割などの所定数に分割する（ステップ４０２）。この分割により得られた小領域をセルと称することにする。このｓ−ｋ平面座標の補正範囲内を複数に分割したときのｓ−ｋ平面座標の例を図８に示す。なお、補正範囲を分割する所定の数については、追って説明する。 By the way, regarding the expansion / contraction rate k, k = 1 means no expansion / contraction, 0 <k <1 means reduction, and k> 1 means enlargement. For example, a range in which the expansion / contraction rate k, which is a fluctuation correction parameter, will exist, The maximum value kmax = 10.0 and the minimum value kmin = 0.1 are determined in advance. Similarly, for the movement amount s, for example, the movement range of the movement amount s is determined in advance as a maximum value smax = 50 pixels, a minimum value smin = -50 pixels of the movement range, and the like. Then, the preset correction range is divided into a predetermined number such as 16 divisions (step 402). The small area obtained by this division is called a cell. An example of the sk plane coordinates when the correction range of the sk plane coordinates is divided into a plurality of parts is shown in FIG. The predetermined number for dividing the correction range will be described later.

変動補正パラメータ推定部２５は、続いてセル毎に直線を投票し（ステップ４０３）、そして投票値が最大となるセルを検出する（ステップ４０４）。投票というのは、当該セルを通過する直線の数を積算することである。なお、投票値が最大となるセルが複数検出された場合は、それらのセルのリストを生成する。 Subsequently, the fluctuation correction parameter estimation unit 25 votes a straight line for each cell (step 403), and detects a cell having the maximum vote value (step 404). Voting means adding up the number of straight lines that pass through the cell. When a plurality of cells having the maximum vote value are detected, a list of those cells is generated.

そして、変動補正パラメータ推定部２５は、投票が最大となるセルの中から重複投票となるセルを検出する（ステップ４０５）。ここで、重複投票について説明する。 Then, the fluctuation correction parameter estimation unit 25 detects a cell that becomes a duplicate vote from the cells that have the largest vote (step 405). Here, the duplicate vote will be described.

例えば、図９に例示したように、入力文書画像特徴量に含まれる局所ピーク座標ｂ２，ｂ３はそれぞれ、わずかに拡縮をするか、わずかに移動させることによって参照文書画像特徴量に含まれる局所ピーク座標ａ２と重ねられるとする。図１０は、直線式が通過するあるセルを示した図であるが、この例のように局所ピーク座標ａ２，ｂ２に基づき生成された直線式及び局所ピーク座標ａ２，ｂ３に基づき生成された直線式のように、同じ局所ピーク座標ａ２により生成された直線式が同じセルの中に重複して通過する場合があり得る。数ピクセルの間隔で引かれた罫線などが文書画像に含まれていた場合は、このようなケースが発生しうるかもしれない。しかしながら、局所ピーク座標ａ２は、局所ピーク座標ｂ２及びｂ３と同時に重ねられることはない。つまり、１つの局所ピーク座標が同時に複数の局所ピーク座標と対応付けられることはあり得ない。本実施の形態では、このように、１つの局所ピーク座標に基づき生成された直線式が１つのセルに投票されていることを重複投票と称することにするが、このような重複投票を含むセルにおける投票数は正しい値を示していない。ステップ４０６では、図１０に例示した重複投票がされているセルが検出される。 For example, as illustrated in FIG. 9, the local peak coordinates b2 and b3 included in the input document image feature are slightly enlarged or reduced, or moved slightly, so that the local peak included in the reference document image feature is obtained. It is assumed that it is overlapped with the coordinates a2. FIG. 10 is a diagram showing a certain cell through which a linear expression passes, but a linear expression generated based on the local peak coordinates a2 and b2 and a straight line generated based on the local peak coordinates a2 and b3 as in this example. Like a formula, the straight line formula generated by the same local peak coordinate a2 may pass through the same cell redundantly. Such a case may occur when ruled lines drawn at intervals of several pixels are included in the document image. However, the local peak coordinates a2 are not overlapped simultaneously with the local peak coordinates b2 and b3. That is, one local peak coordinate cannot be associated with a plurality of local peak coordinates at the same time. In the present embodiment, the fact that a linear expression generated based on one local peak coordinate is voted for one cell is referred to as a duplicate vote, but a cell including such a duplicate vote. The number of votes in is not correct. In step 406, a cell in which duplicate voting illustrated in FIG. 10 is performed is detected.

変動補正パラメータ推定部２５は、直線式のリストを生成する際に当該セルの中を通過する各直線式に関連付けられた局所ピーク座標を参照することによって重複投票を検出する。そして、重複投票が検出されなかった場合（ステップ４０６でＮ）、投票値が最大となるセルを特定する（ステップ４０７）。なお、投票値が最大となるセルが複数特定され場合もあり得る。そして、変動補正パラメータ推定部２５は、変動補正パラメータを推定することになるが（ステップ４０８）、この処理については追って詳述する。 The fluctuation correction parameter estimation unit 25 detects a duplicate vote by referring to local peak coordinates associated with each linear expression passing through the cell when generating a list of linear expressions. If no duplicate vote is detected (N in step 406), the cell having the maximum vote value is specified (step 407). There may be a case where a plurality of cells having the maximum vote value are specified. Then, the fluctuation correction parameter estimation unit 25 estimates the fluctuation correction parameter (step 408). This process will be described in detail later.

一方、重複投票が検出された場合（ステップ４０６でＹ）、変動補正パラメータ推定部２５は、重複投票が検出されたセルのみを所定数に再分割する（ステップ４０９）。重複投票が検出されたセルのみを再分割の対象とするので、他のセルは、再分割のためにメモリ（ＲＡＭ３）にロードする必要はない。図１０に例示したセルが再分割されたときの例を図１１に示す。複数のセルで重複投票が検出された場合、その検出された複数のセル全てにおいて再分割する。再分割する所定の数については、追って説明する。そして、再分割したセルにおいて改めて投票を行う（ステップ４１０）。再分割したセル毎に投票を行った後は、最大投票値となるセルを検出する処理（ステップ４０４）に戻る。重複投票が検出されなくなるまでセルの再分割、投票する処理（ステップ４０９，４１０）は、再帰的に繰り返される。 On the other hand, when a duplicate vote is detected (Y in step 406), the fluctuation correction parameter estimation unit 25 subdivides only the cells in which the duplicate vote is detected into a predetermined number (step 409). Since only cells in which duplicate voting is detected are targeted for re-division, other cells need not be loaded into the memory (RAM 3) for re-division. FIG. 11 shows an example when the cell illustrated in FIG. 10 is subdivided. When duplicate voting is detected in a plurality of cells, re-division is performed in all the detected plurality of cells. The predetermined number to be subdivided will be described later. Then, a new vote is performed in the re-divided cell (step 410). After voting for each subdivided cell, the process returns to the process of detecting the cell having the maximum vote value (step 404). The process of subdividing and voting cells (steps 409 and 410) is recursively repeated until no duplicate voting is detected.

以上のようにして、重複投票されたセルがない状態で最大投票値となるセルが特定されと、変動補正パラメータ推定部２５は、セルに含まれる直線式の交点、すなわち変動補正パラメータである拡縮率ｋと移動量ｓを推定する（ステップ４０８）。 As described above, when a cell having the maximum vote value is identified in a state where there is no duplicate voted cell, the fluctuation correction parameter estimation unit 25 performs scaling of the intersection of the linear expressions included in the cell, that is, the fluctuation correction parameter. The rate k and the movement amount s are estimated (step 408).

なお、入力文書と参照文書とが相似関係にある場合、最大投票値となるセルが複数特定された場合でも、基本的には、拡縮率ｋと移動量ｓの組は１つだけ得られる。つまり、ある最大投票値となるセルの中に交点は存在しなくても、そのセルの中を通過する直線式から求められる交点は、他の最大投票値となるセルの中にある交点と一致するはずである。但し、スキャナの読取精度等の要因で、理論的には複数の交点が得られてしまう場合もあり得る。この場合、複数の変動補正パラメータが推定されることになる。ここで、変動補正パラメータを推定する処理（ステップ４０８）について詳述する。 When the input document and the reference document are in a similar relationship, only one set of the scaling ratio k and the movement amount s is basically obtained even when a plurality of cells having the maximum vote value are specified. In other words, even if there is no intersection point in the cell that has the maximum vote value, the intersection point that is obtained from the linear expression that passes through the cell matches the intersection point in the cell that has the other maximum vote value. Should do. However, theoretically, a plurality of intersections may be obtained due to factors such as the reading accuracy of the scanner. In this case, a plurality of variation correction parameters are estimated. Here, the process of estimating the fluctuation correction parameter (step 408) will be described in detail.

参照文書と入力文書が相似関係にあるとしたならば、理論的には、正しい拡縮率ｋと移動量ｓとの組が１つだけ存在するため、あるセル内で複数の直線がｓ−ｋ平面座標のその拡縮率ｋと移動量ｓとで表される交点（ｓ＊，ｋ＊）で交わるはずである。ところが、実際には、スキャナの読取精度、各文書の用紙の伸縮、ノイズの影響などが原因で多少の誤差が生じ、これにより、本来、直線式が交点（ｓ＊，ｋ＊）で交わるところを、図１２に例示するように若干のずれが生じてくる場合があり得る。そこで、本実施の形態では、以下の説明するように拡縮率ｋ及び移動量ｓの各パラメータ値を補正して複数の直線から１つの交点（ｓ＊，ｋ＊）を求めるようにした。 If it is assumed that the reference document and the input document are in a similar relationship, theoretically, there is only one set of the correct scaling factor k and the movement amount s. It should intersect at an intersection (s *, k *) represented by the expansion / contraction ratio k of the plane coordinates and the movement amount s. However, in practice, some errors occur due to the reading accuracy of the scanner, the expansion and contraction of the paper of each document, the influence of noise, and the like, and this is where the linear equations essentially intersect at the intersection (s *, k *). As shown in FIG. 12, there may be a slight deviation. Therefore, in the present embodiment, as described below, each parameter value of the scaling factor k and the movement amount s is corrected to obtain one intersection (s *, k *) from a plurality of straight lines.

本実施の形態においては、直線群から最小距離にある点の座標を交点の座標とみなすようにした。すなわち、セル内の点（ｓ，ｋ）から直線ａｉ＝ｋ×ｂｊ＋ｓまでの距離をｄ_ｉ，ｊとすると、

という式にて表せる。また、セル内を通過する直線群から最小距離にある点の座標を（ｓ＊，ｋ＊）とすると、この点は各直線からの距離の二乗和が最小となる点として求める。つまり、

という式にて表せる。但し、上式におけるＣは、セル内を通過する直線群のパラメータ添字（ｉ，ｊ）を表す。このような（ｓ＊，ｋ＊）は次の式を解くことにより、解析的に求める。

In the present embodiment, the coordinates of the point at the minimum distance from the straight line group are regarded as the coordinates of the intersection. That is, if the distance from the point (s, k) in the cell to the straight line ai = k × bj + s is d _{i, j} ,

It can be expressed by the formula If the coordinates of a point at a minimum distance from a group of straight lines passing through the cell are (s *, k *), this point is obtained as a point where the sum of squares of the distance from each line is minimum. That means

It can be expressed by the formula However, C in the above formula represents a parameter subscript (i, j) of a straight line group passing through the cell. Such (s *, k *) is obtained analytically by solving the following equation.

以上のようにして、変動補正パラメータ推定部２５は、セル内における点（ｓ＊，ｋ＊）、すなわち、拡縮率ｋ及び移動量ｓを変動補正パラメータとして求める。 As described above, the fluctuation correction parameter estimation unit 25 obtains the point (s *, k *) in the cell, that is, the expansion / contraction rate k and the movement amount s as the fluctuation correction parameters.

なお、上記説明では、ステップ４１０において再分割した全てのセルに対して改めて投票を行うとした。ただ、再分割した場合、図１３のようにセルＡに交点があると考えられる場合、セルＣに対しセルＡと同様に投票を行っても、セルＣの直線式から得られる交点は、セルＡに存在することになるのでその投票処理は無駄になる。そこで、セルを再分割した場合、再分割したセルの中で交点が存在するセルＡをステップ４１０における投票の前に求めておき、セルＡに隣接しないセルＣに関しては、投票等以降の処理の対象外としてもよい。つまり、セルＣに関するデータは、メモリにロードする必要はない。なお、セルＡに隣接したセルＢに対しても処理対象外としてもよいが、セルＢに存在するはずの交点が、実際には誤差のためにセルＡに存在すると誤って推定されるような場合も考慮して、隣接するセルＢは処理対象外としないようにしてもよい。 In the above description, it is assumed that voting is performed again for all the cells re-divided in step 410. However, if subdivision is performed and cell A is considered to have an intersection as shown in FIG. 13, even if voting is performed on cell C in the same manner as cell A, the intersection obtained from the linear expression of cell C is Since it exists in A, the voting process is useless. Therefore, when the cell is subdivided, the cell A in which the intersection exists among the subdivided cells is obtained before the voting in step 410, and the cell C that is not adjacent to the cell A is processed after the voting or the like. It may be excluded. That is, the data relating to the cell C need not be loaded into the memory. Note that the cell B adjacent to the cell A may be excluded from the processing target, but the intersection that should exist in the cell B is actually erroneously estimated to exist in the cell A due to an error. In consideration of the case, the adjacent cell B may not be excluded from processing.

ここで、補正範囲及びセルの分割について説明する。 Here, the correction range and cell division will be described.

前述したように、本実施の形態においては、ステップ４０２において補正範囲を複数のセルに分割した。ここで、補正範囲を大まかに分割すると（セルの数を少なくすると）、重複投票が発生するセルの数が増えてしまい、再分割及び再分割後の各領域への投票を行う処理にかかる負荷が増大する。一方、補正範囲を細かに分割すると（セルの数を多くすると）、直線式の投票や重複投票の検出に要する処理負荷が大きくなる。このように、補正範囲を分割する数に関しては、トレードオフの関係にある。ただ、本実施の形態では、重複投票が発生した場合、その発生したセルのみを再分割及び再度の投票の対象とすることによって、これらの問題を解消しうる。つまり、補正範囲の分割する所定数に関しては、自由度が高く、適当に設定しておいても処理速度やメモリ使用量の点で従来と比較して改善が見込める。セルの再分割についても同様である。 As described above, in the present embodiment, the correction range is divided into a plurality of cells in step 402. Here, if the correction range is roughly divided (decreasing the number of cells), the number of cells in which duplicate voting occurs increases, and the load on processing for voting to each area after subdivision and subdivision Will increase. On the other hand, if the correction range is finely divided (increasing the number of cells), the processing load required for detecting a linear vote or a duplicate vote increases. As described above, the number of divisions of the correction range is in a trade-off relationship. However, in the present embodiment, when a duplicate vote occurs, these problems can be solved by subjecting only the generated cell to the subdivision and re-voting. In other words, the predetermined number of correction ranges to be divided has a high degree of freedom, and even if it is set appropriately, improvement in the processing speed and memory usage can be expected compared to the conventional case. The same applies to cell subdivision.

図３に戻り、以上のようにして変動補正パラメータが得られると、投影波形補正部２６は、変動補正パラメータ推定部２５により得られた変動補正パラメータ（拡縮率ｋと移動量ｓ）で入力文書画像の投影波形を補正する（ステップ５００）。なお、複数の変動補正パラメータが求められていた場合には、それぞれの変動補正パラメータで入力文書画像の投影波形を補正する。 Returning to FIG. 3, when the fluctuation correction parameter is obtained as described above, the projection waveform correction unit 26 inputs the input document with the fluctuation correction parameter (scale rate k and movement amount s) obtained by the fluctuation correction parameter estimation unit 25. The projected waveform of the image is corrected (step 500). When a plurality of variation correction parameters are obtained, the projection waveform of the input document image is corrected with each variation correction parameter.

照合スコア算出部２７は、参照文書画像の投影波形と、投影波形補正部２６により補正された入力文書画像の投影波形とを照合することによって参照文書と入力文書との相似関係の判定指標となる照合スコアを算出する（ステップ６００）。照合スコアは、例えば相関係数によりＸ，Ｙ方向毎に求めるようにしてもよい。 The collation score calculation unit 27 becomes a determination index of the similarity between the reference document and the input document by collating the projection waveform of the reference document image with the projection waveform of the input document image corrected by the projection waveform correction unit 26. A verification score is calculated (step 600). The matching score may be obtained for each of the X and Y directions by using a correlation coefficient, for example.

照合スコア算出部２７が算出した照合スコアを参照することによって、操作者が入力文書と参照文書の相似関係を判断する。あるいは、他の装置が照合スコアを予め設定した閾値と比較することによって相似関係の有無を判定する。複数の変動補正パラメータが求められていたことにより複数の組の照合スコアが算出されていた場合には、照合スコアの大きい方のみを正しい変動補正パラメータと認識して判定してもよい。 By referring to the matching score calculated by the matching score calculation unit 27, the operator determines the similarity between the input document and the reference document. Alternatively, the presence or absence of a similarity relationship is determined by comparing the collation score with a preset threshold value. When a plurality of sets of collation scores are calculated because a plurality of fluctuation correction parameters have been obtained, only the one having a larger collation score may be recognized as a correct fluctuation correction parameter for determination.

ところで、照合スコアをＸ，Ｙの各方向の相関値の和とした場合、Ｘ，Ｙ方向の各相関値が０．７，０．７のときも０．４，１．０のときも、照合スコアは共に１．４と算出される。ただ、前者に対し、後者の場合は、Ｘ方向の相関値が０．４と低いのにかかわらず、参照文書と入力文書とに相似関係があると判定されてしまうかもしれない。そこで、先に求めた相関値が所定の閾値より小さい場合は、入力文書と参照文書との間に相似関係はないと判定し、これ以降の処理をスキップしてもよい。 By the way, when the collation score is the sum of correlation values in the X and Y directions, the correlation values in the X and Y directions are 0.7, 0.7, 0.4, and 1.0. The matching score is calculated to be 1.4. However, in the latter case, it may be determined that there is a similarity between the reference document and the input document, regardless of whether the correlation value in the X direction is as low as 0.4. Therefore, if the previously obtained correlation value is smaller than a predetermined threshold value, it may be determined that there is no similarity between the input document and the reference document, and the subsequent processing may be skipped.

ところで、入力文書と参照文書の相似関係の判定を、照合スコア算出部２７を動作させることなく行うようにしてもよい。例えば、入力文書を拡縮する場合、横方向と縦方向の拡縮率を、通常は縦横ほぼ均等の割合で拡縮するはずである。つまり、通常は、縦横均等の倍率でコピーするものであり、極端に縦長や横長にコピーなどしない。本実施の形態では、変動補正パラメータ推定部２５が変動補正パラメータ（拡縮率ｋ及び移動量ｓ）をＸ，Ｙ方向それぞれに推定するわけであるが、いずれかの文書画像におけるＸ方向とＹ方向の変動補正パラメータ値があまりにも異なっていた場合、例えば入力文書画像におけるＸ方向の拡縮率がＹ方向の０．９〜１．１倍の範囲にない場合など所定の差異以上に異なっていた場合には、適切な変動補正パラメータが得られなかったとして以降の処理をスキップする。つまり、Ｘ，Ｙ方向の拡縮率ｋが所定以上に異なる場合は、照合スコア算出部２７に照合スコアを算出させるまでもなく入力文書と参照文書との間に相似関係はないものと判定する。 By the way, the similarity relationship between the input document and the reference document may be determined without operating the matching score calculation unit 27. For example, when an input document is enlarged / reduced, the enlargement / reduction ratios in the horizontal direction and the vertical direction should normally be enlarged / reduced at a substantially equal ratio. That is, normally, copying is performed at the same vertical / horizontal magnification, and copying is not performed extremely vertically or horizontally. In the present embodiment, the fluctuation correction parameter estimation unit 25 estimates the fluctuation correction parameters (the enlargement / reduction ratio k and the movement amount s) in the X and Y directions, but the X direction and the Y direction in any document image. When the fluctuation correction parameter value of the input document image is too different, for example, when the enlargement / reduction ratio in the X direction in the input document image is not in the range of 0.9 to 1.1 times the Y direction, the difference is more than a predetermined difference. Therefore, the subsequent processing is skipped because an appropriate fluctuation correction parameter is not obtained. That is, when the scaling ratios k in the X and Y directions are different from each other by a predetermined value or more, it is determined that there is no similarity between the input document and the reference document without causing the matching score calculation unit 27 to calculate the matching score.

１ＣＰＵ、２ＲＯＭ、３ＲＡＭ、４ハードディスクドライブ（ＨＤＤ）、５ＨＤＤコントローラ、６マウス、７キーボード、８ディスプレイ、９入出力コントローラ、１０ネットワークコントローラ、１１内部バス、２０画像処理装置、２１，２２投影波形生成部、２３，２４特徴量生成部、２５変動補正パラメータ推定部、２６投影波形補正部、２７照合スコア算出部。 1 CPU, 2 ROM, 3 RAM, 4 hard disk drive (HDD), 5 HDD controller, 6 mouse, 7 keyboard, 8 display, 9 input / output controller, 10 network controller, 11 internal bus, 20 image processing device, 21 and 22 Projection waveform generation unit, 23, 24 Feature amount generation unit, 25 Variation correction parameter estimation unit, 26 Projection waveform correction unit, 27 Collation score calculation unit.

Claims

Means for accumulating pixel values of each image to be compared in at least one of the horizontal and vertical directions and acquiring feature amount information including information for specifying a peak position of the accumulated value;
Similarity conversion parameters for superimposing the peak position of one image on the other image by performing at least one of enlargement / reduction or translation from the reference point, based on the acquired feature amount information corresponding to each image A similarity conversion parameter estimating means for estimating a similarity conversion parameter represented by a scaling ratio and a movement amount;
Means for calculating an index indicating a similarity relationship between the images after correcting the one image with the similarity transformation parameter;
Have
The similarity transformation parameter estimation means includes
Generating a linear function on a plane coordinate of an enlargement / reduction ratio-movement amount represented by an enlargement / reduction ratio and a movement amount for superimposing each of the peak positions of one image with each of the peak positions of the other image;
Dividing a predetermined range on the plane coordinate of the scaling ratio-movement amount plane where the similarity conversion parameter will exist, into a plurality of regions,
Determine the number of linear functions that pass through each area and identify the set of peak positions of each image that generated each linear function for each area,
If there is a region where the linear functions based on the same peak position overlap in the set of peak positions specified for each region, the region is divided into a plurality of regions until there is no overlapping region. Re-divided into two, repeatedly calculating the number of linear functions that pass through each of the subdivided regions and identifying the set of peak positions of each image that generated the respective linear functions,
Find similarity transformation parameters based on a linear function that passes through the region where the number of linear functions passes is maximum,
An image processing apparatus.

The similarity transformation parameter estimation means includes
A region where a linear function based on the same peak position is duplicated and passed through only the detected region after detecting a region where the number of passes through the linear function is maximum. The image processing apparatus according to claim 1, wherein the detection is performed.

The similarity transformation parameter estimation means includes
When the area is subdivided into multiple areas, the intersection of the linear function is obtained in the area to be divided, and among the areas generated by the subdivision, only the area where the obtained intersection exists and the area adjacent to the area The image processing apparatus according to claim 1, wherein the number of linear functions that pass is determined and the set of peak positions of each image that has generated each linear function is specified.

Computer
Means for accumulating pixel values of each image to be compared in at least one of the horizontal and vertical directions and acquiring feature amount information including information for specifying a peak position of the accumulated value;
Similarity conversion parameters for superimposing the peak position of one image on the other image by performing at least one of enlargement / reduction or translation from the reference point, based on the acquired feature amount information corresponding to each image A similarity conversion parameter estimating means for estimating a similarity conversion parameter represented by a scaling ratio and a movement amount;
Means for calculating an index indicating a similarity relationship between the images after correcting the one image with the similarity transformation parameter;
Function as
The similarity transformation parameter estimation means includes
Generating a linear function on a plane coordinate of an enlargement / reduction ratio-movement amount represented by an enlargement / reduction ratio and a movement amount for superimposing each of the peak positions of one image with each of the peak positions of the other image;
Dividing a predetermined range on the plane coordinate of the scaling ratio-movement amount plane where the similarity conversion parameter will exist, into a plurality of regions,
Determine the number of linear functions that pass through each area and identify the set of peak positions of each image that generated each linear function for each area,
If there is a region where the linear functions based on the same peak position overlap in the set of peak positions specified for each region, the region is divided into a plurality of regions until there is no overlapping region. Re-divided into two, repeatedly calculating the number of linear functions that pass through each of the subdivided regions and identifying the set of peak positions of each image that generated the respective linear functions,
Find similarity transformation parameters based on a linear function that passes through the region where the number of linear functions passes is maximum,
An image processing program characterized by that.