JP4192887B2

JP4192887B2 - Tamper detection device, watermarked image output device, watermarked image input device, watermarked image output method, and watermarked image input method

Info

Publication number: JP4192887B2
Application number: JP2004351883A
Authority: JP
Inventors: 昌彦須崎
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2004-12-03
Filing date: 2004-12-03
Publication date: 2008-12-10
Anticipated expiration: 2024-12-03
Also published as: JP2006165818A

Description

本発明は，文書画像などの画像の改ざん検出に関する。 The present invention relates to detection of falsification of an image such as a document image.

コピー用紙などの印刷媒体に印字データを印刷し，印刷物を生成すると，その印刷物に印刷されたテキストや画像などを見ることによって，印刷物に印刷された内容を把握することができる。 When print data is printed on a print medium such as copy paper and a printed matter is generated, the contents printed on the printed matter can be grasped by looking at texts and images printed on the printed matter.

一方，画像や文書データなどのディジタルデータにコピー・偽造防止のための情報や機密情報を目視できない形式で埋め込む「電子透かし」が存在している。その「電子透かし」を文書データ等に埋め込むことで，偽造等の不正行為を未然に防ぐための技術が存在する。 On the other hand, there exists “digital watermark” that embeds information for preventing copy / counterfeiting or confidential information in digital data such as images and document data in an invisible format. There is a technique for preventing illegal acts such as forgery by embedding the “digital watermark” in document data or the like.

上記「電子透かし」の技術を応用して，上記印刷媒体に印字データを印刷する際に，上記印字データの他に，上記印字データに対応した電子透かしを一緒に印刷すると，その印刷物を人の目で見ただけで，印刷物に印刷してある電子透かし情報から，印刷物に印字された文字が改変される等の印刷物の改ざんを発見できる（例えば，特許文献１参照）。なお，改ざんされているか否かの判定は印刷結果と電子透かしで印字されてものを比較することによって行われている。 When printing the print data on the print medium by applying the above-mentioned “digital watermark” technology, if the digital watermark corresponding to the print data is printed together with the print data, It is possible to discover falsification of the printed matter such as alteration of characters printed on the printed matter from the electronic watermark information printed on the printed matter, just by visual observation (see, for example, Patent Document 1). Note that whether or not tampering has occurred is determined by comparing the print result with that printed with a digital watermark.

特開２０００−２３２５７３号公報JP 2000-232573 A

しかしながら，印刷物に改ざんされたか否かを判定するために，電子透かしから取り出した印字内容と，印刷物に印刷されている印字内容とを目視によって比較する必要があった。 However, in order to determine whether or not the printed material has been tampered with, it is necessary to visually compare the printed content extracted from the digital watermark with the printed content printed on the printed material.

したがって，以下（１）及び（２）に示すような問題があった。（１）目視による判定であるため大量の印刷物の改ざんの有無を短時間で処理することは困難であった。（２）印字内容を１文字ずつ読み比べる必要があるため，人為的なミスによって改ざんの見逃しが起こる可能性があった。 Therefore, there are problems as shown in (1) and (2) below. (1) Since it is a visual determination, it has been difficult to process whether a large amount of printed material has been tampered with in a short time. (2) Since it is necessary to read and compare the printed contents one character at a time, there is a possibility that alterations may be missed due to human error.

本発明は，上記問題点に鑑みてなされたものであり，本発明の目的は，透かし情報を自動的に埋め込み，出力画像を生成し，その出力画像から埋め込まれた透かし情報を抽出し，出力画像の改ざんを自動的に検出することが可能な，新規かつ改良された改ざん検出装置，透かし入り画像出力装置，透かし入り画像入力装置，透かし入り画像出力方法，および透かし入り画像入力方法を提供することである。 The present invention has been made in view of the above problems, and an object of the present invention is to automatically embed watermark information, generate an output image, extract the embedded watermark information from the output image, and output it. Provided are a new and improved tamper detection device, watermarked image output device, watermarked image input device, watermarked image output method, and watermarked image input method capable of automatically detecting image tampering. That is.

上記課題を解決するため，本発明の第１の観点によれば，改ざん検出装置が提供される。上記改ざん検出装置は，画像を入力し，その入力された画像から１又は２以上の改ざんされたか否かを判定する領域の特定領域を切り出す処理を実行する画像加工部と；画像加工部により切り出された特定領域の位置及び／又は大きさに関する領域情報，または特定領域に対応する画像の部分画像のうち少なくとも一方を透かし情報として画像に埋め込み，出力画像を生成する透かし情報埋め込み部と；出力画像を入力し，その入力された出力画像に埋め込まれた透かし情報を抽出する透かし情報抽出部と；入力された出力画像を加工し，画像と略同一サイズの入力画像に変形する画像変形部と；画像に存在する特定領域の部分画像に少なくとも類似する擬似画像を生成する擬似画像生成部と；入力画像と擬似画像との差分からなる差分画像を生成し，該差分画像に差分領域が存在していた場合，入力画像の変形元となる出力画像が改ざんされたと判定する改ざん判定部とを備えることを特徴としている。 In order to solve the above-described problem, according to a first aspect of the present invention, a tamper detection device is provided. The tampering detection apparatus includes: an image processing unit that inputs an image and executes a process of cutting out a specific region of a region for determining whether one or more tampering has been performed from the input image; and the image processing unit A watermark information embedding unit that embeds at least one of region information relating to the position and / or size of the specified region or a partial image of an image corresponding to the specific region as watermark information and generates an output image; A watermark information extraction unit that extracts watermark information embedded in the input output image; an image transformation unit that processes the input output image and transforms the input image into an input image having substantially the same size as the image; A pseudo image generation unit that generates a pseudo image that is at least similar to a partial image of a specific region existing in the image; and a differential image that includes a difference between the input image and the pseudo image Generated, if the difference region in said difference image is present, is characterized by comprising a determining alteration determination unit and an output image to be deformed original input image has been tampered with.

本発明によれば，画像から特定領域に該当する部分画像のみを切出し，その部分画像や，特定領域に関する領域情報などを例えば電子署名等に相当する透かし情報として画像に埋め込む。透かし情報が埋め込まれた画像が出力画像として生成される。その出力画像に改ざんされたか否かを判定する際には，出力画像を入力し，補正や，出力画像の二値化，または出力画像の拡大／縮小する等の加工処理を実行することで入力画像を生成し，さらに上記透かし情報に基づき擬似画像を生成し，上記双方の画像の差分をとった差分画像に差分領域が存在していれば，改ざんされたと判定することができる。かかる構成によれば，改ざん検出を人の目に頼らなくとも自動的に装置が検出することができ，大量の出力画像を迅速的に，精度良く改ざん検出処理を実行することができる。また，画像に埋め込む，または画像から抽出する透かし情報は，改ざん検出処理の対象となる領域（特定領域）のみに関する画像又は領域情報等を埋め込むため，画像全領域の画像データよりもデータ量を軽減することが可能となり，改ざん検出処理の効率化，迅速化が図れる。なお，差分領域は，例えば上記入力画像と擬似画像の一対一対応する画像領域の画像が相違する場合に出現する領域である。 According to the present invention, only a partial image corresponding to a specific area is cut out from the image, and the partial image, area information regarding the specific area, and the like are embedded in the image as watermark information corresponding to, for example, an electronic signature. An image in which watermark information is embedded is generated as an output image. When judging whether or not the output image has been tampered with, the input image is input and input by executing processing such as correction, binarization of the output image, or enlargement / reduction of the output image. If an image is generated, a pseudo image is generated based on the watermark information, and a difference area exists in the difference image obtained by taking the difference between the two images, it can be determined that the image has been tampered with. According to such a configuration, the apparatus can automatically detect tampering detection without depending on human eyes, and a tamper detection process can be executed quickly and accurately for a large number of output images. Also, the watermark information embedded in or extracted from the image embeds the image or area information related to only the area (specific area) subject to the falsification detection process, so the data amount is reduced compared to the image data of the entire area of the image. This makes it possible to improve the efficiency and speed of the falsification detection process. The difference area is an area that appears when, for example, the image of the input image and the pseudo image corresponding to each other are different from each other.

上記課題を解決するために，本発明の別の観点によれば，透かし入り画像出力装置が提供される。上記透かし入り画像出力装置は，画像を入力し，その入力された画像から１又は２以上の改ざんされたか否かを判定する領域の特定領域を切り出す処理を実行する画像加工部と；画像加工部により切り出された特定領域の位置及び／又は大きさに関する領域情報，または特定領域に対応する画像の部分画像のうち少なくとも一方を透かし情報として画像に埋め込み，出力画像を生成する透かし情報埋め込み部とを備えることを特徴としている。 In order to solve the above problems, according to another aspect of the present invention, a watermarked image output apparatus is provided. The watermarked image output apparatus includes: an image processing unit that inputs an image and executes a process of cutting out a specific region of the region for determining whether or not one or more of the input image has been tampered with; A watermark information embedding unit that embeds at least one of region information relating to the position and / or size of the specific region cut out by the above, or a partial image of an image corresponding to the specific region as watermark information, and generates an output image. It is characterized by providing.

本発明によれば，画像から特定領域のみを切出してその特定領域に係る画像の部分画像や，特定領域の領域情報などを例えば電子署名等に相当する透かし情報として画像に埋め込む。上記埋め込まれた透かし情報は，改ざん検出処理の際に電子署名的な役割を果たし，透かし情報から生成される擬似画像と出力画像から生成される入力画像とを対比することで自動的に改ざん検出処理が実行される。かかる構成により，画像に透かし情報を埋め込むことで装置が自動的に改ざん検出をすることができる。また画像のうち改ざん検出処理を行う対象を絞り込むことで，画像に埋め込む透かし情報のデータ量を軽減することができる。 According to the present invention, only a specific region is cut out from an image, and a partial image of the image related to the specific region, region information of the specific region, and the like are embedded in the image as watermark information corresponding to, for example, an electronic signature. The embedded watermark information plays a role of electronic signature in the falsification detection process and automatically detects falsification by comparing the pseudo image generated from the watermark information with the input image generated from the output image. Processing is executed. With this configuration, the apparatus can automatically detect falsification by embedding watermark information in an image. Further, by narrowing down the target of the falsification detection process in the image, it is possible to reduce the data amount of the watermark information embedded in the image.

上記画像加工部は，特定領域を切り出す処理を実行し，さらにその切り出された特定領域に対応する画像の部分画像を所定の縮小率で縮小する縮小処理を実行し；透かし情報埋め込み部は，領域情報，特定領域に対応する画像の部分画像，該特定領域に対応する画像の部分画像が縮小された縮小画像，または該縮小画像の縮小率のうち少なくとも一方を透かし情報として埋め込むように構成してもよい。かかる構成により，画像に単に特定領域に関する透かし情報を埋め込むよりも，より一層データ量を圧縮し，軽減することができる。 The image processing unit executes a process of cutting out the specific area, and further executes a reduction process of reducing a partial image of the image corresponding to the cut out specific area at a predetermined reduction rate; Information, a partial image of the image corresponding to the specific area, a reduced image obtained by reducing the partial image of the image corresponding to the specific area, or a reduction ratio of the reduced image is embedded as watermark information. Also good. With this configuration, it is possible to further compress and reduce the amount of data compared to simply embedding watermark information relating to a specific area in an image.

上記画像加工部は，特定領域を切り出す処理を実行し，さらにその切り出された特定領域に対応する画像の部分画像を，該特定領域ごとの重要度に応じて定められる縮小率で縮小する縮小処理を実行するように構成してもよい。かかる構成により，改ざん検出する対象の特定領域のうち特に重要な領域については検出精度を向上させることができ，一方優先度が落ちる特定領域については縮小率を高める等することで処理効率を向上させることができる。 The image processing unit executes a process of cutting out the specific area, and further reduces the partial image of the image corresponding to the cut out specific area at a reduction rate determined according to the importance for each specific area. May be configured to execute. With this configuration, it is possible to improve the detection accuracy for particularly important areas among the specific areas to be detected for alteration, while improving the processing efficiency by increasing the reduction ratio for the specific areas where the priority is lowered. be able to.

画像加工部は，特定領域ごとに設定される重要度を，動的に変更するように構成してもよい。かかる構成によれば，特定領域内に存在するシンボル等によって重要度を動的に変更することで，予め重要度を定めずに改ざん処理を効率的に実施することができる。 The image processing unit may be configured to dynamically change the importance set for each specific area. According to such a configuration, the tampering process can be efficiently performed without determining the importance in advance by dynamically changing the importance according to the symbols or the like existing in the specific area.

上記特定領域は，文字，記号，または図形のうち少なくとも一つを含んだ文字領域であるように構成してもよい。なお，上記文字は１又は２文字以上であり，記号と図形も１又は２以上である。 The specific area may be configured to be a character area including at least one of a character, a symbol, or a figure. In addition, the said character is 1 or 2 characters or more, and a symbol and a figure are 1 or 2 or more.

画像加工部は，１又は２以上の縮小画像各々を一括し，一つの合成画像を生成するように構成してもよい。かかる構成により，画像に埋め込む透かし情報をより一層圧縮することができ，データ量を軽減することができる。 The image processing unit may be configured to batch one or more reduced images and generate one composite image. With this configuration, watermark information embedded in an image can be further compressed, and the amount of data can be reduced.

上記透かし入り画像出力装置は，画像に含まれる所定長さを有する直線及び／又は破線の領域を抽出する直線／破線処理部をさらに備え，上記透かし情報埋め込み部は，直線／破線処理部により抽出された直線及び／又は破線の領域の位置及び／又は大きさに関する直線領域情報を透かし情報として画像にさらに埋め込むように構成することができる。 The watermarked image output apparatus further includes a straight line / dashed line processing unit for extracting a straight line and / or broken line region having a predetermined length included in the image, and the watermark information embedding unit is extracted by the straight line / dashed line processing unit. The straight line area information regarding the position and / or size of the straight line and / or broken line area can be further embedded in the image as watermark information.

上記課題を解決するために，本発明の別の観点によれば，透かし入り画像入力装置が提供される。上記透かし入り画像入力装置は，画像のうち改ざんされたか否かを判定する１又は２以上の特定領域の位置及び／又は大きさに関する領域情報，または特定領域にかかる画像の部分画像のうち少なくとも一つが透かし情報として画像に埋め込まれることで生成された出力画像を入力し，その入力された出力画像に埋め込まれた該透かし情報を抽出する透かし情報抽出部と；入力された出力画像を加工し，画像と略同一サイズの入力画像に変形する画像変形部と；原画像の画像領域に存在する特定領域の画像に少なくとも類似する擬似画像を生成する擬似画像生成部と；入力画像と擬似画像との差分からなる差分画像を生成し，該差分画像に差分領域が存在していた場合，入力画像の変形元となる出力画像が改ざんされたと判定する改ざん判定部とを備えることを特徴としている。 In order to solve the above problems, according to another aspect of the present invention, a watermarked image input apparatus is provided. The watermarked image input device includes at least one of region information regarding the position and / or size of one or more specific regions for determining whether or not the image has been tampered with, or a partial image of an image related to the specific region. A watermark information extraction unit that inputs an output image generated by embedding one in the image as watermark information and extracts the watermark information embedded in the input output image; and processes the input output image; An image deforming unit that transforms an input image having substantially the same size as the image; a pseudo image generating unit that generates a pseudo image at least similar to an image in a specific region existing in the image region of the original image; and an input image and a pseudo image Tampering determination that generates a difference image consisting of differences and determines that the output image that is the source of deformation of the input image has been tampered with when a difference area exists in the difference image It is characterized in that it comprises and.

本発明によれば，出力画像を入力することで，画像に埋め込まれた透かし情報を抽出し，出力画像を補正や，出力画像の二値化，または出力画像の拡大／縮小する等の加工処理を実行することで入力画像を生成し，さらに上記透かし情報に基づき擬似画像を生成し，上記双方の画像の差分をとった差分画像に差分領域が存在していれば，改ざんされたと判定することができる。かかる構成によれば，改ざん検出を人の目に頼らなくとも自動的に装置が検出することができ，大量の出力画像を迅速的に，精度良く改ざん検出処理を実行することができる。また，画像から抽出する透かし情報は，改ざん検出処理の対象となる領域（特定領域）のみに関する画像又は領域情報等を埋め込むため画像全領域よりもデータ量を軽減することが可能となり，改ざん検出処理の効率化，迅速化が図れる。なお，差分領域は，例えば上記入力画像と擬似画像の一対一対応する画像の領域とが相違する場合に出現する領域である。 According to the present invention, by inputting an output image, watermark information embedded in the image is extracted, and processing such as correction of the output image, binarization of the output image, or enlargement / reduction of the output image is performed. To generate an input image, generate a pseudo image based on the watermark information, and if there is a difference area in the difference image obtained by taking the difference between the two images, determine that the image has been tampered with Can do. According to such a configuration, the apparatus can automatically detect tampering detection without depending on human eyes, and a tamper detection process can be executed quickly and accurately for a large number of output images. In addition, since the watermark information extracted from the image embeds the image or area information related to only the area (specific area) subject to the falsification detection process, the amount of data can be reduced compared to the whole area of the image. Can be made more efficient and faster. The difference area is an area that appears when, for example, the input image and the pseudo image have a one-to-one corresponding image area.

透かし情報抽出部は，領域情報，特定領域にかかる画像の部分画像，該特定領域にかかる画像の部分画像が縮小された縮小画像，または該縮小画像の縮小率のうち少なくとも一つが透かし情報として画像に埋め込まれることで生成された出力画像を入力し，その入力された出力画像に埋め込まれた該透かし情報を抽出するように構成してもよい。 The watermark information extraction unit includes at least one of area information, a partial image of an image relating to the specific area, a reduced image obtained by reducing a partial image of the image relating to the specific area, or a reduction ratio of the reduced image as watermark information. An output image generated by being embedded in the input image may be input, and the watermark information embedded in the input output image may be extracted.

上記擬似画像生成部が画像と略同一サイズの背景画像を生成し，透かし情報に特定領域に対応する画像の部分画像が含まれる場合そのまま該画像を背景画像に領域情報に従って配置し，または透かし情報に特定領域に対応する縮小画像が含まれる場合，縮小画像に縮小した縮小率の逆数の倍率でまず拡大し，該拡大した縮小画像を領域情報に従って配置し，擬似画像を生成することで該擬似画像は画像に類似させるように構成してもよい。 When the pseudo image generation unit generates a background image having substantially the same size as the image, and the watermark information includes a partial image of the image corresponding to the specific area, the image is directly arranged in the background image according to the area information, or the watermark information When a reduced image corresponding to a specific area is included in the image, the reduced image is first enlarged at a reciprocal of the reduced reduction rate, the enlarged reduced image is arranged according to the area information, and a pseudo image is generated. The image may be configured to be similar to the image.

擬似画像生成部が抽出された透かし情報に含む特定領域の位置及び／又は大きさに関する領域情報，特定領域にかかる画像，または特定領域にかかる画像が縮小された縮小画像のうち少なくとも一方に基づき擬似画像を生成することで，該擬似画像は画像に類似するように構成してもよい。 The pseudo image generation unit is simulated based on at least one of area information on the position and / or size of the specific area included in the extracted watermark information, an image on the specific area, or a reduced image obtained by reducing the image on the specific area. By generating an image, the pseudo image may be configured to be similar to the image.

特定領域は，文字，記号，または図形のうち少なくとも一つを含む文字領域であるように構成してもよい。なお，上記文字は１又は２文字以上であり，記号と図形も１又は２以上である。 The specific area may be configured to be a character area including at least one of a character, a symbol, or a figure. In addition, the said character is 1 or 2 characters or more, and a symbol and a figure are 1 or 2 or more.

上記画像変形部は，透かし情報に含まれる特定領域の位置及び／又は大きさに関する領域情報に基づき，入力された出力画像において特定領域と一致する領域を切り出し，その特定領域に対応する透かし情報に含んだ縮小率で切り出された領域の画像を縮小し，さらにその縮小率の逆数の倍率で該切り出された領域の画像を拡大した後，上記出力画像における元の位置に配置するように構成してもよい。かかる構成により，生成される擬似画像の作成条件と同じ条件にし，擬似画像の画質と入力画像の画質とをほぼ同じにすることによって，改ざん検出の精度を向上させ，より正確に改ざんされたか否かを判断することができる。 The image transformation unit cuts out an area that matches the specific area in the input output image based on the area information regarding the position and / or size of the specific area included in the watermark information, and converts the area into the watermark information corresponding to the specific area. The image of the clipped area is reduced at the included reduction ratio, and the image of the clipped area is enlarged at a magnification that is the reciprocal of the reduction ratio, and is then placed at the original position in the output image. May be. With such a configuration, it is possible to improve the accuracy of tamper detection by setting the same conditions as the generation conditions of the generated pseudo image and making the image quality of the pseudo image and the image quality of the input image substantially the same, and whether or not the tampering has been performed more accurately. Can be determined.

上記画像変形部は，特定領域ごとの重要度に応じて定められる縮小率で切り出された領域の画像を縮小し，さらにその縮小率の逆数の倍率で該切り出された領域の画像を拡大した後，出力画像における元の位置に配置するように構成してもよい。 The image transformation unit reduces the image of the clipped region at a reduction rate determined according to the importance for each specific region, and further enlarges the image of the clipped region at a magnification that is the inverse of the reduction rate. , It may be arranged at the original position in the output image.

透かし入り画像入力部に備わる透かし情報抽出部は，画像に含まれる所定長さを有する直線及び／又は破線の領域の位置及び／又は大きさに関する直線／破線領域情報を透かし情報として出力画像から抽出し；上記透かし入り画像入力装置は，さらに透かし情報に含まれる直線／破線領域情報に基づき，入力された出力画像に存在する直線及び／又は破線の領域を除去する直線／破線除去部を備えるように構成してもよい。 The watermark information extraction unit included in the watermarked image input unit extracts, as watermark information, straight line / broken line area information relating to the position and / or size of a straight line and / or broken line area having a predetermined length included in the image from the output image. The watermarked image input apparatus further includes a straight line / broken line removal unit for removing straight lines and / or broken line areas present in the input output image based on straight line / broken line area information included in the watermark information. You may comprise.

上記課題を解決するために，本発明の別の観点によれば，画像を入力し，その入力された画像から１又は２以上の改ざんされたか否かを判定する領域の特定領域を切り出す処理を実行する画像加工処理と；上記画像加工処理で切り出された特定領域の位置及び／又は大きさに関する領域情報，または特定領域に対応する画像の部分画像のうち少なくとも一方を透かし情報として画像に埋め込み，出力画像を生成する透かし情報埋め込み処理とを含むことを特徴としている。 In order to solve the above-described problem, according to another aspect of the present invention, an image is input, and a process of cutting out a specific area of an area for determining whether one or more tampering has been performed from the input image. Image processing to be executed; and embedding at least one of region information regarding the position and / or size of the specific region cut out by the image processing processing, or a partial image of the image corresponding to the specific region as watermark information, And a watermark information embedding process for generating an output image.

上記課題を解決するために，本発明の別の観点によれば，画像のうち改ざんされたか否かを判定する１又は２以上の特定領域の位置及び／又は大きさに関する領域情報，または特定領域にかかる画像の部分画像のうち少なくとも一つが透かし情報として画像に埋め込まれることで生成された出力画像を入力し，その入力された出力画像に埋め込まれた該透かし情報を抽出する透かし情報抽出処理と；入力された出力画像を加工し，画像と略同一サイズの入力画像に変形する画像変形処理と；画像に存在する特定領域の部分画像に少なくとも類似する擬似画像を生成する擬似画像生成処理と；入力画像と擬似画像との差分からなる差分画像を生成し，該差分画像に差分領域が存在していた場合，入力画像の変形元となる出力画像が改ざんされたと判定する改ざん判定処理とを含むことを特徴としている。 In order to solve the above-described problem, according to another aspect of the present invention, region information regarding the position and / or size of one or more specific regions for determining whether or not an image has been tampered with, or a specific region A watermark information extraction process for inputting an output image generated by embedding at least one of the partial images of the image as watermark information in the image and extracting the watermark information embedded in the input output image; An image transformation process that processes the input output image and transforms the input image into an input image having substantially the same size as the image; a pseudo image generation process that generates a pseudo image that is at least similar to a partial image of a specific area existing in the image; If a difference image is generated that consists of the difference between the input image and the pseudo image, and the difference area exists in the difference image, the output image that is the source of deformation of the input image has been altered. It is characterized by comprising a determining alteration determination process.

以上説明したように，本発明によれば，人間の目で改ざんの有無を検出せずに，改ざん検出装置が自動的に文書画像に改ざんが存在するか否かを自動的に検出することができる。また，人間の目ではなく装置によって改ざんを検出するため，文書画像に改ざんが存在するか否かの検出処理の精度を向上させることができる。一方で，改ざん検出をするために文書画像に透かし情報を埋め込むが，その透かし情報には，文書画像の全領域ではなく文書画像のうち改ざん防止を重点的に検査する領域だけに限定して，その領域に関する情報等を埋め込むため，データ量を軽減することができる。 As described above, according to the present invention, the falsification detection device can automatically detect whether or not falsification exists in a document image without detecting the presence or absence of falsification by human eyes. it can. In addition, since tampering is detected by an apparatus instead of human eyes, the accuracy of detection processing for determining whether or not tampering exists in a document image can be improved. On the other hand, watermark information is embedded in the document image in order to detect falsification, but the watermark information is limited to only the area of the document image that is to be examined for prevention of falsification, not the entire area of the document image. Since the information about the area is embedded, the amount of data can be reduced.

以下，本発明の好適な実施の形態について，添付図面を参照しながら詳細に説明する。なお，以下の説明及び添付図面において，略同一の機能及び構成を有する構成要素については，同一符号を付することにより，重複説明を省略する。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, preferred embodiments of the invention will be described in detail with reference to the accompanying drawings. In the following description and the accompanying drawings, components having substantially the same functions and configurations are denoted by the same reference numerals, and redundant description is omitted.

なお，以下に説明する文書画像は，白黒（背景が白，文字が黒）の場合を前提に説明するが，かかる場合に限定されず，文書画像の背景と文字の色が相違していれば，如何なる背景と文字との色の組み合わせの場合も実施可能である。 The document image described below is described on the assumption that the document image is black and white (background is white, character is black). However, the present invention is not limited to this, and the document image background and character color are different. , Any background color combination can be implemented.

（第１の実施の形態について）
まず，図１を参照しながら，第１の実施の形態にかかる改ざん検出装置に備わる文書出力部１０について説明する。図１は，第１の実施の形態にかかる文書出力部１０の概略的な構成を示すブロック図である。 (About the first embodiment)
First, the document output unit 10 provided in the falsification detection apparatus according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram illustrating a schematic configuration of a document output unit 10 according to the first embodiment.

図１に示すように，文書出力部１０は，文書画像（画像）１１を入力として，透かし入りの出力文書画像１４を出力する。なお，上記文書画像１１は，ワープロソフトなどの文書作成ソフトで作成された文書データを画像化した画像データである。 As shown in FIG. 1, the document output unit 10 receives a document image (image) 11 and outputs an output document image 14 with a watermark. The document image 11 is image data obtained by imaging document data created by document creation software such as word processing software.

出力文書画像１４は，文書画像１１の特徴を透かし情報として埋め込んだ画像データであり，通常は上記出力文書画像１４を紙などの印刷媒体に印刷して保存または配布するが，印刷せずに画像データのまま保存または配布してもよい。なお，上記出力文書画像１４は，プリンタなどの出力装置に該当する出力デバイス（図示せず。）によって出力される。 The output document image 14 is image data in which the features of the document image 11 are embedded as watermark information. Normally, the output document image 14 is printed on a print medium such as paper and stored or distributed. Data may be stored or distributed as it is. The output document image 14 is output by an output device (not shown) corresponding to an output device such as a printer.

文書出力部１０には，文書画像１１から文書画像１１の特徴を示す指標となる特徴量を抽出してデータ化する文書特徴データ化部（画像加工部）１２と，データ化された特徴量を透かし情報として文書画像１１に埋め込む透かし情報合成部（透かし情報埋め込み部）１３とが備わっている。 The document output unit 10 includes a document feature data conversion unit (image processing unit) 12 that extracts a feature amount serving as an index indicating the feature of the document image 11 from the document image 11 and converts it into data, and the converted feature amount. A watermark information synthesis unit (watermark information embedding unit) 13 embedded in the document image 11 as watermark information is provided.

文書特徴データ化部１２では，文書画像１１中に存在する文字領域に対応する画像（部分画像）のみを取り出して，文字領域に対応する部分画像ごとに縮小し，縮小した上記文字領域を集めた画像（部分画像）を文書画像の特徴データとすることで，同じ縮小率でも文書画像１１全体をそのまま縮小して特徴データとする場合と比較してデータ量を削減することができる。なお，以下の説明において，文書画像１１に存在する文字領域は，例えば，単に大きさや範囲等を示す領域そのものを指す場合や，または文字領域の領域に対応する部分の文書画像（部分画像）を指す場合があるものとする。 The document feature data conversion unit 12 extracts only the image (partial image) corresponding to the character area existing in the document image 11, reduces the image for each partial image corresponding to the character area, and collects the reduced character areas. By using the image (partial image) as the feature data of the document image, the data amount can be reduced as compared with the case where the entire document image 11 is reduced as it is to be feature data even at the same reduction ratio. In the following description, the character area existing in the document image 11 refers to, for example, the area itself indicating the size or range, or the document image (partial image) corresponding to the area of the character area. It may be pointed out.

上記透かし情報合成部１３は，文書画像１１と透かし情報とを重ね合わせて透かし入りの文書画像を作成する。透かし情報合成部１３は，透かし情報をディジタル化して数値に変換したものをＮ元符号化（Ｎは２以上）し，符号後の各シンボルを予め用意した信号に割当てる。信号は任意の大きさの矩形領域中にドットを配列することにより任意の方向と波長を持つ波を表現し，波の方向や波長に対してシンボルを割当てたものである。透かし入り文書画像は，これらの信号がある規則に従って画像上に配置されたものである。 The watermark information combining unit 13 creates a watermarked document image by superimposing the document image 11 and the watermark information. The watermark information synthesizer 13 digitizes the watermark information and converts it into a numerical value, and performs N-element coding (N is 2 or more), and assigns each encoded symbol to a signal prepared in advance. A signal represents a wave having an arbitrary direction and wavelength by arranging dots in a rectangular area of an arbitrary size, and a symbol is assigned to the direction and wavelength of the wave. A watermarked document image is one in which these signals are arranged on the image according to a certain rule.

次に，図２を参照しながら，第１の実施の形態にかかる改ざん検出装置に備わる文書入力部２０について説明する。なお，図２は，第１の実施の形態にかかる文書入力部の概略的な構成の一例を示すブロック図である。 Next, the document input unit 20 provided in the falsification detection apparatus according to the first embodiment will be described with reference to FIG. FIG. 2 is a block diagram illustrating an example of a schematic configuration of the document input unit according to the first embodiment.

文書入力部２０は，透かし入りが埋め込まれた出力文書画像１４を直接または間接的に取り込むことで生成される入力文書画像２１から改ざんの有無を判定し，その結果を改ざん検出結果２６として出力する。 The document input unit 20 determines the presence or absence of falsification from the input document image 21 generated by directly or indirectly capturing the output document image 14 embedded with a watermark, and outputs the result as a falsification detection result 26. .

入力文書画像２１は，上記出力文書画像１４が印刷物の場合，当該印刷物をスキャナなどの入力デバイス（図示せず。）により計算機（改ざん検出装置）に取り込まれる。または，上記出力文書画像１４がディジタルデータのままの場合，有線又は無線により入力デバイスから直接取り込まれる。つまり，入力文書画像２１と出力文書画像１４は同一である。 When the output document image 14 is a printed material, the input document image 21 is taken into a computer (tamper detection device) by an input device (not shown) such as a scanner. Alternatively, when the output document image 14 is still digital data, it is directly captured from the input device by wire or wireless. That is, the input document image 21 and the output document image 14 are the same.

図２に示すように，文書入力部２０は，入力文書画像２１から透かし情報を取り出す透かし情報抽出部２２と，入力文書画像２１から抽出した透かし情報を文書出力部１０で埋め込んだ文書画像１１の特徴を示す特徴情報に復元する擬似文書画像生成部２３と，復元した特徴情報に基づき文書画像１１の特徴と比較が行えるように入力文書画像２１を変形する入力画像変形部（画像変形部）２４と，および文書画像復元部２３と入力画像変形部２４で計算した画像を比較して改ざんを判定する改ざん検出部（改ざん判定部）２５から構成される。なお，上記入力画像変形部２４が入力文書画像２１を変形するのは，入力文書画像２１に改ざんの有無を判定するためである。 As shown in FIG. 2, the document input unit 20 includes a watermark information extraction unit 22 that extracts watermark information from the input document image 21, and a document image 11 in which the watermark information extracted from the input document image 21 is embedded by the document output unit 10. A pseudo document image generation unit 23 that restores the feature information indicating the feature, and an input image transformation unit (image transformation unit) 24 that transforms the input document image 21 so as to be compared with the feature of the document image 11 based on the restored feature information. And a tampering detection unit (tampering determination unit) 25 that compares the images calculated by the document image restoration unit 23 and the input image transformation unit 24 to determine tampering. The reason why the input image deformation unit 24 deforms the input document image 21 is to determine whether or not the input document image 21 has been tampered with.

次に，第１の実施の形態にかかる文書出力部１０の処理の一連の動作について説明する。なお，以降の説明では，文書画像１１において文書が横書きの場合について説明するが，かかる場合に限定されず，例えば，文書が縦書きの場合であっても実施可能である。上記文書が縦書きの場合，文書画像１１を左右どちらかに９０度回転させた後，文書が横書きの場合における処理と同一の処理を実行すること等により実施される。 Next, a series of operations of processing of the document output unit 10 according to the first embodiment will be described. In the following description, the case where the document is horizontally written in the document image 11 will be described. However, the present invention is not limited to such a case, and can be implemented even when the document is vertically written. When the document is written vertically, the document image 11 is rotated 90 degrees to the left or right, and then the same processing as that when the document is written horizontally is executed.

まず，図３を参照しながら，文書特徴データ化部１２の処理について説明する。なお，図３は，第１の実施の形態にかかる文書特徴データ化部の処理の概略の一例を示すフローチャートである。 First, the processing of the document feature data converting unit 12 will be described with reference to FIG. FIG. 3 is a flowchart showing an example of an outline of processing of the document feature data converting unit according to the first embodiment.

図３に示すように，文書特徴データ化部１２は，まず，文書画像１１から文字領域のみを抽出する文字領域抽出処理（Ｓ３１）を実行する。 As shown in FIG. 3, the document feature data conversion unit 12 first executes a character area extraction process (S31) for extracting only a character area from the document image 11.

（文書画像１１について）
ここで，図４を参照しながら，本実施の形態にかかる文書画像１１について説明する。なお，図４は，本実施の形態にかかる文書画像の概略的な構成の一例を示す説明図である。 (About document image 11)
Here, the document image 11 according to the present embodiment will be described with reference to FIG. FIG. 4 is an explanatory diagram showing an example of a schematic configuration of a document image according to the present embodiment.

文書画像１１は，フォント情報やレイアウト情報を含むデータであり，文書作成ソフト等で作成されるものとする。文書画像１１は，文書が紙に印刷された状態の画像としてページごとに作成することができる。この文書画像１１は白黒の２値画像であり，画像上で白い画素（値が１の画素）は背景であり，黒い画素（値が０の画素）は文字領域（インクが塗布される領域）であるものとする。 The document image 11 is data including font information and layout information, and is created by document creation software or the like. The document image 11 can be created for each page as an image in a state where the document is printed on paper. The document image 11 is a black and white binary image. On the image, white pixels (pixels having a value of 1) are backgrounds, and black pixels (pixels having a value of 0) are character regions (regions to which ink is applied). Suppose that

図４に示すように，文書画像１１は，機密性，完全性などを保持することが必要な証明書等の重要な文書を例示することができる。また，文書画像１１に文書が印字されている個所各々が文字領域に該当する。文字領域は，本実施の形態では文字が黒のため黒画素領域であるとするが，かかる例に限定されない。 As shown in FIG. 4, the document image 11 can exemplify an important document such as a certificate that needs to maintain confidentiality, integrity, and the like. Each location where the document is printed on the document image 11 corresponds to a character area. In this embodiment, the character region is a black pixel region because the character is black, but is not limited to this example.

（文字領域抽出処理（Ｓ３１）について）
図５〜図８を参照しながら，上記文字領域抽出処理（Ｓ３１）についてさらに具体的に説明する。なお，図５〜図８は，文字領域の抽出処理の概略を示す説明図である。 (Character area extraction processing (S31))
The character area extraction process (S31) will be described more specifically with reference to FIGS. 5 to 8 are explanatory diagrams showing an outline of the character region extraction processing.

上記文字領域抽出処理（Ｓ３１）では，以下に示すＳｔｅｐ１〜Ｓｔｅｐ３に従って文字領域の抽出が行われる。 In the character region extraction process (S31), character regions are extracted according to the following Step 1 to Step 3.

（Ｓｔｅｐ１）
まず，文書画像１１の領域のうち文字領域（黒画素領域）に対応する画像（部分画像）に対して膨張処理を行う。膨張処理は一般的な画像処理の方法を用いる。図５は，図４に示す文書画像１１の文字領域に対して膨張処理を行った結果の一例である。 (Step 1)
First, an expansion process is performed on an image (partial image) corresponding to a character area (black pixel area) in the area of the document image 11. The expansion process uses a general image processing method. FIG. 5 shows an example of the result of performing the expansion process on the character area of the document image 11 shown in FIG.

（Ｓｔｅｐ２）
文書特徴データ化部１２は，図４に示す文書画像１１の文字領域に対応する部分画像について膨張処理を行うと，その膨張処理を行った文字領域に対応する部分画像に対してラベル付けを行い，各文字領域を囲む最小矩形を抽出する。 (Step 2)
When the document feature data conversion unit 12 performs the expansion process on the partial image corresponding to the character area of the document image 11 illustrated in FIG. 4, the document feature data conversion unit 12 labels the partial image corresponding to the character area on which the expansion process has been performed. , Extract the minimum rectangle that encloses each character area.

なお，上記ラベル付けは，連続的に集まった黒画素の集合を１つのグループとして定義付ける，いわゆる黒画素のグルーピング処理である。上記ラベル付けすることで，１文字または２文字からなる最小矩形からなる文字領域を抽出することができる。 The labeling is a so-called black pixel grouping process in which a set of continuously collected black pixels is defined as one group. By labeling, a character area consisting of a minimum rectangle consisting of one or two characters can be extracted.

文書特徴データ化部１２が文書領域５１に対して膨張処理を行い，ラベル付け処理を行うと，図６に示すように，文書特徴データ化部１２は，１文字又は２文字程度の極力少ない文字数の最小矩形からなる文字領域を認識する。なお，図６は，図５に示す文書画像１１の文字領域５１について，ラベル付け処理の概略の一例を示す説明図である。 When the document feature data conversion unit 12 performs an expansion process on the document area 51 and performs a labeling process, as shown in FIG. 6, the document feature data conversion unit 12 has as few characters as possible, such as one or two characters. Recognizes the character area consisting of the smallest rectangle. FIG. 6 is an explanatory diagram showing an example of an outline of labeling processing for the character region 51 of the document image 11 shown in FIG.

上記文書特徴データ化部１２が，膨張処理を行うのは，上記ラベル付け処理を効率化するためであり，かかる膨張処理によって，ラベル付け処理の処理時間を短縮することができる。なお，計算速度が十分に速い改ざん検出装置の場合には，“Ｓｔｅｐ１”を省略し，文書画像１１に対して直接ラベル付けを行っても良い。 The document feature data conversion unit 12 performs the expansion process in order to improve the efficiency of the labeling process, and the processing time of the labeling process can be shortened by the expansion process. Note that in the case of a falsification detection apparatus having a sufficiently high calculation speed, “Step 1” may be omitted, and the document image 11 may be directly labeled.

（Ｓｔｅｐ３）
次に，文書特徴データ化部１２は，ラベル付けを行った最小矩形からなる文字領域の存在位置を把握し，その文字領域の左右に存在する最小矩形からなる文字領域同士の部分画像を結合させる。 (Step 3)
Next, the document feature data conversion unit 12 grasps the position of the character region consisting of the minimum rectangle that has been labeled, and combines the partial images of the character regions consisting of the minimum rectangle existing on the left and right of the character region. .

図６に示すように，例えば，文字領域５１−１のうち，矩形Ａ（最右端から３番目，真中の矩形）の左上頂点の座標値が（Ｘ_ａ０，Ｙ_ａ０），右下頂点の座標値が（Ｘ_ａ１，Ｙ_ａ１）であったとする。この矩形の左隣に存在する矩形Ｂの右下頂点の座標値が（Ｘ_ｂ１，Ｙ_ｂ１）であるとき，「Ｘ_ａ０−Ｘ_ｂ１＜Ｔｗ」を満たす場合には矩形Ａと矩形Ｂを結合する。また，矩形Ａの右隣に存在する矩形Ｃの左上頂点の座標値が（Ｘ_ｃ０，Ｙ_ｃ０）で，「Ｘ_ｃ０−Ｘ_ａ１＜Ｔｗ」を満たす場合には矩形Ａと矩形Ｃを結合する。 As shown in FIG. 6, for example, in the character area 51-1, the coordinate value of the upper left vertex of the rectangle A (third from the rightmost, middle rectangle) is (X _a0 , Y _a0 ), and the coordinate of the lower right vertex. It is assumed that the value is (X _a1 , Y _a1 ). When the coordinate value of the lower right vertex of the rectangle B existing on the left side of this rectangle is (X _b1 , Y _b1 ), the rectangle A and the rectangle B are combined if “X _a0 −X _b1 <Tw” is satisfied. To do. Further, when the coordinate value of the upper left vertex of the rectangle C existing right next to the rectangle A is (X _c0 , Y _c0 ) and satisfies “X _c0 -X _a1 <Tw”, the rectangle A and the rectangle C are combined. .

その結果，文書特徴データ化部１２は，図７に示す文字領域５１−１や，文字領域５１−２等のように，複数文字が含まれる文字領域５１を抽出することができる。なお，上記結合処理の結果，文字領域５１には，１文字だけの文字が含まれる場合もあり得る。かかる最小矩形を結合処理することで，例えば，文字領域に対応する部分画像の縮小など後続処理を繰り返し実行せずにある程度一括して効率的に行うことができる。なお，図７は，図６に示す最小矩形を結合する結合処理の結果を概略的に示す説明図である。 As a result, the document feature data converting unit 12 can extract a character area 51 including a plurality of characters, such as the character area 51-1 and the character area 51-2 shown in FIG. As a result of the above combining process, the character area 51 may contain only one character. By combining the minimum rectangles, for example, the subsequent processes such as reduction of the partial image corresponding to the character area can be efficiently performed to some extent without repeatedly executing subsequent processes. FIG. 7 is an explanatory diagram schematically showing the result of the joining process for joining the minimum rectangles shown in FIG.

さらに，文書特徴データ化部１２は，上記Ｓｔｅｐ１〜Ｓｔｅｐ３からなる文字領域抽出処理（Ｓ３１）を図６に示す文字領域５１だけでなく，文書画像１１に含まれるその他の文字領域についても実行する。 Further, the document feature data conversion unit 12 executes the character region extraction process (S31) including Step 1 to Step 3 not only for the character region 51 shown in FIG. 6 but also for other character regions included in the document image 11.

図８に示すように，文書特徴データ化部１２が文字領域抽出処理（Ｓ３１）を実行すると，文書画像１１には，領域１〜領域９，・・・などの複数の文字領域が存在するのが分かる。なお，文書特徴データ化部１２は，各文字領域の位置情報を例えば座標などによって管理している。図８は，文字領域抽出（Ｓ３１）を実行した結果の文書画像１１の概略的な構成を示す説明図である。 As shown in FIG. 8, when the document feature data converting unit 12 executes the character area extraction process (S31), the document image 11 includes a plurality of character areas such as areas 1 to 9,. I understand. Note that the document feature data conversion unit 12 manages the position information of each character area using, for example, coordinates. FIG. 8 is an explanatory diagram showing a schematic configuration of the document image 11 as a result of executing the character region extraction (S31).

（文字領域切り出し・縮小処理（Ｓ３２）について）
次に，文書特徴データ化部１２は，文字領域抽出処理（Ｓ３１）で抽出した文字領域を文書画像１１から切り出し，その切り出された文字領域各々を縮小する（Ｓ３２）。 (Regarding character area cutout / reduction processing (S32))
Next, the document feature data conversion unit 12 cuts out the character area extracted in the character area extraction process (S31) from the document image 11, and reduces each of the cut out character areas (S32).

図９を参照しながら，上記文字領域を切り出し・縮小する処理（Ｓ３２）について説明する。なお，図９は，図８に示す領域１及び領域２の文字領域について切り出し・縮小処理（Ｓ３２）する概略の一例を示す説明図である。 The process (S32) for cutting out and reducing the character area will be described with reference to FIG. Note that FIG. 9 is an explanatory diagram illustrating an example of an outline of the cutout / reduction processing (S32) for the character areas of the area 1 and the area 2 illustrated in FIG.

図９に示すように，文書特徴データ化部１２は，まず，図８に示す文書画像１１のうち領域１と領域２の文字領域を切り出す。なお，切り出された領域１と領域２の文字領域からなる画像データは，図９に示す左側である。 As shown in FIG. 9, the document feature data conversion unit 12 first cuts out the character areas of the area 1 and the area 2 from the document image 11 shown in FIG. Note that the image data composed of the character areas of the cut out areas 1 and 2 is on the left side shown in FIG.

次に，文書特徴データ化部１２は，領域１と領域２の文字領域からなる文書画像（画像データ）各々を縮小する（図９に示す右側の文書画像）。したがって，縮小されるため領域１と領域２の画像データのデータ量は小さくなる。なお，縮小率は，予め所定率で一律固定の場合を例に挙げて説明するが，かかる例に限定されず，文字領域によって縮小率を動的に変更する等の場合でもよい。詳細については後程説明する。 Next, the document feature data conversion unit 12 reduces each document image (image data) composed of the character areas of the area 1 and the area 2 (the document image on the right side shown in FIG. 9). Therefore, since the image data is reduced, the data amount of the image data in the area 1 and the area 2 is reduced. The reduction rate will be described by taking a case where the reduction rate is fixed at a predetermined rate as an example. However, the reduction rate is not limited to this example, and the reduction rate may be dynamically changed depending on the character area. Details will be described later.

上記領域１と領域２以外の文字領域抽出処理（Ｓ３１）で抽出した全ての文字領域について，文書特徴データ化部１２は，縮小処理を実行する（Ｓ３２）。なお，上記縮小した文字領域を“縮小文字領域”と明記する。 The document feature data conversion unit 12 executes a reduction process for all the character areas extracted in the character area extraction process (S31) other than the areas 1 and 2 (S32). The reduced character area is specified as “reduced character area”.

また，文書特徴データ化部１２は，文字領域を抽出する際に，各文字領域の左上頂点の座標値に従って，識別番号を割り当てている。その割り当て方法については文書出力部１０と文書入力部２０の間では共通である。例えば，図８に示す識別番号は，領域１，領域２，…等のように“１”，“２”，…が識別番号となるが，その割当て方法は，文書画像１１の左上を基準座標（０，０）としたうえで，文字領域の左上頂点のｙ座標値が小さいものを第１優先とし，その次にｘ座標値の小さいものを第２優先として昇順に識別番号を割り振っているが，かかる例に限定されない。 Further, when extracting the character area, the document feature data converting unit 12 assigns an identification number according to the coordinate value of the upper left vertex of each character area. The assignment method is common between the document output unit 10 and the document input unit 20. For example, in the identification numbers shown in FIG. 8, “1”, “2”,... Are identification numbers such as region 1, region 2,. After setting (0, 0), identification numbers are assigned in ascending order with the first y-coordinate value of the top left vertex of the character area being the first priority and the second x-coordinate value being the second priority. However, it is not limited to such an example.

（縮小文字領域合成（Ｓ３３）について）
次に，文字領域の切り出し・縮小処理（Ｓ３２）が終了すると，文書特徴データ化部１２は，上記縮小文字領域を組み合わせるように縮小文字領域各々を合成し，１つの画像（合成画像）を生成する。 (About reduced character area composition (S33))
Next, when the character area segmentation / reduction process (S32) is completed, the document feature data conversion unit 12 synthesizes the reduced character areas so as to combine the reduced character areas, thereby generating one image (synthesized image). To do.

ここで，図１０，図１１を参照しながら，本実施の形態にかかる合成画像について説明する。なお，図１０，図１１は本実施の形態にかかる縮小文字領域の合成画像の概略的な構成の一例を示す説明図である。 Here, the composite image according to the present embodiment will be described with reference to FIGS. 10 and 11. 10 and 11 are explanatory diagrams showing an example of a schematic configuration of a reduced character region composite image according to the present embodiment.

文書特徴データ化部１２は，複数の縮小文字領域を所定の順番に従って各縮小文字領域がシームレスに繋ぎ合わさるように配置していく。なお，縮小文字領域が単数の場合は，上記合成処理（Ｓ３３）は実行されず省略されるものとする。 The document feature data conversion unit 12 arranges a plurality of reduced character areas so that the reduced character areas are seamlessly connected in a predetermined order. When the reduced character area is single, the above composition process (S33) is not executed and is omitted.

文書特徴データ化部１２による全ての縮小文字領域の配置が終了すると，その配置された複数の縮小文字領域を１つの画像データとする合成画像が文書特徴データ化部１２によって生成される。 When the arrangement of all the reduced character areas by the document feature data converting unit 12 is completed, the document feature data converting unit 12 generates a composite image using the arranged reduced character areas as one image data.

図１０に示す合成画像は，図８に示す領域１〜領域ｎ（例えば，ｎは正の整数。）を縮小した縮小文字領域１〜縮小文字領域ｎを縮小文字領域１から昇順（１，２，３，…）に縦（垂直方向）にシームレスに配置することによって合成した場合の画像データである。 The composite image shown in FIG. 10 includes reduced character area 1 to reduced character area n obtained by reducing areas 1 to n (for example, n is a positive integer) shown in FIG. , 3,...) Are image data when they are combined by seamlessly arranging them vertically (vertical direction).

なお，図１０に示す合成画像の場合，合成画像の幅は全ての縮小文字領域１〜縮小文字領域ｎのなかで最も幅の大きい縮小文字領域に左右される。また合成画像の高さは，全ての縮小文字領域の高さを合計した総和となる。 In the case of the composite image shown in FIG. 10, the width of the composite image depends on the reduced character region having the largest width among all the reduced character regions 1 to n. The height of the composite image is the total sum of the heights of all the reduced character areas.

また，上記図１０に示した合成画像に限定されず，例えば，変形例として図１１に示す合成画像を例示することができる。図１１に示す合成画像は，図８に示す領域１〜領域ｎを縮小した縮小文字領域１〜縮小文字領域ｎを縮小文字領域１から昇順に合成画像の幅を所定値に固定した上で，縮小文字領域をシームレスに配置することによって合成した場合の画像データである。つまり，合成画像の幅に余りがあれば，その余りに次の縮小文字領域を配置することで，余白領域を減少させることができる。 Further, the present invention is not limited to the composite image shown in FIG. 10, and for example, the composite image shown in FIG. 11 can be exemplified as a modification. The composite image shown in FIG. 11 is obtained by fixing the width of the composite image to a predetermined value in ascending order from the reduced character region 1 to the reduced character region 1 to the reduced character region n obtained by reducing the regions 1 to n shown in FIG. This is image data in a case where the reduced character areas are combined by seamless arrangement. That is, if there is a remainder in the width of the composite image, the margin area can be reduced by placing the next reduced character area in the remainder.

図１１に示す合成画像の合成処理として，縮小文字領域の面積の総和などから合成画像の幅Ｗ_ｓを予め決めておき，Ｗ_ｓを超えない限りは縮小文字領域を横に並べ，Ｗ_ｓを超える場合には合成画像の次段の左端に縮小文字領域を配置する等の方法により合成する処理を例示できる。 As the synthesis processing of the composite image shown in FIG. 11, previously determined width W _s of the composite image from such total area of the reduced character region in advance, as long as it does not exceed the W _s is arranged reduced character area next to the W _s In the case of exceeding, a process of combining by a method such as arranging a reduced character area at the left end of the next stage of the combined image can be exemplified.

どのような合成処理のパターンを選択するかは，予め定めても良いし，合成画像のサイズ，あるいは合成画像を圧縮したときのサイズが最も小さくなるパターンを文書画像１１ごとに計算して選択しても良い。この場合，どの合成パターンで結合したかは画像特徴データ以外のフィールドに記録する。 The type of composition processing pattern to be selected may be determined in advance, or a pattern for which the size of the composite image or the size when the composite image is compressed is calculated and selected for each document image 11. May be. In this case, the combined pattern is recorded in a field other than the image feature data.

合成画像はそのまま文書画像１１に埋め込んでも良いし，既知の画像圧縮方法によって圧縮した後に文書画像１１に埋め込んでも良い。 The composite image may be embedded in the document image 11 as it is, or may be embedded in the document image 11 after being compressed by a known image compression method.

（透かし情報合成部１３の処理について）
透かし情報合成部１３では改ざん検出用のデータ（改ざん検出データＤａｔａ）を文書画像１１に透かし情報として埋め込む。なお，改ざん検出データＤａｔａとは以下に示すデータであるとする。 (Regarding the processing of the watermark information synthesis unit 13)
The watermark information synthesis unit 13 embeds data for falsification detection (falsification detection data Data) in the document image 11 as watermark information. It is assumed that the falsification detection data Data is the following data.

Ｄａｔａ０：文書画像１１の大きさ，データサイズなどのヘッダ情報
Ｄａｔａ１：合成画像の画像データあるいは合成画像を圧縮した画像データ
Ｄａｔａ２：文字領域の位置情報（文字領域抽出処理（Ｓ３１）により抽出したもの）
Ｄａｔａ３：合成画像の作成方法ＩＤ（合成画像の作成方法を文書画像１１ごとに変更する場合） Data 0: Header information such as the size and data size of the document image 11 Data 1: Image data of the composite image or image data obtained by compressing the composite image Data 2: Position information of the character area (extracted by the character area extraction process (S31))
Data3: Composite image creation method ID (when the composite image creation method is changed for each document image 11)

なお，上記改ざん検出データＤａｔａ０，改ざん検出データＤａｔａ１，改ざん検出データＤａｔａ２は改ざん検出のために必須であるが，改ざん検出データＤａｔａ３については必要に応じて適宜埋め込む。図１２に示すテーブルは，改ざん検出データＤａｔａ２に関するテーブルであって，文書画像１１において文字領域が存在する位置を示す座標値と領域番号とが１対１対応したテーブルの一例である。改ざん検出データＤａｔａ２では文字領域矩形の左上頂点のｘ座標，ｙ座標，右下頂点のｘ座標，ｙ座標をそれぞれ記録する。なお，図１２は，文書画像１１に埋め込まれる改ざん検出データＤａｔａ２のテーブルの概略的な構成の一例を示す説明図である。 Note that the falsification detection data Data0, falsification detection data Data1, and falsification detection data Data2 are essential for falsification detection, but the falsification detection data Data3 is appropriately embedded as necessary. The table shown in FIG. 12 is an example of a table related to the falsification detection data Data2, and the coordinate value indicating the position where the character area exists in the document image 11 and the area number have a one-to-one correspondence. In the alteration detection data Data2, the x coordinate and y coordinate of the upper left vertex of the character area rectangle, and the x coordinate and y coordinate of the lower right vertex are recorded. FIG. 12 is an explanatory diagram showing an example of a schematic configuration of a table of falsification detection data Data2 embedded in the document image 11.

ここで，図１３を参照しながら，透かし情報合成部１３の透かし情報を埋め込むなどの処理について説明する。なお，図１３は，透かし情報合成部１３の処理の概略的な流れの一例を示すフローチャートである。 Here, a process of embedding watermark information of the watermark information synthesizing unit 13 will be described with reference to FIG. FIG. 13 is a flowchart showing an example of a schematic flow of processing of the watermark information synthesis unit 13.

図１３に示すように，まず，改ざん検出データＤａｔａをＮ元符号に変換する（ステップＳ１０１）。Ｎは任意であるが，本実施形態では説明を容易にするためＮ＝２とする。従って，ステップＳ１０１で生成される符号は２元符号であり，０と１のビット列で表現されるものとする。このステップＳ１０１ではデータをそのまま符号化しても良いし，データを暗号化したものを符号化しても良い。 As shown in FIG. 13, first, falsification detection data Data is converted into an N-element code (step S101). N is arbitrary, but in the present embodiment, N = 2 for ease of explanation. Accordingly, the code generated in step S101 is a binary code, and is expressed by a bit string of 0 and 1. In this step S101, the data may be encoded as it is, or the encrypted data may be encoded.

次に，符号語の各シンボルに対して透かし信号を割り当てる（ステップＳ１０２）。透かし信号とはドット（黒画素）の配列によって任意の波長と方向を持つ波を表現したものである。透かし信号については，さらに後述する。 Next, a watermark signal is assigned to each symbol of the code word (step S102). The watermark signal represents a wave having an arbitrary wavelength and direction by an arrangement of dots (black pixels). The watermark signal will be further described later.

さらに，符号化されたデータのビット列に対応する信号ユニットを文書画像１１上に配置する（ステップＳ１０３）。 Further, a signal unit corresponding to the bit string of the encoded data is arranged on the document image 11 (step S103).

上記ステップＳ１０２において，符号語の各シンボルに対して割り当てる透かし信号について説明する。図１４は透かし信号の一例を示す説明図である。 The watermark signal assigned to each symbol of the code word in step S102 will be described. FIG. 14 is an explanatory diagram showing an example of a watermark signal.

透かし信号の幅と高さをそれぞれＳｗ，Ｓｈとする。ＳｗとＳｈは異なっていても良いが，本実施形態では説明を容易にするためＳｗ＝Ｓｈとする。長さの単位は画素数であり，図１４の例ではＳｗ＝Ｓｈ＝１２である。これらの信号が紙面に印刷されたときの大きさは，透かし画像の解像度に依存しており，例えば透かし画像が６００ｄｐｉ（ｄｏｔｐｅｒｉｎｃｈ：解像度の単位であり，１インチ当たりのドット数）の画像であるとしたならば，図１４の透かし信号の幅と高さは，印刷文書上で１２／６００＝０．０２（インチ）となる。 Let the width and height of the watermark signal be Sw and Sh, respectively. Sw and Sh may be different, but in the present embodiment, Sw = Sh is set for ease of explanation. The unit of length is the number of pixels. In the example of FIG. 14, Sw = Sh = 12. The size when these signals are printed on the paper surface depends on the resolution of the watermark image. For example, the watermark image is an image of 600 dpi (dot per inch: unit of resolution, number of dots per inch). 14, the width and height of the watermark signal in FIG. 14 are 12/600 = 0.02 (inch) on the printed document.

以下，幅と高さがＳｗ，Ｓｈの矩形を１つの信号の単位として「信号ユニット」と称する。図１４（１）は，ドット間の距離が水平軸に対してａｒｃｔａｎ（３）（ａｒｃｔａｎはｔａｎの逆関数）の方向に密であり，波の伝播方向はａｒｃｔａｎ（−１／３）である。以下，この信号ユニットをユニットＡと称する。図１４（２）はドット間の距離が水平軸に対してａｒｃｔａｎ（−３）の方向に密であり，波の伝播方向はａｒｃｔａｎ（１／３）である。以下，この信号ユニットをユニットＢと称する。 Hereinafter, a rectangle having a width and a height of Sw and Sh is referred to as a “signal unit” as one signal unit. In FIG. 14A, the distance between dots is dense in the direction of arctan (3) (arctan is an inverse function of tan) with respect to the horizontal axis, and the wave propagation direction is arctan (−1/3). . Hereinafter, this signal unit is referred to as unit A. In FIG. 14B, the distance between dots is dense in the direction of arctan (−3) with respect to the horizontal axis, and the wave propagation direction is arctan (1/3). Hereinafter, this signal unit is referred to as unit B.

図１５は，図１４（１）の画素値の変化をａｒｃｔａｎ（１／３）の方向から見た断面図である。図１５において，ドットが配列されている部分が波の最小値の腹（振幅が最大となる点）となり，ドットが配列されていない部分は波の最大値の腹となっている。 FIG. 15 is a cross-sectional view of the change in the pixel value in FIG. 14A as viewed from the direction of arctan (1/3). In FIG. 15, the portion where the dots are arranged is the antinode of the minimum value of the wave (the point where the amplitude is maximum), and the portion where the dots are not arranged is the antinode of the maximum value of the wave.

また，ドットが密に配列されている領域はそれぞれ１ユニットの中に２つ存在するため，この例では１ユニットあたりの周波数は２となる。波の伝播方向はドットが密に配列されている方向に垂直となるため，ユニットＡの波は水平方向に対してａｒｃｔａｎ（−１／３），ユニットＢの波はａｒｃｔａｎ（１／３）となる。なお，ａｒｃｔａｎ（ａ）の方向とａｃｒｔａｎ（ｂ）の方向が垂直のとき，ａ×ｂ＝−１である。 In addition, since there are two regions in each unit where dots are densely arranged, the frequency per unit is 2 in this example. Since the wave propagation direction is perpendicular to the direction in which the dots are densely arranged, the wave of unit A is arctan (-1/3) with respect to the horizontal direction, and the wave of unit B is arctan (1/3). Become. Note that a × b = −1 when the direction of arctan (a) and the direction of actan (b) are perpendicular.

本実施形態では，ユニットＡで表現される透かし信号にシンボル０を割り当て，ユニットＢで表現される透かし信号にシンボル１を割り当てる。また，これらをシンボルユニットと称する。 In the present embodiment, symbol 0 is assigned to the watermark signal expressed by unit A, and symbol 1 is assigned to the watermark signal expressed by unit B. These are called symbol units.

透かし信号には図１４（１），（２）で示されるもの以外にも，例えば図１６（３）〜（５）で示されるようなドット配列が考えられる。図１６（３）は，ドット間の距離が水平軸に対してａｒｃｔａｎ（１／３）の方向に密であり，波の伝播方向はａｒｃｔａｎ（−３）である。以下，この信号ユニットをユニットＣと称する。 In addition to the watermark signals shown in FIGS. 14 (1) and (2), for example, dot arrangements as shown in FIGS. 16 (3) to (5) are conceivable. In FIG. 16 (3), the distance between dots is dense in the direction of arctan (1/3) with respect to the horizontal axis, and the wave propagation direction is arctan (-3). Hereinafter, this signal unit is referred to as unit C.

図１６（４）は，ドット間の距離が水平軸に対してａｒｃｔａｎ（−１／３）の方向に密であり，波の伝播方向はａｒｃｔａｎ（３）である。以下，この信号ユニットをユニットＤと称する。図１６（５）は，ドット間の距離が水平軸に対してａｒｃｔａｎ（１）の方向に密であり，波の伝播方向はａｒｃｔａｎ（−１）である。なお，図１６（５）は，ドット間の距離が水平軸に対してａｒｃｔａｎ（−１）の方向に密であり，波の伝播方向はａｒｃｔａｎ（１）であると考えることもできる。以下，この信号ユニットをユニットＥと称する。 In FIG. 16 (4), the distance between dots is dense in the direction of arctan (−1/3) with respect to the horizontal axis, and the wave propagation direction is arctan (3). Hereinafter, this signal unit is referred to as unit D. In FIG. 16 (5), the distance between dots is dense in the direction of arctan (1) with respect to the horizontal axis, and the wave propagation direction is arctan (−1). In FIG. 16 (5), it can be considered that the distance between dots is dense in the direction of arctan (−1) with respect to the horizontal axis, and the wave propagation direction is arctan (1). Hereinafter, this signal unit is referred to as unit E.

このようにして，先に割り当てた組み合わせ以外にも，シンボル０とシンボル１を割り当てるユニットの組み合わせのパターンが複数考えられるため，どの透かし信号がどのシンボルに割り当てられているかを秘密にして第三者（不正者）が埋め込まれた信号を簡単に解読できないようにすることもできる。 In this way, in addition to the combinations assigned previously, there can be a plurality of patterns of combinations of units to which symbols 0 and 1 are assigned. Therefore, it is a third party to keep secret which watermark signal is assigned to which symbol. It is also possible to make it impossible to easily decipher the embedded signal.

さらに，図１３に示したステップＳ１０２で，改ざん検出データＤａｔａを４元符号で符号化した場合には，例えば，ユニットＡに符号語のシンボル０を，ユニットＢにシンボル１を，ユニットＣにシンボル２を，ユニットＤにシンボル３を割り当てることも可能である Further, when the falsification detection data Data is encoded with a quaternary code in step S102 shown in FIG. 13, for example, the code A symbol 0 is assigned to the unit A, the symbol 1 is assigned to the unit B, and the symbol is assigned to the unit C. 2 and symbol 3 can be assigned to unit D

図１４，図１６に示した透かし信号の一例においては，１ユニット中のドットの数をすべて等しくしているため，これらのユニットを隙間なく並べることにより，見かけの濃淡が均一となる。したがって印刷された紙面上では，単一の濃度を持つグレー画像が背景として埋め込まれているように見える。 In the example of the watermark signal shown in FIG. 14 and FIG. 16, since the number of dots in one unit is all equal, the apparent shading becomes uniform by arranging these units without gaps. Therefore, it appears that a gray image having a single density is embedded as a background on the printed paper.

このような効果を出すために，例えば，ユニットＥを背景ユニット（シンボルが割り当てられていない信号ユニット）と定義し，これを隙間なく並べて文書画像１１の背景とし，シンボルユニット（ユニットＡ，ユニットＢ）を文書画像１１に埋め込む場合は，埋め込もうとする位置の背景ユニット（ユニットＥ）とシンボルユニット（ユニットＡ，ユニットＢ）とを入れ替える。 In order to produce such an effect, for example, the unit E is defined as a background unit (signal unit to which no symbol is assigned), and these are arranged without gaps as the background of the document image 11, and the symbol units (unit A, unit B) are defined. ) Is embedded in the document image 11, the background unit (unit E) and the symbol unit (unit A, unit B) at the position to be embedded are exchanged.

図１７（１）はユニットＥを背景ユニットと定義し，これを隙間なく並べて文書画像１１の背景とした場合を示す説明図である。図１７（２）は図１７（１）の背景画像の中にユニットＡを埋め込んだ一例を示し，図１７（３）は図１７（１）の背景画像の中にユニットＢを埋め込んだ一例を示している。本実施形態では，背景ユニットを文書画像１１の背景とする方法について説明するが，シンボルユニットのみを配置することによって文書画像１１の背景としても良い。 FIG. 17A is an explanatory diagram showing a case where the unit E is defined as a background unit and arranged as a background of the document image 11 without gaps. 17 (2) shows an example in which the unit A is embedded in the background image of FIG. 17 (1), and FIG. 17 (3) shows an example in which the unit B is embedded in the background image of FIG. 17 (1). Show. In the present embodiment, a method of using the background unit as the background of the document image 11 will be described. However, only the symbol unit may be arranged as the background of the document image 11.

次に，符号語の１シンボルを文書画像１１に埋め込む方法について，図１８を参照しながら説明する。 Next, a method of embedding one symbol of the code word in the document image 11 will be described with reference to FIG.

図１８は，文書画像１１へのシンボル埋め込み方法の一例を示す説明図である。ここでは，例として「０１０１」というビット列を埋め込む場合について説明する。 FIG. 18 is an explanatory diagram showing an example of a symbol embedding method in the document image 11. Here, a case where a bit string “0101” is embedded will be described as an example.

図１８（１），（２）に示すように，同じシンボルユニットを繰り返し埋め込む。これは文書中の文字が埋め込んだシンボルユニットの上に重なった場合，信号検出時に検出されなくなることを防ぐためであり，シンボルユニットの繰り返し数と配置のパターン（以下，ユニットパターンと称する。）は任意である。 As shown in FIGS. 18A and 18B, the same symbol unit is repeatedly embedded. This is to prevent the character unit in the document from being detected when a signal is detected when it is superimposed on the embedded symbol unit. The symbol unit repetition number and arrangement pattern (hereinafter referred to as a unit pattern) are used. Is optional.

すなわち，ユニットパターンの一例として，図１８（１）のように繰り返し数を４（１つのユニットパターン中に４つのシンボルユニットが存在する）にしたり，図１８（２）のように繰り返し数を２（１つのユニットパターン中に２つのシンボルユニットが存在する）にしたりすることができ，あるいは，繰り返し数を１（１つのユニットパターン中には１つのシンボルユニットだけが存在する）としてもよい。 That is, as an example of the unit pattern, the number of repetitions is set to 4 (four symbol units exist in one unit pattern) as shown in FIG. 18 (1), or the number of repetitions is set to 2 as shown in FIG. 18 (2). (There are two symbol units in one unit pattern) or the number of repetitions may be one (only one symbol unit exists in one unit pattern).

また，図１８（１），（２）は１つのシンボルユニットに対して１つのシンボルが与えられているが，図１８（３）のようにシンボルユニットの配置パターンに対してシンボルを与えても良い。 In FIGS. 18A and 18B, one symbol is given to one symbol unit. However, even if symbols are given to the arrangement pattern of symbol units as shown in FIG. good.

１ページ分に何ビットの情報量を埋め込むことができるかは，信号ユニットの大きさ，ユニットパターンの大きさ，画像（または，原画像）の大きさに依存する。画像の水平方向と垂直方向にいくつの信号を埋め込んだかは，既知として信号検出を行っても良いし，入力装置から入力された画像の大きさと信号ユニットの大きさから逆算しても良い。 How many bits of information can be embedded in one page depends on the size of the signal unit, the size of the unit pattern, and the size of the image (or the original image). The number of signals embedded in the horizontal and vertical directions of the image may be detected as known, or may be calculated backward from the size of the image input from the input device and the size of the signal unit.

１ページ分の水平方向にＰｗ個，垂直方向にＰｈ個のユニットパターンが埋め込めるとすると，画像中の任意の位置のユニットパターンをＵ（ｘ，ｙ），ｘ＝１〜Ｐｗ，ｙ＝１〜Ｐｈと表現し，Ｕ（ｘ，ｙ）を「ユニットパターン行列」と称することにする。また，１ページに埋め込むことができるビット数を「埋め込みビット数」と称する。埋め込みビット数はＰｗ×Ｐｈである。 If Pw unit patterns in the horizontal direction for one page and Ph unit patterns in the vertical direction can be embedded, unit patterns at arbitrary positions in the image are represented by U (x, y), x = 1 to Pw, y = 1. ~ Ph, and U (x, y) is referred to as a "unit pattern matrix". In addition, the number of bits that can be embedded in one page is referred to as the “number of embedded bits”. The number of embedded bits is Pw × Ph.

図１９は，改ざん検出データＤａｔａを文書画像１１に埋め込む方法について示したフローチャートである。ここでは，例えば印刷物として換算すると１枚（１ページ分）の文書画像１１に，同じ情報を繰り返し埋め込む場合について説明する。同じ情報を繰り返し埋め込むことにより，文書画像１１と改ざん検出データＤａｔａを重ね合わせたときに１つのユニットパターン全体が塗りつぶされるなどして埋め込み情報が消失するような場合でも，埋め込んだ情報を取り出すことを可能とするためである。 FIG. 19 is a flowchart showing a method for embedding the falsification detection data Data in the document image 11. Here, for example, a case will be described in which the same information is repeatedly embedded in one (one page) document image 11 when converted as a printed matter. By repeatedly embedding the same information, it is possible to extract the embedded information even when the entire unit pattern is filled and the embedded information disappears when the document image 11 and the falsification detection data Data are superimposed. This is to make it possible.

まず，改ざん検出データＤａｔａをＮ元符号に変換する（ステップＳ２０１）。図１３のステップＳ１０１と同様である。以下では，符号化されたデータをデータ符号と称し，ユニットパターンの組み合わせによりデータ符号を表現したものをデータ符号ユニットＤｕと称する。 First, the alteration detection data Data is converted into an N-element code (step S201). This is the same as step S101 in FIG. Hereinafter, the encoded data is referred to as a data code, and the data code expressed by a combination of unit patterns is referred to as a data code unit Du.

次いで，データ符号の符号長（ここではビット数）と埋め込みビット数から，１枚の画像にデータ符号ユニットを何度繰り返し埋め込むことができるかを計算する（ステップＳ２０２）。本実施形態ではデータ符号の符号長データをユニットパターン行列の第１行に挿入するものとする。データ符号の符号長を固定長として符号長データは埋め込まないようにしても良い。 Next, based on the code length of the data code (here, the number of bits) and the number of embedded bits, how many times the data code unit can be embedded in one image is calculated (step S202). In this embodiment, it is assumed that the code length data of the data code is inserted into the first row of the unit pattern matrix. The code length of the data code may be fixed and the code length data may not be embedded.

データ符号ユニットを埋め込む回数Ｄｎは，データ符号長をＣｎとして以下の式で計算される。 The number Dn of embedding the data code unit is calculated by the following equation with the data code length as Cn.

ここで剰余をＲｎ（Ｒｎ＝Ｃｎ−（Ｐｗ×（Ｐｈ−１）））とすると，ユニットパターン行列にはＤｎ回のデータ符号ユニットおよびデータ符号の先頭Ｒｎビット分に相当するユニットパターンを埋め込むことになる。ただし，剰余部分のＲｎビットは必ずしも埋め込まなくても良い。 Here, assuming that the remainder is Rn (Rn = Cn− (Pw × (Ph−1))), the unit pattern matrix is embedded with a data pattern unit of Dn times and a unit pattern corresponding to the first Rn bits of the data code. become. However, the Rn bit of the remainder part does not necessarily have to be embedded.

図２０の説明では，ユニットパターン行列のサイズを９×１１（１１行９列），データ符号長を１２（図中で０〜１１の番号がついたものがデータ符号の各符号語を表わす）とする。 In the description of FIG. 20, the unit pattern matrix size is 9 × 11 (11 rows and 9 columns), and the data code length is 12 (numbers 0 to 11 in the figure indicate the codewords of the data code). And

次いで，ユニットパターン行列の第１行目に符号長データを埋め込む（ステップＳ２０３）。図２０の例では符号長を９ビットのデータで表現して１度だけ埋め込んでいる例を説明しているが，ユニットパターン行列の幅Ｐｗが十分大きい場合，データ符号と同様に符号長データを繰り返し埋め込むこともできる。 Next, the code length data is embedded in the first row of the unit pattern matrix (step S203). In the example of FIG. 20, the code length is expressed by 9-bit data and is described as being embedded only once. However, when the unit pattern matrix width Pw is sufficiently large, It can be embedded repeatedly.

さらに，ユニットパターン行列の第２行以降に，データ符号ユニットを繰り返し埋め込む（ステップＳ２０４）。図２０で示すようにデータ符号のＭＳＢ（ｍｏｓｔｓｉｇｎｉｆｉｃａｎｔｂｉｔ）またはＬＳＢ（ｌｅａｓｔｓｉｇｎｉｆｉｃａｎｔｂｉｔ）から順に行方向に埋め込む。図２０の例ではデータ符号ユニットを７回，およびデータ符号の先頭６ビットを埋め込んでいる例を示している。 Further, the data code unit is repeatedly embedded in the second and subsequent rows of the unit pattern matrix (step S204). As shown in FIG. 20, data code MSB (most significant bit) or LSB (least significant bit) is embedded in the row direction in order. The example of FIG. 20 shows an example in which the data code unit is embedded seven times and the first 6 bits of the data code are embedded.

データの埋め込み方法は図２０のように行方向に連続になるように埋め込んでも良いし，列方向に連続になるように埋め込んでも良い。 The data embedding method may be embedded so as to be continuous in the row direction as shown in FIG. 20 or may be embedded so as to be continuous in the column direction.

以上，透かし情報合成部１５における，文書画像１１と改ざん検出データＤａｔａの重ね合わせについて説明した。 The superposition of the document image 11 and the falsification detection data Data in the watermark information synthesis unit 15 has been described above.

上述のように，透かし情報合成部１５は，文書画像１１と改ざん検出データＤａｔａを重ね合わせる。透かし入り文書画像の各画素の値は，文書画像１１と改ざん検出データＤａｔａの対応する画素値の論理積演算（ＡＮＤ）によって計算する。すなわち，文書画像１１と改ざん検出データＤａｔａのどちらかが０（黒）であれば，透かし入り原画像の画素値は０（黒），それ以外は１（白）となる。 As described above, the watermark information synthesis unit 15 superimposes the document image 11 and the falsification detection data Data. The value of each pixel of the watermarked document image is calculated by AND operation (AND) of the corresponding pixel values of the document image 11 and the falsification detection data Data. That is, if either the document image 11 or the falsification detection data Data is 0 (black), the pixel value of the watermarked original image is 0 (black), otherwise 1 (white).

図２１は，透かし入り文書画像の一例を示す説明図である。図２２は，図２１の一部を拡大して示した説明図である。ここで，ユニットパターンは図１８（１）のパターンを用いている。透かし入り文書画像１１（出力文書画像１４）は，例えば，透かし情報合成部１５のインタフェース（図示せず。）により出力される。 FIG. 21 is an explanatory diagram showing an example of a watermarked document image. FIG. 22 is an explanatory view showing a part of FIG. 21 in an enlarged manner. Here, the unit pattern shown in FIG. 18A is used. The watermarked document image 11 (output document image 14) is output by, for example, an interface (not shown) of the watermark information synthesis unit 15.

（文書入力部２０の動作について）
以降の説明では，出力文書画像１４は紙等の印刷媒体に印刷された後に配布されるものとし，入力文書画像２１はその出力文書画像１４の透かし入りの印刷書面をスキャナ等の入力デバイスにより画像化したものとするが，かかる例に限定されず，出力文書画像１４は印刷されないディジタルデータそのままの場合であってもよい。 (About the operation of the document input unit 20)
In the following description, it is assumed that the output document image 14 is distributed after being printed on a print medium such as paper, and the input document image 21 is an image obtained by printing a watermarked print document of the output document image 14 with an input device such as a scanner. However, the present invention is not limited to this example, and the output document image 14 may be digital data that is not printed as it is.

具体的には，図２３に示すように，出力文書画像１４は紙等の印刷媒体に印刷された後に印刷物として配布されるものとし，入力文書画像２１は透かし入り文書画像の印刷書面をスキャナ等の入力デバイスにより画像化したものとする。なお，図１５に示す入力文書画像２１は改ざんが行われているものとする。図２３は，入力文書画像２１の概略的な構成の一例を示す説明図である。 Specifically, as shown in FIG. 23, the output document image 14 is printed on a print medium such as paper and then distributed as a printed material, and the input document image 21 is a watermarked document image printed on a scanner or the like. It is assumed that the image is input by the input device. It is assumed that the input document image 21 shown in FIG. 15 has been tampered with. FIG. 23 is an explanatory diagram illustrating an example of a schematic configuration of the input document image 21.

まず，文書入力部２０に備わる透かし情報抽出部２２は，後程詳しく説明するが，入力文書画像２１から透かし情報（改ざん検出データＤａｔａ）を抽出し，改ざん検出データＤａｔａ（改ざん検出データＤａｔａ０，改ざん検出データＤａｔａ１，改ざん検出データＤａｔａ２，改ざん検出データＤａｔａ３）を復元する。なお，改ざん検出データＤａｔａ１が画像圧縮されている場合は解凍を行い，合成画像を復元する。また上記改ざん検出データＤａｔａ３は省略可能なデータであり，省略されていた場合は，予め文書出力部１０と文書入力部２０で共通の作成方法で処理を行う。改ざん検出データＤａｔａ２が量子化されている場合は，量子化誤差を含んだデータとなっているため，元の値に復元する。 First, the watermark information extraction unit 22 provided in the document input unit 20 extracts watermark information (falsification detection data Data) from the input document image 21 and details the falsification detection data Data (falsification detection data Data0, falsification detection). Data Data1, alteration detection data Data2, alteration detection data Data3) are restored. If the falsification detection data Data1 is compressed, decompression is performed to restore the composite image. Further, the falsification detection data Data3 is omissible data, and if omitted, the document output unit 10 and the document input unit 20 perform processing in advance by a common creation method. When the falsification detection data Data2 is quantized, the data includes a quantization error and is restored to the original value.

（透かし情報抽出部２２の動作について）
次に，図２４を参照しながら，透かし情報抽出部２２の処理について説明する。図２４は，透かし情報抽出部２２の処理の概略的な流れの一例を示すフローチャートである。 (About the operation of the watermark information extraction unit 22)
Next, the processing of the watermark information extraction unit 22 will be described with reference to FIG. FIG. 24 is a flowchart illustrating an example of a schematic flow of processing of the watermark information extraction unit 22.

図２４に示すように，まず，スキャナなどの入力デバイス（図示せず。）によって透かし入り原画像を計算機のメモリ等に入力する（ステップＳ３０１）。この画像を入力画像と称する。入力画像は多値画像であり，以下では２５６階調のグレー画像として説明するが，かかる例に限定されず，フルカラー画像などの場合でもよい。また入力画像の解像度（スキャナ等の入力デバイス（図示せず。）で読み込むときの解像度）は，上記文書出力部１０で作成した透かし入り文書画像（出力文書画像１４）と異なっていても良いが，ここでは上記文書出力部１０で作成した画像と同じ解像度であるとして説明する。また，１つのユニットパターンが１つのシンボルユニットから構成されている場合について説明する。 As shown in FIG. 24, first, a watermarked original image is input to a memory of a computer or the like by an input device (not shown) such as a scanner (step S301). This image is referred to as an input image. The input image is a multi-valued image and will be described below as a gray image with 256 gradations. However, the input image is not limited to this example, and may be a full-color image or the like. Further, the resolution of the input image (resolution when read by an input device such as a scanner (not shown)) may be different from the watermarked document image (output document image 14) created by the document output unit 10. In the following description, it is assumed that the resolution is the same as that of the image created by the document output unit 10. A case where one unit pattern is composed of one symbol unit will be described.

＜信号検出フィルタリング工程（ステップＳ３１０）＞
ステップＳ３１０では，入力文書画像２１全体に対してフィルタリング処理を行い，フィルタ出力値の計算とフィルタ出力値の比較を行う。フィルタ出力値の計算は，以下に示すガボールフィルタと称されるフィルタを用いて，入力文書画像２１の全画素においてフィルタと画像間のコンボリューションにより計算する。 <Signal Detection Filtering Step (Step S310)>
In step S310, a filtering process is performed on the entire input document image 21, and a filter output value is calculated and the filter output value is compared. The filter output value is calculated by convolution between the filter and the image in all pixels of the input document image 21 using a filter called a Gabor filter shown below.

以下にガボールフィルタＧ（ｘ，ｙ），ｘ＝０〜ｇｗ−１，ｙ＝０〜ｇｈ−１を示す。ｇｗ，ｇｈはフィルタのサイズであり，ここでは上記透かし情報埋め込み装置１０で埋め込んだ信号ユニットと同じ大きさである。 The Gabor filter G (x, y), x = 0 to gw−1, and y = 0 to gh−1 are shown below. gw and gh are filter sizes, which are the same size as the signal unit embedded by the watermark information embedding apparatus 10 here.

入力画像中の任意の位置でのフィルタ出力値はフィルタと画像間のコンボリューションにより計算する。ガボールフィルタの場合は実数フィルタと虚数フィルタ（虚数フィルタは実数フィルタと半波長分位相がずれたフィルタ）が存在するため，それらの２乗平均値をフィルタ出力値とする。例えば，ある画素（ｘ，ｙ）における輝度値とフィルタＡの実数フィルタとのコンボリューションがＲｃ，虚数フィルタとのコンボリューションがＩｃであったとすると，フィルタ出力値Ｆ（Ａ，ｘ，ｙ）は以下の式で計算する。 The filter output value at an arbitrary position in the input image is calculated by convolution between the filter and the image. In the case of a Gabor filter, there are a real number filter and an imaginary number filter (an imaginary number filter is a filter whose phase is shifted by a half wavelength from the real number filter), and the mean square value thereof is used as a filter output value. For example, if the convolution of the luminance value at a certain pixel (x, y) with the real filter of the filter A is Rc and the convolution with the imaginary filter is Ic, the filter output value F (A, x, y) is Calculate with the following formula.

上記のように各信号ユニットに対応するすべてのフィルタに対してフィルタ出力値を計算した後，各画素において上記のように計算したフィルタ出力値を比較し，その最大値Ｆ（ｘ，ｙ）をフィルタ出力値行列として記憶する。また，値が最大であるフィルタに対応する信号ユニットの番号をフィルタ種類行列として記憶する（図２５）。具体的には，ある画素（ｘ，ｙ）において，Ｆ（Ａ，ｘ，ｙ）＞Ｆ（Ｂ，ｘ，ｙ）の場合には，フィルタ出力値行列の（ｘ，ｙ）の値としてＦ（Ａ，ｘ，ｙ）を設定し，フィルタ種類行列の（ｘ，ｙ）の値として信号ユニットＡを示す「０」を設定する（本実施形態では，信号ユニットＡ，Ｂの番号を「０」，「１」としている）。 After calculating the filter output values for all the filters corresponding to each signal unit as described above, the filter output values calculated as described above are compared in each pixel, and the maximum value F (x, y) is obtained. Store as a filter output value matrix. Further, the number of the signal unit corresponding to the filter having the maximum value is stored as a filter type matrix (FIG. 25). Specifically, in a certain pixel (x, y), when F (A, x, y)> F (B, x, y), F is set as the value of (x, y) in the filter output value matrix. (A, x, y) is set, and “0” indicating the signal unit A is set as the value of the filter type matrix (x, y) (in this embodiment, the numbers of the signal units A and B are set to “0”). ”And“ 1 ”).

なお，本実施形態ではフィルタの個数が２つであるが，フィルタの個数がそれより多い場合も，同様に複数のフィルタ出力値の最大値とその際のフィルタに対応する信号ユニット番号を記憶すればよい。 In the present embodiment, the number of filters is two. However, when the number of filters is larger than that, the maximum value of a plurality of filter output values and the signal unit number corresponding to the filter at that time are also stored. That's fine.

＜信号位置探索工程（ステップＳ３２０）＞
ステップＳ３２０では，ステップＳ３１０で得られたフィルタ出力値行列を用いて，信号ユニットの位置を決定する。具体的には，まず，信号ユニットの大きさがＳｈ×Ｓｗで構成されていたとすると，格子点の垂直方向の間隔がＳｈ，水平方向の間隔がＳｗ，格子点の個数がＮｈ×Ｎｗの信号位置探索テンプレートを作成する（図２６）。そのように作成したテンプレートの大きさは，Ｔｈ（Ｓｈ＊Ｎｈ）×Ｔｗ（Ｓｗ＊Ｎｗ）となるが，Ｎｈ，Ｎｗには信号ユニット位置を探索するために最適な値を用いればよい。 <Signal Location Search Step (Step S320)>
In step S320, the position of the signal unit is determined using the filter output value matrix obtained in step S310. Specifically, first, assuming that the size of the signal unit is Sh × Sw, a signal in which the vertical interval between grid points is Sh, the horizontal interval is Sw, and the number of grid points is Nh × Nw. A position search template is created (FIG. 26). The size of the template thus created is Th (Sh * Nh) × Tw (Sw * Nw), and optimum values for searching for the signal unit position may be used for Nh and Nw.

次に，フィルタ出力値行列をテンプレートの大きさごとに分割する。さらに，各分割領域で，隣接する領域の信号ユニットに重複しない範囲（水平方向±Ｓｗ／２，垂直方向±Ｓｈ／２，）でテンプレートをフィルタ出力値行列上で画素単位に移動させながら，テンプレート格子点上のフィルタ出力値行列値Ｆ（ｘ，ｙ）の総和Ｖを以下の式を用いて求め（図３１），その総和が一番大きいテンプレートの格子点をその領域の信号ユニットの位置とする。 Next, the filter output value matrix is divided for each template size. Further, in each divided region, the template is moved in units of pixels on the filter output value matrix within a range (horizontal direction ± Sw / 2, vertical direction ± Sh / 2) that does not overlap with the signal units in the adjacent region. The sum V of the filter output value matrix values F (x, y) on the lattice points is obtained using the following equation (FIG. 31), and the lattice point of the template having the largest sum is determined as the position of the signal unit in the region. To do.

上記の例は，ステップＳ３１０で全画素に対して，フィルタ出力値を求めた場合であり，フィルタリングを行う際，ある一定間隔の画素に対してのみフィルタリングを行うこともできる。例えば，２画素おきにフィルタリングを行った場合は，上記の信号位置探索テンプレートの格子点の間隔も１／２とすればよい。 The above example is a case where the filter output value is obtained for all the pixels in step S310. When filtering is performed, it is possible to perform filtering only for pixels at a certain interval. For example, when filtering is performed every two pixels, the interval between the lattice points of the signal position search template may be halved.

＜信号シンボル決定工程（ステップＳ３３０）＞
ステップＳ３３０では，ステップＳ３２０で決定した信号ユニット位置のフィルタ種類行列の値（フィルタに対応した信号ユニット番号）を参照することで，信号ユニットがＡかＢを決定する。 <Signal Symbol Determination Step (Step S330)>
In step S330, the signal unit is determined to be A or B by referring to the value of the filter type matrix at the signal unit position determined in step S320 (signal unit number corresponding to the filter).

上記のようにして，決定した信号ユニットの判定結果をシンボル行列として記憶する。 The determination result of the determined signal unit is stored as a symbol matrix as described above.

＜信号境界決定工程（ステップＳ３４０）＞
ステップＳ３２０では，信号ユニットが埋め込まれているかにかかわらず，画像全面に対してフィルタリング処理を行っているので，どの部分に信号ユニットが埋め込まれていたかを決定する必要がある。そこで，ステップＳ３４０では，シンボル行列からあらかじめ信号ユニットを埋め込む際に決めておいたパターンを探索することで信号境界を求める。 <Signal Boundary Determination Step (Step S340)>
In step S320, since the filtering process is performed on the entire image regardless of whether the signal unit is embedded, it is necessary to determine in which part the signal unit is embedded. Therefore, in step S340, the signal boundary is obtained by searching for a pattern determined in advance when embedding the signal unit from the symbol matrix.

例えば信号ユニットが埋め込まれている境界には，必ず信号ユニットＡを埋め込むとしておけば，ステップＳ３３０で決定したシンボル行列の横方向に信号ユニットＡの数を計数し，中心から上下にそれぞれ，信号ユニットＡの個数が一番多い位置を信号境界の上端／下端とする。図２７の例では，シンボル行列における信号ユニットＡは「黒」（値でいうと「０」）で表現されているので，シンボル行列の黒画素数を計数することで，信号ユニットＡの数を計数することができ，その度数分布により，信号境界の上端／下端を求めることができる。左端／右端もユニットＡの個数を計数する方向が異なるだけで，同じように求めることができる。 For example, if the signal unit A is necessarily embedded in the boundary where the signal unit is embedded, the number of the signal units A is counted in the horizontal direction of the symbol matrix determined in step S330, and the signal unit A is respectively measured vertically from the center. The position with the largest number of A is defined as the upper end / lower end of the signal boundary. In the example of FIG. 27, since the signal unit A in the symbol matrix is represented by “black” (in terms of value, “0”), the number of signal units A is calculated by counting the number of black pixels in the symbol matrix. Counting can be performed, and the upper / lower end of the signal boundary can be obtained from the frequency distribution. The left end / right end can also be obtained in the same way, except for the direction in which the number of units A is counted.

信号境界を求めるためには上記方法に限らず，シンボル行列から探索することができるパターンをあらかじめ埋め込み側と検出側で決めておくだけでよい。 In order to obtain the signal boundary, not only the above method but also a pattern that can be searched from the symbol matrix may be determined in advance on the embedding side and the detecting side.

再び，図２４のフローチャートに戻り，以降のステップＳ３０５について説明する。ステップＳ３０５では，シンボル行列のうち，信号境界内部に相当する部分から元の情報を復元する。なお，本実施形態では，１つのユニットパターンは１つのシンボルユニットで構成されているので，ユニットパターン行列は，シンボル行列と等価になる。 Returning again to the flowchart of FIG. 24, the following step S305 will be described. In step S305, the original information is restored from the portion corresponding to the inside of the signal boundary in the symbol matrix. In the present embodiment, since one unit pattern is composed of one symbol unit, the unit pattern matrix is equivalent to the symbol matrix.

＜情報復号工程（ステップＳ３０５）＞
図２８は情報復元の一例を示す説明図である。情報復元のステップは以下の通りである。
（１）各ユニットパターンに埋め込まれているシンボルを検出する（図２８（１））。
（２）シンボルを連結してデータ符号を復元する（図２８（２））。
（３）データ符号を復号して埋め込まれた情報を取り出す（図２８（３））。 <Information Decoding Step (Step S305)>
FIG. 28 is an explanatory diagram showing an example of information restoration. The steps of information restoration are as follows.
(1) A symbol embedded in each unit pattern is detected (FIG. 28 (1)).
(2) The symbols are connected to restore the data code (FIG. 28 (2)).
(3) The data code is decoded to extract the embedded information ((3) in FIG. 28).

図２９〜図３１はデータ符号の復元方法の一例を示す説明図である。復元方法は基本的に図２０の逆の処理となる。 29 to 31 are explanatory diagrams showing an example of a data code restoration method. The restoration method is basically the reverse process of FIG.

まず，ユニットパターン行列の第１行から符号長データ部分を取り出して，埋め込まれたデータ符号の符号長を得る（ステップＳ４０１）。 First, the code length data portion is extracted from the first row of the unit pattern matrix to obtain the code length of the embedded data code (step S401).

次いで，ユニットパターン行列のサイズとステップＳ４０１で得たデータ符号の符号長をもとに，データ符号ユニットを埋め込んだ回数Ｄｎおよび剰余Ｒｎを計算する（ステップＳ４０２）。 Next, based on the size of the unit pattern matrix and the code length of the data code obtained in step S401, the number Dn of data code units embedded and the remainder Rn are calculated (step S402).

次いで，ユニットパターン行列の２行目以降からステップＳ２０３と逆の方法でデータ符号ユニットを取り出す（ステップＳ４０３）。図３０の例ではＵ（１，２）（２行１列）から順に１２個のパターンユニットごとに分解する（Ｕ（１，２）〜Ｕ（３，３），Ｕ（４，３）〜Ｕ（６，４），・・・）。Ｄｎ＝７，Ｒｎ＝６であるため，１２個のパターンユニット（データ符号ユニット）は７回取り出され，剰余として６個（データ符号ユニットの上位６個に相当する）のユニットパターン（Ｕ（４，１１）〜Ｕ（９，１１））が取り出される。 Next, data code units are extracted from the second and subsequent rows of the unit pattern matrix by the reverse method of step S203 (step S403). In the example of FIG. 30, 12 pattern units are sequentially decomposed from U (1, 2) (2 rows and 1 column) (U (1, 2) to U (3, 3), U (4, 3) to U (6, 4), ...). Since Dn = 7 and Rn = 6, 12 pattern units (data code units) are extracted 7 times, and 6 unit patterns (corresponding to the upper 6 data code units) (U (4 , 11) to U (9, 11)) are taken out.

次いで，ステップＳ４０３で取り出したデータ符号ユニットに対してビット確信度演算を行うことにより，埋め込んだデータ符号を再構成する（ステップＳ４０４）。以下，ビット確信度演算について説明する。 Next, the embedded data code is reconstructed by performing bit certainty calculation on the data code unit extracted in step S403 (step S404). Hereinafter, the bit certainty calculation will be described.

図３１のようにユニットパターン行列の２行１列目から最初に取り出されたデ−外符号ユニットをＤｕ（１，１）〜Ｄｕ（１２，１）とし，順次Ｄｕ（１，２）〜Ｄｕ（１２，２），・・・，と表記する。また，剰余部分はＤｕ（１，８）〜Ｄｕ（６，８）とする。ビット確信度演算は各データ符号ユニットの要素ごとに多数決を取るなどして，データ符号の各シンボルの値を決定することである。これにより，例えば，文字領域との重なりや紙面の汚れなどが原因で，任意のデータ符号ユニット中の任意のユニットから正しく信号検出を行えなかった場合（ビット反転エラーなど）でも，最終的に正しくデータ符号を復元することができる。 As shown in FIG. 31, the first outer code units extracted from the second row and the first column of the unit pattern matrix are Du (1,1) to Du (12,1), and Du (1,2) to Du are sequentially added. (12, 2),. Further, the surplus portion is assumed to be Du (1, 8) to Du (6, 8). The bit certainty calculation is to determine the value of each symbol of the data code by, for example, taking a majority vote for each element of each data code unit. As a result, for example, even if signal detection cannot be performed correctly from any unit in any data encoding unit due to overlap with the character area or dirt on the paper surface, it will eventually be correctly The data code can be restored.

具体的には例えばデータ符号の１ビット目は，Ｄｕ（１，１），Ｄｕ（１，２），・・・，Ｄｕ（１，８）の信号検出結果が１である方が多い場合には１と判定し，０である方が多い場合には０と判定する。同様にデータ符号の２ビット目はＤｕ（２，１），Ｄｕ（２，２），・・・，Ｄｕ（２，８）の信号検出結果による多数決によって判定し，データ符号の１２ビット目はＤｕ（１２，１），Ｄｕ（１２，２），・・・，Ｄｕ（１２，７）（Ｄｕ（１２，８）は存在しないためＤｕ（１２，７）までとなる）の信号検出結果による多数決によって判定する。 Specifically, for example, the first bit of the data code is more often when the signal detection result of Du (1, 1), Du (1, 2),..., Du (1, 8) is 1. Is determined to be 1 and when there are more 0s, it is determined to be 0. Similarly, the second bit of the data code is determined by majority decision based on the signal detection result of Du (2, 1), Du (2, 2),..., Du (2, 8), and the 12th bit of the data code is Du (12,1), Du (12,2),..., Du (12,7) (Du (12,8) is not present, so it is up to Du (12,7)). Judgment by majority vote.

ここではデータ符号を繰り返し埋め込む場合について説明したが，データを符号化する際に誤り訂正符号などを用いることにより，データ符号ユニットの繰り返しを行わないような方法も実現できる。 Although the case where data codes are repeatedly embedded has been described here, a method that does not repeat data code units can be realized by using an error correction code or the like when encoding data.

以上説明したように，入力文書画像２１全面にフィルタリング処理を施し，信号位置探索テンプレートを用いて，フィルタ出力値の総和が最大になるように，信号ユニットの位置を求めることができるので，用紙のゆがみなどにより画像が伸縮していたりする場合にでも，信号ユニットの位置を正しく検出でき，透かし入り原画像（入力文書画像２１）から正確に改ざん検出データＤａｔａを検出することができる。 As described above, since the filtering process is performed on the entire input document image 21 and the position of the signal unit can be obtained by using the signal position search template so that the total sum of the filter output values is maximized. Even when the image expands or contracts due to distortion or the like, the position of the signal unit can be detected correctly, and the falsification detection data Data can be detected accurately from the watermarked original image (input document image 21).

（擬似文書画像生成部２３の処理について）
次に，図３２，図３３を参照しながら，本実施の形態にかかる擬似文書画像生成部２３の処理について説明する。なお，図３２は，擬似文書画像生成部２３の処理の概略を示すフローチャートである。図３３は，擬似文書画像生成部２３による擬似文書画像の生成処理の概略を示す説明図である。 (Processing of the pseudo document image generation unit 23)
Next, processing of the pseudo document image generation unit 23 according to the present embodiment will be described with reference to FIGS. 32 and 33. FIG. 32 is a flowchart showing an outline of processing of the pseudo document image generation unit 23. FIG. 33 is an explanatory diagram showing an outline of a pseudo document image generation process by the pseudo document image generation unit 23.

擬似文書画像生成部２３の処理によって，透かし情報抽出部２２で抽出された改ざん検出データＤａｔａに基づき，その改ざん検出データＤａｔａから文書画像１１の画像特徴が再現された画像（擬似文書画像）を作成する。 Based on the falsification detection data Data extracted by the watermark information extraction unit 22 by the processing of the pseudo document image generation unit 23, an image (pseudo document image) in which the image features of the document image 11 are reproduced from the falsification detection data Data is created. To do.

（縮小文字領域分離（Ｓ４１）について）
図３２に示すように，擬似文書画像生成部２３は，上記説明した文書出力部１０に備わる透かし情報合成部１３による縮小文書領域合成（Ｓ３３）の逆の処理手順によって，合成画像から各縮小文字領域を分離する（Ｓ４１）。なお，図３３に示すように，図３２に示す上記縮小文字領域分離処理（Ｓ４１）に対応する処理は，Ｓ６１である。 (About reduced character area separation (S41))
As shown in FIG. 32, the pseudo document image generation unit 23 performs the process of reversing each reduced character from the synthesized image by the reverse processing procedure of the reduced document area synthesis (S33) by the watermark information synthesis unit 13 provided in the document output unit 10 described above. The areas are separated (S41). As shown in FIG. 33, the process corresponding to the reduced character area separation process (S41) shown in FIG. 32 is S61.

（縮小文字領域拡大（Ｓ４２）について）
擬似文書画像生成部２３は，縮小文字領域を分離すると，その分離した縮小文字領域を拡大する（Ｓ４２）。なお，上記拡大処理（Ｓ４２）は，縮小文字領域ごとに拡大し，もとの文字領域サイズにもどす。例えば，文字領域が１／２に縮小されていた場合には，縮小文字領域を分離した後，その縮小文字領域を２倍に拡大する。また，図３３に示すように，図３２に示す上記拡大処理（Ｓ４２）に対応する処理は，Ｓ６２である。 (About reduced character area enlargement (S42))
When the reduced document area is separated, the pseudo document image generation unit 23 enlarges the reduced reduced character area (S42). The enlargement process (S42) enlarges each reduced character area and restores the original character area size. For example, if the character area has been reduced to ½, after the reduced character area is separated, the reduced character area is enlarged twice. As shown in FIG. 33, the process corresponding to the enlargement process (S42) shown in FIG. 32 is S62.

（文字領域再構成（Ｓ４３）について）
上記分離した後の縮小文字領域を拡大処理（Ｓ４２）すると，擬似文書画像生成部２３は，文書画像１１と同じサイズの背景（白色からなる画像領域）のみのキャンバス画像（背景画像）を生成する。 (Character area reconstruction (S43))
When the reduced character area after the separation is enlarged (S42), the pseudo document image generation unit 23 generates a canvas image (background image) having only the background (white image area) having the same size as the document image 11. .

さらに，擬似文書画像生成部２３は，改ざん検出データＤａｔａ２（各文字領域の位置情報など）を参照し，縮小文字領域を拡大した文字領域を上記キャンバス画像上に配置する（Ｓ４３）。なお，図３３に示すように，図３２に示す文字領域再構成処理（Ｓ４３）に対応する処理は，Ｓ６３である。 Further, the pseudo document image generation unit 23 refers to the falsification detection data Data2 (position information of each character area) and arranges the character area obtained by enlarging the reduced character area on the canvas image (S43). As shown in FIG. 33, the process corresponding to the character area reconstruction process (S43) shown in FIG. 32 is S63.

なお，上記キャンバス画像と擬似文書画像のサイズは略同一であるが，これらは必ずしも文書画像１１と同じである必要はない。キャンバス画像と擬似文書画像が文書画像１１と同一ではない場合，縮小文字領域の拡大処理（Ｓ４２）において拡大率を変更する。例えばキャンバス画像の幅が文書画像１１の１／Ｓｗであり，高さが１／Ｓｈであった場合，文字領域が１／Ｎに縮小されていたとすると，双方の拡大率を求めると，幅は（１／Ｓｗ）／（１／Ｎ）倍，高さは（１／Ｓｈ）／（１／Ｎ）倍と各々求められる。 Although the canvas image and the pseudo document image have substantially the same size, they do not necessarily have to be the same as the document image 11. If the canvas image and the pseudo document image are not the same as the document image 11, the enlargement ratio is changed in the enlargement process (S42) of the reduced character area. For example, if the width of the canvas image is 1 / Sw of the document image 11 and the height is 1 / Sh, and the character area is reduced to 1 / N, the width is (1 / Sw) / (1 / N) times and height are obtained as (1 / Sh) / (1 / N) times, respectively.

以上で，擬似文書画像生成部２３によって，キャンバス画像に拡大処理（Ｓ４２）後の文字領域全てが改ざん検出データＤａｔａ２に基づき配置される（Ｓ４３）と，擬似文書画像が完成する。 As described above, the pseudo document image generation unit 23 completes the pseudo document image when all the character regions after the enlargement process (S42) are arranged on the canvas image based on the falsification detection data Data2 (S43).

（入力画像変形部２４の処理について）
上記擬似文書画像生成部２３による擬似文書画像が生成されると，次に，入力画像変形部２４による入力文書画像２１の変形処理が行われる。当該変形処理は，入力文書画像２１を，改ざん検出を行うための比較対象となる擬似文書画像と同一サイズの画像に補正し，さらに補正後の入力文書画像２１を二値化する。なお，上記擬似文書画像と同一サイズに補正し，二値化後，改ざん検出の精度を向上させるため縮小処理（Ｓ３２）の縮小率と同じ率だけ縮小し，さらに拡大処理（Ｓ４２）の拡大率と同じ率だけ拡大を行う。つまり擬似文書画像が生成されるまでに経てきた処理と同じ条件にすることで，改ざんの有無の判定の正確性をより一層向上させることができるが，省略することも可能である。 (Regarding the processing of the input image transformation unit 24)
When the pseudo document image is generated by the pseudo document image generation unit 23, the input document image 21 is deformed by the input image deformation unit 24. In the transformation process, the input document image 21 is corrected to an image having the same size as that of the pseudo document image to be compared for performing alteration detection, and the corrected input document image 21 is binarized. In addition, after correcting to the same size as the pseudo document image and binarization, the image is reduced by the same rate as the reduction rate of the reduction process (S32) in order to improve the accuracy of falsification detection, and further the enlargement rate of the enlargement process (S42) Magnify at the same rate as. In other words, by using the same conditions as the processing that has been performed until the pseudo document image is generated, the accuracy of the determination of whether or not falsification has occurred can be further improved, but it can be omitted.

ここで，図３４を参照しながら，入力画像変形部２４の処理について説明する。なお，図３４は入力画像変形部２４のフローチャートである。以下，このフローチャートに従って説明を行う。 Here, the processing of the input image transformation unit 24 will be described with reference to FIG. FIG. 34 is a flowchart of the input image deformation unit 24. Hereinafter, description will be given according to this flowchart.

＜信号ユニットの位置検出（ステップＳ６１０）＞
図３５は第１の実施形態で検出された信号ユニット位置を入力画像（透かし入り原画像１１）２１上に表示したものである。図３５において，信号ユニットをＵ（ｘ，ｙ），ｘ＝１〜Ｗｕ，ｙ＝１〜Ｈｕと表記する。Ｕ（１，ｙ）〜Ｕ（Ｗｕ，ｙ）は同じ行にある信号ユニット（図３５の符号７１０）とし，Ｕ（ｘ，１）〜Ｕ（ｘ，Ｈｕ）は同じ列にある信号ユニット（図３５の符号７２０）とする。Ｕ（１，ｙ）〜Ｕ（Ｗｕ，ｙ）およびＵ（ｘ，１）〜Ｕ（ｘ，Ｈｕ）などは，実際には同じ直線上には並んでおらず，微小に上下左右にずれている。 <Signal Unit Position Detection (Step S610)>
FIG. 35 shows the signal unit position detected in the first embodiment on the input image (original image 11 with watermark) 21. In FIG. 35, the signal unit is expressed as U (x, y), x = 1 to Wu, y = 1 to Hu. U (1, y) to U (Wu, y) are signal units (reference numeral 710 in FIG. 35), and U (x, 1) to U (x, Hu) are signal units (in the same column). Reference numeral 720 in FIG. U (1, y) to U (Wu, y), U (x, 1) to U (x, Hu), etc. are not actually arranged on the same straight line, and are slightly shifted vertically and horizontally. Yes.

また信号ユニットＵ（ｘ，ｙ）の入力画像上の座標値Ｐを（Ｐｘ（ｘ，ｙ），Ｐｙ（ｘ，ｙ）），ｘ＝１〜Ｗｕ，ｙ＝１〜Ｈｕと表記する（図３５の符号７３０，７４０，７５０，７６０）。ただし入力画像に対して縦横Ｎ画素おき（Ｎは自然数）にフィルタリングを行うものとする。このフィルタリングについては，第１の実施形態の＜信号位置探索工程（ステップＳ３２０）＞と同様に行う。Ｐは信号出力値行列における各信号ユニットの座標値を単純に縦横Ｎ倍した値である。 The coordinate value P on the input image of the signal unit U (x, y) is expressed as (Px (x, y), Py (x, y)), x = 1 to Wu, y = 1 to Hu (FIG. 35, 730, 740, 750, 760). However, the input image is filtered every N pixels vertically and horizontally (N is a natural number). This filtering is performed in the same manner as in the <signal position search step (step S320)> in the first embodiment. P is a value obtained by simply multiplying the coordinate value of each signal unit in the signal output value matrix N times vertically and horizontally.

＜信号ユニット位置の直線近似（ステップＳ６２０）＞
信号ユニットの位置を行方向，列方向に直線で近似する。図３６は行方向の直線近似の例である。図３６において，同じ行にある信号ユニットＵ（１，ｙ）〜Ｕ（Ｗｕ，ｙ）の位置を直線Ｌｈ（ｙ）で近似している。近似直線は，各信号ユニットの位置と直線Ｌｈ（ｙ）との距離の総和が最も小さくなるような直線である。このような直線は，例えば最小二乗法や主成分分析などの一般的手法によって求めることができる。行方向の直線近似はすべての行について行い，同様にして列方向の直線近似をすべての列について行う。 <Linear approximation of signal unit position (step S620)>
Approximate the position of the signal unit with a straight line in the row and column directions. FIG. 36 shows an example of linear approximation in the row direction. In FIG. 36, the positions of signal units U (1, y) to U (Wu, y) in the same row are approximated by a straight line Lh (y). The approximate straight line is a straight line that minimizes the sum of the distances between the position of each signal unit and the straight line Lh (y). Such a straight line can be obtained by a general method such as a least square method or principal component analysis. The straight line approximation in the row direction is performed for all the rows, and the straight line approximation in the column direction is similarly performed for all the columns.

図３７は行方向，列方向に直線近似を行った結果の例である。図３７において，信号ユニットをＵ（ｘ，ｙ），ｘ＝１〜Ｗｕ，ｙ＝１〜Ｈｕと表記する。Ｌｈ（ｙ）はＵ（１，ｙ）〜Ｕ（Ｗｕ，ｙ）を近似した直線（図３７の符号８１０）であり，Ｌｖ（ｘ）はＵ（ｘ，１）〜Ｕ（ｘ，Ｈｕ）を近似した直線（図３７の符号８２０）である。 FIG. 37 shows an example of a result obtained by performing linear approximation in the row direction and the column direction. In FIG. 37, the signal unit is represented as U (x, y), x = 1 to Wu, y = 1 to Hu. Lh (y) is a straight line (reference numeral 810 in FIG. 37) approximating U (1, y) to U (Wu, y), and Lv (x) is U (x, 1) to U (x, Hu). Is a straight line (reference numeral 820 in FIG. 37).

＜直線の均等化（ステップＳ６３０）＞
ステップＳ６２０で近似した直線は，検出された信号ユニットの位置が，ある程度まとまってずれているなどの理由により，個別に見ると直線の傾きや位置が均等ではない。そこで，ステップＳ６３０では，個々の直線の傾きや位置を補正して均等化を行う。 <Straight line equalization (step S630)>
The straight lines approximated in step S620 are not uniform in slope or position when viewed individually, for example, because the positions of the detected signal units are shifted to some extent. Therefore, in step S630, equalization is performed by correcting the inclination and position of each straight line.

図３８は行方向の近似直線Ｌｈ（ｙ）の傾きを補正する例である。図３８（ａ）は補正前であり，図３８（ｂ）は補正後である。図３８（ａ）におけるＬｈ（ｙ）の傾きをＴｈ（ｙ）としたとき，Ｌｈ（ｙ）の傾きはＬｈ（ｙ）の近傍の直線の傾きの平均値となるように補正する。具体的には，Ｔｈ（ｙ）＝ＡＶＥＲＡＧＥ（Ｔｈ（ｙ−Ｎｈ）〜Ｔｈ（ｙ＋Ｎｈ））とする。ただし，ＡＶＥＲＡＧＥ（Ａ〜Ｂ）はＡ〜Ｂまでの平均値を計算する計算式とし，Ｎｈは任意の自然数とする。ｙ−Ｎｈ＜１の場合はＴｈ（ｙ）＝ＡＶＥＲＡＧＥ（Ｔｈ（１）〜Ｔｈ（ｙ＋Ｎｈ））とし，ｙ＋Ｎｈ＞Ｈｕの場合はＴｈ（ｙ）＝ＡＶＥＲＡＧＥ（Ｔｈ（ｙ−Ｎｈ）〜Ｔｈ（Ｈｕ））とする。図３８はＮｈを１としたときの例であり，図３８（ｂ）はＬｈ（ｙ）がＬｈ（ｙ−１）〜Ｌｈ（ｙ＋１）の直線の傾きの平均値によって補正されている例を示している。 FIG. 38 shows an example of correcting the inclination of the approximate straight line Lh (y) in the row direction. FIG. 38A shows a state before correction, and FIG. 38B shows a state after correction. When the slope of Lh (y) in FIG. 38A is Th (y), the slope of Lh (y) is corrected so as to be the average value of the slopes of straight lines in the vicinity of Lh (y). Specifically, Th (y) = AVERAGE (Th (y−Nh) to Th (y + Nh)). However, AVERAGE (A to B) is a calculation formula for calculating an average value from A to B, and Nh is an arbitrary natural number. When y−Nh <1, Th (y) = AVERAGE (Th (1) to Th (y + Nh)), and when y + Nh> Hu, Th (y) = AVERAGE (Th (y−Nh) to Th (Hu). )). FIG. 38 shows an example where Nh is 1, and FIG. 38B shows an example where Lh (y) is corrected by the average value of the slopes of the straight lines Lh (y−1) to Lh (y + 1). Show.

図３９は行方向の近似直線Ｌｈ（ｙ）の位置を補正する例である。図３９（ａ）は補正前であり，図３９（ｂ）は補正後である。図３９（ａ）について，垂直方向に任意の基準直線１１３０を設定し，この直線とＬｈ（ｙ）との交点のｙ座標をＱ（ｙ）としたとき，Ｑ（ｙ）がＬｈ（ｙ）の近傍の直線位置の平均となるように補正する。具体的には，Ｑ（ｙ）＝ＡＶＥＲＡＧＥ（Ｑ（ｙ−Ｍｈ）〜Ｑ（ｙ＋Ｍｈ））とする。ただしＭｈは任意の自然数とする。ｙ−Ｍｈ＜１またはｙ＋Ｍｈ＞Ｈｕの場合は変更を行わないものとする。図３９はＭｈを１としたときの例であり，図３９（ｂ）はＬｈ（ｙ）がＬｈ（ｙ−１）とＬｈ（ｙ＋１）の直線の位置の中点（平均）によって補正されている例を示している。なお，この処理は省略可能である。 FIG. 39 shows an example of correcting the position of the approximate straight line Lh (y) in the row direction. FIG. 39A is before correction, and FIG. 39B is after correction. In FIG. 39A, when an arbitrary reference straight line 1130 is set in the vertical direction and the y coordinate of the intersection of this straight line and Lh (y) is Q (y), Q (y) becomes Lh (y). It correct | amends so that it may become the average of the linear position of near. Specifically, Q (y) = AVERAGE (Q (y−Mh) to Q (y + Mh)). However, Mh is an arbitrary natural number. If y-Mh <1 or y + Mh> Hu, no change is made. FIG. 39 shows an example when Mh is 1, and FIG. 39B shows that Lh (y) is corrected by the midpoint (average) of the positions of the straight lines Lh (y−1) and Lh (y + 1). An example is shown. This process can be omitted.

＜直線の交点計算（ステップＳ６４０）＞
行方向の近似直線と列方向の近似直線の交点を計算する。図４０は行方向の近似直線Ｌｈ（１）〜Ｌｈ（Ｈｕ）と，列方向の近似直線Ｌｖ（１）〜Ｌｖ（Ｗｕ）の交点を計算した例である。交点の計算は一般的な数学的手法で行う。ここで計算した交点を補正後の信号ユニット位置とする。すなわち，行方向の近似直線Ｌｈ（ｙ）と列方向の近似直線Ｌｖ（ｘ）の交点を，信号ユニットＵ（ｘ，ｙ）の補正後の位置（Ｒｘ（ｘ，ｙ），Ｒｙ（ｘ，ｙ））とする。例えば信号ユニットＵ（１，１）の補正後の位置は，Ｌｈ（１）とＬｖ（１）の交点とする。 <Line intersection calculation (step S640)>
Calculate the intersection of the approximate line in the row direction and the approximate line in the column direction. FIG. 40 shows an example in which the intersection points of the approximate straight lines Lh (1) to Lh (Hu) in the row direction and the approximate straight lines Lv (1) to Lv (Wu) in the column direction are calculated. The intersection point is calculated by a general mathematical method. The intersection calculated here is used as the corrected signal unit position. That is, the intersection of the approximate straight line Lh (y) in the row direction and the approximate straight line Lv (x) in the column direction is set to the corrected position (Rx (x, y), Ry (x, y) of the signal unit U (x, y). y)). For example, the corrected position of the signal unit U (1, 1) is an intersection of Lh (1) and Lv (1).

＜補正画像作成（ステップＳ６５０）＞
ステップＳ６４０により計算した信号ユニット位置を参照して，入力画像から補正画像を作成する。ここでは透かし画像出力部１０で出力した透かし入り原画像を印刷する際の解像度をＤｏｕｔとし，文書入力部２０に入力する入力文書画像２１を取得する際の解像度をＤｉｎとする。また補正後の入力文書画像２１の大きさは，擬似文書画像と同じサイズであるとする。 <Creation of corrected image (step S650)>
A corrected image is created from the input image with reference to the signal unit position calculated in step S640. Here, the resolution when printing the watermarked original image output by the watermark image output unit 10 is Dout, and the resolution when the input document image 21 to be input to the document input unit 20 is acquired is Din. The corrected input document image 21 is assumed to be the same size as the pseudo document image.

透かし画像出力部１０において信号ユニットの大きさが幅Ｓｗ，高さＳｈであるとすると，入力文書画像２１における信号ユニットは幅Ｔｗ＝Ｓｗ×Ｄｉｎ／Ｄｏｕｔ，高さはＴｈ＝Ｓｈ×Ｄｉｎ／Ｄｏｕｔとなる。したがって，信号ユニットの数が横方向にＷｕ個，縦方向にＨｕ個である場合には，補正後の入力文書画像２１のサイズは幅Ｗｍ＝Ｔｗ×Ｗｕ，高さＨｍ＝Ｔｈ×Ｈｕとなる。また，補正後の入力文書画像２１における任意の信号ユニットＵ（ｘ，ｙ）の位置を（Ｓｘ（ｘ，ｙ），Ｓｙ（ｘ，ｙ））とすると，信号ユニットが均等に並ぶように補正後の入力文書画像２１を作るため，Ｓｘ＝Ｔｗ×ｘ，Ｓｙ＝Ｔｈ×ｙが成り立つ。なお，最も左上の信号ユニットＵ（１，１）の位置は（０，０）であり，これが補正後の入力文書画像２１の原点となる。 Assuming that the size of the signal unit in the watermark image output unit 10 is the width Sw and the height Sh, the signal unit in the input document image 21 has a width Tw = Sw × Din / Dout, and the height is Th = Sh × Din / Dout. It becomes. Therefore, when the number of signal units is Wu in the horizontal direction and Hu in the vertical direction, the size of the input document image 21 after correction is Wm = Tw × Wu and Hm = Th × Hu. . Further, if the position of an arbitrary signal unit U (x, y) in the corrected input document image 21 is (Sx (x, y), Sy (x, y)), correction is performed so that the signal units are evenly arranged. In order to create a later input document image 21, Sx = Tw × x and Sy = Th × y hold. The position of the upper left signal unit U (1, 1) is (0, 0), which is the origin of the input document image 21 after correction.

補正された画像上の任意の位置（Ｘｍ，Ｙｍ）の画素値Ｖｍは，入力文書画像上の座標（Ｘｉ，Ｙｉ）の画素値Ｖｉにより求める。図４１はこれらの座標の対応例であり，図４１（ａ）は入力画像１３１０（入力文書画像２１）を，図４１（ｂ）は補正画像１３２０（補正後の入力文書画像２１）を示している。この図を用いて（Ｘｍ，Ｙｍ）と（Ｘｉ，Ｙｉ）の関係を説明する。 The pixel value Vm at an arbitrary position (Xm, Ym) on the corrected image is obtained from the pixel value Vi at the coordinates (Xi, Yi) on the input document image. FIG. 41 is a correspondence example of these coordinates. FIG. 41A shows the input image 1310 (input document image 21), and FIG. 41B shows the corrected image 1320 (corrected input document image 21). Yes. The relationship between (Xm, Ym) and (Xi, Yi) will be described using this figure.

図４１（ｂ）の補正画像１３２０において，（Ｘｍ，Ｙｍ）を中心としてみたときの左上，右上，左下の領域で，最も近い信号ユニットをそれぞれＵ（ｘ，ｙ）（座標値は（Ｓｘ（ｘ，ｙ），Ｓｙ（ｘ，ｙ）），１３６０），Ｕ（ｘ＋１，ｙ）（１３７０），Ｕ（ｘ，ｙ＋１）（１３８０）とし，それらとの距離をそれぞれＥ１，Ｅ２，Ｅ３とする（具体的には，ｘはＸｍ／Ｔｗ＋１を超えない最小の整数，ｙはＸｍ／Ｔｗ＋１を超えない最小の整数）。このとき，図４１（ａ）の入力画像１３１０におけるＵ（ｘ，ｙ）（座標値は（Ｒｘ（ｘ，ｙ），Ｒｙ（ｘ，ｙ）），１３３０），Ｕ（ｘ＋１，ｙ）（１３４０），Ｕ（ｘ，ｙ＋１）（１３５０）と（Ｘｉ，Ｙｉ）との距離がそれぞれＤ１，Ｄ２，Ｄ３であって，Ｄ１〜Ｄ３の比Ｄ１：Ｄ２：Ｄ３がＥ１：Ｅ２：Ｅ３と等しいときに（Ｘｍ，Ｙｍ）の画素値Ｖｍは，入力画像１３１０上の座標（Ｘｉ，Ｙｉ）の画素値Ｖｉにより求める。 In the corrected image 1320 of FIG. 41 (b), the closest signal units in the upper left, upper right, and lower left areas when viewed at (Xm, Ym) as the center are U (x, y) (coordinate values are (Sx ( x, y), Sy (x, y)), 1360), U (x + 1, y) (1370), U (x, y + 1) (1380), and their distances as E1, E2, E3, respectively. (Specifically, x is the smallest integer not exceeding Xm / Tw + 1, and y is the smallest integer not exceeding Xm / Tw + 1). At this time, U (x, y) (coordinate values are (Rx (x, y), Ry (x, y)), 1330), U (x + 1, y) (1340) in the input image 1310 in FIG. ), U (x, y + 1) (1350) and (Xi, Yi) are D1, D2, and D3, respectively, and the ratio D1: D2: D3 of D1 to D3 is equal to E1: E2: E3. The pixel value Vm of (Xm, Ym) is obtained from the pixel value Vi of the coordinates (Xi, Yi) on the input image 1310.

図４２はこのような（Ｘｉ，Ｙｉ）の具体的な計算方法を示している。図４２（ａ）の符号１４３０は（Ｘｍ，Ｙｍ）をＵ（ｘ，ｙ）とＵ（ｘ＋１，ｙ）を結ぶ直線上に射影した点でＦｘ＝Ｘｍ−Ｓｘ（ｘ，ｙ）である。また，符号１４４０は（Ｘｍ，Ｙｍ）をＵ（ｘ，ｙ）とＵ（ｘ，ｙ＋１）を結ぶ直線上に射影した点でＦｙ＝Ｙｍ−Ｓｙ（ｘ，ｙ）である。同様に，図４２（ｂ）の符号１４５０は（Ｘｍ，Ｙｍ）をＵ（ｘ，ｙ）とＵ（ｘ＋１，ｙ）を結ぶ直線上に射影した点でＧｘ＝Ｘｍ−Ｓｘ（ｘ，ｙ）である。同様に，符号１４６０は（Ｘｍ，Ｙｍ）をＵ（ｘ，ｙ）とＵ（ｘ，ｙ＋１）を結ぶ直線上に射影した点でＧｙ＝Ｙｍ−Ｓｙ（ｘ，ｙ）である。このとき，図４２（ａ）の入力画像１４１０におけるＦｘは，Ｆｘ／（Ｒｘ（ｘ＋１，ｙ）−Ｒｘ（ｘ，ｙ））＝Ｅｘ／Ｔｗより，Ｆｘ＝Ｅｘ／Ｔｗ×（Ｒｘ（ｘ＋１，ｙ）−Ｒｘ（ｘ，ｙ））となる。同様に，Ｆｙ＝Ｅｙ／Ｔｈ×（Ｒｙ（ｘ，ｙ＋１）−Ｒｙ（ｘ，ｙ））となる。これより，Ｘｉ＝Ｆｘ＋Ｒｘ（ｘ，ｙ），Ｙｉ＝Ｆｙ＋Ｒｙ（ｘ，ｙ）となる。 FIG. 42 shows a specific calculation method of such (Xi, Yi). Reference numeral 1430 in FIG. 42A is a point obtained by projecting (Xm, Ym) onto a straight line connecting U (x, y) and U (x + 1, y), and Fx = Xm−Sx (x, y). Reference numeral 1440 denotes Fy = Ym−Sy (x, y) at a point where (Xm, Ym) is projected onto a straight line connecting U (x, y) and U (x, y + 1). Similarly, reference numeral 1450 in FIG. 42B is a point where (Xm, Ym) is projected onto a straight line connecting U (x, y) and U (x + 1, y). Gx = Xm−Sx (x, y) It is. Similarly, reference numeral 1460 is Gy = Ym−Sy (x, y) at a point where (Xm, Ym) is projected onto a straight line connecting U (x, y) and U (x, y + 1). At this time, Fx in the input image 1410 in FIG. 42A is Fx / (Rx (x + 1, y) −Rx (x, y)) = Ex / Tw, Fx = Ex / Tw × (Rx (x + 1, y) -Rx (x, y)). Similarly, Fy = Ey / Th × (Ry (x, y + 1) −Ry (x, y)). Thus, Xi = Fx + Rx (x, y) and Yi = Fy + Ry (x, y).

以上により，図４２（ｂ）の補正画像１４２０（補正後の入力文書画像２１）上の任意の点（Ｘｍ，Ｙｍ）の画素値には，入力画像（入力文書画像２１）上の点（Ｘｉ，Ｙｉ）の画素値をセットする。ただし，（Ｘｉ，Ｙｉ）は一般的に実数値であるため，入力画像上で（Ｘｉ，Ｙｉ）に最も近い座標における画素値とするか，その近傍４画素の画素値とそれらとの距離の比から画素値を計算する。 As described above, the pixel value of an arbitrary point (Xm, Ym) on the corrected image 1420 (corrected input document image 21) in FIG. 42B is set to the point (Xi) on the input image (input document image 21). , Yi) is set. However, since (Xi, Yi) is generally a real value, the pixel value at the coordinates closest to (Xi, Yi) on the input image is set, or the pixel values of the four neighboring pixels and the distance between them are set. Calculate the pixel value from the ratio.

以上，入力画像変形部２４の動作について説明したが，さらに入力画像変形部２４は上記入力文書画像２１に対して補正及び二値化処理の後，その補正後の入力文書画像２１を縮小し，縮小した分だけ逆に拡大処理を行う処理を実行する。 The operation of the input image deforming unit 24 has been described above. The input image deforming unit 24 further performs a correction and binarization process on the input document image 21, and then reduces the input document image 21 after the correction. On the contrary, a process of performing an enlargement process by the reduced amount is executed.

ここで，上記入力画像変形部２４により補正及び二値化され，さらに縮小・拡大された入力文書画像２１を示すと，図４３に図示の通りとなる。なお，図４３は，補正及び二値化された入力文書画像をさらに縮小・拡大した場合の入力文書画像の概略的な構成の一例を示す説明図である。 Here, the input document image 21 corrected and binarized by the input image deforming unit 24 and further reduced / enlarged is shown in FIG. FIG. 43 is an explanatory diagram showing an example of a schematic configuration of an input document image when the corrected and binarized input document image is further reduced / enlarged.

以上説明したように，本実施形態によれば，入力文書画像２１に埋め込んだ信号の位置情報（改ざん検出データＤａｔａ２）を元に，印刷書面を取り込んだ画像を補正するため，印刷物から取り込んだ画像から印刷前の画像を歪みや伸び縮みなく復元できるので，それらの画像間の位置の対応付けを高精度で行うことができ，さらには高性能の改ざん検出を行うことができる。 As described above, according to the present embodiment, an image captured from a printed material is used to correct an image captured from a printed document based on position information (falsification detection data Data2) of a signal embedded in the input document image 21. Therefore, the image before printing can be restored without distortion or expansion / contraction, so that the positions of the images can be correlated with high accuracy, and further, high-performance alteration detection can be performed.

（改ざん検出部２５の処理について）
入力画像変形部２４によって，入力文書画像２１の補正処理が終了すると，改ざん検出部２５は，その補正後の入力文書画像２１と上記擬似文書画像とを相互に比較することによって，当該入力文書画像２１に改ざんが存在するか否かを判定する。 (About the processing of the falsification detection unit 25)
When the correction process of the input document image 21 is completed by the input image deformation unit 24, the falsification detection unit 25 compares the input document image 21 after the correction and the pseudo document image with each other to thereby compare the input document image 21. It is determined whether or not tampering exists in 21.

ここで，図４４を参照しながら，第１の実施の形態にかかる改ざん検出部２５の処理について説明する。なお，図４４は第１の実施の形態にかかる改ざん検出部の処理の概略を示すフローチャートである。 Here, the processing of the falsification detection unit 25 according to the first embodiment will be described with reference to FIG. FIG. 44 is a flowchart showing an outline of processing of the falsification detection unit according to the first embodiment.

図４４に示すように，まず，改ざん検出部２５は，上記擬似文書画像と，補正や縮小・拡大処理後の入力文書画像２１との差分を求め，差分画像を生成する（Ｓ５１）。 As shown in FIG. 44, first, the falsification detection unit 25 obtains a difference between the pseudo document image and the input document image 21 after the correction and reduction / enlargement processing, and generates a difference image (S51).

改ざん検出部２５によって生成された上記差分画像を図示すると，図４５に示すとおりとなる。なお，図４５に示す差分画像は，図４３に示した補正後に縮小・拡大処理した入力文書画像２１と上記擬似文書画像との差分で生成された画像の一例である。画像中の黒い領域が差分である。例えば，文字の輪郭付近に多くの黒色の領域が存在するのが分かる。 The difference image generated by the falsification detection unit 25 is illustrated as shown in FIG. The difference image shown in FIG. 45 is an example of an image generated by the difference between the input document image 21 reduced and enlarged after the correction shown in FIG. 43 and the pseudo document image. The black area in the image is the difference. For example, it can be seen that there are many black regions near the outline of the character.

上記差分画像には，擬似文書画像と，補正・縮小・拡大済みの入力文書画像２１との差分で，上記文字の輪郭付近に黒色領域（差分領域）が存在する場合の他に，例えば文字全体が黒色領域となっている場合等があり，上記の場合，その黒色領域は改ざんであると判断される。しかしながら，上記の場合であっても，入力画像変形部２４の二値化の閾値が最適ではないなどの理由で，本来であれば改ざんではない部分まで改ざんであると判断するような差分領域が出現してしまう場合がある。 The difference image is a difference between the pseudo document image and the corrected / reduced / enlarged input document image 21. In addition to the case where a black region (difference region) exists near the outline of the character, for example, the entire character Is a black area, and in the above case, it is determined that the black area is falsified. However, even in the above-described case, there is a difference area that determines that a part that has not been falsified has been falsified because the binarization threshold of the input image deformation unit 24 is not optimal. May appear.

そこで，上記のように誤って改ざんと判定する場合を回避するため，改ざん検出部２５は，差分画像に対してノイズ除去処理（Ｓ５２）を実行する。上記ノイズ除去処理は，例えば差分画像の差分領域についてラベル付け処理を実行し，同じラベルのついた（グループ化された）差分領域の画素数の和が閾値Ｔ_ｎ以下であれば，そのグループ化された差分領域はノイズであると判断し，差分画像から差分領域を除去する。差分領域を除去する手段としては，例えば黒画素を白画素にすること等を例示できる。かかるノイズ除去によって，例えば改ざんではなく単に文字の輪郭部分に黒色領域が存在している場合などについて誤判定を防げる。 Therefore, in order to avoid the case where it is erroneously determined to be falsified as described above, the falsification detection unit 25 performs noise removal processing (S52) on the difference image. The noise removal process, for example, the difference area of the difference image to perform the labeling process, if the same label with a (grouped) by the sum of the number of pixels in the difference region is less than the threshold value T _n, the grouping The determined difference area is determined to be noise, and the difference area is removed from the difference image. As a means for removing the difference area, for example, a black pixel can be changed to a white pixel. Such noise removal can prevent misjudgment, for example, when a black region is present in the outline of a character without being altered.

ノイズ除去処理（Ｓ５２）終了後，依然として差分領域として残った領域を改ざん検出部２５は，その差分領域の部分は改ざんであると判定する（Ｓ５３）。 After the noise removal process (S52), the alteration detection unit 25 determines that the area remaining as the difference area has been altered (S53).

図４５，図４６に示すように，改ざん検出部２５によって，改ざんであると判定された差分領域は，「学籍番号」の欄に記載された“１１”の部分と，「心理学」の「成績」の欄に“Ａ”と記載された部分の２箇所であることが分かる。つまり，上記“１１”と“Ａ”とが検出された改ざん領域（差分領域）である。 As shown in FIG. 45 and FIG. 46, the difference area determined to be falsified by the falsification detection unit 25 is the portion of “11” described in the “student ID number” column and “psychology” “ It can be seen that there are two places where “A” is written in the “Results” column. That is, the above-described “11” and “A” are the falsified areas (difference areas) from which they are detected.

以上，文書入力部２０の一連の動作の説明を終了する。第１の実施の形態によれば，文書画像１１のうち画像領域全体ではなく，文字領域だけを切り出し，各々の文字領域を一括してまとめた画像を透かし情報として文書画像１１に埋め込むため，改ざん検出のための改ざん検出データＤａｔａ（透かし情報）のデータ量を削減することができる。したがって，文書画像に改ざん検出データＤａｔａを埋め込む処理等を迅速かつ効率化し，また，埋め込まれた改ざん検出データＤａｔａを抽出する処理等の迅速化，効率化が図れるとともに，透かし情報として少ないデータ量で高精度の改ざん検出を自動的に実行することができる。 This is the end of the description of the series of operations of the document input unit 20. According to the first embodiment, not the entire image area of the document image 11 but only the character area is cut out, and an image in which the character areas are collectively collected is embedded in the document image 11 as watermark information. The data amount of the falsification detection data Data (watermark information) for detection can be reduced. Therefore, the processing for embedding the falsification detection data Data in the document image can be performed quickly and efficiently, and the processing for extracting the embedded falsification detection data Data can be performed quickly and efficiently, and the amount of watermark information can be reduced. High-precision alteration detection can be automatically executed.

（第２の実施の形態について）
第１の実施の形態では，文書画像１１に存在する文字領域については一律同じ縮小率で文字領域の縮小等を行う場合について説明した。第２の実施の形態では，文書画像１１１には複数の文字領域が存在し，かつ文字領域ごとに記載内容等の重要度が異なることを前提とした場合について説明する。 (About the second embodiment)
In the first embodiment, a case has been described in which character areas existing in the document image 11 are reduced at the same reduction ratio. In the second embodiment, a case will be described in which it is assumed that there are a plurality of character areas in the document image 111 and that the importance of the description content and the like is different for each character area.

第２の実施の形態では，例えば，重要度が高い記載事項の文字領域については縮小率を小さくしてより細かな改ざんの有無の判定を行い，重要度が低い記載事項の文字領域については縮小率を大きくして大まかな改ざんの有無の判定を行う。 In the second embodiment, for example, a character area of a description item with high importance is reduced to determine whether or not there is a finer alteration by reducing the reduction rate, and a character area of description items with low importance is reduced. Increase the rate to determine whether there has been a rough tampering.

次に，図４７を参照しながら，第２の実施の形態にかかる改ざん検出装置に備わる文書出力部１１０について説明する。図４７は，第２の実施の形態にかかる文書出力部の概略的な構成の一例を示すブロック図である。なお，改ざん検出装置には，文書出力部１１０及び／又は文書入力部１２０が備わる。 Next, the document output unit 110 provided in the falsification detection apparatus according to the second embodiment will be described with reference to FIG. FIG. 47 is a block diagram illustrating an example of a schematic configuration of a document output unit according to the second embodiment. Note that the falsification detection device includes a document output unit 110 and / or a document input unit 120.

また，第２の実施の形態にかかる改ざん検出装置に備わる文書出力部１１０についての説明は，第１の実施の形態にかかる文書出力部１０との相違する点について特に詳細に説明し，その他の点については，ほぼ同様の構成のため詳細な説明は省略する。 Further, the description of the document output unit 110 provided in the falsification detection apparatus according to the second embodiment will be described in detail with respect to differences from the document output unit 10 according to the first embodiment. About a point, since it is substantially the same structure, detailed description is abbreviate | omitted.

図４７に示すように，第２の実施の形態にかかる文書出力部１１０と，第１の実施の形態にかかる文書出力部１０との相違する部分は，文書出力部１１０に縮小率判定部１１５がさらに追加された点，文書出力部１０に備わる文書特徴データ化部１２ではなく，その代わりに第２の文書特徴データ化部（画像加工部）１１６がさらに追加された点である。 As shown in FIG. 47, the difference between the document output unit 110 according to the second embodiment and the document output unit 10 according to the first embodiment is the reduction ratio determination unit 115 in the document output unit 110. Is added, instead of the document feature data conversion unit 12 included in the document output unit 10, a second document feature data conversion unit (image processing unit) 116 is added instead.

上記縮小率判定部１１５は，各文字領域の重要度を参照し，文字領域から縮小文字画像に縮小する際の縮小率を決定する。 The reduction rate determination unit 115 refers to the importance of each character area and determines a reduction rate when reducing the character area to a reduced character image.

次に，図４８を参照しながら，第２の実施の形態にかかる改ざん検出装置に備わる文書入力部１２０について説明する。図４８は，第２の実施の形態にかかる文書入力部の概略的な構成の一例を示すブロック図である。上記改ざん検出装置には，文書出力部１１０または文書入力部１２０のうち少なくとも一方が備わっている。 Next, the document input unit 120 provided in the falsification detection apparatus according to the second embodiment will be described with reference to FIG. FIG. 48 is a block diagram illustrating an example of a schematic configuration of a document input unit according to the second embodiment. The tampering detection apparatus includes at least one of the document output unit 110 and the document input unit 120.

なお，第２の実施の形態にかかる改ざん検出装置に備わる文書入力部１２０についての説明は，第１の実施の形態にかかる文書入力部２０との相違する点について特に詳細に説明し，その他の点については，ほぼ同様の構成のため詳細な説明は省略する。 Note that the description of the document input unit 120 provided in the falsification detection apparatus according to the second embodiment will be described in detail with respect to differences from the document input unit 20 according to the first embodiment. About a point, since it is substantially the same structure, detailed description is abbreviate | omitted.

図４８に示すように，第２の実施の形態にかかる文書入力部１２０と，第１の実施の形態にかかる文書入力部２０との相違する部分は，文書入力部２０に備わる擬似文書画像生成部２３ではなく，その代わりとして第２の擬似文書画像生成部１２７が備わる点，文書入力部１２０に入力画像変形部２４ではなく，第２の入力画像変形部（画像変形部）２２８が備わる点である。 As shown in FIG. 48, the difference between the document input unit 120 according to the second embodiment and the document input unit 20 according to the first embodiment is the pseudo document image generation provided in the document input unit 20. Instead, the second pseudo document image generating unit 127 is provided instead of the unit 23, and the second input image deforming unit (image deforming unit) 228 is provided in the document input unit 120 instead of the input image deforming unit 24. It is.

（第２の文書特徴データ化部１１６の動作について）
次に，図４９を参照しながら，上記第２の文書特徴データ化部１１６の一連の動作について説明する。図４９は，第２の実施の形態にかかる第２の文書特徴データ化部１１６の処理の概略を示すフローチャートである。 (Operation of the second document feature data conversion unit 116)
Next, a series of operations of the second document feature data converting unit 116 will be described with reference to FIG. FIG. 49 is a flowchart showing an outline of processing of the second document feature data converting unit 116 according to the second embodiment.

図４９に示すように，まず第２の文書特徴データ化部１１６は，文書画像１１１から文字領域を抽出する（Ｓ１３１）。なお，第２の実施の形態にかかる文字領域を抽出する文字領域抽出処理（Ｓ１３１）は，第１の実施の形態にかかる文字領域抽出処理（Ｓ３１）と実質的に同様であるため詳細な説明は省略する。 As shown in FIG. 49, first, the second document feature data conversion unit 116 extracts a character area from the document image 111 (S131). The character area extraction process (S131) for extracting the character area according to the second embodiment is substantially the same as the character area extraction process (S31) according to the first embodiment, and thus detailed description will be made. Is omitted.

次に，第２の文書特徴データ化部１１６は，文書画像１１１に予め設定されている記載事項ブロック別の重要度を参照し，その記載事項ブロックの重要度に対応するように各文字領域の縮小率を設定する（Ｓ１３２）。 Next, the second document feature data conversion unit 116 refers to the importance level for each description item block set in advance in the document image 111, and stores each character area so as to correspond to the importance level of the description item block. A reduction ratio is set (S132).

ここで，図５０を参照しながら，第２の実施の形態にかかる記載事項ブロック別の重要度について説明する。図５０は，第２の実施の形態にかかる記載事項ブロックの重要度の概略的な構成の一例を示す説明図である。 Here, with reference to FIG. 50, the importance of each description item block according to the second embodiment will be described. FIG. 50 is an explanatory diagram illustrating an example of a schematic configuration of importance levels of the description item blocks according to the second embodiment.

図５０に示すように，文書画像１１１には，１又は２以上の記載事項ブロック３００（記載事項ブロック３００ａ〜記載事項ブロック３００ｃ）が設定されている。上記記載事項ブロック３００は，例えば，文書画像１１１を作成する利用者等によって，透かし情報を埋め込む前に，事前に記載事項ブロック３００の位置・大きさが設定され，さらにその記載事項ブロック３００の重要度が設定される。 As shown in FIG. 50, one or two or more description item blocks 300 (the description item block 300a to the description item block 300c) are set in the document image 111. The entry block 300 is set in advance with the position / size of the entry block 300 before embedding watermark information by a user or the like who creates the document image 111. The degree is set.

図５０に示す文書画像１１１には，記載事項ブロック３００ａと，記載事項ブロック３００ｂと，記載事項ブロック３００ｃとが設定されている。さらに，重要度については，記載事項ブロック３００ａと記載事項ブロック３００ｃの重要度が「中」程度に設定され，記載事項ブロック３００ｂの重要度が「高」に設定されている。なお，記載事項ブロック３００が未設定の領域については重要度が低くいため改ざんの検出を行う必要がないものとする。 In the document image 111 shown in FIG. 50, a description item block 300a, a description item block 300b, and a description item block 300c are set. Further, regarding the importance, the importance of the entry block 300a and the entry block 300c is set to “medium”, and the importance of the entry block 300b is set to “high”. It should be noted that it is not necessary to detect falsification for an area in which the description item block 300 is not set because the importance is low.

なお，第２の実施の形態にかかる重要度は「中」または「高」の２段階からなる場合を例に挙げて説明したが，かかる例に限定されず，例えば，重要度は，「中」，「高」，「最高」などの順に重要度が高くなる場合であっても実施可能である。 The importance according to the second embodiment has been described by taking the case of “medium” or “high” as an example, but is not limited to such an example. For example, the importance is “medium” Even if the importance increases in the order of "", "high", "highest", etc., it can be implemented.

図５０に示す文書画像１１１では，抽出（Ｓ１３１）された文字領域のうち，記載事項ブロック３００ａおよび記載事項ブロック３００ｃに属する文字領域については，縮小率を高くし（例えば，１／４に縮小する），記載事項３００ｂに属する文字領域に対しては縮小率を低く（例えば１／２に縮小）する。 In the document image 111 shown in FIG. 50, among the extracted character areas (S131), the character areas belonging to the entry block 300a and the entry block 300c are increased in reduction ratio (for example, reduced to ¼). ), The reduction ratio is reduced (for example, reduced to 1/2) for the character area belonging to the description item 300b.

なお，第２の実施の形態にかかる記載事項ブロック３００の重要度は，透かし情報として埋め込む前に，予め重要度が設定されている場合を例に挙げて説明したが，かかる例に限定されず，例えば，重要度は，記載事項ブロック３００内に存在する文字領域について抽出するたびに，文字認識し，キーワードＤＢ（データベース）を改ざん検出装置が備えることで，そのキーワードＤＢに格納されたキーワードと文字領域に存在する文字とが一致した場合，重要度を「高」又は「最高」と動的に設定する場合等でも実施可能である。 The importance of the entry block 300 according to the second embodiment has been described by taking as an example the case where the importance is set in advance before embedding as watermark information. However, the importance is not limited to this example. For example, each time a character area existing in the entry block 300 is extracted, the importance is recognized, and the keyword DB (database) is provided in the falsification detection device, so that the keyword stored in the keyword DB This can be implemented even when the character existing in the character area matches and the importance is dynamically set to “high” or “highest”.

第２の文書特徴データ化部１１６が文字領域抽出処理（Ｓ１３１）を実行し，さらに記載事項ブロック３００ごとに設定された重要度を参照すると，さらに第２の文書特徴データ化部１１６は，文字領域に対して領域の番号を割り振る。 When the second document feature data conversion unit 116 executes the character region extraction process (S131) and further refers to the importance set for each description item block 300, the second document feature data conversion unit 116 Assign region numbers to regions.

第２の文書特徴データ化部１１６が文字領域に番号を割振った結果の一例を，図５１に示す。図５１は，第２の実施の形態にかかる第２の文書特徴データ化部１１６による番号割振り処理の結果の一例を示す説明図である。 FIG. 51 shows an example of the result of assigning numbers to character areas by the second document feature data converting unit 116. FIG. 51 is an explanatory diagram showing an example of the result of the number allocation process by the second document feature data converting unit 116 according to the second embodiment.

図５１に示すように文字領域に番号が割り振られる対象としては，例えば，記載事項ブロック３００が設定され，かつその記載事項ブロック３００の重要度が「中」以上の記載事項ブロック３００に属する文字領域に対して領域の番号が割り振られるが，かかる例に限定されない。 As shown in FIG. 51, for example, a description item block 300 is set as a target to which a number is assigned to a character region, and a character region belonging to the description item block 300 whose importance level is “medium” or higher. An area number is assigned to each, but the present invention is not limited to such an example.

また，図５１に示すように，文字領域の領域に番号が割り振られる優先順位としては，まず，記載事項ブロック３００の番号を第１優先（記載事項ブロック３００の番号が若いのが優先。）とする。したがって，図５０と図５１に示すように，記載事項ブロック３００ａに属する文字領域から順に番号を割り振り，次に，記載事項ブロック３００ｂに属する文字領域に番号を割り振り，最後に記載事項ブロック３００ｃに属する文字領域に番号を割り振る。 As shown in FIG. 51, as the priority order in which numbers are assigned to character areas, first, the number of the description item block 300 is the first priority (the number of the description item block 300 is the lower priority). To do. Therefore, as shown in FIGS. 50 and 51, numbers are assigned in order from the character area belonging to the entry block 300a, then numbers are assigned to the character areas belonging to the entry block 300b, and finally the entry area 300c. Assign a number to the character area.

次の優先順位としては，文字領域の矩形の左上頂点のｙ座標値を第２優先とする。なお，文書画像１１１の矩形の左上頂点を基準座標（０，０）とする。したがって，記載事項ブロック３００に属する文字領域のうち矩形の左上頂点のｙ座標値が基準座標（０，０）に近い文字領域から番号が昇順に割り振られる。 As the next priority, the y-coordinate value of the upper left vertex of the rectangle of the character area is given the second priority. It is assumed that the upper left vertex of the rectangle of the document image 111 is the reference coordinate (0, 0). Accordingly, among the character areas belonging to the entry block 300, numbers are assigned in ascending order from the character area in which the y-coordinate value of the upper left vertex of the rectangle is close to the reference coordinates (0, 0).

最後に，文字領域の領域に番号が割り振られる優先順位としては，ｘ座標値を第３優先とする。したがって，記載事項ブロック３００に属する文字領域のうち第２優先であるｙ座標値が同値の場合，ｘ座標値が基準座標（０，０）に近い方から順に番号が昇順に割り振られる。 Finally, as a priority order in which numbers are assigned to the character area, the x coordinate value is the third priority. Therefore, when the y-coordinate value, which is the second priority among the character areas belonging to the entry block 300, is the same value, the x-coordinate values are assigned in ascending order from the closest to the reference coordinates (0, 0).

以上，番号の割り振り優先順位に従って，第２の文書特徴データ化部１１６が記載事項ブロック３００に属する文字領域に対して番号を割り振ると，図５１に図示の通りとなる。 As described above, when the second document feature data conversion unit 116 assigns numbers to the character areas belonging to the description item block 300 in accordance with the number assignment priority order, the result is as shown in FIG.

図５１に示すように，文字領域である領域２〜領域５は，記載事項ブロック３００ａに属するため，第２の文書特徴データ化部１１６は，領域２〜領域５の縮小率を“４”（１／４に縮小）に設定し，領域８〜領域９は記載事項ブロック３００ｃに属するため，第２の文書特徴データ化部１１６は，領域８〜領域９の縮小率を“４”（１／４に縮小）に設定し，最後に領域１０〜領域１１は記載事項ブロック３００ｂに属するため，第２の文書特徴データ化部１１６は，領域１０〜領域１１の縮小率を“２”（１／２に縮小）に設定する。 As shown in FIG. 51, since the areas 2 to 5 which are character areas belong to the description item block 300a, the second document feature data conversion unit 116 sets the reduction ratio of the areas 2 to 5 to “4” ( Since the areas 8 to 9 belong to the description item block 300c, the second document feature data conversion unit 116 sets the reduction ratio of the areas 8 to 9 to “4” (1 / 4) and finally, since the areas 10 to 11 belong to the description item block 300b, the second document feature data conversion unit 116 sets the reduction ratio of the areas 10 to 11 to “2” (1 / Set to 2).

各文字領域に対する縮小率が決定すると（Ｓ１３２），第２の文書特徴データ化部１１６は，文書画像１１１から文字領域を切り出し，その文字領域に対応する設定済の縮小率で上記文字領域を縮小する（Ｓ１３２）。 When the reduction ratio for each character area is determined (S132), the second document feature data converting unit 116 cuts out the character area from the document image 111, and reduces the character area at a set reduction ratio corresponding to the character area. (S132).

図５２を参照しながら，第２の文書特徴データ化部１１６による文字領域の切り出し処理，文字領域の縮小処理について具体的に説明する。図５２は，文字領域の切り出し及び縮小処理（Ｓ１３２）の概略の一例を示す説明図である。 With reference to FIG. 52, the character region segmentation processing and the character region reduction processing by the second document feature data converting unit 116 will be specifically described. FIG. 52 is an explanatory diagram showing an example of an outline of the character region cutout and reduction processing (S132).

図５２に示すように，記載事項ブロック３００ａに属す領域２は，縮小率が“４”であるため，上記領域２を切り出した後，第２の文書特徴データ化部１１６は，当該領域２を１／４に縮小している（Ｓ１３２）。なお，記載事項ブロック３００ａに属すその他の領域（領域３〜領域５）についても領域２と同様に処理される。 As shown in FIG. 52, since the area 2 belonging to the description item block 300a has a reduction ratio of “4”, the second document feature data converting unit 116 extracts the area 2 after cutting out the area 2. It is reduced to 1/4 (S132). The other areas (area 3 to area 5) belonging to the entry block 300a are processed in the same manner as the area 2.

また，図５２に示すように，記載事項ブロック３００ｂに属す領域１０，領域１１は，縮小率が“２”であるため，上記領域１０，領域１１を切り出した後，第２の文書特徴データ化部１１６は，当該領域１０，領域１１を１／２に縮小している（Ｓ１３２）。なお，縮小された文字領域を縮小文字領域とする。 As shown in FIG. 52, since the reduction ratios of the area 10 and the area 11 belonging to the description item block 300b are “2”, after the area 10 and the area 11 are cut out, the second document feature data is generated. The unit 116 reduces the areas 10 and 11 to ½ (S132). The reduced character area is defined as a reduced character area.

次に，第２の文書特徴データ化部１１６は，上記縮小文字領域を組合せるように縮小文字領域を各々配置し，１つの画像（合成画像）を生成する。なお，第１の実施の形態にかかる合成画像を生成する処理（Ｓ３４）と第２の実施の形態にかかる合成画像を生成する処理（Ｓ１３４）は実質的に同じであるため詳細な処理説明は省略する。 Next, the second document feature data converting unit 116 arranges the reduced character areas so as to combine the reduced character areas, and generates one image (composite image). Note that the process for generating a composite image according to the first embodiment (S34) and the process for generating a composite image according to the second embodiment (S134) are substantially the same, so a detailed description of the process will be given. Omitted.

ここで，図５３を参照しながら，第２の実施の形態にかかる合成画像について説明する。図５３は，第２の実施の形態にかかる合成画像の概略的な構成の一例を示す説明図である。 Here, a synthesized image according to the second embodiment will be described with reference to FIG. FIG. 53 is an explanatory diagram illustrating an example of a schematic configuration of a composite image according to the second embodiment.

図５３に示すように，合成画像には，縮小率が異なる１又は２以上の縮小文字領域が隙間のないように埋め込まれている。なお，図５３に示す合成画像は，図１０に示す合成画像と同様に，領域番号の小さい順に縮小文字領域が縦方向（垂直方向）に配置されることで生成される。図１０に示す合成画像と図５３に示す合成画像の相違点は，図５３に示す合成画像は縮小率の異なる縮小文字領域が混在する点で相違する。その他については，図１０に示す合成画像と実質的に同様である。 As shown in FIG. 53, in the composite image, one or two or more reduced character regions having different reduction ratios are embedded without any gaps. Note that the composite image shown in FIG. 53 is generated by arranging the reduced character areas in the vertical direction (vertical direction) in ascending order of the area numbers, similarly to the composite image shown in FIG. The difference between the composite image shown in FIG. 10 and the composite image shown in FIG. 53 is that the composite image shown in FIG. 53 is mixed with reduced character areas having different reduction ratios. Others are substantially the same as the synthesized image shown in FIG.

（透かし情報合成部１１３の処理について）
透かし情報合成部（透かし情報埋め込み部）１１３では改ざん検出用のデータ（改ざん検出データＤａｔａ）を文書画像１１１に透かし情報として埋め込む。なお，第２の実施の形態にかかる改ざん検出データＤａｔａとは以下に示すデータであるとする。 (Regarding the processing of the watermark information synthesis unit 113)
A watermark information synthesis unit (watermark information embedding unit) 113 embeds data for falsification detection (falsification detection data Data) as watermark information in the document image 111. Note that the falsification detection data Data according to the second embodiment is assumed to be the following data.

Ｄａｔａ１００：文書画像１１１の大きさ，データサイズなどのヘッダ情報
Ｄａｔａ１０１：合成画像の画像データあるいは合成画像を圧縮した画像データ
Ｄａｔａ１０２：文字領域の位置情報（文字領域抽出処理（Ｓ１３１）により抽出したもの）および縮小率（又は，縮小レベル）
Ｄａｔａ１０３：合成画像の作成方法ＩＤ（合成画像の作成方法を文書画像１１１ごとに変更する場合） Data 100: Header information such as the size and data size of the document image 111 Data 101: Image data of the composite image or image data obtained by compressing the composite image Data 102: Position information of the character area (extracted by the character area extraction process (S131)) And reduction rate (or reduction level)
Data 103: Composite image creation method ID (when the composite image creation method is changed for each document image 111)

上記改ざん検出データＤａｔａ１０２は，図５４に示すようなテーブル化されたデータ等を例示することができる。図５４は，第２の実施の形態にかかる改ざん検出データＤａｔａ１０２の概略的な構成を示す説明図である。 The falsification detection data Data102 can be exemplified by tabulated data as shown in FIG. FIG. 54 is an explanatory diagram illustrating a schematic configuration of the falsification detection data Data 102 according to the second embodiment.

図５４に示すように，例えば，「領域番号」が“１”など，「縮小レベル」が“０”の文字領域の場合，上記説明したように重要度が低い記載事項ブロック３００に属する文字領域であるため，改ざん検出データＤａｔａ１０１には，当該文字領域の画像データが記録されていない。 As shown in FIG. 54, for example, in the case of a character area whose “area number” is “1” and whose “reduction level” is “0”, as described above, the character area belonging to the entry block 300 with low importance Therefore, the image data of the character area is not recorded in the falsification detection data Data101.

また一方で，図５４に示すように，例えば，「領域番号」が“Ｎ”など，「縮小レベル」が“１”（例えば，縮小率が“２”等）の文字領域の場合，重要度が高い記載事項ブロック３００に属する文字領域であるため，改ざん検出データＤａｔａ１０１に当該文字領域の画像データが記録される。 On the other hand, as shown in FIG. 54, for example, in the case of a character area whose “area number” is “N” and whose “reduction level” is “1” (for example, the reduction ratio is “2”, etc.) Therefore, the image data of the character area is recorded in the falsification detection data Data101.

同様に，図５４に示すように，例えば，「領域番号」が“２”など，「縮小レベル」が“２”
（例えば，縮小率が“４”等）の文字領域の場合，重要度が中程度の記載事項ブロック３００に属する文字領域であるため，改ざん検出データＤａｔａ１０１ｎｉ当該文字領域の画像データは記録される。 Similarly, as shown in FIG. 54, for example, “area number” is “2”, and “reduction level” is “2”.
In the case of a character area (for example, a reduction ratio of “4” or the like), since it is a character area belonging to the entry block 300 having a medium importance level, the falsification detection data Data101ni image data of the character area is recorded.

なお，上記記載事項ブロック３００の重要度と縮小率との対応関係が，文書出力部１１０と文書入力部１２０の双方で共通であれば，改ざん検出データＤａｔａ１０２を，第１の実施の形態にかかる改ざん検出データＤａｔａ２と同じデータ構造に変更し，以下に示す改ざん検出データＤａｔａ１０４をさらに追加することも可能である。 If the correspondence between the importance level and the reduction ratio of the description item block 300 is common to both the document output unit 110 and the document input unit 120, the falsification detection data Data102 is applied to the first embodiment. It is also possible to change to the same data structure as the falsification detection data Data2 and to add the falsification detection data Data104 shown below.

Ｄａｔａ１０４：記載事項ブロック３００の位置・大きさ等の情報及び重要度 Data 104: Information such as the position / size of the entry block 300 and the importance level

ここで，図５５を参照しながら，上記改ざん検出データＤａｔａ１０４について説明する。図５５は，第２の実施の形態にかかる改ざん検出データＤａｔａ１０４の概略的な構成の一例を示す説明図である。 Here, the falsification detection data Data104 will be described with reference to FIG. FIG. 55 is an explanatory diagram showing an example of a schematic configuration of the falsification detection data Data 104 according to the second embodiment.

図５５に示すように，改ざん検出データＤａｔａ１０４は，記載事項ブロック３００の番号を示す「記載事項」と，記載事項ブロック３００の位置に関する情報である「Ｌｅｆｔ」と，「Ｔｏｐ」と，「Ｒｉｇｈｔ」と，「Ｂｏｔｔｏｍ」と，記載事項ブロック３００に属す文字領域の重要度を表わすための「重要度」とから構成される。 As shown in FIG. 55, the falsification detection data Data 104 includes “description item” indicating the number of the descriptive item block 300, “Left”, “Top”, and “Right”, which are information regarding the position of the descriptive item block 300. And “Bottom”, and “Importance” for indicating the importance of the character area belonging to the entry block 300.

なお，第２の実施の形態にかかる透かし情報合成部１１３による文書画像１１１に改ざん検出データＤａｔａを埋め込む等の処理については，第１の実施の形態にかかる透かし情報合成部１３による埋め込み処理とほぼ同様の構成であるため詳細な説明は省略する。 Note that processing such as embedding the falsification detection data Data into the document image 111 by the watermark information combining unit 113 according to the second embodiment is almost the same as the processing by the watermark information combining unit 13 according to the first embodiment. Since it is the same structure, detailed description is abbreviate | omitted.

（文書入力部１２０の動作について）
以降の説明では，出力文書画像１１４は紙等の印刷媒体に印刷された後に配布されるものとし，入力文書画像１２１はその出力文書画像１１４の透かし入りの印刷書面をスキャナ等の入力デバイスにより画像化したものとするが，かかる例に限定されず，出力文書画像１１４は印刷されないディジタルデータそのままの場合であってもよい。 (About the operation of the document input unit 120)
In the following description, it is assumed that the output document image 114 is distributed after being printed on a print medium such as paper, and the input document image 121 is an image obtained by printing a watermarked print document of the output document image 114 with an input device such as a scanner. However, the present invention is not limited to this example, and the output document image 114 may be digital data that is not printed as it is.

また，第２の実施の形態にかかる文書出力部１１０において文書画像１１１に埋め込まれる改ざん検出データＤａｔａとして改ざん検出データＤａｔａ１０２を用いる場合について説明する。 A case will be described in which the falsification detection data Data102 is used as the falsification detection data Data embedded in the document image 111 in the document output unit 110 according to the second embodiment.

なお，第２の実施の形態にかかる文書入力部１２０の処理の説明は，第１の実施の形態にかかる文書入力部２０の処理と相違する点について説明し，その他の点については，ほぼ同様な構成であるため省略する。 The processing of the document input unit 120 according to the second embodiment will be described with respect to differences from the processing of the document input unit 20 according to the first embodiment, and the other points will be substantially the same. Since this is a simple configuration, it is omitted.

（第２の擬似文書画像生成部１２７の処理について）
第２の擬似文書画像部１２７の動作は，第１の実施の形態にかかる擬似文書画像部２３の動作と実質的に同一であるが，第２の実施の形態では文字領域ごとに縮小率が異なるため，拡大率もそれに応じて文字領域ごとに変更する。なお，合成画像として改ざん検出データＤａｔａ１０１に記録されていない文字領域の画像データは，擬似文書画像にも反映されない。 (Processing of the second pseudo document image generation unit 127)
The operation of the second pseudo document image unit 127 is substantially the same as the operation of the pseudo document image unit 23 according to the first embodiment. However, in the second embodiment, the reduction rate is different for each character area. Since they are different, the enlargement ratio is changed for each character area accordingly. Note that image data of a character area that is not recorded in the falsification detection data Data 101 as a composite image is not reflected in the pseudo document image.

図５６に示すように，第２の擬似文書画像部１２７によって生成された擬似文書画像は，必ずしも文書画像１１１と完全一致ではなく，矢印に示すように，改ざん検出データＤａｔａ１０１に記録されていない文字領域は重要度が低いため，そのような文字領域についてはそもそも擬似文書画像に配置されない。 As shown in FIG. 56, the pseudo document image generated by the second pseudo document image unit 127 is not necessarily completely coincident with the document image 111, and as indicated by an arrow, characters not recorded in the falsification detection data Data 101 are displayed. Since the area is low in importance, such a character area is not arranged in the pseudo document image in the first place.

したがって，擬似文書画像に配置される文字領域は，例えば重要度が「中」以上のものについては，第２の擬似文書画像生成部１２７によって配置されて擬似文書画像が生成される。 Therefore, for example, if the character area arranged in the pseudo document image has an importance level of “medium” or higher, the pseudo document image is generated by being arranged by the second pseudo document image generation unit 127.

（第２の入力画像変形部１２８の動作について）
第２の入力画像変形部１２８は，第１の実施の形態にかかる入力画像変形部２４で説明したように，二値化済みの入力文書画像１２１に対して，改ざん検出データＤａｔａ１０２に含む文字領域の位置を示す座標データを参照し，その位置・大きさに該当する文字領域を切り出した後，さらに改ざん検出データＤａｔａ１０２に含む「縮小レベル」を参照し，一旦その縮小率で文字領域を縮小し，縮小率の逆数で拡大を行った後，二値化済みの入力文書画像１２１の元の位置に配置する。 (About operation | movement of the 2nd input image deformation | transformation part 128)
As described in the input image transformation unit 24 according to the first embodiment, the second input image transformation unit 128 applies a character area included in the falsification detection data Data 102 to the binarized input document image 121. After referring to the coordinate data indicating the position of the character and cutting out the character area corresponding to the position / size, the character area is temporarily reduced at the reduction ratio with reference to the “reduction level” included in the alteration detection data Data102. , After enlarging with the reciprocal of the reduction ratio, the binarized input document image 121 is placed at the original position.

ここで，図５７を参照しながら，第２の入力画像変形部１２８の動作について説明する。なお，図５７は，第２の入力画像変形部１２８の動作の概略を示す説明図である。 Here, the operation of the second input image transformation unit 128 will be described with reference to FIG. FIG. 57 is an explanatory diagram showing an outline of the operation of the second input image deformation unit 128.

図５７に示すように，第２の入力画像変形部１２８は，入力文書画像１２１における領域２の文字領域を，改ざん検出データＤａｔａ１０２の位置情報（座標値）に従って切り出す（Ｓ１３１）。なお，その他の文字領域についても上記と同様に処理される。 As shown in FIG. 57, the second input image transformation unit 128 cuts out the character area of area 2 in the input document image 121 according to the position information (coordinate values) of the falsification detection data Data102 (S131). The other character areas are processed in the same manner as described above.

次に，第２の入力画像変形部１２８は，切り出された領域２に対応する改ざん検出データＤａｔａ１０２の「縮小レベル」を参照し，その縮小レベルが“２”であるため，上記領域２の文字領域を１／２に縮小する（Ｓ１３２）。なお，その他の文字領域についても上記と同様に処理される。 Next, the second input image transformation unit 128 refers to the “reduction level” of the falsification detection data Data 102 corresponding to the cut-out area 2 and the reduction level is “2”. The area is reduced to ½ (S132). The other character areas are processed in the same manner as described above.

次に，第２の入力画像変形部１２８は，縮小率“１／２”の逆数である“２”倍だけ，先程Ｓ１３２で縮小された領域２の文字領域を拡大処理する（Ｓ１３３）。 Next, the second input image deforming unit 128 enlarges the character area in the area 2 reduced in S132 by “2” times the reciprocal of the reduction ratio “1/2” (S133).

次に，第２の入力画像変形部１２８は，上記拡大処理により拡大された領域２の文字領域を入力文書画像１２１の元の場所に配置する（Ｓ１３４）。なお，改ざん検出データＤａｔａ１０２に含まれる「縮小レベル」が“０”の文字領域については，第２の入力画像変形部１２８は，当該文字領域を背景色（図５７の場合では白色）で塗りつぶす。 Next, the second input image transformation unit 128 arranges the character area of the area 2 enlarged by the enlargement process at the original location of the input document image 121 (S134). It should be noted that for the character area having “0” as the “reduction level” included in the alteration detection data Data 102, the second input image transformation unit 128 fills the character area with the background color (white in the case of FIG. 57).

なお，Ｓ１３４において，第２の入力画像変形部１２８は，改ざん検出データＤａｔａ１０２に含まれる「縮小レベル」が“０”の文字領域については背景色で塗りつぶす場合を例に挙げて説明したが，かかる例に限定されず，例えば，第２の入力画像変形部１２８は，予め入力文書画像１２１と同一サイズからなる背景画像（例えば白画素で構成された画像など）を生成し，その背景画像に上記拡大処理により拡大された領域２の文字領域を入力文書画像１２１の元の場所と同じ位置に貼り付ける場合等でも実施可能である。 In S134, the second input image transformation unit 128 has been described by taking as an example the case where the character area with the “reduction level” of “0” included in the falsification detection data Data102 is filled with the background color. For example, the second input image deforming unit 128 generates a background image (for example, an image composed of white pixels) having the same size as the input document image 121 in advance, and the background image includes the above-described background image. The present invention can be implemented even when the character area of the area 2 enlarged by the enlargement process is pasted at the same position as the original location of the input document image 121.

また，文書出力部１１０で生成する改ざん検出データＤａｔａ１０４を用いる場合，第２の入力画像変形部１２８は，改ざん検出データＤａｔａ２に含まれる各文字領域の位置情報（座標値など）が改ざん検出データＤａｔａ１０４のどの記載事項ブロック３００に属するかを判定し，その記載事項ブロック３００の重要度を参照し，縮小率および拡大率を決定する。 When the falsification detection data Data 104 generated by the document output unit 110 is used, the second input image transformation unit 128 has the falsification detection data Data 104 in which position information (coordinate values) of each character area included in the falsification detection data Data2 is changed. The description item block 300 is determined, and the importance level of the description item block 300 is referred to, and the reduction rate and the enlargement rate are determined.

第２の入力画像変形部１２８で入力文書画像１２１の文字領域を別々に縮小および拡大を行うのは，第２の実施の形態にかかる擬似文書画像の各文字領域が変形された状態を再現することで，改ざん検出の性能を向上させるためである。変形された状態とは，例えば，文字のつぶれ具合などを例示することができる。 The reason why the character area of the input document image 121 is separately reduced and enlarged by the second input image deformation unit 128 is to reproduce the state in which each character area of the pseudo document image according to the second embodiment is deformed. This is to improve the performance of tamper detection. The deformed state can be exemplified by, for example, how the characters are crushed.

また，各文字領域を例えば重要度などに応じて縮小率を変更すれば，重要な文字領域については詳細に改ざんされたか否かを検査することができ，重要度が低い文字領域については大まかな改ざん検出処理を施すことで改ざん検出処理の高速化を図ることができる。具体的には，１／４に縮小した入力文書画像１２１を４倍に拡大した画像よりも，１／２分に縮小した入力文書画像１２１を２倍に拡大した画像のほうが画像劣化が少ないため，詳細に改ざん検出処理を実行することができる。 For example, if the reduction ratio of each character area is changed according to the importance level, it is possible to inspect whether the important character area has been tampered with in detail. By performing the falsification detection process, the falsification detection process can be speeded up. Specifically, an image obtained by doubling the input document image 121 reduced to ½ is smaller in image deterioration than an image obtained by magnifying the input document image 121 reduced by ¼ to four times. , Detailed alteration detection processing can be executed.

以上から，第２の実施の形態にかかる改ざん検出処理では，文字領域別に縮小率を変更するため，文書が記載された内容の重要度に応じて，改ざんされたか否かを検出する検出精度を動的に又は静的に変更することができる。 From the above, in the falsification detection processing according to the second embodiment, since the reduction ratio is changed for each character area, the detection accuracy for detecting whether or not the document has been falsified is determined according to the importance of the contents described in the document. It can be changed dynamically or statically.

（第３の実施の形態について）
第１の実施の形態および第２の実施の形態では，文書画像（原画像）１１および文書画像（原画像）１１１には，文書が記載された文字領域のみが存在する場合の改ざん検出処理について説明した。第３の実施の形態では，文書画像（原画像）２１１に文書と罫線が混在する場合の改ざん検出処理について説明する。 (About the third embodiment)
In the first embodiment and the second embodiment, the falsification detection process when the document image (original image) 11 and the document image (original image) 111 include only a character area in which the document is described. explained. In the third embodiment, falsification detection processing when a document image (original image) 211 contains a document and ruled lines will be described.

文書画像に罫線からなる画像領域が存在する場合，単純に罫線を画像データとして処理すると透かし情報として文書画像に埋め込むデータ量が大きくなってしまう。また，罫線を画像データではなく，罫線の存在位置を示す位置情報を座標値などで文書画像に埋め込み管理した場合，二重線や破線など罫線の線種を位置情報に記録する必要がある。しかしながら，文書画像から線種を自動的に判定するのは複雑な処理が必要で，そのため透かし情報を生成するための処理時間等が増大してしまう。 When an image region including ruled lines exists in the document image, if the ruled lines are simply processed as image data, the amount of data embedded in the document image as watermark information increases. In addition, if the ruled line is not image data but position information indicating the position of the ruled line is embedded and managed in the document image using coordinate values or the like, the line type of the ruled line such as a double line or a broken line needs to be recorded in the position information. However, automatically determining the line type from the document image requires complicated processing, which increases the processing time for generating watermark information.

そもそも罫線は機密情報などに該当する可能性が極めて低いと考えられ，罫線に対する微細な変更が改ざんに該当することは極めてまれである。そこで，第３の実施の形態では，上記罫線に関する事由を前提とし，罫線を除去した上で文書画像に透かし情報を埋め込み，さらに透かし情報を抽出して改ざん検出する際にも，罫線を除去した上で改ざん検出を行うようにする。以下，詳細に説明する。 In the first place, it is considered that the ruled line is very unlikely to correspond to confidential information, and it is very rare that a minute change to the ruled line corresponds to tampering. Therefore, in the third embodiment, on the premise of the reason for the ruled lines, the ruled lines are removed, the watermark information is embedded in the document image, and the ruled lines are also removed when the watermark information is extracted to detect falsification. Do tamper detection above. This will be described in detail below.

なお，第３の実施の形態にかかる罫線は，黒色の直線の場合を例に挙げて説明するが，かかる例に限定されず，例えば，黒色以外の赤色，緑色などの場合でもよく，直線以外の例えば破線などの場合であっても実施可能である。 The ruled lines according to the third embodiment will be described by taking the case of a black straight line as an example. However, the ruled line is not limited to this example. For example, the ruled line may be red, green, or the like other than black. Even in the case of a broken line, it can be implemented.

次に，図５８を参照しながら，第３の実施の形態にかかる改ざん検出装置に備わる文書出力部２１０について説明する。図５８は，第３の実施の形態にかかる文書出力部の概略的な構成を示すブロック図である。なお，改ざん検出装置には，文書出力部２１０及び／又は文書入力部２２０が備わる。 Next, the document output unit 210 provided in the falsification detection apparatus according to the third embodiment will be described with reference to FIG. FIG. 58 is a block diagram illustrating a schematic configuration of a document output unit according to the third embodiment. The tampering detection apparatus includes a document output unit 210 and / or a document input unit 220.

また，第３の実施の形態にかかる改ざん検出装置に備わる文書出力部２１０についての説明は，第１の実施の形態にかかる文書出力部１０との相違する点について特に詳細に説明し，その他の点については，ほぼ同様の構成のため詳細な説明は省略する。 Further, the description of the document output unit 210 provided in the falsification detection apparatus according to the third embodiment will be described in detail with respect to differences from the document output unit 10 according to the first embodiment. About a point, since it is substantially the same structure, detailed description is abbreviate | omitted.

図５８に示すように，第３の実施の形態にかかる文書出力部２１０は，第１の実施の形態にかかる文書出力部１０と比較して，罫線処理部（直線／破線処理部，直線／破線除去部）２１５がさらに追加されている点で異なる。上記罫線処理部２１５については後程詳述する。 As shown in FIG. 58, the document output unit 210 according to the third embodiment has a ruled line processing unit (straight line / broken line processing unit, straight line / line) as compared with the document output unit 10 according to the first embodiment. This is different in that a broken line removal unit 215 is further added. The ruled line processing unit 215 will be described in detail later.

次に，図５９を参照しながら，第３の実施の形態にかかる文書入力部２２０について説明する。図５９は，第３の実施の形態にかかる文書入力部２２０の概略的な構成を示すブロック図である。 Next, a document input unit 220 according to the third embodiment will be described with reference to FIG. FIG. 59 is a block diagram illustrating a schematic configuration of the document input unit 220 according to the third embodiment.

なお，第３の実施の形態にかかる文書入力部２２０についての説明は，第１の実施の形態にかかる文書入力部２０との相違する点について特に詳細に説明し，その他の点については，ほぼ同様の構成のため詳細な説明は省略する。 Note that the description of the document input unit 220 according to the third embodiment will be described in detail with respect to differences from the document input unit 20 according to the first embodiment. Detailed description is omitted because of the same configuration.

図５９に示すように，第３の実施の形態にかかる文書入力部２２０は，第１の実施の形態にかかる文書入力部２０と比較して，入力画像変形部２４の代わりに第３の入力画像変形部（画像変形部）２２７を備える点で異なる。上記第３の入力画像変形部２２７については後程詳述する。 As illustrated in FIG. 59, the document input unit 220 according to the third embodiment has a third input instead of the input image deformation unit 24 as compared with the document input unit 20 according to the first embodiment. The difference is that an image deformation unit (image deformation unit) 227 is provided. The third input image deformation unit 227 will be described in detail later.

（文書出力部２１０の動作について）
次に，図６０を参照しながら，第３の実施の形態に係る文書出力部２１０に備わる罫線処理部２１５の動作について説明する。図６０は，第３の実施の形態にかかる罫線処理部２１５の処理の概略を示すフローチャートである。なお，第３の実施の形態にかかる文書出力部２１０の動作の説明は，第１又は第２の実施の形態にかかる文書出力部との相違点について説明する。 (About the operation of the document output unit 210)
Next, the operation of the ruled line processing unit 215 provided in the document output unit 210 according to the third embodiment will be described with reference to FIG. FIG. 60 is a flowchart illustrating an outline of processing of the ruled line processing unit 215 according to the third embodiment. The operation of the document output unit 210 according to the third embodiment will be described with respect to differences from the document output unit according to the first or second embodiment.

図６０に示すように，罫線処理部２１５は，まず，文書画像２１１から罫線を抽出する罫線抽出処理（Ｓ２３１）を実行する。次に，罫線処理部２１５は，Ｓ２３１で抽出した罫線を除去する罫線除去処理（Ｓ２３２）を実行する。 As shown in FIG. 60, the ruled line processing unit 215 first executes a ruled line extraction process (S231) for extracting a ruled line from the document image 211. Next, the ruled line processing unit 215 executes ruled line removal processing (S232) for removing the ruled lines extracted in S231.

ここで，図６１を参照しながら，第３の実施の形態にかかる文書画像２１１について説明する。図６１は，第３の実施の形態にかかる文書画像２１１の概略的な構成を示す説明図である。 Here, a document image 211 according to the third embodiment will be described with reference to FIG. FIG. 61 is an explanatory diagram illustrating a schematic configuration of a document image 211 according to the third embodiment.

図６１に示すように，第３の実施の形態にかかる文書画像２１１には１又は２以上の罫線が存在している。なお，第３の実施の形態に係る文書画像２１１は，図４等に示す第１の実施の形態にかかる文書画像１１と比較して，罫線が存在する点で異なり，記載内容を含め，その他の点については同様である。 As shown in FIG. 61, the document image 211 according to the third embodiment has one or more ruled lines. Note that the document image 211 according to the third embodiment is different from the document image 11 according to the first embodiment shown in FIG. The same applies to the point.

図６１に示すように，文書画像２１１の罫線には，水平方向の罫線（水平罫線２４１）と，垂直方向の罫線（垂直罫線２４２）と２種類の罫線がある。即ち，罫線処理部２１５は，かかる２種類の罫線を抽出（Ｓ２３１）または除去（Ｓ２３２）する。 As shown in FIG. 61, the ruled lines of the document image 211 include two types of ruled lines: a horizontal ruled line (horizontal ruled line 241) and a vertical ruled line (vertical ruled line 242). That is, the ruled line processing unit 215 extracts (S231) or removes (S232) these two types of ruled lines.

（罫線抽出処理（Ｓ２３１），罫線除去処理（Ｓ２３２）について）
ここで，図６２を参照しながら，上記罫線抽出処理（Ｓ２３１）と罫線除去処理（Ｓ２３２）について説明する。図６２は，罫線抽出処理（Ｓ２３１）と罫線除去処理（Ｓ２３２）の概略を示すフローチャートである。 (Regarding ruled line extraction processing (S231) and ruled line removal processing (S232))
Here, the ruled line extraction process (S231) and the ruled line removal process (S232) will be described with reference to FIG. FIG. 62 is a flowchart showing an outline of ruled line extraction processing (S231) and ruled line removal processing (S232).

図６２に示すように，罫線処理部２１５は，まず文書画像２１１をラスタースキャンし黒画素を検出する（Ｓ２４１）。 As shown in FIG. 62, the ruled line processing unit 215 first raster scans the document image 211 to detect black pixels (S241).

次に，罫線処理部２１５は，Ｓ２４１で検出した黒画素の位置から水平方向に連続して存在する黒画素を検索し，その黒画素の画素数をカウントする（Ｓ２４２）。 Next, the ruled line processing unit 215 searches for black pixels continuously present in the horizontal direction from the position of the black pixel detected in S241, and counts the number of black pixels (S242).

次に，罫線処理部２１５は，上記Ｓ２４２でカウントした画素数を判定する（Ｓ２４３）。罫線処理部２１５は，上記画素数が所定値Ｌ以上であれば統合処理（Ｓ２４４）を実行し，Ｌ以下であれば，Ｓ２４１で検索した連続する黒画素領域を削除（白画素に変更）する削除処理（Ｓ２４６）を実行する。 Next, the ruled line processing unit 215 determines the number of pixels counted in S242 (S243). The ruled line processing unit 215 executes integration processing (S244) if the number of pixels is equal to or greater than the predetermined value L, and deletes (changes to white pixels) the continuous black pixel area searched in S241 if the number is less than or equal to L. A deletion process (S246) is executed.

次に，罫線処理部２１５は，Ｓ２４２で検索した領域の上下の近傍にＳ２４２で検索した長さに等しい連続する黒画素領域が存在すれば，それらを同じ罫線とみなして統合する（Ｓ２４４）。 Next, if there are continuous black pixel areas equal to the length searched in S242 near the top and bottom of the area searched in S242, the ruled line processing unit 215 regards them as the same ruled line and integrates them (S244).

最後に，罫線処理部２１５は，Ｓ２４４で統合された罫線情報（例えば，長さ，幅，位置などに関する情報）を記録し，検索した罫線を削除する（Ｓ２４６）。Ｓ２４６が終了すると，再び罫線処理部２１５は黒画素検出処理（Ｓ２４１）を実行する。 Finally, the ruled line processing unit 215 records the ruled line information integrated in S244 (for example, information on length, width, position, etc.) and deletes the searched ruled line (S246). When S246 ends, the ruled line processing unit 215 executes the black pixel detection process (S241) again.

以上で，水平罫線２４１を抽出（Ｓ２３１）及び除去（Ｓ２３２）する処理について説明は終了する。なお，垂直罫線２４２の場合は，図６２に示すＳ２４２で垂直方向に連続した黒画素領域を検索するように変更し，Ｓ２４４で左右に同じ長さの線分があれば同じ罫線として統合するように変更することで，罫線処理部２１５は罫線抽出処理（Ｓ２３１）と罫線除去処理（Ｓ２３２）を実行する。 This is the end of the description of the process of extracting (S231) and removing (S232) the horizontal ruled line 241. In the case of the vertical ruled line 242, a change is made to search for black pixel regions that are continuous in the vertical direction in S242 shown in FIG. 62, and if there are line segments of the same length on the left and right in S244, they are integrated as the same ruled line. The ruled line processing unit 215 executes the ruled line extraction process (S231) and the ruled line removal process (S232).

また，第３の実施に形態にかかる罫線処理部２１５は，罫線抽出処理（Ｓ２３１）と罫線除去処理（Ｓ２３２）を実行することで，罫線を抽出および除去する場合を例に挙げて説明したが，かかる例に限定されず，例えば，罫線の抽出方法は，ＯＣＲ等の文書構造解析で用いられる一般的な方法等を用いる場合でも実施可能である。 Also, the ruled line processing unit 215 according to the third embodiment has been described by taking as an example the case of extracting and removing ruled lines by executing the ruled line extraction process (S231) and the ruled line removal process (S232). For example, the ruled line extraction method can be implemented even when a general method used in document structure analysis such as OCR is used.

ここで，図６１に示す文書画像２１１に対して罫線処理部２１５によって罫線抽出処理（Ｓ２３１）が実行されると，図６３に示すように，図６１に示す文書画像２１１のうち罫線領域のみを抽出することができる。 Here, when the ruled line extraction processing (S231) is executed by the ruled line processing unit 215 on the document image 211 shown in FIG. 61, only the ruled line area in the document image 211 shown in FIG. Can be extracted.

また，罫線抽出処理（Ｓ２３１），罫線除去処理（Ｓ２３２）が実行されることで出力された上記罫線情報に基づいて罫線処理部２１５が罫線を白画素からなる背景画像に配置すると図６１に示す文書画像２１１のうち罫線のみを再現することができる。なお，図６３は，図６１に示す文書画像２１１のうち罫線だけを表示した文書画像の概略的な構成を示す説明図である。 Also, FIG. 61 shows that the ruled line processing unit 215 arranges the ruled lines in the background image composed of white pixels based on the ruled line information output by executing the ruled line extraction process (S231) and the ruled line removal process (S232). Only ruled lines in the document image 211 can be reproduced. FIG. 63 is an explanatory diagram showing a schematic configuration of a document image in which only ruled lines are displayed in the document image 211 shown in FIG.

さらに，罫線処理部２１５が罫線抽出処理（Ｓ２３１），罫線除去処理（Ｓ２３２）を実行することによって，図６１に示す文書画像２１１のうち水平罫線２４１及び垂直罫線２４２を除去した結果は，図６４に図示の通りとなる。なお，図６４は，図６１に示す文書画像のうち罫線を除去した文書画像の概略的な構成の一例を示す説明図である。 Further, the ruled line processing unit 215 executes the ruled line extraction process (S231) and the ruled line removal process (S232), thereby removing the horizontal ruled lines 241 and the vertical ruled lines 242 from the document image 211 shown in FIG. As shown in the figure. FIG. 64 is an explanatory diagram showing an example of a schematic configuration of a document image from which ruled lines are removed from the document image shown in FIG.

また，上記図６４に示す罫線が除去された文書画像２１１に対して文書特徴データ化部（画像加工部）２１２が文字領域の抽出処理等の処理を実行する。なお，第３の実施の形態にかかる文書特徴データ化部２１１の動作は，第１又は第２の実施の形態にかかる文書特徴データ化部１２，１１６とほぼ同様の構成であるため詳細な説明は省略する。 Further, the document feature data conversion unit (image processing unit) 212 executes processing such as character region extraction processing on the document image 211 from which the ruled lines shown in FIG. 64 are removed. Note that the operation of the document feature data conversion unit 211 according to the third embodiment is substantially the same as that of the document feature data conversion units 12 and 116 according to the first or second embodiment, and thus detailed description will be made. Is omitted.

（透かし情報合成部２１３の動作について）
第３の実施の形態にかかる透かし情報合成部（透かし情報埋め込み部）２１３は，第１の実施の形態にかかる透かし情報合成部１３と同様に改ざん検出データＤａｔａを透かし情報として文書画像２１１に埋め込む。 (About the operation of the watermark information synthesis unit 213)
The watermark information synthesis unit (watermark information embedding unit) 213 according to the third embodiment embeds the falsification detection data Data as watermark information in the document image 211 in the same manner as the watermark information synthesis unit 13 according to the first embodiment. .

なお，第３の実施の形態にかかる改ざん検出データＤａｔａは，第１の実施の形態にかかる改ざん検出データＤａｔａに，以下に示すデータ（改ざん検出データＤａｔａ２０５）をさらに追加したものである。 Note that the falsification detection data Data according to the third embodiment is obtained by further adding the following data (falsification detection data Data205) to the falsification detection data Data according to the first embodiment.

Ｄａｔａ２０５：罫線情報（抽出した罫線領域の位置，幅，長さ等に関するデータ） Data 205: Ruled line information (data regarding the position, width, length, etc. of the extracted ruled line area)

（入力画像変形部２２７の動作について）
次に，文書入力部２２０に備わる第３の入力画像変形部２２７の動作について説明する。第３の入力画像変形部２２７は，第１の実施の形態の入力画像変形部２４で説明した二値化済みの入力文書画像に対して，上記改ざん検出データＤａｔａ２０５の罫線情報を参照して罫線の削除を行う。 (Regarding the operation of the input image transformation unit 227)
Next, the operation of the third input image transformation unit 227 provided in the document input unit 220 will be described. The third input image deformation unit 227 refers to the ruled line information of the falsification detection data Data 205 with respect to the binarized input document image described in the input image deformation unit 24 of the first embodiment. Delete.

第３の入力画像変形部２２７は，第１の実施の形態でも説明したが，出力文書画像２１４をスキャナ等の入力デバイス（図示せず。）からの入力文書画像から補正画像を作成し，さらに補正画像について画像領域に含まれる罫線領域を除去する。 As described in the first embodiment, the third input image transformation unit 227 creates a corrected image from the output document image 214 from an input document image from an input device (not shown) such as a scanner, and further The ruled line area included in the image area is removed from the corrected image.

なお，入力デバイス（図示せず。）によって取得された入力文書画像２２１を図示すると，図６５のようになる。図６５に示す入力文書画像２２１に対して第３の入力画像変形部２２７が補正画像を作成し，さらに罫線を除去する。図６５は，入力デバイスで取得された入力文書画像２２１の概略的な構成を示す説明図である。 An input document image 221 acquired by an input device (not shown) is shown in FIG. The third input image transformation unit 227 creates a corrected image for the input document image 221 shown in FIG. 65, and further removes ruled lines. FIG. 65 is an explanatory diagram showing a schematic configuration of the input document image 221 acquired by the input device.

また，第３の入力画像変形部２２７が水平罫線２４１，垂直罫線２４２からなる罫線領域を除去すると，図６６に図示の通りとなる。なお，図６６は，図６５に示す入力文書画像２１１を二値化など補正した画像から罫線を削除した入力文書画像の概略的な構成の一例を示す説明図である。 Further, when the third input image deformation unit 227 removes the ruled line area including the horizontal ruled line 241 and the vertical ruled line 242, the result is as shown in FIG. 66 is an explanatory diagram illustrating an example of a schematic configuration of an input document image in which ruled lines are deleted from an image obtained by correcting the input document image 211 illustrated in FIG. 65 by binarization or the like.

第３の入力画像変形部２２７の処理動作は，第１の実施の形態にかかる入力画像変形部２４の動作と比較して罫線削除処理以外の処理ついては，実質的に同一なので詳細な説明は省略する。 The processing operation of the third input image deformation unit 227 is substantially the same as the processing other than the ruled line deletion processing compared to the operation of the input image deformation unit 24 according to the first embodiment, and thus detailed description thereof is omitted. To do.

また，第３の実施の形態にかかる文書画像２１１に罫線が存在する場合についての改ざん検出処理について説明してきたが，第１の実施の形態にかかる文書画像１１または第２の実施の形態にかかる文書画像１１１に罫線が存在する場合であっても実施可能である。つまり，第３の実施の形態にかかる罫線処理部２１５による罫線抽出処理（Ｓ２３１）や，罫線除去処理（Ｓ２３２），さらには第３の入力画像変形部２２７による罫線除去処理を，第１の実施の形態または第２の実施の形態についても実質的に同様の処理を施すことによって実施することが可能となる。 In addition, the falsification detection process when there is a ruled line in the document image 211 according to the third embodiment has been described, but the document image 11 according to the first embodiment or the second embodiment is applied. Even when ruled lines exist in the document image 111, the present invention can be implemented. That is, the ruled line extraction process (S231), the ruled line removal process (S232) by the ruled line processing unit 215 according to the third embodiment, and the ruled line removal process by the third input image deformation unit 227 are performed in the first implementation. The second embodiment or the second embodiment can be implemented by performing substantially the same processing.

なお，第３の実施の形態にかかる改ざん検出処理は，上記説明の通り，第１の実施の形態にかかる改ざん検出部（改ざん判定部）２５による改ざん検出処理（Ｓ５３）等と実質的に同様であるため詳細な説明は省略する。 As described above, the falsification detection process according to the third embodiment is substantially the same as the falsification detection process (S53) by the falsification detection unit (falsification determination unit) 25 according to the first embodiment. Therefore, detailed description is omitted.

以上，第３の実施の形態にかかる改ざん検出処理について説明したが，かかる構成により，文書画像中に罫線が存在していた場合でも，利用者による特段の処理又は作業を要することなく，改ざん検出装置が自動的に入力文書画像が改ざんされたか否かを自動的に判定することができる。また，罫線を特定する位置情報等を画像形式で文書画像に埋め込まずに，罫線に関する位置情報等を座標値等で透かし情報として文書画像に埋め込むため，上記透かし情報のデータ量を極端に増大させることを防止できる。 The falsification detection process according to the third embodiment has been described above. With this configuration, even when ruled lines exist in the document image, the falsification detection is performed without requiring special processing or work by the user. The apparatus can automatically determine whether the input document image has been tampered with automatically. Further, since the position information for specifying the ruled line is not embedded in the document image in the image format, the position information related to the ruled line is embedded as the watermark information with the coordinate value or the like in the document image, so that the data amount of the watermark information is extremely increased. Can be prevented.

以上，添付図面を参照しながら本発明の好適な実施形態について説明したが，本発明はかかる例に限定されない。当業者であれば，特許請求の範囲に記載された技術的思想の範疇内において各種の変更例または修正例を想定し得ることは明らかであり，それらについても当然に本発明の技術的範囲に属するものと了解される。 As mentioned above, although preferred embodiment of this invention was described referring an accompanying drawing, this invention is not limited to this example. It is obvious for those skilled in the art that various changes or modifications can be envisaged within the scope of the technical idea described in the claims, and these are naturally within the technical scope of the present invention. It is understood that it belongs.

上記実施形態においては，地紋等からなる透かし情報として改ざん検出データＤａｔａを文書画像に埋め込む場合を例にあげて説明したが，本発明はかかる例に限定されない。例えば，２次元バーコードとして改ざん検出データＤａｔａを文書画像に埋め込む場合等でも実施することができる。 In the above-described embodiment, the case where the falsification detection data Data is embedded in the document image as watermark information including a background pattern or the like has been described as an example. However, the present invention is not limited to such an example. For example, the present invention can be implemented even when the falsification detection data Data is embedded in a document image as a two-dimensional barcode.

本発明は，に適用可能である。 The present invention is applicable to.

第１の実施の形態にかかる文書出力部１０の概略的な構成を示すブロック図である。It is a block diagram which shows the schematic structure of the document output part 10 concerning 1st Embodiment. 第１の実施の形態にかかる文書入力部の概略的な構成の一例を示すブロック図である。It is a block diagram which shows an example of a schematic structure of the document input part concerning 1st Embodiment. 第１の実施の形態にかかる文書特徴データ化部の処理の概略の一例を示すフローチャートである。It is a flowchart which shows an example of the outline of a process of the document characteristic data conversion part concerning 1st Embodiment. 本実施の形態にかかる文書画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of the schematic structure of the document image concerning this Embodiment. 文字領域の抽出処理の概略を示す説明図である。It is explanatory drawing which shows the outline of the extraction process of a character area. 文字領域の抽出処理の概略を示す説明図である。It is explanatory drawing which shows the outline of the extraction process of a character area. 文字領域の抽出処理の概略を示す説明図である。It is explanatory drawing which shows the outline of the extraction process of a character area. 文字領域の抽出処理の概略を示す説明図である。It is explanatory drawing which shows the outline of the extraction process of a character area. 図８に示す領域１及び領域２の文字領域について切り出し・縮小処理（Ｓ３２）する概略の一例を示す説明図である。It is explanatory drawing which shows an example of the outline which cuts out and reduces processing (S32) about the character area of the area | region 1 and the area | region 2 which are shown in FIG. 本実施の形態にかかる縮小文字領域の合成画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the synthetic | combination image of the reduction character area concerning this Embodiment. 本実施の形態にかかる縮小文字領域の合成画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the synthetic | combination image of the reduction character area concerning this Embodiment. 文書画像に埋め込まれる改ざん検出データＤａｔａ２のテーブルの概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the table of the falsification detection data Data2 embedded in a document image. 透かし情報合成部の処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of a process of a watermark information synthetic | combination part. 透かし信号の一例を示す説明図である。It is explanatory drawing which shows an example of a watermark signal. 図２０（１）の画素値の変化をａｒｃｔａｎ（１／３）の方向から見た断面図である。It is sectional drawing which looked at the change of the pixel value of Fig.20 (1) from the direction of arctan (1/3). 透かし信号の一例を示す説明図であり，（３）はユニットＣを，（４）はユニットＤを，（５）はユニットＥを示している。It is explanatory drawing which shows an example of a watermark signal, (3) shows the unit C, (4) shows the unit D, (5) shows the unit E. 図２３（１）はユニットＥを背景ユニットと定義し，これを隙間なく並べて原画像１１の背景とした場合を示す説明図である。図２３（２）は図２３（１）の背景画像の中にユニットＡを埋め込んだ一例を示し，図２３（３）は図２３（１）の背景画像の中にユニットＢを埋め込んだ一例を示している。FIG. 23 (1) is an explanatory diagram showing a case where the unit E is defined as a background unit and arranged as a background of the original image 11 without gaps. 23 (2) shows an example in which the unit A is embedded in the background image of FIG. 23 (1), and FIG. 23 (3) shows an example in which the unit B is embedded in the background image of FIG. 23 (1). Show. 原画像へのシンボル埋め込み方法の一例を示す説明図である。It is explanatory drawing which shows an example of the symbol embedding method to an original image. 透かし情報を原画像に埋め込む方法について示したフローチャートである。5 is a flowchart illustrating a method for embedding watermark information in an original image. 透かし情報を原画像に埋め込む方法について示した説明図である。It is explanatory drawing shown about the method of embedding watermark information in an original image. 透かし入り文書画像の一例を示す説明図である。It is explanatory drawing which shows an example of a watermarked document image. 図２１の一部を拡大して示した説明図である。It is explanatory drawing which expanded and showed a part of FIG. 入力文書画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of an input document image. 第１の実施の形態における信号検出フィルタリング工程（ステップＳ３１０）の説明図である。It is explanatory drawing of the signal detection filtering process (step S310) in 1st Embodiment. 第１の実施の形態における信号位置探索工程（ステップＳ３２０）の説明図である。It is explanatory drawing of the signal position search process (step S320) in 1st Embodiment. 第１の実施の形態における信号境界決定工程（ステップＳ３４０）の説明図である。It is explanatory drawing of the signal boundary determination process (step S340) in 1st Embodiment. 情報復元の一例を示す説明図である。It is explanatory drawing which shows an example of information restoration. データ符号の復元方法の一例を示す説明図である。It is explanatory drawing which shows an example of the decompression | restoration method of a data code. データ符号の復元方法の一例を示す説明図である。It is explanatory drawing which shows an example of the decompression | restoration method of a data code. データ符号の復元方法の一例を示す説明図である。It is explanatory drawing which shows an example of the decompression | restoration method of a data code. データ符号の復元方法の一例を示す説明図である。It is explanatory drawing which shows an example of the decompression | restoration method of a data code. 擬似文書画像生成部の処理の概略を示すフローチャートである。It is a flowchart which shows the outline of a process of a pseudo document image generation part. 擬似文書画像生成部２３による擬似文書画像の生成処理の概略を示す説明図である。It is explanatory drawing which shows the outline of the production | generation process of the pseudo document image by the pseudo document image production | generation part. 入力画像変形部の動作の概略を示すフローチャートである。It is a flowchart which shows the outline | summary of operation | movement of an input image deformation | transformation part. 第１の実施形態で検出された信号ユニット位置を入力画像上に表示した説明図である。It is explanatory drawing which displayed the signal unit position detected in 1st Embodiment on the input image. 近似直線の検出の例を示す説明図である。It is explanatory drawing which shows the example of a detection of an approximate straight line. 直線近似を行った結果の例を示す説明図である。It is explanatory drawing which shows the example of the result of having performed linear approximation. 傾きの補正を示す説明図である。It is explanatory drawing which shows correction | amendment of inclination. 位置の補正を示す説明図である。It is explanatory drawing which shows the correction | amendment of a position. 直線の交点の例を示す説明図である。It is explanatory drawing which shows the example of the intersection of a straight line. 入力画像と補正画像の位置の対応例を示す説明図である。It is explanatory drawing which shows the example of a response | compatibility of the position of an input image and a correction image. 入力画像と補正画像の対応付け方法の例を示す説明図である。It is explanatory drawing which shows the example of the matching method of an input image and a correction image. 補正及び二値化された入力文書画像をさらに縮小・拡大した場合の入力文書画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the input document image at the time of further reducing and enlarging the corrected and binarized input document image. 第１の実施の形態にかかる改ざん検出部の処理の概略を示すフローチャートである。It is a flowchart which shows the outline of a process of the tampering detection part concerning 1st Embodiment. 差分画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of a difference image. 差分画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of a difference image. 第２の実施の形態にかかる文書出力部の概略的な構成の一例を示すブロック図である。It is a block diagram which shows an example of a schematic structure of the document output part concerning 2nd Embodiment. 第２の実施の形態にかかる文書入力部の概略的な構成の一例を示すブロック図である。It is a block diagram which shows an example of a schematic structure of the document input part concerning 2nd Embodiment. 第２の実施の形態にかかる第２の文書特徴データ化部の処理の概略を示すフローチャートである。It is a flowchart which shows the outline of a process of the 2nd document feature data conversion part concerning 2nd Embodiment. 第２の実施の形態にかかる記載事項ブロックの重要度の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of the schematic structure of the importance of the description item block concerning 2nd Embodiment. 第２の実施の形態にかかる第２の文書特徴データ化部による番号割振り処理の結果の一例を示す説明図である。It is explanatory drawing which shows an example of the result of the number allocation process by the 2nd document feature data conversion part concerning 2nd Embodiment. 文字領域の切り出し及び縮小処理（Ｓ１３２）の概略の一例を示す説明図である。It is explanatory drawing which shows an example of the outline of a cutting-out and reduction process (S132) of a character area. 第２の実施の形態にかかる合成画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of the schematic structure of the synthesized image concerning 2nd Embodiment. 第２の実施の形態にかかる改ざん検出データＤａｔａ１０２の概略的な構成を示す説明図である。It is explanatory drawing which shows schematic structure of the alteration detection data Data102 concerning 2nd Embodiment. 第２の実施の形態にかかる改ざん検出データＤａｔａ１０４の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the alteration detection data Data104 concerning 2nd Embodiment. 擬似文書画像の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of a pseudo document image. 第２の入力画像変形部１２８の動作の概略を示す説明図である。It is explanatory drawing which shows the outline of operation | movement of the 2nd input image deformation | transformation part. 第３の実施の形態にかかる文書出力部の概略的な構成を示すブロック図である。It is a block diagram which shows schematic structure of the document output part concerning 3rd Embodiment. 第３の実施の形態にかかる文書入力部の概略的な構成を示すブロック図である。It is a block diagram which shows the schematic structure of the document input part concerning 3rd Embodiment. 第３の実施の形態にかかる罫線処理部の処理の概略を示すフローチャートである。It is a flowchart which shows the outline of a process of the ruled line process part concerning 3rd Embodiment. 第３の実施の形態にかかる文書画像の概略的な構成を示す説明図である。It is explanatory drawing which shows schematic structure of the document image concerning 3rd Embodiment. 罫線抽出処理（Ｓ２３１）と罫線除去処理（Ｓ２３２）の概略を示すフローチャートである。It is a flowchart which shows the outline of a ruled line extraction process (S231) and a ruled line removal process (S232). 図６１に示す文書画像のうち罫線領域の概略的な構成の一例を示す説明図である。FIG. 62 is an explanatory diagram illustrating an example of a schematic configuration of a ruled line region in the document image illustrated in FIG. 61. 図６１に示す文書画像のうち罫線を除去した文書画像の概略的な構成の一例を示す説明図である。FIG. 62 is an explanatory diagram illustrating an example of a schematic configuration of a document image from which ruled lines are removed from the document image illustrated in FIG. 61. 入力デバイスで取得された入力文書画像２２１の概略的な構成を示す説明図である。It is explanatory drawing which shows schematic structure of the input document image 221 acquired with the input device. 図６５に示す入力文書画像２１１を二値化など補正した画像から罫線を削除した入力文書画像の概略的な構成の一例を示す説明図である。FIG. 66 is an explanatory diagram illustrating an example of a schematic configuration of an input document image obtained by deleting ruled lines from an image obtained by correcting the input document image 211 illustrated in FIG. 65 by binarization or the like.

Explanation of symbols

１０文書出力部
１１文書画像
１２文書特徴データ化部
１３透かし情報合成部
１４出力文書画像
２１入力文書画像
２２透かし情報抽出部
２３擬似文書画像生成部
２４入力画像変形部
２５改ざん検出部
２６改ざん検出結果 DESCRIPTION OF SYMBOLS 10 Document output part 11 Document image 12 Document feature data conversion part 13 Watermark information synthetic | combination part 14 Output document image 21 Input document image 22 Watermark information extraction part 23 Pseudo-document image generation part 24 Input image deformation | transformation part 25 Tampering detection part 26 Tampering detection result

Claims

An image processing unit that inputs an image and executes a process of cutting out a specific region of the region for determining whether or not one or more tampering has been performed from the input image;
At least one of area information regarding the position and / or size of the specific area cut out by the image processing unit or a partial image of the image corresponding to the specific area is embedded in the image as watermark information to generate an output image A watermark information embedding unit;
With
The image processing unit executes a process of cutting out the specific area, and further reduces the partial image of the image corresponding to the cut out specific area at a reduction ratio determined according to the importance for each specific area. A watermarked image output apparatus characterized by executing a reduction process.

The watermarked image output apparatus according to claim 1, wherein the image processing unit dynamically changes the importance set for each specific area.

An image processing unit that inputs an image and executes a process of cutting out a specific region of the region for determining whether or not one or more tampering has been performed from the input image;
At least one of area information regarding the position and / or size of the specific area cut out by the image processing unit or a partial image of the image corresponding to the specific area is embedded in the image as watermark information to generate an output image A watermark information embedding unit;
With
The watermark image output apparatus, wherein the image processing unit collects each of the one or more reduced images and generates one composite image.

The image processing unit executes a process of cutting out the specific area, and further executes a reduction process of reducing a partial image of the image corresponding to the cut out specific area at a predetermined reduction rate;
The watermark information includes the area information and further includes a partial image of the image corresponding to the specific area or a reduced image obtained by reducing a partial image of the image corresponding to the specific area. The watermarked image output apparatus according to claim 1, further comprising a rate .

4. The watermarked image according to claim 1, wherein the specific area is a character area including at least one of a character, a symbol, or a figure. Output device.

The watermarked image output apparatus further includes a straight line / broken line processing unit that extracts a straight line and / or broken line area having a predetermined length included in the image,
The watermark information embedding unit further embeds, as watermark information, straight line area information relating to the position and / or size of a straight line and / or broken line area extracted by the straight line / broken line processing unit, into the image. 4. The watermarked image output apparatus according to claim 1, 2, or 3 .

At least one of area information on the position and / or size of one or more specific areas for determining whether or not the image has been tampered with, or a partial image of the image relating to the specific area is used as watermark information. A watermark information extraction unit that inputs an output image generated by being embedded in the image and extracts the watermark information embedded in the input output image;
An image transformation unit that processes the input output image and transforms the input image into an input image having substantially the same size as the image;
A pseudo image generation unit that generates a pseudo image at least similar to a partial image of the specific region existing in the image;
A falsification determination unit that generates a difference image including a difference between the input image and the pseudo image and determines that an output image that is a deformation source of the input image has been falsified when a difference area exists in the difference image When;
With
When the pseudo image generation unit generates a background image having substantially the same size as the image, and the watermark information includes a partial image of the image corresponding to the specific area, the image is directly arranged in the background image according to the area information. Or when the watermark information includes a reduced image corresponding to the specific area, the image is first enlarged at a reciprocal of the reduction ratio reduced to the reduced image, and the enlarged reduced image is arranged according to the area information; A watermarked image input apparatus, wherein the pseudo image is similar to the image by generating the pseudo image.

At least one of area information on the position and / or size of one or more specific areas for determining whether or not the image has been tampered with, or a partial image of the image relating to the specific area is used as watermark information. A watermark information extraction unit that inputs an output image generated by being embedded in the image and extracts the watermark information embedded in the input output image;
An image transformation unit that processes the input output image and transforms the input image into an input image having substantially the same size as the image;
A pseudo image generation unit that generates a pseudo image at least similar to a partial image of the specific region existing in the image;
A falsification determination unit that generates a difference image including a difference between the input image and the pseudo image and determines that an output image that is a deformation source of the input image has been falsified when a difference area exists in the difference image When;
With
The image transformation unit cuts out an area that matches the specific area in the input output image based on area information regarding the position and / or size of the specific area included in the watermark information, and corresponds to the specific area The partial image of the clipped area is reduced at a reduction ratio included in the watermark information, and the partial image of the clipped area is enlarged at a magnification that is the inverse of the reduction ratio. A watermarked image input device characterized by being arranged at a position.

At least one of area information on the position and / or size of one or more specific areas for determining whether or not the image has been tampered with, or a partial image of the image relating to the specific area is used as watermark information. A watermark information extraction unit that inputs an output image generated by being embedded in the image and extracts the watermark information embedded in the input output image;
An image transformation unit that processes the input output image and transforms the input image into an input image having substantially the same size as the image;
A pseudo image generation unit that generates a pseudo image at least similar to a partial image of the specific region existing in the image;
A falsification determination unit that generates a difference image including a difference between the input image and the pseudo image and determines that an output image that is a deformation source of the input image has been falsified when a difference area exists in the difference image When;
With
The image deforming unit reduces the partial image of the clipped area at a reduction ratio determined according to the importance for each specific area, and further, the partial image of the clipped area at a magnification that is a reciprocal of the reduction ratio. After the image is enlarged, the watermarked image input device is arranged at the original position in the output image.

At least one of area information on the position and / or size of one or more specific areas for determining whether or not the image has been tampered with, or a partial image of the image relating to the specific area is used as watermark information. A watermark information extraction unit that inputs an output image generated by being embedded in the image and extracts the watermark information embedded in the input output image;
An image transformation unit that processes the input output image and transforms the input image into an input image having substantially the same size as the image;
A pseudo image generation unit that generates a pseudo image at least similar to a partial image of the specific region existing in the image;
A falsification determination unit that generates a difference image including a difference between the input image and the pseudo image and determines that an output image that is a deformation source of the input image has been falsified when a difference area exists in the difference image When;
With
The watermark information extraction unit included in the watermarked image input unit outputs, as watermark information, straight line / broken line area information regarding the position and / or size of a straight line and / or broken line area having a predetermined length included in the image. Extracted from the image;
The watermarked image input apparatus further includes a straight line / broken line removal unit that removes a straight line and / or broken line area present in the input output image based on the straight line / broken line area information included in the watermark information. A watermarked image input device.

The watermarked area according to claim 7, 8, 9, or 10, wherein the specific area is a character area including at least one of a character, a symbol, or a figure. Image input device.