JP3980983B2

JP3980983B2 - Watermark information embedding method, watermark information detecting method, watermark information embedding device, and watermark information detecting device

Info

Publication number: JP3980983B2
Application number: JP2002289697A
Authority: JP
Inventors: 昌彦須崎; 甲子雄松井
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2002-10-02
Filing date: 2002-10-02
Publication date: 2007-09-26
Anticipated expiration: 2022-10-02
Also published as: JP2004128845A

Description

【０００１】
【発明の属する技術分野】
本発明は，文書画像に対して文字以外の形式で秘密情報を付加する方法と，印刷された秘密情報入り文書から秘密情報を検出する技術に関するものである。
【０００２】
【従来の技術】
画像や文書データなどにコピー・偽造防止のための情報や機密情報を人の目には見えない形で埋め込む「電子透かし」は，保存やデータの受け渡しがすべて電子媒体上で行われることを前提としており，透かしによって埋め込まれている情報の劣化や消失がないため確実に情報検出を行うことができる。これと同様に，紙媒体に印刷された文書に対しても，文書が不正に改ざんされたりコピーされることを防ぐために，文字以外の視覚的に目障りではない形式でかつ容易に改ざんが不可能であるような秘密情報を印刷文書に埋め込む方法が必要となっている。
【０００３】
印刷物として最も広く利用される白黒の二値の文書に対する情報埋め込み方法としては，以下のような技術が知られている。
【０００４】
［１］特開２００１−７８００６「白黒２値文書画像への透かし情報埋め込み・検出方法及びその装置」
任意の文字列を囲む最小矩形をいくつかのブロックに分割し，それらを２つのグループ（グループ１，グループ２）に分ける（グループの数は３つ以上でも良い）。例えば信号が１の場合はグループ１のブロック中の特徴量を増やしグループ２の各ブロック中の特徴量を減らす。信号が０の場合は逆の操作を行う。ブロック中の特徴量は，文字領域の画素数や文字の太さ，ブロックを垂直にスキャンして最初に文字領域にぶつかる点までの距離などである。
【０００５】
［２］特開２００１−５３９５４「情報埋め込み装置，情報読み出し装置，電子透かしシステム，情報埋め込み方法，情報読み出し方法及び記録媒体」
１つの文字を囲む最小矩形の幅と高さをその文字に対する特徴量として定め，２つ以上の文字間での特徴量の大小関係の分類パターンによりシンボルを表すものとする。例えば３つの文字からは６つの特徴量が定義でき，これらの大小関係のパターンの組合わせを列挙し，これらの組合わせを２つのグルーブに分類し，それぞれにシンボルを与える。埋め込む情報が“０”であって，これを表すために選択された文字の特徴量の組合わせパターンが“１”であった場合，６つの特徴量のうちいずれかを文字領域を膨らませるなどして変化させる。変化させるパターンは変化量が最小となるように選択する。
【０００６】
［３］特開平９−１７９４９４「機密情報記録方法」
４００ｄｐｉ以上のプリンタで印刷されることを想定する。情報を数値化し，基準点マークと位置判別マークとの距離（ドット数）により情報の表現を行う。
【０００７】
［４］特開平１０−２００７４３「文書処理装置」
万線スクリーン（細かい平行線で構成された特殊スクリーン）のスクリーン線を後方に移動させるかどうかにより情報を表現する。
【０００８】
【特許文献１】
特開２００１−７８００６号公報
【特許文献２】
特開２００１−５３９５４号公報
【特許文献３】
特開平９−１７９４９４号公報
【特許文献４】
特開平１０−２００７４３号公報
【０００９】
【発明が解決しようとする課題】
しかしながら，上記公知技術［１］，［２］では，文書画像の文字を構成する画素や文字間隔・行間隔に対する変更を伴うためフォントやレイアウトの変更が発生する。加えて，上記公知技術［３］，［４］においても，検出時には，スキャナ等の入力機器から読み取った入力画像の１画素単位の精密な検出処理が必要となるため，紙面の汚れや印刷時や読み取り時に雑音が付加された場合などには情報検出精度に大きな影響を与える。
【００１０】
このように，上記公知技術［１］〜［４］では，印刷された文書をスキャナなどの入力装置によって再びコンピュータに入力して埋め込まれた秘密情報を検出する場合に，印刷書類の汚れや入力の際に発生する回転などの画像変形が原因で，入力画像に多くの雑音成分が含まれるため，正確に秘密情報を取り出すことが困難であるという問題点があった。
【００１１】
本発明は，従来の透かし情報埋め込み／検出技術が有する上記問題点に鑑みてなされたものであり，本発明の目的は，正確に秘密情報を取り出すことの可能な，新規かつ改良された透かし情報埋め込み方法，透かし情報検出方法，透かし情報埋め込み装置，及び，透かし情報検出装置を提供することである。
【００１２】
さらに，本発明の他の目的は，信号検出時に紙の回転などで入力画像に歪みがある場合でも信号の復元を行うことができ，画像の回転補正を行う必要がなく，これにより，信号検出時の処理量を削減することの可能な，新規かつ改良された透かし情報埋め込み方法，透かし情報検出方法，透かし情報埋め込み装置，及び，透かし情報検出装置を提供することである。
【００１３】
【課題を解決するための手段】
上記課題を解決するため，本発明の第１の観点によれば，文書の背景としてドットを規則正しく配置し，その中に配置の規則の異なるドットパターンを挿入し，前記ドットパターンの配置の規則に対して情報を与える透かし情報埋め込み方法が提供される。本発明の透かし情報埋め込み方法において，前記ドットパターンは少なくとも第１，第２，第３のドットを含んで構成されており，前記第１，第２，第３のドットの相対的な位置関係によって定まる固有値に応じて前記ドットパターンに値を設定し，所定形状の情報領域内に，同じ値が設定された前記ドットパターンを繰り返し配置し，その情報領域全体に前記情報の１ビットを設定することを特徴とする。
【００１４】
前記第１，第２，第３のドットの相対的な位置関係によって定まる固有値は，例えば，前記第３のドットを始点とし前記第１のドットを終点とするベクトルと，前記第３のドットを始点とし前記第２のドットを終点とするベクトルとの内積とすることができる。
【００１５】
かかる方法によれば，数個の小さなドットから構成されるドットパターンを埋め込むことで，文書（紙）の背景に視覚的に違和感のない方法で情報を埋め込むことができる。そして，ドットのうち１つを始点，２つを終点とした仮想的な２つのベクトルの内積の違い（ベクトルのなす角度およびベクトルの大きさの違い）によって情報を表現している。すなわち，情報をＮ元符号化し，各符号語に対応して内積を割り当てることができる。かかる方法によれば，信号検出時に紙の回転などで入力画像に歪みがある場合でも，仮想ベクトル同士のなす角や仮想ベクトル同士の大きさの違いなどを検出するだけで信号の復元を行うことが可能となり，画像の回転補正を行う必要がない。このようにして，信号検出時の処理量を削減することができる。
【００１６】
また，前記第３のドットを，前記第２のドットを始点として，前記第１のドットを始点とし前記第２のドットを終点とするベクトルに略垂直な方向に位置するようにすることができる。後述するように，情報検出の際に，第３のドットを容易に検索することができる。
【００１７】
前記ドットパターンは，さらに，前記第１のドットを始点として，前記第１のドットを始点とし前記第２のドットを終点とするベクトルに略垂直な方向に位置する第４のドットを含んで構成することができる。ドットパターンを文書の背景に密に配置したときの濃度のバランスを保ち，ドットパターン検出時の補助とすることができる。すなわち，第１のドットを始点とし第２のドットを終点とするベクトルを水平基準ベクトルとし，第１のドットを始点とし第４のドットを終点とするベクトルを垂直基準ベクトルとして利用することができる。
【００１８】
また本発明では，所定形状の情報領域内に，同じ値が設定されたドットパターンを繰り返し配置し，その情報領域全体に１ビットの情報を設定することとしているが，ここで，情報領域の形状の一例としては矩形（正方形を含む）が挙げられる。なお，情報領域をドットパターンと同じサイズとし，１つの情報領域にドットパターンを１つのみ配置してもよい。
【００１９】
また，上記課題を解決するため，本発明の第２の観点によれば，上記本発明の第１の観点にかかる透かし情報埋め込み方法で印刷された文書から情報を検出する透かし情報検出方法が提供される。本発明の透かし情報検出方法は，以下の各ステップを含むことを特徴としている。
（ステップ１）前記文書を画像データに変換して，第１，第２，第３のドット候補点を探索する。
（ステップ２）前記第１，第２，第３のドット候補点の相対的な位置関係によって定まる固有値と，前記第１，第２，第３のドットの相対的な位置関係によって定まる固有値とが略同一の場合に，前記第１，第２，第３のドット候補点は前記ドットパターンを構成する第１，第２，第３のドットであると判定する。
（ステップ３）前記情報領域内における前記ドットパターンに設定された値の多数決により，その情報領域に設定された情報の１ビットを検出する。
【００２０】
上記（ステップ１）における第１，第２，第３のドット候補点の相対的な位置関係によって定まる固有値は，例えば，前記第３のドット候補点を始点とし前記第１のドット候補点を終点とするベクトルと，前記第３のドット候補点を始点とし前記第２のドット候補点を終点とするベクトルとの内積である。
【００２１】
上記（ステップ１）における第１，第２，第３のドット候補点の探索は，例えば，以下の各ステップを含む。
（ステップ１−１）前記第１のドット候補点を検索する。
（ステップ１−２）前記第１のドット候補点が見つかった場合に，その第１のドット候補点を中心に所定の半径を持つ探索円を設定して，その探索円上に第２のドット候補点を探索する。ここで探索円とは，半径の異なる２つの円に囲まれた所定幅のリング形状領域を含む概念である。
（ステップ１−３）前記第２のドット候補点が見つかった場合に，その第２のドット候補点を始点として，前記第１のドット候補点を始点としその第２のドット候補点を終点とするベクトルに略垂直な方向に前記第３のドット候補点を探索する。
【００２２】
かかる透かし情報検出方法によれば，上記優れた効果を有する透かし情報埋め込み方法により埋め込んだ情報を，容易かつ確実に検出することができる。
【００２３】
さらに本発明によれば，上記（ステップ１）において入力された文書を画像データに変換する際に必要な傾き補正を行うことができる。例えば，前記画像データ全体で検出された各ドットパターンにおける，前記第１のドットを始点とし前記第２のドットを終点とするベクトルの傾きの平均値を前記画像データの傾きであると判定し，前記画像データの傾き補正を行うことが可能である。
【００２４】
さらにまた本発明によれば，上記（ステップ３）において情報領域に設定された情報の１ビットを検出するにあたり，検出精度を上げるために例えば以下のような手法を採用することができる。
（ステップ３−１）前記画像データの傾きに応じた座標系を設定する。
（ステップ３−２）前記座標系の垂直軸と水平軸に沿い，隣り合う間隔を前記座標系における前記情報領域の大きさと略同一にした格子状の水平格子軸及び垂直格子軸を設定する。
（ステップ３−３）前記水平格子軸と前記垂直格子軸との交点を中心とし，隣接する情報領域との境界をまたがないような十分小さな判定領域を設定する。
（ステップ３−４）その判定領域内における前記ドットパターンの値の多数決により，その判定領域を含む前記情報領域に設定された前記情報の１ビットを検出する。
【００２５】
さらに，上記（ステップ３−１）における前記水平格子軸及び前記垂直格子軸の設定は，例えば，以下の各ステップにより行われる。
（ステップ３−１−１）任意に設定した水平格子軸と垂直格子軸により定まる判定領域の配置において，判定領域内に含まれるドットパターンが第１の値を表す場合には−１を，第２の値を表す場合は＋１をその判定領域の値として加算する。
（ステップ３−１−２）判定領域内の各ドットパターンについての合計値の絶対値をすべての判定領域について合計した値を，ここで設定した水平格子軸と垂直格子軸に対する評価値とする。
（ステップ３−１−３）水平格子軸と垂直格子軸の位置を微小に変化させた中で最も評価値が大きくなる位置をもって，前記水平格子軸及び前記垂直格子軸として設定する。
【００２６】
また，本発明によれば，上記本発明の第１の観点にかかる透かし情報埋め込み方法を実現可能な透かし情報埋め込み装置が提供される。
【００２７】
さらにまた，本発明によれば，上記本発明の第２の観点にかかる透かし情報検出方法を実現可能な透かし情報検出装置が提供される。
【００２８】
【発明の実施の形態】
以下に添付図面を参照しながら，本発明にかかる透かし情報埋め込み方法，透かし情報検出方法，透かし情報埋め込み装置，及び，透かし情報検出装置の好適な実施の形態について詳細に説明する。なお，本明細書及び図面において，実質的に同一の機能構成を有する構成要素については，同一の符号を付することにより重複説明を省略する。
【００２９】
図１は，本実施の形態にかかる透かし情報埋め込み装置，及び，透かし情報検出装置の構成を示す説明図である。
【００３０】
（透かし情報埋め込み装置１００１）
透かし情報埋め込み装置１００１は，文書データと文書に埋め込む機密情報をもとに文書画像を構成し，紙媒体に印刷を行う装置である。透かし情報埋め込み装置１００１は，図１に示したように，文書画像形成部１００５と，透かし画像形成部１００６と，透かし入り文書画像合成部１００７と，出力デバイス１００８により構成されている。文書データ１００３は文書作成ツール等により作成されたデータである。機密情報１００４は紙媒体に文字以外の形式で埋め込む情報（文字列や画像，音声データ）などである。
【００３１】
文書画像形成部１００６では，文書データ１００３を紙面に印刷した状態の画像が作成される。具体的には，文書画像中の白画素領域は何も印刷されない部分であり，黒画素領域は黒の塗料が塗布される部分である。なお，本実施の形態では，白い紙面に黒のインク（単色）で印刷を行うことを前提として説明するが，本発明はこれに限定されず，カラー（多色）で印刷を行う場合であっても，同様に本発明を適用可能である。
【００３２】
透かし画像形成部１００６は，機密情報１００４をディジタル化して数値に変換したものをＮ元符号化（Ｎは２以上）し，符号語の各シンボルをあらかじめ用意した信号に割り当てる。本実施の形態において，透かし画像形成部１００６においてあらかじめ用意した信号は，任意の大きさの矩形領域中にドットを配置することにより任意の２つ以上の仮想的なベクトルを構成し，そのベクトルの内積の違いに対してシンボルを割り当てたものである。透かし画像は，これらの信号がある規則に従って画像上に配置されたものである。
【００３３】
透かし入り文書画像合成部１００７は，文書画像と透かし画像を重ね合わせて透かし入りの文書画像を作成する。また，出力デバイス１００８は，プリンタなどの出力装置であり，透かし入り文書画像を紙媒体に印刷する。文書画像形成部１００５，透かし画像形成部１００６，透かし入り文書画像合成部１００７はプリンタドライバの中の一つの機能として実現されていても良い。
【００３４】
印刷文書１００９は，元の文書データ１００３に対して機密情報１００４を埋め込んで印刷されたものであり，物理的に保管・管理される。また，透かし情報検出装置１００２が透かし情報埋め込み装置１００１に対して遠隔の場所にある場合には，例えば配送などにより，印刷文書１００９のやりとりが可能となっている。
【００３５】
（透かし情報検出装置１００２）
透かし情報検出装置１００２は，紙媒体に印刷されている文書を画像として取り込み，埋め込まれている機密情報を復元する装置である。透かし情報検出装置１００２は，図１に示したように，入力デバイス１０１０と，透かし検出部１０１１とにより構成されている。
【００３６】
入力デバイス１０１０は，スキャナなどの入力装置であり，紙に印刷された文書１００９を多値階調のグレイ画像として計算機に取り込む。また，透かし検出部１０１１は，入力画像に対してフィルタ処理を行い，埋め込まれた信号を検出する。そして，検出された信号からシンボルを復元し，埋め込まれた機密情報を取り出す。
【００３７】
以上のように構成される透かし情報埋め込み装置１００１及び透かし情報検出装置１００２の動作について説明する。まず，図１〜図４を参照しながら，透かし情報埋め込み装置１００１の動作について説明する。
【００３８】
（文書画像形成部１００５）
文書データ１００３はフォント情報やレイアウト情報を含むデータであり，ワープロソフト等で作成されるものとする。文書画像形成部１００５は，この文書データ１００３を基に，文書が紙に印刷された状態の画像をページごとに作成する。この文書画像は白黒の二値画像であり，画像上で白い画素（値が１の画素）は背景であり，黒い画素（値が０の画素）は文字領域（インクが塗布される領域）であるものとする。
【００３９】
（透かし画像形成部１００６）
機密情報１００４は文字，音声，画像などの各種データであり，透かし画像形成部１００６ではこの情報から文書画像の背景として重ね合わせる透かし画像を作成する。
【００４０】
図２は，透かし画像形成部１００６の処理の流れを示す流れ図である。
まず，機密情報１００４をＮ元符号に変換する（ステップＳ１０１）。Ｎは任意であるが，本実施の形態では説明を容易にするためＮ＝２とする。従って，ステップＳ１０１で生成される符号は２元符号であり，機密情報１００４が０と１のビット列で表現されるものとする。このステップＳ１０１ではデータをそのまま符号化しても良いし，データを暗号化したものを符号化しても良い。
【００４１】
次いで，符号語の各シンボルに対してドットパターンを割り当てる（ステップＳ１０２）。ドットパターンは，数個のドット（黒画素）からなり，これらドットの相対的な位置関係によって定まる固有値を与えることができる。このステップＳ１０２では，ドットパターンの固有値と符号語の各シンボル（０，１）とを関連づける。この関連づけについては，さらに後述する。
【００４２】
そして，符号化された機密情報１００４のビット列に対応するドットパターンを透かし画像上に配置する（ステップＳ１０３）。
【００４３】
上記ステップＳ１０２において，符号語の各シンボルに対して割り当てるドットパターンについて説明する。図３はドットパターンの一例を示す説明図である。
【００４４】
ドットパターンの幅と高さをそれぞれＳｗ，Ｓｈとする。ＳｗとＳｈは異なっていても良いが，本実施の形態では説明を容易にするためＳｗ＝Ｓｈとする。長さの単位は画素数であり，図３の例ではＳｗ＝Ｓｈ＝１２である。これらの信号が紙面に印刷されたときの大きさは，透かし画像の解像度に依存しており，例えば透かし画像が６００ｄｐｉ（ｄｏｔｐｅｒｉｎｃｈ：解像度の単位であり，１インチ当たりのドット数）の画像であるとしたならば，図３のドットパターンの幅と高さは，印刷文書上で１２／６００＝０．０２（インチ）となる。
【００４５】
図３（ａ）に示したように，ドットパターンは，
・始点ドット２１０１（座標値（０，０））
・水平基準ドット２１０２（座標値（Ｓｗ／２，０））
・垂直基準ドット２１０３（座標値（０，Ｓｈ／２））
・変調ドット２１０４（座標値（Ｓｗ／２，Ｓｈ／２＋１））
を含んで構成されている。なお，垂直基準ドット２１０３は，ドットパターンを文書の背景に密に配置したときの濃度のバランスを保ち，信号検出時の補助とするためのドットである。また，変調ドット２１０４は，水平基準ドット２１０２を始点として，始点ドット２１０１を始点とし水平基準ドット２１０２を終点とするベクトルに略垂直な方向に位置するように配置する。これら始点ドット２１０１，水平基準ドット２１０２，変調ドット２１０４の相対的な位置関係によって定まる固有値に応じてドットパターンに値を設定する。以下に，固有値について説明する。
【００４６】
図３（ａ）において，始点ドット２１０１，水平基準ドット２１０２，変調ドット２１０４による仮想的な２つのベクトル（変調ベクトル２１０６，基準ベクトル２１０５）を設定する。変調ベクトル２１０６は，変調ドット２１０４を始点とし始点ドット２１０１を終点としたベクトルである。基準ベクトル２１０５は，変調ドット２１０４を始点とし水平基準ドット２１０１を終点としたベクトルである。また，基準ベクトルと変調ベクトルのなす角を変調角２１０４（＝θ０）とする。また基準ベクトルＶｓの大きさｖｓ＝｜Ｖｓ｜，変調ベクトルＶｍの大きさｖｍ＝｜Ｖｍ｜とする。
【００４７】
本実施の形態では，基準ベクトル２１０５と変調ベクトル２１０６の内積によってドットパターンの特徴を表すものとし，これを信号固有値Ｓｖと称することにする。なお，ベクトルＡとベクトルＢの内積Ａ・Ｂは，
Ａ・Ｂ＝｜Ａ｜｜Ｂ｜ｃｏｓθ
ここで，｜Ａ｜はベクトルＡの大きさ，θはベクトルＡとベクトルＢのなす角で表される。
【００４８】
また，ベクトルＡとベクトルＢの内積は，Ａ＝（ｘ０，ｙ０），Ｂ＝（ｘ１，ｙ１）とするとＡ・Ｂ＝ｘ０×ｘ１＋ｙ０×ｙ１とも表される。図３の例では始点ドットの座標値２１０１を（ｘ，ｙ）＝（０，０），水平基準ドット２１０２の座標値を（Ｓｗ／２，０），変調ドット２１０４の座標値を（Ｓｗ／２，Ｓｈ／２−１）としている。したがって，変調角２２０７（＝θ１），信号固有値Ｓｖ０＝Ｖｓ・Ｖｍ＝（Ｓｗ／２）×（Ｓｗ／２−１）である。
【００４９】
図３（ｂ）においても同様に基準ベクトル２２０５と変調ベクトル２２０６を設定する。ただし，始点ドット２２０１，水平基準ドット２２０２の座標値は図３（ａ）と等しいが，変調ドット２２０３の座標値を（Ｓｗ／２，Ｓｈ／２＋１）としている。したがって，信号固有値Ｓｖ１＝Ｖｓ・Ｖｍ＝（Ｓｗ／２）×（Ｓｗ／２＋１）である。
【００５０】
以下では図３（ａ）が信号０を表し，図３（ｂ）が信号１を表すものとする。
【００５１】
このように始点ドット，水平基準ドット，変調ドットの相対的な位置関係によって信号固有値を様々に変更することができる。なお，基準ベクトルは変調ドットと垂直基準ドットを結ぶベクトルとしても良い。この場合，変調ドットの座標値として，信号０の場合は（Ｓｗ／２−１，Ｓｈ／２），信号１の場合は（Ｓｗ／２＋１，Ｓｈ／２）としても良い。
【００５２】
さらに，図３の例では信号０と信号１で，“基準ベクトル及び変調ベクトルの大きさ”及び“変調角の大きさ”の両方を変化させることで，２つの信号間の信号固有値を異なるものにしているが，“基準ベクトル及び変調ベクトルの大きさ”または“変調角の大きさ”のどちらか一方のみを変化させて信号固有値を異なるものにしても良い。
【００５３】
図２のステップＳ１０３では，符号化されたデータのビット列に対応するドットパターンを透かし画像上に配置するが，ビット（シンボル）を表現するために，本実施の形態では，図４に示すように同一のドットパターンをｌｓ×ｌｓ（個）の矩形状に配置する。これをユニットパターンと称する。図４の例ではｌｓ＝１０としており，信号０を表すドットパターンにより構成されたユニットパターンがシンボル０（図４（ａ））を，信号１を表すドットパターンにより構成されたユニットパターンがシンボル１（図４（ｂ））を表すものとする。以下，ユニットパターンの画像上での１辺の大きさをＬとする。ＳｗとＳｈが等しい場合，Ｌ＝ｌｓ×Ｓｗである。
【００５４】
図５は，機密情報を透かし画像に埋め込む方法について示した流れ図である。ここでは１枚（１ページ分）の透かし画像に，同じ情報を繰り返し埋め込む場合について説明する。同じ情報を繰り返し埋め込むことにより，透かし画像と文書画像を重ね合わせたときに１つのユニットパターン全体が塗りつぶされるなどして埋め込み情報が消失するような場合でも，埋め込んだ情報を取り出すことが可能である。
【００５５】
まず，機密情報１００４をＮ元符号（本実施の形態では２元符号）に変換する（ステップＳ２０１）。図２のステップＳ１０１と同様である。以下では，符号化された情報をデータ符号と称し，ユニットパターンの組合わせによりデータ符号を表現したものをデータ符号ユニットＤｕと称する。
【００５６】
次いで，データ符号の符号長（ここではビット数）と埋め込みビット数から，１枚の画像にデータ符号ユニットを何度繰り返し埋め込むことができるかを計算する（ステップＳ２０２）。１ページ分の透かし画像の中に何ビットの情報量を埋め込むことができるかは，ドットパターンの大きさ，ユニットパターンの大きさ，文書画像の大きさに依存する。信号検出時においては，文書画像の水平方向と垂直方向にいくつの信号を埋め込んだかは，既知として信号検出を行っても良いし，入力装置から入力された画像の大きさと信号ユニットの大きさから逆算しても良い。
【００５７】
１ページ分の透かし画像の水平方向にＰｗ個，垂直方向にＰｈ個のユニットパターンが埋め込めるとする。水平方向にＰｗ個，垂直方向にＰｈ個のユニットパターンを「ユニットパターン行列」と称することにする。また，１ページに埋め込むことができるビット数を「埋め込みビット数」と称する。埋め込みビット数はＰｗ×Ｐｈである。
【００５８】
本実施の形態では，データ符号の符号長を表すビット列（以下，符号長データという）をユニットパターン行列の第１行に挿入するものとする。なお，データ符号の符号長を固定長として符号長データを透かし画像に埋め込まないようにしてもよい。１ページ分の透かし画像の水平方向にＰｗ個，垂直方向にＰｈ個のユニットパターンが埋め込めるとすると，データ符号ユニットを埋め込む回数Ｄｎは，データ符号長をＣｎとして以下の式で計算される。
【００５９】
【数１】

【００６０】
ここで剰余をＲｎ（Ｒｎ＝Ｃｎ−（Ｐｗ×（Ｐｈ−１）））とすると，ユニットパターン行列にはＤｎ回のデータ符号ユニットおよびデータ符号の先頭Ｒｎビット分に相当するユニットパターンを埋め込むことになる。ただし，剰余部分のＲｎビットは必ずしも埋め込まなくても良い。
【００６１】
図６の説明では，ユニットパターン行列のサイズを９×１１（１１行９列），データ符号長を１２（図中で０〜１１の番号がついたものがデータ符号の各符号語を表す）とする。
【００６２】
ユニットパターン行列への符号長データの埋め込みについて具体的に説明すると，まず，ユニットパターン行列の第１行目に符号長データを埋め込む（ステップＳ２０３）。図６の例では符号長を９ビットのデータで表現して１度だけ埋め込んでいる例を説明しているが，ユニットパターン行列の幅Ｐｗが十分大きい場合，データ符号と同様に符号長データを繰り返し埋め込むこともできる。
【００６３】
さらに，ユニットパターン行列の第２行以降に，データ符号ユニットを繰り返し埋め込む（ステップＳ２０４）。図６で示すようにデータ符号のＭＳＢ（ｍｏｓｔｓｉｇｎｉｆｉｃａｎｔｂｉｔ）またはＬＳＢ（ｌｅａｓｔｓｉｇｎｉｆｉｃａｎｔｂｉｔ）から順に行方向に埋め込む。図６の例ではデータ符号ユニットを７回，およびデータ符号の先頭６ビットを埋め込んでいる例を示している。データの埋め込み方法は図６のように行方向に連続になるように埋め込んでも良いし，列方向に連続になるように埋め込んでも良い。
【００６４】
以上，透かし画像形成部１００６における，透かし画像について説明した。次いで，透かし情報埋め込み装置１００１の透かし入り文書画像合成部１００７について説明する。
【００６５】
（透かし入り文書画像合成部１００７）
透かし入り文書画像合成部１００７では，文書画像形成部１００５で作成した文書画像と，透かし画像形成部１００６で作成した透かし画像を重ね合わせる。図７は，透かし入り文書画像の一例を示す説明図である。透かし入り文書画像の各画素の値は，図７に示したように，文書画像と透かし画像の対応する画素値の論理積演算（ＡＮＤ）によって計算する。すなわち，文書画像と透かし画像のどちらかが０（黒）であれば，透かし入り文書画像の画素値は０（黒），それ以外は１（白）となる。
【００６６】
透かし入り文書画像は，出力デバイス１００８により出力される。
【００６７】
以上，透かし情報埋め込み装置１００１の動作について説明した。
次いで，図１，及び，図８〜図１７を参照しながら，透かし情報検出装置１００２の動作について説明する。
【００６８】
（透かし検出部１００２）
図８は透かし検出部の処理の流れを示す説明図である。
ステップＳ３０１ではスキャナなどの入力デバイスによって透かし入り文書画像を透かし情報検出装置１００２に入力する。具体的には，計算機のメモリ等に入力する。この画像を入力画像と呼ぶ。入力画像は多値画像であり，以降では２５６階調のグレイ画像として説明する。また入力画像をスキャナ等で読み込むときの解像度は，上述の透かし情報埋め込み装置１００１で作成した透かし入り文書画像と異なっていても良いが，ここでは透かし情報埋め込み装置１００１で作成した画像と同じ解像度であるとして説明を行う。またスキャナなどで画像を取り込む際に，紙がスキャナの撮像面の座標軸に対して正確な位置合わせが行われないために透かし埋め込み領域の座標系が入力画像の座標系に対して回転している場合があるが，以降の説明では画像処理による透かし領域の回転補正は行わずに透かし検出を行う。
【００６９】
ステップＳ３０２では入力画像からドットパターンの検出を行う。ドットパターンの検出の詳細については，図９〜図１２を参照しながらさらに後述する。ドットパターンの検出は入力画像全体に対して行う。入力画像のサイズを幅Ｗ，高さＨとしたとき，入力画像の座標値Ｉ（ｘ，ｙ）（ｘ＝０〜Ｗ−１，ｙ＝０〜Ｈ−１）に信号０を表すドットパターンが存在すればＵ（ｘ，ｙ）＝−１とし，信号１を表すドットパターンが存在すればＵ（ｘ，ｙ）＝１とする。ドットパターンが存在しない場合はＵ（ｘ，ｙ）＝０とする。ただし，ステップＳ３０２においては，Ｕ（ｘ，ｙ）（ｘ＝０〜Ｗ−１，ｙ＝０〜Ｈ−１）は最初にすべて０で初期化されているものとする。
【００７０】
ステップＳ３０２では信号の判定結果を格納すると同時に，基準ベクトルの傾きをＲ（ｘ，ｙ）に格納する。ただし，Ｒ（ｘ，ｙ）の値はＵ（ｘ，ｙ）が０以外の場合のみ，すなわち，ドットパターンが存在した座標のみ有効であるものとする。
【００７１】
ステップＳ３０３ではＵ（ｘ，ｙ）およびＲ（ｘ，ｙ）を元にユニットパターンを検出する（図１３で説明）。ユニットパターンの検出の詳細については，図１３〜図１７を参照しながらさらに後述する。
【００７２】
ステップＳ３０４ではシンボル列（例えば０１１０００１０１００・・・）から情報を復元する。情報復号については，図１８を参照しながらさらに後述する。
【００７３】
以上，図８を参照しながら，本実施の形態にかかる透かし情報検出方法について概説した。次いで，各ステップＳ３０２〜Ｓ３０４について，順に詳説する。
【００７４】
（ドットパターン検出ステップＳ３０２）
図９は，図８のドットパターン検出ステップＳ３０２の詳細を示す説明図である，ドットパターン検出ステップＳ３０２は，図９に示したように，以下のステップＳ４０１〜Ｓ４０４からなる。
【００７５】
（始点ドット候補検索ステップＳ４０１）
ステップＳ４０１では，入力画像を図１０に示すようにラスタースキャンすることにより始点ドット候補Ｄ１を探索する。透かし画像形成部によって埋め込まれたドット（始点ドット，基準ドット，変調ドット）を検出する方法としては，以下のいずれの方法でも良い。
・入力画像の任意の位置Ｉ（ｘ，ｙ）における輝度値があらかじめ定めた閾値Ｔ以下であればそれをドットと判定する。
・入力画像の任意の位置Ｉ（ｘ，ｙ）における任意のエッジ検出フィルタの出力値があらかじめ定めた閾値Ｔ以上であればそれをドットと判定する。
【００７６】
（水平基準ドット候補検索ステップＳ４０２）
ステップＳ４０２では水平基準ドットの候補となるドットを探索する。ここではまず，図１１で示すように，探索領域Ｒを設定する。探索領域Ｒは始点ドット候補Ｄ１を中心とし，半径をそれぞれｒ１，ｒ２（ｒ１＜ｒ２）とする２つの半円，および探索開始軸Ａｓと探索終端軸Ａｅに囲まれた領域である。探索領域Ｒは，始点ドット候補Ｄ１が真の始点ドットであった場合に，水平基準ドットが存在するであろう範囲を示したものである。ｒ１，ｒ２は埋め込んだドットパターンの基準ベクトルＶｓの大きさ，および印刷の解像度と画像取り込みの解像度に依存している。
【００７７】
探索開始軸Ａｓと探索終端軸Ａｅはラスタースキャンの際に，元に戻る方向でドットを探索することを防ぐために設定する。これはドットの並びの方向性を保つために必要となる。探索の方向は探索開始軸Ａｓからドットの探索を始め，探索領域Ｒ内を時計回りに行うものとする。
【００７８】
ステップＳ４０２において探索領域Ｒ内に新たにドットが見つかった場合，これを水平基準ドット候補Ｄ２とする。新たなドットが見つからなかった場合，始点ドット候補Ｄ１は真の始点ドットではなかったものとしてステップＳ４０１に戻り，再び次の始点ドット候補を探索する。
【００７９】
（変調ドット候補検索ステップＳ４０３）
ステップＳ４０３では，変調ドット候補を検索する。図１２で示すように，水平基準ドット候補Ｄ２を通り，始点ドット候補Ｄ１と水平基準ドット候補Ｄ２を結んだ直線と垂直な直線上で，図１２の破線矢印で示した探索方向に沿ってステップＳ４０２と同様にドットを探索する。探索範囲は水平基準ドット候補Ｄ２からの距離が（Ｓｈ−１）±Δｄおよび（Ｓｈ＋１）±Δｄ（Δｄはあらかじめ定めた値）の範囲とし，新たなドットが見つかった場合に，これを変調ドット候補Ｄ３とする。新たなドットが見つからなかった場合，始点ドット候補Ｄ１は真の始点ドットではなかったものとしてステップＳ４０１に戻り，再び次の始点ドット候補を探索する。
【００８０】
（信号固有値（内積）計算ステップＳ４０４）
ステップＳ４０４では，変調ドット候補Ｄ３と水平基準ドット候補Ｄ２を結ぶ基準ベクトルＶｓと，変調ドット候補Ｄ３と始点ドット候補Ｄ１を結ぶ変調ベクトルＶｍによる信号固有値（内積）を計算し，これがＳｖ０±ｄｖまたはＳｖ１±ｄｖの範囲内であれば，ここで検出した３つのドットＤ１，Ｄ２，Ｄ３はドットパターンを構成するドットであると判定する。ただし，ｄｖは信号固有値に関する誤差成分であらかじめ定められた値とする。信号固有角がＳｖ０±ｄｖまたはＳｖ１±ｄｖの範囲外であれば，始点ドット候補Ｄ１は真の始点ドットではなかったものとしてステップＳ４０１に戻り，再び次の始点ドット候補を探索する。
【００８１】
（信号判定および信号結果蓄積ステップＳ４０５）
ステップＳ４０５ではステップＳ４０４で求めた信号固有値から，ステップＳ４０６により，このドットパターンが信号０であるか信号１であるかを判定する。信号０であればＵ（ｘ，ｙ）＝−１とし，信号１であればＵ（ｘ，ｙ）＝１とする。また，基準ベクトルの傾きをＲ（ｘ，ｙ）に記録する。
【００８２】
以上，ドットパターン検出ステップＳ３０２の詳細について説明した。次いで，ユニットパターン検出ステップＳ３０３の詳細について説明する。
【００８３】
（ユニットパターン検出ステップＳ３０３）
図１３は，図８のユニットパターン検出ステップＳ３０３の詳細を示す説明図である。ユニットパターン検出ステップＳ３０３は，図１３に示したように，以下のステップＳ５０１〜Ｓ５０４からなる。
【００８４】
（信号領域の傾き検出ステップＳ５０１）
ステップＳ５０１では，基準ベクトルの傾きＲ（ｘ，ｙ）のうち，それに対応するＵ（ｘ，ｙ）の値が０以外の場合についての平均値Ｒａを計算する。この平均値Ｒａが入力画像に対する信号領域の傾き，例えばスキャナなどの撮像面に対する原稿の傾きであると判定する。以下，傾きＲａという。
【００８５】
（ユニットパターン判定領域設定ステップＳ５０２）
ステップＳ５０２ではユニットパターン判定領域の大きさを設定する。ユニットパターンは同じ情報を表すドットパターンをｌｓ×ｌｓ個ずつ並べたものであるが，信号領域の傾きのため，ラスタースキャンによりユニットパターンの判定を行った場合には，ユニットパターンの周辺部分の信号が，隣接するユニットパターンの信号と混ざり合うことになる（図１４のグレーの部分）。したがって，ユニットパターン判定領域の大きさをＬ×Ｌより小さ目に設定する。
【００８６】
図１５はユニットパターン判定領域の大きさの決定方法の例を示している。
図１５（ａ）は入力画像であり，図中の任意のユニットパターンを取り出したものが図１５（ｂ）である。図１５（ｂ）は一辺がＬの正方形を傾きＲａだけ回転させたものである。図のように，入力画像の座標系において回転のない矩形状の判定領域を設定する。判定領域はその４つの頂点すべてが回転した状態でのユニットパターンの内部に収まるようにする。図中でｄｍはマージンであり，ｄｍ＝０の場合は先の条件（４頂点がすべてユニットパターン内に収まる）を満たす最大の矩形となる。以下では判定領域の一辺の長さをＭとする。
【００８７】
（判定領域の最適設置位置確定ステップＳ５０３）
ステップＳ５０３では，判定領域を回転のある格子状に配置し，格子の軸を変動させることによって判定領域の最適な配置を求める。
【００８８】
図１６は判定領域を格子状に配置した例である。破線で示された直線が格子軸であり，信号埋め込み領域の座標系にしたがって「水平格子軸」と「垂直格子軸」に分類している。水平格子軸は信号領域における水平軸で，隣接する軸間の距離はＬ（ユニットパターンの大きさ）である。垂直格子軸は信号領域における垂直軸で，隣接する軸間の距離はＬである。入力画像の座標系から見た格子軸は，入力画像の水平軸，垂直軸をそれぞれ傾きＲａだけ回転させたものであり，隣り合う軸間の距離はＬ×ｃｏｓ（Ｒａ）である。
【００８９】
図のように水平格子軸と垂直格子軸の交点と判定領域の中心が重なるように，判定領域を配置する。また，格子軸の交点のうち最も左上にあるものを「格子軸の始点Ｃ（０）」と呼び，入力画像内に設置できるすべての判定領域をＣ（ｉ），ｉ＝０〜Ｃｎとする。Ｃ（ｉ）は格子軸の交点の並びを行列とみなしたときに，１行１列，１行２列，・・・，２行１列，２行２列の順であるものとする。
【００９０】
次に，Ｃｕ（ｉ），ｉ＝０〜ＣｎをＣ（ｉ）に含まれる領域に対応するＵ（ｘ，ｙ）の値の合計値の絶対値とし，格子軸の始点Ｃ（０）の座標が（ＣＸ，ＣＹ）であったときのＣｕ（ｉ），ｉ＝０〜Ｃｎの合計値をＰｕ（ＣＸ，ＣＹ）と表記する。
【００９１】
Ｃｕ（ｉ）は各判定領域が，ユニットパターンの中央に配置されたときに最も値が大きくなる。なぜなら，図１７で示すように，判定領域が２つ以上のユニットパターンにまたがって設置された場合，対応する領域のＵ（ｘ，ｙ）の値は＋１と−１が混在するため，それらの合計の絶対値は小さくなる。一方，判定領域が１つのユニットパターン内に収まる場合は，対応する領域のＵ（ｘ，ｙ）の値は，たとえ信号判定エラーがあった場合を考慮しても，＋１か−１のどちらかに偏るため，それらの合計の絶対値は大きくなる。
【００９２】
従って，ＣＸ，ＣＹをそれぞれ±ｄＬ／２の範囲で動かした場合のＰｕ（ＣＸ，ＣＹ）を計算し，これが最大となるＣＸ，ＣＹが判定領域の最適な配置であると判定する。
【００９３】
（ユニットパターン判定ステップＳ５０４）
ステップＳ５０４では，ステップＳ５０３で設定した配置におけるＣ（ｉ），ｉ＝０〜Ｃｎにおいて，Ｃｐ（ｉ），ｉ＝０〜ＣｎをＣ（ｉ）に含まれる領域に対応するＵ（ｘ，ｙ）の値の合計とし，Ｃｓ（ｉ），ｉ＝０〜ＣｎをＣｐ（ｉ）によって判定されたユニットパターンの値とする。
【００９４】
Ｃｐ（ｉ）が正であれば対応するユニットパターンの値は１であるものとしＣｓ（ｉ）＝１とする。また，Ｃｐ（ｉ）が負であれば対応するユニットパターンの値は０であるものとしＣｓ（ｉ）＝０とする。ただし，Ｃｐ（ｉ）が閾値Ｔｓ以下であれば，その判定領域は信号埋め込み領域の外にあると判断しＣｓ（ｉ）＝−１とする。
【００９５】
Ｃｓ（ｉ），ｉ＝０〜Ｃｎのうち，Ｃｓ（ｉ）が負のものを除き，Ｃｓ（ｉ）の値を直列に連結してビット列を復元することによって，埋め込まれた情報を取り出す。
【００９６】
以上，ユニットパターン検出ステップＳ３０３の詳細について説明した。再び，図８の流れ図に戻り，以降のステップＳ３０４について説明する。ステップＳ３０４では，ユニットパターン行列のシンボルを連結してデータ符号を再構成し，元の情報を復元する。
【００９７】
（情報復号ステップＳ３０４）
図１８は情報復元の一例を示す説明図である。情報復元ステップＳ３０４は以下の通りである。
▲１▼各ユニットパターンに埋め込まれているシンボルを検出する（図１８▲１▼）。▲２▼シンボルを連結してデータ符号を復元する（図１８▲２▼）。
▲３▼データ符号を復号して埋め込まれた情報を取り出す（図１８▲３▼）。
【００９８】
本実施の形態ではデータ符号を繰り返し埋め込む場合について説明したが，データを符号化する際に誤り訂正符号などを用いることにより，データ符号ユニットの繰り返しを行わないような方法も実現できる。
【００９９】
以上説明したように，本実施の形態によれば，数個の小さなドット（始点ドット２１０１，水平基準ドット２１０５，変調ドット２１０４，垂直基準ドット２１０３）から構成されるドットパターンを埋め込むことで，印刷文書（紙）１００９の背景に視覚的に違和感のない方法で情報を埋め込むことができる。そして，変調ドットを始点，始点ドット，水平基準ドットを終点とした仮想的な２つのベクトルの内積の違い（ベクトルのなす角度およびベクトルの大きさの違い）によって情報を表現している。かかる方法によれば，信号検出時に紙の回転などで入力画像に歪みがある場合でも，仮想ベクトル同士のなす角や仮想ベクトル同士の大きさの違いなどを検出するだけで信号の復元を行うことが可能となり，画像の回転補正を行う必要がない。このようにして，信号検出時の処理量を削減することができる。
【０１００】
以上，添付図面を参照しながら本発明にかかる透かし情報埋め込み方法，透かし情報検出方法，透かし情報埋め込み装置，及び，透かし情報検出装置の好適な実施形態について説明したが，本発明はかかる例に限定されない。当業者であれば，特許請求の範囲に記載された技術的思想の範疇内において各種の変更例または修正例に想到し得ることは明らかであり，それらについても当然に本発明の技術的範囲に属するものと了解される。
【０１０１】
【発明の効果】
以上説明したように，本発明によれば，印刷文書（紙）の背景に視覚的に違和感のない方法で情報を埋め込む場合において，情報を正確に取り出すことが可能である。また本発明によれば，信号検出時に紙の回転などで入力画像に歪みがある場合でも信号の復元を行うことができ，画像の回転補正を行う必要がなく，これにより，信号検出時の処理量を削減することが可能である。
【図面の簡単な説明】
【図１】透かし情報埋め込み装置及び透かし情報検出装置の構成を示す説明図である。
【図２】透かし画像形成部１００５の処理の流れを示す流れ図である。
【図３】ドットパターンの例を示す説明図である。
【図４】ユニットパターンの説明図である。
【図５】機密情報を透かし画像に埋め込む方法について示した流れ図である。
【図６】透かし検出部１０１１の処理の流れを示す流れ図である。
【図７】透かし入り文書画像の一例を示す説明図である。
【図８】透かし検出部１０１１の処理の流れを示す説明図である。
【図９】ドットパターン検出の処理の流れを示す説明図である。
【図１０】図９の始点ドット候補選択ステップＳ４０１の説明図である。
【図１１】図９の水平基準ドット候補検索ステップＳ４０２の説明図である。
【図１２】図９の変調ドット候補検索ステップＳ４０３の説明図である。
【図１３】図８のユニットパターン検出ステップＳ３０３の詳細な説明図である。
【図１４】図１３のユニットパターン判定領域設定ステップＳ５０２の説明図である。
【図１５】図１３のユニットパターン判定領域設定ステップＳ５０２の説明図であり，（ａ）は入力画像であり，（ｂ）は任意のユニットパターンを取り出したものである。
【図１６】図１３の判定領域の最適設置位置確定ステップＳ５０３の説明図である。
【図１７】図１３の判定領域の最適設置位置確定ステップＳ５０３の説明図である。
【図１８】情報復元の一例を示す説明図である
【符号の説明】
１００１透かし情報埋め込み装置
１００２透かし情報検出装置
１００３文書データ
１００４機密情報
１００５文書画像形成部
１００６透かし画像形成部
１００７透かし入り文書画像合成部
１００８出力デバイス
１００９印刷文書
１０１０入力デバイス
１０１１透かし検出部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a method for adding confidential information to a document image in a format other than characters, and a technique for detecting confidential information from a printed document containing confidential information.
[0002]
[Prior art]
“Digital watermarking” that embeds information to prevent copying / counterfeiting and confidential information in images and document data in a form that is invisible to the human eye is based on the premise that all storage and data transfer are performed on electronic media. Since the information embedded by the watermark is not deteriorated or lost, the information can be reliably detected. Similarly, documents printed on paper media cannot be easily tampered with in a form that is not visually unsightly other than text to prevent unauthorized tampering and copying. There is a need for a method for embedding confidential information in a printed document.
[0003]
The following techniques are known as information embedding methods for black and white binary documents that are most widely used as printed materials.
[0004]
[1] Japanese Patent Laid-Open No. 2001-78006 “Method and apparatus for embedding / detecting watermark information in a monochrome binary document image”
A minimum rectangle surrounding an arbitrary character string is divided into several blocks and divided into two groups (group 1 and group 2) (the number of groups may be three or more). For example, when the signal is 1, the feature amount in the block of group 1 is increased and the feature amount in each block of group 2 is decreased. When the signal is 0, the reverse operation is performed. The feature amount in the block includes the number of pixels in the character area, the thickness of the character, and the distance to the point where the block first hits the character area.
[0005]
[2] JP 2001-53954 A "Information embedding device, information reading device, digital watermark system, information embedding method, information reading method, and recording medium"
The width and height of the minimum rectangle that encloses one character are defined as the feature amount for the character, and the symbol is represented by a classification pattern of the magnitude relationship of the feature amount between two or more characters. For example, six feature quantities can be defined from three characters, and combinations of patterns of these magnitude relationships are enumerated, these combinations are classified into two grooves, and a symbol is given to each. If the information to be embedded is “0” and the combination pattern of the feature amounts of the character selected to represent this is “1”, one of the six feature amounts is expanded in the character area, etc. To change. The pattern to be changed is selected so that the amount of change is minimized.
[0006]
[3] Japanese Patent Laid-Open No. 9-179494 “Confidential Information Recording Method”
Assume that printing is performed with a printer of 400 dpi or more. Information is digitized and information is expressed by the distance (number of dots) between the reference point mark and the position determination mark.
[0007]
[4] Japanese Patent Laid-Open No. 10-200743 “Document Processing Device”
Information is expressed by whether or not the screen lines of the line screen (special screen composed of fine parallel lines) are moved backward.
[0008]
[Patent Document 1]
JP 2001-78006 A
[Patent Document 2]
JP 2001-53954 A
[Patent Document 3]
JP 9-179494 A
[Patent Document 4]
Japanese Patent Laid-Open No. 10-200743
[0009]
[Problems to be solved by the invention]
However, in the known techniques [1] and [2], the font and layout are changed because the change is made to the pixels constituting the characters of the document image, the character spacing, and the line spacing. In addition, in the above known techniques [3] and [4], when detecting, precise detection processing for each pixel of an input image read from an input device such as a scanner is required. When noise is added at the time of reading, the information detection accuracy is greatly affected.
[0010]
As described above, in the known techniques [1] to [4], when the printed document is input to the computer again by an input device such as a scanner and the embedded confidential information is detected, the smear or input of the printed document is detected. There is a problem that it is difficult to accurately extract secret information because many noise components are included in the input image due to image deformation such as rotation that occurs during the process.
[0011]
The present invention has been made in view of the above-mentioned problems of the conventional watermark information embedding / detection technique, and an object of the present invention is to provide new and improved watermark information capable of accurately extracting secret information. To provide an embedding method, a watermark information detecting method, a watermark information embedding device, and a watermark information detecting device.
[0012]
Furthermore, another object of the present invention is to restore a signal even when the input image is distorted due to paper rotation or the like at the time of signal detection, and it is not necessary to perform image rotation correction. It is to provide a new and improved watermark information embedding method, watermark information detecting method, watermark information embedding device, and watermark information detecting device capable of reducing the amount of processing at the time.
[0013]
[Means for Solving the Problems]
In order to solve the above problems, according to a first aspect of the present invention, dots are regularly arranged as a background of a document, dot patterns having different arrangement rules are inserted therein, and the dot pattern arrangement rules are defined. A watermark information embedding method for providing information is provided. In the watermark information embedding method of the present invention, the dot pattern includes at least the first, second, and third dots, and depends on the relative positional relationship between the first, second, and third dots. A value is set for the dot pattern according to a fixed eigenvalue, the dot pattern having the same value is repeatedly arranged in an information area of a predetermined shape, and one bit of the information is set in the entire information area It is characterized by.
[0014]
The eigenvalue determined by the relative positional relationship between the first, second, and third dots is, for example, a vector having the third dot as a start point and the first dot as an end point, and the third dot. It can be an inner product with a vector starting from the second dot and ending at the second dot.
[0015]
According to such a method, by embedding a dot pattern composed of several small dots, information can be embedded in a document (paper) background in a visually uncomfortable manner. Then, information is expressed by the difference in inner product (difference in the angle formed by the vector and the size of the vector) between two virtual vectors starting from one of the dots and ending at two. That is, the information can be N-ary encoded and inner products can be assigned corresponding to each codeword. According to this method, even if the input image is distorted due to paper rotation at the time of signal detection, it is possible to restore the signal only by detecting the angle between the virtual vectors and the size difference between the virtual vectors. This eliminates the need for image rotation correction. In this way, the amount of processing at the time of signal detection can be reduced.
[0016]
The third dot may be positioned in a direction substantially perpendicular to a vector starting from the second dot, starting from the first dot, and ending at the second dot. . As will be described later, the third dot can be easily searched for information detection.
[0017]
The dot pattern further includes a fourth dot located in a direction substantially perpendicular to a vector starting from the first dot, starting from the first dot, and ending at the second dot. can do. When the dot pattern is densely arranged on the background of the document, the density balance can be maintained, and it can be used as an aid when detecting the dot pattern. That is, a vector having the first dot as a start point and the second dot as an end point can be used as a horizontal reference vector, and a vector having the first dot as a start point and the fourth dot as an end point can be used as a vertical reference vector. .
[0018]
In the present invention, dot patterns set with the same value are repeatedly arranged in an information area having a predetermined shape, and 1-bit information is set in the entire information area. As an example, there is a rectangle (including a square). The information area may be the same size as the dot pattern, and only one dot pattern may be arranged in one information area.
[0019]
In order to solve the above problem, according to a second aspect of the present invention, there is provided a watermark information detection method for detecting information from a document printed by the watermark information embedding method according to the first aspect of the present invention. Is done. The watermark information detection method of the present invention is characterized by including the following steps.
(Step 1) The document is converted into image data, and first, second, and third dot candidate points are searched.
(Step 2) An eigenvalue determined by a relative positional relationship between the first, second, and third dot candidate points and an eigenvalue determined by a relative positional relationship between the first, second, and third dots. If they are substantially the same, the first, second and third dot candidate points are determined to be the first, second and third dots constituting the dot pattern.
(Step 3) One bit of the information set in the information area is detected by majority of the values set in the dot pattern in the information area.
[0020]
The eigenvalue determined by the relative positional relationship between the first, second, and third dot candidate points in (Step 1) is, for example, the third dot candidate point as the start point and the first dot candidate point as the end point. And a vector having the third dot candidate point as a start point and the second dot candidate point as an end point.
[0021]
The search for the first, second, and third dot candidate points in (Step 1) includes, for example, the following steps.
(Step 1-1) The first dot candidate point is searched.
(Step 1-2) When the first dot candidate point is found, a search circle having a predetermined radius is set around the first dot candidate point, and the second dot is set on the search circle. Search for candidate points. Here, the search circle is a concept including a ring-shaped region having a predetermined width surrounded by two circles having different radii.
(Step 1-3) When the second dot candidate point is found, the second dot candidate point is set as the start point, the first dot candidate point is set as the start point, and the second dot candidate point is set as the end point. The third dot candidate point is searched in a direction substantially perpendicular to the vector to be performed.
[0022]
According to such a watermark information detection method, information embedded by the watermark information embedding method having the excellent effect can be easily and reliably detected.
[0023]
Furthermore, according to the present invention, it is possible to perform tilt correction necessary when converting the document input in the above (step 1) into image data. For example, in each dot pattern detected in the entire image data, it is determined that the average value of the vector gradient starting from the first dot and ending at the second dot is the gradient of the image data, It is possible to correct the inclination of the image data.
[0024]
Furthermore, according to the present invention, for detecting one bit of information set in the information area in the above (step 3), for example, the following method can be adopted in order to increase detection accuracy.
(Step 3-1) A coordinate system corresponding to the inclination of the image data is set.
(Step 3-2) A grid-like horizontal grid axis and vertical grid axis are set along the vertical axis and the horizontal axis of the coordinate system, and the adjacent interval is made substantially the same as the size of the information area in the coordinate system.
(Step 3-3) A sufficiently small determination area is set so that the intersection of the horizontal grid axis and the vertical grid axis is the center and does not cross the boundary between adjacent information areas.
(Step 3-4) One bit of the information set in the information area including the determination area is detected by majority decision of the value of the dot pattern in the determination area.
[0025]
Further, the setting of the horizontal grid axis and the vertical grid axis in (Step 3-1) is performed by the following steps, for example.
(Step 3-1-1) When the dot pattern included in the determination region represents the first value in the determination region arrangement determined by the arbitrarily set horizontal lattice axis and vertical lattice axis, When a value of 2 is represented, +1 is added as the value of the determination region.
(Step 3-1-2) A value obtained by adding the absolute values of the total values for the respective dot patterns in the determination region for all the determination regions is set as an evaluation value for the horizontal lattice axis and the vertical lattice axis set here.
(Step 3-1-3) The position where the evaluation value is the largest among the positions of the horizontal grid axis and the vertical grid axis which are slightly changed is set as the horizontal grid axis and the vertical grid axis.
[0026]
The present invention also provides a watermark information embedding device that can implement the watermark information embedding method according to the first aspect of the present invention.
[0027]
Furthermore, according to the present invention, there is provided a watermark information detecting apparatus capable of realizing the watermark information detecting method according to the second aspect of the present invention.
[0028]
DETAILED DESCRIPTION OF THE INVENTION
Exemplary embodiments of a watermark information embedding method, a watermark information detecting method, a watermark information embedding device, and a watermark information detecting device according to the present invention will be described below in detail with reference to the accompanying drawings. In the present specification and drawings, components having substantially the same functional configuration are denoted by the same reference numerals, and redundant description is omitted.
[0029]
FIG. 1 is an explanatory diagram showing the configuration of a watermark information embedding device and a watermark information detection device according to this embodiment.
[0030]
(Watermark information embedding device 1001)
The watermark information embedding device 1001 is a device that forms a document image based on document data and confidential information embedded in the document and prints it on a paper medium. As shown in FIG. 1, the watermark information embedding device 1001 includes a document image forming unit 1005, a watermark image forming unit 1006, a watermarked document image synthesizing unit 1007, and an output device 1008. Document data 1003 is data created by a document creation tool or the like. The confidential information 1004 is information (character string, image, audio data) to be embedded in a paper medium in a format other than characters.
[0031]
The document image forming unit 1006 creates an image in a state where the document data 1003 is printed on a paper surface. Specifically, the white pixel region in the document image is a portion where nothing is printed, and the black pixel region is a portion where black paint is applied. In this embodiment, the description will be made on the assumption that printing is performed on white paper with black ink (single color). However, the present invention is not limited to this, and printing is performed in color (multicolor). However, the present invention can be similarly applied.
[0032]
The watermark image forming unit 1006 digitizes the confidential information 1004 and converts it into a numerical value, performs N-element coding (N is 2 or more), and assigns each symbol of the code word to a signal prepared in advance. In the present embodiment, the signal prepared in advance in the watermark image forming unit 1006 constitutes two or more virtual vectors by arranging dots in a rectangular area of an arbitrary size, Symbols are assigned for differences in inner products. A watermark image is one in which these signals are arranged on the image according to a certain rule.
[0033]
The watermarked document image composition unit 1007 creates a watermarked document image by superimposing the document image and the watermark image. The output device 1008 is an output device such as a printer, and prints a watermarked document image on a paper medium. The document image forming unit 1005, the watermark image forming unit 1006, and the watermarked document image synthesizing unit 1007 may be realized as one function in the printer driver.
[0034]
The print document 1009 is printed by embedding confidential information 1004 in the original document data 1003, and is physically stored and managed. Further, when the watermark information detecting device 1002 is located at a location remote from the watermark information embedding device 1001, the print document 1009 can be exchanged by, for example, delivery.
[0035]
(Watermark information detection apparatus 1002)
The watermark information detection device 1002 is a device that takes in a document printed on a paper medium as an image and restores embedded confidential information. As shown in FIG. 1, the watermark information detection apparatus 1002 includes an input device 1010 and a watermark detection unit 1011.
[0036]
The input device 1010 is an input device such as a scanner, and captures a document 1009 printed on paper as a multi-value gray image in a computer. In addition, the watermark detection unit 1011 performs a filtering process on the input image and detects an embedded signal. Then, the symbol is restored from the detected signal, and the embedded confidential information is extracted.
[0037]
Operations of the watermark information embedding device 1001 and the watermark information detection device 1002 configured as described above will be described. First, the operation of the watermark information embedding device 1001 will be described with reference to FIGS.
[0038]
(Document Image Forming Unit 1005)
Document data 1003 is data including font information and layout information, and is created by word processing software or the like. Based on the document data 1003, the document image forming unit 1005 creates an image of the document printed on paper for each page. This document image is a black and white binary image. On the image, white pixels (pixels having a value of 1) are backgrounds, and black pixels (pixels having a value of 0) are character regions (regions to which ink is applied). It shall be.
[0039]
(Watermark image forming unit 1006)
The confidential information 1004 is various data such as characters, sounds, and images, and the watermark image forming unit 1006 creates a watermark image to be superimposed as a background of the document image from this information.
[0040]
FIG. 2 is a flowchart showing a processing flow of the watermark image forming unit 1006.
First, the confidential information 1004 is converted into an N-element code (step S101). N is arbitrary, but in the present embodiment, N = 2 for ease of explanation. Therefore, the code generated in step S101 is a binary code, and the confidential information 1004 is expressed by a bit string of 0 and 1. In this step S101, the data may be encoded as it is, or the encrypted data may be encoded.
[0041]
Next, a dot pattern is assigned to each symbol of the code word (step S102). A dot pattern consists of several dots (black pixels), and can give an eigenvalue determined by the relative positional relationship of these dots. In this step S102, the unique value of the dot pattern is associated with each symbol (0, 1) of the code word. This association will be further described later.
[0042]
Then, a dot pattern corresponding to the encoded bit string of the confidential information 1004 is arranged on the watermark image (step S103).
[0043]
The dot pattern assigned to each symbol of the code word in step S102 will be described. FIG. 3 is an explanatory diagram showing an example of a dot pattern.
[0044]
Let the width and height of the dot pattern be Sw and Sh, respectively. Sw and Sh may be different, but in the present embodiment, Sw = Sh is set for ease of explanation. The unit of length is the number of pixels. In the example of FIG. 3, Sw = Sh = 12. The size when these signals are printed on the paper depends on the resolution of the watermark image. For example, the watermark image is an image of 600 dpi (dot per inch: unit of resolution, number of dots per inch). 3, the width and height of the dot pattern in FIG. 3 is 12/600 = 0.02 (inch) on the printed document.
[0045]
As shown in FIG. 3A, the dot pattern is
Start point dot 2101 (coordinate value (0, 0))
Horizontal reference dot 2102 (coordinate value (Sw / 2, 0))
Vertical reference dot 2103 (coordinate value (0, Sh / 2))
Modulation dot 2104 (coordinate values (Sw / 2, Sh / 2 + 1))
It is comprised including. The vertical reference dot 2103 is a dot for maintaining the balance of the density when the dot pattern is densely arranged on the background of the document and for assisting in signal detection. Further, the modulation dot 2104 is arranged so as to be positioned in a direction substantially perpendicular to a vector having the horizontal reference dot 2102 as the start point, the start point dot 2101 as the start point, and the horizontal reference dot 2102 as the end point. Values are set in the dot pattern in accordance with the eigenvalues determined by the relative positional relationship between the start point dot 2101, the horizontal reference dot 2102, and the modulation dot 2104. The eigenvalue will be described below.
[0046]
In FIG. 3A, two virtual vectors (modulation vector 2106 and reference vector 2105) based on the start point dot 2101, horizontal reference dot 2102, and modulation dot 2104 are set. The modulation vector 2106 is a vector having the modulation dot 2104 as a start point and the start point dot 2101 as an end point. The reference vector 2105 is a vector having the modulation dot 2104 as a start point and the horizontal reference dot 2101 as an end point. In addition, an angle formed by the reference vector and the modulation vector is a modulation angle 2104 (= θ0). Further, it is assumed that the magnitude of the reference vector Vs is vs = | Vs |, and the magnitude of the modulation vector Vm is vm = | Vm |.
[0047]
In this embodiment, the dot pattern feature is expressed by the inner product of the reference vector 2105 and the modulation vector 2106, and this is referred to as a signal eigenvalue Sv. The inner product A / B of vector A and vector B is
A · B = | A || B | cos θ
Here, | A | is represented by the magnitude of vector A, and θ is represented by the angle formed by vector A and vector B.
[0048]
The inner product of the vector A and the vector B is also expressed as A · B = x0 × x1 + y0 × y1, where A = (x0, y0) and B = (x1, y1). In the example of FIG. 3, the coordinate value 2101 of the start point dot is (x, y) = (0, 0), the coordinate value of the horizontal reference dot 2102 is (Sw / 2, 0), and the coordinate value of the modulation dot 2104 is (Sw / 2, Sh / 2-1). Therefore, the modulation angle 2207 (= θ1) and the signal eigenvalue Sv0 = Vs · Vm = (Sw / 2) × (Sw / 2-1).
[0049]
Similarly in FIG. 3B, the reference vector 2205 and the modulation vector 2206 are set. However, the coordinate values of the start point dot 2201 and the horizontal reference dot 2202 are the same as those in FIG. 3A, but the coordinate value of the modulation dot 2203 is (Sw / 2, Sh / 2 + 1). Therefore, the signal eigenvalue Sv1 = Vs · Vm = (Sw / 2) × (Sw / 2 + 1).
[0050]
In the following, it is assumed that FIG. 3A represents signal 0 and FIG.
[0051]
In this way, the signal eigenvalue can be variously changed according to the relative positional relationship of the start point dot, the horizontal reference dot, and the modulation dot. The reference vector may be a vector connecting the modulation dot and the vertical reference dot. In this case, the coordinate value of the modulation dot may be (Sw / 2-1, Sh / 2) for signal 0 and (Sw / 2 + 1, Sh / 2) for signal 1.
[0052]
Further, in the example of FIG. 3, the signal eigenvalue between the two signals is changed by changing both “the size of the reference vector and the modulation vector” and “the size of the modulation angle” between the signal 0 and the signal 1. However, the signal eigenvalue may be made different by changing only one of “the size of the reference vector and the modulation vector” or “the size of the modulation angle”.
[0053]
In step S103 of FIG. 2, a dot pattern corresponding to the bit string of the encoded data is arranged on the watermark image. In this embodiment, in order to express bits (symbols), as shown in FIG. The same dot pattern is arranged in a rectangular shape of ls × ls (pieces). This is called a unit pattern. In the example of FIG. 4, ls = 10, the unit pattern constituted by the dot pattern representing signal 0 is symbol 0 (FIG. 4A), and the unit pattern constituted by the dot pattern representing signal 1 is symbol 1. (FIG. 4B) shall be represented. Hereinafter, the size of one side on the unit pattern image is L. When Sw and Sh are equal, L = ls × Sw.
[0054]
FIG. 5 is a flowchart showing a method for embedding confidential information in a watermark image. Here, a case where the same information is repeatedly embedded in one (one page) watermark image will be described. By repeatedly embedding the same information, it is possible to take out the embedded information even when the embedded information disappears due to, for example, the entire unit pattern being filled when the watermark image and the document image are superimposed. .
[0055]
First, the confidential information 1004 is converted into an N-ary code (a binary code in the present embodiment) (step S201). This is the same as step S101 in FIG. Hereinafter, the encoded information is referred to as a data code, and the data code expressed by a combination of unit patterns is referred to as a data code unit Du.
[0056]
Next, based on the code length of the data code (here, the number of bits) and the number of embedded bits, how many times the data code unit can be embedded in one image is calculated (step S202). The number of bits of information that can be embedded in a watermark image for one page depends on the size of the dot pattern, the size of the unit pattern, and the size of the document image. At the time of signal detection, the number of signals embedded in the horizontal and vertical directions of the document image may be detected as known, or may be detected based on the size of the image input from the input device and the size of the signal unit. You may calculate backward.
[0057]
It is assumed that Pw unit patterns in a horizontal direction and Ph units in a vertical direction can be embedded in a watermark image for one page. The unit patterns of Pw in the horizontal direction and Ph in the vertical direction are referred to as “unit pattern matrix”. In addition, the number of bits that can be embedded in one page is referred to as the “number of embedded bits”. The number of embedded bits is Pw × Ph.
[0058]
In the present embodiment, a bit string representing the code length of a data code (hereinafter referred to as code length data) is inserted into the first row of the unit pattern matrix. The code length of the data code may be fixed and the code length data may not be embedded in the watermark image. Assuming that Pw unit patterns in the horizontal direction and Ph units in the vertical direction can be embedded in a watermark image for one page, the number Dn of embedding data code units is calculated by the following equation with the data code length as Cn.
[0059]
[Expression 1]

[0060]
Here, assuming that the remainder is Rn (Rn = Cn− (Pw × (Ph−1))), the unit pattern matrix is embedded with a data pattern unit of Dn times and a unit pattern corresponding to the first Rn bits of the data code. become. However, the Rn bit of the remainder part does not necessarily have to be embedded.
[0061]
In the description of FIG. 6, the size of the unit pattern matrix is 9 × 11 (11 rows and 9 columns), and the data code length is 12 (numbers 0 to 11 in the figure indicate each codeword of the data code). And
[0062]
Specifically, the embedding of the code length data in the unit pattern matrix will be described. First, the code length data is embedded in the first row of the unit pattern matrix (step S203). In the example of FIG. 6, the code length is expressed by 9-bit data and is described as being embedded only once. However, if the unit pattern matrix width Pw is sufficiently large, It can be embedded repeatedly.
[0063]
Further, the data code unit is repeatedly embedded in the second and subsequent rows of the unit pattern matrix (step S204). As shown in FIG. 6, the MSB (most significant bit) or LSB (least significant bit) of the data code is sequentially embedded in the row direction. The example of FIG. 6 shows an example in which the data code unit is embedded seven times and the first 6 bits of the data code are embedded. The data embedding method may be embedded so as to be continuous in the row direction as shown in FIG. 6, or may be embedded so as to be continuous in the column direction.
[0064]
The watermark image in the watermark image forming unit 1006 has been described above. Next, the watermarked document image composition unit 1007 of the watermark information embedding device 1001 will be described.
[0065]
(Watermarked document image composition unit 1007)
A watermarked document image composition unit 1007 superimposes the document image created by the document image forming unit 1005 and the watermark image created by the watermark image forming unit 1006. FIG. 7 is an explanatory diagram showing an example of a watermarked document image. As shown in FIG. 7, the value of each pixel of the watermarked document image is calculated by a logical product operation (AND) of the corresponding pixel values of the document image and the watermark image. That is, if either the document image or the watermark image is 0 (black), the pixel value of the watermarked document image is 0 (black), otherwise 1 (white).
[0066]
The watermarked document image is output by the output device 1008.
[0067]
The operation of the watermark information embedding device 1001 has been described above.
Next, the operation of the watermark information detection apparatus 1002 will be described with reference to FIGS. 1 and 8 to 17.
[0068]
(Watermark detection unit 1002)
FIG. 8 is an explanatory diagram showing the flow of processing of the watermark detection unit.
In step S301, a watermarked document image is input to the watermark information detection apparatus 1002 by an input device such as a scanner. Specifically, it is input to the memory of the computer. This image is called an input image. The input image is a multi-valued image and will be described as a gray image with 256 gradations. The resolution when the input image is read by a scanner or the like may be different from the watermarked document image created by the watermark information embedding device 1001 described above, but here the resolution is the same as the image created by the watermark information embedding device 1001. The description will be given assuming that there is. In addition, when the image is captured by a scanner or the like, the coordinate system of the watermark embedding area is rotated with respect to the coordinate system of the input image because the paper is not accurately aligned with the coordinate axis of the imaging surface of the scanner. In some cases, the watermark detection is performed without performing the rotation correction of the watermark area by image processing in the following description.
[0069]
In step S302, a dot pattern is detected from the input image. Details of the dot pattern detection will be described later with reference to FIGS. The dot pattern is detected for the entire input image. When the size of the input image is width W and height H, the dot pattern representing the signal 0 in the coordinate value I (x, y) (x = 0 to W−1, y = 0 to H−1) of the input image U (x, y) = − 1 if there is a dot pattern, and U (x, y) = 1 if a dot pattern representing signal 1 exists. If there is no dot pattern, U (x, y) = 0. However, in step S302, U (x, y) (x = 0 to W-1, y = 0 to H-1) is initially initialized with all zeros.
[0070]
In step S302, the signal determination result is stored, and at the same time, the gradient of the reference vector is stored in R (x, y). However, the value of R (x, y) is valid only when U (x, y) is other than 0, that is, only the coordinates where the dot pattern exists are valid.
[0071]
In step S303, a unit pattern is detected based on U (x, y) and R (x, y) (described in FIG. 13). Details of the unit pattern detection will be described later with reference to FIGS.
[0072]
In step S304, information is restored from a symbol string (for example, 01100010100...). Information decoding will be further described later with reference to FIG.
[0073]
The watermark information detection method according to the present embodiment has been outlined above with reference to FIG. Next, each step S302 to S304 will be described in detail.
[0074]
(Dot pattern detection step S302)
FIG. 9 is an explanatory diagram showing details of the dot pattern detection step S302 of FIG. 8. As shown in FIG. 9, the dot pattern detection step S302 includes the following steps S401 to S404.
[0075]
(Starting point candidate search step S401)
In step S401, the starting point candidate D1 is searched by raster scanning the input image as shown in FIG. Any of the following methods may be used as a method for detecting the dots (start point dots, reference dots, modulation dots) embedded by the watermark image forming unit.
If the luminance value at an arbitrary position I (x, y) of the input image is equal to or less than a predetermined threshold value T, it is determined as a dot.
If the output value of an arbitrary edge detection filter at an arbitrary position I (x, y) of the input image is equal to or greater than a predetermined threshold T, it is determined as a dot.
[0076]
(Horizontal reference dot candidate search step S402)
In step S402, a candidate dot for a horizontal reference dot is searched. Here, first, a search area R is set as shown in FIG. The search area R is an area surrounded by two semicircles having a radius of r1 and r2 (r1 <r2), and a search start axis As and a search end axis Ae, centered on the starting point candidate D1. The search area R indicates a range where a horizontal reference dot will exist when the start point candidate D1 is a true start point dot. r1 and r2 depend on the size of the reference vector Vs of the embedded dot pattern, the printing resolution, and the image capturing resolution.
[0077]
The search start axis As and the search end axis Ae are set in order to prevent searching for dots in the return direction during raster scanning. This is necessary in order to maintain the direction of dot arrangement. The search direction is such that a search for a dot is started from the search start axis As and the search region R is clockwise.
[0078]
If a new dot is found in the search region R in step S402, this is set as the horizontal reference dot candidate D2. If no new dot is found, the start point candidate D1 is determined not to be a true start point dot, and the process returns to step S401 to search for the next start point candidate again.
[0079]
(Modulation dot candidate search step S403)
In step S403, a modulation dot candidate is searched. As shown in FIG. 12, on the straight line passing through the horizontal reference dot candidate D2 and perpendicular to the straight line connecting the start point dot candidate D1 and the horizontal reference dot candidate D2, a step is performed along the search direction indicated by the dashed arrow in FIG. A dot is searched for in the same manner as in S402. The search range is a range in which the distance from the horizontal reference dot candidate D2 is (Sh-1) ± Δd and (Sh + 1) ± Δd (Δd is a predetermined value), and when a new dot is found, this is used as a modulation dot. Let it be candidate D3. If no new dot is found, the start point candidate D1 is determined not to be a true start point dot, and the process returns to step S401 to search for the next start point candidate again.
[0080]
(Signal eigenvalue (inner product) calculation step S404)
In step S404, a signal eigenvalue (inner product) is calculated based on the reference vector Vs connecting the modulation dot candidate D3 and the horizontal reference dot candidate D2 and the modulation vector Vm connecting the modulation dot candidate D3 and the start point dot candidate D1, and this is calculated as Sv0 ± dv or If it is within the range of Sv1 ± dv, it is determined that the three dots D1, D2 and D3 detected here are dots constituting a dot pattern. However, dv is a value determined in advance as an error component related to the signal eigenvalue. If the signal intrinsic angle is outside the range of Sv0 ± dv or Sv1 ± dv, it is determined that the start point dot candidate D1 is not a true start point dot, and the process returns to step S401 to search for the next start point dot candidate again.
[0081]
(Signal determination and signal result accumulation step S405)
In step S405, it is determined from the signal eigenvalue obtained in step S404 whether the dot pattern is signal 0 or signal 1 in step S406. If the signal is 0, U (x, y) = − 1, and if the signal is 1, U (x, y) = 1. Also, the slope of the reference vector is recorded in R (x, y).
[0082]
The details of the dot pattern detection step S302 have been described above. Next, details of the unit pattern detection step S303 will be described.
[0083]
(Unit pattern detection step S303)
FIG. 13 is an explanatory diagram showing details of the unit pattern detection step S303 of FIG. The unit pattern detection step S303 includes the following steps S501 to S504 as shown in FIG.
[0084]
(Signal Area Inclination Detection Step S501)
In step S501, an average value Ra is calculated for the case where the value of U (x, y) corresponding to the slope R (x, y) of the reference vector is other than 0. The average value Ra is determined to be the inclination of the signal area with respect to the input image, for example, the inclination of the document with respect to the imaging surface such as a scanner. Hereinafter, it is referred to as slope Ra.
[0085]
(Unit pattern determination area setting step S502)
In step S502, the size of the unit pattern determination area is set. The unit pattern is a pattern in which ls × ls dot patterns representing the same information are arranged one by one. However, when the unit pattern is determined by raster scan due to the inclination of the signal area, the signal of the peripheral part of the unit pattern However, it is mixed with the signal of the adjacent unit pattern (gray portion in FIG. 14). Therefore, the size of the unit pattern determination area is set smaller than L × L.
[0086]
FIG. 15 shows an example of a method for determining the size of the unit pattern determination area.
FIG. 15A shows an input image, and FIG. 15B shows an arbitrary unit pattern extracted from the figure. FIG. 15B shows a square having an L side rotated by an inclination Ra. As shown in the figure, a rectangular determination area without rotation in the coordinate system of the input image is set. The determination area is set so as to be within the unit pattern in a state where all the four vertices are rotated. In the figure, dm is a margin, and when dm = 0, it is the largest rectangle that satisfies the previous condition (all four vertices fit within the unit pattern). Hereinafter, the length of one side of the determination area is M.
[0087]
(Determination area optimum installation position determination step S503)
In step S503, the determination area is arranged in a rotating grid, and the optimal arrangement of the determination area is obtained by changing the axis of the grid.
[0088]
FIG. 16 shows an example in which the determination areas are arranged in a grid pattern. A straight line indicated by a broken line is a lattice axis, which is classified into a “horizontal lattice axis” and a “vertical lattice axis” according to the coordinate system of the signal embedding area. The horizontal grid axis is the horizontal axis in the signal area, and the distance between adjacent axes is L (unit pattern size). The vertical grid axis is the vertical axis in the signal region, and the distance between adjacent axes is L. The lattice axis viewed from the coordinate system of the input image is obtained by rotating the horizontal axis and the vertical axis of the input image by the inclination Ra, and the distance between adjacent axes is L × cos (Ra).
[0089]
As shown in the figure, the determination area is arranged so that the intersection of the horizontal grid axis and the vertical grid axis overlaps the center of the determination area. Also, the intersection at the top left of the grid axis intersection is called “grid axis start point C (0)”, and all determination areas that can be set in the input image are C (i), i = 0 to Cn. . C (i) is assumed to be in the order of 1 row and 1 column, 1 row and 2 columns,..., 2 rows and 1 column, and 2 rows and 2 columns when the arrangement of the intersections of the lattice axes is regarded as a matrix.
[0090]
Next, Cu (i), i = 0 to Cn is defined as the absolute value of the total value of U (x, y) corresponding to the region included in C (i), and the starting point C (0) of the lattice axis When the coordinates are (CX, CY), the total value of Cu (i), i = 0 to Cn is expressed as Pu (CX, CY).
[0091]
Cu (i) has the largest value when each determination region is arranged at the center of the unit pattern. This is because, as shown in FIG. 17, when the determination area is installed across two or more unit patterns, U (x, y) values of the corresponding areas are mixed with +1 and −1. The absolute value of the sum is small. On the other hand, if the determination area fits in one unit pattern, the value of U (x, y) in the corresponding area is either +1 or −1 even if there is a signal determination error. Therefore, the absolute value of the sum of them becomes large.
[0092]
Therefore, Pu (CX, CY) is calculated when CX and CY are moved within a range of ± dL / 2, respectively, and it is determined that CX and CY that maximize this are the optimal arrangement of the determination regions.
[0093]
(Unit pattern determination step S504)
In step S504, in C (i) and i = 0 to Cn in the arrangement set in step S503, Cp (i) and i = 0 to Cn are assigned U (x, y corresponding to the area included in C (i). ), And Cs (i), i = 0 to Cn are unit pattern values determined by Cp (i).
[0094]
If Cp (i) is positive, the corresponding unit pattern value is 1, and Cs (i) = 1. If Cp (i) is negative, the value of the corresponding unit pattern is 0, and Cs (i) = 0. However, if Cp (i) is equal to or less than the threshold value Ts, it is determined that the determination region is outside the signal embedding region, and Cs (i) = − 1.
[0095]
Of Cs (i), i = 0 to Cn, except for negative Cs (i), the value of Cs (i) is connected in series to recover the bit string, thereby extracting the embedded information.
[0096]
The details of the unit pattern detection step S303 have been described above. Returning to the flowchart of FIG. 8 again, the following step S304 will be described. In step S304, the unit pattern matrix symbols are concatenated to reconstruct the data code, and the original information is restored.
[0097]
(Information decoding step S304)
FIG. 18 is an explanatory diagram showing an example of information restoration. The information restoration step S304 is as follows.
(1) A symbol embedded in each unit pattern is detected ((1) in FIG. 18). {Circle around (2)} The symbols are connected to restore the data code ((2) in FIG. 18).
(3) Decode the data code and take out the embedded information ((3) in FIG. 18).
[0098]
In this embodiment, the case of repeatedly embedding a data code has been described. However, a method that does not repeat a data code unit can be realized by using an error correction code or the like when encoding data.
[0099]
As described above, according to this embodiment, printing is performed by embedding a dot pattern composed of several small dots (start point dot 2101, horizontal reference dot 2105, modulation dot 2104, vertical reference dot 2103). Information can be embedded in the background of the document (paper) 1009 by a method that does not cause a sense of incongruity. Then, information is expressed by the difference in inner product (the difference between the angle formed by the vector and the size of the vector) of the two virtual vectors having the modulation dot as the start point, the start point dot, and the horizontal reference dot as the end point. According to this method, even if the input image is distorted due to paper rotation at the time of signal detection, it is possible to restore the signal only by detecting the angle between the virtual vectors and the size difference between the virtual vectors. This eliminates the need for image rotation correction. In this way, the amount of processing at the time of signal detection can be reduced.
[0100]
The preferred embodiments of the watermark information embedding method, the watermark information detecting method, the watermark information embedding device, and the watermark information detecting device according to the present invention have been described above with reference to the accompanying drawings. However, the present invention is limited to this example. Not. It will be obvious to those skilled in the art that various changes or modifications can be conceived within the scope of the technical idea described in the claims, and these are naturally within the technical scope of the present invention. It is understood that it belongs.
[0101]
【The invention's effect】
As described above, according to the present invention, when information is embedded in a background of a printed document (paper) with a method that does not cause a sense of incongruity, it is possible to accurately extract information. Further, according to the present invention, it is possible to restore the signal even when the input image is distorted due to paper rotation at the time of signal detection, and it is not necessary to perform image rotation correction. It is possible to reduce the amount.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram showing a configuration of a watermark information embedding device and a watermark information detection device.
FIG. 2 is a flowchart showing a processing flow of a watermark image forming unit 1005;
FIG. 3 is an explanatory diagram showing an example of a dot pattern.
FIG. 4 is an explanatory diagram of a unit pattern.
FIG. 5 is a flowchart showing a method of embedding confidential information in a watermark image.
FIG. 6 is a flowchart showing a process flow of the watermark detection unit 1011;
FIG. 7 is an explanatory diagram illustrating an example of a watermarked document image.
8 is an explanatory diagram showing a flow of processing of a watermark detection unit 1011. FIG.
FIG. 9 is an explanatory diagram showing a flow of dot pattern detection processing.
10 is an explanatory diagram of a start dot candidate selection step S401 in FIG. 9;
FIG. 11 is an explanatory diagram of the horizontal reference dot candidate search step S402 of FIG.
12 is an explanatory diagram of the modulation dot candidate search step S403 in FIG. 9;
13 is a detailed explanatory diagram of the unit pattern detection step S303 of FIG.
FIG. 14 is an explanatory diagram of the unit pattern determination area setting step S502 of FIG.
FIGS. 15A and 15B are explanatory diagrams of the unit pattern determination area setting step S502 in FIG. 13, where FIG. 15A is an input image, and FIG. 15B is an arbitrary unit pattern extracted;
16 is an explanatory diagram of the optimum installation position determination step S503 for the determination region in FIG. 13;
FIG. 17 is an explanatory diagram of the optimum installation position determination step S503 for the determination region in FIG. 13;
FIG. 18 is an explanatory diagram showing an example of information restoration
[Explanation of symbols]
1001 Watermark information embedding device
1002 Watermark information detection apparatus
1003 Document data
1004 Confidential information
1005 Document image forming unit
1006 Watermark image forming unit
1007 Watermarked document image composition unit
1008 Output device
1009 Printed document
1010 Input device
1011 Watermark detection unit

Claims

A watermark information embedding method in which dots are regularly arranged as a background of a document, a dot pattern having a different arrangement rule is inserted therein, and information is given to the arrangement rule of the dot pattern,
The dot pattern includes at least first, second, and third dots,
A value is set for the dot pattern according to the eigenvalue determined by the relative positional relationship between the first, second, and third dots,
In the information area of a predetermined shape, the dot pattern set with the same value is repeatedly arranged, 1 bit of the information is set in the entire information area ,
The eigenvalue determined by the relative positional relationship between the first, second, and third dots is
The watermark is an inner product of a vector having the third dot as a start point and the first dot as an end point, and a vector having the third dot as a start point and the second dot as an end point Information embedding method.

The third dot is located in a direction substantially perpendicular to a vector starting from the second dot, starting from the first dot, and ending at the second dot. 2. The watermark information embedding method according to 1.

The dot pattern further includes a fourth dot located in a direction substantially perpendicular to a vector starting from the first dot, starting from the first dot, and ending at the second dot. The watermark information embedding method according to claim 1, wherein the watermark information embedding method is performed.

A watermark information detection method for detecting information from a document printed by the watermark information embedding method according to claim 1,
Converting the document into image data, searching for first, second and third dot candidate points;
When the eigenvalue determined by the relative positional relationship between the first, second, and third dot candidate points is substantially the same as the eigenvalue determined by the relative positional relationship between the first, second, and third dots. And determining that the first, second and third dot candidate points are the first, second and third dots constituting the dot pattern,
1 bit of the information set in the information area is detected by majority of the values set in the dot pattern in the information area;
The eigenvalue determined by the relative positional relationship between the first, second, and third dot candidate points is:
An inner product of a vector having the third dot candidate point as a start point and the first dot candidate point as an end point, and a vector having the third dot candidate point as a start point and the second dot candidate point as an end point A watermark information detection method characterized by being.

The search for the first, second, and third dot candidate points is as follows:
Searching for the first dot candidate point;
When the first dot candidate point is found, a search circle having a predetermined radius is set around the first dot candidate point, and a second dot candidate point is searched on the search circle.
When the second dot candidate point is found, the second dot candidate point is a starting point, the first dot candidate point is the starting point, and the second dot candidate point is the end point. 5. The watermark information detecting method according to claim 4 , wherein the third dot candidate point is searched in a direction.

In each dot pattern detected in the entire image data, an average value of vector gradients starting from the first dot and ending at the second dot is determined as the gradient of the image data, 6. The watermark information detecting method according to claim 4 , wherein the data is corrected for inclination.

Set a coordinate system according to the inclination of the image data,
Setting a horizontal grid axis and a vertical grid axis in the form of a grid along the vertical and horizontal axes of the coordinate system, with the interval between adjacent ones being substantially the same as the size of the information area in the coordinate system;
Set a sufficiently small determination area centered on the intersection of the horizontal grid axis and the vertical grid axis and does not cross the boundary between adjacent information areas,
The majority of the values of the dot pattern in the determination area, and detects one bit of the information set in the information area including the determination area, according to any one of claims 4-6 Watermark information detection method.

The horizontal grid axis and the vertical grid axis are set as follows:
In the arrangement of the determination region determined by the arbitrarily set horizontal lattice axis and vertical lattice axis, −1 is used when the dot pattern included in the determination region represents the first value, and +1 when the second value represents the second value. As the value of the judgment area,
The sum of the absolute values of the total values for each dot pattern in the judgment area for all judgment areas is used as the evaluation value for the horizontal and vertical grid axes set here.
With the most evaluation value becomes larger position in which minutely changing the position of the horizontal grid axis and vertical grid axis, and sets as the horizontal grid axis and the vertical grid axes, according to claim 7 Watermark information detection method.

A watermark image embedding device,
A document image forming unit that creates a document image for each page based on the document data;
A watermark image forming unit that creates a watermark image by using the rule of dot pattern arrangement as a watermark by the watermark information embedding method according to any one of claims 1 to 3 based on confidential information;
A watermarked image composition unit that creates a watermarked document image by superimposing the document image and the watermark image;
A watermark image embedding device comprising:

9. A watermark for detecting the information by a watermark information detection method according to any one of claims 4 to 8 , from a watermarked document image created by superimposing a document image and a watermark image in which a plurality of types of dot patterns are embedded. A watermark image detection apparatus comprising a detection unit.