JP4154252B2

JP4154252B2 - Image processing apparatus and method

Info

Publication number: JP4154252B2
Application number: JP2003027609A
Authority: JP
Inventors: 北洋金田; 健一太田; 進一加藤; 恵市岩村; 淳一林; 貴巳江口; 淳田丸
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2003-02-04
Filing date: 2003-02-04
Publication date: 2008-09-24
Anticipated expiration: 2023-02-04
Also published as: JP2004241938A

Description

【０００１】
【発明の属する技術分野】
本発明は画像処理装置およびその方法に関し、例えば、ディジタル文書などの画像処理に関する。
【０００２】
【従来の技術】
近年、オフィス文書全般のセキュリティは、ISO 15408を標準とする考え方が世界的に拡がり、このような観点から非常に重要な技術分野となりつつある。このような中、文書情報のセキュリティ管理方法の一つとして、いわゆる電子透かし技術が各種考案され、利用されるようになった。
【０００３】
セキュリティ管理の目的としては、データの不正コピー防止、重要情報の漏洩もしくは改竄の防止、文書情報の著作権保護、あるいは、画像データなどの利用に対する課金、など種々のものが考えられ、それぞれに対して様々な電子透かし方式が提案されている。例えば、ディジタル画像データに対して、人間が知覚できないように透かし情報を埋め込む技術としては、特願平10-278629号公報に開示された、画像データをウェーブレット変換し、周波数空間での冗長性を利用して透かし情報を埋め込むなどの方法が知られている。
【０００４】
また、文書画像のような二値画像は冗長度が少なく、電子透かし技術を実現するのが難しいが、文書画像特有の特徴を利用した電子透かし方式（以下「文書透かし」と呼ぶ）が幾つか知られている。例えば、行のベースラインを動かす方法（特許第3136061号）、単語間の空白長を操作する方法（米国特許第6,086,706号、特開平9-186603号公報）、文字間の空白長を操作する方法（King Mongkut大学「Electronic document data hiding technique using inter-character space」 The 1998 IEEE Asia-Pacific Conf. On Circuits and Systems、1998、419-422頁）、白黒二値のビットマップ画像として扱う方法（特開平11-234502号公報）などが挙げられる。
【０００５】
以上の方式は、画像中に透かし情報が埋め込まれていることをユーザが判別不能である（以降「不可視透かし」と呼ぶ）ことが特徴である。逆に、透かし情報が埋め込まれていることをユーザに明示して透かし情報を埋め込む方式（以降「可視透かし」と呼ぶ）も提案されている。例えば、特願平10-352619号公報には、原画像の画素位置と、埋め込むべき透かし画像の形状との比較により、原画像と埋込み系列との可逆演算を施した結果を、使用者に透かし情報がみえる形で埋め込む方法が開示されている。
【０００６】
【特許文献】
特願平10-278629号公報
特許第3136061号
米国特許第6,086,706号
特開平9-186603号公報
特開平11-234502号公報
特願平10-352619号公報
【０００７】
【非特許文献】
King Mongkut大学「Electronic document data hiding technique using inter-character space」 The 1998 IEEE Asia-Pacific Conf. On Circuits and Systems、1998、419-422頁
【０００８】
【発明が解決しようとする課題】
電子透かし方式は、基本的に、画像データそのものの中に何らかの付加的な情報を埋め込むことを目的とし、埋め込まれた付加情報を利用して不正使用の防止、著作権保護、データの改竄防止など、原画像の保護を図るものである。言い換えれば、原画像そのものをみることを禁止したり、所定の権限を持つユーザのみにコピーを許可したりする、といった目的は想定されていない。
【０００９】
また、原画像の保護は画像全体に適用される。そのため、保護された画像に含まれる保護不要の画像までもみることができない、コピーすることができない、などの問題がある。
【００１０】
本発明は、画像情報のテキスト領域についてユーザが選択した表示／非表示に応じて適応的に処理を制御することを目的とする。
【００１１】
また、テキスト領域について画像を保護することを他の目的とする。
【００１２】
【課題を解決するための手段】
本発明は、前記の目的を達成する一手段として、以下の構成を備える。
【００１３】
本発明は、画像処理装置が行う画像処理方法であって、
前記画像処理装置が有する入力手段が、画像情報を入力し、
前記画像処理装置が有する認識手段が、前記入力された画像情報に含まれる複数のテキスト領域を認識し、
前記画像処理装置が有する生成手段が、前記テキスト領域ごとに、前記テキスト領域に対する少なくとも表示、印刷、複製および送信の何れかの処理を制御するための認証情報を生成し、
前記画像処理装置が有する選択手段が、前記テキスト領域に対する少なくとも表示、印刷、複製および送信の何れかの処理を行った後に前記テキスト領域内のテキストの表示もしくは非表示のいずれかを選択し、
前記画像処理装置が有する埋め込み手段が、前記選択において表示が選択された場合には、前記テキスト領域に対して、前記テキスト領域内のテキストが可視な状態で、前記認証情報を電子透かしとして埋め込み、前記選択において非表示が選択された場合には、前記テキスト領域に対して、前記テキスト領域内のテキストが不可視な状態で、前記認証情報を電子透かしとして埋め込む
ことを特徴とする。
【００１４】
また、画像情報を入力する入力手段と、
前記入力された画像情報に含まれる複数のテキスト領域を認識する認識手段と、
前記テキスト領域ごとに前記テキスト領域に対する少なくとも表示、印刷、複製および送信の何れかの処理を制御するための認証情報を生成する生成手段と、
前記テキスト領域に対する少なくとも表示、印刷、複製および送信の何れかの処理を行った後に前記テキスト領域内のテキストの表示もしくは非表示のいずれかを選択する選択手段と、
前記選択手段において表示が選択された場合には、前記テキスト領域内のテキストが可視な状態で、前記認証情報を前記テキスト領域ごとに電子透かしとして埋め込み、
前記選択手段において非表示が選択された場合には、前記テキスト領域内のテキストが不可視な状態で、前記認証情報を電子透かしとして埋め込む埋込手段と
を有することを特徴とする。
【００１７】
【発明の実施の形態】
以下、図面を参照して本発明の実施形態を詳細に説明する。
【００１８】
【第１実施形態】
［構成］
図1は実施形態の画像処理システムの構成例を示すブロック図である。
【００１９】
この画像処理システムは、オフィス（のような複数の区分）10と20がインターネットのようなWAN 104で接続された環境で実現される。
【００２０】
オフィス10内に構築されたLAN 107には、複合機(MFP: Multi-Function Processor) 100、MFP 100を制御するマネージメントPC 101、クライアントPC 102、文書管理サーバ106、文書管理サーバによって管理されるデータベース105などが接続されている。オフィス20はオフイス10とほぼ同様の構成を有するが、オフィス20内に構築されたLAN 108には、少なくとも文書管理サーバ106、文書管理サーバによって管理されるデータベース105などが接続されている。オフィス10のLAN 107とオフィス20のLAN 108は、LAN 107に接続されたプロキシサーバ103、WAN 104、および、LAN 108に接続されたプロキシサーバ103を介して、相互に接続されている。
【００２１】
MFP 100、紙の文書の画像を読み取り、読み取った画像を処理する画像処理の一部を担当する。MFP 100から出力される画像信号は、通信線109を介してマネージメントPC 101に入力される。マネージメントPC 101は、通常のパーソナルコンピュータ(PC)で、画像記憶するハードディスクなどのメモリ、ハードウェアまたはソフトウェアで構成される画像処理部、CRTやLCDなどのモニタ、マウスやキーボードなどの入力部を有するが、その一部はMFP 100に一体化して構成されている。
【００２２】
図2はMFP 100の構成例を示すブロック図である。
【００２３】
オートドキュメントフィーダ(ADF)を含む画像読取部110は、一枚または重ねられた複数の原稿それぞれの画像を、光源で照射し、原稿からの反射像をレンズで固体撮像素子上に結像し、固体撮像素子からラスタ順の画像読取信号（例えば600dpi）を得る。原稿を複写する場合は、この画像読取信号をデータ処理部115で記録信号へ変換し、複数枚の記録紙に複写する場合は、一旦、記録部111に一頁分の記録信号を記憶した後、記録信号を繰り返し記録部112に出力することで、複数の記録紙に画像を形成する。
【００２４】
一方、クライアントPC 102から出力されるプリントデータは、LAN 107を介してネットワークインタフェイス(I/F)114へ入力され、データ処理部装置115によって記録可能なラスタデータに変換された後、記録部112によって記録紙上に画像として形成される。
【００２５】
MFP 100に対する操作者の指示は、MFP 100に装備されたキー操作部とマネージメントPC 101のキーボードやマウスからなる入力部113によって行われる。操作入力の表示および画像処理状態の表示などは表示部116によって行われる。
【００２６】
上記のMFP 100の動作は、データ処理部115内の図示しない制御部で制御される。
【００２７】
なお、記憶部111は、マネージメントPC 101からも制御可能である。MFP 100とマネージメントPC 101と間のデータの授受および制御は、ネットワークI/F 117および両者を直結する信号線109を介して行われる。
【００２８】
［処理］
図3から図6は上記の画像処理システムによる処理の概要を説明するフローチャートである。
【００２９】
画像読取部110により原稿を走査して、600dpi、8ビットの画像信号を得る（画像情報入力処理、S1201）。データ処理部115は、この画像信号にトリミング、斜行補正（向きの補正を含む）、ノイズ除去などの前処理を施し(S1202)、二値化処理により二値画像を生成して(S1203)、記憶部111に一頁分の画像データ（多値および二値画像データ）を保存する。
【００３０】
マネージメントPC 101のCPUは、記憶部111に格納された画像データにブロックセレクションを実行して、文字/線画部、階調画像部、および、文字/線画や画像が存在しない下地部を識別する(S1204)。さらに、文字/線画部を例えば段落単位の領域や、それ以外の構造物（罫線を有する表や線画）に分割して、それらをセグメント化（テキスト領域）する。一方、階調画像部および下地部は、矩形領域など、分割可能な単位ごとに独立したオブジェクトにセグメント化（ピクチャ領域）する(S1205)。そして、分離されたテキスト領域の位置情報、ピクチャ領域の位置情報に基づき、記憶部111に格納された画像データから、テキスト領域は二値画像を、ピクチャ領域は多値画像を切り出す(S1206)。以下の説明では、切り出した画像領域を「ブロック」と呼ぶ場合がある。
【００３１】
以下の処理はブロックごとに行う。処理するブロックがテキスト領域の場合は文書透かし検出処理により、ピクチャ領域の場合は下地透かし検出処理により、当該ブロックに透かし情報が埋め込まれているか否かを判別する(S1207)。透かし情報が埋め込まれていると判断される場合は当該領域の表示フラグをOFFに設定し(S1210)、埋め込まれていないと判断される場合は当該領域の表示フラグをONに設定する(S1209)。そして、すべてのブロックに対して同様の処理を行ったか否かを判断し(S1211)、すべてのブロックの表示フラグの設定が終了するまで、ステップS1207からS1210の処理を繰り返す。
【００３２】
続いて、処理対象のブロックを選択し(S1212)、選択ブロックに透かし情報が埋め込まれているか否かを表示フラグによって判断し(S1213)、埋め込まれていない場合は後述する「処理A」へ移行する。一方、透かし情報が埋め込まれている場合はパスワードの入力を促す(S1214)。このパスワードは、後述するように、当該ブロックの表示を制御するほか、印刷、送信など他の制御機能の認証を行うために使用される。
【００３３】
パスワードが入力されると、その正当性を判断し(S1215)、不正なパスワードの場合は後述する「処理B」へ移行する。正しいパスワードの場合は、そのパスワードが表示用か否かを判定し(S1216)、表示用であれば、さらに当該ブロックが下地部か否かを判定し(S1217)、下地部以外（テキスト領域または階調画像部）であれば当該ブロックの表示フラグをONにする(S1221)。
【００３４】
ステップS1217で下地部と判定された場合、すなわち透かし情報が埋め込まれた下地部の場合は、画像が存在しないので、下地に埋め込まれた透かし情報（以下「下地透かし」と呼ぶ）から画像のオリジナルデータの格納場所を示すポインタ情報を抽出し(S1218)、文書管理サーバ106などからオリジナルデータを取得する(S1219)。この際注意すべきことは、オリジナルデータに透かし情報が埋め込まれていない場合、透かし情報の継承が必要になる。透かし情報を継承しないと、以後、当該ブロックの各種制御が不能になってしまう。あるいは、透かし情報を継承するのではなく、新たに透かし情報を入力してもよい。当該ブロックのオリジナルデータに透かし情報を継承（つまり、下地透かしの情報を不可視透かしとして画像に埋め込む）または新たな透かし情報を埋め込んで(S1220)、透かし情報を埋め込んだ表示用の画像を準備した後、当該ブロックの表示フラグをONにする(S1221)。
【００３５】
一方、ステップS1216でパスワードが表示用ではないと判定された場合は、当該ブロックがテキスト領域か否かを判定し(S1222)、テキスト領域でなければ処理をステップS1225へ進める。また、テキスト領域の場合は、当該ブロックの二値画像データを文書管理サーバ106などへ送って保存させ(S1223)、透かし情報（画像データの保存先を示すポインタ情報、各種パスワード、各種制御情報などを含む）を下地透かしを埋め込んで当該ブロックをマスクする(S1224)。そして、ステップS1225で、当該ブロックの表示フラグをOFFにする。
【００３６】
次に、透かし情報から当該ブロックの他の制御情報（印刷、複製、送信の可否など）を抽出し(S1226)、制御情報に従い当該ブロックの他の制御フラグをONまたはOFFにする(S1227)。続いて、全てのブロックの処理が終了したか否かを判定し(S1228)、未了であれば処理をステップS1212へ戻し、終了であれば制御フラグに従って各種制御を行う(S1229)。なお、印刷、複製および送信などの制御情報に対応した印刷フラグ、複製フラグおよび送信フラグなどがあるが、それらがオンであれば当該ブロックの画像データは印刷、複製または送信され、それらがオフであれば当該ブロックの画像データは印刷、複製または送信されない。
【００３７】
次に、ステップS1213で透かし情報がないと判定した場合の「処理A」を説明する。
【００３８】
まず、当該ブロックがテキスト領域か否かを判定し(S1241)、テキスト領域でなければ制御対象ブロックではないから処理をステップS1228へ進める。また、テキスト領域の場合は、透かし埋込モードになり、テキストの読み取りが可能な文書透かしを埋め込むか（表示モード）、下地透かしを埋め込み当該ブロックをマスクするか（非表示モード）をユーザに選択させる(S1242)。表示モードが選択された場合は、各種パスワードの設定を行い(S1246)、それらパスワードを含む透かし情報とする文書透かしとして埋め込む(S1247)。また、非表示モードが選択された場合は、各種パスワードの設定を行い(S1243)、当該ブロックの二値画像データを文書管理サーバ106などへ送って保存させ(S1244)、ポインタ情報、各種パスワード、各種制御情報などを含む下地透かしを埋め込み、当該ブロックをマスクする(S1245)。
【００３９】
そして、当該ブロックの画像（透かし情報を埋め込み後の画像または下地）を再表示し(S1248)、処理をステップS1228へ進める。
【００４０】
次に、ステップ1215で不正なパスワードと判定した場合の「処理B」を説明する。、
【００４１】
まず、当該ブロックがテキスト領域か否かを判定し(S1251)、テキスト領域でなければ（元々マスクされた領域なので表示に関する保全は問題なし）、制御に関する保全を行うためすべての制御フラグをOFFにし(S1255)、処理をステップS1228へ進める。また、テキスト領域の場合は非表示にするため、当該ブロックの二値画像データを文書管理サーバ106などへ送って保存させ(S1252)、保存先のポインタ情報、各種パスワード、各種制御情報などを含む下地透かしを埋め込み、当該ブロックをマスクし(S1253)、当該ブロックを再表示し(S1254)、すべての制御フラグをOFFにし(S1255)、処理をステップS1228へ進める。
【００４２】
各種制御の一例として印刷制限と送信制限を説明すると、次のようになる。
・印刷指示があった場合
印刷フラグがオフのブロック: 下地透かしを埋め込んだ下地画像を印刷する
印刷フラグがオンのブロック: 文書透かしを埋め込んだ画像、またはオリジナルデータの画像を印刷する
・送信指示があった場合
送信フラグがオフのブロック: 下地透かしを埋め込んだ下地画像を送信する
送信フラグがオンのブロック: 文書透かしを埋め込んだ画像、またはオリジナルデータを送信する
【００４３】
このような制御を行うことで、文書のオブジェクトごとにセキュリティを管理（例えば閲覧制限、複製制限、送信制限、印刷制限など）することが自由自在に可能になる。また、文書を印刷した場合に、テキスト領域やピクチャ領域にそれぞれ文書透かしや不可視透かしを埋め込むから、印刷された画像から読み取られたオブジェクトのセキュリティ管理が可能になり、文書のセキュリティを大幅に向上することができる。
【００４４】
以下では、主要な処理について、その詳細を説明する。
【００４５】
［ブロックセレクション］
先ず、ステップS1204およびS1205のブロックセレクションを説明する。
【００４６】
ブロックセレクションは、図7に示す一頁の画像をオブジェクトの集合体と認識して、各オブジェクトの属性を文字(TEXT)、図画(PICTURE)、写真(PHOTO)、線(LINE)、表(TABLE)に判別し、異なる属性を持つ領域（ブロック）に分割する処理である。次に、ブロックセレクションの具体例を説明する。
【００４７】
先ず、処理すべき画像を白黒画像に二値化して、輪郭線追跡によって黒画素で囲まれる画素の塊を抽出する。面積が大きい黒画素の塊については、その内部の白画素について輪郭線追跡を行い白画素の塊を抽出する。さらに、所定面積以上の白画素の塊の内部の黒画素の塊を抽出するというように、黒画素および白画素の塊の抽出を再帰的に繰り返す。
【００４８】
このようにして得られた画素塊を、大きさおよび形状で分類し、異なる属性を持つ領域に分類する。例えば、縦横比が1に近く、大きさが所定範囲の画素塊を文字属性の画素塊とし、さらに、近接する文字属性の画素塊が整列していてグループ化が可能な場合はそれらを文字領域とする。また、縦横比が小さい扁平な画素塊を線領域に、所定以上の大きさで、かつ、矩形に近い形状を有し、整列した白画素塊を内包する黒画素塊が占める範囲を表領域に、不定形の画素塊が散在する領域を写真領域、その他の任意形状の画素塊を図画領域に、のようにそれぞれ分類する。
【００４９】
図8はブロックセレクションの結果を示す図で、図8(a)は抽出された各ブロックのブロック情報を示す。また、図8(b)は入力ファイル情報で、ブロックセレクションによって抽出されたブロックの総数を示す。これらの情報は、透かし情報の埋め込み、抽出の際に利用される。
【００５０】
［文書透かしの埋め込み］
次に、文書透かしの埋め込みを説明する。
【００５１】
図9に示す文書画像3001は、ブロックセレクションによってテキスト領域として分離されたブロックである。さらに、テキスト領域に対して、後述する文書画像解析3002によって文字要素ごとの外接矩形3004を抽出する。文字要素とは、射影を用いて抽出された矩形領域を指し、一つの文字である場合と、文字の構成要素（へん、つくり等）の場合がある。
【００５２】
そして、抽出した外接矩形3004の情報から、外接矩形間の空白長を算出し、後述する埋込規則に基づき各外接矩形を左右にシフトすることで、外接矩形間に1ビットの情報を埋め込み（埋込処理3003）、透かし情報3006を埋め込んだ文書画像3005を生成する。
【００５３】
文書画像解析3002は、本来、文字認識の要素技術であり、文書画像をテキスト領域やグラフ等の図形領域などに分割し、射影を用いて、テキスト領域の文字を文字単位に切り出す技術である。例として、特開平6-68301号公報に記載された技術を挙げることができる。
【００５４】
［文書透かしの抽出］
次に、文書透かしの抽出手法を説明する。
【００５５】
まず、文書透かしの埋め込みと同様に、図10に示す画像3005から、ブロックセレクションおよび文書画像解析3002により、文字の外接矩形3103を抽出し、抽出した外接矩形3103の情報を用いて、外接矩形間の空白長を算出する。また、各行において、1ビットの情報を埋め込むための文字を特定し、後述する埋込規則に基づいて、埋め込まれた透かし情報3105を抽出する（抽出処理3104）。
【００５６】
次に、埋込規則を説明する。
【００５７】
1ビットの情報を埋め込んだ文字の前後の空白長を、図11に示すようにP、Sとする。1ビットの情報を埋め込む文字は、行の両端の文字を除いて、一文字おきになる。空白長から(P-S)/(P+S)を算出し、適当な量子化ステップで量子化し、剰余を計算すると1ビットの情報を復元することができる。式(1)は、この関係を示し、埋め込まれた値V（‘0’または‘1’）を抽出することができる。
V = floor[(P - S)/{α(P + S)}] mod 2 …(1)
ここで、αは量子化ステップ (0 <α< 1)
【００５８】
透かし情報を埋め込む際は、外接矩形を1ピクセルずつ左右にシフトし、式(1)によって埋め込むべき値（‘0’または‘1’）になるまで、左または右へのシフト量（ピクセル数）を増加する。
【００５９】
図12はシフト量を探索する処理を示すフローチャートである。図12おいて、変数iはシフト量の候補値、変数Flag1および2はシフト対象の文字を、距離i分、右または左にシフトすると隣接する文字に接触するか否かを示し、接触する場合は‘1’になる。
【００６０】
まず、変数の初期値を設定し(S3402)、シフト対象の文字（もしくは文字要素）を、距離i分、右にシフトすると右隣の文字（もしくは文字要素）に接触するか否かを判定し(S3403)、接触する場合はFlag1を‘1’にする(S3404)。続いて、シフト対象の文字を、距離i分、左にシフトすると左隣の文字に接触するか否かを判定し(S3405)、接触する場合はFlag1を‘1’にする(S3406)。
【００６１】
次に、距離iのシフトが可能か否かを判定し(S3407)、両フラグが‘1’ならば不可能と判定して、シフト量を0にする(S3408)。この場合、シフト対象の文字のシフトによる情報の埋め込みは不可能である。
【００６２】
また、Flag1が‘0’ならば(S3409)、シフト対象の文字を、距離i分、右にシフトした場合に、埋め込もうとする値Vが得られるか否かを式(1)によって判定し(S3410)、値Vが得られる場合はシフト量を+iとする(S3411)。なお、シフト量の符号は、正が右へのシフトを、負が左へのシフトを意味する。
【００６３】
また、Flag1が‘1’または右シフトで値Vが得られず、Flag2が‘0’ならば(S3412)、シフト対象の文字を、距離i分、左にシフトした場合に、埋め込もうとする値Vが得られるか否かを式(1)によって判定し(S3413)、値Vが得られる場合はシフト量を-iとする(S3414)。
【００６４】
右および左シフトの何れでも値Vが得られない場合は、変数iをインクリメントし(S3415)、処理をステップS3403へ戻す。
【００６５】
このようにして探索されたシフト量に従い、文字をシフトして1ビットの情報を埋め込む。以上の処理を、各文字に対して行うことで、透かし情報を文書画像に埋め込む。
【００６６】
［電子透かしの埋込処理部］
以下で説明する電子透かし（ディジタルウォータマーク）は「不可視の電子透かし」とも呼ばれ、人間の視覚では殆ど認識できない程度の、オリジナル画像データの変化そのもののことである。そして、その変化の一つまたは変化の組み合わせが何らかの付加情報を表す。
【００６７】
図13は透かし情報を埋め込む埋込処理部（機能部）の構成を示すブロック図である。
【００６８】
埋込処理部は、画像入力部4001、埋込情報入力部4002、鍵情報入力部4003、電子透かし生成部4004、電子透かし埋込部4005および画像出力部4006から構成される。なお、電子透かしの埋込処理は、上記のような構成を有するソフトウェアによって実現されてもよい。
【００６９】
画像入力部4001は、透かし情報を埋め込む画像の画像データIを入力する。以降の説明では、説明を簡単にするため、画像データIがモノクロ多値画像を表すとする。勿論、カラー画像データ等の複数の色成分からなる画像データに透かし情報を埋め込むならば、その複数の色成分である例えばRGB成分、あるいは、輝度、色差成分の夫々をモノクロ多値画像と同様に扱い、各成分に透かし情報を埋め込むことができる。その場合、モノクロ多値画像に比べて、約三倍の情報量の透かし情報を埋め込むことが可能になる。
【００７０】
埋込情報入力部4002は、画像データIに埋め込む透かし情報をバイナリデータ列として入力する。このバイナリデータ列を付加情報Infとするが、付加情報Infは‘0’または‘1’の何れかを表すビットの組み合わせによって構成される。付加情報Infは、画像データIに該当する領域を制御するための認証情報やオリジナルデータへのポインタ情報などを表す。以降では、nビットで表現される付加情報Infを埋め込む例を説明する。
【００７１】
なお、付加情報Infが容易に悪用されないように、付加情報Infは暗号化されていてもよいし、画像データIから付加情報Infが抽出できないように変更（以下「攻撃」と呼ぶ）された場合でも正しく付加情報Infが抽出されるように、誤り訂正符号化が施されていてもよい。なお、故意ではない攻撃もあり得る。例えば、一般的な画像処理の非可逆圧縮、輝度補正、幾何変換、フィルタリングなどの結果、透かし情報が除去される場合である。暗号化および誤り訂正符号化などの処理は公知であるから、その詳細説明は省略する。
【００７２】
鍵情報入力部4003は、付加情報Infの埋め込みおよび抽出に必要な鍵情報kを入力する。鍵情報kはLビットで表され、L=8であれば"01010101"（十進表記では"85"）などである。鍵情報kは、後述する擬似乱数発生部4102が実行する擬似乱数発生処理の初期値として与えられる。埋込処理部および後述する抽出処理部が共通の鍵情報kを使用する場合に限り、埋め込まれた付加情報Infが正しく抽出される。言い換えれば、鍵情報kを所有する利用者だけが付加情報Infを正しく抽出することができる。
【００７３】
電子透かし生成部4004は、埋込情報入力部4002から付加情報Infを、鍵情報入力部4003から鍵情報kを入力し、付加情報Infと鍵情報kに基づいて電子透かしwを生成する。図14は電子透かし生成部4004の詳細を示すブロック図である。
【００７４】
基本行列生成部4101は、基本行列mを生成する。基本行列mは、付加情報Infを構成する各ビットの位置と、各ビットが埋め込まれる画像データIの画素位置を対応付けるために用いられる。基本行列生成部4101は、複数の基本行列を選択的に利用することが可能で、どの基本行列を用いるかは目的/状況に応じて変更する必要があり、基本行列の切り替えにより最適な透かし情報（付加情報Inf）の埋め込みが可能になる。
【００７５】
図15は基本行列mの例を示す図である。行列4201は、16ビットの付加情報Infを埋め込む場合に用いられる基本行列mの一例で、4×4の各要素に1から16の数字が割り当てられている。基本行列mの要素の値と、付加情報Infのビット位置とが対応付けられる。つまり、基本行列mの要素の値が「1」の位置に付加情報Infのビット位置が「1」（最上位ビット）を対応させ、同様に、要素の値が「2」の位置に付加情報Infのビット位置が「2」（最上位ビットの次のビット）を対応させる。
【００７６】
行列4202は、8ビットの付加情報Infを埋め込む場合に用いられる基本行列mの一例である。行列4202によれば、行列4201の要素のうち「1」から「8」までの値を持つ要素に付加情報Infの8ビットが対応付けられ、値を持たない要素に付加情報Infが対応することはない。行列4202に示すように、付加情報Infの各ビットに対応する位置を散らすことで、行列4201を用いる場合よりも、付加情報Infの埋め込みによる画像の変化（画質劣化）を認識し難くすることができる。
【００７７】
行列4203は、行列4202と同様、8ビットの付加情報Infを埋め込む場合に用いられる基本行列mの一例である。行列4202によれば1ビットの情報が一画素に埋め込まれるが、行列4203によれば1ビットの情報は二画素に埋め込まれる。言い換えれば、行列4202が全画素の50%に当たる画素を付加情報Infの埋め込みに用いているのに対して、行列4203は全画素（100%）を付加情報Infの埋め込みに用いる。従って、行列4203を使用すれば、付加情報Infを埋め込む回数が増え、行列4201や4202よりも、付加情報Infを確実に抽出できる（攻撃耐性がある）ことになる。なお、透かし情報の埋め込みに使用する画素の割合を、以降、「充填率」と呼ぶことにする。因みに、行列4201の充填率は100%、行列4202の充填率は50%、行列4203の充填率は100%である。
【００７８】
行列4204は、充填率は100%であるが、4ビットの付加情報Infしか埋め込まない。従って、1ビットの情報は四画素を用いて埋め込まれ、付加情報Infを埋め込む回数がさらに増えて、攻撃耐性がさらに向上するが、その反面、他の行列よりも埋め込み可能な情報量が小さくなる。
【００７９】
このように、基本行列mをどのような構成にするかによって、充填率、1ビットの埋め込みに使用する画素数、埋め込み可能な情報量を選択的に設定することができる。充填率は、主に、透かし情報を埋め込んだ画像の画質に影響し、1ビットの埋め込みに使用する画素数は、主に、攻撃耐性に影響する。従って、充填率を大きくすると画質の劣化が大きくなり、1ビットの埋め込みに使用する画素数を大きくすると攻撃耐性が強くなり、埋め込み可能な情報量が小さくなる。このように、画質、攻撃耐性および情報量はトレードオフの関係にある。
【００８０】
本実施形態においては、複数種類の基本行列mを適応的に選択することで、攻撃耐性、画質、情報量を制御および設定することが可能である。
【００８１】
擬似乱数発生部4102は、入力された鍵情報kを元に、擬似乱数列rを生成する。擬似乱数列rは、{-1, 1}の範囲に含まれる一様分布に従う実数列で、鍵情報kは擬似乱数rを発生させる初期値として用いられる。すなわち、鍵情報k1を用いて生成した擬似乱数列r(k1)と、鍵情報k2（≠k1）を用いて生成した擬似乱数列r(k2)とは異なる。擬似乱数列rを生成する方法は公知であるから詳細な説明は省略する。
【００８２】
擬似乱数割当部4103は、透かし情報Inf、基本行列mおよび擬似乱数列rを入力して、基本行列mに基づき、透かし情報Infの各ビットを擬似乱数列rの各要素に割り当て電子透かしwを生成する。具体的には、行列4204の各要素をラスタ順にスキャンして、値「1」を持つ要素に最上位ビットを、値「2」をもつ要素に次のビットを、のようにして、付加情報Infの各ビットを基本行列mの各要素に対応させ、付加情報Infのビットが‘1’のときは対応する擬似乱数列rの要素をそのまま、‘0’のときは対応する擬似乱数列rの要素に-1を掛ける。以上の処理を、付加情報Infのnビット分実行すると、図16に一例を示す電子透かしwが得られる。なお、図16に示す電子透かしwは、基本行列mを図15に示す行列4204、擬似乱数列にr={0.7, -0.6, -0.9, 0.8}の実数列、付加情報Inf（4ビット）が“1001”の例である。
【００８３】
なお、上記では16ビット、8ビットおよび4ビットの付加情報Infを埋め込むために4×4の基本行列mを用いる例を説明したが、これに限らず、1ビットの情報を埋め込むために更に多くの画素を利用し、より大きなサイズの基本行列mを用いることができる。より大きなサイズの基本行列mを用いれば、擬似乱数列rもより長い実数列を用いることになる。実際には、説明に用いたような四要素から構成される乱数列では、後述する抽出処理が正しく機能しない可能性がある。つまり、付加情報Infが埋め込まれているにも関わらず、集積画像cと電子透かしw1、w2、…、wnとの相関係数が小さくなる可能性がある。そこで、例えば64ビットの付加情報Infを埋め込むために、充填率50%においては、256×256の基本行列mを用いるような構成にする。この場合、1ビットの埋め込みに512画素が使用される。
【００８４】
電子透かし埋込部4005は、画像データIおよび電子透かしwを入力し、電子透かしwを埋め込んだ画像データI'を出力する。電子透かし埋込部4005は、式(2)に従い、電子透かしの埋め込み処理を実行する。
I'_i,j = I_i,j + aw_i,j …(2)
ここで、I'_i,jは電子透かしが埋め込まれた画像データ
I_i,jは電子透かしを埋め込む前の画像データ
w_i,jは電子透かし
iおよびjは画像または電子透かしのx,y座標値
aは電子透かしの強度を設定するパラメータ
【００８５】
aとしては例えば「10」程度の値が選択可能である。aを大きくすると攻撃耐性が大きい電子透かしを埋め込むことが可能であるが、画質劣化が大きくなる。一方で、aを小さくすれば攻撃耐性は小さくなるが、画質劣化も抑えることができる。基本行列mの構成と同様に、aの値を適当に設定することで、攻撃耐性と画質とのバランスを調整することが可能である。
【００８６】
図17は式(2)に示す電子透かしの埋め込み処理を具体的に示す図である。符号4401が電子透かしが埋め込まれた画像データI'に、符号4402が電子透かしを埋め込む前の画像データIに、符号4403が電子透かしwにそれぞれ対応する。図17に示すように、式(2)の演算は行列内の各要素に対して実行される。
【００８７】
式(2)および図17に示す処理が画像データIの全体に繰り返し実行される。画像データIが図18に示す24×24画素から構成される場合、画像データIは、4×4画素からなる互いに重複しないブロック（マクロブロック）に分割され、各マクロブロックに対して式(2)の処理が実行される。
【００８８】
全てのマクロブロックに対して、繰り返し、電子透かしの埋め込み処理を実行することにより、結果的に、画像全体に透かし情報を埋め込むことが可能である。一つのマクロブロックにはnビットから構成される付加情報Infが埋め込まれているから、少なくともマクロブロックが一つあれば埋め込まれた付加情報Infを抽出することができる。言い換えれば、付加情報Infの抽出は、画像全体を必要とせず、画像データIの一部（少なくとも一つのマクロブロック）があれば充分である。画像データIの一部から付加情報Infを完全に抽出可能なことを「切取耐性がある」と呼ぶ。
【００８９】
こうして生成された付加情報Infが電子透かしとして埋め込まれた画像データI'は、画像出力部4006を通じて、埋込処理部の最終的な出力となる。
【００９０】
［電子透かしの抽出処理部］
図19は、画像に埋め込まれた透かし情報を抽出する抽出処理部（機能部）の構成を示すブロック図である。
【００９１】
抽出処理部は、画像入力部4601、鍵情報入力部4602、電子透かし生成部4603、電子透かし抽出部4604および電子透かし出力部4605から構成される。なお、電子透かしの抽出処理は、上記のような構成を有するソフトウェアによって実現されてもよい。
【００９２】
画像入力部4601は、透かし情報が埋め込まれている可能性がある画像データI"が入力される。なお、画像入力部4601に入力される画像データI"は、前述した埋込処理部によって透かし情報が埋め込まれた画像データI'でもよいし、攻撃が加えられた画像データI'や、透かし情報が埋め込まれていない画像データIであってもよい。
【００９３】
鍵情報入力部4602は、透かし情報を抽出するための鍵情報kを入力する。ここで入力される鍵情報kは、前述した埋込処理部の鍵情報入力部4003に入力されたものと同一でなければならない。異なる鍵情報が入力された場合は付加情報を正しく抽出することはできない。言い換えれば、正しい鍵情報kを有する利用者だけが正しい付加情報Inf'を抽出することが可能である。
【００９４】
抽出パターン生成部4603は、鍵情報kを入力し、鍵情報kに基づいて抽出パターンを生成する。図20は抽出パターン生成部4603の処理の詳細を示す図である。抽出パターン生成部4603は、基本行列生成部4701、擬似乱数発生部4702および擬似乱数割当部4703から構成される。基本行列生成部4701は前述した基本行列生成部4101と、擬似乱数発生部4702は前述した擬似乱数発生部4102と同じ動作を行うので、それらの詳細説明は省略する。ただし、同一の鍵情報kに対して、基本行列生成部4701が生成する基本行列mと、基本行列生成部4101が生成する基本行列mとが同一でなければ、付加情報を正しく抽出することはできない。
【００９５】
擬似乱数割当部4703は、基本行列mと擬似乱数列rを入力して、擬似乱数列rの各要素を基本行列mの所定要素に割り当てる。前述した埋込処理部の擬似乱数割当部4103との違いは、擬似乱数割当部4103が出力する電子透かしwは一つであるのに対し、擬似乱数割当部4703からは付加情報Infのビット数（ここではnビット）分の抽出パターンwnを出力することである。
【００９６】
擬似乱数列rの各要素を基本行列mの所定要素に割り当てる詳細を、図15に示す行列4204を用いた例を示して説明する。行列4204を用いる場合、4ビットの付加情報Infの埋め込みが可能であるから、四つの抽出パターンw1、w2、w3、w4が出力される。具体的には、行列4204の各要素をラスタ順にスキャンして、値「1」を持つ要素に擬似乱数列rの各要素を割り当て、値「1」を持つ全要素に擬似乱数列rの各要素の割り当てが終了すると、擬似乱数列rを割り当てた行列を抽出パターンw1として生成する。図21は抽出パターンの例を示す図で、擬似乱数列rとしてr={0.7, -0.6, -0.9, 0.8}という実数列を用いた場合である。以上の処理を、行列4204の値「2」「3」「4」を持つ要素に実行し、それぞれ抽出パターンw2、w3、w4を生成する。こうして生成された抽出パターンw1、w2、w3およびw4を重ね合わせると、埋込処理部で作成された電子透かしwに等しくなる。
【００９７】
電子透かし抽出部4604は、画像データI"および抽出パターンw1、w2、…、wnを入力して、画像データI"から付加情報Inf'を抽出する。ここで抽出される付加情報Inf'は、埋め込まれた付加情報Infに等しいことが望まれるが、画像データI'が様々な攻撃を受けている場合、必ずしも一致しない。
【００９８】
電子透かし抽出部4604は、画像データI"から生成された集積画像cと抽出パターンw1、w2、…、wnとの相関をそれぞれ計算する。集積画像cとは、画像データI"をマクロブロックに分割し、各マクロブロックの要素の値の平均値を算出した画像である。図22は4×4画素の抽出パターンと、24×24画素の画像データI"が入力された場合の集積画像cを説明する図である。図22に示す画像データI"は36個のマクロブロックに分割され、これら36個のマクロブロックの各要素の値の平均値を求めたものが集積画像cである。
【００９９】
こうして生成された集積画像cと、抽出パターンw1、w2、…、wnとの相関がそれぞれ計算される。相関係数は、集積画像cと抽出パターンwnの類似度を測定する統計量で、式(3)で表される。
ρ= c'^T・w'n/|c'^T||w'n| …(3)
ここで、c'およびw'nは各要素と要素の平均値との差を要素とする行列
c'^Tはc'の転置行列
【０１００】
相関係数ρは-1から+1の値をとる。集積画像cと抽出パターンwnとの正の相関が強い場合、ρは+1に近付き、負の相関が強い場合、ρは-1に近付く。「正の相関が強い」は「集積画像cが大きいほど抽出パターンwnが大きくなる」関係であり、「負の相関が強い」は「集積画像cが大きいほど抽出パターンwnが小さくなる」関係である。また、集積画像cと抽出パターンwnとの相関がない場合、ρは0になる。
【０１０１】
こうして算出した相関によって、画像データI"に付加情報Inf'が埋め込まれているか否か、さらに、埋め込まれている場合は付加情報Inf'を構成する各ビットが‘1’か‘0’かを判定する。つまり、集積画像cと抽出パターンw1、w2、…、wnとの相関係数を算出し、算出された相関係数が0に近い場合は「付加情報は埋め込まれていない」、相関係数が0から離れた正数の場合は‘1’が、相関係数が0から離れた負数の場合は‘0’が埋め込まれていると判断する。
【０１０２】
相関を求めることは、集積画像cと、抽出パターンw1、w2、…、wnそれぞれとの類似度を評価することに等しい。つまり、前述した埋込処理部によって、画像データI"（集積画像c）に抽出パターンw1、w2、…、wnに相当するパターンが埋め込まれている場合、高い類似度を示す相関値が算出される。
【０１０３】
図23は4ビットの付加情報が埋め込まれた画像データI"（集積画像c）からw1、w2、w3、w4を用いて電子透かしを抽出する例を示す図である。
【０１０４】
集積画像cと、四つの抽出パターンw1、w2、w3、w4との相関値がそれぞれ算出する。画像データI"（集積画像c）に付加情報Inf'が埋め込まれている場合、相関値が例えば0.9、-0.8、-0.85、0.7と算出される。この結果から付加情報Inf'は“1001”と判定することができ、最終的に4ビットの付加情報Inf'を抽出することが可能である。
【０１０５】
こうして抽出されたnビットの付加情報Inf'は、電子透かし出力部4605を通じて、抽出処理部の抽出結果として出力される。その際、埋込処理部において、付加情報Infを埋め込む際に誤り訂正符号化処理や暗号化処理が施されている場合は、誤り訂正復号処理や暗号復号処理が実行される。得られた情報は、最終的にバイナリデータ列（付加情報Inf'）として出力される。
【０１０６】
［変形例］
上記では、透かしとして、文書透かしと下地透かしとを使い分ける例を説明したが、これに限ることはなく、各オブジェクトに最適な透かし方式を使い分けてもよい。
【０１０７】
また、認証制御をパスワードを用いて実現する例を説明したが、これに限ることはなく、鍵制御で実現してもよい。
【０１０８】
【第２実施形態】
以下、本発明にかかる第2実施形態の画像処理装置を説明する。なお、第2実施形態において、第1実施形態と略同様の構成については、同一符号を付して、その詳細説明を省略する。
【０１０９】
［構成］
図24はディジタル複写機の構成例を示す外観図で、原稿画像をディジタル的に読み取り、所定の画像処理を施してディジタル画像データを生成するリーダ部5l、および、生成されたディジタル画像データによって複写画像を生成するブリンタ部52からなる。
【０１１０】
リーダ部5lの原稿給送装置510lは、原稿を最終頁から順に一枚ずつプラテンガラス5102上へ供給し、原稿画像の読み取り終了後、プラテンガラス5102上の原稿を排出する。原稿がプラテンガラス5102上に搬送されると、ランプ5103が点灯され、そしてスキャナユニット5104の移動が開始され、原稿が露光走査される。この時の原稿からの反射光は、ミラー5105、5106、5107およびレンズ5108によってCCDイメージセンサ（以下「CCD」という）5109に結像される。このようにして、走査される原稿画像はCCD 5109によって読み取られ、CCD 5109から出力される画像信号は、画像処理部5110により、シェーディング補正、シャープネス補正などの画像処理が施された後、プリンタ部52へ転送される。
【０１１１】
プリンタ部52のレーザドライバ522lは、リーダ部5lから入力される画像データに応じてレーザ発光部520lを駆動する。レーザ発光部520lから出力されるレーザ光は、ポリゴンミラーによって感光ドラム5202を走査して、感光ドラム5202に潜像を形成する。感光ドラム5202に形成された潜像は、現像器5203によって現像剤（トナー）が付着される。
【０１１２】
カセット5204またはカセット5205から供給される記録紙は、レーザ光の照射開始に同期して転写部5206へ搬送され、感光ドラム5202に付着した現像剤が転写される。現像剤が転写された記録紙は、定着部5207に搬送され、定着部5207の熱と圧力により現像剤が記録紙に定着される。定着部5207を通過した記録紙は、排出ローラ5208によって排出される。ソータ5220は、排出される記録紙をそれぞれのビンに収納して記録紙を仕分ける。なお、ソータ5220は、仕分けが設定されていない場合は最上部のビンに記録紙を収納する。
【０１１３】
両面記録が設定されている場合、記録紙は排出ローラ5208のところまで搬送された後、逆回転される排出ローラ5208およびフラッパ5209によって、再給紙搬送路へ導かれる。また、多重記録が設定されている場合、記録紙は排出ローラ5208まで搬送されず、フラッパ5209によって再給紙搬送路へ導かれる。再給紙搬送路へ導かれた記録紙は、上述したタイミングで転写部5206へ給紙される。
【０１１４】
［処理］
図25はリーダ部51の画像処理部5110によって実行される、エリア指定された画像を隠蔽する処理を示すフローチャートである。
【０１１５】
画像処理部5110は、原稿から読み取られた画像信号が入力されると、微細な画素単位の輝度情報を通常8ビット程度の精度で量子化したディジタル画像データを生成する(S101)。画素の空間分解能は42μm×42μm程度で、これは1インチ（25.4mm）当り約600画素（600dpi）の解像度である。画像処理部5110は、生成した画像データが表す画像を図26に示す操作部の画面に表示する。
【０１１６】
操作部は、通常、タッチパネルで表面を被った液晶ディスプレなどで構成され、画面上に表示されるボタンを操作して所望する操作を行うことができる。図6において、ボタン601は装置モードを選択するボタンで、「コピー」モードは読み取った原稿画像をコピー（プリンタ部52から出力）し、「送信」モードは読み取った原稿の画像データを電子ファイルとしてネットワークを介して遠隔地に送信し、「蓄積」モードは読み取った原稿の画像データを電子ファイルとして装置に内蔵されたハードディスクなどの補助記憶装置に蓄積するものである。ここでは「コピー」モードが選択されたとして、そのボタン枠を太線で示す。
【０１１７】
表示部602は、選択されたモードに応じた装置の基本的な動作条件を表示し、コピーモード選択時は出力用紙サイズと拡大/縮小の倍率を表示する。プレビュー表示部603は、リーダ部51が読み取った画像の全体を縮小表示する。プレビュー表示部603に表示された枠604は、プレビュー表示された画像に設定されたエリアを示している。枠604によって示されるエリア（以下「エリア」と呼ぶ）は、ボタン605によってその大きさが拡大/縮小され、ボタン606によって上下左右に移動される。言い換えれば、ボタン605および606の操作によって、プレビュー表示部603上のエリア604のサイズおよび位置が変更される。
【０１１８】
ボックス607は、後述する認証情報を入力するテキストボックスで、例えば、図示しないテンキーを使用して四桁程度の文字列が入力され、入力された文字列の数に相当する「*」などの記号を表示する。なお、入力文字列を直接表示せずに「*」を表示するのは、セキュリティを向上するためである。
【０１１９】
画像処理部5110は、操作部を用いた、ユーザのエリア604の指定および認証情報の入力を受け付け(S102、S103)、指定および入力が終了すると、入力画像データからエリア104で指定され画像データを切り出し(S104)、切り出した画像データの種別を判別し(S105)、判別結果に応じた画像圧縮方式を選択し(S106)、選択した画像圧縮方式によって切り出した画像データをデータ圧縮する(S107)。次に、圧縮画像データ、用いた画像圧縮方式を示す識別符号、および、入力された認証情報に基づき、それらを合成した符号データを生成し(S108)、生成した符号データを後述する方法によってビットマップデータに変換する(S109)。そして、入力画像データからエリア604内の画像データを消去し(S110)、消去後の空白エリアにステップS109で得たビットマップ化された符号データを嵌め込んだ画像データを合成し(S111)、合成した画像データを出力する(S112)。
【０１２０】
ここでは装置モードとしてコピーモードが選択されているので、出力された画像データはプリンタ部52へ送られて、記録紙に複写画像が形成されることになる。同様に、装置モードとして送信モードが選択されていれば、出力された画像データはネットワーク通信部へ送られ、所定のあて先へ電子的に転送される。また、蓄積モードが選択されていれば、出力された画像データは装置内の補助記憶装置に蓄積される。
【０１２１】
図27はステップS105からS111までの処理を詳細に説明するフローチャートである。
【０１２２】
切り出した画像データの種別の判別を行うが、エリア指定された画像が写真のような連続階調画像か、文字/線画のような二値的な画像かを判別する(S203)。判別手法としては、対象画像の輝度分布を示すヒストグラムを利用する方法、空間周波数成分ごとの発生頻度を利用する方法、あるいは、パターンマッチングにより「線」として認識される確率が高いか否かを利用する方法、など種々のものが提案されているが、そのような公知の手法を用いることが可能である。
【０１２３】
画像が文字/線画と判別された場合、画像の輝度分布を示すヒストグラムを生成し(S204)、このヒストグラムから背景と文字/線画を分離するために最適なしきい値を求め(S205)、このしきい値を用いて画像データを二値化し(S206)、得られた二値画像データを圧縮処理する(S207)。この圧縮処理には、周知の二値画像圧縮方式が適用できる。通常、二値画像の圧縮方式としては、情報の欠落が発生しないロスレス（可逆）圧縮方式、例えばMMR圧縮、MR圧縮、MH圧縮、JBIG圧縮などの何れかを用いる。勿論、圧縮後の符号サイズが最小になるように、上記方式の何れかを適応的に用いる、ということも可能である。
【０１２４】
一方、画像が連続階調画像と判断された場合、解像度変換を行う(S208)。入力画像データは例えば600dpiで読み取られたものであるが、写真のような階調画像に対しては300dpi程度でも画像の劣化がみられないことが通例である。そこで、最終的な符号サイズを削減する目的で、例えば縦、横ともに1/2のサイズに縮小した300dpi相当の画像データに変換する。そして、300dpiの多値画像データを圧縮処理する(S209)。多値画像に好適な圧縮方式としては、周知のJPEG圧縮方式、JPEG2000圧縮方式などを利用することができる。ただし、これらの圧縮方式は、原画像に対して、通常は視覚的に識別困難な程度の劣化を伴う、ロッシー（非可逆）圧縮方式である。
【０１２５】
得られた圧縮画像データに、圧縮方式を識別するコード情報を付加する(S210)。これは、出力画像を元の画像に復元する際の伸長方式を指定するために必要な情報である。例えば、それぞれの圧縮方式に対して、下記のような識別コードを予め割り当てておく。
JPEG圧縮 → BB
JPEG2000 → CC
MMR圧縮 → DD
MH圧縮 → EE
JBIG圧縮 → FF
【０１２６】
次に、認証情報のコードを付加する(S211)。認証情報は出力画像を元の画像に復元する際に、復元しようとする者がその権限を有するか否かを判別するために必要な情報である。復元の際に、ここで付加される認証情報が正しく指定された場合に限り、元の画像への復元処理が行われる。
【０１２７】
このようにして得られる符号データのディジタル信号列を二進数として、二値のビットマップデータに変換して(S212)、エリア604に嵌め込み合成する(S213)。
【０１２８】
以上の動作を模式的に示すのが図28である。入力画像データ301に対してエリア604が指定されると、エリア604内の画像データが消去され、ビットマップ化された符号データに置き換えられる。
【０１２９】
図29は、図27に示すフローを模式的に説明する図である。
【０１３０】
エリア604のSx×Sy画素の画像を切り出し、文字/線画と判定されたので二値化し、ロスレス圧縮する。そして、圧縮後の符号列の例えば選択に圧縮方式の識別符号を付加し、さらに先頭に認証情報を付加し、二進数およびビットマップ化を経て、エリア604と同一サイズのSx×Sy画素のビットマップデータを生成し、このビットマップデータでエリア604の画像を置き換える。勿論、識別符号や認証情報の付加位置は符号列の先頭ではなく末尾や所定のビット数目など、予め定められた位置であれば任意である。さらに、識別符号や認証情報の抽出を確実にするために、複数の位置に繰り返し付加することもできる。
【０１３１】
［符号データのビットマップ化］
図30から32は符号データをビットマップ化する方法を説明する図で、三通りの異なる方法を示している。それぞれ、小さな矩形が600dpiの一画素を表す。
【０１３２】
図30に示す方法は、600dpiの画素を2×2画素が1ビットの情報をもつようにビットマップ化する。二進数として表現した符号データ（左側）が‘1’の場合は2×2画素の四画素を‘1’(黒）とし、符号データが‘0’の場合は四画素を‘0’(白）とする。結果的に、600dpiの1/2の解像度(300dpi)の二値ビットマップデータが生成される。なお、2×2画素で1ビットの情報を表すのは、実施形態によって記録紙上にプリントされたビットマップ画像をリーダでスキャンして元の画像を復元する際、リーダの読取精度、位置ずれ、倍率誤差、などによる影響を低減し、正確にビットマップ画像から符号データを復元するためである。
【０１３３】
図31に示す方法は、2×2画素を全て同値とするのではなく、符号データが‘1’の場合は四画素のうち左上の小画素（600dpi相当）を‘1’（黒）とし、‘0’の場合は右下の小画素を‘1’（黒）とする。このような構成によれば、プリントされたビットMAP画像をスキャンして元の画像を復元する際の信頼性が向上する。
【０１３４】
図32に示す方法は、1ビットを表現する画素を4×2画素とし、図示するような白黒画素の配置によって‘1’および‘0’を表す。こうすれば、単位面積当りに記録可能なデータ量は減少するが、元の画像の復元時の読取精度をさらに改善することが可能になる。
【０１３５】
なお、ビットマップ化の方法は、上記の方式に制限されるわけではなく、その他の種々の方式が適用できる。
【０１３６】
次に、生成されるビットマップのサイズと、そこに埋め込み可能な情報量について説明する。
【０１３７】
エリア604の大きさ（Sx×Sy画素）が原稿上で2インチ角（縦横約5cm）であるとすると、原画像データは600dpiであるからSx、Syは何れも1200画素になる。つまり、エリア604の画像データの情報量は一画素当り8ビットとして下記のようになる。
1200×1200×8 = 11,520,000ビット = 11Mビット
【０１３８】
前述した方法で符号データをビットマップ化して、エリア604の画像と置き換える場合、図30および31の方式では、四画素で1ビットの情報を埋め込むから記録可能な情報量は1/4×1/8=1/32になり、2インチ角のエリア604に埋め込めるデータ量は下記のようになる。
11M/32 = 0.34Mビット
【０１３９】
言い換えれば、11Mビットの画像データを1/32の0.34Mビットに圧縮しなければならず、非現実的である。そのため、上述したように、エリア604の画像属性を判別して適応的に二値化、解像度変換、圧縮方法の切り替えを行う必要がある。エリア604の画像が文字/線画の場合は、解像度は600dpiのまま、二値化する。これで画像のデータ量は1/8になり(11/8=)1.38Mビットになり、これをさらに0.34Mビットにするには1/4の圧縮が必要になるが、これはMMRやJBIGの圧縮方法により容易に達成可能な圧縮率である。勿論、圧縮方式の識別符号や認証情報なども埋め込む必要があるため、1/4よりも高い圧縮率が必要だが、それでも比較的容易に達成可能である。
【０１４０】
一方、写真/階調画像の場合は、階調数は8ビットのまま、解像度を半分(300dpi)にすることでデータ量を1/4に削減され、(11/4=)2.75Mビットになる。これをさらに0.34Mビットにするには1/8の圧縮が必要になるが、これはJPEGやJPEG2000の圧縮方法により、画質劣化を抑えて、極めて容易に達成できる圧縮率である。
【０１４１】
また図32に示すビットマップ化を採用すると、埋め込み可能な情報量がさらに1/2になり、圧縮率をさらに二倍に高める必要があるが、上記の圧縮方式として非現実的な値になるわけではない。
【０１４２】
［元の画像の復元］
図33はリーダ部51の画像処理部5110によって実行される、ビットマップから元の画像を復元する方法を説明するフローチャートである。
【０１４３】
画像処理部5110は画像を入力する(S801)。プリント出力であればリーダ部51によりその画像を読み取りディジタル画像とすればよいし、電子的に送信または蓄積されたものであれば、ディジタル画像としてそのまま入力すればよい。
【０１４４】
次に、画像処理部5110は、入力画像から隠蔽された画像領域を検出する(S802)。この検出には、入力画像に含まれる矩形領域を検出し、検出した矩形領域に黒画素と白画素の周期的な切り替わりが存在すれば隠蔽された画像領域であると判定する、といった方式を適用する。
【０１４５】
次に、検出された、隠蔽された画像領域の画像データから画素の配列を読み取り(S803)、その画像データのビットマップ化方法を判定して二進数の符号列を復元し(S804)、符号列から圧縮方式を示す識別コードを抽出し(S805)、認証情報を抽出する(S806)。
【０１４６】
次に、入力画像に隠蔽された画像が存在する旨を、操作部の画面等に表示し、画像を復元するための認証情報の入力を促す(S807)。認証情報が入力されると、入力された認証情報が、抽出した認証情報と一致するか否かを判定し(S808)、不一致であれば入力画像をそのまま出力する(S813)。
【０１４７】
また、一致すれば元の画像を復元するが、符号列から圧縮方式の識別コードおよび認証情報を除いた圧縮画像の符号データを抽出し(S809)、抽出した符号データに、抽出した識別コードに対応する圧縮方式の伸長処理を施し(S810)、伸長した画像を、検出した、隠蔽された画像領域の画像と置き換え(S811)、得られた合成画像を出力する(S812)。ここで出力される画像はエリア指定された画像を隠蔽する前の元の画像が復元されたものになる。
【０１４８】
このように、エリア指定された部分画像を効率良く圧縮して得た符号データをビットマップ化して原画像に合成することで、エリア指定された画像を視覚的に識別不能な状態の画像に置き換えて隠蔽することが可能になる。このような識別不能な画像（隠蔽された画像領域）がある場合、その領域の画像を符号データとして認識（解読）し、その符号データに設定された認証情報を参照して、閲覧するなどの権限を有するユーザに対しては、その符号データに設定された圧縮方式の識別コードに基づき元の画像を復元することができる。
【０１４９】
従って、所定の権限を有するユーザは、元の画像を復元し、原画像の表示、印刷、複製、送信および/または蓄積が可能である。なお、認証情報は、表示、印刷、複製、送信および蓄積の画像操作それぞれに対して個別に設定してもよいし、まとめて、あるいは、表示および印刷や複製および送信など画像操作をグループ化した単位ごとに設定してもよい。
【０１５０】
［変形例］
上記では、図28などに示すように、一つのエリア604を指定して、その領域の画像を隠蔽する例を説明したが、隠蔽する領域は一つに限らず、複数指定することが可能である。その場合、エリア指定ごとにステップS102からS111の処理を繰り返せばよい。また、複数の隠蔽された画像領域を有する画像から元の画像を復元する場合は、検出された、隠蔽された画像領域ごとにステップS803からS811の処理を繰り返せばよい。
【０１５１】
上記では、ディジタル複写機において読み取られる原稿画像を対象に、情報の隠蔽、符号化方法、復元方法を説明したが、これらはPC（パーソナルコンピュータ）上の文書や図画などにも適用することができる。この場合、文書や図画のプリントを指示すると、プリントしようとするプリンタに対応するデバイスドライバが立ち上がり、PC上のアプリケーションが生成する印刷コードに基づいてプリント出力用の画像データが生成される。デバイスドライバは、生成した画像データを、図26に示したように、そのユーザインタフェイス画面上にプレビュー表示して、ユーザが隠蔽したいと希望するエリア604の指定および認証情報の入力を受け付ける。また、隠蔽された画像領域を検出する。以降の処理は、上記と同様であるが、それらの処理はPC上のデバイスドライバ（具体的にはデバイスドライバソフトウェアを実行するCPU）によって実現される。
【０１５２】
上記では、符号データをビットマップ化する例を説明したが、プリント画像の歪みや記録紙の汚れなどによって、正確に元の画像を復元できない場合が想定される。そのような障害を回避するために、符号データに誤り訂正符号を追加した後、ビットマップ化するようにすれば、ビットマップとして記録したデータの信頼性を向上させることができる。誤り訂正符号は種々の公知の手法が提案されているのでそれを利用すればよい。ただし、埋め込みが可能な有効な情報量が減ることになるため、その分の画像の圧縮率は高めに設定する必要がある。勿論、誤り訂正符号だけでなく、情報漏洩に対する堅牢性を高めるために、符号データを暗号化した後、ビットマップ化することも考えられる。
【０１５３】
【他の実施形態】
なお、本発明は、複数の機器（例えばホストコンピュータ、インタフェイス機器、リーダ、プリンタなど）から構成されるシステムに適用しても、一つの機器からなる装置（例えば、複写機、ファクシミリ装置など）に適用してもよい。
【０１５４】
また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体（または記録媒体）を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはCPUやMPU）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム(OS)などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１５５】
さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるCPUなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１５６】
本発明を上記記憶媒体に適用する場合、その記憶媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。
【０１５７】
【発明の効果】
以上説明したように、本発明によれば、画像情報のテキスト領域についてユーザが選択した表示／非表示に応じて適応的に処理を制御することができる。
【０１５８】
また、テキスト領域について画像を保護することができる。
【図面の簡単な説明】
【図１】実施形態の画像処理システムの構成例を示すブロック図、
【図２】 MFPの構成例を示すブロック図、
【図３】画像処理システムによる処理の概要を説明するフローチャート、
【図４】画像処理システムによる処理の概要を説明するフローチャート、
【図５】画像処理システムによる処理の概要を説明するフローチャート、
【図６】画像処理システムによる処理の概要を説明するフローチャート、
【図７】ブロックセレクションを説明する図、
【図８】ブロックセレクションの結果を示す図、
【図９】文書透かしの埋め込みを説明する図、
【図１０】文書透かしの抽出を説明する図、
【図１１】文書透かしの埋込規則を説明する図、
【図１２】シフト量を探索する処理を示すフローチャート、
【図１３】透かし情報を埋め込む埋込処理部（機能部）の構成を示すブロック図、
【図１４】電子透かし生成部の詳細を示すブロック図、
【図１５】基本行列の例を示す図、
【図１６】電子透かしwの一例を示す図、
【図１７】電子透かしの埋め込み処理を示す図、
【図１８】画像データの構成例を示す図、
【図１９】画像に埋め込まれた透かし情報を抽出する抽出処理部（機能部）の構成を示すブロック図、
【図２０】抽出パターン生成部の処理の詳細を示す図、
【図２１】抽出パターンの例を示す図、
【図２２】集積画像を説明する図、
【図２３】電子透かしを抽出する例を示す図、
【図２４】ディジタル複写機の構成例を示す外観図、
【図２５】リーダ部の画像処理部によって実行される、エリア指定された画像を隠蔽する処理を示すフローチャート、
【図２６】操作部の概要を示す図、
【図２７】図25に示すステップS105からS111までの処理を詳細に説明するフローチャート、
【図２８】図27に示すフローを模式的に示す図、
【図２９】図27に示すフローを模式的に説明する図、
【図３０】符号データをビットマップ化する方法を説明する図、
【図３１】符号データをビットマップ化する方法を説明する図、
【図３２】符号データをビットマップ化する方法を説明する図、
【図３３】リーダ部の画像処理部5110によって実行される、ビットマップから元の画像を復元する方法を説明するフローチャートである。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus and method, for example, image processing of a digital document or the like.
[0002]
[Prior art]
In recent years, the concept of ISO 15408 as a standard for security of office documents in general has spread worldwide, and from this point of view, it is becoming a very important technical field. Under such circumstances, various so-called digital watermark techniques have been devised and used as one of document information security management methods.
[0003]
There are various security management purposes such as prevention of unauthorized copying of data, prevention of leakage or falsification of important information, copyright protection of document information, or charging for the use of image data, etc. Various digital watermarking methods have been proposed. For example, as a technique for embedding watermark information into digital image data so that it cannot be perceived by humans, the image data disclosed in Japanese Patent Application No. 10-278629 is subjected to wavelet transform to reduce redundancy in frequency space. There are known methods such as embedding watermark information.
[0004]
A binary image such as a document image has little redundancy and it is difficult to realize a digital watermark technique. However, there are several digital watermark methods (hereinafter referred to as “document watermark”) using characteristics unique to a document image. Are known. For example, a method of moving the baseline of a line (Japanese Patent No. 3160661), a method of operating a space length between words (US Pat. No. 6,086,706, Japanese Patent Laid-Open No. 9-186603), a method of operating a space length between characters (King Mongkut University "Electronic document data hiding technique using inter-character space" The 1998 IEEE Asia-Pacific Conf. On Circuits and Systems, 1998, pp. 419-422) 11-234502).
[0005]
The above method is characterized in that the user cannot determine that watermark information is embedded in an image (hereinafter referred to as “invisible watermark”). Conversely, a method of embedding watermark information by clearly indicating to the user that the watermark information is embedded (hereinafter referred to as “visible watermark”) has also been proposed. For example, in Japanese Patent Application No. 10-352619, the result of performing a reversible operation between the original image and the embedded sequence by comparing the pixel position of the original image and the shape of the watermark image to be embedded is transmitted to the user. A method of embedding information in a visible form is disclosed.
[0006]
[Patent Literature]
Japanese Patent Application No. 10-278629
Japanese Patent No. 3160661
U.S. Patent No. 6,086,706
JP-A-9-186603
Japanese Patent Laid-Open No. 11-234502
Japanese Patent Application No. 10-352619
[0007]
[Non-patent literature]
King Mongkut University "Electronic document data hiding technique using inter-character space" The 1998 IEEE Asia-Pacific Conf. On Circuits and Systems, 1998, pp. 419-422
[0008]
[Problems to be solved by the invention]
The digital watermarking method is basically intended to embed some additional information in the image data itself, and uses the embedded additional information to prevent unauthorized use, copyright protection, data tampering, etc. This is intended to protect the original image. In other words, the purpose of prohibiting viewing the original image itself or permitting copying only to a user having a predetermined authority is not assumed.
[0009]
In addition, the protection of the original image is applied to the entire image. Therefore, there is a problem that even an unprotected image included in the protected image cannot be seen or copied.
[0010]
The present invention provides image information text In the area Adaptively according to the display / non-display selected by the user The purpose is to control the process.
[0011]
Also, text region about Another purpose is to protect the image.
[0012]
[Means for Solving the Problems]
The present invention has the following configuration as one means for achieving the above object.
[0013]
The present invention is an image processing method performed by an image processing apparatus,
The input means of the image processing apparatus inputs image information,
A recognition unit included in the image processing apparatus recognizes a plurality of text regions included in the input image information;
The generation unit included in the image processing apparatus generates authentication information for controlling at least one of display, printing, copying, and transmission processing for the text area for each text area,
The selection means included in the image processing apparatus selects at least one of display, printing, duplication, and transmission of the text area and then displaying or not displaying the text in the text area,
When the display unit is selected in the selection, the embedding unit included in the image processing apparatus embeds the authentication information as a digital watermark in a state where the text in the text region is visible in the text region, When non-display is selected in the selection, the authentication information is embedded as a digital watermark with the text in the text area invisible in the text area.
It is characterized by that.
[0014]
An input means for inputting image information;
Recognition means for recognizing a plurality of text regions included in the input image information;
Generating means for generating authentication information for controlling at least one of display, printing, copying, and transmission processing for the text area for each text area;
A selection means for selecting either display or non-display of the text in the text area after performing at least one of display, printing, duplication, and transmission processing on the text area;
When display is selected by the selection means, the text in the text area is visible, and the authentication information is embedded as a digital watermark for each text area,
An embedding unit that embeds the authentication information as a digital watermark in a state where the text in the text area is invisible when non-display is selected by the selection unit;
It is characterized by having.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0018]
[First Embodiment]
[Constitution]
FIG. 1 is a block diagram illustrating a configuration example of an image processing system according to an embodiment.
[0019]
This image processing system is realized in an environment in which offices (such as a plurality of sections) 10 and 20 are connected by a WAN 104 such as the Internet.
[0020]
A LAN 107 built in the office 10 includes a multi-function processor (MFP) 100, a management PC 101 that controls the MFP 100, a client PC 102, a document management server 106, and a database managed by the document management server. 105 etc. are connected. The office 20 has substantially the same configuration as that of the office 10, but a LAN 108 built in the office 20 is connected to at least a document management server 106, a database 105 managed by the document management server, and the like. The LAN 107 of the office 10 and the LAN 108 of the office 20 are connected to each other via a proxy server 103 connected to the LAN 107, a WAN 104, and a proxy server 103 connected to the LAN 108.
[0021]
The MFP 100 is in charge of a part of image processing for reading an image of a paper document and processing the read image. An image signal output from the MFP 100 is input to the management PC 101 via the communication line 109. The management PC 101 is a normal personal computer (PC), and has a memory such as a hard disk for storing images, an image processing unit composed of hardware or software, a monitor such as a CRT or LCD, and an input unit such as a mouse and keyboard. However, some of them are integrated into the MFP 100.
[0022]
FIG. 2 is a block diagram illustrating a configuration example of the MFP 100.
[0023]
The image reading unit 110 including an auto document feeder (ADF) irradiates an image of each of one or a plurality of originals with a light source, and forms a reflection image from the originals on a solid-state image sensor with a lens. An image read signal in raster order (for example, 600 dpi) is obtained from the solid-state imaging device. When copying a document, the image reading signal is converted into a recording signal by the data processing unit 115. When copying to a plurality of recording papers, the recording signal for one page is temporarily stored in the recording unit 111. By repeatedly outputting the recording signal to the recording unit 112, images are formed on a plurality of recording sheets.
[0024]
On the other hand, the print data output from the client PC 102 is input to the network interface (I / F) 114 via the LAN 107, converted into raster data that can be recorded by the data processing unit 115, and then the recording unit. By 112, an image is formed on the recording paper.
[0025]
An operator's instruction to the MFP 100 is performed by a key operation unit provided in the MFP 100 and an input unit 113 including a keyboard and a mouse of the management PC 101. Display of operation input, display of image processing status, and the like are performed by the display unit 116.
[0026]
The operation of the MFP 100 is controlled by a control unit (not shown) in the data processing unit 115.
[0027]
The storage unit 111 can also be controlled from the management PC 101. Data exchange and control between the MFP 100 and the management PC 101 are performed via a network I / F 117 and a signal line 109 directly connecting both.
[0028]
[processing]
3 to 6 are flowcharts for explaining the outline of the processing by the image processing system.
[0029]
The original is scanned by the image reading unit 110 to obtain an image signal of 600 dpi and 8 bits (image information input process, S1201). The data processing unit 115 performs preprocessing such as trimming, skew correction (including direction correction), noise removal, and the like on the image signal (S1202), and generates a binary image by binarization processing (S1203). Then, one page of image data (multi-value and binary image data) is stored in the storage unit 111.
[0030]
The CPU of the management PC 101 executes block selection on the image data stored in the storage unit 111 to identify the character / line drawing part, the gradation image part, and the background part where no character / line drawing or image exists ( S1204). Further, the character / line drawing section is divided into, for example, a paragraph unit area and other structures (tables and line drawings having ruled lines), and these are segmented (text area). On the other hand, the gradation image portion and the background portion are segmented (picture region) into independent objects for each unit that can be divided, such as a rectangular region (S1205). Then, based on the position information of the separated text area and the position information of the picture area, the text area cuts out a binary image and the picture area cuts out a multi-valued image from the image data stored in the storage unit 111 (S1206). In the following description, the clipped image area may be referred to as a “block”.
[0031]
The following processing is performed for each block. If the block to be processed is a text area, it is determined whether or not watermark information is embedded in the block by a document watermark detection process, and if it is a picture area, it is determined by a background watermark detection process (S1207). If it is determined that the watermark information is embedded, the display flag of the area is set to OFF (S1210). If it is determined that the watermark information is not embedded, the display flag of the area is set to ON (S1209). . Then, it is determined whether or not the same processing has been performed for all blocks (S1211), and the processing from step S1207 to S1210 is repeated until the setting of the display flag for all the blocks is completed.
[0032]
Subsequently, a block to be processed is selected (S1212), and whether or not watermark information is embedded in the selected block is determined by a display flag (S1213). If not embedded, the process proceeds to `` Processing A '' described later. To do. On the other hand, if watermark information is embedded, the user is prompted to enter a password (S1214). As will be described later, this password is used to control the display of the block and to authenticate other control functions such as printing and transmission.
[0033]
When the password is input, the validity is determined (S1215). If the password is invalid, the process proceeds to “Process B” described later. If the password is correct, it is determined whether or not the password is for display (S1216). If the password is for display, it is further determined whether or not the block is a background part (S1217). If it is a gradation image portion), the display flag of the block is turned ON (S1221).
[0034]
If it is determined in step S1217 that it is a background part, that is, if it is a background part in which watermark information is embedded, the image does not exist, so the watermark information embedded in the background (hereinafter referred to as “background watermark”) Pointer information indicating a data storage location is extracted (S1218), and original data is acquired from the document management server 106 or the like (S1219). It should be noted that when the watermark information is not embedded in the original data, it is necessary to inherit the watermark information. If the watermark information is not inherited, thereafter, various controls of the block will be disabled. Alternatively, the watermark information may be newly input instead of inheriting the watermark information. After the watermark data is inherited in the original data of the block (that is, the background watermark information is embedded in the image as an invisible watermark) or new watermark information is embedded (S1220), and the display image embedded with the watermark information is prepared Then, the display flag of the block is turned ON (S1221).
[0035]
On the other hand, if it is determined in step S1216 that the password is not for display, it is determined whether or not the block is a text area (S1222). If not, the process proceeds to step S1225. In the case of a text area, the binary image data of the block is sent to the document management server 106 and saved (S1223), and watermark information (pointer information indicating the save destination of the image data, various passwords, various control information, etc.) The block is masked by embedding a background watermark (S1224). In step S1225, the display flag of the block is turned OFF.
[0036]
Next, other control information of the block (whether printing, copying, transmission, etc.) is extracted from the watermark information (S1226), and other control flags of the block are turned ON or OFF according to the control information (S1227). Subsequently, it is determined whether or not the processing of all the blocks has been completed (S1228). If not completed, the process returns to step S1212, and if completed, various controls are performed according to the control flag (S1229). Note that there are a print flag, a duplication flag, and a transmission flag corresponding to control information such as printing, duplication, and transmission. If they are on, the image data of the block is printed, duplicated or transmitted, and they are off. If so, the image data of the block is not printed, copied or transmitted.
[0037]
Next, “Processing A” when it is determined in step S1213 that there is no watermark information will be described.
[0038]
First, it is determined whether or not the block is a text area (S1241). If it is not a text area, it is not a control target block, and the process advances to step S1228. In the case of a text area, the watermark embedding mode is set, and the user selects whether to embed a document watermark that can read the text (display mode) or to embed a background watermark and mask the block (non-display mode). (S1242). When the display mode is selected, various passwords are set (S1246) and embedded as a document watermark as watermark information including these passwords (S1247). If the non-display mode is selected, various passwords are set (S1243), the binary image data of the block is sent to the document management server 106 and the like (S1244), pointer information, various passwords, A background watermark including various control information and the like is embedded, and the block is masked (S1245).
[0039]
Then, the image of the block (the image or background after embedding watermark information) is displayed again (S1248), and the process proceeds to step S1228.
[0040]
Next, “Process B” when it is determined in step 1215 that the password is invalid will be described. ,
[0041]
First, it is determined whether or not the block is a text area (S1251). If it is not a text area (it is a masked area, there is no problem with display-related maintenance), all control flags are turned OFF to perform control-related maintenance. (S1255), the process proceeds to step S1228. In addition, in order to hide the text area, the binary image data of the block is sent to the document management server 106 or the like for storage (S1252), and includes destination pointer information, various passwords, various control information, etc. The background watermark is embedded, the block is masked (S1253), the block is redisplayed (S1254), all control flags are turned off (S1255), and the process proceeds to step S1228.
[0042]
Printing restrictions and transmission restrictions will be described as an example of various controls as follows.
・ When there is a print instruction
Block with print flag off: Prints the background image with the background watermark embedded
Block with print flag on: Prints an image with embedded document watermark or original data
・ When there is a transmission instruction
Block with send flag off: Sends the background image with the background watermark embedded
Block with send flag turned on: Sends an image with embedded document watermark or original data
[0043]
By performing such control, it is possible to freely manage security (for example, viewing restriction, copy restriction, transmission restriction, printing restriction, etc.) for each object of the document. Also, when a document is printed, the document watermark and invisible watermark are embedded in the text area and the picture area, respectively, which makes it possible to manage the security of the object read from the printed image and greatly improve the security of the document. be able to.
[0044]
Below, the detail is demonstrated about main processes.
[0045]
[Block selection]
First, block selection in steps S1204 and S1205 will be described.
[0046]
Block selection recognizes the image of one page shown in Fig. 7 as a collection of objects, and the attributes of each object are text (TEXT), drawing (PICTURE), photo (PHOTO), line (LINE), table (TABLE) ) And is divided into areas (blocks) having different attributes. Next, a specific example of block selection will be described.
[0047]
First, an image to be processed is binarized into a black and white image, and a block of pixels surrounded by black pixels is extracted by contour tracking. For a black pixel block having a large area, outline tracing is performed for white pixels inside the block to extract a white pixel block. Further, the extraction of the black pixel block and the white pixel block is recursively repeated such that the black pixel block inside the white pixel block having a predetermined area or more is extracted.
[0048]
The pixel blocks obtained in this way are classified by size and shape, and are classified into regions having different attributes. For example, if a pixel block with an aspect ratio close to 1 and a size within a specified range is used as a pixel block for character attributes, and if pixel blocks with adjacent character attributes are aligned and can be grouped, they can be grouped into a character area And In addition, a flat pixel block having a small aspect ratio is defined as a line region, and a range occupied by a black pixel block having a shape larger than a predetermined size and close to a rectangle and including aligned white pixel blocks is defined as a table region. A region where irregular pixel clusters are scattered is classified as a photographic region, and a pixel block of other arbitrary shape is classified as a graphic region.
[0049]
FIG. 8 is a diagram showing the result of block selection, and FIG. 8 (a) shows block information of each extracted block. FIG. 8 (b) shows the total number of blocks extracted by block selection as input file information. Such information is used when embedding and extracting watermark information.
[0050]
[Embed document watermark]
Next, embedding of a document watermark will be described.
[0051]
A document image 3001 shown in FIG. 9 is a block separated as a text area by block selection. Further, a circumscribed rectangle 3004 for each character element is extracted from the text area by a document image analysis 3002 described later. A character element refers to a rectangular area extracted using projection, and may be a single character or a character component (hen, structure, etc.).
[0052]
Then, a blank length between circumscribed rectangles is calculated from the extracted circumscribed rectangle 3004 information, and each circumscribed rectangle is shifted to the left and right based on an embedding rule described later, thereby embedding 1-bit information between circumscribed rectangles ( An embedding process 3003) generates a document image 3005 in which watermark information 3006 is embedded.
[0053]
The document image analysis 3002 is originally an element technology for character recognition, and is a technology for dividing a document image into graphic regions such as a text region and a graph, and cutting out characters in the text region in character units using projection. As an example, the technique described in JP-A-6-68301 can be cited.
[0054]
[Extract Document Watermark]
Next, a document watermark extraction method will be described.
[0055]
First, as in the case of embedding the document watermark, the circumscribed rectangle 3103 of the character is extracted from the image 3005 shown in FIG. 10 by the block selection and document image analysis 3002, and the extracted circumscribed rectangle 3103 information is used. Calculate the blank length of. Further, in each row, a character for embedding 1-bit information is specified, and embedded watermark information 3105 is extracted based on an embedding rule described later (extraction process 3104).
[0056]
Next, the embedding rule will be described.
[0057]
The space lengths before and after the character in which 1-bit information is embedded are P and S as shown in FIG. The characters that embed 1-bit information are every other character except for the characters at both ends of the line. If (PS) / (P + S) is calculated from the blank length, quantized at an appropriate quantization step, and the remainder is calculated, 1-bit information can be restored. Equation (1) shows this relationship, and the embedded value V ('0' or '1') can be extracted.
V = floor [(P-S) / {α (P + S)}] mod 2… (1)
Where α is the quantization step (0 <α <1)
[0058]
When embedding watermark information, the circumscribed rectangle is shifted by 1 pixel to the left and right, and the shift amount (number of pixels) to the left or right until the value to be embedded ('0' or '1') by Equation (1) Increase.
[0059]
FIG. 12 is a flowchart showing a process for searching for the shift amount. In FIG. 12, variable i is a candidate value of the shift amount, and variables Flag1 and 2 indicate whether or not to touch the adjacent character when shifting the character to be shifted to the right or left by the distance i. Becomes '1'.
[0060]
First, the initial value of the variable is set (S3402), and if the character (or character element) to be shifted is shifted to the right by the distance i, it is determined whether or not it touches the character (or character element) on the right. (S3403) When making contact, Flag1 is set to '1' (S3404). Subsequently, when the character to be shifted is shifted to the left by the distance i, it is determined whether or not the character adjacent to the left is touched (S3405). If touched, Flag1 is set to '1' (S3406).
[0061]
Next, it is determined whether or not the shift of the distance i is possible (S3407). If both flags are “1”, it is determined that the shift is impossible and the shift amount is set to 0 (S3408). In this case, it is impossible to embed information by shifting the character to be shifted.
[0062]
If Flag1 is '0' (S3409), it is determined by Equation (1) whether or not the value V to be embedded is obtained when the character to be shifted is shifted to the right by the distance i. However, if the value V is obtained, the shift amount is set to + i (S3411). Note that the sign of the shift amount means that a positive shift to the right and a negative shift to the left.
[0063]
If Flag1 is '1' or value V is not obtained by right shift and Flag2 is '0' (S3412), if the character to be shifted is shifted left by distance i, an attempt is made to embed Whether or not the value V to be obtained is obtained is determined by the equation (1) (S3413). If the value V is obtained, the shift amount is set to -i (S3414).
[0064]
If the value V cannot be obtained by either the right shift or the left shift, the variable i is incremented (S3415), and the process returns to step S3403.
[0065]
In accordance with the shift amount searched in this way, the character is shifted to embed 1-bit information. The watermark information is embedded in the document image by performing the above processing on each character.
[0066]
[Digital watermark embedding processor]
The digital watermark (digital watermark) described below is also referred to as “invisible digital watermark”, and is a change in original image data that is hardly recognized by human vision. One of the changes or a combination of changes represents some additional information.
[0067]
FIG. 13 is a block diagram showing a configuration of an embedding processing unit (functional unit) for embedding watermark information.
[0068]
The embedding processing unit includes an image input unit 4001, an embedded information input unit 4002, a key information input unit 4003, a digital watermark generation unit 4004, a digital watermark embedding unit 4005, and an image output unit 4006. The digital watermark embedding process may be realized by software having the above-described configuration.
[0069]
The image input unit 4001 inputs image data I of an image in which watermark information is embedded. In the following description, to simplify the description, it is assumed that the image data I represents a monochrome multivalued image. Of course, if watermark information is embedded in image data composed of a plurality of color components such as color image data, each of the plurality of color components, for example, RGB components, luminance, and color difference components, is the same as a monochrome multi-valued image. Watermark information can be embedded in each component. In that case, it is possible to embed watermark information having an information amount about three times that of a monochrome multi-valued image.
[0070]
The embedding information input unit 4002 inputs watermark information to be embedded in the image data I as a binary data string. This binary data string is used as additional information Inf. The additional information Inf is composed of a combination of bits representing either “0” or “1”. The additional information Inf represents authentication information for controlling an area corresponding to the image data I, pointer information to the original data, and the like. Hereinafter, an example in which additional information Inf expressed by n bits is embedded will be described.
[0071]
Note that the additional information Inf may be encrypted so that the additional information Inf is not easily abused, or has been changed (hereinafter referred to as an “attack”) so that the additional information Inf cannot be extracted from the image data I. However, error correction coding may be performed so that the additional information Inf is correctly extracted. There may be unintentional attacks. For example, this is a case where watermark information is removed as a result of irreversible compression, luminance correction, geometric transformation, filtering, and the like of general image processing. Since processes such as encryption and error correction encoding are well known, detailed description thereof will be omitted.
[0072]
Key information input section 4003 inputs key information k necessary for embedding and extracting additional information Inf. The key information k is represented by L bits, and is “01010101” (“85” in decimal notation) or the like if L = 8. The key information k is given as an initial value of a pseudo random number generation process executed by a pseudo random number generation unit 4102 described later. Only when the embedding processing unit and the later-described extraction processing unit use the common key information k, the embedded additional information Inf is correctly extracted. In other words, only the user who owns the key information k can correctly extract the additional information Inf.
[0073]
The digital watermark generation unit 4004 receives additional information Inf from the embedded information input unit 4002 and key information k from the key information input unit 4003, and generates a digital watermark w based on the additional information Inf and the key information k. FIG. 14 is a block diagram showing details of the digital watermark generation unit 4004.
[0074]
The basic matrix generation unit 4101 generates a basic matrix m. The basic matrix m is used to associate the position of each bit constituting the additional information Inf with the pixel position of the image data I in which each bit is embedded. The basic matrix generation unit 4101 can selectively use a plurality of basic matrices, and it is necessary to change which basic matrix is used according to the purpose / situation. (Additional information Inf) can be embedded.
[0075]
FIG. 15 is a diagram illustrating an example of the basic matrix m. The matrix 4201 is an example of a basic matrix m used when embedding 16-bit additional information Inf, and numbers from 1 to 16 are assigned to each 4 × 4 element. The value of the element of the basic matrix m is associated with the bit position of the additional information Inf. In other words, the bit position of the additional information Inf is “1” (most significant bit) corresponding to the position of the element value “1” of the basic matrix m, and similarly, the additional information is set to the position of the element value “2”. The bit position of Inf corresponds to “2” (the bit next to the most significant bit).
[0076]
The matrix 4202 is an example of a basic matrix m used when embedding 8-bit additional information Inf. According to the matrix 4202, 8 bits of the additional information Inf are associated with elements having values from “1” to “8” among the elements of the matrix 4201, and the additional information Inf corresponds to elements having no value. There is no. As shown in the matrix 4202, by dispersing the positions corresponding to each bit of the additional information Inf, it is more difficult to recognize the image change (image quality degradation) due to the embedding of the additional information Inf than when the matrix 4201 is used. it can.
[0077]
Similar to the matrix 4202, the matrix 4203 is an example of a basic matrix m used when embedding 8-bit additional information Inf. According to the matrix 4202, 1-bit information is embedded in one pixel, whereas according to the matrix 4203, 1-bit information is embedded in two pixels. In other words, the matrix 4202 uses 50% of all pixels for embedding the additional information Inf, whereas the matrix 4203 uses all pixels (100%) for embedding the additional information Inf. Therefore, if the matrix 4203 is used, the number of times of embedding the additional information Inf increases, and the additional information Inf can be extracted more reliably (has attack resistance) than the matrices 4201 and 4202. The ratio of pixels used for embedding watermark information is hereinafter referred to as “filling rate”. Incidentally, the filling rate of the matrix 4201 is 100%, the filling rate of the matrix 4202 is 50%, and the filling rate of the matrix 4203 is 100%.
[0078]
The matrix 4204 has a filling rate of 100%, but only embeds 4-bit additional information Inf. Therefore, 1-bit information is embedded using four pixels, and the number of times of embedding additional information Inf is further increased and attack resistance is further improved. However, the amount of information that can be embedded is smaller than other matrices. .
[0079]
Thus, the filling rate, the number of pixels used for 1-bit embedding, and the amount of information that can be embedded can be selectively set depending on the configuration of the basic matrix m. The filling rate mainly affects the image quality of an image in which watermark information is embedded, and the number of pixels used for embedding 1 bit mainly affects attack resistance. Therefore, when the filling rate is increased, the image quality is greatly deteriorated, and when the number of pixels used for embedding 1 bit is increased, the attack resistance is increased and the amount of information that can be embedded is reduced. Thus, image quality, attack resistance, and information amount are in a trade-off relationship.
[0080]
In the present embodiment, it is possible to control and set attack resistance, image quality, and information amount by adaptively selecting a plurality of types of basic matrices m.
[0081]
The pseudorandom number generator 4102 generates a pseudorandom number sequence r based on the input key information k. The pseudo random number sequence r is a real number sequence according to a uniform distribution included in the range of {-1, 1}, and the key information k is used as an initial value for generating the pseudo random number r. That is, the pseudo random number sequence r (k1) generated using the key information k1 is different from the pseudo random number sequence r (k2) generated using the key information k2 (≠ k1). Since the method for generating the pseudo-random number sequence r is known, a detailed description thereof will be omitted.
[0082]
The pseudo random number assigning unit 4103 receives the watermark information Inf, the basic matrix m, and the pseudo random number sequence r, assigns each bit of the watermark information Inf to each element of the pseudo random number sequence r based on the basic matrix m, and assigns the digital watermark w. Generate. Specifically, each element of the matrix 4204 is scanned in raster order, the most significant bit is assigned to the element having the value “1”, the next bit is assigned to the element having the value “2”, and the additional information Each bit of Inf is made to correspond to each element of the basic matrix m. When the bit of the additional information Inf is '1', the corresponding element of the pseudo random number sequence r is left as it is, and when it is '0', the corresponding pseudo random number sequence r Multiply the element of by -1. When the above processing is executed for n bits of the additional information Inf, a digital watermark w shown as an example in FIG. 16 is obtained. Note that the digital watermark w shown in FIG. 16 has a basic matrix m as the matrix 4204 shown in FIG. 15, a pseudo-random number sequence with a real number sequence of r = {0.7, -0.6, -0.9, 0.8}, and additional information Inf (4 bits). Is an example of “1001”.
[0083]
In the above description, an example in which a 4 × 4 basic matrix m is used to embed 16-bit, 8-bit, and 4-bit additional information Inf has been described. However, the present invention is not limited to this, and more information is used to embed 1-bit information. A larger basic matrix m can be used. If a larger basic matrix m is used, a longer real number sequence is used as the pseudorandom number sequence r. Actually, the extraction process described later may not function correctly in a random number sequence composed of four elements as used in the description. That is, although the additional information Inf is embedded, there is a possibility that the correlation coefficient between the integrated image c and the digital watermarks w1, w2,. For this reason, for example, in order to embed 64-bit additional information Inf, a basic matrix m of 256 × 256 is used at a filling rate of 50%. In this case, 512 pixels are used for 1-bit embedding.
[0084]
The digital watermark embedding unit 4005 receives the image data I and the digital watermark w, and outputs the image data I ′ in which the digital watermark w is embedded. The digital watermark embedding unit 4005 executes digital watermark embedding processing according to the equation (2).
I ' _{i, j} = I _{i, j} + aw _{i, j} … (2)
Where I ' _{i, j} Is image data with embedded digital watermark
I _{i, j} Is the image data before embedding the digital watermark
w _{i, j} Is watermark
i and j are the x and y coordinate values of the image or watermark
a is a parameter to set the watermark strength
[0085]
For example, a value of about “10” can be selected as a. When a is increased, it is possible to embed a digital watermark having a high attack resistance, but the image quality deterioration is increased. On the other hand, if a is reduced, attack resistance is reduced, but image quality deterioration can also be suppressed. Similar to the configuration of the basic matrix m, it is possible to adjust the balance between attack resistance and image quality by appropriately setting the value of a.
[0086]
FIG. 17 is a diagram specifically illustrating the digital watermark embedding process shown in Expression (2). Reference numeral 4401 corresponds to the image data I ′ in which the electronic watermark is embedded, reference numeral 4402 corresponds to the image data I before the electronic watermark is embedded, and reference numeral 4403 corresponds to the electronic watermark w. As shown in FIG. 17, the calculation of Expression (2) is performed on each element in the matrix.
[0087]
The processing shown in Expression (2) and FIG. 17 is repeatedly performed on the entire image data I. When the image data I is composed of 24 × 24 pixels shown in FIG. 18, the image data I is divided into non-overlapping blocks (macroblocks) each consisting of 4 × 4 pixels, and the expression (2 ) Is executed.
[0088]
By repeatedly executing digital watermark embedding processing for all macroblocks, it is possible to embed watermark information in the entire image as a result. Since the additional information Inf composed of n bits is embedded in one macroblock, the embedded additional information Inf can be extracted if there is at least one macroblock. In other words, the extraction of the additional information Inf does not require the entire image, and it is sufficient if there is a part of the image data I (at least one macro block). The fact that the additional information Inf can be completely extracted from a part of the image data I is referred to as “cut-off resistance”.
[0089]
The image data I ′ in which the additional information Inf generated in this way is embedded as a digital watermark becomes the final output of the embedding processing unit through the image output unit 4006.
[0090]
[Digital Watermark Extraction Processing Unit]
FIG. 19 is a block diagram illustrating a configuration of an extraction processing unit (functional unit) that extracts watermark information embedded in an image.
[0091]
The extraction processing unit includes an image input unit 4601, a key information input unit 4602, a digital watermark generation unit 4603, a digital watermark extraction unit 4604, and a digital watermark output unit 4605. The digital watermark extraction process may be realized by software having the above-described configuration.
[0092]
The image input unit 4601 receives image data I ″ that may have watermark information embedded therein. The image data I ″ input to the image input unit 4601 is watermarked by the above-described embedding processing unit. It may be image data I ′ with embedded information, image data I ′ with an attack, or image data I without embedded watermark information.
[0093]
The key information input unit 4602 inputs key information k for extracting watermark information. The key information k input here must be the same as that input to the key information input unit 4003 of the embedding processing unit described above. If different key information is input, additional information cannot be extracted correctly. In other words, only the user having the correct key information k can extract the correct additional information Inf ′.
[0094]
The extraction pattern generation unit 4603 receives the key information k and generates an extraction pattern based on the key information k. FIG. 20 is a diagram illustrating details of processing of the extraction pattern generation unit 4603. The extraction pattern generation unit 4603 includes a basic matrix generation unit 4701, a pseudo random number generation unit 4702, and a pseudo random number assignment unit 4703. The basic matrix generation unit 4701 performs the same operations as the basic matrix generation unit 4101 and the pseudo random number generation unit 4702 described above, and thus detailed description thereof is omitted. However, if the basic matrix m generated by the basic matrix generation unit 4701 and the basic matrix m generated by the basic matrix generation unit 4101 are not the same for the same key information k, the additional information can be correctly extracted. Can not.
[0095]
The pseudorandom number assigning unit 4703 receives the basic matrix m and the pseudorandom number sequence r, and assigns each element of the pseudorandom number sequence r to a predetermined element of the basic matrix m. The difference between the embedding processing unit and the pseudo-random number assigning unit 4103 is that the pseudo-random number assigning unit 4103 outputs one digital watermark w, whereas the pseudo-random number assigning unit 4703 receives the number of bits of the additional information Inf. The extraction pattern wn for (n bits here) is output.
[0096]
Details of assigning each element of the pseudo-random number sequence r to a predetermined element of the basic matrix m will be described with reference to an example using the matrix 4204 shown in FIG. When the matrix 4204 is used, since 4-bit additional information Inf can be embedded, four extraction patterns w1, w2, w3, and w4 are output. Specifically, each element of the matrix 4204 is scanned in raster order, each element of the pseudo random number sequence r is assigned to an element having the value “1”, and each element of the pseudo random number sequence r is assigned to all elements having the value “1”. When the element assignment is completed, a matrix to which the pseudo random number sequence r is assigned is generated as the extraction pattern w1. FIG. 21 is a diagram illustrating an example of an extraction pattern, in which a real number sequence of r = {0.7, −0.6, −0.9, 0.8} is used as the pseudo random number sequence r. The above processing is executed for the elements having the values “2”, “3”, and “4” of the matrix 4204 to generate extraction patterns w2, w3, and w4, respectively. When the extracted patterns w1, w2, w3, and w4 generated in this way are superimposed, it becomes equal to the digital watermark w created by the embedding processing unit.
[0097]
The digital watermark extraction unit 4604 inputs the image data I ″ and the extraction patterns w1, w2,..., Wn, and extracts the additional information Inf ′ from the image data I ″. The additional information Inf ′ extracted here is desired to be equal to the embedded additional information Inf, but does not necessarily match when the image data I ′ is subjected to various attacks.
[0098]
The digital watermark extraction unit 4604 calculates the correlation between the integrated image c generated from the image data I "and the extraction patterns w1, w2, ..., wn. The integrated image c converts the image data I" into a macroblock. It is the image which divided | segmented and calculated the average value of the element value of each macroblock. FIG. 22 is a diagram for explaining an integrated image c when a 4 × 4 pixel extraction pattern and 24 × 24 pixel image data I ″ are input. The image data I ″ shown in FIG. 22 includes 36 macros. An integrated image c is obtained by dividing the block into blocks and calculating the average value of the elements of these 36 macroblocks.
[0099]
Correlations between the integrated image c generated in this way and the extraction patterns w1, w2,. The correlation coefficient is a statistic that measures the degree of similarity between the accumulated image c and the extraction pattern wn, and is expressed by Equation (3).
ρ = c ' ^T ・ W'n / | c ' ^T || w'n |… (3)
Where c 'and w'n are matrices whose elements are the difference between each element and the average value of the elements
c ' ^T Is the transpose of c '
[0100]
The correlation coefficient ρ takes a value from −1 to +1. When the positive correlation between the integrated image c and the extraction pattern wn is strong, ρ approaches +1, and when the negative correlation is strong, ρ approaches -1. “Positive correlation is strong” is the relationship “extracted pattern wn is larger as integrated image c is larger”, and “negative correlation is stronger” is relationship “extracted pattern wn is smaller as integrated image c is larger” is there. If there is no correlation between the integrated image c and the extraction pattern wn, ρ is 0.
[0101]
Based on the correlation thus calculated, whether or not the additional information Inf 'is embedded in the image data I ", and if embedded, whether each bit constituting the additional information Inf' is '1' or '0'. That is, the correlation coefficient between the integrated image c and the extraction patterns w1, w2,..., Wn is calculated, and when the calculated correlation coefficient is close to 0, “additional information is not embedded”, It is determined that “1” is embedded when the relation number is a positive number away from 0, and “0” is embedded when the correlation coefficient is a negative number away from 0.
[0102]
Obtaining the correlation is equivalent to evaluating the similarity between the integrated image c and each of the extraction patterns w1, w2,. That is, when the pattern corresponding to the extraction pattern w1, w2,..., Wn is embedded in the image data I ″ (integrated image c) by the embedding processing unit described above, a correlation value indicating a high similarity is calculated. The
[0103]
FIG. 23 is a diagram illustrating an example of extracting a digital watermark from image data I ″ (integrated image c) embedded with 4-bit additional information using w1, w2, w3, and w4.
[0104]
Correlation values between the integrated image c and the four extraction patterns w1, w2, w3, and w4 are calculated. When the additional information Inf ′ is embedded in the image data I ”(integrated image c), the correlation values are calculated as, for example, 0.9, −0.8, −0.85, and 0.7. From this result, the additional information Inf ′ is“ 1001 ”. It is possible to finally determine that 4-bit additional information Inf ′ is extracted.
[0105]
The n-bit additional information Inf ′ thus extracted is output as an extraction result of the extraction processing unit through the digital watermark output unit 4605. At that time, if the error correction encoding process or the encryption process is performed in the embedding processing unit when the additional information Inf is embedded, the error correction decoding process or the encryption decoding process is executed. The obtained information is finally output as a binary data string (additional information Inf ′).
[0106]
[Modification]
In the above description, an example in which a document watermark and a background watermark are selectively used as a watermark has been described. However, the present invention is not limited to this, and an optimal watermark method may be used for each object.
[0107]
Moreover, although the example which implement | achieves authentication control using a password was demonstrated, it is not restricted to this, You may implement | achieve by key control.
[0108]
Second Embodiment
The image processing apparatus according to the second embodiment of the present invention will be described below. Note that in the second embodiment, identical symbols are assigned to configurations similar to those in the first embodiment and detailed description thereof is omitted.
[0109]
[Constitution]
FIG. 24 is an external view showing a configuration example of a digital copying machine. A reader unit 5l that digitally reads an original image and performs predetermined image processing to generate digital image data, and a copy using the generated digital image data. The printer includes a printer 52 that generates an image.
[0110]
The document feeder 510l of the reader unit 5l supplies the documents one by one on the platen glass 5102 sequentially from the last page, and discharges the document on the platen glass 5102 after reading the document image. When the document is conveyed onto the platen glass 5102, the lamp 5103 is turned on, and the movement of the scanner unit 5104 is started, and the document is exposed and scanned. The reflected light from the original at this time is imaged on a CCD image sensor (hereinafter referred to as “CCD”) 5109 by mirrors 5105, 5106, 5107 and a lens 5108. In this way, the scanned original image is read by the CCD 5109, and the image signal output from the CCD 5109 is subjected to image processing such as shading correction and sharpness correction by the image processing unit 5110, and then the printer unit. Transferred to 52.
[0111]
The laser driver 522l of the printer unit 52 drives the laser emission unit 520l in accordance with the image data input from the reader unit 5l. Laser light output from the laser light emitting unit 5201 scans the photosensitive drum 5202 with a polygon mirror to form a latent image on the photosensitive drum 5202. A developer (toner) is attached to the latent image formed on the photosensitive drum 5202 by the developing device 5203.
[0112]
The recording paper supplied from the cassette 5204 or the cassette 5205 is conveyed to the transfer unit 5206 in synchronization with the start of laser beam irradiation, and the developer attached to the photosensitive drum 5202 is transferred. The recording paper to which the developer has been transferred is conveyed to the fixing unit 5207, and the developer is fixed to the recording paper by the heat and pressure of the fixing unit 5207. The recording paper that has passed through the fixing unit 5207 is discharged by a discharge roller 5208. The sorter 5220 sorts the recording paper by storing the discharged recording paper in each bin. The sorter 5220 stores the recording paper in the uppermost bin when the sorting is not set.
[0113]
When double-sided recording is set, the recording sheet is conveyed to the discharge roller 5208 and then guided to the refeed conveyance path by the reversely rotated discharge roller 5208 and flapper 5209. When multiple recording is set, the recording paper is not conveyed to the discharge roller 5208 but is guided to the refeed conveyance path by the flapper 5209. The recording paper guided to the refeed conveyance path is fed to the transfer unit 5206 at the timing described above.
[0114]
[processing]
FIG. 25 is a flowchart showing processing for concealing an area-designated image, which is executed by the image processing unit 5110 of the reader unit 51.
[0115]
When the image signal read from the document is input, the image processing unit 5110 generates digital image data obtained by quantizing the luminance information of a fine pixel unit with an accuracy of usually about 8 bits (S101). The spatial resolution of the pixels is about 42 μm × 42 μm, which is a resolution of about 600 pixels (600 dpi) per inch (25.4 mm). The image processing unit 5110 displays the image represented by the generated image data on the screen of the operation unit shown in FIG.
[0116]
The operation unit is usually composed of a liquid crystal display or the like that is covered with a touch panel, and can perform a desired operation by operating buttons displayed on the screen. In FIG. 6, a button 601 is a button for selecting a device mode. In the “copy” mode, the read original image is copied (output from the printer unit 52). In the “send” mode, the image data of the read original is converted into an electronic file. In the “accumulation” mode, the image data of the read original is accumulated as an electronic file in an auxiliary storage device such as a hard disk built in the apparatus. Here, assuming that the “copy” mode is selected, the button frame is indicated by a bold line.
[0117]
A display unit 602 displays basic operation conditions of the apparatus according to the selected mode, and displays the output paper size and the enlargement / reduction ratio when the copy mode is selected. The preview display unit 603 reduces and displays the entire image read by the reader unit 51. A frame 604 displayed on the preview display unit 603 indicates an area set for the preview-displayed image. An area indicated by a frame 604 (hereinafter referred to as “area”) is enlarged / reduced by a button 605 and is moved up / down / left / right by a button 606. In other words, the size and position of the area 604 on the preview display unit 603 are changed by operating the buttons 605 and 606.
[0118]
A box 607 is a text box for inputting authentication information, which will be described later. For example, a character string of about 4 digits is input using a numeric keypad (not shown), and a symbol such as “*” corresponding to the number of input character strings. Is displayed. The reason why “*” is displayed without directly displaying the input character string is to improve security.
[0119]
The image processing unit 5110 receives designation of the user's area 604 and input of authentication information using the operation unit (S102, S103), and when the designation and input are completed, the image data designated in the area 104 is input from the input image data. Cutting out (S104), discriminating the type of cut out image data (S105), selecting an image compression method according to the discrimination result (S106), and compressing the image data cut out by the selected image compression method (S107) . Next, based on the compressed image data, the identification code indicating the used image compression method, and the input authentication information, code data obtained by combining them is generated (S108), and the generated code data is converted into bits by a method described later. Conversion into map data (S109). Then, the image data in the area 604 is erased from the input image data (S110), and the image data in which the code data obtained in step S109 is inserted into the blank area after erasing is synthesized (S111), The synthesized image data is output (S112).
[0120]
Here, since the copy mode is selected as the apparatus mode, the output image data is sent to the printer unit 52, and a copy image is formed on the recording paper. Similarly, if the transmission mode is selected as the device mode, the output image data is sent to the network communication unit and electronically transferred to a predetermined destination. If the accumulation mode is selected, the output image data is accumulated in the auxiliary storage device in the apparatus.
[0121]
FIG. 27 is a flowchart for explaining in detail the processing from steps S105 to S111.
[0122]
The type of the cut-out image data is determined, and it is determined whether the area-designated image is a continuous tone image such as a photograph or a binary image such as a character / line image (S203). As a discrimination method, a method using a histogram showing the luminance distribution of the target image, a method using the occurrence frequency for each spatial frequency component, or whether or not the probability of being recognized as a “line” by pattern matching is high is used. Various methods have been proposed, and such known methods can be used.
[0123]
If the image is determined to be a character / line drawing, a histogram indicating the luminance distribution of the image is generated (S204), and an optimal threshold value is obtained from this histogram to separate the background from the character / line drawing (S205). The threshold value is used to binarize the image data (S206), and the obtained binary image data is compressed (S207). A known binary image compression method can be applied to this compression processing. Normally, as a binary image compression method, a lossless (reversible) compression method in which no loss of information occurs, for example, one of MMR compression, MR compression, MH compression, JBIG compression, or the like is used. Of course, any of the above methods can be used adaptively so that the code size after compression is minimized.
[0124]
On the other hand, if the image is determined to be a continuous tone image, resolution conversion is performed (S208). The input image data is read at, for example, 600 dpi, but it is usual that a gradation image such as a photograph does not deteriorate even at about 300 dpi. Therefore, for the purpose of reducing the final code size, for example, the image data is converted into image data equivalent to 300 dpi that has been reduced to a half size both vertically and horizontally. Then, the 300-dpi multi-value image data is compressed (S209). As a compression method suitable for a multi-valued image, a well-known JPEG compression method, JPEG2000 compression method, or the like can be used. However, these compression methods are lossy (irreversible) compression methods that are accompanied by deterioration that is normally difficult to visually identify the original image.
[0125]
Code information for identifying the compression method is added to the obtained compressed image data (S210). This is information necessary for designating a decompression method when restoring the output image to the original image. For example, the following identification codes are assigned in advance to the respective compression methods.
JPEG compression → BB
JPEG2000 → CC
MMR compression → DD
MH compression → EE
JBIG compression → FF
[0126]
Next, an authentication information code is added (S211). The authentication information is information necessary to determine whether or not the person who is going to restore has the authority when restoring the output image to the original image. At the time of restoration, restoration processing to the original image is performed only when the authentication information added here is correctly specified.
[0127]
The digital signal sequence of the code data obtained in this way is converted into binary bitmap data as a binary number (S212), and is fitted into the area 604 and synthesized (S213).
[0128]
FIG. 28 schematically shows the above operation. When an area 604 is designated for the input image data 301, the image data in the area 604 is erased and replaced with bit mapped code data.
[0129]
FIG. 29 is a diagram schematically illustrating the flow shown in FIG.
[0130]
An Sx × Sy pixel image in the area 604 is cut out, and since it is determined as a character / line image, it is binarized and lossless compressed. Then, for example, a compression scheme identification code is added to the selection of the compressed code string, authentication information is added to the head, and binary and bitmap conversion is performed. Then, a bit of Sx × Sy pixel of the same size as the area 604 Map data is generated, and the image in area 604 is replaced with this bitmap data. Of course, the addition position of the identification code and the authentication information is arbitrary as long as it is a predetermined position such as the end of the code string, the end, or a predetermined bit number. Furthermore, in order to ensure the extraction of the identification code and the authentication information, it can be repeatedly added to a plurality of positions.
[0131]
[Bitmap of coded data]
FIGS. 30 to 32 are diagrams for explaining a method of converting code data into a bitmap, and show three different methods. Each small rectangle represents one pixel of 600 dpi.
[0132]
In the method shown in FIG. 30, 600 dpi pixels are bitmapped so that 2 × 2 pixels have 1-bit information. If the code data (left side) expressed as a binary number is '1', 4 pixels of 2 x 2 pixels are set to '1' (black), and if the code data is '0', the four pixels are set to '0' (white) ). As a result, binary bitmap data having a resolution (300 dpi) of 1/2 of 600 dpi is generated. It should be noted that 2 × 2 pixels represent 1 bit of information when the original image is restored by scanning a bitmap image printed on the recording paper according to the embodiment with the reader reading accuracy, misalignment, This is to reduce the influence of magnification error and the like and accurately restore the code data from the bitmap image.
[0133]
The method shown in FIG. 31 does not set all 2 × 2 pixels to the same value. When the code data is “1”, the upper left small pixel (equivalent to 600 dpi) is set to “1” (black) among the four pixels, If '0', the lower right small pixel is set to '1' (black). According to such a configuration, the reliability when the printed bit MAP image is scanned to restore the original image is improved.
[0134]
In the method shown in FIG. 32, pixels representing 1 bit are 4 × 2 pixels, and “1” and “0” are represented by the arrangement of monochrome pixels as illustrated. This reduces the amount of data that can be recorded per unit area, but can further improve the reading accuracy when restoring the original image.
[0135]
The bitmapping method is not limited to the above method, and other various methods can be applied.
[0136]
Next, the size of the generated bitmap and the amount of information that can be embedded therein will be described.
[0137]
If the size of the area 604 (Sx × Sy pixel) is 2 inches square (about 5 cm in length and width) on the document, the original image data is 600 dpi, so both Sx and Sy are 1200 pixels. That is, the information amount of the image data in the area 604 is as follows with 8 bits per pixel.
1200 × 1200 × 8 = 11,520,000 bits = 11M bits
[0138]
When the code data is converted into a bitmap by the above-described method and replaced with the image of the area 604, in the method of FIGS. 30 and 31, the amount of information that can be recorded is 1/4 × 1/4 because one bit of information is embedded in four pixels. 8 = 1/32, and the amount of data that can be embedded in the 2-inch square area 604 is as follows.
11M / 32 = 0.34Mbit
[0139]
In other words, 11 Mbit image data must be compressed to 0.34 Mbit of 1/32, which is unrealistic. Therefore, as described above, it is necessary to discriminate the image attribute of the area 604 and adaptively switch between binarization, resolution conversion, and compression method. If the image in the area 604 is a character / line drawing, the image is binarized with the resolution kept at 600 dpi. This reduces the image data volume to 1/8 (11/8 =) 1.38M bits, and 1/4 compression is necessary to make this 0.34M bits. The compression ratio can be easily achieved by the compression method. Of course, since it is necessary to embed an identification code or authentication information of the compression method, a compression ratio higher than 1/4 is required, but it can be achieved relatively easily.
[0140]
On the other hand, in the case of a photo / gradation image, the number of gradations remains 8 bits, and the data volume is reduced to 1/4 by reducing the resolution by half (300 dpi) to (11/4 =) 2.75 Mbits. Become. In order to further reduce this to 0.34 Mbit, compression of 1/8 is required, which is a compression rate that can be achieved very easily with JPEG and JPEG2000 compression methods, suppressing image quality degradation.
[0141]
In addition, if the bit map shown in FIG. 32 is adopted, the amount of information that can be embedded is further halved and the compression rate needs to be further doubled, but this is an unrealistic value as the compression method described above. Do not mean.
[0142]
[Restore original image]
FIG. 33 is a flowchart for explaining a method of restoring the original image from the bitmap, which is executed by the image processing unit 5110 of the reader unit 51.
[0143]
The image processing unit 5110 inputs an image (S801). In the case of print output, the image may be read as a digital image by the reader unit 51, and may be input as it is as a digital image if it is electronically transmitted or stored.
[0144]
Next, the image processing unit 5110 detects an image area concealed from the input image (S802). For this detection, a method is used in which a rectangular area included in the input image is detected, and if the detected rectangular area is periodically switched between black pixels and white pixels, it is determined that the image area is concealed. To do.
[0145]
Next, an array of pixels is read from the detected image data of the concealed image area (S803), a bit map conversion method for the image data is determined, and a binary code string is restored (S804). An identification code indicating a compression method is extracted from the sequence (S805), and authentication information is extracted (S806).
[0146]
Next, the fact that there is an image concealed in the input image is displayed on the screen of the operation unit, etc., and the user is prompted to input authentication information for restoring the image (S807). When authentication information is input, it is determined whether or not the input authentication information matches the extracted authentication information (S808). If the authentication information does not match, the input image is output as it is (S813).
[0147]
If they match, the original image is restored, but the code data of the compressed image excluding the compression method identification code and the authentication information is extracted from the code string (S809), and the extracted code data is added to the extracted identification code. Corresponding compression method decompression processing is performed (S810), the decompressed image is replaced with the detected image of the concealed image area (S811), and the resultant composite image is output (S812). The image output here is an image obtained by restoring the original image before hiding the area-specified image.
[0148]
In this way, the code data obtained by efficiently compressing the area-designated partial image is converted into a bitmap and combined with the original image, thereby replacing the area-designated image with an image that cannot be visually identified. Can be concealed. When there is such an indistinguishable image (a concealed image area), the image in the area is recognized (decoded) as code data, browsed with reference to authentication information set in the code data, etc. For an authorized user, the original image can be restored based on the identification code of the compression method set in the code data.
[0149]
Therefore, a user having a predetermined authority can restore the original image and display, print, copy, transmit and / or store the original image. The authentication information may be set individually for each of display, printing, copying, transmission, and storage image operations, or grouped together, or grouped image operations such as display, printing, copying, and transmission. You may set for every unit.
[0150]
[Modification]
In the above, as shown in FIG. 28 and the like, an example has been described in which one area 604 is specified and the image of the area is concealed. However, the concealment area is not limited to one, and a plurality of areas can be designated. is there. In that case, the processing of steps S102 to S111 may be repeated for each area designation. Further, when the original image is restored from an image having a plurality of concealed image areas, the processes of steps S803 to S811 may be repeated for each detected concealed image area.
[0151]
In the above description, the information concealment, encoding method, and restoration method have been described for a document image read by a digital copying machine, but these can also be applied to documents and drawings on a PC (personal computer). . In this case, when printing of a document or graphic is instructed, a device driver corresponding to the printer to be printed is activated, and image data for print output is generated based on a print code generated by an application on the PC. As shown in FIG. 26, the device driver displays a preview of the generated image data on the user interface screen, and accepts designation of an area 604 that the user desires to hide and input of authentication information. Further, the concealed image area is detected. The subsequent processes are the same as described above, but these processes are realized by a device driver on the PC (specifically, a CPU that executes device driver software).
[0152]
In the above description, the example in which the code data is converted into a bitmap has been described. However, it is assumed that the original image cannot be accurately restored due to distortion of the print image, dirt on the recording paper, or the like. In order to avoid such a failure, if the error correction code is added to the code data and then converted into a bitmap, the reliability of the data recorded as a bitmap can be improved. Various known methods have been proposed for error correction codes, which can be used. However, since the effective amount of information that can be embedded is reduced, it is necessary to set the image compression rate higher. Of course, in order to improve not only the error correction code but also the robustness against information leakage, it is conceivable that the code data is encrypted and then converted into a bitmap.
[0153]
[Other Embodiments]
Note that the present invention can be applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, and a printer), and a device (for example, a copying machine and a facsimile device) including a single device. You may apply to.
[0154]
Another object of the present invention is to supply a storage medium (or recording medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded to a system or apparatus, and the computer (or CPU or CPU) of the system or apparatus. Needless to say, this can also be achieved by the MPU) reading and executing the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention. Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.
[0155]
Furthermore, after the program code read from the storage medium is written into a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function is determined based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion card or the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.
[0156]
When the present invention is applied to the storage medium, the storage medium stores program codes corresponding to the flowcharts described above.
[0157]
【The invention's effect】
As described above, according to the present invention, image information text In the area Adaptively according to the display / non-display selected by the user The process can be controlled.
[0158]
Also, text region about Images can be protected.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of an image processing system according to an embodiment;
FIG. 2 is a block diagram illustrating a configuration example of an MFP.
FIG. 3 is a flowchart for explaining an overview of processing by the image processing system;
FIG. 4 is a flowchart for explaining an overview of processing by the image processing system;
FIG. 5 is a flowchart for explaining an overview of processing by the image processing system;
FIG. 6 is a flowchart for explaining an overview of processing by the image processing system;
FIG. 7 is a diagram for explaining block selection;
FIG. 8 is a diagram showing the result of block selection;
FIG. 9 is a diagram for explaining embedding of a document watermark;
FIG. 10 is a diagram for explaining extraction of a document watermark;
FIG. 11 is a diagram for explaining a document watermark embedding rule;
FIG. 12 is a flowchart showing a process for searching for a shift amount;
FIG. 13 is a block diagram showing the configuration of an embedding processing unit (functional unit) that embeds watermark information;
FIG. 14 is a block diagram showing details of a digital watermark generation unit;
FIG. 15 is a diagram showing an example of a basic matrix;
FIG. 16 is a diagram showing an example of a digital watermark w;
FIG. 17 is a diagram showing a digital watermark embedding process;
FIG. 18 is a diagram illustrating a configuration example of image data;
FIG. 19 is a block diagram illustrating a configuration of an extraction processing unit (functional unit) that extracts watermark information embedded in an image;
FIG. 20 is a diagram showing details of processing of the extraction pattern generation unit;
FIG. 21 is a diagram showing an example of an extraction pattern;
FIG. 22 is a diagram for explaining an integrated image;
FIG. 23 is a diagram showing an example of extracting a digital watermark;
FIG. 24 is an external view showing a configuration example of a digital copying machine;
FIG. 25 is a flowchart showing processing for concealing an area-designated image, which is executed by the image processing unit of the reader unit;
FIG. 26 is a diagram showing an outline of the operation unit;
FIG. 27 is a flowchart for explaining in detail the processing from steps S105 to S111 shown in FIG.
FIG. 28 is a diagram schematically showing the flow shown in FIG. 27;
FIG. 29 is a diagram schematically explaining the flow shown in FIG. 27;
FIG. 30 is a diagram for explaining a method of converting code data into a bitmap;
FIG. 31 is a diagram for explaining a method of converting code data into a bitmap;
FIG. 32 is a diagram for explaining a method of converting code data into a bitmap;
FIG. 33 is a flowchart for describing a method of restoring an original image from a bitmap, which is executed by the image processing unit 5110 of the reader unit.

Claims

An image processing method performed by an image processing apparatus,
The input means of the image processing apparatus inputs image information,
A recognition unit included in the image processing apparatus recognizes a plurality of text regions included in the input image information;
The generation unit included in the image processing apparatus generates authentication information for controlling at least one of display, printing, copying, and transmission processing for the text area for each text area,
The selection means included in the image processing apparatus selects at least one of display, printing, duplication, and transmission of the text area and then displaying or not displaying the text in the text area,
When the display unit is selected in the selection, the embedding unit included in the image processing apparatus embeds the authentication information as a digital watermark in a state where the text in the text region is visible in the text region, When the non-display is selected in the selection, the authentication information is embedded as a digital watermark in a state where the text in the text area is invisible in the text area.

When the means of the image processing device makes the text invisible, the text information is stored in an external device ,
The image processing method according to claim 1, wherein position information of the stored text information is embedded as a digital watermark.

Further, the means included in the image processing apparatus extracts authentication information embedded in the text area,
3. The image processing method according to claim 1, wherein at least one of display, printing, copying, and transmission processing for the text area is controlled based on the extracted authentication information.

The accepting means included in the image processing device accepts input of authentication information for controlling at least one of display, printing, duplication, and transmission processing for the text area,
A determination unit included in the image processing apparatus determines whether the received authentication information is correct authentication information;
When the processing means included in the image processing apparatus is determined to be correct in the determination, at least one of display, printing, copying, and transmission processing for the text area is performed,
The image processing apparatus storage means has, when the it is determined not correct in decision saves the text information of the text area to the outside of the device,
The image processing method according to claim 3 , wherein the means included in the image processing apparatus embeds position information of the stored information as a digital watermark in a state where the text information of the text area is invisible.

An input means for inputting image information;
Recognition means for recognizing a plurality of text regions included in the input image information;
Generation means for generating authentication information for controlling at least a display, printing, processing of any of replication and transmission to the text area for each of the text area,
A selection means for selecting either display or non-display of the text in the text area after performing at least one of display, printing, duplication, and transmission processing on the text area;
When display is selected by the selection means, the text in the text area is visible, and the authentication information is embedded as a digital watermark for each text area,
An image processing apparatus comprising: an embedding unit that embeds the authentication information as a digital watermark in a state where the text in the text area is invisible when non-display is selected by the selection unit.

Furthermore,
Storage means for storing text information in an external device when the text is made invisible;
The image processing apparatus according to claim 5, further comprising means for embedding position information of the stored text information as a digital watermark.

Extraction means for extracting authentication information embedded in the text area;
Based on the extracted authentication information, at least a display, printing, copying and image processing apparatus according to claim 5 or 6, characterized in that a control means for controlling the processing of one of transmission to the text area .

The program for making a computer perform the image processing method of any one of Claims 1 thru | or 4 .

A computer-readable storage medium storing the program according to claim 8 .