JP3651777B2

JP3651777B2 - Digital watermark system, digital watermark analysis apparatus, digital watermark analysis method, and recording medium

Info

Publication number: JP3651777B2
Application number: JP2000361433A
Authority: JP
Inventors: 博文村谷
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2000-11-28
Filing date: 2000-11-28
Publication date: 2005-05-25
Anticipated expiration: 2020-11-28
Also published as: JP2002165081A

Description

【０００１】
【発明の属する技術分野】
本発明は、結託耐性符号の埋め込まれたデジタルコンテンツの複製物に対する結託攻撃に用いられたデジタルコンテンツの複製物の数を推定可能な電子透かしシステム、電子透かし解析装置及び電子透かし解析方法に関する。
【０００２】
【従来の技術】
デジタルコンテンツ（例えば、静止画、動画、音声、音楽等）は、多数のデジタルデータで構成された構造を持つ。そして、その構造の中には、データを変更しても、当該デジタルコンテンツの作品の同一性あるいは経済的価値を保持できる部分がある。そのような許容された範囲内のデータを変更するによって、デジタルコンテンツに、種々の情報を埋め込むことができる。このような技術は、電子透かしと呼ばれる。
【０００３】
電子透かし技術によって、デジタルコンテンツに、様々な透かし情報（例えば、コンテンツの著作権者やユーザの識別情報、著作権者の権利情報、コンテンツの利用条件、その利用時に必要な秘密情報、コピー制御情報等、あるいはそれらを組み合わせたものなど)を、様々な目的（例えば、利用制御、コピー制御を含む著作権保護、二次利用の促進等）で埋め込み、検出・利用することができる。
【０００４】
ここでは、例えば同一のデジタルコンテンツを多数のユーザを対象として配給するときに適用される技術として、デジタルコンテンツの複製物に、当該複製物を個々に識別するための情報（例えば、ユーザＩＤに一意に対応する透かし情報）を埋め込む場合を考える。
【０００５】
デジタルコンテンツの複製物に固有の識別情報を埋め込む手法は、そのデジタルコンテンツの複製物が更に複製されて海賊版として出回ったときに、該海賊版から識別情報を検出することによって流出元ユーザを特定することができることから、デジタルコンテンツの違法コピーに対する事前の抑制として機能するとともに、著作権侵害が発生したときの事後の救済にも役立つことになる。
【０００６】
また、あるユーザがデジタルコンテンツの複製物に埋め込まれた識別情報を無効するためには、ユーザにはどの部分が識別情報を構成するビットであるか分からないので、当該デジタルコンテンツの複製物に相当の改変を加える必要があり、そうすると、当該デジタルコンテンツの経済的価値を損なってしまうので、違法コピーの動機付けを奪うことができる。
【０００７】
このような状況において違法コピーを可能ならしめる方法として出現したのが、「結託攻撃（ｃｏｌｌｕｓｉｏｎａｔｔａｃｋ）」である。
【０００８】
結託攻撃は、異なる複製物には異なる識別情報が埋めこまれていることを利用するものであり、例えば、複数人で複製物を持ちよって、それらをビット単位で比較することによって、デジタルデータの値が異なる部分を見つけ出し、その部分を改ざん（例えば、多数決、少数決、ランダマイズ等）することによって、識別情報を改ざん、消失させるという方法である（なお、具体的な比較操作は行わず、コンテンツ間で画素値を平均化するなどの操作を行って、同様の結果を得る場合もある）。
【０００９】
例えば簡単な例で示すと、Ａ氏、Ｂ氏、Ｃ氏の複製物にそれぞれ、
００…００…
００…１１…
１１…００…
という識別情報が埋め込まれていた場合に、例えば、
１０…０１…
という、Ａ氏、Ｂ氏、Ｃ氏のいずれとも異なる識別情報が埋め込まれたコンテンツを出現させることができてしまう。
【００１０】
そこで、結託攻撃に対する耐性、すなわち結託攻撃を受けても結託者の全部または一部を特定できるような性質を持つ符号（以下、結託耐性符号と呼ぶ）を電子透かしとして埋め込む方法および該結託耐性符号に基づく追跡アルゴリズム（ｔｒａｃｉｎｇａｌｇｏｒｉｔｈｍ；結託攻撃に用いられたコンテンツに埋め込まれた識別番号を特定し、結託者のユーザＩＤを特定するためのアルゴリズム）が種々提案されている。例えば、その一つにｃ−ｓｅｃｕｒｅ符号がある(D. Boneh and J. Shaw, “Collusion-Secure Fingerprinting for Digital Data,” CRYPTO’95, 180-189, 1995.)。
【００１１】
【発明が解決しようとする課題】
結託攻撃に対する耐性をより強くする、すなわち結託者を特定できなくなる結託数の上限数をより増やすためには、コンテンツに埋め込む符号長をより長くする必要があり、一方、コンテンツに埋め込む符号長にも制限があるので、この種の結託耐性符号およびその結託耐性符号に基づく追跡アルゴリズムでは、符号長を削減するために、結託攻撃に利用された複製物の数に上限を設けている（なお、ｃ−ｓｅｃｕｒｅのｃは、高々ｃ個までの複製物を用いた結託攻撃に対して有効であるという意味である）。もし結託攻撃が許容されている個数を越える数の複製物を用いて行われた場合には、追跡アルゴリズムは、結託攻撃に関与した複製物の識別番号として、誤って、結託攻撃に関与したのではない複製物の識別番号を出力し、結託者でないものが結託者として特定されてしまう、という誤判定が発生することがあり得る。なお、結託攻撃においてそれに関与した複製物以外の複製物の識別番号と同一の識別番号を持つ複製物が生成される可能性と、許容個数以下での誤判定の可能性は、結託耐性符号の設計によって確率的にかなり低く抑えられるので、誤判定は、主に許容個数を越える結託攻撃によって発生する。
【００１２】
しかし、現実に許容個数までの複製物で結託攻撃が行われたか否かは、攻撃者（結託者）のみが知る情報である。攻撃者は、複製物の偽造によって不法に利益を得ることを目的としているため、結託攻撃に用いた複製物の個数を自ら公開することはおよそ考えられない。
【００１３】
したがって、該結託耐性符号の復号を行う追跡アルゴリズムが結託に関与した複製物の識別番号（あるいは、該識別番号に対応する結託者のユーザＩＤ）を出力したとしても、それが、正しく複製物あるいは結託者を特定しているのか、それとも、結託攻撃が許容されている個数を越える数の複製物を用いて行われたために、誤って、結託に使われていなかった複製物あるいはそのユーザを特定してしまったのかを、判別することができない、という問題点がある。
【００１４】
また、この問題を回避するためには、現実的に結託者が準備することができそうな複製物の個数を想像し、その個数以上の許容個数となるように結託耐性符号を設計するより他なく、どうしても、大きな許容個数を設定することになってしまい、その結果、符号長も大きくなってしまう。
【００１５】
本発明は、上記事情を考慮してなされたもので、結託耐性符号の埋め込まれたデジタルコンテンツの複製物に対する結託攻撃に用いられたデジタルコンテンツの複製物の数を推定可能な電子透かしシステム、電子透かし解析装置及び電子透かし解析方法を提供することを目的とする。
【００１６】
【課題を解決するための手段】
本発明は、結託攻撃に用いられたデジタルコンテンツの複製物の数を推定する電子透かしシステムであって、デジタルコンテンツの複製物をユーザへ渡すのに先だって、該複製物に対応するユーザを特定する識別情報に対して、所定の方法に従って、複数の整数を割り当て、割り当てられた前記複数の整数の各々に対応する複数の成分符号を生成し、生成された前記複数の成分符号を連接して埋め込むべき結託耐性符号を生成し、生成された前記結託耐性符号を前記複製物に埋め込む第１のステップと、解析対象となった前記デジタルコンテンツの複製物から、該複製物に前記結託耐性符号として埋め込まれている符号を検出し、検出された前記符号を構成する複数の成分符号の各々について、該成分符号の改ざん部分の位置に関係する位置情報を検出し、前記複数の成分符号の各々について検出された複数の前記位置情報に基づいて、前記改ざん部分の位置に関係する所定の統計量を求め、求められた前記改ざん部分の位置に関係する所定の統計量に基づいて、前記デジタルコンテンツに対する結託攻撃に使用された複製物の数を推定する第２のステップとを有することを特徴とする。
【００１７】
また、本発明は、結託攻撃に用いられたデジタルコンテンツの複製物の数を推定可能な電子透かしシステムであって、デジタルコンテンツの複製物に対応するユーザを特定する識別子を割り当てる際に、予め定められた非負整数の範囲に属する識別子候補の中から、所定の追跡アルゴリズムによって結託攻撃に用いられた複製物に対応する識別子であるとして誤検出される可能性のより高い弱識別子でないと判断されるものを割り当て、デジタルコンテンツの複製物をユーザへ渡すのに先だって、該複製物に対応するユーザを特定する前記識別子に対して、該識別子の値に基づく所定の方法に従って、複数の整数を割り当て、割り当てられた前記複数の整数の各々に対応する複数の成分符号を生成し、生成された前記複数の成分符号を連接して埋め込むべき結託耐性符号を生成し、生成された前記結託耐性符号を前記複製物に埋め込む第１のステップと、解析対象となった前記デジタルコンテンツの複製物から、該複製物に前記結託耐性符号として埋め込まれている符号を検出し、検出された前記符号に前記所定の追跡アルゴリズムを適用して、結託攻撃に用いられた複製物に対応するユーザを特定する前記識別子を求め、求められた前記識別子を、弱識別子とそれ以外の非弱識別子とに分類し、この弱識別子と非弱識別子との分類結果に基づいて、弱識別子と非弱識別子とに関する所定の統計量を求め、求められた前記弱識別子と非弱識別子とに関する所定の統計量に基づいて、前記デジタルコンテンツに対する結託攻撃に使用された複製物の数を推定する第２のステップとを有することを特徴とする。
【００１８】
また、本発明は、結託攻撃に用いられたデジタルコンテンツの複製物の数を推定する電子透かし解析装置／装置であって、解析対象となったデジタルコンテンツの複製物から、該複製物に結託耐性符号として埋め込まれている符号を検出する手段／ステップと、検出された前記符号を構成する複数の成分符号の各々について、該成分符号の改ざん部分の位置に関係する位置情報を検出する手段／ステップと、前記複数の成分符号の各々について検出された複数の前記位置情報に基づいて、前記改ざん部分の位置に関係する所定の統計量を求める手段／ステップと、求められた前記改ざん部分の位置に関係する所定の統計量に基づいて、前記デジタルコンテンツに対する結託攻撃に使用された複製物の数を推定する手段／ステップとを備えたことを特徴とする。
【００１９】
好ましくは、検出された前記符号を構成する複数の成分符号の各々について、該成分符号の改ざん部分の最上位ビット側位置と最下位ビット側位置の一方または両方を求め、求められた複数の前記最上位ビット側位置をそれぞれ規準化した値に対する第１の平均と、求められた複数の前記最下位ビット側位置をそれぞれ規準化した値に対する第２の平均との一方または両方を求め、求められた前記第１の平均と前記第２の平均の一方または両方を、予め定められた所定の関数に入力することによって、該所定の関数の出力として、結託攻撃に用いられた複製物の数の推定値を求めるようにしてもよい。
【００２０】
また、本発明は、検出された前記符号を構成する複数の成分符号の各々について、該成分符号の改ざん部分の最上位ビット側位置と最下位ビット側位置の一方または両方を求め、求められた複数の前記最上位ビット側位置のうち、予め定められた基準位置より上位ビット側にあるものの個数および該基準位置より下位ビット側にあるものの個数についての第１の比と、求められた複数の前記最下位ビット側位置のうち、予め定められた基準位置より上位ビット側にあるものの個数および該基準位置より下位ビット側にあるものの個数についての第２の比との一方または両方を求め、前記第１の比と前記第２の比の一方または両方を、予め定められた所定の関数に入力することによって、該所定の関数の出力として、結託攻撃に用いられた複製物の数の推定値を求めるようにしてもよい。
【００２１】
また、本発明は、結託攻撃に用いられたデジタルコンテンツの複製物の数を推定する電子透かし解析装置／方法であって、解析対象となった前記デジタルコンテンツの複製物から、該複製物に前記結託耐性符号として埋め込まれている符号を検出する手段／ステップと、検出された前記符号に前記所定の追跡アルゴリズムを適用して、結託攻撃に用いられた複製物に対応するユーザを特定する前記識別子を求める手段／ステップと、求められた前記識別子を、弱識別子とそれ以外の非弱識別子とに分類する手段／ステップと、この弱識別子と非弱識別子との分類結果に基づいて、弱識別子と非弱識別子とに関する所定の統計量を求める手段／ステップと、求められた前記弱識別子と非弱識別子とに関する所定の統計量に基づいて、前記デジタルコンテンツに対する結託攻撃に使用された複製物の数を推定する手段／ステップとを備えたことを特徴とする。
【００２２】
好ましくは、前記弱識別子と前記非弱識別子との分類結果に基づいて、弱識別子に分類された識別子の数と、非弱識別子に分類された識別子の数との比を求め、求められた前記比を、予め定められた所定の関数に入力することによって、該所定の関数の出力として、結託攻撃に用いられた複製物の数の推定値を求めるようにしてもよい。
【００２３】
好ましくは、前記結託攻撃に用いられた複製物の数の推定値を求める代わりに、前記結託攻撃に用いられた複製物の数の大小レベルを示す情報を求めるようにしてもよい。
【００２４】
また、本発明は、結託攻撃に用いられた異なる識別情報を透かしとして埋め込まれた同種の化学物質製品を追跡する化学物質透かしシステムであって、対象となる化学物質製品に埋め込むべき識別情報に対して、所定の方法に従って、複数の整数を割り当て、割り当てられた前記複数の整数の各々に対応する複数の成分符号を生成し、生成された前記複数の成分符号を連接して埋め込むべき結託耐性符号を生成し、生成された前記結託耐性符号を前記化学物質製品に埋め込む第１のステップと、解析対象となった前記化学物質製品から、該化学物質製品に前記結託耐性符号として埋め込まれている符号を検出し、検出された前記符号に前記所定の追跡アルゴリズムを適用して、結託攻撃に用いられた化学物質製品に対応する前記識別情報を求める第２のステップとを有することを特徴とする。
【００２５】
また、本発明は、結託攻撃に用いられた異なる識別情報を透かしとして埋め込まれた同種の化学物質製品を追跡する化学物質透かしシステムであって、対象となる化学物質製品に埋め込むべき識別子を割り当てる際に、予め定められた非負整数の範囲に属する識別子候補の中から、所定の追跡アルゴリズムによって結託攻撃に用いられた化学物質製品に対応する識別子であるとして誤検出される可能性のより高い弱識別子でないと判断されるものを割り当て、前記化学物質製品に埋め込むべき識別子に対して、該識別子の値に基づく所定の方法に従って、複数の整数を割り当て、割り当てられた前記複数の整数の各々に対応する複数の成分符号を生成し、生成された前記複数の成分符号を連接して埋め込むべき結託耐性符号を生成し、生成された前記結託耐性符号を前記化学物質製品に埋め込む第１のステップと、解析対象となった前記化学物質製品から、該化学物質製品に前記結託耐性符号として埋め込まれている符号を検出し、検出された前記符号に前記所定の追跡アルゴリズムを適用して、結託攻撃に用いられた化学物質製品に対応する前記識別情報を求める第２のステップとを有することを特徴とする。
【００２６】
また、本発明は、結託攻撃に用いられた異なる識別情報を透かしとして埋め込まれた同種の化学物質製品の数を推定する化学物質透かしシステムであって、対象となる化学物質製品に埋め込むべき識別情報に対して、所定の方法に従って、複数の整数を割り当て、割り当てられた前記複数の整数の各々に対応する複数の成分符号を生成し、生成された前記複数の成分符号を連接して埋め込むべき結託耐性符号を生成し、生成された前記結託耐性符号を前記化学物質製品に埋め込む第１のステップと、解析対象となった前記化学物質製品から、該化学物質製品に前記結託耐性符号として埋め込まれている符号を検出し、検出された前記符号を構成する複数の成分符号の各々について、該成分符号の改ざん部分の位置に関係する位置情報を検出し、前記複数の成分符号の各々について検出された複数の前記位置情報に基づいて、前記改ざん部分の位置に関係する所定の統計量を求め、求められた前記改ざん部分の位置に関係する所定の統計量に基づいて、前記化学物質製品に対する結託攻撃に使用された化学物質製品の数を推定する第２のステップとを有することを特徴とする。
【００２７】
化学物質透かしシステム。
【００２８】
また、本発明は、結託攻撃に用いられた異なる識別情報を透かしとして埋め込まれた同種の化学物質製品の数を推定する化学物質透かしシステムであって、対象となる化学物質製品に埋め込むべき識別子を割り当てる際に、予め定められた非負整数の範囲に属する識別子候補の中から、所定の追跡アルゴリズムによって結託攻撃に用いられた化学物質製品に対応する識別子であるとして誤検出される可能性のより高い弱識別子でないと判断されるものを割り当て、前記化学物質製品に埋め込むべき識別子に対して、該識別子の値に基づく所定の方法に従って、複数の整数を割り当て、割り当てられた前記複数の整数の各々に対応する複数の成分符号を生成し、生成された前記複数の成分符号を連接して埋め込むべき結託耐性符号を生成し、生成された前記結託耐性符号を前記化学物質製品に埋め込む第１のステップと、解析対象となった前記化学物質製品から、該化学物質製品に前記結託耐性符号として埋め込まれている符号を検出し、検出された前記符号に前記所定の追跡アルゴリズムを適用して、結託攻撃に用いられた化学物質製品に対応する前記識別情報を求め、求められた前記識別子を、弱識別子とそれ以外の非弱識別子とに分類し、この弱識別子と非弱識別子との分類結果に基づいて、弱識別子と非弱識別子とに関する所定の統計量を求め、求められた前記弱識別子と非弱識別子とに関する所定の統計量に基づいて、前記化学物質製品に対する結託攻撃に使用された化学物質製品の数を推定する第２のステップとを有することを特徴とする。
【００２９】
なお、装置に係る本発明は方法に係る発明としても成立し、方法に係る本発明は装置に係る発明としても成立する。
また、装置または方法に係る本発明は、コンピュータに当該発明に相当する手順を実行させるための（あるいはコンピュータを当該発明に相当する手段として機能させるための、あるいはコンピュータに当該発明に相当する機能を実現させるための）プログラムを記録したコンピュータ読取り可能な記録媒体としても成立する。
【００３０】
本発明によれば、結託耐性符号の埋め込まれたデジタルコンテンツの複製物から検出した符号についての統計的な手法に基づく推定（例えば、結託耐性符号における改ざん部分の分布の偏り、あるいは検出された識別子における弱識別子の比率などに関する統計的性質等に基づく推定）を行うことによって、結託攻撃に用いられたデジタルコンテンツの複製物の数を推定することができる。これによって、追跡アルゴリズムの追跡結果の正誤に関する評価や、結託数自体の情報収集などができるようになる。
【００３１】
【発明の実施の形態】
以下、図面を参照しながら発明の実施の形態を説明する。
【００３２】
本発明は、同一のデジタルコンテンツの複製物（例えば、静止画、動画、音声、音楽等）の各々に対して、少なくとも、複製物ごとに異なるユーザ識別符号（後述するように、当該複製物に対応するユーザすなわち当該複製物を利用することになるユーザ（例えば、当該複製物を譲渡するユーザ、あるいは当該複製物を貸し渡すユーザ）のユーザ識別子（ユーザＩＤ）に一意に対応する識別情報であって結託耐性符号に基づくもの）を透かし情報として埋め込み、検出する場合に適用可能である。
【００３３】
もちろん、同一のデジタルコンテンツの複製物の各々に対して、さらに、その他の様々な透かし情報（例えば、コンテンツの著作権者の識別情報、著作権者の権利情報、コンテンツの利用条件、その利用時に必要な秘密情報、コピー制御情報等、あるいはそれらを組み合わせたものなど)を様々な目的（例えば、利用制御、コピー制御を含む著作権保護、二次利用の促進等）で埋め込み、検出するものであってもよいが、以下では、ユーザ識別符号に関係する部分を中心に説明する（その他の透かし情報を利用する場合における当該その他の透かし情報に関係する部分の構成は特に限定されない）。
【００３４】
以下で示す構成図は、装置の機能ブロック図としても成立し、また、ソフトウェア（プログラム）の機能モジュール図あるいは手順図としても成立するものである。
【００３５】
図１に、本発明の実施の形態に係る電子透かし埋込装置と電子透かし解析装置が適用されるシステムの概念図を示す。
【００３６】
電子透かし埋込装置１と電子透かし解析装置２は、コンテンツ提供側に備えられ、管理される。
電子透かし埋込装置１においてデジタルコンテンツに所望の透かしデータを埋め込む方法や、電子透かし解析装置２においてデジタルコンテンツから該透かしデータ自体を取り出す方法は、基本的には任意である（例えば、“松井甲子雄著、「電子透かしの基礎」、森北出版、１９９８年”等参照）。
電子透かし埋込装置１は、ソフトウェア（プログラム）としてもハードウェアとしても実現可能である。同様に、電子透かし解析装置２は、ソフトウェア（プログラム）としてもハードウェアとしても実現可能である。また、電子透かし埋込装置１および電子透かし解析装置２をコンテンツ提供側で用いる場合には、それらを一体化して実現することも可能である。
【００３７】
図２に、電子透かし埋込装置１の構成例を示す。この電子透かし埋込装置１は、ユーザ識別符号として埋め込むべき透かし情報である、ユーザＩＤに対応する結託耐性符号を生成する符号生成部１１と、生成された結託耐性符号（埋め込み符号）を対象コンテンツに埋め込む符号埋込部１２とから構成される。
【００３８】
電子透かし埋込装置１は、対象コンテンツと、これに埋め込むべき対象ユーザのユーザＩＤとが与えられると、該ユーザＩＤに対応する結託耐性符号を生成し、ユーザ識別符号として該結託耐性符号が埋め込まれたコンテンツを、該ユーザＩＤのユーザ向けの複製物として出力する。他の透かし情報を利用する場合には、その際に、必要に応じて他の透かし情報が埋め込まれる。
【００３９】
電子透かし埋込装置１により得られた各ユーザ向けのコンテンツの複製物は、記憶媒体や通信媒体などを媒介とした流通経路３を経てそれぞれ流通する。複数の複製物を用いた結託攻撃は、この流通経路３にて行われる。
【００４０】
図３、図４に、電子透かし解析装置２の構成例を示す。
【００４１】
図３、図４に示されるように、電子透かし解析装置２は、検出対象となるコンテンツからユーザ識別符号（埋め込まれた結託耐性符号または結託攻撃が施されて改ざんされたもの）を抽出する符号抽出部２１と、検出対象となるコンテンツについて、結託攻撃に使用された複製物の個数を推定する結託数推定部２２と、所定の追跡アルゴリズムを実行して、結託攻撃に用いられたであろう複製物の結託耐性符号を特定し、該結託耐性符号に対応するユーザＩＤ（ユーザＩＤが復元できない場合のある追跡アルゴリズムでは、結託攻撃に用いられたであろう結託耐性符号に対応するユーザＩＤ、またはユーザＩＤを復元できなかった旨）を特定する（なお、結託攻撃に用いられたであろう結託耐性符号自体を求めずに、直接、対応するユーザＩＤを求めるようにしてもよい）追跡アルゴリズム処理部２３とを備える。なお、本実施形態では、結託攻撃がなされなかった場合を、結託攻撃に使用された複製物の個数＝１として扱うものとする。
【００４２】
ここで、電子透かし解析装置２の結託数推定部２２には、（１）結託数推定部２２単独で（追跡アルゴリズム処理部２３の結果を利用することなしに）、検出対象となるコンテンツから、ユーザ識別符号（埋め込まれた結託耐性符号または結託攻撃が施されて改ざんされたもの）を取り出し、取り出したユーザ識別符号を解析することによって、結託攻撃に使用された複製物の個数を推定する形態（結託数推定部の第１の態様）と、（２）追跡アルゴリズム処理部２３から出力された追跡結果（例えば結託者の全部または一部のユーザＩＤ）に基づいて、結託攻撃に使用された複製物の個数を推定する形態（結託数推定部の第２の態様）とがある。図３は、結託数推定部の第１の態様の場合の構成例であり、図４は、結託数推定部の第１の態様の場合の構成例である。
【００４３】
結託耐性符号や追跡アルゴリズムは、基本的には、どのようなものでも適用可能であり、特に限定されない。
【００４４】
なお、図３や図４において、さらに、結託数推定部２２の結果および追跡アルゴリズム処理部２３の結果を総合的に判断した判定結果を出力する総合判定部を備えても良い。加工処理部は、例えば、推定結託数が許容数以下で且つ結託者のユーザＩＤ（の集合）が得られた場合には、当該結託者のユーザＩＤ（の集合）を出力し（または、これに加えて推定結託数を出力し）、推定結託数が許容数を越え且つ結託者のユーザＩＤ（の集合）が得られた場合には、推定結託数オーバーによる判定不能である旨（または、これに加えて推定結託数）を出力する。もちろん、その他の総合判定結果の生成の仕方も可能である。
【００４５】
なお、上記の（１）の場合には、結託数推定部２２による処理と、追跡アルゴリズム処理部２３による処理は、いずれを先に行ってもよいし、並列的に行ってもよい。
【００４６】
また、上記の（１）の場合には、結託数推定部２２を持ち且つ追跡アルゴリズム処理部２３を持たない電子透かし解析装置２もあり得る。図５に、この場合の構成例を示す。
【００４７】
本実施形態によれば、電子透かし解析装置２は結託数推定部２２によって結託攻撃に用いられたであろう複製物の個数を推定することができ、これによって追跡アルゴリズムの追跡結果の正誤に関する評価や、結託数自体の情報収集などができるようになる。
【００４８】
以下では、電子透かし埋込装置１についてより詳細に説明する。
【００４９】
図６に、概略的な手順の一例を示す。
【００５０】
符号生成部１１は、まず、対象複製物に、埋め込むべきユーザＩＤに対応する、Ｍ（Ｍは複数）個の整数Ａ（１）、Ａ（２）、…、Ａ（Ｍ）を求める（ステップＳ１）。該Ｍ個の整数は、予め求めて記憶しておく方法と、必要時に求める方法とがある。
【００５１】
各ｉ（ｉ＝１〜Ｍ）におけるそれぞれの整数Ａ（ｉ）は、０〜Ｎ（ｉ）−１のいずれかの値を取るものとする。ここで、Ｎ（１）、Ｎ（２）、…、Ｎ（Ｍ）は、予め定められた相互に異なる正整数とする。より好ましくは、Ｎ（１）、Ｎ（２）、…、Ｎ（Ｍ）は、互いに素である。
【００５２】
ユーザＩＤに対応するＭ個の整数Ａ（ｉ）の各々には、０〜Ｎ（ｉ）−１の範囲でランダムに値を割り当てる方法と、０〜Ｎ（ｉ）−１の範囲で一定の規則に従って値を割り当てる方法とがある。また、いずれも場合についても、各ユーザＩＤには、互いにＡ（１）、Ａ（２）、…、Ａ（Ｍ−１）のうちの少なくとも一つが相違するように排他的に割り当てる方法と、Ａ（１）、Ａ（２）、…、Ａ（Ｍ）のすべてを同一とするＭ個の整数の組を複数のユーザＩＤに重複に割当てることを許す方法とがある。
【００５３】
排他的に値を割り当てる方法には、例えば、ユーザＩＤの値として、０〜Ｎ（１）×Ｎ（２）×・・・×Ｎ（Ｍ）−１の範囲の整数の全部または一部を使用するものとし、Ｍ個の整数Ａ（ｉ）の各々について、対象ユーザＩＤをＮ（ｉ）で割ったときの余りを、該ユーザＩＤに対応するＡ（ｉ）の値とする方法がある。
【００５４】
なお、ユーザＩＤから、該ユーザＩＤに対応する整数Ａ（１）、Ａ（２）、…、Ａ（Ｍ）の組を算出できない方法を用いる場合には、各ユーザＩＤと、該ユーザＩＤに対応する整数Ａ（１）、Ａ（２）、…、Ａ（Ｍ）の組との対応関係情報を保存しておく必要がある。また、ユーザＩＤから、該ユーザＩＤに対応する整数Ａ（１）、Ａ（２）、…、Ａ（Ｍ）の組を算出できる方法を用いる場合には、各ユーザＩＤと、該対応関係情報を保存せずに必要に応じて再計算するようにしてもよいし、該対応関係情報を保存しておいてこれを参照するようにしてもよい。
【００５５】
次に、符号生成部１１は、対象複製物に埋め込むべきユーザＩＤに対応する、Ｍ個の整数Ａ（１）、Ａ（２）、…、Ａ（Ｍ）から、該ユーザＩＤに対応する結託耐性符号を生成する（ステップＳ２）。各ユーザＩＤに対応する結託耐性符号は、予め生成して記憶しておく方法と、必要時に生成する方法とがある。
【００５６】
各ユーザＩＤに対応する結託耐性符号は、該ユーザＩＤに対応するＭ個の整数Ａ（１）、Ａ（２）、…、Ａ（Ｍ）の各々について、対応する成分符号Ｗ（１）、Ｗ（２）、…、Ｗ（Ｍ）を求め、それらを連結することによって生成する。
【００５７】
整数Ａ（ｉ）に対応する成分符号Ｗ（ｉ）としては、例えば、Γ₀(ｎ，ｄ)符号（１または０のみからなる連続したｄビットを一つの単位Ｂ（ｊ）とし、Ｂ（０）〜Ｂ（ｎ−２）を連結したもの；ただし、Ｂ（０）〜Ｂ（ｎ−２）は、すべてが０のみからなるか、すべてが１のみからなるか、またはＢ（０）〜Ｂ（ｍ）までは０のみからなり且つＢ（ｍ）〜Ｂ（ｎ−２）までは１のみからなるものである）を用いることができる。例えば、対象ユーザＩＤをＮ（ｉ）で割った余りを該ユーザＩＤに対応する整数Ａ（ｉ）の値とする方法の場合の簡単な例を示すと、Ｎ（１）＝５の場合、ｎ＝５となり、ｄ＝３とすると、Γ₀(５，３)符号は、以下のようになる。
Ａ（１）＝０のとき：Ｗ（１）＝１１１１１１１１１１１１
Ａ（１）＝１のとき：Ｗ（１）＝０００１１１１１１１１１
Ａ（１）＝２のとき：Ｗ（１）＝００００００１１１１１１
Ａ（１）＝３のとき：Ｗ（１）＝０００００００００１１１
Ａ（１）＝４のとき：Ｗ（１）＝００００００００００００
このようにして求めた各Ａ（ｉ）に対応する成分符号Ｗ（ｉ）を連結することによって、結託耐性符号を生成することができる。
【００５８】
この符号では１と０はそれぞれｄビットを単位として連続するように配置され、ｄビット未満の数の１や０が孤立して存在することはない（上記の例では、３ビット未満の数の１や０が孤立して存在することはないことがわかる）。したがって、ｄビット未満の数の１や０が孤立して存在する場合には、結託攻撃がなされたことが推定される（ｄビット未満の数の１や０が孤立して存在しない場合には、結託攻撃がなされなかったことが推定される）。
【００５９】
このようにして生成された結託耐性符号は、電子透かし埋込装置１の符号埋め込み部１２によって、対象コンテンツに埋め込まれる（ステップＳ３）。
【００６０】
図７に、符号生成部１１の一構成例を示す。
【００６１】
この符号生成部１１は、それぞれｋ′（＝Ｍ）個の法記憶部１２１−１，１２１−２，…，１２１−ｋ′、剰余計算部１２２−１，１２２−２，…，１２２−ｋ′、成分符号生成部１２４−１，１２４−２，…，１２４−ｋ′と、符号パラメータ記憶部１２３及び符号連接部１２５からなる。
【００６２】
法記憶部１２１−１，１２１−２，…，１２１−ｋ′には、互いに素の関係にある整数、この例では相異なるｋ′個の素数ｐ_i（＝Ｎ（ｉ））(ｉ＝１，２，…，ｋ′)が記憶されており、これらの素数ｐ_iが剰余計算部１２２−１，１２２−２，…，１２２−ｋ′に法として供給される。剰余計算部１２２−１，１２２−２，…，１２２−ｋ′は、入力されるユーザＩＤ＝ｕに対して、素数ｐ_iを法とする剰余ｕ_i＝ｕ mod ｐ_i(ｉ＝１，２，…，ｋ′)をそれぞれ求める。すなわち、入力されたユーザＩＤに対応した複数の整数要素の組として、剰余計算部１２２−１，１２２−２，…，１２２−ｋ′により剰余ｕ_i＝ｕ mod ｐ_i(ｉ＝１，２，…，ｋ′)が計算される。なお、この例では、ｐ_i(ｉ＝１，２，…，ｋ′)は、素数としたが、互いに素な整数であってもよい。
【００６３】
成分符号生成部１２４−１，１２４−２，…，１２４−ｋ′は、ｋ′個の素数ｐ_i(ｉ＝１，２，…，ｋ′)に対して、符号パラメータ記憶部２３に記憶された符号パラメータｔに従って剰余計算部１２２−１，１２２−２，…，１２２−ｋ′により求められた剰余ｕ_i(ｉ＝１，２，…，ｋ′)を表す前述したΓ₀(ｎ，ｄ)符号からなる成分符号Γ₀(ｐ_i，ｔ)をそれぞれ生成する。すなわち、成分符号生成部１２４−１，１２４−２，…，１２４−ｋ′では、所定個数(ｎ)のユーザＩＤに対して剰余計算部１２２−１，１２２−２，…，１２２−ｋ′で計算される全ての剰余ｕ_i(ｉ＝１，２，…，ｋ′)の組を表現可能なｋ′個の成分符号のうちのｋ個の組み合わせがユーザＩＤを一意に表現できる成分符号Γ₀(ｐ_i，ｔ)を各剰余に対応して生成する。
【００６４】
符号連接部１２５は、成分符号生成部１２４−１，１２４−２，…，１２４−ｋ′により生成された各成分符号Γ₀(ｐ_i，ｔ)を連接することによって、透かし情報である結託耐性符号を生成する。
【００６５】
図８に、成分符号生成部１２４−１，１２４−２，…，１２４−ｋ′の一つ(１２４−ｉ)の構成を示す。符号パラメータをｔ、剰余をｕ_i、法をｐ_iとすると、減算部１３１ではｐ_i−ｕ_i−１が求められる。“０”列生成部１３２では、符号パラメータｔと剰余ｕ_iに基づきｔ×ｕ_iビットの連続した“０”列が生成され、“１”列生成部１３３では、符号パラメータｔと減算部１３１からの出力ｐ_i−ｕ_i−１に基づきｔ×(ｐ_i−ｕ_i−１)ビットの連続した“１”列が生成される。そして、これらの“０”列と“１”列が連接部３４で連接され、ｔ×(ｐ_i−１)ビットのビット列がΓ₀(ｎ，ｄ)符号からなる成分符号Γ₀(ｐ_i，ｔ)として生成される。
【００６６】
図９は、こうして生成される結託耐性符号の成分符号(結託攻撃を受ける前の成分符号)の一例を示している。０からｎ−１までのｎ個のユーザＩＤに対応して、Ｂ(０)，…，Ｂ(ｎ−２)のブロック“０”列からなる成分符号が割り当てられている。
【００６７】
ここで、上記の符号生成方法について数値を小さくとって簡単にした例を用いて説明する。
【００６８】
まず、整数の個数Ｍを３とし、Ｎ（１）＝３、Ｎ（２）＝５、Ｎ（３）＝７とする。この場合、Ａ（１）は０〜２のいずれか、Ａ（２）は０〜４のいずれか、Ａ（３）は０〜６のいずれかとなる。
【００６９】
次に、Ｎ（１）×Ｎ（２）×Ｎ（３）−１＝１０４であるので、０〜１０４の範囲の全部または一部をユーザＩＤとして用いる。ここでは、そのうち０〜１４をユーザＩＤとして用いるものとする。
【００７０】
例えば、ユーザＩＤ＝７の場合、
Ａ（１）＝７ｍｏｄＮ（１）＝０ｍｏｄ３＝１、
Ａ（２）＝７ｍｏｄＮ（２）＝０ｍｏｄ５＝２、
Ａ（３）＝７ｍｏｄＮ（３）＝０ｍｏｄ７＝０、
となる。
【００７１】
図１０に、この例においてユーザＩＤ＝０〜１４の各々について求められたＡ（１）、Ａ（２）、Ａ（３）を示す。
【００７２】
次に、Γ₀(ｎ，ｄ)符号においてｄ＝３とした場合におけるＡ（１）＝０、Ａ（１）＝１、Ａ（１）＝２のそれぞれに対応する成分符号Ｗ１は、次のようになる（なお、分かりやすくするために、０または１を、３ビット単位に分けて記述している）。
Ａ（１）＝０：Ｗ１＝１１１１１１
Ａ（１）＝１：Ｗ１＝０００１１１
Ａ（１）＝２：Ｗ１＝００００００
また、同様に、Ａ（２）＝０、Ａ（２）＝１、Ａ（２）＝２、Ａ（２）＝３、Ａ（２）＝４のそれぞれに対応する成分符号Ｗ２は、次のようになる。
Ａ（２）＝０：Ｗ２＝１１１１１１１１１１１１
Ａ（２）＝１：Ｗ２＝０００１１１１１１１１１
Ａ（２）＝２：Ｗ２＝００００００１１１１１１
Ａ（２）＝３：Ｗ２＝０００００００００１１１
Ａ（２）＝４：Ｗ２＝００００００００００００
また、同様に、Ａ（３）＝０、Ａ（３）＝１、Ａ（３）＝２、Ａ（３）＝３、Ａ（３）＝４、Ａ（３）＝５、Ａ（３）＝６のそれぞれに対応する成分符号Ｗ２は、次のようになる。
Ａ（３）＝０：Ｗ３＝１１１１１１１１１１１１１１１１１１
Ａ（３）＝１：Ｗ３＝０００１１１１１１１１１１１１１１１
Ａ（３）＝２：Ｗ３＝００００００１１１１１１１１１１１１
Ａ（３）＝３：Ｗ３＝０００００００００１１１１１１１１１
Ａ（３）＝４：Ｗ３＝００００００００００００１１１１１１
Ａ（３）＝５：Ｗ３＝０００００００００００００００１１１
Ａ（３）＝６：Ｗ３＝００００００００００００００００００
したがって、例えば、ユーザＩＤ＝７の場合、Ａ（１）＝１、Ａ（２）＝２、Ａ（３）＝０であるから、
Ｗ１＝０００１１１
Ｗ２＝００００００１１１１１１
Ｗ３＝１１１１１１１１１１１１１１１１１１
となり、ユーザＩＤ＝７に対応する結託耐性符号は、それらを連結して、
000111 000000111111 111111111111111111
となる（なお、分かりやすくするために、Ｗ１〜Ｗ３に対応する部分の境界で分けて記述している）。
【００７３】
図１１に、この例において各ユーザＩＤ＝０〜１４について求められた結託耐性符号を示す。
【００７４】
次に、電子透かし解析装置２についてより詳細に説明する。
【００７５】
ここで、上記の例を利用して、結託攻撃について説明する。
【００７６】
例えば、上記のユーザＩＤ＝２のユーザが入手したコンテンツには、ユーザ識別符号として、
000000 000000111111 000000111111111111
が埋め込まれている（図１０、図１１参照）。また、例えば、上記のユーザＩＤ＝３のユーザが入手したコンテンツには、ユーザ識別符号として、
111111 000000000111 000000000111111111
が埋め込まれている（図１０、図１１参照）。
【００７７】
この場合に、ユーザＩＤ＝２のユーザとユーザＩＤ＝３の２人のユーザが持ちよったコンテンツを比較すると、上記３６ビットのうち、左から１〜６番目、１３〜１５番目、２５〜２７番目が相違していることがわかる。そこで、それらが識別情報の一部と分かるため、１〜６番目、１３〜１５番目、２５〜２７番目のうちの一部に改ざんが施され、例えば、次のような改変が施される。
010101 000000010111 000000101111111111
同様に、ユーザＩＤ＝７のユーザとユーザＩＤ＝８との２人のユーザで、例えば、次のような改ざんが施される。
000010 000000101111 010111111111111111
また、同様に、ユーザＩＤ＝３、４、５、６の４人のユーザによって、例えば、次のような改ざんが施される。
010101 010101010101 000000000010101010
次に、追跡アルゴリズムの概要を説明する。
【００７８】
符号抽出部２１によって、検出対象となるコンテンツからユーザ識別符号（埋め込まれた結託耐性符号または結託攻撃が施されて改ざんされたもの）が抽出されると、追跡アルゴリズム処理部２３は、抽出された符号を解析することによって、結託攻撃に用いられたであろう複製物の結託耐性符号を推定し、該結託耐性符号に対応するユーザＩＤを推定する。
【００７９】
ここで、図１２（ａ）に示すように、符号（生成された結託耐性符号、結託攻撃を受けた結託耐性符号）の成分符号（上記の例では３つの成分符号）の各々ごとについて、当該成分符号の両端の位置と、隣接する要素Ｂ（ｄ−１）とＢ（ｄ）との境界の位置を、数値化して表すものとする。すなわち、第ｉ番目の成分符号Ｗ（ｉ）の要素Ｂ（ｊ）の数をＮ（ｉ）−１個とし、図１２（ａ）でＮ（ｉ）をＮで表すものとすると、要素Ｂ（０）の左端の位置が０、要素Ｂ（ｄ−１）とＢ（ｄ）との境界の位置がｄ、要素Ｂ（Ｎ−２）の右端の位置がＮ−１で表される。
【００８０】
そして、コンテンツから検出された符号の第ｉ番目の成分符号Ｗ（ｉ）を、左端のビットからみていったときにはじめて出現する、０のみからなる要素Ｂ（ｓ−１）と１を含む要素Ｂ（ｓ）との境界を求め、該境界に対応する上記の位置を示す値ｓを、Ａｍｉｎ（ｉ）で表すものとする。一方、右端のビットからみていったときにはじめて出現する、１のみからなる要素Ｂ（ｔ）と０を含む要素Ｂ（ｔ−１）との境界を求め、該境界に対応する上記の位置を示す値ｔを、Ａｍａｘ（ｉ）で表すものとする。
【００８１】
例えば、図１２（ｂ）の符号の例では、Ａｍｉｎ（ｉ）＝２、Ａｍａｘ（ｉ）＝４となる。
また、例えば、図１２（ｃ）の符号の例では、Ａｍｉｎ（ｉ）＝２、Ａｍａｘ（ｉ）＝２となる。
また、例えば、図１２（ｄ）の符号の例では、Ａｍｉｎ（ｉ）＝４、Ａｍａｘ（ｉ）＝４となる。
なお、第ｉ番目の成分符号Ｗ（ｉ）が０のみからなる場合には、Ａｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）＝Ｎ（ｉ）−１となる。また、１のみからなる場合には、Ａｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）＝０となる。
【００８２】
さて、追跡アルゴリズムでは、検出された符号の各々の成分符号を調べ、予め定められたｄ（上記の例の場合、３）ビット未満の数の１や０が孤立して存在する成分符号が検出された場合に、結託攻撃がなされたものと判断することができる。また、この場合に、結託数が予め規定された許容数以下であったものと仮定して、上記の各々の成分符号Ｗ（ｉ）の第１の境界あるいは境界の最小値Ａｍｉｎ（ｉ）や第２の境界あるいは境界の最大値Ａｍａｘ（ｉ）に基づいて、結託攻撃に使用された複製物に埋め込まれていたであろう結託耐性符号を推定することができる。そして、結託耐性符号から、対応するユーザＩＤを求め、これを結託攻撃を行った結託者のユーザＩＤとして特定することができる。
【００８３】
各々のＡ（ｉ）に対応する成分符号Ｍ（ｉ）は、例えば、先に例示したもの、すなわち、
Ａ（３）＝０：Ｗ３＝１１１１１１１１１１１１１１１１１１
Ａ（３）＝１：Ｗ３＝０００１１１１１１１１１１１１１１１
Ａ（３）＝２：Ｗ３＝００００００１１１１１１１１１１１１
Ａ（３）＝３：Ｗ３＝０００００００００１１１１１１１１１
Ａ（３）＝４：Ｗ３＝００００００００００００１１１１１１
Ａ（３）＝５：Ｗ３＝０００００００００００００００１１１
Ａ（３）＝６：Ｗ３＝００００００００００００００００００
を比較して分かるように、その性質上、複数の複製物から得られる相違部分は、必ず連続した要素Ｂとして得られることがわかる（したがって、結託攻撃による改ざんによって、この連続した部分に０と１が混在してくることになる）。
【００８４】
そして、ある成分符号（ｉ）において、そのｄビット未満の数の１や０が孤立して存在する連続した部分の左端の位置と右端の位置、すなわちＡｍｉｎ（ｉ）とＡｍａｘ（ｉ）は、必ず、結託攻撃に用いられた複数の複製物のいずれかに埋め込まれた結託耐性符号の対応する成分符号の０と１の区切り目の位置（成分符号がすべて１の場合は、該成分符号の左端の位置、すべて０の場合は、該成分符号の右端の位置）すなわちＡｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）に一致することがわかる。このような情報が、各成分符号Ｗ（ｉ）毎に得られる。それら情報Ａｍｉｎ（１）、Ａｍｉｎ（２）、…、Ａｍｉｎ（Ｍ）、Ａｍａｘ（１）、Ａｍａｘ（２）、…、Ａｍａｘ（Ｍ）を解析することによって、結託攻撃に使用された複製物に埋め込まれていたであろうユーザ識別符号すなわち結託耐性符号を特定し、該結託耐性符号に対応するユーザＩＤを、結託攻撃を行った結託者のユーザＩＤとして特定することができる。
【００８５】
簡単な例としては、２人のユーザによる結託攻撃では、各成分符号ごとにおいて、Ａｍｉｎ（ｉ）は、２人のユーザの一方の持つ複製物に埋め込まれた符号のＡｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）に一致し、Ａｍａｘ（ｉ）は、他方のユーザの持つ複製物に埋め込まれた符号のＡｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）に一致する（各成分符号ごとに一方と他方の対応は異なりうる）。Ａｍｉｎ（１）とＡｍａｘ（１）からいずれか１つ、Ａｍｉｎ（２）とＡｍａｘ（２）からいずれか１つ、…Ａｍｉｎ（Ｍ）とＡｍａｘ（Ｍ）からいずれか１つを選択したものを、それぞれ、各成分符号の０と１の区切り目の位置として持つような結託耐性符号が存在すれば、それが求める解であり、該結託耐性符号に対応するユーザＩＤが結託者を示すことになる。
【００８６】
例えば、先の図１０、図１１の例で示したように、
ユーザＩＤ＝２のユーザが入手したコンテンツに、ユーザ識別符号として、
000000 000000111111 000000111111111111
が埋め込まれており、ユーザＩＤ＝３のユーザが入手したコンテンツに、ユーザ識別符号として、
111111 000000000111 000000000111111111
が埋め込まれており、当該２人のユーザによって結託攻撃がなされ、次のようなユーザ識別符号、
010101 000000010111 000000101111111111
に改ざんされた場合に、
この改ざん符号では、
Ａｍｉｎ（１）＝０、Ａｍａｘ（１）＝２
Ａｍｉｎ（２）＝２、Ａｍａｘ（１）＝３
Ａｍｉｎ（１）＝２、Ａｍａｘ（１）＝３
であり、
図１０あるいは図１２を参照すると、
Ａｍｉｎ（１）＝２、Ａｍａｘ（１）＝２、Ａｍａｘ（１）＝２を、ユーザＩＤ＝２が満たし、かつ、
Ａｍｉｎ（１）＝０、Ａｍａｘ（１）＝３、Ａｍａｘ（１）＝３を、ユーザＩＤ＝３が満たすので、
この結託攻撃は、ユーザＩＤ＝２とユーザＩＤ＝３によって行われ、
結託攻撃に用いられた符号は、
000000 000000111111 000000111111111111
と、
111111 000000000111 000000000111111111
であることを突き止めることができる。
【００８７】
なお、３人以上のユーザによる結託攻撃では、これに用いられた全複製物に埋め込まれた結託耐性符号（ユーザ識別符号）の全成分符号が本来持つＡｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）の全ては得られないことがあるが、ある複製物に埋め込まれた結託耐性符号の全てあるいはそのうちの多数の成分符号についてＡｍｉｎ（ｉ）＝Ａｍａｘ（ｉ）が得られたることが期待されるので、対象コンテンツから検出された符号から得られたＡｍｉｎ（１）、Ａｍｉｎ（２）、…、Ａｍｉｎ（Ｍ）、Ａｍａｘ（１）、Ａｍａｘ（２）、…、Ａｍａｘ（Ｍ）を適宜組み合わせて検証することによって、かりにすべての結託者のユーザＩＤが特定できなかったとしても、一部の結託者のユーザＩＤを特定することができる。
【００８８】
さて、符号抽出部２１によって、検出対象となるコンテンツからユーザ識別符号（埋め込まれた結託耐性符号または結託攻撃が施されて改ざんされたもの）が抽出されると、電子透かし解析装置２の結託数推定部２２は、抽出された符号を解析することによって、結託攻撃に用いられたであろう複製物の結託数を推定する。
【００８９】
以下では、電子透かし解析装置２の結託数推定部２２について説明する。
【００９０】
（第１の構成例）
まず、図３や図５に例示されるような、結託数推定部２２単独で（追跡アルゴリズム処理部２３の結果を利用することなしに）、結託攻撃に使用された複製物の個数を推定する場合（結託数推定部の第１の態様の場合）について説明する。なお、前述したように、結託耐性符号や追跡アルゴリズムは、基本的には、どのようなものでも適用可能である。また、結託攻撃がなされなかった場合を、結託攻撃に使用された複製物の個数＝１として扱うものとする。
【００９１】
図１３に、この場合の結託数推定部２２の構成例を示す。図１３に示されるように、この結託数推定部２２は、抽出された符号（ユーザ識別符号）の各々の成分符号Ｗ（１）、Ｗ（２）、…、Ｗ（Ｍ）から、先に説明した、Ａｍｉｎ（１）、Ａｍｉｎ（２）、…、Ａｍｉｎ（Ｍ）のＭ個の第１グループのデータと、Ａｍａｘ（１）、Ａｍａｘ（２）、…、Ａｍａｘ（Ｍ）のＭ個の第２グループのデータの一方または両方を求める境界検出部２２１、求められた第１グループのデータと第２グループのデータの一方または両方から統計的な量を求める統計処理部２２２、求められた統計的な量から結託数の推定値Ｃ0を求める推定結託数算出部２２３を含む。
【００９２】
図１４に、概略的な手順の一例を示す。
【００９３】
ここで、図１５に、結託攻撃において用いた複製物の個数（図１５ではｃと表す）を変えたときに、それに対応して、改ざん後の符号の各成文符号Ｗ（ｉ）から検出される第１の境界Ａｍｉｎ（ｉ）がどのような値をとるか、その確率を表す。なお、図１５では、Ａｍｉｎ（ｉ）が取りうる最大値（＝Ｎ（ｉ）−１）と最小値（＝０）との差（＝Ｎ（ｉ）の値）で除して正規化する。また、第１の境界Ａｍａｘ（ｉ）については、Ａｍｉｎ（ｉ）＝１−Ａｍａｘ（ｉ）の関係になる（図１５の横軸の０を１．０に、１．０を０に入れ替えたものになる）。
【００９４】
これより分かるのは、第１の境界Ａｍｉｎについては、結託数ｃが大きくなると、Ａｍｉｎが小さな値をとる確率がより大きくなるように、バイアスされてくるということである。同様に、第２の境界Ａｍａｘについては、ｃが大きくなると、Ａｍａｘが大きな値をとる確率がより大きくなるように、バイアスされてくるということである。
【００９５】
したがって、複数のＷ（ｉ）に対して、Ｍ個のＡｍｉｎ（ｉ）（またはＡｍａｘ（ｉ））の値をもとめ、それらＭ個のＡｍｉｎ（ｉ）（またはＡｍａｘ（ｉ））の分布を解析することによって、結託数ｃの値を統計的に推定することができることになる。
【００９６】
統計処理部２２２および推定結託数算出部２２３による統計的な処理の方法には種々のバリエーションが考えられる。以下では、２つのバリエーションを説明する。
【００９７】
（バリエーション１）
まず、境界検出部２２１は、抽出された符号の各々の成分符号Ｗ（１）、Ｗ（２）、…、Ｗ（Ｍ）から、先に説明した、Ａｍｉｎ（１）、Ａｍｉｎ（２）、…、Ａｍｉｎ（Ｍ）のＭ個の第１グループのデータと、Ａｍａｘ（１）、Ａｍａｘ（２）、…、Ａｍａｘ（Ｍ）のＭ個の第２グループのデータの一方または両方を求める（ステップＳ１１）。
【００９８】
次に、統計処理部２２２は、第１グループのデータＡｍｉｎ（ｉ）を利用する場合には、各Ｗ（ｉ）について、Ａｍｉｎ（１）、Ａｍｉｎ（２）、…、Ａｍｉｎ（Ｍ）の平均＜Ａｍｉｎ＞を求める（ステップＳ１２）。ただし、それらの値が取りうる最大値と最小値との差（＝対応するＮ（ｉ）の値）で除して正規化する。すなわち、Ｗ（ｉ）、Ｎ（ｉ）、Ａ（ｉ）についての第１グループのデータの平均（第１の平均）＜Ａｍｉｎ＞は、
＜Ａｍｉｎ＞＝｛Ａｍｉｎ（１）／Ｎ（１）＋Ａｍｉｎ（２）／Ｎ（２）＋…＋Ａｍｉｎ（Ｍ）／Ｎ（Ｍ）｝／Ｍ
である。
【００９９】
また、第２グループのデータＡｍａｘ（ｉ）を利用する場合には、統計処理部２２２は、各Ｗ（ｉ）、Ｎ（ｉ）、Ａ（ｉ）についての第２グループのデータの平均（第２の平均）＜Ａｍａｘ＞を求める（ステップＳ１２）。＜Ａｍａｘ＞は、
＜Ａｍａｘ＞＝｛Ａｍａｘ（１）／Ｎ（１）＋Ａｍａｘ（２）／Ｎ（２）＋…＋Ａｍａｘ（Ｍ））／Ｎ（Ｍ）｝／Ｍ
である。
【０１００】
第１グループのデータＡｍｉｎ（ｉ）および第２グループのデータＡｍａｘ（ｉ）を利用する場合には、＜Ａｍｉｎ＞および＜Ａｍａｘ＞を求める（ステップＳ１２）。
【０１０１】
次に、推定結託数算出部２２３は、後述するような方法によって、＜Ａｍｉｎ＞や＜Ａｍａｘ＞から、結託数の推定値Ｃ0を求める（ステップＳ１３）。
【０１０２】
以下、結託数の推定値Ｃ0を求める方法について説明する。
【０１０３】
ここで、Ｃ0人の結託者による結託攻撃が行われたとする。
【０１０４】
前述したように、このＣ0が、結託耐性符号の想定している結託数の上限ｃを越えているか否かを知ることによって、例えば、追跡アルゴリズムが出力した結託者のユーザＩＤが正しいものであるか、それとも無実のユーザのものであるかに関する判断の材料となる。
【０１０５】
前述したように、各成分符号Ｗ（ｉ）において、結託語の符号後を復号することによって、ＡｍｉｎやＡｍａｘを検出することができ（それらは、いずれかの結託者の結託前の符号語の対応する成分符号ＷのＡｍｉｎ＝Ａｍａｘである）、それらに対して統計的な処理を行うことでＣ0を推定することができる。統計的な処理の方法は様々あるが、ここでは、各成分符号Ｗ（ｉ）についてのＡｍｉｎ（ｉ）の平均＜Ａｍｉｎ＞からＣ0を推定する方法について説明する。
【０１０６】
ある成分符号Ｗ（ｉ）について、各結託者に割り当てられている整数Ａ（ｉ）が０からＮ（ｉ）−１までの整数のいずれかをとる確率は、０からＮ（ｉ）−１までの整数のいずれについても等しく、１／Ｎ（ｉ）で与えられるとすると、Ａｍｉｎ（ｉ）がある値ｘをとる確率Ｐｒ［Ａｍｉｎ（ｉ）＝ｘ］は、結託者の数がＣ0のとき、次式で与えられる。
【０１０７】
【数１】

ここで、ｘは、０からＮ（ｉ）−１までの整数に値をとる。
【０１０８】
そこで、実際に復号によって得られた各々の成分符号Ｗ（ｉ）についてのＡｍｉｎ（ｉ）を、それぞれ、それらの値が取りうる最大値と最小値との差すなわちＮ（ｉ）の値で除して正規化し（Ａｍｉｎ（ｉ）／Ｎ（ｉ））、Ｍ個のＡｍｉｎ（ｉ）／Ｎ（ｉ）の平均＜Ａｍｉｎ＞、すなわち、
＜Ａｍｉｎ＞＝｛Ａｍｉｎ（１）／Ｎ（１）＋Ａｍｉｎ（２）／Ｎ（２）＋…＋Ａｍｉｎ（Ｍ）／Ｎ（Ｍ）｝／Ｍ
を求める。
【０１０９】
＜Ａｍｉｎ＞は、ｙ＝ｘ／Ｎ（ｉ）を０から１の間の実数として連続近似することによって、次のような期待値＜ｙ＞で近似できる。
【０１１０】
【数２】

【０１１１】
ここで、Ｐ［ｙ］は、Ｐｒ［Ａｍｉｎ（ｉ）＝ｘ］に対するＮ（ｉ）→∞の連続極限によって与えられ、次式で表される。
【０１１２】
【数３】

【０１１３】
よって、＜Ａｍｉｎ＞＝＜ｙ＞の近似より、
Ｃ0＝＜Ａｍｉｎ＞^-1−１となり、
結託数Ｃ0が推定できる。
【０１１４】
＜Ａｍａｘ＞を用いた場合にも、同様にして、
Ｃ0＝（１−＜Ａｍａｘ＞）^-1−１となり、
結託数Ｃ0が推定できる。
【０１１５】
また、＜Ａｍｉｎ＞および＜Ａｍａｘ＞を用いて、
Ｃ0＝（１／２＋＜Ａｍｉｎ＞／２−＜Ａｍａｘ＞／２）^-1−１として、
結託数Ｃ0を推定することもできる。
【０１１６】
したがって、第１グループのデータＡｍｉｎ（ｉ）のみを利用する場合には、推定結託数算出部２２３は、上記のような方法によって、＜Ａｍｉｎ＞から、結託数の推定値Ｃ0を求めることができる。
【０１１７】
また、第２グループのデータＡｍａｘ（ｉ）のみを利用する場合には、推定結託数算出部２２３は、上記のような方法によって、＜Ａｍａｘ＞から、結託数の推定値Ｃ0を求めることができる。
【０１１８】
また、第１グループのデータＡｍｉｎ（ｉ）および第２のグループのデータＡｍａｘ（ｉ）を利用する場合には、推定結託数算出部２２３は、上記のような方法によって、＜Ａｍｉｎ＞および＜Ａｍａｘ＞から、結託数の推定値Ｃ0を求める。
【０１１９】
なお、第１グループのデータＡｍｉｎ（ｉ）および第２のグループのデータＡｍａｘ（ｉ）を利用する場合には、推定結託数算出部２２３は、＜Ａｍｉｎ＞から、結託数の推定値Ｃ0（Ｃminとする）を求めるとともに、＜Ａｍａｘ＞から、結託数の推定値Ｃ0（Ｃmaxとする）を求め、ＣminとＣmaxを列記して出力するか、またはＣminとＣmaxのうちの最大値を出力するか、またはＣminとＣmaxの平均を出力することも可能である（その他のバリエーションも可能である）。
【０１２０】
ここで、具体例を示す。ここでは、Ｍ＝２５６、Ｎ（１）＝５１２、Ｎ（２５６）＝２２９７、Ｎ（２）〜Ｎ（２５５）は５１３から２２９３の間の値とし、ｄ＝３０のΓ₀(ｎ，ｄ)符号を用いることとして、ある１６人で結託攻撃を行った場合に、結託攻撃後のコンテンツから検出した符号（ユーザ識別符号）をもとに、２５６個の成分符号Ｗ（１）〜Ｗ（２５６）について、Ａｍｉｎ（１）、Ａｍａｘ（１）、Ａｍｉｎ（２）、Ａｍａｘ（２）、…、Ａｍｉｎ（２５６）／Ａｍａｘ（２５６）を求めた一例において、その一部を抜粋して示すと、次のようになった。

また、２５６個のＡｍｉｎから＜Ａｍｉｎ＞を計算すると、
＜Ａｍｉｎ＞＝０．０６１８７１
が得られ、これを、Ｃ0＝＜Ａｍｉｎ＞^-1−１に代入することによって、
Ｃ0＝１５．１６３（人）
となり、真の結託者の数に近い値が得られた。
【０１２１】
また、＜Ａｍａｘ＞＝０．９３５３８が得られ、これを、Ｃ0＝（１−＜Ａｍａｘ＞）^-1−１に代入することによって、
Ｃ0＝１４．４７５（人）
となり、真の結託者の数に近い値が得られていることがわかる。
【０１２２】
また、Ｃ0＝（１／２＋＜Ａｍｉｎ＞／２−＜Ａｍａｘ＞／２）^-1−１を用いると、
Ｃ0＝１４．８１１（人）
となり、真の結託者の数に近い値が得られていることがわかる。
【０１２３】
同様の条件で、ある３２人の結託攻撃を行った場合における、結託数の推定結果の一例は、次のようになった。
＜Ａｍｉｎ＞＝０．０２９０６５
＜Ａｍａｘ＞＝０．９６６８４３
Ｃ0＝＜Ａｍｉｎ＞^-1−１＝３３．４０６
Ｃ0＝（１−＜Ａｍａｘ＞）^-1−１＝２９．１６０
Ｃ0＝(1/2+<Amin>/2-<Amax>/2)^-1−１＝３１．１４３
同様の条件で、ある４８人の結託攻撃を行った場合における、結託数の推定結果の一例は、次のようになった。
＜Ａｍｉｎ＞＝０．０１９８８４
＜Ａｍａｘ＞＝０．９７７３８２
Ｃ0＝＜Ａｍｉｎ＞^-1−１＝４９．２９２
Ｃ0＝（１−＜Ａｍａｘ＞）^-1−１＝４３．２１３
Ｃ0＝(1/2+<Amin>/2-<Amax>/2)^-1−１＝４６．０５７
（バリエーション２）
次に、他のバリエーションを説明する。
【０１２４】
ここでは、バリエーション１との相違点を説明する。バリエーション１では、統計処理部２２２は、Ａｍｉｎ（ｉ）の平均やＡｍａｘ（ｉ）の平均を求め、推定結託数算出部２２３は、Ａｍｉｎ（ｉ）の平均やＡｍａｘ（ｉ）の平均から、結託者の数ｃ0を推定した。
【０１２５】
バリエーション２では、統計処理部２２２は、Ａｍｉｎ（ｉ）やＡｍａｘ（ｉ）から、他の統計量を求め、推定結託数算出部２２３は、該他の統計量から、結託者の数ｃ0を推定する。
【０１２６】
例えば、図１５をみると、Ａｍｉｎ（ｉ）については、結託数ｃが大きくなるほど、横軸におけるある値Ａｔｈを基準値として、基準値Ａｔｈ以下の値を持つＡｍｉｎ（ｉ）の数を、基準値Ａｔｈを越える値を持つＡｍｉｎ（ｉ）の数で割った比αが大きくなることがわかる。一方、Ａｍａｘ（ｉ）については、結託数ｃが大きくなるほど、比αが小さくなる。
【０１２７】
そこで、例えば先の例のように、ある成分符号Ｗ（ｉ）について、各結託者に割り当てられている整数Ａ（ｉ）が０からＮ（ｉ）−１までの整数のいずれかをとる確率は、０からＮ（ｉ）−１までの整数のいずれについても等しく、１／Ｎ（ｉ）で与えられるとして、予め、結託数ｃのときの、基準値Ａｔｈ以下の値を持つＡｍｉｎ（ｉ）の数を、基準値Ａｔｈを越える値を持つＡｍｉｎ（ｉ）の数で割った比αを与える関数ｆ（ｃ）＝αの逆関数ｃ＝ｆ^-1（α）を予め求めておく。
【０１２８】
そして、統計処理部２２２は、Ａｍｉｎ（ｉ）から、比αを求め（ステップＳ１２）、推定結託数算出部２２３は、比αを、上記のｃ＝ｆ^-1（α）に代入して、結託数ｃを推定することができる（ステップＳ１３）。Ａｍａｘ（ｉ）についても同様である。もちろん、Ａｍｉｎ（ｉ）とＡｍａｘ（ｉ）の一方を用いてもよいし、両方を用いてもよい。
【０１２９】
なお、上記では、結託数の値を推定するようにしたが、結託数を何段階かのレベルで求めるようにしてもよい。例えば、Ａｍｉｎ（ｉ）について求めた上記の比αが予め定められた基準値以下の場合には、結託数が少ない（あるいは許容数以下）を示す情報を出力し、予め定められた基準値を越える場合には、結託数が多い（あるいは許容数を超過）を示す情報を出力する関数を用いるようにしてもよい。
【０１３０】
また、これまで説明した以外のバリエーションも可能である。
【０１３１】
（第２の構成例）
次に、図４に例示されるような、追跡アルゴリズム処理部２３の結果を利用して、結託攻撃に使用された複製物の個数を推定する場合（結託数推定部の第２の態様の場合）について説明する。なお、前述したように、結託耐性符号や追跡アルゴリズムは、基本的には、どのようなものでも適用可能である。また、結託攻撃がなされなかった場合を、結託攻撃に使用された複製物の個数＝１として扱うものとする。
【０１３２】
図１６に、この場合の結託数推定部２２の構成例を示す。図１６に示されるように、この結託数推定部２２は、追跡アルゴリズム処理部２３から結託者の全部または一部のユーザＩＤが出力された場合に、該ユーザＩＤを後述する弱ＩＤ（弱識別情報）と非弱ＩＤ（非弱識別情報）とに分類する弱ＩＤ・非弱ＩＤ分類部２４１、この分類結果を基に、弱ＩＤの数と非弱ＩＤの数とに基づく統計的な量を求める統計処理部２４２、求められた統計的な量から結託数の推定値Ｃ0を求める推定結託数算出部２４３を含む。なお、統計処理部２４２および推定結託数算出部２４３による統計的な処理の方法には種々のバリエーションが考えられる。
【０１３３】
図１７に、概略的な手順の一例を示す。
【０１３４】
なお、この場合には、電子透かし埋込装置１（の符号生成部１１）は、弱ＩＤをユーザＩＤとして用いないものとする。
【０１３５】
以下では、第１の構成例との相違点を中心に説明する。
【０１３６】
ここで、弱ＩＤと非弱ＩＤについて説明する。
【０１３７】
弱ＩＤとは、ユーザＩＤとして用いた場合に、結託攻撃を行っていないユーザのユーザＩＤであるにもかかわらず、結託者のユーザＩＤとして誤検出される可能性のより高いユーザＩＤである（誤検出に弱いＩＤという意味から、このように呼ぶ）。非弱ＩＤとは、ユーザＩＤ候補のうちから、弱ＩＤを除いたユーザＩＤであり、非弱ＩＤのみがユーザＩＤとして使用される。
【０１３８】
非弱ＩＤは、所定の判定アルゴリズムによって判定する方法と、誤検出される可能性のより高いユーザＩＤを何らかの指針（例えば、対応する結託耐性符号の成分符号の全部または多数について、その正規化したＡｍｉｎ＝Ａｍａｘが０または１に近い等）によって予め決めてしまう方法とがある。
【０１３９】
ここで、図１８に示すフローチャートを用いて、図７の符号生成部１１の場合に、与えられたユーザＩＤ（の候補）が弱ＩＤか非弱ＩＤかを判定する処理手順の一例について説明する。
【０１４０】
まず、対象となったユーザＩＤを一つずつシーケンシャルに入力し(ステップＳ３１)、このユーザＩＤが結託者ＩＤとして誤検出される確率(誤検出確率)を推定する(ステップＳ３２)。この誤検出確率の推定は、例えば前述したｐ_i（＝Ｎ（ｉ））、ｋ、ｋ′（＝Ｍ），結託者総数の最大値ｃ、ユーザ総数ｎ、ｚといったパラメータを用いて次のようにして行われる。なお、ｋについては、図７の法記憶部１２１−１，１２１−２，…，１２１−ｋ′で用意されているｋ′個の素数ｐ１（＝Ｎ（１）），ｐ２（＝Ｎ（２）），…，ｐｋ′（＝Ｎ（ｋ′））から任意のｋ個の素数を選んだとき、それらのｋ個の素数の積をｎ以上とするものである（例えば、この積はｎ≦Ｎ（１）×Ｎ（２）×…×Ｎ（ｋ）である）。また、ｚは、１以上の正整数であり、例えば、ｋ′＝ｃ(ｋ＋ｚ)／２を満足する正整数である。
【０１４１】
まず、次式を定義する。
【０１４２】
【数４】

次に、次式を定義する。
【０１４３】
【数５】

【０１４４】
あるユーザＩＤ(＝ｕとする)が結託者ＩＤとして誤検出される確率を概ね表す量として、次式で表される評価値ＥＥＰを計算する。
【０１４５】
【数６】

【０１４６】
ここで、ｕ_p＝ｕ mod ｐとする。これ以外にも、ある利用者ＩＤについて誤検出確率を近似する評価値が存在するならば、それを該評価値ＥＥＰの代わりに用いることが可能である。例えば、次式で表される評価値ＥＥＰを用いてもよい。
【０１４７】
【数７】

【０１４８】
次に、ステップＳ３２で推定された誤検出確率(例えば、該ＥＥＰ)が所定の閾値を超えたか否かを調べ(ステップＳ３３)、閾値を超える場合は、ユーザＩＤ（の候補）が弱ＩＤであると判定し(ステップＳ３４)、また誤検出確率が閾値以下の場合は、ユーザＩＤ（の候補）が非弱ＩＤであると判定する(ステップＳ３５)。
【０１４９】
さて、結託攻撃を行った結託数ｃが大きくなると、追跡アルゴリズムが当該結託攻撃を受けたコンテンツから結託者のユーザＩＤを推定した場合に、得られる結果として弱ＩＤが増加してくる。したがって、弱ＩＤの数と非弱ＩＤの数との比βを評価することで、その比βを生み出す結託数ｃの値が推定できる。
【０１５０】
すなわち、結託数ｃが大きくなるほど、弱ＩＤの数を、非弱ＩＤの数で割った比βが大きくなるので、予め、結託数ｃのときの比βを与える関数ｈ（ｃ）＝βの逆関数ｃ＝ｈ^-1（β）を予め求めておくことで、比βから結託数ｃを推定することができる。
【０１５１】
まず、弱ＩＤ・非弱ＩＤ分類部２４１は、追跡アルゴリズム処理部２３から結託者の全部または一部のユーザＩＤが出力された場合に、該ユーザＩＤを、弱ＩＤと非弱ＩＤとに分類する（ステップＳ２１）。なお、弱ＩＤか非弱ＩＤかの判断は、例えば、弱ＩＤのリストを記憶しておき、与えられたユーザＩＤが該リストに登録されているものと一致するか否かを調べることによって、一致すれば弱ＩＤと判断し、一致しなければ非弱ＩＤと判断するようにしてもよいし、ユーザＩＤが弱ＩＤか非弱ＩＤかを判定する手順が作成可能であれば、該判断手順によって弱ＩＤか非弱ＩＤかを判断するようにしてもよい。
【０１５２】
次に、統計処理部２２２は、分類された弱ＩＤと非弱ＩＤとに基づいて、弱ＩＤの数を非弱ＩＤの数で割った比βを求める（ステップＳ２２）。
【０１５３】
そして、推定結託数算出部２２３は、比βを、上記のｃ＝ｈ^-1（β）に代入して、結託数ｃを推定することができる（ステップＳ２３）。
【０１５４】
なお、上記では、結託数の値を推定するようにしたが、結託数を何段階かのレベルで求めるようにしてもよい。例えば、求めた上記の比βが予め定められた基準値以下の場合には、結託数が少ない（あるいは許容数以下）を示す情報を出力し、予め定められた基準値を越える場合には、結託数が多い（あるいは許容数を超過）を示す情報を出力する関数を用いるようにしてもよい。
【０１５５】
なお、以上では、コンテンツの複製物に、ユーザＩＤに対応する符号を埋め込むようにしたが、その代わりに、複製物の複製物ＩＤとそのユーザを特定するための情報（例えば、ユーザ名、あるいはユーザＩＤ等）との対応を保存または復元可能にしておき、コンテンツの複製物に、複製物ＩＤに対応する符号を埋め込むようにしてもよい。
【０１５６】
以下では、本実施形態のハードウェア構成、ソフトウェア構成について説明する。
【０１５７】
本実施形態の電子透かし解析装置は、ハードウェアとしても、ソフトウェア（（コンピュータに所定の手段を実行させるための、あるいはコンピュータを所定の手段として機能させるための、あるいはコンピュータに所定の機能を実現させるための）プログラム）としても、実現可能である。また、電子透かし解析装置をソフトウェアで実現する場合には、記録媒体によってプログラムを受け渡しすることも、通信媒体によってプログラムを受け渡しすることもできる。もちろん、それらは、電子透かし埋込装置についても同様である。
また、電子透かし埋込装置や電子透かし解析装置をハードウェアとして構成する場合、半導体装置として形成することができる。
また、本発明を適用した電子透かし解析装置を構成する場合、あるいは電子透かし解析プログラムを作成する場合に、同一構成を有するブロックもしくはモジュールがあっても、それらをすべて個別に作成することも可能であるが、同一構成を有するブロックもしくはモジュールについては１または適当数のみ用意しておいて、それをアルゴリズムの各部分で共有する（使い回す）ことも可能である。電子透かし埋込装置を構成する場合、あるいは電子透かし埋め込みプログラムを作成する場合も、同様である。また、電子透かし埋込装置および電子透かし解析装置を含むシステムを構成する場合、あるいは電子透かし埋め込みプログラムおよび電子透かし検出プログラムを含むシステムを作成する場合には、電子透かし埋込装置（あるいはプログラム）と電子透かし解析装置（あるいはプログラム）に渡って、同一構成を有するブロックもしくはモジュールについては１または適当数のみ用意しておいて、それをアルゴリズムの各部分で共有する（使い回す）ことも可能である。
【０１５８】
また、電子透かし埋込装置や電子透かし解析装置をソフトウェアで構成する場合には、マルチプロセッサを利用し、並列処理を行って、処理を高速化することも可能である。
【０１５９】
ところで、デジタル透かしに対する透かし技術は、デジタルデータの他に、ある情報あるいは物質の一部の内容を変更しても、その情報あるいは物質の同一性、同質性あるいは経済的価値等を変じないようなものにも適用可能であり、本発明は、デジタルデータの他に、そのような情報あるいは物質にも適用可能である。
【０１６０】
例えば、本発明において、結託攻撃への耐性を持つ電子透かし埋込装置／電子透かし解析装置において用いられる、埋め込まれる符号の生成手段、検出手段は、化学的に合成される、あるいは、工業的に管理された環境下で生物的に生成される化合物あるいは化学物質の出所の追跡にも応用できる。化合物としては、ＤＮＡ、ＲＮＡ、タンパク質、その他の高分子の化合物が、符号を埋め込むことができる冗長性を多く持つ。
【０１６１】
以下では、本発明を、化合物の複製物に対して個別の識別情報（ユーザＩＤ、製造者ＩＤ、販売者ＩＤ、取引ＩＤ、それらを組み合わせた情報など）を埋め込み、その出所を特定する手段を与える透かし技術として適用する場合について説明する。
【０１６２】
化合物は、複数の原子、分子、基といった物質から構成されている。例えば、ＤＮＡやＲＮＡは、所定のアミノ酸の配列構造を持っており、別のアミノ酸で置きかえるか如何かによって情報が表現されているとみなせる。その構造の中には、（デジタルコンテンツの場合には、データを変更しても、作品の同一性あるいは経済的価値を変えない場合があるのと同様に、）化合物の場合には、組成を変更しても、当該目的において、その作用・副作用・効用等の性質・機能等（別の観点でみれば、経済的価値）を変えない場合がある。
【０１６３】
そのような許容された範囲内の変更によって、その複製物を個々に識別する情報を埋め込むことができる。
【０１６４】
本発明の電子透かしを化合物に適用する場合、化合物に対する透かし埋込装置は、デジタルコンテンツに対する透かし埋込装置におけるデジタルコンテンツの所定の部分のビットを変更する構成を、化合物の所定の部分の組成を変更する装置に置き換えたものである。また、化合物に対する透かし解析装置は、デジタルコンテンツに対する透かし解析装置において透かし情報を検出するためにデジタルコンテンツの所定の部分のビットの値を読み取る情報を検出する構成を、透かし情報を検出するために化合物の所定の部分の組成を解析する装置に置き換えたものである。すなわち、化合物とのインタフェースとなる装置が相違するだけで、原理的には、デジタルコンテンツに対する透かし技術と同じである。
【０１６５】
図１９に、化合物に対する透かし埋込装置の構成例を示す。
【０１６６】
符号生成部１００１は、その化合物に埋め込むべき識別情報を入力とし、結託耐性符号を生成する。
【０１６７】
特定部位の構造変換部１００２〜１００４は、それぞれ、結託耐性符号の各ビット、あるいは、ビットの各集合に対して、その値に応じて化合物の構造を変換するものである。特定部位の構造変換部１００２は、原化合物の特定部位１を処理し、特定部位の構造変換部１００３は、特定部位１を処理済みの化合物の特定部位２を処理し、特定部位の構造変換部１００４は、特定部位１，２を処理済みの化合物の特定部位３を処理して、所望する埋込済み化合物を生成する。もちろん、図１９では、３つの構造変換部が示されているが、その数は３に限定されるものではない。
【０１６８】
ここで、化合物の構造の変換とは、その化合物の利用の目的に適した性質あるいは機能等を損なわず且つ新たな弊害あるいは副作用等をもたらさないままで、異なる構造を持つ化合物に変換する手段のことである。あるいは、その化合物が純粋な化合物ではなく、混合物である場合には、その組成を変更する手段であってもよい。
【０１６９】
図２０に、化合物に対する透かし埋込装置の他の構成例を示す。
【０１７０】
図１９の構成例は、すでに合成された化合物の構造を後から変換するものであったっが、図２０の構成例は、化合物の合成時に符号を埋め込むものである。
【０１７１】
符号生成部１０１１は、その化合物に埋め込むべき識別情報を入力とし、結託耐性符号を生成する。
【０１７２】
この場合、各合成材料毎に、結託耐性符号の各ビット、あるいは、ビットの各集合に対して、その値に応じた合成材料が容易されており、合成材料部１０１２〜１０１４は、それぞれ、結託耐性符号の各ビット、あるいは、ビットの各集合に対して、その値に応じた化合物の合成材料を選択するものである。もちろん、図２０では、３つの合成材料選択部が示されているが、その数は３に限定されるものではない。
【０１７３】
合成部１０１５は、各合成材料部１０１２〜１０１４により選択された合成材料を合成して、所望する埋込済み化合物を生成する。
【０１７４】
さて、化合物に対する結託攻撃では、基本的にはデジタルコンテンツに対する結託攻撃と同様で、例えば、複数の異なる識別情報（例えば、ユーザＩＤ、製造者ＩＤ、ユーザＩＤ及び製造者ＩＤ等）が埋め込まれた化合物の構造を比較することで、差異のある部分の構造を改変することで作られる。
【０１７５】
図２１に、化合物に対する透かし解析装置の構成例を示す。
【０１７６】
特定部位の構造読み取り部１２０１〜１２０１は、図１９の特定部位の構造変換部１００２〜１００４あるいは図２０の合成材料部１０１２〜１０１４に対応するもので、その化合物中の特定部位の構造を読み取り、それをビットあるいはビットの集合である情報として出力する。
【０１７７】
符号復号部１２０４は、それらのビットから追跡すべき符号語を再現し、結託数を推定するもので、デジタルコンテンツに対する電子透かし解析装置２の持つ、ビットから追跡すべき符号語を再現し、結託数を推定する機能と同様である。
【０１７８】
もちろん、化合物に対する透かし解析装置は、必要に応じて、追跡アルゴリズムの機能を持つものである。
【０１７９】
また、化合物に対する透かし解析装置は、結託数を推定する機能を持たず、追跡アルゴリズムの機能を持つ構成も可能である。
【０１８０】
ここで、本発明に用いられる化合物の構造の変換手段や構造の読み取り手段について、利用可能な技術を例示する。以下では、ＤＮＡの場合を例にとって説明する。
【０１８１】
ＤＮＡにおいて、その塩基配列を求めることを、シークエンシングという。シークエンシングの方法としては、ショットガン法、プライマーウォーク法、ネスティッドデレーション法などが知られている。これらは、いずれも遺伝子のクローニングによる方法である。シークエンシングで用いる試薬・機器・装置の例については、各種の方法が提案されている。例えば、渡辺格監修、杉浦昌弘編集「クローニングとシークエンス」、農村文化社（１９８９年）や、榊佳之等編「ゲノムサイエンス」、共立出版（１９９９年）などに開示されている。
【０１８２】
同様に、ＤＮＡの例では、新しい遺伝子を導入する際に用いられている遺伝子導入法により構造変換が可能である。遺伝子導入法には、燐酸カルシウム沈殿法、デキストラン法、リボフェクション法などの化学的な方法や、電気穿孔法、マイクロインジェクション法などが知られている。例えば、波賀信幸著「分子細胞工学」、コロナ社（２０００年）に開示されている。
【０１８３】
なお、この発明の実施の形態で例示した構成は一例であって、それ以外の構成を排除する趣旨のものではなく、例示した構成の一部を他のもので置き換えたり、例示した構成の一部を省いたり、例示した構成に別の機能を付加したり、それらを組み合わせたりすることなどによって得られる別の構成も可能である。また、例示した構成と論理的に等価な別の構成、例示した構成と論理的に等価な部分を含む別の構成、例示した構成の要部と論理的に等価な別の構成なども可能である。また、例示した構成と同一もしくは類似の目的を達成する別の構成、例示した構成と同一もしくは類似の効果を奏する別の構成なども可能である。
また、各種構成部分についての各種バリエーションは、適宜組み合わせて実施することが可能である。
また、この発明の実施の形態は、個別装置としての発明、システム全体としての発明、個別装置内部の構成部分についての発明、またはそれらに対応する方法の発明等、種々の観点、段階、概念またはカテゴリに係る発明を包含・内在するものである。
従って、この発明の実施の形態に開示した内容からは、例示した構成に限定されることなく発明を抽出することができるものである。
【０１８４】
本発明は、上述した実施の形態に限定されるものではなく、その技術的範囲において種々変形して実施することができる。
【０１８５】
【発明の効果】
本発明によれば、結託耐性符号の埋め込まれたデジタルコンテンツの複製物から検出した符号についての統計的な手法に基づく推定を行うことによって、結託攻撃に用いられたデジタルコンテンツの複製物の数を推定することができる。
【図面の簡単な説明】
【図１】本発明の一実施形態に係る電子透かし埋込装置及び電子透かし解析装置を含むコンテンツ流通システムの概略構成を示す図
【図２】同実施形態に係る電子透かし埋込装置の構成例を示す図
【図３】同実施形態に係る電子透かし解析装置の構成例を示す図
【図４】同実施形態に係る電子透かし解析装置の他の構成例を示す図
【図５】同実施形態に係る電子透かし解析装置のさらに他の構成例を示す図
【図６】同実施形態に係る電子透かし埋込装置の概略的な手順の一例を示すフローチャート
【図７】同実施形態に係る電子透かし埋込装置の符号生成部の構成例を示す図
【図８】図７の符号生成部の成分符号生成部の構成例を示す図
【図９】同実施形態に係る電子透かし埋込装置により生成される成分符号の例について説明するための図
【図１０】同実施形態における各ユーザＩＤに対応する複数の整数の組の例について説明するための図
【図１１】同実施形態における各ユーザＩＤに対応する結託耐性符号の例について説明するための図
【図１２】同実施形態における各成分符号でのビットパターンに関する境界の位置について説明するための図
【図１３】同実施形態に係る電子透かし解析装置の結託数推定部の構成例を示す図
【図１４】同実施形態に係る電子透かし解析装置の結託数推定部の概略的な手順の一例を示すフローチャート
【図１５】結託攻撃において用いる複製物の個数と、改ざんされた結託耐性符号の各成分符号において検出されるビットパターンに関する境界の位置との関係について説明するための図
【図１６】同実施形態に係る電子透かし解析装置の結託数推定部の他の構成例を示す図
【図１７】同実施形態に係る電子透かし解析装置の結託数推定部の概略的な手順の他の例を示すフローチャート
【図１８】同実施形態においてユーザＩＤが弱ＩＤか非弱ＩＤかを判定するための手順の一例を示すフローチャート
【図１９】同実施形態に係る化合物に対する透かし埋込装置の構成例を示す図
【図２０】同実施形態に係る化合物に対する透かし埋込装置の他の構成例を示す図
【図２１】同実施形態に係る化合物に対する透かし解析装置の他の構成例を示す図
【符号の説明】
１…電子透かし埋込装置
２…電子透かし解析装置
３…流通経路
１１…符号生成部
１２…符号埋込部
２１…符号抽出部
２２…結託数推定部
２３…追跡アルゴリズム処理部
１２１−１〜１２１−ｋ′…法記憶部
１２２−１〜１２２−ｋ′…剰余計算部
１２３…符号パラメータ記憶部
１２４−１〜１２４−ｋ′…成分符号生成部
１２５…符号連接部
１３１…減算部
１３２…“０”列生成部
１３３…“１”列生成部
３４…連接部
２２１…境界検出部
２２２…統計処理部
２２３…推定結託数算出部
２４１…弱ＩＤ・非弱ＩＤ分類部
２４２…統計処理部
２４３…推定結託数算出部
１００１，１０１１…符号生成部
１００２〜１００４…特定部位の構造変換部
１０１２〜１０１４…合成材料部
１０１５…合成部
１２０１〜１２０１…特定部位の構造読み取り部
１２０４…符号復号部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a digital watermark system, a digital watermark analysis apparatus, and a digital watermark analysis method capable of estimating the number of digital content copies used in a collusion attack on a copy of digital content in which a collusion-resistant code is embedded.
[0002]
[Prior art]
Digital content (for example, still images, moving images, audio, music, etc.) has a structure composed of a large number of digital data. In the structure, there is a portion that can maintain the identity or economic value of the digital content work even if the data is changed. Various data can be embedded in the digital content by changing the data within the allowable range. Such a technique is called a digital watermark.
[0003]
Various watermark information (for example, identification information of the copyright owner and user of the content, copyright information of the copyright owner, usage conditions of the content, secret information necessary for the use, copy control information) Etc., or a combination thereof) can be embedded, detected and used for various purposes (for example, usage control, copyright protection including copy control, promotion of secondary usage, etc.).
[0004]
Here, for example, as a technique applied when distributing the same digital content to a large number of users, information for uniquely identifying the copy of the digital content (for example, unique to the user ID) Consider the case of embedding watermark information corresponding to.
[0005]
The method of embedding identification information unique to a copy of a digital content is to identify an outflow user by detecting the identification information from the pirated copy when the copy of the digital content is further copied and circulated as a pirated copy. Therefore, it functions as a pre-suppression against illegal copying of digital content, and also serves as a post-relief when a copyright infringement occurs.
[0006]
Also, in order for a user to invalidate identification information embedded in a copy of digital content, the user does not know which part is a bit that constitutes the identification information, and therefore corresponds to a copy of the digital content. If this is done, the economic value of the digital content will be lost, and the motivation for illegal copying can be deprived.
[0007]
In this situation, “collusion attack” has emerged as a method for making illegal copying possible.
[0008]
A collusion attack is based on the fact that different identifications are embedded in different copies. For example, by using multiple copies of copies and comparing them bit by bit, This is a method in which parts with different values are found and the identification information is falsified and lost by falsifying the parts (for example, majority voting, minority voting, randomization, etc.). (Similar results may be obtained by performing operations such as averaging pixel values in between).
[0009]
For example, in a simple example, each copy of Mr. A, Mr. B, and Mr. C
00 ... 00 ...
00 ... 11 ...
11 ... 00 ...
If the identification information is embedded, for example,
10 ... 01 ...
In other words, content in which identification information different from any of Mr. A, Mr. B, and Mr. C is embedded can appear.
[0010]
Therefore, a method of embedding a code having a property (hereinafter, referred to as a collusion-resistant code) having a property that can identify all or a part of the collusioner even when subjected to the collusion attack as a digital watermark, and the collusion-resistant code Various tracking algorithms (trading algorithms; algorithms for identifying identification numbers embedded in content used in collusion attacks and identifying collusion user IDs) have been proposed. For example, there is a c-secure code (D. Boneh and J. Shaw, “Collusion-Secure Fingerprinting for Digital Data,” CRYPTO'95, 180-189, 1995.).
[0011]
[Problems to be solved by the invention]
In order to increase resistance to collusion attacks, that is, to increase the upper limit of the number of collusions that cannot identify the collusion, it is necessary to increase the code length embedded in the content, while the code length embedded in the content also Since there is a limitation, this kind of collusion-resistant code and a tracking algorithm based on the collusion-resistant code impose an upper limit on the number of replicas used for collusion attacks in order to reduce the code length (note that c-secure Means that it is effective against collusion attacks using up to c replicates). If the collusion attack was performed using more than the permitted number of replicas, the tracking algorithm was incorrectly involved in the collusion attack as the identification number of the replica involved in the collusion attack. It is possible that an identification number of a non-replicated material will be output and that a non-colluder will be identified as a colluder. In addition, the possibility that a duplicate having the same identification number as the identification number of the duplicate other than the duplicate involved in the collusion attack and the possibility of misjudgment below the allowable number are determined by the collusion tolerance code. Since it can be stochastically kept low by design, misjudgment is mainly caused by collusion attacks that exceed the allowable number.
[0012]
However, whether or not a collusion attack has actually been performed with an allowable number of copies is information that only an attacker (collusioner) knows. Since attackers aim to gain profits illegally by forging copies, it is almost impossible to disclose the number of copies used in collusion attacks themselves.
[0013]
Therefore, even if the tracking algorithm that decodes the collusion-resistant code outputs the identification number of the duplicate involved in the collusion (or the user ID of the colluder corresponding to the identification number), Whether the collusion has been identified, or because the collusion attack was performed using more than the permitted number of copies, the copy or its user that was not used for collusion was mistakenly identified There is a problem that it is impossible to determine whether or not it has been done.
[0014]
In order to avoid this problem, imagine that the number of duplicates that the colluder can actually prepare is imagined, and other than designing the collusion-resistant code so that the allowable number is more than that number. However, a large allowable number is inevitably set, and as a result, the code length also increases.
[0015]
The present invention has been made in consideration of the above circumstances, and is a digital watermark system and digital watermark analysis capable of estimating the number of digital content copies used in a collusion attack against a copy of digital content in which a collusion-resistant code is embedded. An object is to provide an apparatus and a digital watermark analysis method.
[0016]
[Means for Solving the Problems]
The present invention is an electronic watermarking system for estimating the number of copies of digital content used in a collusion attack, and identifying information for identifying a user corresponding to the copy before handing over the copy of the digital content to the user In accordance with a predetermined method, a plurality of integers are allocated, a plurality of component codes corresponding to each of the allocated integers is generated, and the generated plurality of component codes are concatenated and embedded. A first step of generating a resistance code, and embedding the generated collusion-resistant code in the copy, and a copy of the digital content that has been analyzed, embedded in the copy as the collusion-resistance code A position related to the position of the falsified portion of each component code for each of a plurality of component codes constituting the detected code And detecting a predetermined statistic related to the position of the falsified portion based on the plurality of the position information detected for each of the plurality of component codes, and relating to the position of the obtained falsified portion. And a second step of estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic.
[0017]
The present invention is an electronic watermarking system capable of estimating the number of copies of digital content used in a collusion attack, and is assigned in advance when assigning an identifier for identifying a user corresponding to the copy of digital content. Among identifier candidates belonging to the range of non-negative integers, those that are judged not to be weak identifiers that are more likely to be erroneously detected as identifiers corresponding to duplicates used in collusion attacks by a predetermined tracking algorithm Prior to assigning and delivering a copy of the digital content to the user, the identifier identifying the user corresponding to the copy is assigned and assigned a plurality of integers according to a predetermined method based on the value of the identifier. Generating a plurality of component codes corresponding to each of the plurality of integers, and concatenating the generated plurality of component codes. A first step of generating a collusion-resistant code to be embedded, and embedding the generated collusion-resistant code in the copy, and from the copy of the digital content that has been analyzed, the copy as the collusion-resistant code The embedded code is detected, and the predetermined tracking algorithm is applied to the detected code to obtain the identifier for identifying the user corresponding to the duplicate used in the collusion attack, and the obtained identifier Are classified into weak identifiers and other non-weak identifiers, and based on the classification results of the weak identifiers and non-weak identifiers, a predetermined statistic regarding weak identifiers and non-weak identifiers is obtained, And a second step of estimating the number of duplicates used in the collusion attack on the digital content based on a predetermined statistic regarding the weak identifier and the non-weak identifier. And wherein the Rukoto.
[0018]
The present invention also relates to a digital watermark analysis apparatus / device for estimating the number of copies of digital content used in a collusion attack, from the copy of the digital content that has been analyzed, to the copy as a collusion-resistant code. Means / step for detecting an embedded code, and means / step for detecting position information related to the position of a falsified portion of the component code for each of a plurality of component codes constituting the detected code; Means / step for obtaining a predetermined statistic related to the position of the tampered part based on a plurality of the position information detected for each of the plurality of component codes, and the obtained position of the tampered part Means / step for estimating the number of copies used in a collusion attack on the digital content based on a predetermined statistic. The features.
[0019]
Preferably, for each of a plurality of component codes constituting the detected code, one or both of the most significant bit side position and the least significant bit side position of the falsified portion of the component code is obtained, One or both of a first average for a value obtained by normalizing the most significant bit side position and a second average for a value obtained by normalizing the plurality of obtained least significant bit side positions are obtained and obtained. Further, by inputting one or both of the first average and the second average into a predetermined function, an estimated value of the number of replicas used in the collusion attack as an output of the predetermined function May be requested.
[0020]
Further, the present invention obtains one or both of the most significant bit side position and the least significant bit side position of the tampered part of the component code for each of the plurality of component codes constituting the detected code, Of the plurality of most significant bit side positions, a first ratio of the number of those located on the upper bit side from a predetermined reference position and the number of those located on the lower bit side from the reference position, Obtaining one or both of the number of the least significant bit side positions on the upper bit side from a predetermined reference position and the second ratio of the number on the lower bit side from the reference position; By inputting one or both of the first ratio and the second ratio into a predetermined function, a replica used for the collusion attack is output as the predetermined function. It may be obtained in the estimate of the number.
[0021]
The present invention also relates to a digital watermark analysis apparatus / method for estimating the number of copies of digital content used in a collusion attack, from the copy of the digital content to be analyzed to the collusion resistance. Means / step for detecting a code embedded as a code, and applying said predetermined tracking algorithm to said detected code to determine said identifier identifying a user corresponding to a duplicate used in a collusion attack Means / step; means / step for classifying the obtained identifier into a weak identifier and other non-weak identifiers; and a weak identifier and a non-weak identifier based on a classification result of the weak identifier and the non-weak identifier. Means / step for determining a predetermined statistic relating to the identifier, and based on the determined statistic relating to the weak identifier and the non-weak identifier determined Characterized by comprising a means / step for estimating a duplicate the number that was used to collusion attack on the content.
[0022]
Preferably, based on the classification result of the weak identifier and the non-weak identifier, a ratio between the number of identifiers classified as weak identifiers and the number of identifiers classified as non-weak identifiers is obtained, and the obtained By inputting the ratio into a predetermined function, an estimated value of the number of duplicates used in the collusion attack may be obtained as an output of the predetermined function.
[0023]
Preferably, instead of obtaining the estimated value of the number of replicas used in the collusion attack, information indicating the level of the number of replicas used in the collusion attack may be obtained.
[0024]
Further, the present invention is a chemical watermark system for tracking the same kind of chemical product embedded with different identification information used in the collusion attack as a watermark, and for the identification information to be embedded in the target chemical product. In accordance with a predetermined method, a plurality of integers are allocated, a plurality of component codes corresponding to the allocated plurality of integers are generated, and the generated collusion-resistant codes to be embedded by concatenating the plurality of component codes A first step of embedding the generated collusion-resistant code in the chemical product, and a code embedded as the collusion-resistant code in the chemical product from the chemical product to be analyzed And detecting the identification information corresponding to the chemical product used in the collusion attack by applying the predetermined tracking algorithm to the detected code. Characterized by a second step that.
[0025]
The present invention also provides a chemical watermark system for tracking the same kind of chemical product embedded with different identification information used in a collusion attack as a watermark, and assigning an identifier to be embedded in the target chemical product. Furthermore, among identifier candidates belonging to a predetermined non-negative integer range, weak identifiers that are more likely to be erroneously detected as identifiers corresponding to chemical products used in collusion attacks by a predetermined tracking algorithm And assigning a plurality of integers to an identifier to be embedded in the chemical product product according to a predetermined method based on the value of the identifier, corresponding to each of the assigned integers A plurality of component codes are generated, and a collusion-resistant code to be generated by concatenating and embedding the generated plurality of component codes is generated and generated. A first step of embedding the collusion resistance code in the chemical product, and detecting a code embedded as the collusion resistance code in the chemical product from the chemical product that is the object of analysis. And applying the predetermined tracking algorithm to the code to obtain the identification information corresponding to the chemical substance product used in the collusion attack.
[0026]
Further, the present invention is a chemical watermark system for estimating the number of the same kind of chemical substance products embedded with watermarks of different identification information used in collusion attacks, and the identification information to be embedded in the target chemical product In accordance with a predetermined method, a plurality of integers are allocated, a plurality of component codes corresponding to each of the allocated integers is generated, and the generated plurality of component codes are concatenated and embedded. A first step of generating a resistance code, and embedding the generated collusion resistance code in the chemical product, and from the chemical product to be analyzed, embedded in the chemical product as the collusion resistance code For each of a plurality of component codes constituting the detected code, position information related to the position of the tampered portion of the component code is detected. Based on the plurality of position information detected for each of the plurality of component codes, a predetermined statistic related to the position of the falsified portion is obtained, and a predetermined statistic related to the obtained position of the falsified portion And a second step of estimating the number of chemical product products used in a collusion attack on the chemical product product.
[0027]
Chemical watermark system.
[0028]
The present invention is also a chemical watermarking system for estimating the number of the same kind of chemical product embedded with different identification information used in a collusion attack as a watermark, and an identifier to be embedded in the target chemical product. When assigning, it is more likely to be erroneously detected as an identifier corresponding to a chemical product used in a collusion attack by a predetermined tracking algorithm from among identifier candidates belonging to a predetermined non-negative integer range Assigning an identifier that is determined not to be a weak identifier, assigning a plurality of integers to the identifier to be embedded in the chemical product product according to a predetermined method based on the value of the identifier, and assigning each of the assigned integers A plurality of corresponding component codes are generated, and a collusion-resistant code to be embedded by concatenating the generated plurality of component codes is generated. A first step of embedding the collusion-resistant code into the chemical substance product, and detecting the code embedded as the collusion-resistant code in the chemical substance product from the chemical substance product to be analyzed The predetermined tracking algorithm is applied to the generated code to obtain the identification information corresponding to the chemical product used in the collusion attack, and the obtained identifier is divided into a weak identifier and other non-weak identifiers. A predetermined statistic relating to the weak identifier and the non-weak identifier based on the classification result of the weak identifier and the non-weak identifier, and a predetermined statistic relating to the obtained weak identifier and the non-weak identifier. And a second step of estimating the number of chemical product products used in a collusion attack on the chemical product product.
[0029]
The present invention relating to the apparatus is also established as an invention relating to a method, and the present invention relating to a method is also established as an invention relating to an apparatus.
Further, the present invention relating to an apparatus or a method has a function for causing a computer to execute a procedure corresponding to the invention (or for causing a computer to function as a means corresponding to the invention, or for a computer to have a function corresponding to the invention. It can also be realized as a computer-readable recording medium on which a program (for realizing) is recorded.
[0030]
According to the present invention, an estimation based on a statistical method for a code detected from a copy of digital content in which a collusion-resistant code is embedded (for example, a bias in the distribution of tampered portions in the collusion-resistant code, or a detected identifier). The number of copies of the digital content used in the collusion attack can be estimated by performing estimation based on statistical properties and the like regarding the ratio of weak identifiers. This makes it possible to evaluate whether the tracking result of the tracking algorithm is correct or not, and to collect information on the number of collusions.
[0031]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the invention will be described with reference to the drawings.
[0032]
In the present invention, at least for each copy of the same digital content (for example, still image, moving image, sound, music, etc.) Identification information uniquely corresponding to a user identifier (user ID) of a corresponding user, that is, a user who uses the copy (for example, a user who transfers the copy or a user who lends the copy). This is applicable when embedding and detecting as watermark information.
[0033]
Of course, for each copy of the same digital content, various other watermark information (for example, identification information of the copyright owner of the content, rights information of the copyright holder, usage conditions of the content, Necessary confidential information, copy control information, etc., or a combination of them) is embedded and detected for various purposes (eg, use control, copyright protection including copy control, promotion of secondary use, etc.) However, the following description will focus on the portion related to the user identification code (the configuration of the portion related to the other watermark information when other watermark information is used is not particularly limited).
[0034]
The configuration diagram shown below is established as a functional block diagram of the apparatus, and also as a functional module diagram or procedure diagram of software (program).
[0035]
FIG. 1 shows a conceptual diagram of a system to which a digital watermark embedding apparatus and a digital watermark analysis apparatus according to an embodiment of the present invention are applied.
[0036]
The digital watermark embedding device 1 and the digital watermark analysis device 2 are provided and managed on the content providing side.
The method of embedding desired watermark data in the digital content in the digital watermark embedding device 1 and the method of extracting the watermark data itself from the digital content in the digital watermark analysis device 2 are basically arbitrary (for example, “Kokoo Matsui For example, see "Basics of Digital Watermarking", Morikita Publishing, 1998 ").
The digital watermark embedding device 1 can be realized as software (program) or hardware. Similarly, the digital watermark analysis apparatus 2 can be realized as software (program) or hardware. Further, when the digital watermark embedding device 1 and the digital watermark analysis device 2 are used on the content providing side, they can be realized by integrating them.
[0037]
FIG. 2 shows a configuration example of the digital watermark embedding apparatus 1. The digital watermark embedding device 1 includes a code generation unit 11 that generates a collusion-resistant code corresponding to a user ID, which is watermark information to be embedded as a user identification code, and the generated collusion-resistant code (embedded code) as target content. And a code embedding unit 12 to be embedded.
[0038]
The electronic watermark embedding device 1, when given the target content and the user ID of the target user to be embedded therein, generates a collusion-resistant code corresponding to the user ID, and embeds the collusion-resistant code as a user identification code The content is output as a copy for the user with the user ID. When other watermark information is used, other watermark information is embedded as necessary.
[0039]
A copy of the content for each user obtained by the digital watermark embedding apparatus 1 is distributed through a distribution path 3 through a storage medium or a communication medium. A collusion attack using a plurality of replicas is performed in the distribution channel 3.
[0040]
3 and 4 show a configuration example of the digital watermark analysis apparatus 2.
[0041]
As shown in FIG. 3 and FIG. 4, the digital watermark analysis apparatus 2 extracts a user identification code (an embedded collusion tolerance code or a collusion attack-falsified code) from the content to be detected. For the content to be detected, the extraction unit 21, the collusion number estimation unit 22 for estimating the number of duplicates used in the collusion attack, a predetermined tracking algorithm was executed, and the collusion attack would have been used. Identify the collusion-resistant code of the replica, and the user ID corresponding to the collusion-resistant code (the user ID corresponding to the collusion-resistant code that would have been used in the collusion attack in a tracking algorithm in which the user ID may not be restored), Alternatively, the user ID can be directly restored without obtaining the collusion tolerance code itself that would have been used in the collusion attack. It may be obtained) and a tracking algorithm processor 23. In the present embodiment, the case where no collusion attack is made is treated as the number of duplicates used in the collusion attack = 1.
[0042]
Here, the collusion number estimation unit 22 of the digital watermark analysis apparatus 2 includes (1) the collusion number estimation unit 22 alone (without using the result of the tracking algorithm processing unit 23), from the content to be detected, A form in which the number of duplicates used in the collusion attack is estimated by taking out the user identification code (embedded collusion-resistant code or one that has been tampered with the collusion attack) and analyzing the extracted user identification code (First aspect of the collusion number estimation unit) and (2) used for the collusion attack based on the tracking result output from the tracking algorithm processing unit 23 (for example, all or part of the collider's user ID). There is a mode for estimating the number of duplicates (second mode of the collusion number estimation unit). FIG. 3 is a configuration example in the case of the first mode of the collusion number estimation unit, and FIG. 4 is a configuration example in the case of the first mode of the collusion number estimation unit.
[0043]
Basically, any collusion-resistant code or tracking algorithm is applicable and is not particularly limited.
[0044]
In addition, in FIG.3 and FIG.4, you may provide the comprehensive determination part which outputs the determination result which comprehensively determined the result of the collusion number estimation part 22 and the result of the tracking algorithm process part 23 further. For example, when the estimated collusion number is less than the allowable number and the collider's user ID (set) is obtained, the processing unit outputs the collider's user ID (set) (or this) In addition, the estimated number of collusions is output), and when the estimated number of collusions exceeds the allowable number and the user ID of (the set of) collusioners is obtained, it is not possible to determine due to the estimated number of collusions (or In addition, the estimated number of collusions) is output. Of course, other ways of generating the comprehensive determination result are also possible.
[0045]
In the case of (1) above, either the processing by the collusion number estimation unit 22 or the processing by the tracking algorithm processing unit 23 may be performed first or in parallel.
[0046]
In the case of (1) above, there may be a digital watermark analysis apparatus 2 that has the collusion number estimation unit 22 and does not have the tracking algorithm processing unit 23. FIG. 5 shows a configuration example in this case.
[0047]
According to the present embodiment, the digital watermark analysis apparatus 2 can estimate the number of duplicates that would have been used in the collusion attack by the collusion number estimation unit 22, thereby evaluating the accuracy of the tracking result of the tracking algorithm. It is also possible to collect information on the number of collusions.
[0048]
Hereinafter, the digital watermark embedding apparatus 1 will be described in more detail.
[0049]
FIG. 6 shows an example of a schematic procedure.
[0050]
First, the code generation unit 11 obtains M (M is a plurality) integers A (1), A (2),..., A (M) corresponding to the user ID to be embedded in the target copy (Step S1). S1). There are a method of obtaining and storing the M integers in advance and a method of obtaining the integers when necessary.
[0051]
Each integer A (i) in each i (i = 1 to M) is assumed to take any value from 0 to N (i) -1. Here, N (1), N (2),..., N (M) are predetermined positive integers different from each other. More preferably, N (1), N (2),..., N (M) are disjoint.
[0052]
Each of the M integers A (i) corresponding to the user ID is assigned a value randomly in the range of 0 to N (i) -1, and is constant in the range of 0 to N (i) -1. There is a method of assigning values according to rules. In any case, each user ID is assigned exclusively so that at least one of A (1), A (2),..., A (M−1) is different from each other, There is a method of allowing a plurality of sets of M integers having the same A (1), A (2),..., A (M) to be assigned to a plurality of user IDs.
[0053]
As a method for exclusively assigning values, for example, all or part of integers in the range of 0 to N (1) × N (2) ×... × N (M) −1 are used as user ID values. There is a method in which, for each of M integers A (i), the remainder when the target user ID is divided by N (i) is used as the value of A (i) corresponding to the user ID. .
[0054]
In addition, when using a method in which a set of integers A (1), A (2),..., A (M) corresponding to the user ID cannot be calculated from the user ID, each user ID and the user ID are It is necessary to store correspondence information with a set of corresponding integers A (1), A (2),..., A (M). Further, when using a method capable of calculating a set of integers A (1), A (2),..., A (M) corresponding to the user ID from the user ID, each user ID and the corresponding relationship information May be recalculated as needed without saving, or the correspondence information may be saved and referenced.
[0055]
Next, the code generation unit 11 performs collusion corresponding to the user ID from M integers A (1), A (2),..., A (M) corresponding to the user ID to be embedded in the target copy. A tolerance code is generated (step S2). The collusion-resistant code corresponding to each user ID includes a method of generating and storing in advance and a method of generating when necessary.
[0056]
The collusion tolerance code corresponding to each user ID is a component code W (1) corresponding to each of M integers A (1), A (2),..., A (M) corresponding to the user ID. W (2),..., W (M) is obtained and generated by concatenating them.
[0057]
As the component code W (i) corresponding to the integer A (i), for example, Γ ₀ (n, d) code (consecutive d bits consisting of only 1 or 0 are defined as one unit B (j) and B (0) to B (n−2) are connected; provided that B (0) to B (n−2) consists entirely of 0, all consists of 1 or B (0) to B (m) consists of only 0 and B (m) to B (n− Up to 2) can consist of only 1). For example, a simple example in the case of a method in which the remainder obtained by dividing the target user ID by N (i) is set to the value of the integer A (i) corresponding to the user ID will be described. When N (1) = 5, If n = 5 and d = 3, then Γ ₀ The (5, 3) code is as follows.
When A (1) = 0: W (1) = 111 111 111 111
When A (1) = 1: W (1) = 000 111 111 111
When A (1) = 2: W (1) = 000 000 111 111
When A (1) = 3: W (1) = 000 000 000 111
When A (1) = 4: W (1) = 000 000 000 000
A collusion-resistant code can be generated by concatenating the component codes W (i) corresponding to the respective A (i) thus obtained.
[0058]
In this code, 1 and 0 are arranged so as to be continuous in units of d bits, and a number of less than 1 bits and 0s do not exist in isolation (in the above example, a number of less than 3 bits). It turns out that 1 and 0 do not exist in isolation). Therefore, it is presumed that a collusion attack was made when a number of 1s and 0s less than d bits exist in isolation (when a number of 1s and 0s less than d bits do not exist in isolation) It is estimated that no collusion attacks were made).
[0059]
The collusion-resistant code generated in this way is embedded in the target content by the code embedding unit 12 of the digital watermark embedding apparatus 1 (step S3).
[0060]
FIG. 7 shows a configuration example of the code generation unit 11.
[0061]
The code generation unit 11 includes k ′ (= M) number of modulus storage units 121-1, 121-2,..., 121-k ′, and remainder calculation units 122-1, 122-2,. ', Component code generation units 124-1, 124-2, ..., 124-k', a code parameter storage unit 123, and a code connection unit 125.
[0062]
In the modulus storage units 121-1, 121-2,..., 121-k ′, there are integers that are relatively prime, in this example, k ′ primes p that are different from each other. _i (= N (i)) (i = 1, 2,..., K ′) is stored, and these prime numbers p _i Is supplied as a modulus to the remainder calculation units 122-1, 122-2, ..., 122-k '. The remainder calculation units 122-1, 122-2,..., 122-k ′ are prime numbers p for the input user ID = u. _i Remainder modulo u _i = U mod p _i (i = 1, 2,..., k ′) are obtained respectively. That is, as a set of a plurality of integer elements corresponding to the input user ID, the remainder u is calculated by the remainder calculators 122-1, 122-2, ..., 122-k '. _i = U mod p _i (i = 1, 2,..., k ′) is calculated. In this example, p _i (i = 1, 2,..., k ′) is a prime number, but may be a relatively prime integer.
[0063]
The component code generators 124-1, 124-2,..., 124-k ′ have k ′ prime numbers p. _i (i = 1, 2,..., k ′) obtained by the remainder calculation units 122-1, 122-2,..., 122-k ′ according to the code parameter t stored in the code parameter storage unit 23. Surplus u _i Γ representing (i = 1, 2,..., k ′) ₀ Component code Γ consisting of (n, d) code ₀ (p _i , T) respectively. That is, in the component code generation units 124-1, 124-2, ..., 124-k ', the remainder calculation units 122-1, 122-2, ..., 122-k' for a predetermined number (n) of user IDs. All remainders u calculated in _i Component code Γ in which k combinations of k ′ component codes that can represent a set of (i = 1, 2,..., k ′) can uniquely represent a user ID. ₀ (p _i , T) are generated corresponding to each remainder.
[0064]
The code concatenation unit 125 includes component code Γ generated by the component code generation units 124-1, 124-2, ..., 124-k '. ₀ (p _i , T) are concatenated to generate a collusion-resistant code that is watermark information.
[0065]
FIG. 8 shows the configuration of one (124-i) of the component code generators 124-1, 124-2,..., 124-k ′. The sign parameter is t and the remainder is u _i , Modulo p _i Then, in the subtracting unit 131, p _i -U _i -1 is required. In the “0” column generation unit 132, the sign parameter t and the remainder u _i Based on t × u _i A continuous “0” sequence of bits is generated, and the “1” sequence generation unit 133 generates a sign parameter t and an output p from the subtraction unit 131. _i -U _i T × (p based on −1 _i -U _i -1) A continuous "1" sequence of bits is generated. These “0” and “1” columns are connected by the connecting portion 34, and t × (p _i -1) The bit string of bits is Γ ₀ Component code Γ consisting of (n, d) code ₀ (p _i , T).
[0066]
FIG. 9 shows an example of a component code (component code before receiving a collusion attack) of the collusion tolerance code generated in this way. Corresponding to n user IDs from 0 to n−1, component codes consisting of block “0” columns of B (0),..., B (n−2) are assigned.
[0067]
Here, the above-described code generation method will be described using an example in which a numerical value is reduced and simplified.
[0068]
First, the number M of integers is set to 3, and N (1) = 3, N (2) = 5, and N (3) = 7. In this case, A (1) is any one of 0-2, A (2) is any one of 0-4, and A (3) is any one of 0-6.
[0069]
Next, since N (1) × N (2) × N (3) −1 = 104, all or part of the range from 0 to 104 is used as the user ID. Here, 0 to 14 are used as user IDs.
[0070]
For example, when user ID = 7,
A (1) = 7 mod N (1) = 0 mod 3 = 1,
A (2) = 7 mod N (2) = 0 mod 5 = 2
A (3) = 7 mod N (3) = 0 mod 7 = 0,
It becomes.
[0071]
FIG. 10 shows A (1), A (2), and A (3) obtained for each of user IDs = 0 to 14 in this example.
[0072]
Next, Γ ₀ The component code W1 corresponding to each of A (1) = 0, A (1) = 1, and A (1) = 2 when d = 3 in the (n, d) code is as follows ( For the sake of clarity, 0 or 1 is described in units of 3 bits).
A (1) = 0: W1 = 111 111
A (1) = 1: W1 = 000 111
A (1) = 2: W1 = 000 000
Similarly, component codes W2 corresponding to A (2) = 0, A (2) = 1, A (2) = 2, A (2) = 3, and A (2) = 4 are become that way.
A (2) = 0: W2 = 111 111 111 111
A (2) = 1: W2 = 000 111 111 111
A (2) = 2: W2 = 000 000 111 111
A (2) = 3: W2 = 000 000 000 111
A (2) = 4: W2 = 000 000 000 000
Similarly, A (3) = 0, A (3) = 1, A (3) = 2, A (3) = 3, A (3) = 4, A (3) = 5, A (3 ) = 6, the component code W2 corresponding to each is as follows.
A (3) = 0: W3 = 111 111 111 111 111 111
A (3) = 1: W3 = 000 111 111 111 111 111
A (3) = 2: W3 = 000 000 111 111 111 111 111
A (3) = 3: W3 = 000 000 000 111 111 111
A (3) = 4: W3 = 000 000 000 000 111 111
A (3) = 5: W3 = 000 000 000 000 000 111
A (3) = 6: W3 = 000 000 000 000 000 000
Therefore, for example, when user ID = 7, A (1) = 1, A (2) = 2, and A (3) = 0.
W1 = 000 111
W2 = 000 000 111 111
W3 = 111 111 111 111 111 111 111
And the collusion-resistant code corresponding to the user ID = 7 connects them,
000111 000000111111 111111111111111111
(For the sake of clarity, the description is divided at the boundaries of the portions corresponding to W1 to W3).
[0073]
FIG. 11 shows the collusion tolerance code obtained for each user ID = 0 to 14 in this example.
[0074]
Next, the digital watermark analysis apparatus 2 will be described in more detail.
[0075]
Here, the collusion attack will be described using the above example.
[0076]
For example, in the content obtained by the user with the user ID = 2, as a user identification code,
000000 000000111111 000000111111111111
Is embedded (see FIGS. 10 and 11). Further, for example, the content obtained by the user with the user ID = 3 is used as a user identification code,
111111 000000000111 000000000111111111
Is embedded (see FIGS. 10 and 11).
[0077]
In this case, comparing the contents held by the user with the user ID = 2 and the two users with the user ID = 3, the first to sixth, the 13th to the fifteenth, the 25th to the 27th among the 36 bits. You can see that the numbers are different. Therefore, since these are recognized as a part of the identification information, a part of the first to sixth, thirteenth to fifteenth, and twenty-fifth to twenty-seventh parts is tampered, and for example, the following modification is performed.
010101 000000010111 000000101111111111
Similarly, for example, the following tampering is performed by two users of user ID = 7 and user ID = 8.
000010 000000101111 010111111111111111
Similarly, for example, the following tampering is performed by four users with user IDs = 3, 4, 5, and 6.
010101 010101010101 000000000010101010
Next, an outline of the tracking algorithm will be described.
[0078]
When the code extraction unit 21 extracts a user identification code (an embedded collusion-resistant code or a collusion attack that has been tampered with) from the content to be detected, the tracking algorithm processing unit 23 extracts By analyzing the code, the collusion resistance code of the duplicate that would have been used in the collusion attack is estimated, and the user ID corresponding to the collusion resistance code is estimated.
[0079]
Here, as shown in FIG. 12 (a), for each of the component codes (three component codes in the above example) of the code (generated collusion-resistant code, collusion-resistant code subjected to collusion attack) The positions of both ends of the component code and the positions of the boundaries between the adjacent elements B (d-1) and B (d) are expressed numerically. That is, if the number of elements B (j) of the i-th component code W (i) is N (i) −1 and N (i) is represented by N in FIG. The position of the left end of (0) is represented by 0, the position of the boundary between the elements B (d-1) and B (d) is represented by d, and the position of the right end of the element B (N-2) is represented by N-1.
[0080]
An element including only elements B (s−1) and 1 that appears for the first time when the i-th component code W (i) of the code detected from the content is viewed from the leftmost bit. A boundary with B (s) is obtained, and a value s indicating the position corresponding to the boundary is represented by Amin (i). On the other hand, the boundary between the element B (t) consisting only of 1 and the element B (t−1) including 0, which appears for the first time when viewed from the rightmost bit, is obtained, and the above-described position corresponding to the boundary is determined. The indicated value t is represented by Amax (i).
[0081]
For example, in the example of the code in FIG. 12B, Amin (i) = 2 and Amax (i) = 4.
For example, in the example of the code | symbol of FIG.12 (c), it becomes Amin (i) = 2 and Amax (i) = 2.
For example, in the example of the code | symbol of FIG.12 (d), it becomes Amin (i) = 4 and Amax (i) = 4.
When the i-th component code W (i) is composed of only 0, Amin (i) = Amax (i) = N (i) −1. In the case where only 1 is included, Amin (i) = Amax (i) = 0.
[0082]
Now, in the tracking algorithm, each component code of the detected code is examined, and a component code in which a predetermined number of 1s and 0s less than d (in the above example, 3) bits are isolated is detected. If so, it can be determined that a collusion attack has been made. Further, in this case, assuming that the collusion number is equal to or less than a predetermined allowable number, the first boundary or the minimum value Amin (i) of each of the component codes W (i) Based on the second boundary or the maximum value Amax (i) of the boundary, it is possible to estimate the collusion tolerance code that would have been embedded in the duplicate used in the collusion attack. Then, a corresponding user ID can be obtained from the collusion resistance code, and can be specified as the user ID of the collusion person who has performed the collusion attack.
[0083]
The component code M (i) corresponding to each A (i) is, for example, the one exemplified above, that is,
A (3) = 0: W3 = 111 111 111 111 111 111
A (3) = 1: W3 = 000 111 111 111 111 111
A (3) = 2: W3 = 000 000 111 111 111 111 111
A (3) = 3: W3 = 000 000 000 111 111 111
A (3) = 4: W3 = 000 000 000 000 111 111
A (3) = 5: W3 = 000 000 000 000 000 111
A (3) = 6: W3 = 000 000 000 000 000 000
As can be seen from the above, it can be seen that a difference part obtained from a plurality of duplicates is always obtained as a continuous element B (thus, the tampering by the collusion attack causes 0 to be added to this continuous part). 1 will be mixed).
[0084]
Then, in a certain component code (i), the left end position and the right end position of a continuous portion in which 1 or 0 of less than d bits exist in isolation, that is, Amin (i) and Amax (i) are: The position of the 0 and 1 breakpoints of the corresponding component code of the collusion-resistant code embedded in any of the plurality of duplicates used in the collusion attack (if the component code is all 1, the component code It can be seen that the position at the left end, in the case of all 0s, coincides with the position at the right end of the component code), that is, Amin (i) = Amax (i). Such information is obtained for each component code W (i). By analyzing the information Amin (1), Amin (2), ..., Amin (M), Amax (1), Amax (2), ..., Amax (M), the duplicate used in the collusion attack is obtained. The user identification code that would have been embedded, that is, the collusion tolerance code, can be identified, and the user ID corresponding to the collusion tolerance code can be identified as the user ID of the colluder who performed the collusion attack.
[0085]
As a simple example, in a collusion attack by two users, for each component code, Amin (i) is Amin (i) = Amax () of a code embedded in a duplicate of one of the two users. i) and Amax (i) matches Amin (i) = Amax (i) of the code embedded in the copy of the other user (the correspondence between one and the other is different for each component code) sell). One selected from Amin (1) and Amax (1), one selected from Amin (2) and Amax (2), ... one selected from Amin (M) and Amax (M) If there is a collusion-tolerant code such as the position of the break between 0 and 1 of each component code, this is the solution to be obtained, and the user ID corresponding to the collusion-tolerant code indicates the colluder Become.
[0086]
For example, as shown in the examples of FIGS.
In the content obtained by the user with user ID = 2, as a user identification code,
000000 000000111111 000000111111111111
Embedded in the content obtained by the user with the user ID = 3 as the user identification code,
111111 000000000111 000000000111111111
Is embedded, and the collusion attack is made by the two users, and the following user identification code,
010101 000000010111 000000101111111111
If tampered with
In this tamper code,
Amin (1) = 0, Amax (1) = 2
Amin (2) = 2, Amax (1) = 3
Amin (1) = 2, Amax (1) = 3
And
Referring to FIG. 10 or FIG.
Amin (1) = 2, Amax (1) = 2, Amax (1) = 2 satisfies the user ID = 2, and
Since user ID = 3 satisfies Amin (1) = 0, Amax (1) = 3, and Amax (1) = 3,
This collusion attack is performed with user ID = 2 and user ID = 3,
The code used for the collusion attack is
000000 000000111111 000000111111111111
When,
111111 000000000111 000000000111111111
Can be determined.
[0087]
In a collusion attack by three or more users, all of Amin (i) = Amax (i) inherent in all component codes of the collusion resistance code (user identification code) embedded in all duplicates used for this are used. May not be obtained, but it is expected that Amin (i) = Amax (i) is obtained for all or many of the collusion-resistant codes embedded in a certain replica. Verify by combining Amin (1), Amin (2),..., Amin (M), Amax (1), Amax (2),. Even if the user IDs of all colluders cannot be specified, the user IDs of some colluders can be specified.
[0088]
When the code extraction unit 21 extracts a user identification code (an embedded collusion-resistant code or a collusion attack that has been tampered with) from the content to be detected, the number of collusions of the digital watermark analysis apparatus 2 The estimation unit 22 estimates the number of collusions of duplicates that would have been used in the collusion attack by analyzing the extracted code.
[0089]
Hereinafter, the collusion number estimation unit 22 of the digital watermark analysis apparatus 2 will be described.
[0090]
(First configuration example)
First, as shown in FIG. 3 and FIG. 5, the number of copies used in the collusion attack is estimated by the collusion number estimation unit 22 alone (without using the result of the tracking algorithm processing unit 23). A case (in the case of the first aspect of the collusion number estimation unit) will be described. As described above, basically any collusion-resistant code or tracking algorithm can be applied. Further, the case where the collusion attack is not made is treated as the number of duplicates used in the collusion attack = 1.
[0091]
FIG. 13 shows a configuration example of the collusion number estimation unit 22 in this case. As shown in FIG. 13, the collusion number estimation unit 22 first calculates the component code W (1), W (2),..., W (M) of the extracted code (user identification code). Mmin first group data of Amin (1), Amin (2),..., Amin (M) and M pieces of Amax (1), Amax (2),. Boundary detection unit 221 for obtaining one or both of the second group data, statistical processing unit 222 for obtaining a statistical amount from one or both of the obtained first group data and second group data, and obtained statistics An estimated collusion number calculation unit 223 for obtaining an estimated value C0 of the collusion number from a specific amount.
[0092]
FIG. 14 shows an example of a schematic procedure.
[0093]
Here, in FIG. 15, when the number of duplicates used in the collusion attack (represented as “c” in FIG. 15) is changed, it is detected from each of the sentence codes W (i) of the code after falsification correspondingly. What value the first boundary Amin (i) takes has a probability. In FIG. 15, normalization is performed by dividing by the difference (= N (i)) between the maximum value (= N (i) −1) and the minimum value (= 0) that Amin (i) can take. . The first boundary Amax (i) has a relationship of Amin (i) = 1−Amax (i) (0 on the horizontal axis in FIG. 15 is replaced with 1.0 and 1.0 is replaced with 0). To become a thing).
[0094]
It can be seen from this that the first boundary Amin is biased such that when the collusion number c increases, the probability that Amin takes a small value increases. Similarly, the second boundary Amax is biased such that as c increases, the probability that Amax takes a large value increases.
[0095]
Therefore, the value of M Amin (i) (or Amax (i)) is obtained for a plurality of W (i), and the distribution of the M Amin (i) (or Amax (i)) is analyzed. By doing so, the value of the collusion number c can be statistically estimated.
[0096]
Various variations of the statistical processing method by the statistical processing unit 222 and the estimated collusion number calculation unit 223 can be considered. In the following, two variations will be described.
[0097]
(Variation 1)
First, the boundary detection unit 221 starts from the component codes W (1), W (2),..., W (M) of the extracted codes, as described above, Amin (1), Amin (2), ..., one or both of data of M first groups of Amin (M) and data of M second groups of Amax (1), Amax (2), ..., Amax (M) are obtained (step S11).
[0098]
Next, when using the data Amin (i) of the first group, the statistical processing unit 222 averages Amin (1), Amin (2),..., Amin (M) for each W (i). <Amin> is obtained (step S12). However, normalization is performed by dividing by the difference between the maximum value and the minimum value that can be taken by these values (= corresponding value of N (i)). That is, the average (first average) <Amin> of the data of the first group for W (i), N (i), and A (i) is
<Amin> = {Amin (1) / N (1) + Amin (2) / N (2) + ... + Amin (M) / N (M)} / M
It is.
[0099]
When the second group of data Amax (i) is used, the statistical processing unit 222 calculates the average (second) of the second group of data for each W (i), N (i), and A (i). 2) <Amax> is obtained (step S12). <Amax> is
<Amax> = {Amax (1) / N (1) + Amax (2) / N (2) + ... + Amax (M)) / N (M)} / M
It is.
[0100]
When the data Amin (i) of the first group and the data Amax (i) of the second group are used, <Amin> and <Amax> are obtained (step S12).
[0101]
Next, the estimated collusion number calculation unit 223 obtains an estimated value C0 of the collusion number from <Amin> and <Amax> by a method as described later (step S13).
[0102]
Hereinafter, a method for obtaining the estimated value C0 of the collusion number will be described.
[0103]
Here, it is assumed that a collusion attack is performed by C0 colluders.
[0104]
As described above, by knowing whether or not this C0 exceeds the upper limit c of the number of collusions assumed by the collusion tolerance code, for example, the user ID of the colluder output by the tracking algorithm is correct. It is a material for judging whether it is an innocent user or not.
[0105]
As described above, in each component code W (i), it is possible to detect Amin and Amax by decoding the code after the collusion word (they are those of the code word before the collusion of any colluder). The corresponding component code W is Amin = Amax), and C0 can be estimated by performing statistical processing on them. There are various statistical processing methods. Here, a method of estimating C0 from the average <Amin> of Amin (i) for each component code W (i) will be described.
[0106]
For a component code W (i), the probability that the integer A (i) assigned to each colluder takes any integer from 0 to N (i) −1 is 0 to N (i) −1. Assuming that all the integers up to are equal and given by 1 / N (i), the probability Pr [Amin (i) = x] where Amin (i) takes a certain value x is that the number of colluders is C0. When given by:
[0107]
[Expression 1]

Here, x takes a value of an integer from 0 to N (i) -1.
[0108]
Therefore, Amin (i) for each component code W (i) actually obtained by decoding is divided by the difference between the maximum value and the minimum value that those values can take, that is, the value of N (i). And normalize (Amin (i) / N (i)) and average <Amin> of M Amin (i) / N (i), that is,
<Amin> = {Amin (1) / N (1) + Amin (2) / N (2) + ... + Amin (M) / N (M)} / M
Ask for.
[0109]
<Amin> can be approximated by the following expected value <y> by continuously approximating y = x / N (i) as a real number between 0 and 1.
[0110]
[Expression 2]

[0111]
Here, P [y] is given by the continuous limit of N (i) → ∞ with respect to Pr [Amin (i) = x] and is expressed by the following equation.
[0112]
[Equation 3]

[0113]
Therefore, from the approximation of <Amin> = <y>,
C0 = <Amin> ^-1 −1,
The collusion number C0 can be estimated.
[0114]
Similarly, when <Amax> is used,
C0 = (1- <Amax>) ^-1 −1,
The collusion number C0 can be estimated.
[0115]
Also, using <Amin> and <Amax>
C0 = (1/2 + <Amin> / 2- <Amax> / 2) ^-1 -1,
The collusion number C0 can also be estimated.
[0116]
Accordingly, when only the data Amin (i) of the first group is used, the estimated collusion number calculation unit 223 can obtain the estimated value C0 of the collusion number from <Amin> by the method as described above. .
[0117]
When only the data Amax (i) of the second group is used, the estimated collusion number calculation unit 223 can obtain the estimated value C0 of the collusion number from <Amax> by the above method. .
[0118]
Also, when using the data Amin (i) of the first group and the data Amax (i) of the second group, the estimated collusion number calculating unit 223 uses the method described above to perform <Amin> and <Amax >, An estimated value C0 of the collusion number is obtained.
[0119]
When the data Amin (i) of the first group and the data Amax (i) of the second group are used, the estimated collusion number calculation unit 223 calculates the estimated value C0 (Cmin of the collusion number from <Amin>. Whether the estimated value C0 (assumed to be Cmax) of the collusion number is obtained from <Amax>, and Cmin and Cmax are listed and output, or the maximum value of Cmin and Cmax is output Or the average of Cmin and Cmax can be output (other variations are possible).
[0120]
Here, a specific example is shown. Here, M = 256, N (1) = 512, N (256) = 2297, N (2) to N (255) are values between 513 and 2293, and d = 30 Γ ₀ By using (n, d) code, when a collusion attack is performed by a certain 16 people, 256 component codes W (1) based on the code (user identification code) detected from the content after the collusion attack ) To W (256), Amin (1), Amax (1), Amin (2), Amax (2),..., Amin (256) / Amax (256) are partly extracted. It was as follows.

When <Amin> is calculated from 256 Amins,
<Amin> = 0.061871
Is obtained, and this is expressed as C0 = <Amin>. ^-1 By substituting for -1.
C0 = 15.163 (person)
And a value close to the number of true colluders was obtained.
[0121]
Also, <Amax> = 0.93538 is obtained, which is expressed as C0 = (1- <Amax>) ^-1 By substituting for -1.
C0 = 14.475 (person)
It turns out that the value close to the number of true colluders is obtained.
[0122]
Also, C0 = (1/2 + <Amin> / 2- <Amax> / 2) ^-1 Using -1,
C0 = 14.811 (person)
It turns out that the value close to the number of true colluders is obtained.
[0123]
An example of the estimated number of collusions when 32 collusion attacks were performed under the same conditions was as follows.
<Amin> = 0.029065
<Amax> = 0.966843
C0 = <Amin> ^-1 -1 = 33.406
C0 = (1- <Amax>) ^-1 −1 = 29.160
C0 = (1/2 + <Amin> / 2- <Amax> / 2) ^-1 -1 = 31.143
An example of the estimated number of collusions when 48 collusion attacks were performed under the same conditions was as follows.
<Amin> = 0.099884
<Amax> = 0.977382
C0 = <Amin> ^-1 -1 = 49.292
C0 = (1- <Amax>) ^-1 -1 = 43.213
C0 = (1/2 + <Amin> / 2- <Amax> / 2) ^-1 -1 = 46.057
(Variation 2)
Next, other variations will be described.
[0124]
Here, differences from the variation 1 will be described. In variation 1, the statistical processing unit 222 obtains the average of Amin (i) and the average of Amax (i), and the estimated collusion number calculation unit 223 determines the collusion from the average of Amin (i) and the average of Amax (i). The number of persons c0 was estimated.
[0125]
In variation 2, the statistical processing unit 222 obtains another statistic from Amin (i) or Amax (i), and the estimated collusion number calculating unit 223 estimates the number of colliders c0 from the other statistic. To do.
[0126]
For example, referring to FIG. 15, for Amin (i), as the collusion number c increases, the number of Amin (i) having a value less than or equal to the reference value Ath, with a certain value Ath on the horizontal axis as the reference value, It can be seen that the ratio α divided by the number of Amin (i) having a value exceeding the value Ath increases. On the other hand, for Amax (i), the ratio α decreases as the collusion number c increases.
[0127]
Thus, for example, as in the previous example, for a certain component code W (i), the probability that the integer A (i) assigned to each colluder takes any integer from 0 to N (i) −1. Is equal to any integer from 0 to N (i) -1, and is given by 1 / N (i). ) Divided by the number of Amin (i) having a value exceeding the reference value Ath, the function f (c) = an inverse function of α = c = f ^-1 (Α) is obtained in advance.
[0128]
Then, the statistical processing unit 222 obtains the ratio α from Amin (i) (step S12), and the estimated collusion number calculating unit 223 calculates the ratio α by the above c = f ^-1 By substituting into (α), the collusion number c can be estimated (step S13). The same applies to Amax (i). Of course, one of Amin (i) and Amax (i) may be used, or both may be used.
[0129]
In the above description, the value of the collusion number is estimated, but the collusion number may be obtained at several levels. For example, when the ratio α obtained for Amin (i) is less than or equal to a predetermined reference value, information indicating that the number of collusions is small (or less than an allowable number) is output, and the predetermined reference value is In the case of exceeding, a function that outputs information indicating that the collusion number is large (or exceeds the allowable number) may be used.
[0130]
Variations other than those described so far are possible.
[0131]
(Second configuration example)
Next, when estimating the number of duplicates used in the collusion attack using the result of the tracking algorithm processing unit 23 as exemplified in FIG. 4 (in the case of the second mode of the collusion number estimation unit) ). As described above, basically any collusion-resistant code or tracking algorithm can be applied. Further, the case where the collusion attack is not made is treated as the number of duplicates used in the collusion attack = 1.
[0132]
FIG. 16 shows a configuration example of the collusion number estimation unit 22 in this case. As shown in FIG. 16, when all or a part of the colluder's user IDs are output from the tracking algorithm processing unit 23, the collusion number estimation unit 22 assigns a weak ID (weak identification) to be described later. Information) and non-weak ID (non-weak identification information), and a statistical quantity based on the number of weak IDs and the number of non-weak IDs based on this classification result. A statistical processing unit 242 for obtaining the estimated collusion number calculating unit 243 for obtaining an estimated value C0 of the collusion number from the obtained statistical quantity. In addition, various variations can be considered for the statistical processing method by the statistical processing unit 242 and the estimated collusion number calculating unit 243.
[0133]
FIG. 17 shows an example of a schematic procedure.
[0134]
In this case, it is assumed that the digital watermark embedding apparatus 1 (the code generation unit 11 thereof) does not use the weak ID as the user ID.
[0135]
Below, it demonstrates focusing on difference with a 1st structural example.
[0136]
Here, the weak ID and the non-weak ID will be described.
[0137]
A weak ID, when used as a user ID, is a user ID that is more likely to be erroneously detected as a colluder's user ID, even though it is a user ID of a user who has not made a collusion attack ( This is called from the meaning of ID weak against false detection). The non-weak ID is a user ID obtained by removing the weak ID from the user ID candidates, and only the non-weak ID is used as the user ID.
[0138]
The non-weak ID is determined by a predetermined determination algorithm and a user ID that is more likely to be erroneously detected with some guideline (for example, all or many of the component codes of the corresponding collusion-resistant code). Amin = Amax is close to 0 or 1, for example).
[0139]
Here, an example of a processing procedure for determining whether a given user ID (candidate) is a weak ID or a non-weak ID in the case of the code generation unit 11 of FIG. 7 will be described using the flowchart shown in FIG. .
[0140]
First, the target user IDs are sequentially input one by one (step S31), and the probability that this user ID is erroneously detected as a colluder ID (false detection probability) is estimated (step S32). The estimation of the false detection probability is, for example, the above-described p. _i (= N (i)), k, k ′ (= M), the maximum value c of the total number of colluders, and the total number of users n and z, are used as follows. For k, k ′ primes p1 (= N (1)) and p2 (= N () prepared in the modulus storage units 121-1, 121-2,..., 121-k ′ in FIG. 2)),..., Pk ′ (= N (k ′)), when an arbitrary k prime is selected, the product of these k primes is n or more (for example, this product is n ≦ N (1) × N (2) ×... × N (k)). Z is a positive integer of 1 or more, for example, a positive integer that satisfies k ′ = c (k + z) / 2.
[0141]
First, the following equation is defined.
[0142]
[Expression 4]

Next, the following equation is defined.
[0143]
[Equation 5]

[0144]
An evaluation value EEP represented by the following equation is calculated as an amount that generally represents the probability that a certain user ID (= u) is erroneously detected as a colluder ID.
[0145]
[Formula 6]

[0146]
Where u _p = U mod p. In addition to this, if there is an evaluation value that approximates the false detection probability for a certain user ID, it can be used in place of the evaluation value EEP. For example, an evaluation value EEP represented by the following formula may be used.
[0147]
[Expression 7]

[0148]
Next, it is checked whether or not the false detection probability (for example, the EEP) estimated in step S32 exceeds a predetermined threshold (step S33), and if it exceeds the threshold, the user ID (candidate) is a weak ID. If it is determined that there is an error (step S34), and if the false detection probability is equal to or less than the threshold, it is determined that the user ID (candidate) is a non-weak ID (step S35).
[0149]
Now, when the collusion number c in which the collusion attack is performed increases, the weak ID increases as a result obtained when the tracking algorithm estimates the collusion user ID from the content subjected to the collusion attack. Therefore, by evaluating the ratio β between the number of weak IDs and the number of non-weak IDs, the value of the collusion number c that produces the ratio β can be estimated.
[0150]
That is, as the collusion number c increases, the ratio β obtained by dividing the number of weak IDs by the number of non-weak IDs increases. Therefore, the function h (c) = β that gives the ratio β when the collusion number c is given in advance. Inverse function c = h ^-1 By obtaining (β) in advance, the collusion number c can be estimated from the ratio β.
[0151]
First, the weak ID / non-weak ID classification unit 241 classifies the user ID into a weak ID and a non-weak ID when all or a part of the colluder's user IDs are output from the tracking algorithm processing unit 23. (Step S21). The determination of weak ID or non-weak ID is performed by, for example, storing a list of weak IDs and checking whether or not a given user ID matches that registered in the list. If they match, it may be determined to be weak ID, and if they do not match, it may be determined to be non-weak ID. If a procedure for determining whether the user ID is weak ID or non-weak ID can be created, the determination procedure It may be determined whether the ID is weak or non-weak.
[0152]
Next, the statistical processing unit 222 obtains a ratio β obtained by dividing the number of weak IDs by the number of non-weak IDs based on the classified weak IDs and non-weak IDs (step S22).
[0153]
Then, the estimated collusion number calculation unit 223 sets the ratio β to c = h described above. ^-1 By substituting into (β), the collusion number c can be estimated (step S23).
[0154]
In the above description, the value of the collusion number is estimated, but the collusion number may be obtained at several levels. For example, when the obtained ratio β is less than or equal to a predetermined reference value, information indicating that the number of collusions is small (or less than or equal to the allowable number) is output, and when the ratio exceeds a predetermined reference value, A function that outputs information indicating that the number of collusions is large (or exceeds the allowable number) may be used.
[0155]
In the above, the code corresponding to the user ID is embedded in the copy of the content. Instead, the copy ID of the copy and information for specifying the user (for example, the user name or The correspondence with the user ID or the like may be saved or restored, and the code corresponding to the copy ID may be embedded in the copy of the content.
[0156]
Below, the hardware configuration and software configuration of this embodiment will be described.
[0157]
The digital watermark analysis apparatus according to the present embodiment is not limited to hardware, but also software ((for causing a computer to execute predetermined means, or for causing a computer to function as predetermined means, or for causing a computer to perform predetermined functions) For this purpose). When the digital watermark analysis apparatus is realized by software, the program can be delivered by a recording medium or the program can be delivered by a communication medium. Of course, the same applies to the digital watermark embedding apparatus.
Further, when the digital watermark embedding device or the digital watermark analysis device is configured as hardware, it can be formed as a semiconductor device.
Also, when configuring a digital watermark analysis apparatus to which the present invention is applied, or when creating a digital watermark analysis program, even if there are blocks or modules having the same configuration, it is possible to create all of them individually. However, it is also possible to prepare one or an appropriate number of blocks or modules having the same configuration and share (reuse) them with each part of the algorithm. The same applies to the case where a digital watermark embedding apparatus is configured or a digital watermark embedding program is created. When configuring a system including a digital watermark embedding device and a digital watermark analysis device, or creating a system including a digital watermark embedding program and a digital watermark detection program, It is also possible to prepare one or an appropriate number of blocks or modules having the same configuration over the digital watermark analysis apparatus (or program) and share (reuse) them with each part of the algorithm. .
[0158]
Further, when the digital watermark embedding device or the digital watermark analysis device is configured by software, it is possible to speed up the processing by using a multiprocessor and performing parallel processing.
[0159]
By the way, the watermarking technology for digital watermarks does not change the identity, homogeneity, economic value, etc. of the information or substance even if the contents of some information or substance other than digital data are changed. In addition to digital data, the present invention can also be applied to such information or materials.
[0160]
For example, in the present invention, the embedded code generation means and detection means used in the digital watermark embedding apparatus / digital watermark analysis apparatus resistant to collusion attacks are synthesized chemically or industrially. It can also be applied to trace the origin of compounds or chemicals that are biologically produced in a controlled environment. As compounds, DNA, RNA, proteins, and other high molecular compounds have many redundancy that can embed codes.
[0161]
In the following, the present invention provides a means for embedding individual identification information (user ID, manufacturer ID, seller ID, transaction ID, information combining them, etc.) on a duplicate of a compound, and specifying its origin. A case of applying as a given watermark technique will be described.
[0162]
A compound is composed of a plurality of substances such as atoms, molecules, and groups. For example, DNA or RNA has a predetermined amino acid sequence structure, and it can be considered that information is expressed depending on whether it is replaced with another amino acid. In the structure, in the case of compounds (as in the case of digital content, just as data may not change the identity or economic value of the work) Even if it is changed, the nature, function, etc. of its action, side effect, utility, etc. (economic value from another viewpoint) may not be changed for this purpose.
[0163]
Such permitted changes can embed information that uniquely identifies the replica.
[0164]
When the digital watermark of the present invention is applied to a compound, the watermark embedding device for the compound is configured to change the bit of the predetermined portion of the digital content in the watermark embedding device for the digital content, and the composition of the predetermined portion of the compound is changed. It is replaced with the device to be changed. Further, the watermark analysis apparatus for a compound has a configuration for detecting information for reading a bit value of a predetermined part of the digital content in order to detect the watermark information in the watermark analysis apparatus for digital content, and a compound for detecting the watermark information. It is replaced with an apparatus for analyzing the composition of a predetermined portion of the above. That is, in principle, it is the same as the watermarking technique for digital content, except that the device that interfaces with the compound is different.
[0165]
FIG. 19 shows a configuration example of a watermark embedding apparatus for a compound.
[0166]
The code generation unit 1001 receives the identification information to be embedded in the compound and generates a collusion-resistant code.
[0167]
The structure conversion units 1002 to 1004 at specific parts convert the structure of the compound according to the value of each bit or each set of bits of the collusion-resistant code. The specific site structure conversion unit 1002 processes the specific site 1 of the original compound, the specific site structure conversion unit 1003 processes the specific site 2 of the compound that has been processed the specific site 1, and the specific site structure conversion unit 1004 processes the specific site 3 of the compound that has processed the

specific sites

1 and 2 to produce the desired embedded compound. Of course, in FIG. 19, three structure conversion units are shown, but the number is not limited to three.
[0168]
Here, the conversion of the structure of a compound is a means for converting to a compound having a different structure without damaging properties or functions suitable for the purpose of use of the compound and without causing new adverse effects or side effects. That is. Alternatively, when the compound is not a pure compound but a mixture, it may be a means for changing the composition.
[0169]
FIG. 20 shows another configuration example of the watermark embedding apparatus for a compound.
[0170]
In the configuration example of FIG. 19, the structure of a compound that has already been synthesized is converted later, but in the configuration example of FIG. 20, reference numerals are embedded when the compound is synthesized.
[0171]
The code generation unit 1011 receives the identification information to be embedded in the compound and generates a collusion resistance code.
[0172]
In this case, for each synthetic material, a composite material corresponding to the value of each bit or set of bits of the collusion-resistant code is facilitated, and each of the synthetic material parts 1012 to 1014 is collusion. For each bit of the tolerance code or each set of bits, a compound synthetic material corresponding to the value is selected. Of course, in FIG. 20, three synthetic material selection sections are shown, but the number is not limited to three.
[0173]
The synthesizing unit 1015 synthesizes the synthetic material selected by each of the synthetic material units 1012 to 1014 to generate a desired embedded compound.
[0174]
Now, in a collusion attack on a compound, it is basically the same as a collusion attack on digital content, for example, a plurality of different identification information (for example, user ID, manufacturer ID, user ID, manufacturer ID, etc.) are embedded. By comparing the structure of the compound, it is made by modifying the structure of the part with the difference.
[0175]
FIG. 21 shows a configuration example of a watermark analysis apparatus for compounds.
[0176]
The specific site structure reading units 1201 to 1201 correspond to the specific site structure conversion units 1002 to 1004 in FIG. 19 or the synthetic material units 1012 to 1014 in FIG. 20, and read the specific site structure in the compound, It is output as information that is a bit or a set of bits.
[0177]
The code decoding unit 1204 reproduces the code word to be tracked from these bits and estimates the collusion number. The code decoding unit 1204 reproduces the code word to be tracked from the bits of the digital watermark analysis apparatus 2 for the digital content and collates it This is similar to the function for estimating the number.
[0178]
Of course, the watermark analysis apparatus for a compound has a function of a tracking algorithm as necessary.
[0179]
In addition, the watermark analysis apparatus for a compound does not have a function of estimating the number of collusions, and can be configured to have a function of a tracking algorithm.
[0180]
Here, examples of usable techniques for the structure conversion means and the structure reading means of the compound used in the present invention will be described. Hereinafter, the case of DNA will be described as an example.
[0181]
Obtaining the base sequence of DNA is called sequencing. As a sequencing method, a shot gun method, a primer walk method, a nested delation method, and the like are known. These are all gene cloning methods. Various methods have been proposed for examples of reagents, devices, and apparatuses used in sequencing. For example, it is disclosed in the supervision of Satoshi Watanabe, edited by Masahiro Sugiura, “Cloning and Sequence”, Rural Bunkasha (1989), “Genome Science” edited by Yoshiyuki Tsuji, Kyoritsu Publishing (1999), and the like.
[0182]
Similarly, in the case of DNA, the structure can be converted by the gene introduction method used when introducing a new gene. Known gene transfer methods include chemical methods such as calcium phosphate precipitation, dextran, and ribofection, electroporation, and microinjection. For example, it is disclosed in Nobuyuki Haga “Molecular Cell Engineering”, Corona (2000).
[0183]
Note that the configuration illustrated in the embodiment of the present invention is an example, and is not intended to exclude other configurations, and a part of the illustrated configuration may be replaced with another or one of the illustrated configurations. Other configurations obtained by omitting a part, adding another function to the illustrated configuration, combining them, and the like are also possible. Also, another configuration that is logically equivalent to the exemplified configuration, another configuration that includes a portion that is logically equivalent to the exemplified configuration, another configuration that is logically equivalent to the main part of the illustrated configuration, and the like are possible. is there. Further, another configuration that achieves the same or similar purpose as the illustrated configuration, another configuration that achieves the same or similar effect as the illustrated configuration, and the like are possible.
Various variations of various components can be implemented in appropriate combination.
Further, the embodiments of the present invention can be applied to various viewpoints, stages, concepts, or the like, such as an invention as an individual device, an invention as a whole system, an invention about a component in an individual device, or an invention of a method corresponding to them. The invention according to the category is included / inherent.
Therefore, the present invention can be extracted from the contents disclosed in the embodiments of the present invention without being limited to the exemplified configuration.
[0184]
The present invention is not limited to the above-described embodiment, and can be implemented with various modifications within the technical scope thereof.
[0185]
【The invention's effect】
According to the present invention, the number of copies of digital content used in a collusion attack is estimated by performing an estimation based on a statistical technique for a code detected from a copy of digital content embedded with a collusion-resistant code. be able to.
[Brief description of the drawings]
FIG. 1 is a diagram showing a schematic configuration of a content distribution system including a digital watermark embedding device and a digital watermark analysis device according to an embodiment of the present invention.
FIG. 2 is a diagram showing a configuration example of a digital watermark embedding device according to the embodiment;
FIG. 3 is a view showing a configuration example of a digital watermark analysis apparatus according to the embodiment;
FIG. 4 is a view showing another configuration example of the digital watermark analysis apparatus according to the embodiment;
FIG. 5 is a view showing still another configuration example of the digital watermark analysis apparatus according to the embodiment;
FIG. 6 is a flowchart showing an example of a schematic procedure of the digital watermark embedding apparatus according to the embodiment;
FIG. 7 is a diagram showing a configuration example of a code generation unit of the digital watermark embedding apparatus according to the embodiment;
8 is a diagram illustrating a configuration example of a component code generation unit of the code generation unit in FIG. 7;
FIG. 9 is a diagram for explaining an example of component codes generated by the digital watermark embedding apparatus according to the embodiment;
FIG. 10 is a view for explaining an example of a plurality of integer sets corresponding to each user ID in the embodiment;
FIG. 11 is a diagram for explaining an example of a collusion-resistant code corresponding to each user ID in the embodiment
FIG. 12 is a view for explaining a boundary position regarding a bit pattern in each component code in the embodiment;
FIG. 13 is a diagram showing a configuration example of a collusion number estimation unit of the digital watermark analysis apparatus according to the embodiment;
FIG. 14 is a flowchart showing an example of a schematic procedure of a collusion number estimation unit of the digital watermark analysis apparatus according to the embodiment;
FIG. 15 is a diagram for explaining the relationship between the number of duplicates used in a collusion attack and the position of a boundary relating to a bit pattern detected in each component code of a falsified collusion-resistant code;
FIG. 16 is a diagram showing another configuration example of the collusion number estimation unit of the digital watermark analysis apparatus according to the embodiment;
FIG. 17 is a flowchart showing another example of the schematic procedure of the collusion number estimation unit of the digital watermark analysis apparatus according to the embodiment;
FIG. 18 is a flowchart showing an example of a procedure for determining whether a user ID is a weak ID or a non-weak ID in the embodiment;
FIG. 19 is a diagram showing a configuration example of a watermark embedding device for a compound according to the embodiment;
FIG. 20 is a view showing another configuration example of the watermark embedding apparatus for the compound according to the embodiment;
FIG. 21 is a view showing another configuration example of the watermark analysis apparatus for the compound according to the embodiment;
[Explanation of symbols]
1 ... Digital watermark embedding device
2. Digital watermark analysis device
3 ... Distribution channel
11: Code generator
12: Code embedding part
21: Code extraction unit
22 ... Quorum number estimation part
23. Tracking algorithm processor
121-1 to 121 -k ′ ... legal storage unit
122-1 to 122 -k ′: Remainder calculation unit
123: Code parameter storage unit
124-1 to 124 -k ′... Component code generator
125 ... Symbol connecting portion
131: Subtraction unit
132 ... "0" column generator
133 ... "1" column generator
34 ... articulated part
221 ... Boundary detection unit
222 ... Statistical processing section
223 ... Estimated number of collusion calculator
241 ... Weak ID / non-weak ID classification unit
242 ... Statistical processing section
243 ... Estimated collusion number calculation part
1001, 1011 ... code generation unit
1002-1004 ... Structure conversion part of specific part
1012 to 1014 ... synthetic material part
1015 ..Synthesizer
1201-1201 ... Structure reading part of a specific part
1204 ... Code decoding unit

Claims

A digital watermarking system that estimates the number of copies of digital content used in a collusion attack,
Prior to passing a copy of the digital content to the user, a plurality of integers are assigned to the identification information for identifying the user corresponding to the copy according to a predetermined method,
Generating a plurality of component codes corresponding to each of the assigned integers;
Generating a collusion-resistant code to be concatenated and embedded with the generated component codes;
A first step of embedding the generated collusion-resistant code in the replica;
From the copy of the digital content that has been analyzed, a code embedded in the copy as the collusion-resistant code is detected,
For each of a plurality of component codes constituting the detected code, position information related to the position of the tampered portion of the component code is detected,
Based on a plurality of the position information detected for each of the plurality of component codes, a predetermined statistic related to the position of the tampered portion is obtained,
And a second step of estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic related to the obtained position of the falsified portion.

A digital watermark system capable of estimating the number of copies of digital content used in a collusion attack,
When assigning an identifier that identifies a user corresponding to a copy of a digital content, it corresponds to a copy used for a collusion attack by a predetermined tracking algorithm from among identifier candidates belonging to a predetermined non-negative integer range Assign an identifier that is determined not to be a false identifier that is more likely to be falsely detected as an identifier, and pass the digital content copy to the user before identifying the user corresponding to the copy. On the other hand, according to a predetermined method based on the value of the identifier, assigning a plurality of integers,
Generating a plurality of component codes corresponding to each of the assigned integers;
Generating a collusion-resistant code to be concatenated and embedded with the generated component codes;
A first step of embedding the generated collusion-resistant code in the replica;
From the copy of the digital content that has been analyzed, a code embedded in the copy as the collusion-resistant code is detected,
Applying the predetermined tracking algorithm to the detected code to determine the identifier identifying the user corresponding to the duplicate used in the collusion attack;
Classifying the determined identifiers into weak identifiers and other non-weak identifiers;
Based on the classification result of the weak identifier and the non-weak identifier, a predetermined statistic regarding the weak identifier and the non-weak identifier is obtained,
And a second step of estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic about the weak identifier and the non-weak identifier obtained. .

A digital watermark analysis device that estimates the number of copies of digital content used in a collusion attack,
Means for detecting a code embedded in the copy as a collusion-resistant code from a copy of the digital content to be analyzed;
Means for detecting position information related to the position of the tampered part of the component code for each of the plurality of component codes constituting the detected code;
Means for obtaining a predetermined statistic related to the position of the tampered portion based on a plurality of the position information detected for each of the plurality of component codes;
An electronic watermark analysis apparatus comprising: means for estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic related to the obtained position of the falsified portion.

The copy of the digital content is prior to the copy being handed to the user,
A process of assigning a plurality of integers according to a predetermined method to identification information for identifying a user corresponding to the duplicate,
Generating a plurality of component codes corresponding to each of the assigned integers;
Processing for generating a collusion-resistant code to be embedded by concatenating the generated component codes;
4. The digital watermark analysis apparatus according to claim 3, wherein a process of embedding the generated collusion-resistant code is performed.

Each of the plurality of component codes constituting the collusion-resistant code embedded in the duplicate includes a bit string consisting of only 1 or 0 of a predetermined number of consecutive bits as a unit, and a bit string consisting of only 1 or 0. Are connected by the number obtained by subtracting 1 from the integer corresponding to the component code, and consist of only 0, 1 only, or correspond to the component code in the bit string of the component code 5. The digital watermark analysis apparatus according to claim 3, wherein 0 and 1 are adjacent codes at only one position corresponding to the integer value.

For each of the plurality of component codes constituting the code detected from the duplicate, a combination of 0 and 1 is detected in the bit string having the predetermined number of bits constituting the component code as one unit The bit sequence is determined to have been altered by a collusion attack,
6. The digital watermark analysis apparatus according to claim 5, wherein information capable of specifying both end portions of the range of the altered bit string is detected as position information related to the position of the falsified portion.

For each of the plurality of component codes constituting the code detected from the duplicate, with respect to the component code determined not to be a falsified part, if the component code consists of only 0, the component code If one of the two end portions of all the bit strings of the component code consists of only one, the predetermined end of the two end portions of all the bit strings of the component code A boundary between the plurality of consecutive 0s and the plurality of consecutive 1s when the other end portion defined is a concatenation of a plurality of consecutive 0s and a plurality of consecutive 1s The digital watermark analysis apparatus according to claim 6, wherein the part is detected as position information related to a position of the falsified part.

For each of a plurality of component codes constituting the detected code, obtain one or both of the most significant bit side position and the least significant bit side position of the falsified portion of the component code,
One of a first average for a value obtained by normalizing each of the obtained plurality of most significant bit side positions and a second average for a value obtained by normalizing the obtained plurality of least significant bit side positions, respectively Seeking both
By inputting one or both of the obtained first average and second average to a predetermined function, the number of replicas used in the collusion attack is output as the output of the predetermined function. The digital watermark analysis apparatus according to claim 3, wherein an estimated value is obtained.

In the case where it is assumed that the value is randomly distributed for all the identification information for each of the plurality of integers assigned according to a predetermined method with respect to the identification information that identifies the user corresponding to the replica When the number of replicas used in the collusion attack is changed, the relationship between each number and the first average value and / or the second average value stochastically expected for the number The digital watermark analysis apparatus according to claim 8, wherein the digital watermark analysis apparatus is based on.

For each of a plurality of component codes constituting the detected code, obtain one or both of the most significant bit side position and the least significant bit side position of the falsified portion of the component code,
A first ratio of the number of the most significant bit-side positions obtained that is higher in the upper bits than the predetermined reference position and the number of those that are in the lower bits from the reference position; One or both of the number of the least significant bit side positions of the plurality of the least significant bit side positions and the second ratio of the number of the least significant bit side positions of the least significant bit side and the number of the least significant bit side positions of the least significant bit side. Seeking
By inputting one or both of the first ratio and the second ratio into a predetermined function, an estimated value of the number of replicas used in the collusion attack is obtained as an output of the predetermined function. The digital watermark analysis apparatus according to claim 3, wherein the digital watermark analysis apparatus is obtained.

In the case where it is assumed that the value is randomly distributed for all the identification information for each of the plurality of integers assigned according to a predetermined method with respect to the identification information that identifies the user corresponding to the replica The relationship between the number of replicas used in the collusion attack and the value of the first ratio and / or the value of the second ratio stochastically expected for that number The digital watermark analysis apparatus according to claim 8, wherein the digital watermark analysis apparatus is based on.

A digital watermark analysis device that estimates the number of copies of digital content used in a collusion attack,
Means for detecting a code embedded in the copy as the collusion-resistant code from a copy of the digital content that has been analyzed;
Means for applying the predetermined tracking algorithm to the detected code to determine the identifier for identifying a user corresponding to a duplicate used in a collusion attack;
Means for classifying the determined identifiers into weak identifiers and other non-weak identifiers;
Means for obtaining a predetermined statistic relating to the weak identifier and the non-weak identifier based on the classification result of the weak identifier and the non-weak identifier;
An electronic watermark analysis apparatus comprising: means for estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic regarding the obtained weak identifier and non-weak identifier.

The copy of the digital content is prior to the copy being delivered to the user,
When assigning an identifier for identifying a user corresponding to the duplicate, an identifier corresponding to the duplicate used in the collusion attack by a predetermined tracking algorithm from among identifier candidates belonging to a predetermined non-negative integer range Assigning what is determined not to be a weak identifier more likely to be falsely detected as being,
Assigning a plurality of integers according to a predetermined method based on a value of the identifier to the identifier that identifies the user corresponding to the copy prior to delivering the copy of the digital content to the user;
Generating a plurality of component codes corresponding to each of the assigned integers;
Processing for generating a collusion-resistant code to be embedded by concatenating the generated component codes;
13. The digital watermark analysis apparatus according to claim 12, wherein a process of embedding the generated collusion-resistant code is performed.

Based on a classification result of the weak identifier and the non-weak identifier, a ratio between the number of identifiers classified as weak identifiers and the number of identifiers classified as non-weak identifiers is obtained,
4. The estimated value of the number of duplicates used in a collusion attack is obtained as an output of the predetermined function by inputting the determined ratio into a predetermined function. 8. The digital watermark analysis apparatus according to any one of 7 above.

The predetermined function is based on a relationship between each number and the value of the ratio stochastically expected for the number when the number of replicas used in the collusion attack is changed. The digital watermark analysis apparatus according to claim 14, wherein the digital watermark analysis apparatus is a digital watermark analysis apparatus.

15. The digital watermark analysis apparatus according to claim 14, wherein instead of obtaining an estimated value of the number of duplicates used in the collusion attack, information indicating a magnitude level of the number of duplicates used in the collusion attack is obtained.

A digital watermark analysis method for estimating the number of copies of digital content used in a collusion attack,
From the copy of the digital content subject to analysis, the code embedded in the copy as a collusion-resistant code is detected,
For each of a plurality of component codes constituting the detected code, position information related to the position of the tampered portion of the component code is detected,
Based on a plurality of the position information detected for each of the plurality of component codes, a predetermined statistic related to the position of the tampered portion is obtained,
A digital watermark analysis method, wherein the number of duplicates used in a collusion attack on the digital content is estimated based on a predetermined statistic related to the obtained position of the falsified portion.

A digital watermark analysis method for estimating the number of copies of digital content used in a collusion attack,
From the copy of the digital content that has been analyzed, a code embedded in the copy as the collusion-resistant code is detected,
Applying the predetermined tracking algorithm to the detected code to determine the identifier identifying the user corresponding to the duplicate used in the collusion attack;
Classifying the determined identifiers into weak identifiers and other non-weak identifiers;
Based on the classification result of the weak identifier and the non-weak identifier, a predetermined statistic regarding the weak identifier and the non-weak identifier is obtained,
A digital watermark analysis method characterized in that the number of duplicates used in a collusion attack on the digital content is estimated based on a predetermined statistic regarding the obtained weak identifier and non-weak identifier.

A computer-readable recording medium recording a program for causing a computer to function as a digital watermark analysis apparatus that estimates the number of copies of digital content used in a collusion attack,
A function for detecting a code embedded as a collusion-resistant code in a copy of the digital content to be analyzed;
For each of a plurality of component codes constituting the detected code, a function for detecting position information related to the position of the falsified portion of the component code;
A function for obtaining a predetermined statistic related to the position of the tampered portion based on a plurality of the position information detected for each of the plurality of component codes;
A computer readable recording of a program for realizing a function for estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic related to the position of the tampered portion obtained. Recording medium.

A computer-readable recording medium recording a program for causing a computer to function as a digital watermark analysis apparatus that estimates the number of copies of digital content used in a collusion attack,
A function for detecting a code embedded in the copy as the collusion-resistant code from a copy of the digital content to be analyzed;
A function for obtaining the identifier for identifying the user corresponding to the duplicate used in the collusion attack by applying the predetermined tracking algorithm to the detected code, and the obtained identifier as a weak identifier A function to classify other non-weak identifiers,
Based on the classification result of the weak identifier and the non-weak identifier, a function for obtaining a predetermined statistic regarding the weak identifier and the non-weak identifier;
A computer-readable recording of a program for realizing a function for estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic relating to the obtained weak identifier and non-weak identifier Possible recording media.

A program for causing a computer to function as a digital watermark analysis apparatus that estimates the number of copies of digital content used in a collusion attack,
A function for detecting a code embedded as a collusion-resistant code in a copy of the digital content to be analyzed;
For each of a plurality of component codes constituting the detected code, a function for detecting position information related to the position of the falsified portion of the component code;
A function for obtaining a predetermined statistic related to the position of the tampered portion based on a plurality of the position information detected for each of the plurality of component codes;
A program for realizing a function for estimating the number of duplicates used in a collusion attack on the digital content, based on a predetermined statistic related to the obtained position of the falsified portion.

A program for causing a computer to function as a digital watermark analysis apparatus that estimates the number of copies of digital content used in a collusion attack,
A function for detecting a code embedded in the copy as the collusion-resistant code from a copy of the digital content to be analyzed;
A function for obtaining the identifier for identifying the user corresponding to the duplicate used in the collusion attack by applying the predetermined tracking algorithm to the detected code, and the obtained identifier as a weak identifier A function to classify other non-weak identifiers,
Based on the classification result of the weak identifier and the non-weak identifier, a function for obtaining a predetermined statistic regarding the weak identifier and the non-weak identifier;
A program for realizing a function for estimating the number of duplicates used in a collusion attack on the digital content based on a predetermined statistic regarding the obtained weak identifier and non-weak identifier.

A chemical watermarking system that tracks chemical products of the same kind embedded with different identification information used in collusion attacks,
Assign multiple integers to the identification information to be embedded in the target chemical product according to a predetermined method
Generating a plurality of component codes corresponding to each of the assigned integers;
Generating a collusion-resistant code to be concatenated and embedded with the generated component codes;
A first step of embedding the generated collusion resistance code in the chemical product;
From the chemical product to be analyzed, the code embedded in the chemical product as the collusion-resistant code is detected,
And a second step of applying the predetermined tracking algorithm to the detected code to obtain the identification information corresponding to the chemical product used in the collusion attack.

A chemical watermarking system that tracks chemical products of the same kind embedded with different identification information used in collusion attacks,
When assigning an identifier to be embedded in a target chemical product, an identifier corresponding to the chemical product used for the collusion attack by a predetermined tracking algorithm from among identifier candidates belonging to a predetermined non-negative integer range. Assigns a weak identifier that is more likely to be falsely detected as being,
Assigning a plurality of integers to the identifier to be embedded in the chemical product according to a predetermined method based on the value of the identifier,
Generating a plurality of component codes corresponding to each of the assigned integers;
Generating a collusion-resistant code to be concatenated and embedded with the generated component codes;
A first step of embedding the generated collusion resistance code in the chemical product;
From the chemical product to be analyzed, the code embedded in the chemical product as the collusion-resistant code is detected,
And a second step of applying the predetermined tracking algorithm to the detected code to obtain the identification information corresponding to the chemical product used in the collusion attack.

A chemical watermarking system that estimates the number of similar chemical products embedded with different identification information used in collusion attacks,
Assign multiple integers to the identification information to be embedded in the target chemical product according to a predetermined method
Generating a plurality of component codes corresponding to each of the assigned integers;
Generating a collusion-resistant code to be concatenated and embedded with the generated component codes;
A first step of embedding the generated collusion resistance code in the chemical product;
From the chemical product to be analyzed, the code embedded in the chemical product as the collusion-resistant code is detected,
For each of a plurality of component codes constituting the detected code, position information related to the position of the tampered portion of the component code is detected,
Based on a plurality of the position information detected for each of the plurality of component codes, a predetermined statistic related to the position of the tampered portion is obtained,
And a second step of estimating the number of chemical products used in a collusion attack on the chemical product based on a predetermined statistic related to the position of the tampered portion obtained. Chemical watermark system.

A chemical watermarking system that estimates the number of similar chemical products embedded with different identification information used in collusion attacks,
When assigning an identifier to be embedded in a target chemical product, an identifier corresponding to the chemical product used for the collusion attack by a predetermined tracking algorithm from among identifier candidates belonging to a predetermined non-negative integer range. Assigns a weak identifier that is more likely to be falsely detected as being,
Assigning a plurality of integers to the identifier to be embedded in the chemical product according to a predetermined method based on the value of the identifier,
Generating a plurality of component codes corresponding to each of the assigned integers;
Generating a collusion-resistant code to be concatenated and embedded with the generated component codes;
A first step of embedding the generated collusion resistance code in the chemical product;
From the chemical product to be analyzed, the code embedded in the chemical product as the collusion-resistant code is detected,
Applying the predetermined tracking algorithm to the detected code to obtain the identification information corresponding to the chemical product used in the collusion attack,
Classifying the determined identifiers into weak identifiers and other non-weak identifiers;
Based on the classification result of the weak identifier and the non-weak identifier, a predetermined statistic regarding the weak identifier and the non-weak identifier is obtained,
A second step of estimating the number of chemical products used in a collusion attack on the chemical product based on a predetermined statistic about the weak identifier and the non-weak identifier obtained. Chemical watermarking system.

The collusion resistance code is embedded in the chemical product by converting the structure of a specific part used for embedding a watermark among the structures of the chemical product based on the value of the collusion resistance code. 27. The chemical watermarking system according to any one of claims 23 to 26, wherein:

When the chemical product is formed by synthesizing a plurality of synthetic materials, the embedding of the collusion resistance code into the chemical product is performed for each value of the collusion resistance code in advance for each synthetic material. Prepared by converting the structure of the specific part corresponding to the above, selecting the corresponding individual synthetic material based on the value of the collusion resistance code, and synthesizing the selected synthetic material The chemical watermark system according to any one of claims 23 to 26, wherein:

24. The detection of the collusion-resistant code from the chemical product is performed by analyzing a structure of a specific portion used for embedding a watermark among structures of the chemical product. 27. The chemical watermark system according to any one of 26.

30. The content of the collusion-resistant code is expressed by the type after substitution of an amino acid at a specific site in the amino acid sequence structure of the chemical substance product, according to any one of claims 27 to 29. Chemical watermarking system.