JP2013057825A

JP2013057825A - Electronic watermark detection device and electronic watermark detection method

Info

Publication number: JP2013057825A
Application number: JP2011196449A
Authority: JP
Inventors: Yushi Uki; 祐史鵜木; Ryota MIYAUCHI; 良太宮内
Original assignee: Japan Advanced Institute of Science and Technology
Current assignee: Japan Advanced Institute of Science and Technology
Priority date: 2011-09-08
Filing date: 2011-09-08
Publication date: 2013-03-28
Anticipated expiration: 2031-09-08
Also published as: JP5879075B2

Abstract

PROBLEM TO BE SOLVED: To provide an electronic watermark detection device and an electronic watermark detection method, which are able to detect electronic watermark data without referring to an original signal.SOLUTION: The electronic watermark detection device comprises a first chirp z conversion part and a second chirp z conversion part 202b for estimating a cochlear characteristic that is simulated by a cochlear filter used when electronic watermark data is applied into an acoustic signal. Based on the cochlear characteristic estimated by the result of the chirp z conversion by the first and second chirp z conversion parts 202a and 202b, the electronic watermark data applied into the acoustic signal is detected.

Description

本発明は、デジタルデータである音響信号（音声、音楽など）に埋め込まれた電子透かしデータを検出する電子透かし検出装置及び電子透かし検出方法に関する。 The present invention relates to a digital watermark detection apparatus and a digital watermark detection method for detecting digital watermark data embedded in an acoustic signal (speech, music, etc.) that is digital data.

近年では、インターネット等の通信ネットワークの普及に伴い、デジタル音楽コンテンツの配信サービス等が提供されるようになっている。しかしながら、デジタル音楽コンテンツの場合、音質をほとんど劣化することなく複製することが可能であるため、違法コピーが横行し、社会問題となっている。そこで、デジタル音楽コンテンツの著作権を保護するための技術として、著作権情報またはシリアルナンバー等の付加情報（電子透かしデータ）を音響信号に埋め込むことにより、違法コピー等の防止及び追跡等を図ることができる電子音響透かし技術が注目されている。 In recent years, with the spread of communication networks such as the Internet, digital music content distribution services and the like have been provided. However, in the case of digital music content, since it is possible to copy without almost degrading sound quality, illegal copying prevails and becomes a social problem. Therefore, as technology for protecting the copyright of digital music content, copyright information or additional information (digital watermark data) such as a serial number is embedded in the sound signal to prevent and track illegal copying. Electronic acoustic watermarking technology that can be used is attracting attention.

電子音響透かし技術としては、例えば、（１）ＬＳＢ（Least Significant Bit replacement）法（非特許文献１を参照）のように符号化／量子化レベルで透かしを埋め込む方法、（２）ＤＳＳ（Direct Spread Spectrum）法（非特許文献２）のように原信号の広範なスペクトルに情報を埋め込む方法がある。また、位相に係わる知覚特性に基づく方法として、（３）エコーハイディング法（以下「ＥＣＨＯ法」、非特許文献３を参照）、（４）周期的位相変調（ＰＰＭ：Periodical Phase Modulation）法（非特許文献４及び特許文献１を参照）等が提案されている。 For example, (1) a method of embedding a watermark at an encoding / quantization level as in the LSB (Least Significant Bit replacement) method (see Non-Patent Document 1), and (2) DSS (Direct Spread) There is a method of embedding information in a wide spectrum of the original signal as in the (Spectrum) method (Non-Patent Document 2). In addition, as a method based on a perceptual characteristic relating to a phase, (3) an echo hiding method (hereinafter referred to as “ECHO method”, see Non-Patent Document 3), and (4) a periodic phase modulation (PPM) method (PPM: Periodical Phase Modulation) method ( Non-Patent Document 4 and Patent Document 1) are proposed.

ところで、人間の聴覚が備える特性の一つに、蝸牛遅延（Cochlear Delay：ＣＤ）特性と呼ばれるものがある。音信号が蝸牛内（前庭階及び鼓室階にある非圧縮性のリンパ液内）を伝搬するとき、それらの二つの階の間の圧力差によって生じる蝸牛の基底膜の振動（伝播）には、信号の周波数に依存して、多少の時間差がみられる。この現象が蝸牛遅延であり、音信号の周波数が低いほど遅延が長くなることが知られている。 By the way, one of the characteristics of human hearing is a so-called Cochlear Delay (CD) characteristic. When the sound signal propagates through the cochlea (in the incompressible lymph in the vestibular and tympanic floors), the vibration (propagation) of the cochlear basement membrane caused by the pressure difference between the two floors Depending on the frequency, there is a slight time difference. This phenomenon is cochlear delay, and it is known that the lower the frequency of the sound signal, the longer the delay.

非特許文献５においては、上記の蝸牛遅延と音の同時性判断との間にどのような関係があるのかが検討されている。具体的には、（ａ）通常（蝸牛遅延操作なし）の調波複合音、（ｂ）蝸牛の基底膜上において蝸牛遅延を打ち消すような群遅延を与えた調波複合音、（ｃ）蝸牛遅延を増長するような群遅延を与えた調波複合音の三つの複合音を用いて聴覚心理物理実験を行い、その実験結果に基づいて、蝸牛遅延が音の同時性判断にどのような影響を与えるのかが検討されている。この非特許文献５では、複合音（ｂ）よりも、複合音（ｃ）を用いた場合の方が、複合音（ａ）と同等の同時性判断を示すことが明らかにされている。 In Non-Patent Document 5, the relationship between the cochlear delay and the sound simultaneity determination is examined. Specifically, (a) normal (no cochlear delay operation) harmonic complex sound, (b) harmonic complex sound with a group delay that cancels the cochlear delay on the basement membrane of the cochlea, (c) cochlea An auditory psychophysical experiment was performed using three complex tones of harmonic complex tones with increasing group delay, and based on the results of the experiment, what effect cochlear delay has on sound simultaneity Whether to give In this non-patent document 5, it is clarified that the composite sound (c) is used in the same way as the composite sound (a) when compared to the composite sound (b).

上記の蝸牛遅延特性に着目し、電子透かしとして埋め込む情報の２値データに対応する二種類の異なる蝸牛遅延に似た遅延パターンを原信号に付与することにより、電子音響透かしを実現する方法（以下、「ＣＤ法」という）が非特許文献６及び７で提案されている。 Focusing on the cochlear delay characteristics described above, a method for realizing a digital acoustic watermark by adding to the original signal delay patterns similar to two different cochlear delays corresponding to binary data of information to be embedded as a digital watermark (hereinafter referred to as a digital watermark) Non-Patent Documents 6 and 7 propose "CD method".

特許第３６２７０２２号Japanese Patent No. 3627022

N. Cvejic and T. Seppanen, “Digital audio watermarking techniques and technologies,” IGI Global, 2007N. Cvejic and T. Seppanen, “Digital audio watermarking techniques and technologies,” IGI Global, 2007 Boney, L., Tewfik, H. H., and Hamdy, K. N., “Digital watermarks for audio signals,” Proc. ICMCS, 473-480, 1996Boney, L., Tewfik, H. H., and Hamdy, K. N., “Digital watermarks for audio signals,” Proc. ICMCS, 473-480, 1996 Daniel Gruhl, Anthony Lu Walter Bender, “Echo Hiding,”Proc. Information Hiding 1st Workshop, pp.295-315, Cambridge Univ., 1996Daniel Gruhl, Anthony Lu Walter Bender, “Echo Hiding,” Proc. Information Hiding 1st Workshop, pp.295-315, Cambridge Univ., 1996 西村竜一、鈴木陽一、「周期的位相変調に基づく音響電子透かし」、日本音響学会誌、vol.60、no.5、pp.269-272、2004Ryuichi Nishimura and Yoichi Suzuki, “Acoustic Watermarking Based on Periodic Phase Modulation”, Journal of the Acoustical Society of Japan, vol.60, no.5, pp.269-272, 2004 E. Aiba, S. Tanaka, M. Tsuzaki, and M. Unoki, “Judgment of perceptual synchrony between two pulses and its relation to the cochlear delays,” Proc. Fechner day 2007, 211-214, 2007E. Aiba, S. Tanaka, M. Tsuzaki, and M. Unoki, “Judgment of perceptual synchrony between two pulses and its relation to the cochlear delays,” Proc. Fechner day 2007, 211-214, 2007 Unoki, M. and Hamada, D. “Audio watermarking method based on the cochlear delay characteristics,”Proc. IIHMSP2008, 616-619, 2008Unoki, M. and Hamada, D. “Audio watermarking method based on the cochlear delay characteristics,” Proc. IIHMSP2008, 616-619, 2008 Unoki, M. and Hamada, D. “Method of digital-audio watermarking based on cochlear delay characteristics,”Int. J. Innv. Comp., Inf. Cont., 6(3(B)), 1325-1346, 2010Unoki, M. and Hamada, D. “Method of digital-audio watermarking based on cochlear delay characteristics,” Int. J. Innv. Comp., Inf. Cont., 6 (3 (B)), 1325-1346, 2010

一般に、電子音響透かし技術では、知覚不可能性（埋め込み情報が利用者に知覚されず、埋め込みによる原信号の知覚可能な歪みが生じないこと）、頑健性（通常の信号変換処理及び埋め込み情報を削除するといった悪意のある攻撃に対して影響を受けないこと）、及び秘匿性（情報が埋め込まれていることに気付かせないこと、気付かれてもその情報を容易に検出されないこと）が要求されている。 In general, in the digital audio watermark technology, imperceptibility (embedding information is not perceived by the user and perceptible distortion of the original signal due to embedding does not occur), robustness (normal signal conversion processing and embedded information Is not affected by malicious attacks such as deletion) and confidentiality (not knowing that the information is embedded, and not being able to detect the information easily even if it is noticed) ing.

上記（１）のＬＳＢ法は、振幅情報に大きく影響を与えない下位ビットに情報を埋め込むため、知覚不可能性を満たすが、ビット変化に敏感なため頑健性に問題がある。また、上記（２）のＤＳＳ法の場合、スペクトル全体に情報を埋め込むため、信号変形処理には頑健であるが、埋め込まれた情報を容易に知覚できるため知覚不可能性に問題がある。 The LSB method (1) satisfies information imperceptibility because it embeds information in lower bits that do not significantly affect amplitude information, but has a problem with robustness because it is sensitive to bit changes. In the case of the DSS method (2), information is embedded in the entire spectrum, so that signal transformation processing is robust. However, since the embedded information can be easily perceived, there is a problem in imperceptibility.

上記（３）のＥＣＨＯ法は、エコー時間及び１次反射音の振幅を調整することで歪みがなく、知覚不可能な埋め込みを実現できるが、自己相関法及びケプストラム処理を利用することで透かし情報を容易に検出・除去できるため、上記の従来の方法の中でもっとも頑健性・秘匿性に欠ける。また、上記（４）のＰＰＭ法は、周期的な位相変調が比較的知覚され難いという聴覚特性に基づいているが、位相変調が高い周波数成分の位相スペクトルをランダムに歪ませるため、知覚不可能性に問題がある。 Although the ECHO method of (3) above can adjust the echo time and the amplitude of the primary reflected sound without distortion and realize non-perceptible embedding, the watermark information can be obtained by using the autocorrelation method and the cepstrum processing. Can be easily detected / removed, and thus lacks robustness and secrecy among the above conventional methods. The PPM method (4) is based on the auditory characteristic that periodic phase modulation is relatively difficult to perceive. However, since the phase spectrum of frequency components with high phase modulation is randomly distorted, it cannot be perceived. There is a problem with sex.

他方、上記のＣＤ法の場合、知覚不可能性、秘匿性、及び頑健性を十分に満足するものの、埋め込まれた情報を検出するために原信号を参照することが必要となるため、応用範囲が限定されるという問題がある。 On the other hand, in the case of the above-described CD method, it is necessary to refer to the original signal in order to detect embedded information, although it sufficiently satisfies the imperceptibility, confidentiality, and robustness. There is a problem that is limited.

本発明は斯かる事情に鑑みてなされたものであり、その主たる目的は、ＣＤ法により埋め込まれた情報を、原信号を参照することなく検出することができる電子透かし検出装置及び電子透かし検出方法を提供することにある。 The present invention has been made in view of such circumstances, and a main object thereof is a digital watermark detection apparatus and a digital watermark detection method capable of detecting information embedded by a CD method without referring to an original signal. Is to provide.

上述した課題を解決するために、本発明の一の態様の電子透かし検出装置は、蝸牛遅延特性を模擬する蝸牛遅延フィルタを用いて、デジタルデータである音響信号に位相変調を施し、前記位相変調が施された音響信号に電子透かしデータを埋め込む電子透かしデータ埋め込み装置によって、デジタルデータである音響信号に電子透かしデータが埋め込まれた場合に、前記蝸牛遅延フィルタが模擬する蝸牛遅延特性を推定する蝸牛遅延特性推定手段と、前記蝸牛遅延特性推定手段により推定された蝸牛遅延特性に基づいて、音響信号に埋め込まれた前記電子透かしデータを検出する電子透かし検出手段とを備える。 In order to solve the above-described problem, a digital watermark detection apparatus according to an aspect of the present invention performs phase modulation on an acoustic signal, which is digital data, using a cochlear delay filter that simulates cochlear delay characteristics. The cochlea delay characteristic simulated by the cochlear delay filter when the digital watermark data is embedded in the acoustic signal which is digital data by the digital watermark data embedding device which embeds the digital watermark data in the acoustic signal subjected to A delay characteristic estimation unit; and a digital watermark detection unit that detects the digital watermark data embedded in the acoustic signal based on the cochlear delay characteristic estimated by the cochlear delay characteristic estimation unit.

この態様において、前記電子透かしデータ埋め込み装置が、複数の異なる蝸牛遅延フィルタを用いて音響信号に位相変調を施すことにより、複数の異なる位相変調された音響信号を生成し、電子透かしデータに応じて、前記複数の異なる位相変調された音響信号の中から一の音響信号を選択し、選択した音響信号同士を接合することにより、電子透かしデータを埋め込むように構成されており、前記蝸牛遅延特性推定手段が、前記複数の異なる蝸牛遅延フィルタがそれぞれ模擬する複数の異なる蝸牛遅延特性を推定するように構成され、前記電子透かし検出手段が、前記蝸牛遅延特性推定手段により推定された前記複数の異なる蝸牛遅延特性に基づいて、電子透かしデータが埋め込まれた音響信号が、前記複数の異なる蝸牛遅延フィルタのうちの何れの蝸牛遅延フィルタが適用されて位相変調が施されたかを判定することにより、電子透かしデータを検出するように構成されていてもよい。 In this aspect, the digital watermark data embedding device generates a plurality of different phase-modulated acoustic signals by performing phase modulation on the acoustic signals using a plurality of different cochlear delay filters, and according to the digital watermark data Selecting one acoustic signal from the plurality of different phase-modulated acoustic signals, and joining the selected acoustic signals to embed digital watermark data, and estimating the cochlear delay characteristics Means is configured to estimate a plurality of different cochlear delay characteristics respectively simulated by the plurality of different cochlear delay filters, and wherein the digital watermark detection means is the plurality of different cochleas estimated by the cochlear delay characteristic estimation means Based on the delay characteristics, the audio signal embedded with the digital watermark data is transmitted through the different cochlear delay filters. By any cochlear delay filter to determine whether applied by phase modulation is performed, it may be configured to detect the electronic watermark data.

また、前記態様において、前記蝸牛遅延特性推定手段が、前記蝸牛遅延フィルタの零点を推定することにより、蝸牛遅延特性を推定するように構成されていてもよい。 Moreover, the said aspect WHEREIN: The said cochlear delay characteristic estimation means may be comprised so that a cochlear delay characteristic may be estimated by estimating the zero point of the said cochlear delay filter.

また、前記態様において、前記蝸牛遅延特性推定手段が、チャープｚ変換を用いて、前記蝸牛遅延フィルタの零点を推定するように構成されていてもよい。 Moreover, the said aspect WHEREIN: The said cochlear delay characteristic estimation means may be comprised so that the zero point of the said cochlear delay filter may be estimated using chirp z conversion.

また、前記態様において、前記蝸牛遅延特性手段により推定された蝸牛遅延特性の逆特性を有するフィルタを電子透かしデータが埋め込まれた音響信号に施すことにより、電子透かしデータが埋め込まれる前の音響信号を取得する原信号取得手段をさらに備えていてもよい。 In the above aspect, by applying a filter having a reverse characteristic of the cochlear delay characteristic estimated by the cochlear delay characteristic means to the acoustic signal in which the digital watermark data is embedded, the acoustic signal before the digital watermark data is embedded You may further provide the original signal acquisition means to acquire.

また、前記態様において、電子透かしデータが埋め込まれた音響信号の位相変調に適用されたと前記電子透かし検出手段により判定された蝸牛遅延フィルタの逆フィルタを当該音響信号に施すことにより、電子透かしデータが埋め込まれる前の音響信号を取得する原信号取得手段をさらに備えていてもよい。 Further, in the above aspect, the digital watermark data is obtained by applying an inverse filter of the cochlear delay filter determined by the digital watermark detection means to be applied to the phase modulation of the acoustic signal in which the digital watermark data is embedded. You may further provide the original signal acquisition means which acquires the acoustic signal before being embedded.

本発明の一の態様の電子透かし検出方法は、蝸牛遅延特性を模擬する蝸牛遅延フィルタを用いて、デジタルデータである音響信号に位相変調を施し、前記位相変調が施された音響信号に電子透かしデータを埋め込む電子透かしデータ埋め込み装置によって、デジタルデータである音響信号に電子透かしデータが埋め込まれた場合に、前記蝸牛遅延フィルタが模擬する蝸牛遅延特性を推定するステップ（ａ）と、推定された蝸牛遅延特性に基づいて、音響信号に埋め込まれた前記電子透かしデータを検出するステップ（ｂ）とを有する。 An electronic watermark detection method according to an aspect of the present invention performs phase modulation on an acoustic signal that is digital data using a cochlear delay filter that simulates cochlear delay characteristics, and performs digital watermarking on the acoustic signal that has been subjected to the phase modulation. A step (a) of estimating a cochlear delay characteristic simulated by the cochlear delay filter when the digital watermark data is embedded in an acoustic signal that is digital data by the digital watermark data embedding device that embeds the data; and the estimated cochlea (B) detecting the digital watermark data embedded in the acoustic signal based on the delay characteristic.

この態様において、前記電子透かしデータ埋め込み装置が、複数の異なる蝸牛遅延フィルタを用いて音響信号に位相変調を施すことにより、複数の異なる位相変調された音響信号を生成し、電子透かしデータに応じて、前記複数の異なる位相変調された音響信号の中から一の音響信号を選択し、選択した音響信号同士を接合することにより、電子透かしデータを埋め込むように構成されており、前記ステップ（ａ）において、前記複数の異なる蝸牛遅延フィルタがそれぞれ模擬する複数の異なる蝸牛遅延特性を推定し、前記ステップ（ｂ）において、前記ステップ（ａ）により推定された前記複数の異なる蝸牛遅延特性に基づいて、電子透かしデータが埋め込まれた音響信号が、前記複数の異なる蝸牛遅延フィルタのうちの何れの蝸牛遅延フィルタが適用されて位相変調が施されたかを判定することにより、電子透かしデータを検出するようにしてもよい。 In this aspect, the digital watermark data embedding device generates a plurality of different phase-modulated acoustic signals by performing phase modulation on the acoustic signals using a plurality of different cochlear delay filters, and according to the digital watermark data The digital watermark data is embedded by selecting one acoustic signal from the plurality of different phase-modulated acoustic signals and joining the selected acoustic signals together, the step (a) In the step (b), in the step (b) based on the plurality of different cochlear delay characteristics estimated in the step (a), The acoustic signal in which the digital watermark data is embedded is converted into any one of the plurality of different cochlear delay filters. Filter is applied by determining whether the phase modulation is performed, may be detected electronic watermark data.

また、前記態様において、前記ステップ（ａ）において、前記蝸牛遅延フィルタの零点を推定することにより、蝸牛遅延特性を推定するようにしてもよい。 Moreover, in the said aspect, you may make it estimate a cochlear delay characteristic by estimating the zero point of the said cochlear delay filter in the said step (a).

また、前記態様において、前記ステップ（ａ）においてチャープｚ変換を用いて、前記蝸牛遅延フィルタの零点を推定するようにしてもよい。 Moreover, in the said aspect, you may make it estimate the zero point of the said cochlear delay filter using a chirp z transformation in the said step (a).

本発明に係る電子透かし検出装置及び電子透かし検出方法によれば、原信号を参照することなく、ＣＤ法により埋め込まれた電子透かしデータを検出することができる。 According to the digital watermark detection apparatus and the digital watermark detection method of the present invention, digital watermark data embedded by the CD method can be detected without referring to the original signal.

本発明の実施の形態に係る電子透かし埋込装置の構成を示すブロック図。1 is a block diagram showing a configuration of a digital watermark embedding apparatus according to an embodiment of the present invention. 本発明の実施の形態に係る電子透かし埋込装置の構成を示す機能ブロック図。1 is a functional block diagram showing a configuration of a digital watermark embedding device according to an embodiment of the present invention. 本発明の実施の形態における電子透かし埋込装置が備える蝸牛遅延フィルタの特性を示すグラフ。The graph which shows the characteristic of the cochlear delay filter with which the digital watermark embedding apparatus in embodiment of this invention is provided. 本発明の実施の形態に係る電子透かし検出装置の構成を示すブロック図。1 is a block diagram showing a configuration of a digital watermark detection apparatus according to an embodiment of the present invention. 本発明の実施の形態に係る電子透かし検出装置の構成を示す機能ブロック図。The functional block diagram which shows the structure of the digital watermark detection apparatus which concerns on embodiment of this invention. 蝸牛遅延フィルタの極及び零点を説明するためのグラフ。The graph for demonstrating the pole and zero of a cochlear delay filter. チャープｚ変換による周波数分析の結果を示すグラフ。The graph which shows the result of the frequency analysis by chirp z conversion. 本発明の実施の形態における電子透かし埋込装置が実行する電子透かし埋込処理の手順を示すフローチャート。The flowchart which shows the procedure of the digital watermark embedding process which the digital watermark embedding apparatus in embodiment of this invention performs. 本発明の実施の形態における電子透かし検出装置が実行する電子透かし検出処理の手順を示すフローチャート。The flowchart which shows the procedure of the digital watermark detection process which the digital watermark detection apparatus in embodiment of this invention performs. 客観評価実験の結果を示すグラフ。The graph which shows the result of objective evaluation experiment. 本発明の実施の形態における電子透かし検出装置が実行する原信号取得処理の手順を示すフローチャート。The flowchart which shows the procedure of the original signal acquisition process which the digital watermark detection apparatus in embodiment of this invention performs. 透かし入り音響信号についての客観評価実験の結果を示すグラフ。The graph which shows the result of the objective evaluation experiment about the sound signal with a watermark. 本発明の実施の形態の原信号取得処理により電子透かしデータを削除する前及び削除した後における客観評価実験の結果を示すグラフ。The graph which shows the result of the objective evaluation experiment before and after deleting digital watermark data by the original signal acquisition process of embodiment of this invention.

以下、本発明の好ましい実施の形態を、図面を参照しながら説明する。なお、以下に示す各実施の形態は、本発明の技術的思想を具体化するための方法及び装置を例示するものであって、本発明の技術的思想は下記のものに限定されるわけではない。本発明の技術的思想は、特許請求の範囲に記載された技術的範囲内において種々の変更を加えることができる。 Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings. In addition, each embodiment shown below illustrates the method and apparatus for actualizing the technical idea of this invention, Comprising: The technical idea of this invention is not necessarily limited to the following. Absent. The technical idea of the present invention can be variously modified within the technical scope described in the claims.

本実施の形態に係る電子透かし検出装置は、原信号に埋め込まれた電子透かしデータをその原信号を参照することなく検出することができる装置である。このように原信号を参照することなく電子透かしデータを検出することを、本明細書では「ブラインド検出」と称する。以下、この電子透かし検出装置と、電子透かしデータを埋め込む電子透かし埋込装置について説明する。 The digital watermark detection apparatus according to this embodiment is an apparatus that can detect digital watermark data embedded in an original signal without referring to the original signal. This detection of digital watermark data without referring to the original signal is referred to as “blind detection” in this specification. Hereinafter, the digital watermark detection apparatus and the digital watermark embedding apparatus for embedding digital watermark data will be described.

［電子透かし埋込装置の構成］
図１は、本発明の実施の形態に係る電子透かし埋込装置の構成を示すブロック図である。図１に示すように、電子透かし埋込装置１は、ＣＰＵ１１、ＲＯＭ１２、ＲＡＭ１３、信号入力部１４、信号出力部１５及びハードディスク１６を備えており、これらのＣＰＵ１１、ＲＯＭ１２、ＲＡＭ１３、信号入力部１４、信号出力部１５及びハードディスク１６は、バス１７によって接続されている。 [Configuration of digital watermark embedding device]
FIG. 1 is a block diagram showing a configuration of a digital watermark embedding apparatus according to an embodiment of the present invention. As shown in FIG. 1, the digital watermark embedding apparatus 1 includes a CPU 11, a ROM 12, a RAM 13, a signal input unit 14, a signal output unit 15, and a hard disk 16, and these CPU 11, ROM 12, RAM 13, and signal input unit 14. The signal output unit 15 and the hard disk 16 are connected by a bus 17.

ＣＰＵ１１は、ＲＯＭ１２及びハードディスク１６に記憶されているコンピュータプログラムを実行する。これにより、電子透かし埋込装置１は、後述するような動作を実行し、音響信号に対する電子透かしデータの埋め込みを実現する。 The CPU 11 executes computer programs stored in the ROM 12 and the hard disk 16. As a result, the digital watermark embedding apparatus 1 executes an operation as described later, and realizes embedding of the digital watermark data into the acoustic signal.

ＲＯＭ１２は、マスクＲＯＭ、ＰＲＯＭ、ＥＰＲＯＭ、又はＥＥＰＲＯＭ等によって構成されており、ＣＰＵ１１によって実行されるコンピュータプログラム及びこれに用いられるデータ等を記憶している。 The ROM 12 is configured by a mask ROM, PROM, EPROM, EEPROM, or the like, and stores a computer program executed by the CPU 11, data used for the same, and the like.

ＲＡＭ１３は、ＳＲＡＭまたはＤＲＡＭ等によって構成されており、ハードディスク１６に記憶されているプログラムの読み出しに用いられる。また、ＲＡＭ１３は、ＣＰＵ１１がコンピュータプログラムを実行するときに、ＣＰＵ１１の作業領域としても利用される。 The RAM 13 is configured by SRAM, DRAM, or the like, and is used for reading a program stored in the hard disk 16. The RAM 13 is also used as a work area for the CPU 11 when the CPU 11 executes a computer program.

信号入力部１４は、処理対象となる原信号である音響信号及びその音響信号に埋め込まれる電子透かしデータの入力を、外部の装置から受け付ける。また、信号出力部１５は、電子透かしデータが埋め込まれた音響信号（以下、「透かし入り音響信号」という）を外部の装置へ出力する。 The signal input unit 14 receives an input of an acoustic signal, which is an original signal to be processed, and digital watermark data embedded in the acoustic signal from an external device. The signal output unit 15 outputs an acoustic signal in which the digital watermark data is embedded (hereinafter referred to as “watermarked acoustic signal”) to an external device.

なお、本実施の形態においては、原信号である音響信号はデジタルデータである。しかし、当該音響信号がアナログデータであってもよく、その場合は、Ａ／Ｄ変換機能を備えた信号入力部１４が、入力された音響信号をＡ／Ｄ変換することによりデジタルデータに変換した上で、その後の処理を行うようにすればよい。 In the present embodiment, the acoustic signal that is the original signal is digital data. However, the sound signal may be analog data. In that case, the signal input unit 14 having an A / D conversion function converts the input sound signal into digital data by A / D conversion. In the above, the subsequent processing may be performed.

ハードディスク１６には、オペレーティングシステム及びアプリケーションプログラム等、並びにＣＰＵ１１に実行させるための種々のコンピュータプログラムおよび当該コンピュータプログラムの実行に用いられるデータ等がインストールされている。このコンピュータプログラムには、電子透かしデータの埋め込みを行うための電子透かし埋込プログラム１６Ａが含まれる。 The hard disk 16 is installed with an operating system, application programs, and the like, various computer programs to be executed by the CPU 11, and data used for executing the computer programs. This computer program includes a digital watermark embedding program 16A for embedding digital watermark data.

ハードディスク１６にインストールされる電子透かし埋込プログラム１６Ａは、フレキシブルディスクドライブ、ＣＤ−ＲＯＭドライブ、またはＤＶＤ−ＲＯＭドライブ等の外部記憶装置（図示せず）を介して、可搬型記録媒体から読み出される。 The digital watermark embedding program 16A installed on the hard disk 16 is read from a portable recording medium via an external storage device (not shown) such as a flexible disk drive, a CD-ROM drive, or a DVD-ROM drive.

なお、このように可搬型記録媒体によって提供されるのみならず、電気通信回線（有線、無線を問わない）を介して電子透かし埋込装置１と通信可能に接続された外部の装置から電子透かし埋込プログラム１６Ａを提供することも可能である。例えば、電子透かし埋込プログラム１６Ａがインターネット上のサーバコンピュータのハードディスク内に格納されている場合において、このサーバコンピュータに電子透かし埋込装置１がアクセスして、当該コンピュータプログラムをダウンロードし、これをハードディスク１６にインストールすることも可能である。 The digital watermark is not only provided by the portable recording medium as described above, but also from an external device that is communicably connected to the digital watermark embedding device 1 via a telecommunication line (whether wired or wireless). It is also possible to provide an embedded program 16A. For example, when the digital watermark embedding program 16A is stored in the hard disk of a server computer on the Internet, the digital watermark embedding apparatus 1 accesses this server computer, downloads the computer program, and stores it in the hard disk 16 can also be installed.

ハードディスク１６には、例えば米マイクロソフト社が製造販売するWindows（登録商標）等のマルチタスクオペレーティングシステムがインストールされている。以下の説明において、本実施の形態に係る電子透かし埋込プログラム１６Ａは当該オペレーティングシステム上で動作するものとする。 The hard disk 16 is installed with a multitask operating system such as Windows (registered trademark) manufactured and sold by Microsoft Corporation. In the following description, it is assumed that the digital watermark embedding program 16A according to the present embodiment operates on the operating system.

次に、上記の電子透かし埋込装置１の構成を、図２に示す機能ブロック図を参照しながら説明する。なお、以下において、ｎはサンプリング番号を、ｋは音響信号のフレーム番号をそれぞれ示している。
図２に示すように、電子透かし埋込装置１は、音響信号ｘ（ｎ）をフレーム化するフレーム処理部１０１と、２つの蝸牛遅延フィルタ１０２ａ及び１０２ｂと、電子透かしデータｓ（ｋ）の値に応じて第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂの何れかを選択するフィルタ選択部１０３とを備えている。 Next, the configuration of the digital watermark embedding apparatus 1 will be described with reference to the functional block diagram shown in FIG. In the following, n represents a sampling number, and k represents a frame number of an acoustic signal.
As shown in FIG. 2, the digital watermark embedding apparatus 1 includes a frame processing unit 101 that frames an acoustic signal x (n), two cochlear delay filters 102a and 102b, and a value of digital watermark data s (k). And a filter selection unit 103 that selects one of the first cochlear delay filter 102a and the second cochlear delay filter 102b.

フィルタ選択部１０３は、電子透かしデータのビット値が“０”である場合に第１蝸牛遅延フィルタ１０２ａを選択し、同じく“１”である場合に第２蝸牛遅延フィルタ１０２ｂを選択する。第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂでは、後述するようにして音響信号に群遅延を与える。このようにして群遅延が付与された音響信号が統合され、電子透かしデータが埋め込まれた音響信号である透かし入り音響信号ｙ（ｎ）が生成される。 The filter selection unit 103 selects the first cochlear delay filter 102a when the bit value of the digital watermark data is “0”, and selects the second cochlear delay filter 102b when the bit value is “1”. The first cochlear delay filter 102a and the second cochlear delay filter 102b give a group delay to the acoustic signal as described later. In this way, the acoustic signals to which the group delay is added are integrated, and a watermarked acoustic signal y (n) that is an acoustic signal in which digital watermark data is embedded is generated.

なお、本実施の形態において、これらのフレーム処理部１０１、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂ、並びにフィルタ選択部１０３は、ＣＰＵ１１が電子透かし埋込プログラム１６Ａを実行することによって実現される。 In this embodiment, the frame processing unit 101, the first cochlear delay filter 102a and the second cochlear delay filter 102b, and the filter selection unit 103 are realized by the CPU 11 executing the digital watermark embedding program 16A. Is done.

［蝸牛遅延フィルタ］
以下、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂの詳細について説明する。これらの第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂは、人間の聴覚の蝸牛遅延特性を模擬したデジタルフィルタであり、具体的には、振幅成分にはまったく影響を与えず、位相特性のみを変化させる全域通過フィルタで構成される。 [Cochlea delay filter]
Details of the first cochlear delay filter 102a and the second cochlear delay filter 102b will be described below. The first cochlear delay filter 102a and the second cochlear delay filter 102b are digital filters that simulate the cochlear delay characteristics of human hearing, and specifically, only the phase characteristics are not affected at all by the amplitude component. It is composed of an all-pass filter that changes.

本実施の形態において、蝸牛遅延フィルタ１０２ａ及び１０２ｂは、以下の式（１）の伝達関数Ｈ（ｚ）により定義される１次の無限インパルス応答型全域通過フィルタで構成される。

ここで、ｂ_ｍはＨ_ｍ（ｚ）のフィルタ係数を表している。
このように、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂを１次の無限インパルス応答型全域通過フィルタで構成することにより、高速な処理が可能になる。 In the present embodiment,

cochlear delay filters

102a and 102b are configured by first-order infinite impulse response type all-pass filters defined by a transfer function H (z) of the following equation (1).

Here, b _m represents a filter coefficient of H _m (z).
In this manner, by configuring the first cochlear delay filter 102a and the second cochlear delay filter 102b with first-order infinite impulse response type all-pass filters, high-speed processing becomes possible.

なお、無限インパルス応答型全域通過フィルタの群遅延特性が蝸牛遅延特性をより正確に表していれば、フィルタ次数は１次以上であってもよく、また、フィルタのカスケード段数は１段以上であってもよい。 If the group delay characteristic of the infinite impulse response all-pass filter more accurately represents the cochlear delay characteristic, the filter order may be 1st or more, and the filter cascade stage is 1 or more. May be.

第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂにより与えられる群遅延γｍ（ω）は以下の式（２）により算出される。

The group delay γm (ω) given by the first cochlear delay filter 102a and the second cochlear delay filter 102b is calculated by the following equation (2).

図３は、本発明の実施の形態１における電子透かし埋込装置１が備える第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂの特性を示すグラフである。図３において、縦軸は群遅延を、横軸は音響信号の周波数をそれぞれ示している。 FIG. 3 is a graph showing characteristics of the first cochlear delay filter 102a and the second cochlear delay filter 102b included in the digital watermark embedding device 1 according to the first embodiment of the present invention. In FIG. 3, the vertical axis represents the group delay, and the horizontal axis represents the frequency of the acoustic signal.

図３において、細い実線は、人間の聴覚における蝸牛遅延を１／１０倍に縮小した蝸牛遅延特性を示している。また、太い実線は、フィルタ係数ｂ＝0.795の場合に上記式１により定義される第１蝸牛遅延フィルタ１０２ａの特性を示し、破線は、フィルタ係数ｂ＝0.865の場合に同じく定義される第２蝸牛遅延フィルタ１０２ｂの特性を示している。 In FIG. 3, a thin solid line indicates a cochlear delay characteristic in which the cochlear delay in human hearing is reduced to 1/10 times. The thick solid line shows the characteristics of the first cochlear delay filter 102a defined by the above equation 1 when the filter coefficient b = 0.895, and the broken line shows the second cochlea similarly defined when the filter coefficient b = 0.865. The characteristic of the delay filter 102b is shown.

なお、図３において細い実線で示されている蝸牛遅延特性は、「T. Dau, O. Wegner, V. Mellert, and B. Kollmeier, “Auditory brainstem responses (ABR) with optimized chirp signals compensating basilar membrane dispersion,” J. Acoust. Soc. Am., 107, 1530-1540, 2000」を参考にして定めたものである。 In addition, the cochlear delay characteristic shown by the thin solid line in FIG. 3 is “T. Dau, O. Wegner, V. Mellert, and B. Kollmeier,“ Auditory brainstem responses (ABR) with optimized chirp signals compensating basilar membrane dispersion. , “J. Acoust. Soc. Am., 107, 1530-1540, 2000”.

以上より、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂを音響信号にかけると、実際の蝸牛遅延の１／１０倍の蝸牛遅延を当該音響信号に付与することになる。したがって、人間の実際の蝸牛遅延特性を近似するためには、このような蝸牛遅延フィルタを１０段カスケード接続する必要がある。しかし、実際と同様の蝸牛遅延量を音響信号に与えることにすると、その音響信号を知覚する際の群遅延量は実際の蝸牛遅延量の２倍になってしまうため、遅延が大きすぎると考えられる。そこで、本実施の形態においては、上記のように実際の蝸牛遅延の１／１０倍の蝸牛遅延を音響信号に与えることにしている。 From the above, when the first cochlear delay filter 102a and the second cochlear delay filter 102b are applied to the acoustic signal, a cochlear delay that is 1/10 times the actual cochlear delay is given to the acoustic signal. Therefore, in order to approximate human actual cochlear delay characteristics, it is necessary to cascade such cochlear delay filters in 10 stages. However, if a cochlear delay amount similar to the actual is given to the acoustic signal, the group delay amount when the acoustic signal is perceived will be twice the actual cochlear delay amount, so the delay is considered too large. It is done. Therefore, in the present embodiment, as described above, a cochlear delay that is 1/10 times the actual cochlear delay is given to the acoustic signal.

本実施の形態において、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂはそれぞれ、下記の式（３）及び式（４）にしたがって、原信号である音響信号ｘ（ｎ）に蝸牛遅延パターンを付与し、中間信号ｗ_０（ｎ）及びｗ_１（ｎ）を得る。そして、フィルタ選択部１０３が、電子透かしデータのビット値に応じて、フレーム毎に中間信号ｗ_０（ｎ）及びｗ_１（ｎ）を選択・統合することにより、下記の式（５）に示す透かし入り音響信号ｙ（ｎ）を取得する。

ただし、（ｋ−１）ΔＷ＜ｎ≦ｋΔＷを満足する。ここで、ΔＷ（＝ｆ_s／Ｎ_bit）はフレーム長であり、ｆ_sは原信号のサンプリング周波数を、Ｎ_bitは１秒あたりの情報埋込ビットレートをそれぞれ表している。 In the present embodiment, the first cochlear delay filter 102a and the second cochlear delay filter 102b are respectively adapted to the acoustic signal x (n) that is the original signal according to the following formulas (3) and (4). To obtain intermediate signals w ₀ (n) and w ₁ (n). Then, the filter selection unit 103 selects and integrates the intermediate signals w ₀ (n) and w ₁ (n) for each frame according to the bit value of the digital watermark data, thereby expressing the following equation (5). A watermarked acoustic signal y (n) is acquired.

However, (k−1) ΔW <n ≦ kΔW is satisfied. Here, ΔW (= f _s / N _bit ) is the frame length, f _s represents the sampling frequency of the original signal, and N _bit represents the information embedding bit rate per second.

［電子透かし検出装置の構成］
図４は、本発明の実施の形態に係る電子透かし検出装置の構成を示すブロック図である。図４に示すように、電子透かし検出装置２は、上記の電子透かし埋込装置１と同様に、ＣＰＵ２１、ＲＯＭ２２、ＲＡＭ２３、信号入力部２４、及びハードディスク２５を備えており、これらのＣＰＵ２１、ＲＯＭ２２、ＲＡＭ２３、信号入力部２４及びハードディスク２５は、バス２６によって接続されている。 [Configuration of digital watermark detection apparatus]
FIG. 4 is a block diagram showing the configuration of the digital watermark detection apparatus according to the embodiment of the present invention. As shown in FIG. 4, the digital watermark detection apparatus 2 includes a CPU 21, a ROM 22, a RAM 23, a signal input unit 24, and a hard disk 25, similar to the digital watermark embedding apparatus 1 described above. The RAM 23, the signal input unit 24, and the hard disk 25 are connected by a bus 26.

ＣＰＵ２１、ＲＯＭ２２及びＲＡＭ２３のそれぞれについては、電子透かし埋込装置１が備えるＣＰＵ１１、ＲＯＭ１２及びＲＡＭ１３と同様であるので、説明を省略する。 Since each of the CPU 21, ROM 22, and RAM 23 is the same as the CPU 11, ROM 12, and RAM 13 included in the digital watermark embedding apparatus 1, description thereof is omitted.

信号入力部２４は、透かし入り音響信号の入力を外部の装置から受け付ける。この透かし入り音響信号は、信号入力部２４に対して電子透かし埋込装置１から直接入力されてもよく、他の装置及び／又は通信ネットワーク等を介して入力されてもよい。 The signal input unit 24 receives an input of a watermarked acoustic signal from an external device. The watermarked acoustic signal may be directly input from the digital watermark embedding device 1 to the signal input unit 24, or may be input via another device and / or a communication network.

ハードディスク２５には、電子透かし埋込装置１の場合と同様に、オペレーティングシステム及びＣＰＵ２１に実行させるための種々のコンピュータプログラム等がインストールされている。このコンピュータプログラムには、電子透かしデータの検出を行うための電子透かし検出プログラム２５Ａが含まれる。 As in the case of the digital watermark embedding device 1, the hard disk 25 is installed with an operating system and various computer programs to be executed by the CPU 21. This computer program includes a digital watermark detection program 25A for detecting digital watermark data.

電子透かし埋込プログラム１６Ａの場合と同様に、ハードディスク２５にインストールされる電子透かし検出プログラム２５Ａは、可搬型記録媒体によって提供されてもよく、電気通信回線を介して提供されてもよい。また、この電子透かし検出プログラム２５Ａは、電子透かし埋込プログラム１６Ａの場合と同様に、ハードディスク２５にインストールされているオペレーティングシステム上で動作するものとする。 As in the case of the digital watermark embedding program 16A, the digital watermark detection program 25A installed in the hard disk 25 may be provided by a portable recording medium or may be provided via an electric communication line. The digital watermark detection program 25A operates on an operating system installed in the hard disk 25, as in the digital watermark embedding program 16A.

次に、上記の電子透かし検出装置２の構成を、図５に示す機能ブロック図を参照しながら説明する。
図５に示すように、電子透かし検出装置２は、電子透かし埋込装置１により生成された透かし入り音響信号ｙ（ｎ）をフレーム化するフレーム処理部２０１と、フレーム化された透かし入り音響信号ｙ（ｎ）に対して、チャープｚ変換を施す２つのチャープｚ変換部２０２ａ及び２０２ｂと、これらの第１チャープｚ変換部２０２ａ及び第２チャープｚ変換部２０２ｂによるチャープｚ変換の結果に基づいて、電子透かしデータのビット値を検出するビット値検出部２０３とを備えている。なお、本実施の形態において、これらのフレーム処理部２０１、第１チャープｚ変換部２０２ａ及び第２チャープｚ変換部２０２ｂ、並びにビット値検出部２０３は、ＣＰＵ２１が電子透かし検出プログラム２５Ａを実行することによって実現される。 Next, the configuration of the digital watermark detection apparatus 2 will be described with reference to the functional block diagram shown in FIG.
As shown in FIG. 5, the digital watermark detection apparatus 2 includes a frame processing unit 201 that frames the watermarked audio signal y (n) generated by the digital watermark embedding apparatus 1, and a framed watermarked audio signal. Based on two chirp z-transformers 202a and 202b that perform chirp z-transform on y (n), and the result of chirp z-transform by these first chirp z-transformer 202a and second chirp z-transformer 202b A bit value detecting unit 203 for detecting the bit value of the digital watermark data. In this embodiment, the CPU 21 executes the digital watermark detection program 25A in the frame processing unit 201, the first chirp z conversion unit 202a, the second chirp z conversion unit 202b, and the bit value detection unit 203. It is realized by.

［チャープｚ変換］
第１チャープｚ変換部２０２ａ及び第２チャープｚ変換部２０２ｂが実行するチャープｚ変換（ＣＺＴ）は、周波数スペクトルのフレキシブルな分析を可能とする手法として知られ（例えば、「Wang, T. T. “The segmented chirp z-transform and its application in spectrum analysis,”IEEE Trans. Instrumentation and measurement, 39(2), 318-323, 1990」を参照）、高速フーリエ変換（ＦＦＴ）の実装にも活用されている。このチャープｚ変換は、離散フーリエ変換（ＤＦＴ）と比較して、周波数分解能及び周波数応答のダイナミックレンジを自由に変えられるという特徴を有している。また、ｚ平面上で任意のＭ点でのｚ変換を効率良く求めることができるという特徴も有している。 [Chirp z conversion]
Chirp z-transform (CZT) performed by the first chirp z-transformer 202a and the second chirp z-transformer 202b is known as a technique that enables flexible analysis of the frequency spectrum (for example, “Wang, TT“ The segmented chirp z-transform and its application in spectrum analysis, "IEEE Trans. Instrumentation and measurement, 39 (2), 318-323, 1990"), and fast Fourier transform (FFT) implementation. The chirp z-transform has a feature that the frequency resolution and the dynamic range of the frequency response can be freely changed as compared with the discrete Fourier transform (DFT). In addition, there is also a feature that z conversion at an arbitrary M point on the z plane can be efficiently obtained.

一般に、チャープｚ変換は、ｚ＝ｒexp（ｊω_ｎ）でＮ点のＤＦＴと結ばれる（大きさｒ＝１で正規化周波数ω_ｎ＝２πｎ／Ｎのとき単位円周上のＤＦＴと等価である）関係にある。ここで、チャープｚ変換は下記の式（６）により表される。

ただし、Ａ＝Ａ_０ｅｘｐ（ｊ２πθ_０）、Ｗ＝Ｗ_０ｅｘｐ（ｊ２πφ_０）である。ここで、θ_０及びφ_０は初期位相である。上述したように、Ａ＝１、Ｍ＝Ｎ、Ｗ＝ｅｘｐ（−ｊ２π／Ｎ）のとき、ＣＺＴはＤＦＴに一致する。 In general, the chirp z-transform is connected to the N-point DFT at z = rexp (jω _n ) (equivalent to the DFT on the unit circumference when the magnitude r = 1 and the normalized frequency ω _n = 2πn / N). ) There is a relationship. Here, the chirp z-transform is expressed by the following equation (6).

However, A = A ₀ exp (j2πθ ₀ ) and W = W ₀ exp (j2πφ ₀ ). Here, θ ₀ and φ ₀ are initial phases. As described above, when A = 1, M = N, and W = exp (−j2π / N), CZT matches DFT.

［ブラインド検出の原理］
本実施の形態では、上記のチャープｚ変換を用いることにより、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂを用いて音響信号に埋め込まれた電子透かしデータのブラインド検出を実現する。以下、このブラインド検出の原理について説明する。 [Principle of blind detection]
In the present embodiment, blind detection of digital watermark data embedded in an acoustic signal is realized using the first cochlear delay filter 102a and the second cochlear delay filter 102b by using the chirp z-transform described above. Hereinafter, the principle of blind detection will be described.

第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂの極及び零点は、図６に示すとおりに配置される。これらの蝸牛遅延フィルタ１０２ａ及び１０２ｂは、上述したように１次ＩＩＲ型全域通過フィルタであり、その特徴として極（図６中の“×”）及び零点（図６中の“○”）は中心点から単位円に向かって垂線を描いたときに交差する半径及びその逆数（ｂ_ｍ及び１／ｂ_ｍ）の関係にある。一般に、ｂ_ｍの値が減少するにしたがい、極は中心点に近付き、零点は単位円から外側に向かって離れていく。反対に、ｂ_ｍの値が増加するにしたがい、極及び零点は互いに単位円に向かって近付いていく。この場合の群遅延量は、図３に示すように、ｂ_ｍの値の増加とともに増加する。なお、図６において、太字の“○”及び“×”は第１蝸牛遅延フィルタ１０２ａの曲及び零点をそれぞれ示し、細字の“○”及び“×”は第２蝸牛遅延フィルタ１０２ｂの曲及び零点をそれぞれ示している。 The poles and zeros of the first cochlear delay filter 102a and the second cochlear delay filter 102b are arranged as shown in FIG. These cochlear delay filters 102a and 102b are first-order IIR all-pass filters as described above, and their poles (“×” in FIG. 6) and zero points (“◯” in FIG. 6) are centered. There is a relationship between a radius intersected when a perpendicular line is drawn from the point toward the unit circle and its reciprocal (b _m and 1 / b _m ). Generally, as the value of b _m decreases, the pole approaches the center point and the zero point moves away from the unit circle toward the outside. Conversely, as the value of b _m increases, the poles and zeros approach each other towards the unit circle. As shown in FIG. 3, the group delay amount in this case increases as the value of b _m increases. In FIG. 6, bold “o” and “x” indicate the tune and zero of the first cochlear delay filter 102a, respectively, and fine “o” and “x” indicate the tune and zero of the second cochlear delay filter 102b, respectively. Respectively.

透かし入り音響信号ｙ（ｎ）は、上述したような遅延情報が埋め込まれた信号として観測されることになる。そのため、ｙ（ｎ）から遅延情報、すなわち遅延情報の付与に利用された蝸牛遅延フィルタの極及び零点の位置を推定することにより、ブラインド検出を実現することができる。 The watermarked acoustic signal y (n) is observed as a signal in which the delay information as described above is embedded. Therefore, blind detection can be realized by estimating the positions of the poles and zeros of the cochlear delay filter used for providing delay information, that is, delay information, from y (n).

なお、原信号ｘ（ｎ）自体も数列の特性として極及び零点を持つため（音源が有界であるとして、その信号の減衰に関係する極など）、観測信号ｙ（ｎ）から仮に極及び零点の位置を推定できたとしても、それはＩＩＲ型全域通過フィルタ（蝸牛遅延フィルタ）によって付与されたものなのか、原信号そのものが持つものであるのかを見極める必要がある。 Since the original signal x (n) itself has a pole and a zero as the characteristics of the sequence (such as a pole related to the attenuation of the signal when the sound source is bounded), the Even if the position of the zero point can be estimated, it is necessary to determine whether it is provided by an IIR all-pass filter (cochlear delay filter) or the original signal itself.

チャープｚ変換を用いることにより、蝸牛遅延フィルタの極及び零点の位置を推定することができることを示すために、上記の式（１）の蝸牛遅延フィルタの零点ｒ＝１／ｂ_ｍを通るようにｒを選択して、原信号ｘ（ｎ）及び遅延情報を埋め込んだ信号ｙ（ｎ）をチャープｚ変換（Ａ＝ｒ、Ｍ＝Ｎ、Ｗ＝ｅｘｐ（−ｊ２π／Ｎ））することにより周波数分析を行う。 In order to show that the pole and zero positions of the cochlear delay filter can be estimated by using the chirp z-transform, let the cochlear delay filter zero of the above equation (1) pass r = 1 / b _m r is selected and frequency is obtained by chirp z-transforming (A = r, M = N, W = exp (−j2π / N)) of the original signal x (n) and the signal y (n) in which the delay information is embedded. Perform analysis.

以下、原信号である楽器音をｘ（ｎ）とし、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂを利用して“AIS-Lab.”の電子透かしデータを埋め込んだ信号をｙ（ｎ）とする。ここでは、第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂはいずれも直流成分のところに極及び零点を配置しており、ｒ＝１／ｂ_０又はｒ＝１／ｂ_１としたチャープｚ変換の周波数分析を行う。なお、サンプリング周波数は４４．１ｋＨｚ、ビットレートはＮ_ｂｉｔ＝４ｂｐｓとして、１フレーム（２５０ｍｓ）に１ビット相当の遅延情報を埋め込むものとする。 Hereinafter, an instrument sound which is an original signal is set to x (n), and a signal in which digital watermark data of “AIS-Lab.” Is embedded using the first cochlear delay filter 102a and the second cochlear delay filter 102b is expressed as y (n ). Here, each of the first cochlear delay filter 102a and the second cochlear delay filter 102b has a pole and a zero located at the DC component, and a chirp z with r = 1 / b ₀ or r = 1 / b ₁ Perform frequency analysis of the conversion. It is assumed that the sampling frequency is 44.1 kHz and the bit rate is N _bit = 4 bps, and delay information corresponding to 1 bit is embedded in one frame (250 ms).

図７は、その分析結果を示すグラフである。図７（ａ）乃至（ｉ）は、左から右にフレーム＃１でのｘ（ｎ）、フレーム＃１でのｙ（ｎ）、フレーム＃２のｙ（ｎ）の周波数スペクトルを、上から下にｒ＝１、ｒ＝１／ｂ_０、ｒ＝１／ｂ_１でのチャープｚ変換により分析した結果をそれぞれ示している。図７（ｇ）に示すように、ｘ（ｎ）に関する分析結果では、極及び零点配置の周波数付近でのスペクトルには特段変化がみられない。他方、フレーム＃１のｙ（ｎ）ではｒ＝１／ｂ_１でのチャープｚ変換の結果（図９（ｈ））において、フレーム＃２のｙ（ｎ）ではｒ＝１／ｂ_０でのチャープｚ変換の結果（図９（ｆ））において、最も低い周波数領域（直流成分から低周波数域までの範囲；例えば図３に示す遅延が見られる周波数帯）のところでスペクトル成分が劇的に減少していることがわかる（図中の矢印で示す箇所）。これは、零点の影響によるディップ（くぼみ）に対応しているため、原理的にはその大きさは−∞ｄＢになる。それ以外の分析（ｒ＝１、ｒ＝１／ｂ_０（フレーム＃１の場合）、及びｒ＝１／ｂ_１（フレーム＃２の場合））では、最も低い周波数のところでスペクトル成分の変化はほとんど見られない（すなわち、−∞ｄＢ（線形で０）に近付かない）。なお、この結果に関しては、他のフレーム及び他の対象信号でも同様のことが起こることが確認されている。 FIG. 7 is a graph showing the analysis results. 7 (a) to (i) show the frequency spectrum of x (n) in frame # 1, y (n) in frame # 1, and y (n) in frame # 2 from the top to the left, from right to left. Below, the results analyzed by chirp z transformation at r = 1, r = 1 / b ₀ and r = 1 / b ₁ are shown, respectively. As shown in FIG. 7 (g), in the analysis result regarding x (n), no particular change is observed in the spectrum near the frequency of the pole and zero arrangement. On the other hand, in y (n) of frame # 1, the result of chirp z conversion at r = 1 / b ₁ (FIG. 9 (h)), in y (n) of frame # 2, r = 1 / b ₀ In the result of the chirp z conversion (FIG. 9 (f)), the spectral component is dramatically reduced in the lowest frequency region (the range from the DC component to the low frequency region; for example, the frequency band in which the delay shown in FIG. It can be seen that (indicated by the arrow in the figure). Since this corresponds to a dip (indentation) due to the influence of the zero point, in principle, the magnitude is −∞ dB. In other analyzes (r = 1, r = 1 / b ₀ (for frame # 1), and r = 1 / b ₁ (for frame # 2)), the change in spectral components at the lowest frequency is It is rarely seen (ie, does not approach -∞ dB (linear and 0)). Regarding this result, it has been confirmed that the same thing occurs in other frames and other target signals.

以上より、対象信号に係わらず、蝸牛遅延フィルタの零点を交差するようにｚ平面上の軌跡に沿ってチャープｚ変換を行うことにより、ｙ（ｎ）から蝸牛遅延フィルタの零点の位置を推定することが可能であることが分かる。なお、原理的には、ｒを零点ではなく極の値にしてチャープｚ変換を行うことも可能である（極の場合は∞ｄＢのスペクトルピークを得ることになる）が、計算機上でのダイナミックレンジ内のオーバーフローを検出しなければならないため、零点を用いる方が望ましい。零点を利用する場合は、ダイナミックレンジ内の０を探せばよいため、より容易な処理で足りることになる。 From the above, the position of the zero point of the cochlear delay filter is estimated from y (n) by performing the chirp z-transform along the locus on the z plane so as to cross the zero point of the cochlear delay filter regardless of the target signal. It can be seen that it is possible. In principle, it is also possible to perform chirp z-transform with r being a pole value instead of a zero point (in the case of a pole, a spectrum peak of ∞ dB is obtained), but dynamics on the computer It is preferable to use a zero because an overflow in the range must be detected. When the zero point is used, it is sufficient to search for 0 within the dynamic range, so that easier processing is sufficient.

本実施の形態では、第１チャープｚ変換部２０２ａがｒ＝１/ｂ_０のｚ平面上の軌跡に沿ったチャープｚ変換を行い、第２チャープｚ変換部２０２ｂがｒ＝１/ｂ_１のｚ平面上の軌跡に沿ったチャープｚ変換を行う。これらのチャープｚ変換の結果を用いることにより、対象信号が、第１蝸牛遅延フィルタ１０２ａ（フィルタ係数ｂ_０）及び第２蝸牛遅延フィルタ１０２ｂ（フィルタ係数ｂ_１）の何れにより群遅延が与えられたものであるのかを推定することが可能になる。 In the present embodiment, the first chirp z conversion unit 202a performs chirp z conversion along a locus on the z plane of r = 1 / b ₀ , and the second chirp z conversion unit 202b satisfies r = 1 / b ₁ . Chirp z transformation is performed along the locus on the z plane. By using the results of these chirp z-transforms, the target signal is given a group delay by either the first cochlear delay filter 102a (filter coefficient b ₀ ) or the second cochlear delay filter 102b (filter coefficient b ₁ ). It is possible to estimate whether it is a thing.

［電子透かし埋込装置１及び電子透かし検出装置２の動作］
次に、上述したように構成された本実施の形態の電子透かし埋込装置１及び電子透かし検出装置２の動作について、図８及び図９に示すフローチャートと図２及び図５とを参照しながら説明する。 [Operations of the digital watermark embedding apparatus 1 and the digital watermark detection apparatus 2]
Next, the operations of the digital watermark embedding apparatus 1 and the digital watermark detection apparatus 2 of the present embodiment configured as described above will be described with reference to the flowcharts shown in FIGS. 8 and 9 and FIGS. 2 and 5. explain.

［電子透かし埋込処理］
図８は、本発明の実施の形態における電子透かし埋込装置１が実行する電子透かし埋込処理の手順を示すフローチャートである。
電子透かし埋込装置１は、フレーム処理部１０１において、外部から入力された音響信号（原信号）を各フレームに分割する（Ｓ１０１）。次に、電子透かし埋込装置１は、フィルタ選択部１０３において、電子透かしデータのビット値に応じて適用する蝸牛遅延フィルタの選択を行う。具体的には、外部から入力され、２進数表現のデータに変換された電子透かしデータのビット値が“０”及び“１”の何れであるかを判定し（Ｓ１０２）、その判定結果に応じて第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂの何れかを選択する。なお、電子透かしデータとしては、例えば著作権者名等の著作権情報またはシリアルナンバー等が挙げられる。 [Digital watermark embedding]
FIG. 8 is a flowchart showing the procedure of the digital watermark embedding process executed by the digital watermark embedding apparatus 1 in the embodiment of the present invention.
In the digital watermark embedding apparatus 1, the frame processing unit 101 divides an externally input acoustic signal (original signal) into each frame (S101). Next, in the digital watermark embedding apparatus 1, the filter selection unit 103 selects a cochlear delay filter to be applied according to the bit value of the digital watermark data. Specifically, it is determined whether the bit value of the digital watermark data input from the outside and converted into binary representation data is “0” or “1” (S102), and according to the determination result Then, one of the first cochlear delay filter 102a and the second cochlear delay filter 102b is selected. Examples of the digital watermark data include copyright information such as a copyright holder name or a serial number.

ステップＳ１０２において電子透かしデータのビット値が“０”であると判定した場合（Ｓ１０２で“０”）、電子透かし埋込装置１は、第１蝸牛遅延フィルタ１０２ａを用いて、音響信号（原信号）に対して位相変調を施す（Ｓ１０３）。他方、電子透かしデータのビット値が“１”であると判定した場合（Ｓ１０２で“１”）、電子透かし埋込装置１は、第２蝸牛遅延フィルタ１０２ｂを用いて、音響信号（原信号）に対して位相変調を施す（Ｓ１０４）。これらのステップＳ１０３及びＳ１０４により、電子透かしデータが音響信号により埋め込まれることになる。 When it is determined in step S102 that the bit value of the digital watermark data is “0” (“0” in S102), the digital watermark embedding device 1 uses the first cochlear delay filter 102a to generate an acoustic signal (original signal). ) Is subjected to phase modulation (S103). On the other hand, when it is determined that the bit value of the digital watermark data is “1” (“1” in S102), the digital watermark embedding device 1 uses the second cochlear delay filter 102b to generate an acoustic signal (original signal). Is subjected to phase modulation (S104). Through these steps S103 and S104, the digital watermark data is embedded by an acoustic signal.

次に、電子透かし埋込装置１は、当該フレームに埋め込む電子透かしデータのすべてのビットが処理されたか否かを判定する（Ｓ１０５）。ここでまだ処理されていないビットがあると判定した場合（Ｓ１０５でＮＯ）、電子透かし埋込装置１は、ステップＳ１０２へ戻り、それ以降の処理を繰り返す。他方、すべてのビットが処理されたと判定した場合（Ｓ１０５でＹＥＳ）、電子透かし埋込装置１は、ステップＳ１０３及びＳ１０４により電子透かしデータの各ビットが埋め込まれた音響信号を接合することにより、透かし入り音響信号を生成する（Ｓ１０６）。 Next, the digital watermark embedding apparatus 1 determines whether or not all the bits of the digital watermark data to be embedded in the frame have been processed (S105). If it is determined that there is a bit that has not yet been processed (NO in S105), the digital watermark embedding apparatus 1 returns to step S102 and repeats the subsequent processing. On the other hand, if it is determined that all the bits have been processed (YES in S105), the digital watermark embedding device 1 joins the audio signal in which each bit of the digital watermark data is embedded in steps S103 and S104, thereby providing a watermark. An incoming sound signal is generated (S106).

上記の電子透かし埋込処理をすべてのフレームについて行い、それらを接続することにより、透かし入り音響信号ｙ（ｎ）が生成される。なお、フレームの接続箇所に不連続点が生じることにより（スペクトル拡散の原因でもある）知覚不可能性に影響が出ることを防止するために、接続部前のフレームの後ろ数点（１ｍｓ程度）をスプライン（Spline）補間で滑らかにすることが望ましい。 The above-described digital watermark embedding process is performed for all the frames and connected to generate a watermarked acoustic signal y (n). In order to prevent the occurrence of discontinuity at the connection point of the frame (which is also the cause of spread spectrum) from affecting the imperceptibility, several points after the frame before the connection (about 1 ms) It is desirable to smooth the image with spline interpolation.

［電子透かし検出処理］
次に、上記のようにして電子透かしデータが埋め込まれた透かし入り音響信号から、当該電子透かしデータを検出する電子透かし検出処理について説明する。本実施の形態では、上述したように、原信号を参照しないブラインド検出を行う。なお、電子透かし検出装置２は、電子透かし埋込装置１により電子透かしデータが埋め込まれた際のビットレートを示す情報を記憶しており、当該情報に基づいて下記のセグメントの設定を行うものとする。 [Digital watermark detection processing]
Next, a digital watermark detection process for detecting the digital watermark data from the watermarked acoustic signal in which the digital watermark data is embedded as described above will be described. In this embodiment, as described above, blind detection is performed without referring to the original signal. The digital watermark detection apparatus 2 stores information indicating the bit rate when the digital watermark data is embedded by the digital watermark embedding apparatus 1, and sets the following segments based on the information. To do.

図９は、本発明の実施の形態における電子透かし検出装置２が実行する電子透かし検出処理の手順を示すフローチャートである。
電子透かし検出装置２は、フレーム処理部２０１において、外部から入力された透かし入り音響信号を各フレームに分割する（Ｓ２０１）。次に、電子透かし検出装置２は、処理対象のセグメントを設定し（Ｓ２０２）、第１チャープｚ変換部２０２ａにおいて、当該セグメントの音響信号に対してチャープｚ変換を行う（Ｓ２０３）。さらに、第２チャープｚ変換部２０２ｂにおいて、同じ音響信号に対してチャープｚ変換を行う（Ｓ２０４）。 FIG. 9 is a flowchart showing a procedure of digital watermark detection processing executed by the digital watermark detection apparatus 2 according to the embodiment of the present invention.
In the digital watermark detection apparatus 2, the frame processing unit 201 divides a watermarked acoustic signal input from the outside into frames (S201). Next, the digital watermark detection apparatus 2 sets a segment to be processed (S202), and the first chirp z conversion unit 202a performs chirp z conversion on the acoustic signal of the segment (S203). Further, the second chirp z conversion unit 202b performs chirp z conversion on the same acoustic signal (S204).

次に、電子透かし検出装置２は、ステップＳ２０３及びＳ２０４により得られた２つの周波数スペクトルのうちの何れが、最も低い周波数でのスペクトルの値が急激に減少しているか否かを判定し、その判定結果に基づき、当該音響信号に対して位相変調を施した蝸牛遅延フィルタの零点を推定する（Ｓ２０５）。本実施の形態の場合、上記のようにスペクトルの値が急激に減少しているのが第１チャープｚ変換部２０２ａにより得られた周波数スペクトルである場合は当該零点が１／ｂ_０であると推定され、同じく第２チャープｚ変換部２０２ｂにより得られた周波数スペクトルである場合は当該零点が１／ｂ_１であると推定される。 Next, the digital watermark detection apparatus 2 determines whether one of the two frequency spectra obtained in steps S203 and S204 has a sharp decrease in the value of the spectrum at the lowest frequency. Based on the determination result, the zero point of the cochlear delay filter obtained by phase-modulating the acoustic signal is estimated (S205). In the case of the present embodiment, when the spectrum value rapidly decreases as described above is the frequency spectrum obtained by the first chirp z-transformer 202a, the zero point is 1 / b _0. If the frequency spectrum is also estimated and obtained by the second chirp z-transformer 202b, the zero point is estimated to be 1 / b ₁ .

次に、電子透かし検出装置２は、ビット値検出部２０３において、ステップＳ２０５により推定された蝸牛遅延フィルタの零点が１／ｂ_０及び１／ｂ_１の何れであるかを判定すし（Ｓ２０６）、１／ｂ_０と判定した場合（Ｓ２０６で１／ｂ_０）はビット値“０”を検出する（Ｓ２０７）。他方、１／ｂ_１と判定した場合（Ｓ２０６で１／ｂ_１）はビット値“１”を検出する（Ｓ２０８）。 Next, in the digital watermark detection apparatus 2, the bit value detection unit 203 determines whether the zero point of the cochlear delay filter estimated in step S205 is 1 / b _{0 or} 1 / b ₁ (S206). When it is determined that 1 / b ₀ (1 / b _{0 in} S206), the bit value “0” is detected (S207). On the other hand, when it is determined as 1 / b ₁ (1 / b _{1 in} S206), the bit value “1” is detected (S208).

その後、電子透かし検出装置２は、処理対象のフレームのすべてのセグメントについて処理を行ったか否かを判定する（Ｓ２０９）。ここで、まだ処理を行っていないセグメントがあると判定した場合（Ｓ２０９でＮＯ）、電子透かし検出装置２は、ステップＳ２０２へ戻り、それ以降の処理を繰り返す。他方、すべてのセグメントについて処理を行ったと判定した場合（Ｓ２０９でＹＥＳ）、電子透かし検出装置２は、ステップＳ２０７及びＳ２０８においてビット値検出部２０３により検出したビット値を接合することにより、電子透かしデータを復元する（Ｓ２１０）。 Thereafter, the digital watermark detection apparatus 2 determines whether or not processing has been performed for all segments of the processing target frame (S209). If it is determined that there is a segment that has not yet been processed (NO in S209), the digital watermark detection apparatus 2 returns to step S202 and repeats the subsequent processing. On the other hand, if it is determined that processing has been performed for all segments (YES in S209), the digital watermark detection apparatus 2 joins the bit values detected by the bit value detection unit 203 in steps S207 and S208, thereby adding digital watermark data. Is restored (S210).

以上のようにして、蝸牛遅延フィルタを用いて音響信号に埋め込まれた電子透かしデータをブラインド検出することができる。 As described above, the digital watermark data embedded in the acoustic signal can be blind-detected using the cochlear delay filter.

［他の手法との比較評価］
次に、上述した本実施の形態の電子透かし埋込処理により埋め込まれた電子透かしデータの知覚不可能性と、同じく電子透かし検出処理によるビット検出の正確性とについて、他の手法と比較評価する。 [Comparison evaluation with other methods]
Next, the imperceptibility of the digital watermark data embedded by the digital watermark embedding process of the present embodiment described above and the accuracy of bit detection by the digital watermark detection process are compared and evaluated with other methods. .

本発明者等は、ＲＷＣ音楽データベース（後藤、橋口、西村、岡、“RWC 研究用音楽データベース:音楽ジャンルデータベースと楽器音データベース,” 情処学研究報告、2002-MUS-45-4, 19-26, 2002）の全１０２曲を評価用の原信号（サンプリング周波数４４．１ｋＨｚ、１６ビット量子化）として利用して、客観評価実験を行った。ここでは、冒頭１０秒間を元曲として、８文字の情報（“AIS-Lab.”）を透かし情報として各原信号に埋め込んだ。また、Ｎ_ｂｉｔ＝４ｂｐｓをベースとし、１２条件のＮ_ｂｉｔ（Ｎ_ｂｉｔ = 4，8，16，32，64，128，256，512，1024，2048，4096，819ｂｐｓ）で、電子透かしデータを原信号の両チャンネルに埋め込み、その特性評価を行った。音質評価に関しては、「Y. Lin and W. H. Abdulla, “Perceptual evaluation of audio watermarking using objective quality measure,” Proc. ICASSP2008, 1745-1748, 2008」に基づき、オーディオ信号に対する知覚評価尺度（ＰＥＡＱ）（P. Kabal, “An examination and interpretation of ITU-R BS.1387: Perceptual evaluation of audio quality,”TSP Lab. Technical Report, Dept. Electrical & Computer Engineering, McGUniv. 2002）及び対数スペクトル歪尺度（ＬＳＤ）を利用した。 The present inventors, RWC music database (Goto, Hashiguchi, Nishimura, Oka, “RWC research music database: music genre database and instrument sound database,” affairs research report, 2002-MUS-45-4, 19- 26, 2002) was used as an original signal for evaluation (sampling frequency 44.1 kHz, 16-bit quantization), and an objective evaluation experiment was conducted. Here, the first 10 seconds is used as the original music, and 8-character information (“AIS-Lab.”) Is embedded in each original signal as watermark information. _Further, the N bit = 4 bps based, at 12 the condition of _{_{N bit (N bit = 4,8,16,32,64,128,256,512,1024,2048,4096,819bps)}} , the original watermark data It was embedded in both channels of the signal and its characteristics were evaluated. Regarding sound quality evaluation, based on “Y. Lin and WH Abdulla,“ Perceptual evaluation of audio watermarking using objective quality measure, ”Proc. ICASSP2008, 1745-1748, 2008, the perceptual evaluation scale (PEAQ) for audio signals (P. Kabal, “An examination and interpretation of ITU-R BS.1387: Perceptual evaluation of audio quality,” TSP Lab. Technical Report, Dept. Electrical & Computer Engineering, McGUniv. 2002) and logarithmic spectral distortion scale (LSD) .

比較対象の手法としては、代表的な電子音響透かし法であるＬＳＢ法、ＤＳＳ法、ＥＣＨＯ法、ＰＰＭ法を利用した。なお、これらの手法は、ＰＰＭ法を除き、何れもブラインド検出法である。また、発明者等により非特許文献６及び７にて提案されているＣＤ法も比較対象とした。以下、この比較対象のＣＤ法をＣＤ（Non-Blind）法とし、本実施の形態の電子透かし検出方法をＣＤ（Blind）法と表現する。 As a method to be compared, the LSB method, the DSS method, the ECHO method, and the PPM method, which are typical electronic sound watermark methods, were used. These methods are all blind detection methods except for the PPM method. The CD method proposed by the inventors in Non-Patent Documents 6 and 7 was also compared. Hereinafter, the CD method to be compared is referred to as a CD (Non-Blind) method, and the digital watermark detection method of the present embodiment is referred to as a CD (Blind) method.

図１０は、上記の客観評価実験の結果を示すグラフであり、（ａ）乃至（ｃ）はそれぞれＰＥＡＱ、ＬＳＤ、ビット検出率についての実験結果を示している。なお、図１０では、上記１０２曲についての平均値が示されている。 FIG. 10 is a graph showing the results of the objective evaluation experiment, and (a) to (c) show the experimental results for PEAQ, LSD, and bit detection rate, respectively. In addition, in FIG. 10, the average value about said 102 music is shown.

まず、図１０（ａ）に示す結果について検討する。ＰＥＡＱのＯＤＧ（Objective Difference Grade）値は０（知覚不可能）〜−４（非常に耳障りである）であるため、ここでは−１（知覚される可能性があるが耳障りではない）を知覚不可能性の閾値と定めた。図１０（ａ）に示されるように、ＤＳＳ法が最も悪く、ＥＣＨＯ法もビットレートが８ｂｐｓ以降から先で急激に悪くなっている。また、ＰＰＭ法は全般的にＯＤＧが−２程度となっている。他方、ＬＳＢ法は、今回の全てのビットレートにおいて良好な結果が得られている。また、ＣＤ（Non-Blind）法では、ビットレートが４ｂｐｓでは全く問題がないが、１２８ｂｐｓあたりからＯＤＧ値が減少し、１０２４ｂｐｓ程度以降で閾値−１を下回っている。これらに対し、本実施の形態のＣＤ（Blind）法では、６４ｂｐｓの時点で既に−１．０付近となり、ｂｐｓの増加とともに−３．０付近まで低下している。 First, the results shown in FIG. Since the ODG (Objective Difference Grade) value of PEAQ is 0 (not perceptible) to -4 (very harsh), -1 (which may be perceived but not harsh) is not perceived here. It was defined as a threshold of possibility. As shown in FIG. 10A, the DSS method is the worst, and the ECHO method is abruptly worse after the bit rate of 8 bps or later. Also, the PPM method generally has an ODG of about -2. On the other hand, the LSB method has obtained good results at all the current bit rates. In the CD (Non-Blind) method, there is no problem when the bit rate is 4 bps, but the ODG value decreases from around 128 bps, and is below the threshold value -1 after about 1024 bps. On the other hand, in the CD (Blind) method of the present embodiment, it is already around −1.0 at 64 bps, and decreases to around −3.0 as bps increases.

次に、図１０（ｂ）に示す結果について検討する。一般にＬＳＤは１ｄＢ内の歪みであれば音質がよいといわれているため、ここでは、ＬＳＤの閾値を１ｄＢに設定した。図１０（ｂ）に示すように、ＬＳＢ法は、ビットレートを変えても埋め込みによる歪みの影響を受けておらず、良好な結果が得られている。他方、ＤＳＳ法の場合、ビットレートの増加にかかわらず評価閾値の上にあり、音質評価では問題があることが分かる。ＥＣＨＯ法及びＰＰＭ法は、いずれも評価閾値内にあり、特に音質に関して問題があるとはいえない。また、ＣＤ（Non-Blind）法は、すべてのビットレートで閾値内にあり、２５６ｂｐｓまでは０．５ｄＢ以内を維持するという良好な結果となっている。これらに対し、ＣＤ（Blind）法は、ビットレートの増加に対して単調増加しており、Ｎ_ｂｉｔ＜１０２４ｂｐｓまでは閾値以下（−１ｄＢ）にあるものの、ＣＤ（Non-Blind）法と比較すると若干大きな値となっている。しかし、４〜６４ｂｐｓの付近では、ＣＤ（Blind）法でのＬＳＤがＣＤ（Non-Blind）法のものよりも若干小さい値になっている。なお、ＣＤ（Blind）法及びＣＤ（Non-Blind）法のＬＳＤでの差は、図１０（ａ）に示すＰＥＡＱの場合ほど大きくはなっていない。これは、単純なスペクトル歪みでみる場合と比べると、聴覚的な印象に基づく尺度では両者により違いが現れるためであると考えられる。 Next, the result shown in FIG. In general, LSD is said to have good sound quality if it is a distortion within 1 dB. Therefore, the LSD threshold is set to 1 dB here. As shown in FIG. 10B, the LSB method is not affected by distortion due to embedding even when the bit rate is changed, and a good result is obtained. On the other hand, in the case of the DSS method, it is above the evaluation threshold regardless of the increase in bit rate, and it can be seen that there is a problem in sound quality evaluation. Both the ECHO method and the PPM method are within the evaluation threshold, and it cannot be said that there is a problem with respect to sound quality. The CD (Non-Blind) method is within the threshold value at all bit rates, and has a good result of maintaining within 0.5 dB up to 256 bps. On the other hand, the CD (Blind) method monotonously increases with the increase in bit rate, and it is below the threshold value (-1 dB) until N _bit <1024 bps, but compared with the CD (Non-Blind) method. It is a slightly large value. However, in the vicinity of 4 to 64 bps, the LSD in the CD (Blind) method is slightly smaller than that in the CD (Non-Blind) method. Note that the difference in the LSD between the CD (Blind) method and the CD (Non-Blind) method is not as great as in the case of PEAQ shown in FIG. This is considered to be due to the difference between the two in the scale based on the auditory impression, compared with the case of looking at simple spectral distortion.

最後に、図１０（ｃ）に示す結果について検討する。ここでは、ビット検出率の閾値を７５％とした。図１０（ｃ）に示すように、ＬＳＢ法を除き、いずれの手法ともビットレートの増加に伴いビット検出率の低下が見られる。ＣＤ（Non-Blind）法は、Ｎ_ｂｉｔ＝１０２４ｂｐｓ程度で閾値を切るが、他の従来手法ではもっと低いビットレートで閾値を切っている。他方、本実施の形態のＣＤ（Blind）法では、ビット検出率の低下はほとんど見られず、ＣＤ（Non-Blind）法と比べても良好な結果が得られている。具体的には、Ｎ_ｂｉｔ＜５１２まではほぼ１００％であり、１０２４ｂｐｓに至って９８％となっている。 Finally, consider the results shown in FIG. Here, the threshold of the bit detection rate is set to 75%. As shown in FIG. 10C, except for the LSB method, a decrease in the bit detection rate is observed with an increase in the bit rate in any of the methods. The CD (Non-Blind) method cuts the threshold at about N _bit = 1024 bps, but other conventional methods cut the threshold at a lower bit rate. On the other hand, in the CD (Blind) method of the present embodiment, the bit detection rate is hardly lowered, and a good result is obtained as compared with the CD (Non-Blind) method. Specifically, N _bit <512 is almost 100%, reaching 1024 bps and 98%.

なお、上記の客観評価実験では、ＬＳＢ法が最も良い結果を出しているが、ＬＳＢ法の場合、埋め込みされた信号が少しでも改変された場合に検出できないため、頑健性に大きな問題があることが非特許文献６及び７等で指摘されている。これに対し、ＣＤ（Non-Blind）法の場合、「Unoki, M., Imabeppu, K., Hamada, D., Haniu, A., and Miyauchi, R. “Embedding limitations with digital-audio watermarking method based on cochlear delay characteristics,” J. Information Hiding and Multimedia Signal Processing, 2(1), 1-23, 2011」等に示されるように、十分な頑健性を備えている。しかしながら、ＣＤ（Non-Blind）法ではブラインド検出ができないという問題があり、本実施の形態のＣＤ（Blind）法では、この問題を解消しつつ、優れた知覚不可能性及び頑健性を得ることが可能である。 In the above objective evaluation experiment, the LSB method gives the best results. However, the LSB method has a big problem in robustness because it cannot be detected when the embedded signal is altered even a little. Is pointed out in Non-Patent Documents 6 and 7, etc. On the other hand, in the case of CD (Non-Blind) method, “Unoki, M., Imabeppu, K., Hamada, D., Haniu, A., and Miyauchi, R.“ Embedding limitations with digital-audio watermarking method based On cochlear delay characteristics, “J. Information Hiding and Multimedia Signal Processing, 2 (1), 1-23, 2011”, etc. However, the CD (Non-Blind) method has a problem that blind detection cannot be performed. In the CD (Blind) method of the present embodiment, this problem can be solved and excellent perceptibility and robustness can be obtained. Is possible.

［原信号取得処理］
従来の多くの電子音響透かし技術では、電子透かしデータを原信号に埋め込んだ後、それを検出することのみが考慮され、検出後にその電子透かしデータを取り除くことまでは検討されていない。そのため、埋め込まれた電子透かしデータを取り除くための工夫はなされず、除去が困難な態様で電子透かしデータの埋め込みを行っている。このことから、従来の多くの技術は、非可逆的な電子音響透かし技術であるといえる。これに対し、本実施の形態では、原信号に対して蝸牛遅延フィルタにより位相変調を施すという比較的単純な処理で電子透かしデータの埋め込みを行っており、その後検出された電子透かしデータを利用することにより、簡易な方法で当該電子透かしデータを取り除いて原信号を取得することができる。このように、本実施の形態では、可逆型の電子音響透かし技術を実現することができる。以下、この原信号を取得するための処理について説明する。 [Original signal acquisition processing]
In many conventional digital audio watermark techniques, only embedding of digital watermark data in an original signal and detection of the original watermark signal are considered, and removal of the digital watermark data after detection is not considered. For this reason, no effort is made to remove the embedded digital watermark data, and the digital watermark data is embedded in a manner that is difficult to remove. From this, it can be said that many conventional techniques are irreversible digital audio watermark techniques. In contrast, in the present embodiment, digital watermark data is embedded by a relatively simple process of performing phase modulation on the original signal by a cochlear delay filter, and the detected digital watermark data is used thereafter. Thus, the original signal can be acquired by removing the digital watermark data by a simple method. Thus, in this embodiment, it is possible to realize a reversible electronic acoustic watermark technology. Hereinafter, processing for acquiring the original signal will be described.

図１１は、本発明の実施の形態における電子透かし検出装置２が実行する原信号取得処理の手順を示すフローチャートである。なお、以下では、電子透かし検出装置２が、電子透かし埋込装置１が備える第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂの逆フィルタ、すなわち第１蝸牛遅延フィルタ１０２ａ及び第２蝸牛遅延フィルタ１０２ｂが模擬する蝸牛遅延特性の逆特性を有するフィルタを備えているものとする。 FIG. 11 is a flowchart showing the procedure of the original signal acquisition process executed by the digital watermark detection apparatus 2 in the embodiment of the present invention. In the following, the digital watermark detection apparatus 2 is an inverse filter of the first cochlear delay filter 102a and the second cochlear delay filter 102b included in the digital watermark embedding apparatus 1, that is, the first cochlear delay filter 102a and the second cochlear delay filter. It is assumed that a filter having a reverse characteristic of the cochlear delay characteristic simulated by 102b is provided.

電子透かし検出装置２は、フレーム処理部２０１において、外部から入力された透かし入り音響信号を各フレームに分割する（Ｓ３０１）。次に、電子透かし検出装置２は、上記の電子透かし検出処理により検出された電子透かしデータを参照し（Ｓ３０２）、その電子透かしデータのビット値が“０”及び“１”の何れであるかを判定する（Ｓ３０３）。 In the digital watermark detection apparatus 2, the frame processing unit 201 divides a watermarked acoustic signal input from the outside into frames (S301). Next, the digital watermark detection apparatus 2 refers to the digital watermark data detected by the digital watermark detection process (S302), and whether the bit value of the digital watermark data is “0” or “1”. Is determined (S303).

ステップＳ３０３において電子透かしデータのビット値が“０”であると判定した場合（Ｓ３０３で“０”）、電子透かし検出装置２は、第１蝸牛遅延フィルタ１０２ａの逆フィルタを用いて、透かし入り音響信号に対して位相変調を施す（Ｓ３０４）。他方、電子透かしデータのビット値が“１”であると判定した場合（Ｓ３０３で“１”）、電子透かし検出装置２は、第２蝸牛遅延フィルタ１０２ｂの逆フィルタを用いて、透かし入り音響信号に対して位相変調を施す（Ｓ３０５）。 When it is determined in step S303 that the bit value of the digital watermark data is “0” (“0” in S303), the digital watermark detection apparatus 2 uses the inverse filter of the first cochlear delay filter 102a to perform watermarked sound. Phase modulation is performed on the signal (S304). On the other hand, when it is determined that the bit value of the digital watermark data is “1” (“1” in S303), the digital watermark detection apparatus 2 uses the inverse filter of the second cochlear delay filter 102b to apply the watermarked acoustic signal. Is subjected to phase modulation (S305).

次に、電子透かし埋込装置１は、当該フレームに埋め込まれている電子透かしデータのすべてのビットについて処理がなされたか否かを判定する（Ｓ３０６）。ここでまだ処理がなされていないビットがあると判定した場合（Ｓ３０６でＮＯ）、電子透かし検出装置２は、ステップＳ３０３へ戻り、それ以降の処理を繰り返す。他方、すべてのビットについて処理がなされたと判定した場合（Ｓ３０６でＹＥＳ）、電子透かし検出装置２は、ステップＳ３０４及びＳ３０５により位相変調が施された音響信号を接合することにより、原信号を復元する（Ｓ３０７）。 Next, the digital watermark embedding apparatus 1 determines whether or not all the bits of the digital watermark data embedded in the frame have been processed (S306). If it is determined that there is a bit that has not yet been processed (NO in S306), the digital watermark detection apparatus 2 returns to step S303 and repeats the subsequent processing. On the other hand, if it is determined that all the bits have been processed (YES in S306), the digital watermark detection apparatus 2 restores the original signal by joining the acoustic signals that have been subjected to phase modulation in steps S304 and S305. (S307).

上記の原信号取得処理をすべてのフレームについて行い、それらを接続することにより、原信号が取得されることになる。なお、電子透かし埋込処理の場合と同様に、フレームの接続箇所に不連続点が生じることにより知覚不可能性に影響が出ることを防止するために、接続部前のフレームの後ろ数点（１ｍｓ程度）をスプライン補間で滑らかにすることが望ましい。 The original signal is acquired by performing the above-described original signal acquisition process for all the frames and connecting them. As in the case of the digital watermark embedding process, in order to prevent the perceptibility from being affected by the occurrence of discontinuous points at the connection points of the frames, several points behind the frame before the connection part ( It is desirable to smoothen 1 ms) by spline interpolation.

［原信号取得処理の評価］
上述した原信号取得処理により取得された信号が実際の原信号と一致しているか等の点について確認するために、上記の客観評価実験と同様の実験を行った。以下、この結果について検討する。 [Evaluation of original signal acquisition processing]
In order to confirm whether or not the signal acquired by the above-described original signal acquisition process matches the actual original signal, an experiment similar to the objective evaluation experiment described above was performed. This result will be discussed below.

図１２は、ＣＤ（Non-Blind）法及びＣＤ（Blind）法における電子透かし埋込処理により生成された透かし入り音響信号についての上記客観評価実験の結果を示すグラフであり、（ａ）乃至（ｃ）はそれぞれＰＥＡＱ、ＬＳＤ、ビット検出率についての実験結果を示している。なお、図１２には、上記１０２曲についての平均値が示されている。 FIG. 12 is a graph showing the result of the objective evaluation experiment on the watermarked acoustic signal generated by the digital watermark embedding process in the CD (Non-Blind) method and the CD (Blind) method. c) show experimental results for PEAQ, LSD, and bit detection rate, respectively. FIG. 12 shows the average value for the 102 songs.

図１２において、ＣＤ（Blind）法の結果は、上述したスプライン補間を行った場合（Blind（Splineあり））と行っていない場合（Blind（Splineなし））とに分けて示されている。図１２を参照すると、スプライン補間を行っている方が、ＰＥＡＱ、ＬＳＤ、ビット検出率の何れについても良い結果が出ていることが分かる。ただし、ビット検出率についてはほとんど違いがない。 In FIG. 12, the results of the CD (Blind) method are shown separately when the above-described spline interpolation is performed (Blind (with Spline)) and when it is not performed (Blind (without Spline)). Referring to FIG. 12, it can be seen that the spline interpolation produces better results for any of PEAQ, LSD, and bit detection rate. However, there is almost no difference in the bit detection rate.

他方、図１３は、本実施の形態の原信号取得処理により電子透かしデータを削除する前及び削除した後における上記の客観評価実験の結果を示すグラフであり、（ａ）乃至（ｃ）はそれぞれＰＥＡＱ、ＬＳＤ、ＳＮＲ（Signal-Noise Ratio）についての実験結果を示している。このＳＮＲにおいて、Ｓは原信号、Ｎは原信号と回復信号（上記の原信号取得処理により得られた信号）との差を意味している。なお、ここでも、上記１０２曲についての平均値が示されている。 On the other hand, FIG. 13 is a graph showing the result of the objective evaluation experiment before and after deleting the digital watermark data by the original signal acquisition processing of the present embodiment, and (a) to (c) are respectively shown. The experimental result about PEAQ, LSD, and SNR (Signal-Noise Ratio) is shown. In this SNR, S means the original signal, and N means the difference between the original signal and the recovery signal (the signal obtained by the original signal acquisition process). Here, the average value for the 102 songs is also shown here.

図１３を参照すると、電子透かしデータを削除する前よりも、削除した後の方が全般的に良好な結果となっていることが分かる。特に、図１３（ｃ）に示すＳＮＲではそのことが顕著となっている。回復信号が原音に近付くほどＳＮＲは高い値になるため、図１３（ｃ）に示す結果は、本実施の形態の原信号取得処理により取得された信号が原信号に近いこと、換言すると透かし入り音響信号から埋め込まれた電子透かしデータを効果的に削除することができたことを表しているといえる。 Referring to FIG. 13, it can be seen that generally better results are obtained after the deletion than before the digital watermark data is deleted. This is particularly noticeable in the SNR shown in FIG. Since the SNR increases as the recovery signal approaches the original sound, the result shown in FIG. 13C indicates that the signal acquired by the original signal acquisition processing of the present embodiment is close to the original signal, in other words, watermarked. It can be said that the digital watermark data embedded from the acoustic signal can be effectively deleted.

このように、本実施の形態では、蝸牛遅延フィルタの逆フィルタを用いて位相変調を行うという簡易な処理で、透かし入り音響信号から電子透かしデータを除去して原信号を取得することができる。このように、原信号を取得することができるため、その原信号に新たな電子透かしデータを埋め込み、これを流通させるようなことも可能となる。これにより、埋め込み情報（例えば、著作権情報、シリアルナンバー等）の内容を更新することができる電子音響透かし技術を実現することができる。 Thus, in this embodiment, the original signal can be acquired by removing the digital watermark data from the watermarked acoustic signal by a simple process of performing phase modulation using the inverse filter of the cochlear delay filter. In this way, since the original signal can be acquired, it is possible to embed new digital watermark data in the original signal and distribute it. Accordingly, it is possible to realize an electronic audio watermark technique that can update the contents of embedded information (for example, copyright information, serial number, etc.).

（その他の実施の形態）
上記の各実施の形態においては、電子透かしデータの埋め込み処理及び検出処理がソフトウェアにより実現されているが、本発明はこれに限定されるわけではない。例えば、これらの処理の全部又は一部が、ＤＳＰ（Digital Signal Processor）等の専用のハードウェア回路によって実現されてもよい。 (Other embodiments)
In each of the above embodiments, the digital watermark data embedding process and detection process are realized by software, but the present invention is not limited to this. For example, all or part of these processes may be realized by a dedicated hardware circuit such as a DSP (Digital Signal Processor).

また、上記の各実施の形態においては、原信号であるモノラル音楽信号に対して電子透かしデータを埋め込んでいるが、本発明はこれに限られるわけではなく、ステレオ音楽信号の両チャンネルに対して電子透かしデータを埋め込むことも可能である。 In each of the above embodiments, the digital watermark data is embedded in the monaural music signal that is the original signal. However, the present invention is not limited to this, and both channels of the stereo music signal are used. It is also possible to embed digital watermark data.

本発明の電子透かし検出装置及び電子透かし検出方法はそれぞれ、種々の音楽ジャンルの音響信号に電子透かしデータが埋め込まれている場合に当該電子透かしデータを検出する電子透かし検出装置及び電子透かし検出方法等として有用である。 The digital watermark detection apparatus and the digital watermark detection method of the present invention are respectively a digital watermark detection apparatus and a digital watermark detection method for detecting digital watermark data when the digital watermark data is embedded in acoustic signals of various music genres. Useful as.

１電子透かし埋込装置
１１ＣＰＵ
１２ＲＯＭ
１３ＲＡＭ
１４信号入力部
１５信号出力部
１６ハードディスク
１６Ａ電子透かし埋込プログラム
１７バス
１０１フレーム処理部
１０２ａ第１蝸牛遅延フィルタ
１０２ｂ第２蝸牛遅延フィルタ
１０３フィルタ選択部
２電子透かし検出装置
２１ＣＰＵ
２２ＲＯＭ
２３ＲＡＭ
２４信号入力部
２５ハードディスク
２５Ａ電子透かし検出プログラム
２６バス
２０１フレーム処理部
２０２ａ、２０２ｂ変換部
２０２ａ第１チャープｚ変換部
２０２ｂ第２チャープｚ変換部
２０３ビット値検出部 1 Electronic Watermark Embedding Device 11 CPU
12 ROM
13 RAM
DESCRIPTION OF SYMBOLS 14 Signal input part 15 Signal output part 16 Hard disk 16A Digital watermark embedding program 17 Bus 101 Frame process part 102a 1st cochlear delay filter 102b 2nd cochlear delay filter 103 Filter selection part 2 Digital watermark detection apparatus 21 CPU
22 ROM
23 RAM
24 signal input unit 25 hard disk 25A digital watermark detection program 26 bus 201 frame processing unit 202a, 202b conversion unit 202a first chirp z conversion unit 202b second chirp z conversion unit 203 bit value detection unit

Claims

Using a cochlear delay filter that simulates the cochlear delay characteristics, the digital signal is subjected to phase modulation on the digital audio signal, and the digital watermark data embedding device embeds the digital watermark data in the phase-modulated audio signal. A cochlear delay characteristic estimating means for estimating a cochlear delay characteristic simulated by the cochlear delay filter when digital watermark data is embedded in an acoustic signal;
An electronic watermark detection apparatus comprising: electronic watermark detection means for detecting the electronic watermark data embedded in an acoustic signal based on the cochlear delay characteristic estimated by the cochlear delay characteristic estimation means.

The digital watermark data embedding device generates a plurality of different phase-modulated acoustic signals by performing phase modulation on the acoustic signal using a plurality of different cochlear delay filters, and according to the digital watermark data, It is configured to embed digital watermark data by selecting one acoustic signal from different phase-modulated acoustic signals and joining the selected acoustic signals together.
The cochlear delay characteristic estimating means is configured to estimate a plurality of different cochlear delay characteristics respectively simulated by the plurality of different cochlear delay filters;
Based on the plurality of different cochlear delay characteristics estimated by the cochlear delay characteristic estimation unit, the digital watermark detection unit converts any of the plurality of different cochlear delay filters into an acoustic signal embedded with digital watermark data. By detecting whether the cochlear delay filter is applied and the phase modulation is performed, the digital watermark data is detected.
The digital watermark detection apparatus according to claim 1.

The cochlear delay characteristic estimating means is configured to estimate a cochlear delay characteristic by estimating a zero point of the cochlear delay filter,
The digital watermark detection apparatus according to claim 1.

The cochlear delay characteristic estimating means is configured to estimate a zero point of the cochlear delay filter using a chirp z-transform.
The digital watermark detection apparatus according to claim 3.

Original signal acquisition means for acquiring an acoustic signal before the digital watermark data is embedded by applying a filter having a reverse characteristic of the cochlear delay characteristic estimated by the cochlear delay characteristic means to the acoustic signal embedded with the digital watermark data Further comprising
The digital watermark detection apparatus according to claim 4.

The acoustic signal before the digital watermark data is embedded by applying an inverse filter of the cochlear delay filter determined by the digital watermark detection means to be applied to the phase modulation of the acoustic signal in which the digital watermark data is embedded. Further comprising original signal acquisition means for acquiring
The digital watermark detection apparatus according to claim 2.

Using a cochlear delay filter that simulates the cochlear delay characteristics, the digital signal is subjected to phase modulation on the digital audio signal, and the digital watermark data embedding device embeds the digital watermark data in the phase-modulated audio signal. A step of estimating a cochlear delay characteristic simulated by the cochlear delay filter when digital watermark data is embedded in an acoustic signal;
And (b) detecting the digital watermark data embedded in the acoustic signal based on the estimated cochlear delay characteristic.

The digital watermark data embedding device generates a plurality of different phase-modulated acoustic signals by performing phase modulation on the acoustic signal using a plurality of different cochlear delay filters, and according to the digital watermark data, It is configured to embed digital watermark data by selecting one acoustic signal from different phase-modulated acoustic signals and joining the selected acoustic signals together.
In the step (a), estimating a plurality of different cochlear delay characteristics respectively simulated by the plurality of different cochlear delay filters,
In the step (b), based on the plurality of different cochlear delay characteristics estimated in the step (a), an acoustic signal embedded with digital watermark data is selected from any of the plurality of different cochlear delay filters. Detecting watermark data by determining whether a cochlear delay filter has been applied and phase modulated;
The digital watermark detection method according to claim 7.

In the step (a), estimating a cochlear delay characteristic by estimating a zero point of the cochlear delay filter,
The digital watermark detection method according to claim 8 or 9.

Estimating the zero point of the cochlear delay filter using chirp z-transform in step (a);
The digital watermark detection method according to claim 9.