JP6353402B2

JP6353402B2 - Acoustic digital watermark system, digital watermark embedding apparatus, digital watermark reading apparatus, method and program thereof

Info

Publication number: JP6353402B2
Application number: JP2015097178A
Authority: JP
Inventors: 仲大室
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-05-12
Filing date: 2015-05-12
Publication date: 2018-07-04
Anticipated expiration: 2035-05-12
Also published as: JP2016212315A

Description

本発明は、音声、音楽などの音響信号（以下、単に「音響信号」ともいう）に、テキスト等の並行情報を付加して送信し、受信側において、良好な品質で音響信号を再生するとともに、付加された並行情報を読み出すための音響電子透かし技術に関する。 The present invention transmits audio signals such as voice and music (hereinafter also simply referred to as “acoustic signals”) with parallel information such as text added thereto, and reproduces the audio signals with good quality on the receiving side. The present invention relates to an acoustic digital watermark technique for reading added parallel information.

図１は、音響電子透かしを利用した情報通信システム（以下、「音響電子透かしシステム」ともいう）の従来技術の例である。送信部９１において、電子透かし埋め込み部９２は、入力された音響信号（以下「入力音響信号」ともいう）に並行情報を電子透かしとして埋め込み、再生信号を出力する。再生信号はスピーカ９３から再生される。受信部９４では、音響空間を介して再生信号の再生音をマイク９５で収音し、電子透かし読み取り部９６は、収音信号から並行情報を読み出すとともに、音響信号（以下「出力音響信号」ともいう）を出力する。送信部９１のスピーカ９３として、例えば店舗内や列車内のアナウンスシステムや、屋外広告やイベント用の音響スピーカが利用可能である。受信部９４のマイク９５は、ボーカル用やアナウンス用のマイクのほか、携帯電話またはスマートフォン等に内蔵されたマイクが利用可能である。 FIG. 1 is an example of the prior art of an information communication system (hereinafter also referred to as “acoustic digital watermark system”) using an acoustic digital watermark. In the transmission unit 91, the digital watermark embedding unit 92 embeds parallel information as an electronic watermark in the input acoustic signal (hereinafter also referred to as “input acoustic signal”), and outputs a reproduction signal. The reproduction signal is reproduced from the speaker 93. The receiving unit 94 collects the reproduction sound of the reproduction signal via the acoustic space by the microphone 95, and the digital watermark reading unit 96 reads the parallel information from the sound collection signal and also outputs the acoustic signal (hereinafter referred to as “output acoustic signal”). Output). As the speaker 93 of the transmission unit 91, for example, an announcement system in a store or a train, or an acoustic speaker for outdoor advertising or an event can be used. As the microphone 95 of the receiving unit 94, a microphone built in a mobile phone or a smartphone can be used in addition to a vocal microphone or an announcement microphone.

図１の音響電子透かしシステムは、例えば、スピーカから音楽を再生しながら、並行情報として音楽のタイトルをテキストとして送信したり、スポーツ映像と音響をスピーカから再生しながら、並行情報として試合のチケット情報やチケット販売サイトのＵＲＬを送信したり、日本語のアナウンスをスピーカから再生しながら、並行情報として外国語訳をテキストで送信したりする用途が想定される。 The audio digital watermarking system of FIG. 1 is, for example, transmitting music titles as text as parallel information while playing music from a speaker, or playing game information as parallel information while playing sports video and sound from a speaker. In addition, it is possible to transmit the URL of a ticket sales site or to transmit a foreign language translation as text as parallel information while reproducing a Japanese announcement from a speaker.

このような電子透かしの方法の従来技術として、非特許文献１、非特許文献２の方法が知られている。 As a conventional technique of such a digital watermark method, the methods of Non-Patent Document 1 and Non-Patent Document 2 are known.

松岡，中島，吉村「移動端末のマイクロフォンで情報を取得する音波情報伝送方式」，NTT DoCoMo テクニカルジャーナル，2006,VOL.14 NO.2, pp.6-13Matsuoka, Nakajima, Yoshimura “Sound wave information transmission method to acquire information with microphone of mobile terminal”, NTT DoCoMo Technical Journal, 2006, VOL.14 NO.2, pp.6-13 茂出木,「非接触抽出可能な音楽への電子透かし埋め込み技術の開発」，情報処理学会研究報告，2005,2005-MUS-60, pp.1-6Shigeki, “Development of Digital Watermark Embedding Techniques for Non-contact Extractable Music”, Information Processing Society of Japan Research Report, 2005, 2005-MUS-60, pp.1-6

音響電子透かしにおいて重要なことは、
（１）必要とする並行情報の通信速度（ビットレート）を確保すること（並行情報の通信速度の確保）。
（２）埋め込まれた並行情報を受信側で確実に読み取ることができること（通信の信頼性確保）。
（３）出力音響信号を人が聴いたときに、並行情報を埋め込まれる前の入力音響信号に比べて劣化が検知されない、または気にならないこと（音声品質の維持）。
である。 The important thing about audio watermarking is
(1) Securing necessary communication speed (bit rate) of parallel information (securing communication speed of parallel information).
(2) The embedded parallel information can be reliably read on the receiving side (ensure communication reliability).
(3) When a person listens to the output sound signal, deterioration is not detected or is not anxious compared to the input sound signal before the parallel information is embedded (maintenance of sound quality).
It is.

非特許文献１では、入力音響信号の高域、例えば5kHz〜10kHzの信号をOFDM信号に置換している。この方法では、（１）と（２）の条件は満たすが、（３）の条件を満たすとはいえない。非特許文献１では、入力音響信号の高域はスペクトル概形だけは保存されるが、オリジナルの音響情報は失われてしまうためである。また、並行情報はOFDM信号として高域のみに埋め込まれるため、高域を感度よく受音できるマイクロフォンが必要である。 In Non-Patent Document 1, a high frequency of an input acoustic signal, for example, a signal of 5 kHz to 10 kHz is replaced with an OFDM signal. In this method, the conditions (1) and (2) are satisfied, but it cannot be said that the condition (3) is satisfied. In Non-Patent Document 1, only the spectral outline is preserved in the high frequency range of the input acoustic signal, but the original acoustic information is lost. Moreover, since the parallel information is embedded only in the high frequency as an OFDM signal, a microphone capable of receiving high frequency with high sensitivity is required.

非特許文献２では、ステレオスピーカとステレオマイクを利用する場合には、（１）（２）（３）とも十分な性能を確保できるとされているが、モノラルスピーカとモノラルマイクを利用する場合には、あらかじめ決められた２つの低域、例えば0〜100Hzと100〜200Hzの信号のうち、一方をゼロにして（当該帯域の音が欠落した音として）再生することと等価であり、スピーカとマイクの間の音響空間には、周囲雑音が存在することを前提とすると、再生音量が小さいときには（１）（２）の性能が十分でなく、再生音量が大きいときには（３）を満たすことができない。 In Non-Patent Document 2, when a stereo speaker and a stereo microphone are used, it is said that (1), (2), and (3) can ensure sufficient performance, but when a monaural speaker and a monaural microphone are used. Is equivalent to reproducing one of two predetermined low-frequency signals, for example, 0 to 100 Hz and 100 to 200 Hz, with one of the signals set to zero (as a sound lacking the sound in the band) Assuming that ambient noise exists in the acoustic space between the microphones, the performance of (1) and (2) is not sufficient when the playback volume is low, and (3) is satisfied when the playback volume is high. Can not.

また、非特許文献１、非特許文献２とも、入力音響信号は、音楽のような複数の音が混ざり合った信号であることを想定している。しかしながら、入力音響信号が、例えばアナウンスのような一人の声の場合には、電子透かしを埋め込むことによる音質劣化が顕著になり、より（３）の条件を満たすことが難しくなる。 In both Non-Patent Document 1 and Non-Patent Document 2, it is assumed that the input acoustic signal is a signal in which a plurality of sounds such as music are mixed. However, when the input acoustic signal is a single voice such as an announcement, for example, sound quality deterioration due to embedding a digital watermark becomes remarkable, and it becomes more difficult to satisfy the condition (3).

なお、類似したシステム例として、図２に示す構成が考えられる。図１の構成との違いは、入力音響信号を「原音」として別途、デジタル通信手段９７によって電子透かし読み取り部９６に送ることである。図２の構成で上記（１）（２）（３）を満たすことは、図１の構成に比べると易しい。その理由は、「原音」がわかっているため、収音信号と「原音（入力音響信号）」の差分を検出しやすいからである。図２の構成は例えば著作権管理（著作権情報を埋め込んで、不正コピーの検出する）などに利用されるが、背景技術に記載した用途及びサービスに利用するには、送信部９１と受信部９４がデジタル通信できる仕組みが必要である。デジタル通信できる仕組みがあるなら、並行情報もデジタル通信路経由で送ればよく、図２の構成では本発明の目的を達成することができない。 As a similar system example, the configuration shown in FIG. 2 can be considered. The difference from the configuration of FIG. 1 is that the input sound signal is separately sent as “original sound” to the digital watermark reading unit 96 by the digital communication means 97. Satisfying the above (1), (2), and (3) with the configuration of FIG. 2 is easier than the configuration of FIG. The reason is that since the “original sound” is known, it is easy to detect the difference between the collected sound signal and the “original sound (input sound signal)”. The configuration in FIG. 2 is used for copyright management (embedding copyright information and detecting unauthorized copying), etc., but for use in the applications and services described in the background art, a transmission unit 91 and a reception unit A system capable of digitally communicating with 94 is required. If there is a mechanism capable of digital communication, parallel information may be sent via a digital communication path, and the configuration of FIG. 2 cannot achieve the object of the present invention.

本発明は、必ずしもデジタル通信路を経由せずに、並行情報の通信速度の確保、通信の信頼性確保、音声品質の維持を実現する音響電子透かしシステムを提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide an acoustic digital watermarking system that realizes ensuring the communication speed of parallel information, ensuring communication reliability, and maintaining sound quality without necessarily going through a digital communication path.

上記の課題を解決するために、本発明の一態様によれば、音響電子透かしシステムは、電子透かし埋め込み部と電子透かし読み取り部とを含む。電子透かし埋め込み部は、入力音響信号の線形予測係数である第一線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、入力音響信号をフィルタリングし、第一残差信号を求める第一逆フィルタと、
Ｎを２以上の整数の何れかとし、第一残差信号の基本周波数である第一基本周波数のＮ倍となる正弦波を第一搬送波として生成する第一搬送波生成部と、Ｍを１以上Ｎ以下の整数の何れかとし、シンボルレートをピッチ周波数のＭ倍となるように、第一搬送波を並行情報で位相変調し、変調波を求める位相変調部と、変調波と第一残差信号とを加算し、第一信号を求める加算部と、第一線形予測係数をフィルタ係数とするＩＩＲフィルタを用いて、第一信号をフィルタリングし、再生信号を求める合成フィルタとを含む。電子透かし読み取り部は、再生信号の再生音を収音して得られる収音信号の線形予測係数である第二線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、収音信号をフィルタリングし、第二残差信号を求める第二逆フィルタと、第二残差信号の基本周波数である第二基本周波数のＮ倍となる正弦波を第二搬送波として生成する第二搬送波生成部と、第二残差信号と第二搬送波とを用いて、並行情報に対応する情報を検出する検波部とを含む。 In order to solve the above problems, according to one aspect of the present invention, an acoustic digital watermark system includes a digital watermark embedding unit and a digital watermark reading unit. The digital watermark embedding unit uses a FIR filter that uses a first linear prediction coefficient that is a linear prediction coefficient of the input acoustic signal as a filter coefficient to filter the input acoustic signal and obtain a first residual signal; ,
A first carrier generation unit that generates a sine wave that is N times the first fundamental frequency, which is the fundamental frequency of the first residual signal, and N is one or more integers, and M is one or more. A phase modulation unit that obtains a modulated wave by phase-modulating the first carrier with parallel information so that the symbol rate is M times the pitch frequency, and a modulated wave and a first residual signal. And an adder for obtaining the first signal, and a synthesis filter for obtaining the reproduced signal by filtering the first signal using an IIR filter having the first linear prediction coefficient as a filter coefficient. The digital watermark reading unit filters the collected sound signal using an FIR filter that uses a second linear prediction coefficient that is a linear prediction coefficient of the collected sound signal obtained by collecting the reproduced sound of the reproduced signal as a filter coefficient, A second inverse filter for obtaining a second residual signal, a second carrier generation unit that generates a sine wave that is N times the second fundamental frequency, which is the fundamental frequency of the second residual signal, as a second carrier, And a detector that detects information corresponding to the parallel information using the residual signal and the second carrier wave.

上記の課題を解決するために、本発明の他の態様によれば、電子透かし埋め込み装置は、入力音響信号の線形予測係数である第一線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、入力音響信号をフィルタリングし、第一残差信号を求める第一逆フィルタと、Ｎを２以上の整数の何れかとし、第一残差信号の基本周波数である第一基本周波数のＮ倍となる正弦波を第一搬送波として生成する第一搬送波生成部と、Ｍを１以上Ｎ以下の整数の何れかとし、シンボルレートをピッチ周波数のＭ倍となるように、第一搬送波を並行情報で位相変調し、変調波を求める第一位相変調部と、変調波と第一残差信号とを加算し、第一信号を求める加算部と、第一線形予測係数をフィルタ係数とするＩＩＲフィルタを用いて、第一信号をフィルタリングし、再生信号を求める合成フィルタとを含む。 In order to solve the above-described problem, according to another aspect of the present invention, an electronic watermark embedding apparatus uses an FIR filter that uses a first linear prediction coefficient that is a linear prediction coefficient of an input acoustic signal as a filter coefficient, A first inverse filter that filters the input acoustic signal to obtain a first residual signal, and N is an integer greater than or equal to 2 and is N times the first fundamental frequency, which is the fundamental frequency of the first residual signal. A first carrier wave generation unit that generates a sine wave as the first carrier wave, and M is any integer between 1 and N, and the first carrier wave is phased in parallel information so that the symbol rate is M times the pitch frequency. A first phase modulation unit that performs modulation and obtains a modulated wave, an addition unit that obtains a first signal by adding the modulated wave and the first residual signal, and an IIR filter that uses the first linear prediction coefficient as a filter coefficient Filter the first signal Ngushi, and a synthesis filter for obtaining a reproduced signal.

上記の課題を解決するために、本発明の他の態様によれば、電子透かし読み取り装置は、収音信号を用いて、並行情報を得る。電子透かし読み取り装置は、収音信号の線形予測係数である第二線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、収音信号をフィルタリングし、第二残差信号を求める第二逆フィルタと、Ｎを２以上の整数の何れかとし、第二残差信号の基本周波数である第二基本周波数のＮ倍となる正弦波を第二搬送波として生成する第二搬送波生成部と、第二残差信号と第二搬送波とを用いて、並行情報に対応する情報を検出する検波部とを含む。 In order to solve the above-described problem, according to another aspect of the present invention, a digital watermark reading apparatus obtains parallel information using a collected sound signal. The digital watermark reading apparatus uses a FIR filter that uses a second linear prediction coefficient that is a linear prediction coefficient of the collected sound signal as a filter coefficient to filter the collected sound signal and obtain a second residual signal; , N is any integer of 2 or more, and a second carrier generation unit that generates a sine wave that is N times the second fundamental frequency, which is the fundamental frequency of the second residual signal, as a second carrier, and a second residual And a detection unit that detects information corresponding to the parallel information using the difference signal and the second carrier wave.

上記の課題を解決するために、本発明の他の態様によれば、音響電子透かし方法は、入力音響信号の線形予測係数である第一線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、入力音響信号をフィルタリングし、第一残差信号を求める第一逆フィルタリングステップと、Ｎを２以上の整数の何れかとし、第一残差信号の基本周波数である第一基本周波数のＮ倍となる正弦波を第一搬送波として生成する第一搬送波生成ステップと、Ｍを１以上Ｎ以下の整数の何れかとし、シンボルレートをピッチ周波数のＭ倍となるように、第一搬送波を並行情報で位相変調し、変調波を求める位相変調ステップと、変調波と第一残差信号とを加算し、第一信号を求める加算ステップと、第一線形予測係数をフィルタ係数とするＩＩＲフィルタを用いて、第一信号をフィルタリングし、再生信号を求める合成フィルタリングステップと、再生信号の再生音を収音して得られる収音信号の線形予測係数である第二線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、収音信号をフィルタリングし、第二残差信号を求める第二逆フィルタリングステップと、第二残差信号の基本周波数である第二基本周波数のＮ倍となる正弦波を第二搬送波として生成する第二搬送波生成ステップと、第二残差信号と第二搬送波とを用いて、並行情報に対応する情報を検出する検波ステップとを含む。 In order to solve the above problem, according to another aspect of the present invention, an acoustic watermarking method uses an FIR filter having a first linear prediction coefficient that is a linear prediction coefficient of an input acoustic signal as a filter coefficient, A first inverse filtering step of filtering an input acoustic signal to obtain a first residual signal, N being an integer of 2 or more, and N times a first fundamental frequency that is a fundamental frequency of the first residual signal; A first carrier wave generating step for generating a sine wave as a first carrier wave, and M is any integer from 1 to N, and the first carrier wave is set in parallel information so that the symbol rate is M times the pitch frequency. A phase modulation step for performing phase modulation and obtaining a modulated wave, an addition step for obtaining the first signal by adding the modulated wave and the first residual signal, and an IIR filter using the first linear prediction coefficient as a filter coefficient are used. A first filtering step for filtering the first signal to obtain a reproduction signal, and an FIR using a second linear prediction coefficient that is a linear prediction coefficient of the collected sound signal obtained by collecting the reproduced sound of the reproduced signal as a filter coefficient A second inverse filtering step of filtering the collected sound signal to obtain a second residual signal using a filter; and a second sine wave that is N times the second fundamental frequency, which is the fundamental frequency of the second residual signal. A second carrier generation step for generating the carrier wave; and a detection step for detecting information corresponding to the parallel information using the second residual signal and the second carrier wave.

上記の課題を解決するために、本発明の他の態様によれば、電子透かし埋め込み方法は、入力音響信号の線形予測係数である第一線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、入力音響信号をフィルタリングし、第一残差信号を求める第一逆フィルタリングステップと、Ｎを２以上の整数の何れかとし、第一残差信号の基本周波数である第一基本周波数のＮ倍となる正弦波を第一搬送波として生成する第一搬送波生成ステップと、Ｍを１以上Ｎ以下の整数の何れかとし、シンボルレートをピッチ周波数のＭ倍となるように、第一搬送波を並行情報で位相変調し、変調波を求める第一位相変調ステップと、変調波と第一残差信号とを加算し、第一信号を求める加算ステップと、第一線形予測係数をフィルタ係数とするＩＩＲフィルタを用いて、第一信号をフィルタリングし、再生信号を求める合成フィルタリングステップとを含む。 In order to solve the above problem, according to another aspect of the present invention, a digital watermark embedding method uses an FIR filter having a first linear prediction coefficient that is a linear prediction coefficient of an input acoustic signal as a filter coefficient, A first inverse filtering step of filtering an input acoustic signal to obtain a first residual signal, N being an integer of 2 or more, and N times a first fundamental frequency that is a fundamental frequency of the first residual signal; A first carrier wave generating step for generating a sine wave as a first carrier wave, and M is any integer from 1 to N, and the first carrier wave is set in parallel information so that the symbol rate is M times the pitch frequency. Phase modulation, a first phase modulation step for obtaining a modulated wave, an addition step for adding the modulated wave and the first residual signal to obtain a first signal, and an IIR filter using the first linear prediction coefficient as a filter coefficient Using data, filtering the first signal, and a synthesis filtering step of obtaining a reproduced signal.

上記の課題を解決するために、本発明の他の態様によれば、電子透かし読み取り方法は、収音信号を用いて、並行情報を得る。電子透かし読み取り方法は、収音信号の線形予測係数である第二線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、収音信号をフィルタリングし、第二残差信号を求める第二逆フィルタリングステップと、Ｎを２以上の整数の何れかとし、第二残差信号の基本周波数である第二基本周波数のＮ倍となる正弦波を第二搬送波として生成する第二搬送波生成ステップと、第二残差信号と第二搬送波とを用いて、並行情報に対応する情報を検出する検波ステップとを含む。 In order to solve the above-described problem, according to another aspect of the present invention, a digital watermark reading method obtains parallel information using a collected sound signal. In the digital watermark reading method, a second inverse filtering step of obtaining a second residual signal by filtering the collected sound signal using an FIR filter having a second linear prediction coefficient that is a linear prediction coefficient of the collected sound signal as a filter coefficient A second carrier generation step for generating a sine wave that is N times the second fundamental frequency, which is the fundamental frequency of the second residual signal, as a second carrier, where N is any integer greater than or equal to 2; A detection step of detecting information corresponding to the parallel information using the residual signal and the second carrier wave.

本発明によれば、必ずしもデジタル通信路を経由せずに、並行情報の通信速度の確保、通信の信頼性確保、音声品質の維持を実現することができるという効果を奏する。 According to the present invention, there is an effect that it is possible to achieve the communication speed of parallel information, the reliability of communication, and the maintenance of voice quality without necessarily going through a digital communication path.

音響電子透かしシステムの構成例を示す図。The figure which shows the structural example of an acoustic digital watermark system. 音響電子透かしシステムの構成例を示す図。The figure which shows the structural example of an acoustic digital watermark system. 第一実施形態に係る電子透かし埋め込み部の機能ブロック図。The functional block diagram of the electronic watermark embedding part which concerns on 1st embodiment. 第一実施形態に係る電子透かし埋め込み部の処理フローの例を示す図。The figure which shows the example of the processing flow of the electronic watermark embedding part which concerns on 1st embodiment. 第一実施形態に係る電子透かし読み込み部の機能ブロック図。The functional block diagram of the electronic watermark reading part which concerns on 1st embodiment. 第一実施形態に係る電子透かし読み込み部の処理フローの例を示す図。The figure which shows the example of the processing flow of the electronic watermark reading part which concerns on 1st embodiment. 図７（Ａ）は残差信号を単純なパルス列でモデル化した信号を示す図、図７（Ｂ）は図７（Ａ）の信号を合成フィルタに通した信号を示す図。FIG. 7A is a diagram showing a signal obtained by modeling a residual signal with a simple pulse train, and FIG. 7B is a diagram showing a signal obtained by passing the signal of FIG. 7A through a synthesis filter. 図７（Ｂ）の信号をフーリエ変換した周波数スペクトルを示す図。The figure which shows the frequency spectrum which Fourier-transformed the signal of FIG.7 (B). 図９（Ａ）は図７（Ａ）に搬送波を加算した信号を示す図、図９（Ｂ）は図９（Ａ）の信号を合成フィルタに通した信号を示す図。9A shows a signal obtained by adding a carrier wave to FIG. 7A, and FIG. 9B shows a signal obtained by passing the signal of FIG. 9A through a synthesis filter. 図９（Ｂ）の信号をフーリエ変換した周波数スペクトルを示す図。The figure which shows the frequency spectrum which carried out the Fourier-transform of the signal of FIG.9 (B). 図１１（Ａ）は図７（Ａ）に図９（Ａ）と位相の異なる搬送波を加算した信号を示す図、図１１（Ｂ）は図１１（Ａ）の信号を合成フィルタに通した信号を示す図。11A shows a signal obtained by adding a carrier wave having a phase different from that of FIG. 9A to FIG. 7A, and FIG. 11B shows a signal obtained by passing the signal of FIG. 11A through a synthesis filter. FIG. 図１１（Ｂ）の信号をフーリエ変換した周波数スペクトル示す図。The figure which shows the frequency spectrum which Fourier-transformed the signal of FIG. 図１３（Ａ）は、図９（Ａ）の搬送波を、ＢＰＳＫで変調した信号を示す図、図１３（Ｂ）は図１３（Ａ）の信号を合成フィルタに通した信号示す図。13A shows a signal obtained by modulating the carrier wave of FIG. 9A with BPSK, and FIG. 13B shows a signal obtained by passing the signal of FIG. 13A through a synthesis filter. 図１３（Ｂ）の信号をフーリエ変換した周波数スペクトルを示す図。The figure which shows the frequency spectrum which carried out the Fourier-transform of the signal of FIG.13 (B). 図１５（Ａ）は搬送波の周波数やシンボルレートをピッチ周波数の整数倍ではない値に設定して加算した信号を示す図、図１５（Ｂ）は図１５（Ａ）の信号を合成フィルタに通した信号示す図。FIG. 15A shows a signal obtained by setting the carrier frequency and symbol rate to a value that is not an integral multiple of the pitch frequency, and FIG. 15B shows a signal obtained by adding the signal of FIG. 15A to the synthesis filter. FIG. 図１５（Ｂ）の信号をフーリエ変換した周波数スペクトルを示す図。The figure which shows the frequency spectrum which Fourier-transformed the signal of FIG.15 (B). 第二実施形態に係る電子透かし埋め込み部の機能ブロック図。The functional block diagram of the electronic watermark embedding part which concerns on 2nd embodiment. 第二実施形態に係る電子透かし埋め込み部の処理フローの例を示す図。The figure which shows the example of the processing flow of the electronic watermark embedding part which concerns on 2nd embodiment.

以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。
＜第一実施形態に係る音響電子透かしシステム＞
第一実施形態に係る音響電子透かしシステム１０は、送信部１１と、受信部１４とを含む。送信部１１及び受信部１４はそれぞれ電子透かし埋め込み部１００及び電子透かし読み取り部２００を含む（図１参照）。 Hereinafter, embodiments of the present invention will be described. In the drawings used for the following description, constituent parts having the same function and steps for performing the same process are denoted by the same reference numerals, and redundant description is omitted.
<Acoustic Watermarking System According to First Embodiment>
The acoustic digital watermark system 10 according to the first embodiment includes a transmission unit 11 and a reception unit 14. The transmission unit 11 and the reception unit 14 each include a digital watermark embedding unit 100 and a digital watermark reading unit 200 (see FIG. 1).

送信部１１の電子透かし埋め込み部１００は、入力音響信号と並行情報とを受け取り、再生信号を出力する。送信部１１のスピーカ９３で再生信号が再生され、再生音が発せられる。なお、スピーカ９３とマイク９５とは共通の音場に配置される。 The digital watermark embedding unit 100 of the transmission unit 11 receives the input acoustic signal and the parallel information, and outputs a reproduction signal. A reproduction signal is reproduced by the speaker 93 of the transmission unit 11 and a reproduction sound is emitted. The speaker 93 and the microphone 95 are arranged in a common sound field.

受信部１４のマイク９５は、再生音を収音し、収音信号を出力する。受信部１４の電子透かし読み取り部２００は、収音信号を受け取り、出力音響信号と並行情報に対応する情報とを出力する。例えば、並行情報は、ディスプレイまたはタッチパネル等の表示部に出力され、表示される構成としてもよい。なお、スピーカ９３の再生音が十分に聴こえる場所に受聴者がいる場合には、電子透かし読み取り部２００は出力音響信号を出力せずに、並行情報のみを出力する構成としてもよい。 The microphone 95 of the receiving unit 14 collects the reproduced sound and outputs a collected sound signal. The digital watermark reading unit 200 of the receiving unit 14 receives the collected sound signal and outputs an output acoustic signal and information corresponding to the parallel information. For example, the parallel information may be output and displayed on a display unit such as a display or a touch panel. When the listener is in a place where the reproduced sound of the speaker 93 can be sufficiently heard, the digital watermark reading unit 200 may output only parallel information without outputting the output acoustic signal.

以下、電子透かし埋め込み部１００及び電子透かし読み取り部２００の詳細について説明する。
＜第一実施形態に係る電子透かし埋め込み部＞
図３は第一実施形態に係る電子透かし埋め込み部１００の機能ブロック図を、図４はその処理フローを示す。 Details of the digital watermark embedding unit 100 and the digital watermark reading unit 200 will be described below.
<Digital Watermark Embedding Unit According to First Embodiment>
FIG. 3 is a functional block diagram of the digital watermark embedding unit 100 according to the first embodiment, and FIG. 4 shows its processing flow.

電子透かし埋め込み部１００は、入力バッファ１１０と逆フィルタ１３０と線形予測分析部１２０とピッチ分析部１４０と搬送波生成部１５０と位相変調部１６０と加算部１７０と合成フィルタ１８０とを含む。 The digital watermark embedding unit 100 includes an input buffer 110, an inverse filter 130, a linear prediction analysis unit 120, a pitch analysis unit 140, a carrier wave generation unit 150, a phase modulation unit 160, an addition unit 170, and a synthesis filter 180.

＜入力バッファ１１０＞
入力バッファ１１０は、入力音響信号を受け取り、少なくとも一定の時間分の入力音響信号を蓄積し（Ｓ１１０）、一定の時間（フレーム）ごとの入力音響信号を出力する。言い換えると、入力音響信号は、入力バッファ１１０に蓄えられ、入力音響信号をフレームと呼ばれる一定の時間ごとに区切って線形予測分析部１２０と逆フィルタ１３０に送る。１フレームの時間長は一般には、１０ミリ秒から２０ミリ秒程度とすることが多いが、それ以外の時間長でもよい。以下、上記一定の時間ごとに区切られた入力音響信号をフレーム信号ともいう。 <Input buffer 110>
The input buffer 110 receives the input sound signal, accumulates the input sound signal for at least a certain time (S110), and outputs the input sound signal every certain time (frame). In other words, the input acoustic signal is stored in the input buffer 110, and the input acoustic signal is sent to the linear prediction analysis unit 120 and the inverse filter 130 after being divided at regular time intervals called frames. In general, the time length of one frame is generally about 10 to 20 milliseconds, but other time lengths may be used. Hereinafter, the input acoustic signal divided at regular intervals is also referred to as a frame signal.

＜線形予測分析部１２０＞
線形予測分析部１２０は、フレーム信号を受け取り、線形予測分析の手法を用いて、入力音響信号を線形予測分析し、線形予測係数を求め（Ｓ１２０）、線形予測係数を逆フィルタ１３０と合成フィルタ１８０に送る。線形予測分析の手法としては、共分散法、自己相関法、ＰＡＲＣＯＲ分析、ＬＳＰ分析等が考えられる。なお、線形予測係数とは、線形予測係数自体に加え、線形予測係数と等価な値（例えば、ＰＡＲＣＯＲ係数、ＬＳＰ(line spectrum pair)）を含むものとする。 <Linear prediction analysis unit 120>
The linear prediction analysis unit 120 receives the frame signal, performs linear prediction analysis on the input acoustic signal using a linear prediction analysis technique, obtains a linear prediction coefficient (S120), and converts the linear prediction coefficient into the inverse filter 130 and the synthesis filter 180. Send to. As a method of linear prediction analysis, a covariance method, an autocorrelation method, a PARCOR analysis, an LSP analysis, or the like can be considered. Note that the linear prediction coefficient includes a value equivalent to the linear prediction coefficient (for example, PARCOR coefficient, LSP (line spectrum pair)) in addition to the linear prediction coefficient itself.

＜逆フィルタ１３０＞
逆フィルタ１３０は、フレーム信号と線形予測係数とを受け取り、線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、入力音響信号をフィルタリングし（Ｓ１３０）、残差信号を求め、出力する。残差信号はピッチ分析部１４０と加算部１７０に送られる。 <Inverse filter 130>
The inverse filter 130 receives the frame signal and the linear prediction coefficient, filters the input acoustic signal using an FIR filter using the linear prediction coefficient as a filter coefficient (S130), obtains a residual signal, and outputs it. The residual signal is sent to the pitch analysis unit 140 and the addition unit 170.

＜ピッチ分析部１４０＞
ピッチ分析部１４０は、残差信号を受け取り、残差信号のピッチ長、すなわち音声の基本周波数の１周期の長さを分析して（Ｓ１４０）、ピッチ周波数またはピッチ長を搬送波生成部１５０に送る。ピッチ周波数とピッチ長は、逆数の関係にあり、それぞれＨｚ、秒を単位とする。なお、特にことわりがない場合にはピッチ周波数とピッチ長は同義とし、周波数領域では基本周波数、時間領域ではピッチの時間長を表すものとする。以下、総称して「ピッチ」と呼ぶ。 <Pitch analysis unit 140>
The pitch analysis unit 140 receives the residual signal, analyzes the pitch length of the residual signal, that is, the length of one period of the fundamental frequency of the speech (S140), and sends the pitch frequency or the pitch length to the carrier wave generation unit 150. . The pitch frequency and the pitch length have a reciprocal relationship, and each has a unit of Hz and second. Unless otherwise specified, the pitch frequency and the pitch length are synonymous, and represent the fundamental frequency in the frequency domain and the time length of the pitch in the time domain. Hereinafter, they are collectively referred to as “pitch”.

＜搬送波生成部１５０＞
搬送波生成部１５０は、ピッチを受け取り、周波数がピッチ周波数のＮ倍（１周期がピッチ長のＮ分の１）となる正弦波を搬送波として生成し（Ｓ１５０）、出力する。Ｎは２以上の整数の何れかである。例えば、ピッチが１００Ｈｚの場合には、５００Ｈｚ、８００Ｈｚ、１０００Ｈｚ等の正弦波を出力する。Ｎの上限は特にないが、ピッチ周波数のＮ倍が４ｋＨｚ以下になるように決めると本実施形態の効果が大きい。音声や楽器音などの音響信号は、ピッチ周波数成分とその高調波成分（倍音成分ともいう）から構成される「調波構造」であることが知られているが、特に音声の場合は、４ｋＨｚ以下の周波数成分で調波構造が顕著であることと、４ｋＨｚ以下の周波数成分の割合が４ｋＨｚ以上の周波数成分の割合に比べ大きいことから、上記搬送波を重畳することによる、受聴者の聴覚的な歪みを抑えることができるためである。なお、搬送波の位相は、残差のピッチ周波数成分の位相と一致するか、または、残差のピッチ周波数成分の位相との相対的な位相差が常に一定値となるようにする。 <Carrier Generation Unit 150>
The carrier wave generation unit 150 receives the pitch, generates a sine wave whose frequency is N times the pitch frequency (one cycle is 1 / N of the pitch length) as a carrier wave (S150), and outputs it. N is any integer of 2 or more. For example, when the pitch is 100 Hz, a sine wave of 500 Hz, 800 Hz, 1000 Hz or the like is output. The upper limit of N is not particularly limited, but the effect of the present embodiment is great if N times the pitch frequency is determined to be 4 kHz or less. It is known that an acoustic signal such as a voice or a musical instrument sound has a “harmonic structure” composed of a pitch frequency component and its harmonic component (also referred to as a harmonic component). Since the harmonic structure is remarkable in the following frequency components and the ratio of the frequency components of 4 kHz or less is larger than the ratio of the frequency components of 4 kHz or more, the auditory auditory sense of the listener by superimposing the carrier wave is increased. This is because distortion can be suppressed. Note that the phase of the carrier wave matches the phase of the pitch frequency component of the residual, or the relative phase difference with the phase of the pitch frequency component of the residual is always a constant value.

＜位相変調部１６０＞
位相変調部１６０は、搬送波と並行情報とを受け取り、シンボルレートをピッチ周波数のＭ倍となるように、搬送波を並行情報で位相変調し、変調波を求め（Ｓ１６０）、出力する。なお、Ｍを１以上Ｎ以下の整数の何れかとする。なお、並行情報は、１と０の時系列で表されるデジタル信号である。位相変調方式には、ＢＰＳＫ（Binary Phase Shift Keying）やＱＰＳＫ（Quadrature Phase Shift Keying）が知られている。ＢＰＳＫは２ＰＳＫ、ＱＰＳＫは４ＰＳＫとも呼ばれる。ＢＰＳＫでは、１シンボルで１ビット、ＱＰＳＫでは１シンボルで２ビットの並行情報を送ることができる。 <Phase modulation unit 160>
The phase modulation unit 160 receives the carrier wave and the parallel information, phase-modulates the carrier wave with the parallel information so that the symbol rate is M times the pitch frequency, obtains a modulated wave (S160), and outputs it. Note that M is any integer from 1 to N. The parallel information is a digital signal represented by a time series of 1 and 0. BPSK (Binary Phase Shift Keying) and QPSK (Quadrature Phase Shift Keying) are known as phase modulation methods. BPSK is also called 2PSK and QPSK is also called 4PSK. In BPSK, 1 symbol can transmit 1 bit, and in QPSK, 1 symbol can transmit 2 bits of parallel information.

＜加算部１７０＞
加算部１７０は、変調波と残差信号とを受け取り、変調波と残差信号とを加算して（Ｓ１７０）、加算後の信号（以下、「第一信号」ともいう）を合成フィルタ１８０に送る。 <Adding unit 170>
The adder 170 receives the modulated wave and the residual signal, adds the modulated wave and the residual signal (S170), and adds the added signal (hereinafter also referred to as “first signal”) to the synthesis filter 180. send.

＜合成フィルタ１８０＞
合成フィルタ１８０は、線形予測係数と第一信号とを受け取り、線形予測係数をフィルタ係数とするＩＩＲフィルタを用いて、第一信号をフィルタリングし（Ｓ１８０）、再生信号を求め、出力する。 <Synthesis filter 180>
The synthesis filter 180 receives the linear prediction coefficient and the first signal, filters the first signal using an IIR filter using the linear prediction coefficient as a filter coefficient (S180), obtains and outputs a reproduction signal.

なお、線形予測分析、逆フィルタ、合成フィルタ１８０、ピッチ分析に関する詳細は、例えば参考文献１に記載されている。また、搬送波および位相変調の詳細については、例えば参考文献２に記載されている。
［参考文献１］古井貞熙著、「ディジタル音声処理」、東海大学出版会、1985年、pp.60-98
［参考文献２］神谷幸宏著、「Ｃ言語によるディジタル無線通信技術」、コロナ社、2010年、pp.53-83 Details regarding the linear prediction analysis, the inverse filter, the synthesis filter 180, and the pitch analysis are described in Reference Document 1, for example. Details of the carrier wave and the phase modulation are described in Reference Document 2, for example.
[Reference 1] Sadaaki Furui, “Digital Audio Processing”, Tokai University Press, 1985, pp.60-98
[Reference 2] Yukihiro Kamiya, "Digital wireless communication technology using C language", Corona, 2010, pp.53-83

＜第一実施形態に係る電子透かし読み取り部＞
図５は第一実施形態に係る電子透かし読み取り部２００の機能ブロック図を、図２はその処理フローを示す。 <Digital Watermark Reading Unit According to First Embodiment>
FIG. 5 is a functional block diagram of the digital watermark reading unit 200 according to the first embodiment, and FIG. 2 shows a processing flow thereof.

電子透かし読み取り部２００は、入力バッファ２１０と逆フィルタ２３０と線形予測分析部２２０とピッチ分析部２４０と搬送波生成部２５０と検波部２６０とを含む。 The digital watermark reading unit 200 includes an input buffer 210, an inverse filter 230, a linear prediction analysis unit 220, a pitch analysis unit 240, a carrier wave generation unit 250, and a detection unit 260.

収音信号は、入力バッファ２１０に送られるとともに、無処理で出力音響信号として出力される。 The collected sound signal is sent to the input buffer 210 and output as an output sound signal without processing.

入力バッファ２１０、線形予測分析部２２０、逆フィルタ２３０、ピッチ分析部２４０、搬送波生成部２５０は図３の電子透かし埋め込み部１００の線形予測分析部１２０、逆フィルタ１３０、ピッチ分析部１４０、搬送波生成部１５０と同じ動作をし、図３の加算部１７０にあたる部分が検波部２６０となる。以下、概要を説明する。 The input buffer 210, the linear prediction analysis unit 220, the inverse filter 230, the pitch analysis unit 240, and the carrier wave generation unit 250 are the linear prediction analysis unit 120, the inverse filter 130, the pitch analysis unit 140, and the carrier wave generation unit of the digital watermark embedding unit 100 of FIG. The portion that corresponds to the operation of the unit 150 and corresponds to the addition unit 170 of FIG. The outline will be described below.

＜入力バッファ２１０＞
入力バッファ２１０は、収音信号を受け取り、少なくとも一定の時間分の収音信号を蓄積し（Ｓ２１０）、一定の時間（フレーム）ごとの収音信号を出力する。 <Input buffer 210>
The input buffer 210 receives the collected sound signal, accumulates the collected sound signal for at least a certain time (S210), and outputs the collected sound signal for every certain time (frame).

＜線形予測分析部２２０＞
線形予測分析部２２０は、フレーム毎の収音信号を受け取り、線形予測分析の手法を用いて、収音信号を線形予測分析し、線形予測係数を求め（Ｓ２２０）、線形予測係数を逆フィルタ２３０に送る。 <Linear prediction analysis unit 220>
The linear prediction analysis unit 220 receives the collected sound signal for each frame, performs linear prediction analysis of the collected sound signal using a linear prediction analysis technique, obtains a linear prediction coefficient (S220), and uses the linear prediction coefficient as an inverse filter 230. Send to.

＜逆フィルタ２３０＞
逆フィルタ２３０は、フレーム毎の収音信号と線形予測係数とを受け取り、線形予測係数をフィルタ係数とするＦＩＲフィルタを用いて、収音信号をフィルタリングし（Ｓ２３０）、残差信号を求め、出力する。残差信号はピッチ分析部２４０と検波部２６０に送られる。 <Inverse filter 230>
The inverse filter 230 receives the collected sound signal and the linear prediction coefficient for each frame, filters the collected sound signal using an FIR filter using the linear prediction coefficient as a filter coefficient (S230), obtains a residual signal, and outputs it. To do. The residual signal is sent to the pitch analysis unit 240 and the detection unit 260.

＜ピッチ分析部２４０＞
ピッチ分析部２４０は、残差信号を受け取り、残差信号のピッチ長、すなわち音声の基本周波数の１周期の長さを分析して（Ｓ２４０）、ピッチを搬送波生成部２５０に送る。 <Pitch analysis unit 240>
The pitch analysis unit 240 receives the residual signal, analyzes the pitch length of the residual signal, that is, the length of one period of the fundamental frequency of the voice (S240), and sends the pitch to the carrier wave generation unit 250.

＜搬送波生成部２５０＞
搬送波生成部２５０は、ピッチを受け取り、周波数がピッチ周波数のＮ倍となる正弦波を搬送波として生成し（Ｓ２５０）、出力する。 <Carrier Generation Unit 250>
The carrier wave generation unit 250 receives the pitch, generates a sine wave whose frequency is N times the pitch frequency as a carrier wave (S250), and outputs it.

＜検波部２６０＞
検波部２６０は、残差信号と搬送波とを受け取り、電子透かしが埋め込まれた残差信号と搬送波とを用いて、並行情報に対応する情報を検出し（Ｓ２６０）、出力する。なお、通信が正常に行われた場合には、電子透かし埋め込み部１００で埋め込まれた並行情報と、Ｓ２６０で検出した並行情報に対応する情報とは一致する。検出の方法には、一般的な位相変調方式の同期検波の方法を利用することができる。ここでは、同期検波の手法が使えるのがポイントである。一般に、高度なディジタル通信網では、送信側と受信側で同期した搬送波を生成できるため、同期検波が可能であり、高速で安定したディジタル通信を実現している。しかしながら、簡易的なディジタル通信システムでは、遅延検波または非同期検波が主流であり、通信の安定性の点で同期検波に劣る。本実施形態では、簡易な変復調方式にもかかわらず同期検波ができるのは、搬送波が電子透かし埋め込み部１００と電子透かし読み取り部２００の両方で、残差信号のピッチに、周波数、位相とも同期して生成されているため、結果として、送信側と受信側で同期した搬送波を生成できているためである。 <Detection unit 260>
The detector 260 receives the residual signal and the carrier wave, detects information corresponding to the parallel information using the residual signal and the carrier wave in which the digital watermark is embedded, and outputs it (S260). If the communication is normally performed, the parallel information embedded by the digital watermark embedding unit 100 matches the information corresponding to the parallel information detected in S260. As a detection method, a common phase modulation type synchronous detection method can be used. Here, the point is that the method of synchronous detection can be used. In general, an advanced digital communication network can generate a carrier wave synchronized between a transmission side and a reception side, so that synchronous detection is possible, and high-speed and stable digital communication is realized. However, in a simple digital communication system, delay detection or asynchronous detection is mainstream, and inferior to synchronous detection in terms of communication stability. In the present embodiment, synchronous detection can be performed in spite of a simple modulation / demodulation method. The carrier wave is synchronized with both the digital watermark embedding unit 100 and the digital watermark reading unit 200 in frequency and phase with the pitch of the residual signal. This is because, as a result, a carrier wave synchronized between the transmission side and the reception side can be generated.

なお、同期検波、遅延検波、非同期検波に関する詳細は、例えば参考文献３に記載されている。
［参考文献３］斉藤洋一著、「ディジタル無線通信の変復調」、電子情報通信学会、 1996年、pp.1-35 Details regarding synchronous detection, delay detection, and asynchronous detection are described in Reference Document 3, for example.
[Reference 3] Yoichi Saito, "Modulation and Demodulation of Digital Wireless Communication", IEICE, 1996, pp.1-35

＜効果＞
以上の構成により、入力音響信号のピッチ周波数を分析し、ピッチ周波数の整数倍の周波数を搬送波として、シンボルレートもピッチ周波数の整数倍とする位相変調信号を透かしとして入力音響信号に重畳する方法により、必ずしもデジタル通信路を経由せずに、（１）並行情報の通信速度の確保、（２）通信の信頼性確保、（３）音声品質の維持を実現する音響電子透かしシステムを実現する。 <Effect>
With the above configuration, by analyzing the pitch frequency of the input acoustic signal and superimposing it on the input acoustic signal as a watermark with a phase modulation signal with a frequency that is an integral multiple of the pitch frequency as a carrier wave and a symbol rate that is an integral multiple of the pitch frequency. An acoustic digital watermarking system that realizes (1) ensuring the communication speed of parallel information, (2) ensuring communication reliability, and (3) maintaining voice quality without necessarily going through a digital communication path.

本実施形態が、上記（１）（２）（３）を満たすことを説明する。 It will be described that the present embodiment satisfies the above (1), (2), and (3).

（１）の送信可能な並行情報のビットレートは、仮にピッチ周波数を１００Ｈｚ、シンボルレートを２００Ｈｚとし、ＢＰＳＫを適用すると２００ｂｐｓ（ビット毎秒）、ＱＰＳＫを適用すると４００ｂｐｓの並行情報を送信可能である。人が話す速度で発話内容をテキストにすると、せいぜい１００ｂｐｓ程度であり、十分な速度が得られる。 As for the bit rate of the parallel information that can be transmitted in (1), if the pitch frequency is 100 Hz and the symbol rate is 200 Hz, parallel information of 200 bps (bits per second) can be transmitted when BPSK is applied, and 400 bps can be transmitted when QPSK is applied. If the content of the utterance is converted to text at the speed at which a person speaks, it is at most about 100 bps, and a sufficient speed can be obtained.

（２）は上述のように、ピッチを基準とした同期検波が可能となり、並行情報を安定して読み取ることが可能である。 As described above, (2) can perform synchronous detection based on the pitch, and can read parallel information stably.

肝心なのは（３）である。再生信号、または、出力音響信号を再生したときに、残差信号に加算された変調波が耳障りな音として聞こえてしまっては全く意味がないが、人間の聴覚的に問題がないことを簡単な例を用いて示す。 What is important is (3). When playing back a playback signal or output acoustic signal, it is completely meaningless if the modulated wave added to the residual signal is heard as an annoying sound. This will be shown using a specific example.

音声の母音区間を線形予測し、求められた残差信号は、パルス的な波形であることが知られている。簡単のため、間隔がピッチ長である単純なパルス列でモデル化して説明する。 It is known that a vowel interval of speech is linearly predicted, and the obtained residual signal has a pulse-like waveform. For simplicity, description will be made by modeling with a simple pulse train whose interval is the pitch length.

図７（Ａ）は、音声の母音区間の残差信号を、ピッチ周波数が１２５Ｈｚの単純なパルス列であると仮定したものである。パルスは１２５Ｈｚに相当する８ミリ秒の等間隔で並んでいる。 FIG. 7 (A) assumes that the residual signal in the vowel section of speech is a simple pulse train with a pitch frequency of 125 Hz. The pulses are arranged at equal intervals of 8 milliseconds corresponding to 125 Hz.

図７（Ｂ）は、図７（Ａ）の信号を、実際の「あ」と発声した音声を分析して得られた線形予測係数をフィルタ係数とする合成フィルタに通した信号を示す。残差信号を単純なパルスで近似しているため、もとの「あ」の音響信号とは多少異なるが、図７（Ｂ）の信号をスピーカで再生すると、「あ」という音に聞こえる。 FIG. 7B shows a signal obtained by passing the signal shown in FIG. 7A through a synthesis filter having a linear prediction coefficient obtained as a filter coefficient obtained by analyzing the speech uttered as “A”. Since the residual signal is approximated by a simple pulse, it is somewhat different from the original “A” acoustic signal, but when the signal in FIG. 7B is reproduced by a speaker, a sound “A” is heard.

図８は、図７（Ｂ）の信号をフーリエ変換（ＦＦＴ）した、周波数スペクトル（パワースペクトル）を示す。スペクトルは１２５Ｈｚの整数倍の高調波が、合成フィルタの周波数特性であるスペクトル包絡で変調された形をしている。 FIG. 8 shows a frequency spectrum (power spectrum) obtained by performing Fourier transform (FFT) on the signal shown in FIG. The spectrum has a form in which harmonics of an integral multiple of 125 Hz are modulated with a spectrum envelope which is a frequency characteristic of the synthesis filter.

図９（Ａ）は、図７（Ａ）に搬送波を加算した信号である。搬送波の周波数は、ピッチ周波数の８倍、すなわち１ｋＨｚに設定した。 FIG. 9A shows a signal obtained by adding a carrier wave to FIG. The frequency of the carrier wave was set to 8 times the pitch frequency, that is, 1 kHz.

図９（Ｂ）は、図９（Ａ）を前記合成フィルタに通した信号であり、図１０は図９（Ｂ）の周波数スペクトルである。 FIG. 9B is a signal obtained by passing FIG. 9A through the synthesis filter, and FIG. 10 is a frequency spectrum of FIG. 9B.

図８と図１０を比べると、１ｋＨｚの振幅が、搬送波の分だけ高くなっていることがわかるが、じっくり見比べなければ、違いに気が付かない程度である。実際に図７（Ｂ）と図９（Ｂ）をスピーカで再生して聴き比べても、多少音色の違いは感じるが、図９（Ｂ）のみを聴けば、搬送波を加えたことによる違和感は感じない。 Comparing FIG. 8 and FIG. 10, it can be seen that the amplitude of 1 kHz is increased by the amount of the carrier wave, but if you do not compare carefully, you will not notice the difference. Even if you actually listen to and compare FIGS. 7 (B) and 9 (B) by listening to the speakers, you will feel a slight difference in the timbre, but if you listen only to FIG. do not feel.

図１１（Ａ）も、図９（Ａ）と同様に、図７（Ａ）に搬送波を加算した信号である。図９（Ａ）と図１１（Ａ）の違いは、搬送波の位相がπ／２違っていることのみである。同様に、図１１（Ｂ）は合成フィルタを通した信号、図１２は周波数スペクトルであるが、パワースペクトルは位相に関係がないため、図１０と図１２は同じになる。ただし、フーリエ変換の都合上、厳密には若干違いがある。人間の耳は位相に鈍感なため、図９（Ｂ）と図１１（Ｂ）の音は同じ音に聴こえる。 11A is also a signal obtained by adding a carrier wave to FIG. 7A, as in FIG. 9A. The only difference between FIG. 9A and FIG. 11A is that the phase of the carrier wave is different by π / 2. Similarly, FIG. 11B is a signal that has passed through the synthesis filter, and FIG. 12 is a frequency spectrum. However, since the power spectrum is not related to the phase, FIGS. 10 and 12 are the same. However, there are some slight differences in terms of Fourier transform. Since the human ear is insensitive to the phase, the sounds in FIGS. 9B and 11B can be heard as the same sound.

本実施形態において、搬送波の位相は異なっても人間の耳には違いが感じられないため、搬送波の位相は、前記のように、残差信号のピッチ周波数成分の位相との相対的な位相差が、電子透かし埋め込み部１００と電子透かし読み取り部２００で一致していれば、任意の位相でよい。 In this embodiment, even if the phase of the carrier wave is different, no difference is felt to the human ear. Therefore, as described above, the phase of the carrier wave is a relative phase difference from the phase of the pitch frequency component of the residual signal. However, if the digital watermark embedding unit 100 and the digital watermark reading unit 200 match, an arbitrary phase may be used.

図１３（Ａ）は、図９（Ａ）の搬送波を、ＢＰＳＫの手法を用いて、並行情報で変調した信号である。並行情報は、一例として１と０が交互であるとした。またシンボルレートはピッチ周波数の２倍である２５０Ｈｚとした。図１３（Ｂ）は図１３（Ａ）を合成フィルタに通した信号、図１４はその周波数スペクトルである。図１４と図１０を比べると、やはり違いがわずかであることがわかる。図１３（Ｂ）をスピーカで再生して聴いても、多少の音色の違いは感じるが、違和感は感じない。 FIG. 13A shows a signal obtained by modulating the carrier wave shown in FIG. 9A with parallel information using the BPSK technique. As an example of the parallel information, 1 and 0 are alternated. The symbol rate was 250 Hz, which is twice the pitch frequency. FIG. 13B is a signal obtained by passing FIG. 13A through the synthesis filter, and FIG. 14 is a frequency spectrum thereof. Comparing FIG. 14 and FIG. 10, it can be seen that the difference is still slight. When listening to FIG. 13B by reproducing with a speaker, a slight difference in timbre is felt, but a sense of incongruity is not felt.

このように、ピッチ周波数のＮ倍（上記例は８倍）の搬送波を、ピッチ周波数のＭ倍（上記例は２倍）のシンボルレートで変調して残差信号に加算しても、違和感のない再生音を得ることができる。図９（Ａ）〜図１３（Ａ）の例では、説明上わかりやすくするために、搬送波の振幅を大き目（残差信号の振幅の−１４ｄＢ）に設定したが、実際に利用する場合には、搬送波の振幅を残差信号の振幅に対して十分に小さく設定することが可能であり、スピーカで再生したときの音色の違いはより少なくすることができる。 Thus, even if a carrier wave having N times the pitch frequency (8 times in the above example) is modulated at a symbol rate M times the pitch frequency (2 times in the above example) and added to the residual signal, there is no sense of incongruity. You can get no playback sound. In the example of FIGS. 9A to 13A, the amplitude of the carrier wave is set to a large value (−14 dB of the amplitude of the residual signal) for the sake of easy understanding. However, when actually used, The amplitude of the carrier wave can be set sufficiently smaller than the amplitude of the residual signal, and the difference in timbre when reproduced by a speaker can be reduced.

次に、搬送波の周波数やシンボルレートがピッチ周波数の整数倍でない場合について説明する。 Next, a case where the frequency of the carrier wave or the symbol rate is not an integral multiple of the pitch frequency will be described.

図１５（Ａ）は、搬送波の周波数やシンボルレートをピッチ周波数の整数倍ではない値に設定して加算した信号、図１５（Ｂ）はそれを合成フィルタに通した信号、図１６はその周波数スペクトルである。図１６を図１４と比べると、調波構造は大きく崩れ、全く異なるスペクトルになってしまっていることがわかる。図１５（Ｂ）の信号をスピーカで再生して聴くと、大変に不快な音になってしまい、本実施形態と同様の効果は得られない。 FIG. 15A shows a signal obtained by setting the carrier frequency or symbol rate to a value which is not an integer multiple of the pitch frequency, FIG. 15B shows a signal obtained by passing it through a synthesis filter, and FIG. 16 shows its frequency. It is a spectrum. Comparing FIG. 16 with FIG. 14, it can be seen that the harmonic structure is greatly collapsed and has a completely different spectrum. When the signal shown in FIG. 15B is reproduced by a speaker and listened to, the sound becomes very unpleasant and the same effect as in the present embodiment cannot be obtained.

＜第二実施形態＞
第一実施形態と異なる部分を中心に説明する。 <Second embodiment>
A description will be given centering on differences from the first embodiment.

＜第二実施形態に係る音響電子透かしシステム＞
第二実施形態に係る音響電子透かしシステム３０は、送信部３１と、受信部３４とを含む。送信部３１及び受信部３４はそれぞれ電子透かし埋め込み部３００及び電子透かし読み取り部４００を含む（図１参照）。 <Acoustic Watermarking System According to Second Embodiment>
The acoustic digital watermark system 30 according to the second embodiment includes a transmission unit 31 and a reception unit 34. The transmission unit 31 and the reception unit 34 include a digital watermark embedding unit 300 and a digital watermark reading unit 400, respectively (see FIG. 1).

以下、電子透かし埋め込み部３００の詳細について説明する。
＜第二実施形態に係る電子透かし埋め込み部＞
図１７は電子透かし埋め込み部３００の機能ブロック図を、図１８はその処理フローを示す。 Details of the digital watermark embedding unit 300 will be described below.
<Digital Watermark Embedding Unit According to Second Embodiment>
FIG. 17 is a functional block diagram of the digital watermark embedding unit 300, and FIG. 18 shows its processing flow.

電子透かし埋め込み部３００は、入力バッファ１１０と逆フィルタ１３０と線形予測分析部１２０とピッチ分析部３４０と搬送波生成部１５０と位相変調部１６０と加算部３７０と合成フィルタ１８０と判定部３９０とＳＷ３９１とを含む。 The digital watermark embedding unit 300 includes an input buffer 110, an inverse filter 130, a linear prediction analysis unit 120, a pitch analysis unit 340, a carrier wave generation unit 150, a phase modulation unit 160, an addition unit 370, a synthesis filter 180, a determination unit 390, and a SW 391. including.

全ての音響信号は、必ずしも音声の母音区間のような周波数特性（調波構造）を持っていない。例えば子音区間や無音区間で残差信号に上記搬送波を加算して合成フィルタ１８０を通した音を聴くと、搬送波が聞こえてしまい、違和感を感じる。そこで、本実施形態では、残差信号が調波構造を有する場合にのみ変調波と残差信号とを加算する構成とする。 All acoustic signals do not necessarily have a frequency characteristic (harmonic structure) like a vowel section of speech. For example, listening to the sound that has passed through the synthesis filter 180 by adding the carrier wave to the residual signal in a consonant section or a silent section, the carrier wave is heard, and a sense of incongruity is felt. Therefore, in this embodiment, the modulation wave and the residual signal are added only when the residual signal has a harmonic structure.

＜ピッチ分析部３４０＞
ピッチ分析部３４０は、残差信号を受け取り、残差信号を用いてピッチ周波数を計算し、搬送波生成部１５０に出力するとともに、ピッチ周波数における残差信号の自己相関値を計算し（Ｓ３４０）、判定部３９０に出力する。 <Pitch analysis unit 340>
The pitch analysis unit 340 receives the residual signal, calculates the pitch frequency using the residual signal, outputs it to the carrier wave generation unit 150, and calculates the autocorrelation value of the residual signal at the pitch frequency (S340), It outputs to the determination part 390.

＜判定部３９０＞
判定部３９０は、自己相関値を受け取り、この値を用いて、母音区間のような周波数特性（調波構造）を持っているかどうかの判定を行い、判定結果に従い、ＳＷ３９１をＯＮ／ＯＦＦを制御する制御信号を出力する。母音区間と同様の周波数特性を持っている場合には、ＳＷ３９１をＯＮにして、搬送波を残差信号に加算する。母音区間と同様の周波数特性を持っていないと判定した場合には、ＳＷ３９１をＯＦＦにして、搬送波を残差信号に加算しない。 <Determining unit 390>
The determination unit 390 receives the autocorrelation value, uses this value to determine whether or not it has a frequency characteristic (harmonic structure) such as a vowel section, and controls the SW 391 on / off according to the determination result Output a control signal. When the frequency characteristic is the same as that of the vowel section, SW 391 is turned on and the carrier wave is added to the residual signal. If it is determined that the frequency characteristic is not the same as that of the vowel section, the SW 391 is turned OFF and the carrier wave is not added to the residual signal.

＜加算部３７０＞
加算部３７０は、残差信号が調波構造を有する場合に変調波と残差信号とを加算し第一信号を求め（Ｓ３９０のｙｅｓ、Ｓ３７０）、残差信号が調波構造を有さない場合に残差信号を第一信号とする（Ｓ３９０のｎｏ）。加算部３７０は、第一信号を加算部１７０に出力する。 <Adding unit 370>
When the residual signal has a harmonic structure, the adding unit 370 adds the modulated wave and the residual signal to obtain a first signal (yes in S390, S370), and the residual signal does not have a harmonic structure. In this case, the residual signal is set as the first signal (no in S390). Adder 370 outputs the first signal to adder 170.

＜第二実施形態に係る電子透かし読み取り部＞
電子透かし読み取り部４００は、電子透かし読み取り部２００と、以下の点で異なる。電子透かし読み取り部４００は、検波部２６０に代えて、検波部４６０を含む（図５参照）。 <Digital Watermark Reading Unit According to Second Embodiment>
The digital watermark reading unit 400 differs from the digital watermark reading unit 200 in the following points. The digital watermark reading unit 400 includes a detection unit 460 instead of the detection unit 260 (see FIG. 5).

検波部４６０は、残差信号と搬送波とを受け取り、電子透かしが埋め込まれた残差信号と搬送波とを用いて、送信側で変調波が加算されている場合には並行情報に対応する情報を検出し（図６のＳ４６０）、出力する。電子透かし読み取り部４００では、電子透かし埋め込み部３００のＳＷ３９１がＯＮであったのかＯＦＦであったのか、すなわち、変調波が加算されているのか、加算されていないのかを収音信号から判定する必要がある。ただし、判定部３９０と同様の処理を行う必要はない。搬送波生成部２５０の搬送波と逆フィルタ２３０の残差信号とは無相関なので、検波部４６０における検波の結果、変調波が加算されていれば十分な強度でベースバンド信号が得られ、変調波が加算されていなければ十分な強度でベースバンド信号が得られない。そのため、電子透かし読み取り部４００は、得られたベースバンド信号があらかじめ決められた強度以上の場合には、送信側で変調波が加算されていると判断し、並行情報に対応する情報を検出、出力する。一方、電子透かし読み取り部４００は、得られたベースバンド信号があらかじめ決められた強度未満の場合には、送信側で変調波が加算されていないと判断し、並行情報に対応する情報を出力しないようにすればよい。 The detection unit 460 receives the residual signal and the carrier wave, and uses the residual signal and the carrier wave in which the digital watermark is embedded, and when the modulation wave is added on the transmission side, information corresponding to the parallel information is obtained. Detect (S460 in FIG. 6) and output. The digital watermark reading unit 400 needs to determine from the collected sound signal whether the SW 391 of the digital watermark embedding unit 300 is ON or OFF, that is, whether a modulated wave is added or not. There is. However, it is not necessary to perform the same processing as that of the determination unit 390. Since the carrier wave of the carrier wave generation unit 250 and the residual signal of the inverse filter 230 are uncorrelated, if the modulation wave is added as a result of detection by the detection unit 460, a baseband signal can be obtained with sufficient intensity, If not added, a baseband signal cannot be obtained with sufficient strength. Therefore, the digital watermark reading unit 400 determines that a modulated wave is added on the transmission side when the obtained baseband signal is equal to or higher than a predetermined intensity, and detects information corresponding to the parallel information. Output. On the other hand, when the obtained baseband signal is less than a predetermined strength, the digital watermark reading unit 400 determines that no modulated wave is added on the transmission side and does not output information corresponding to the parallel information. What should I do?

＜効果＞
このような構成とすることで、第一実施形態と同様の効果を得ることができる。さらに、搬送波を加算しても再生音に違和感を感じない区間のみに電子透かしが埋め込まれる。 <Effect>
By setting it as such a structure, the effect similar to 1st embodiment can be acquired. Furthermore, a digital watermark is embedded only in a section where the reproduced sound does not feel uncomfortable even when the carrier wave is added.

なお、音声の母音区間以外でも、弦楽器や管楽器の音は音声の母音区間と同様の周波数特性（調波構造）を持っているため、弦楽器や管楽器の音に対しても違和感なく電子透かしを埋め込むことができる。 In addition, since the sound of stringed instruments and wind instruments has the same frequency characteristics (harmonic structure) as the voice vowel sections, the digital watermark can be embedded without any sense of incongruity with the sounds of stringed instruments or wind instruments. be able to.

＜その他の変形例＞
電子透かし埋め込み部１００、３００、電子透かし読み込み部２００、４００はそれぞれ別装置として構成し、それぞれ電子透かし埋め込み装置、電子透かし読み込み装置としてもよい。 <Other variations>
The digital watermark embedding units 100 and 300 and the digital watermark reading units 200 and 400 may be configured as separate devices, and may be a digital watermark embedding device and a digital watermark reading device, respectively.

デジタル変復調を用いた通信システムでは、誤り訂正符号や誤り検出符号を併用することが一般的であり、本発明においても、誤り訂正符号や誤り検出符号を併用することによって、より通信の信頼性を高めることができる。 In a communication system using digital modulation / demodulation, an error correction code and an error detection code are generally used together. In the present invention, the reliability of communication is further improved by using an error correction code and an error detection code together. Can be increased.

本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 The present invention is not limited to the above-described embodiments and modifications. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. In addition, it can change suitably in the range which does not deviate from the meaning of this invention.

＜プログラム及び記録媒体＞
本発明は、デジタルシグナルプロセッサや専用ＬＳＩに実装して実現することも可能である。また、コンピュータ本体とコンピュータプログラムとして実行することが可能である。つまり、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Program and recording medium>
The present invention can also be realized by being mounted on a digital signal processor or a dedicated LSI. Further, it can be executed as a computer main body and a computer program. In other words, various processing functions in each device described in the above embodiments and modifications may be realized by a computer. In that case, the processing contents of the functions that each device should have are described by a program. Then, by executing this program on a computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its storage unit. When executing the process, this computer reads the program stored in its own storage unit and executes the process according to the read program. As another embodiment of this program, a computer may read a program directly from a portable recording medium and execute processing according to the program. Further, each time a program is transferred from the server computer to the computer, processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program includes information provided for processing by the electronic computer and equivalent to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

Claims

Including a digital watermark embedding unit and a digital watermark reading unit,
The digital watermark embedding unit includes:
A first inverse filter that obtains a first residual signal by filtering the input acoustic signal using an FIR filter having a first linear prediction coefficient that is a linear prediction coefficient of the input acoustic signal as a filter coefficient;
A first carrier generation unit that generates N as a first carrier wave a sine wave that is N times the first fundamental frequency, which is a fundamental frequency of the first residual signal, and N is any integer of 2 or more;
A phase modulation unit that obtains a modulated wave by phase-modulating the first carrier wave with parallel information so that M is any integer from 1 to N and the symbol rate is M times the pitch frequency;
An adder for adding the modulated wave and the first residual signal to obtain a first signal;
Using an IIR filter having the first linear prediction coefficient as a filter coefficient, and filtering the first signal to obtain a reproduction signal;
The digital watermark reading unit
Filtering the collected sound signal using an FIR filter using a second linear prediction coefficient, which is a linear prediction coefficient of the collected sound signal obtained by collecting the reproduced sound of the reproduced signal, as a filter coefficient; A second inverse filter for obtaining a signal;
A second carrier generation unit that generates, as a second carrier wave, a sine wave that is N times the second fundamental frequency that is the fundamental frequency of the second residual signal;
A detector that detects information corresponding to the parallel information using the second residual signal and the second carrier wave;
Acoustic watermarking system.

The acoustic watermarking system of claim 1,
When the first residual signal has a harmonic structure, the adding unit adds the modulated wave and the first residual signal to obtain the first signal, and the first residual signal is a harmonic structure. If the first residual signal is not the first signal,
Acoustic watermarking system.

A first inverse filter that obtains a first residual signal by filtering the input acoustic signal using an FIR filter having a first linear prediction coefficient that is a linear prediction coefficient of the input acoustic signal as a filter coefficient;
A first carrier generation unit that generates N as a first carrier wave a sine wave that is N times the first fundamental frequency, which is a fundamental frequency of the first residual signal, and N is any integer of 2 or more;
A first phase modulation unit that obtains a modulated wave by phase-modulating the first carrier wave with parallel information so that M is any integer from 1 to N and the symbol rate is M times the pitch frequency;
An adder for adding the modulated wave and the first residual signal to obtain a first signal;
Using an IIR filter having the first linear prediction coefficient as a filter coefficient, and filtering the first signal to obtain a reproduction signal;
Digital watermark embedding device.

A digital watermark reading device that obtains parallel information using a collected sound signal,
A second inverse filter for filtering the collected sound signal to obtain a second residual signal using an FIR filter having a second linear prediction coefficient that is a linear prediction coefficient of the collected sound signal as a filter coefficient;
A second carrier wave generation unit that generates N as a second carrier wave a sine wave that is N times a second fundamental frequency that is a fundamental frequency of the second residual signal, and N is any integer of 2 or more,
A detector for detecting information corresponding to parallel information using the second residual signal and the second carrier wave;
Digital watermark reader.

A first inverse filtering step of filtering the input acoustic signal to obtain a first residual signal using an FIR filter having a first linear prediction coefficient that is a linear prediction coefficient of the input acoustic signal as a filter coefficient;
A first carrier generation step for generating a sine wave that is N times the first fundamental frequency, which is the fundamental frequency of the first residual signal, as a first carrier, where N is any integer greater than or equal to 2;
A phase modulation step for obtaining a modulated wave by phase-modulating the first carrier wave with parallel information so that M is any integer from 1 to N and the symbol rate is M times the pitch frequency;
Adding the modulated wave and the first residual signal to obtain a first signal;
Using an IIR filter having the first linear prediction coefficient as a filter coefficient, a synthetic filtering step of filtering the first signal to obtain a reproduction signal;
Filtering the collected sound signal using an FIR filter using a second linear prediction coefficient, which is a linear prediction coefficient of the collected sound signal obtained by collecting the reproduced sound of the reproduced signal, as a filter coefficient; A second inverse filtering step for obtaining a signal;
A second carrier generation step of generating, as a second carrier wave, a sine wave that is N times the second fundamental frequency that is the fundamental frequency of the second residual signal;
Using the second residual signal and the second carrier wave to detect information corresponding to the parallel information,
Acoustic watermarking method.

A first inverse filtering step of filtering the input acoustic signal to obtain a first residual signal using an FIR filter having a first linear prediction coefficient that is a linear prediction coefficient of the input acoustic signal as a filter coefficient;
A first carrier generation step for generating a sine wave that is N times the first fundamental frequency, which is the fundamental frequency of the first residual signal, as a first carrier, where N is any integer greater than or equal to 2;
A first phase modulation step for obtaining a modulated wave by phase-modulating the first carrier wave with parallel information so that M is any integer from 1 to N and the symbol rate is M times the pitch frequency;
Adding the modulated wave and the first residual signal to obtain a first signal;
Using an IIR filter having the first linear prediction coefficient as a filter coefficient, and filtering the first signal to obtain a reproduction signal;
Digital watermark embedding method.

A digital watermark reading method for obtaining parallel information using a collected sound signal,
A second inverse filtering step of filtering the collected sound signal to obtain a second residual signal using an FIR filter having a second linear prediction coefficient that is a linear prediction coefficient of the collected sound signal as a filter coefficient;
A second carrier generation step of generating N as a second carrier a sine wave that is N times the second fundamental frequency, which is a fundamental frequency of the second residual signal, and N is any integer of 2 or more;
Using the second residual signal and the second carrier wave to detect information corresponding to parallel information,
Digital watermark reading method.

A program for causing a computer to function as the digital watermark embedding device according to claim 3 or the digital watermark reading device according to claim 4.