JP2003099077A

JP2003099077A - Electronic watermark embedding device, and extraction device and method

Info

Publication number: JP2003099077A
Application number: JP2001292825A
Authority: JP
Inventors: Naohisa Komatsu; 尚久小松; Masayuki Sudo; 正之須藤
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2001-09-26
Filing date: 2001-09-26
Publication date: 2003-04-04

Abstract

PROBLEM TO BE SOLVED: To provide an electronic watermark embedding device capable of embedding electronic watermark information corresponding to a sound production structure by noticing a voice generation mechanism of a voice generation source and extracting of electronic dispersively embedding the information in many frequency bands by the extraction, and dealling with many kinds of codings so that the information is not lost even though resampling and compression and restoration of voice data are repeated many times. SOLUTION: The electronic watermark embedding device provided with a voice decomposing section which extracts voice articulatory components and sound source components from inputted voice, an electronic watermark embedding section 13 which embeds the information into the articulatory components, and an electronic watermark voice output section 14 that composites the sound source components and the articulatory components, to which the electronic watermark information is embedded, and outputs electronic watermark embedded voice.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、電子透かし埋込装
置、抽出装置及び方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digital watermark embedding device, an extracting device and a method.

【０００２】[0002]

【従来の技術】従来、コンピュータ技術は急速に発達
し、それに伴って、情報通信インフラの整備も進み、通
信コストも非常に安くなってきていて、現在、ディジタ
ルの世界での活発な情報のやり取りが行われている。そ
して、やり取りの対象となる映像や音声などの情報はデ
ィジタル化が進み、ディジタルコンテンツとしてインタ
ーネットや衛星放送等を利用してやり取りされている。
これにより、マルチメディアの需要が増大し、作者の目
の届かないところでの情報のやり取りが非常に多くなっ
たことによって、ディジタルコンテンツの不正コピーの
問題が最近、特に、クローズアップされている。2. Description of the Related Art Conventionally, computer technology has been rapidly developed, and along with it, the development of information and communication infrastructure has been advanced and the communication cost has become very low. Currently, active exchange of information in the digital world is in progress. Is being done. Information such as video and audio to be exchanged has been digitized, and is exchanged as digital content using the Internet, satellite broadcasting, or the like.
As a result, the demand for multimedia has increased, and the amount of information exchanged so far as to be out of the eyes of the author has increased the problem of illegal copying of digital contents, particularly recently.

【０００３】この不正コピーを助長する要因として、電
子メディアとしての文字、図形、音楽、音響、静止画、
動画等の様々なマルチメディアコンテンツは、基本的に
はソフトウェアであるので、その特性ゆえに原本とほぼ
同等のコピーが容易であるということが挙げられる。こ
のような不正コピーに対して著作権等の知的財産権を保
護するために、近年ディジタル情報へのセキュリティ対
策が検討され、実行されている。Characters, figures, music, sound, still images as electronic media,
Since various multimedia contents such as moving images are basically software, it is possible to copy them almost as easily as the original because of their characteristics. In order to protect intellectual property rights such as copyright against such illegal copying, security measures for digital information have been studied and implemented in recent years.

【０００４】このようなセキュリティ対策の中で一般的
な方法としては、流通過程においてディジタルコンテン
ツを保護するために、記録媒体そのものを封印する方法
やディジタルコンテンツを暗号化する方法などが挙げら
れる。しかしながら、このような方法は、知的財産権の
保護として不十分である。すなわち、暗号化技術の場
合、複合化された後の不正コピーや、再配付に関しては
ほとんど無力である。As a general method among such security measures, there are a method of sealing the recording medium itself and a method of encrypting the digital content in order to protect the digital content in the distribution process. However, such a method is insufficient as protection of intellectual property rights. That is, in the case of the encryption technology, it is almost useless for illegal copying after re-decoding and re-distribution.

【０００５】また、暗号鍵をかけたり封印したりする方
法では、ユーザにそのコンテンツの長所を理解してもら
うことも難しくなり、優れたディジタルコンテンツのス
ムースな流通を妨げることにもつながる。結局、いかに
上手にＰＲをするかによってコンテンツの価値が上下す
るインターネットの世界では、隠蔽（いんぺい）は必ず
しも得策とは言えない。このような特性や現状に基づい
て考えると、できるだけ軽易な方法で、コンテンツの知
的財産権を識別する識別符号をコンテンツ自体に埋め込
み、それを利用者には知覚されずに使用させることが望
ましい。Further, it is difficult for the user to understand the merit of the content by the method of applying the encryption key or sealing the content, and hinders smooth distribution of excellent digital content. After all, concealment is not always a good idea in the Internet world where the value of content changes depending on how well PR is done. Considering such characteristics and the current situation, it is desirable to embed the identification code for identifying the intellectual property right of the content in the content itself and use it without being perceived by the user in the easiest way possible. .

【０００６】そして、このような方法であると、知的財
産権の保護と同時に、使用を制限することによってユー
ザにそのコンテンツの利便性を伝え、気に入った場合に
は購入してもらうというようなことも可能になる。も
し、不正コピーや不正利用を発見した場合には、前記識
別符号を復元し、自らの知的財産権を主張し、知的財産
権の侵害を訴えることもできる。With such a method, the intellectual property right is protected, and at the same time, the use is restricted to convey the convenience of the content to the user, and the user is asked to purchase the content. It also becomes possible. If an illegal copy or illegal use is found, the identification code can be restored, the intellectual property right of the user can be claimed, and infringement of the intellectual property right can be appealed.

【０００７】このような利点により、ディジタルコンテ
ンツの知的財産権を保護する場合の技術として注目を集
めているのが、目に見えない署名の技術である「電子透
かし」である。今後、電子メディアの利用者の多くはこ
の電子透かしの技術をマスターして、各人の独自の方法
で自らの著作物であるディジタルコンテンツに署名を埋
め込み、そのコンテンツを公開するようになることが予
想される。[0007] Due to such advantages, the technique of protecting the intellectual property right of the digital contents is attracting attention as "digital watermark" which is a technique of invisible signature. In the future, many users of electronic media will be able to master this digital watermark technology, embed signatures in their own copyrighted digital content and publish the content in their own unique way. is expected.

【０００８】一方、このような状況下で、人間の
「声」、すなわち、音声そのものが新しい電子メディア
として商品価値を有し、知的財産権の対象となることが
考えられる。例えば、一般の消費者が俳優、タレント等
の音声を購入し、自分の好きなようにアレンジしてフレ
ーズを合成し、携帯電話の呼出音や目覚まし時計に使用
することなどが考えられる。そのため、近年では音声を
対象とした電子透かしが研究され、音声に電子透かし情
報を埋め込んだり、音声から電子透かし情報を抽出した
りする方法が提案されている。On the other hand, under such circumstances, it is conceivable that the "voice" of a human being, that is, the voice itself has a commercial value as a new electronic medium and is subject to intellectual property rights. For example, it can be considered that a general consumer purchases voices of actors, talents, etc., arranges them as they like and synthesizes phrases, and uses them for ringing sounds of mobile phones and alarm clocks. Therefore, in recent years, digital watermarks for voice have been studied, and methods for embedding digital watermark information in voice or extracting digital watermark information from voice have been proposed.

【０００９】例えば、非圧縮の音声データに対しては、
時間マスキング法、周波数マスキング法、ディザ信号
法、適応ＰＣＭ法等が提案されている。また、非圧縮の
音楽データに対しては、エコー法、スペクトラム拡散
法、変形ＤＣＴ法等が提案されている。さらに、圧縮さ
れたものに対しては、ベクトル量子化法、極性符号法、
音源パルス法、パリティビット法等が提案されている。For example, for uncompressed audio data,
A time masking method, a frequency masking method, a dither signal method, an adaptive PCM method, etc. have been proposed. Further, for uncompressed music data, an echo method, a spread spectrum method, a modified DCT method and the like have been proposed. Furthermore, for compressed ones, vector quantization method, polar code method,
Source pulse method, parity bit method, etc. have been proposed.

【００１０】[0010]

【発明が解決しようとする課題】しかしながら、前記従
来の音声に電子透かし情報を埋め込んだり、音声から電
子透かし情報を抽出したりする方法においては、フィル
タ処理等によって電子透かしが消えてしまうことが考え
られる。すなわち、前記従来の方法は、発声後の音声波
形又は符号化パラメータに着目したものであるが、音声
波形に着目した場合、電子透かし情報を埋め込んでもフ
ィルタ処理等で電子透かし情報が消えてしまうことが考
えられる。また、符号化に着目した場合、再生音声をも
う一度別の方式で符号化するなどして電子透かし情報を
除去することができる可能性があり、別の方式で符号化
された音声には適用することができない。However, in the conventional method of embedding the digital watermark information in the voice or extracting the digital watermark information from the voice, it is considered that the digital watermark disappears due to a filtering process or the like. To be That is, the above-mentioned conventional method focuses on the speech waveform or coding parameter after utterance, but when focusing on the speech waveform, even if the digital watermark information is embedded, the digital watermark information disappears by the filtering process or the like. Can be considered. Also, when focusing on encoding, there is a possibility that the digital watermark information can be removed, for example, by encoding the reproduced voice again using another method, and it is applied to the voice encoded by another method. I can't.

【００１１】また、電子透かし情報を音声波長の特定の
周波数帯域に埋め込んだ場合、リサンプリングを繰り返
したり音声データの圧縮及び復元を繰り返すことによっ
て、消えてしまう傾向がある。さらに、前記従来の方法
では音声データの多数種類の符号化に対応することが困
難である。When the digital watermark information is embedded in a specific frequency band of the audio wavelength, it tends to be erased by repeating resampling or audio data compression and decompression. Furthermore, it is difficult for the above-mentioned conventional method to deal with a large number of types of encoding of voice data.

【００１２】本発明は、前記従来の問題点を解決して、
音声の発生源における音声の生成メカニズムに着目し、
発音構造に対応して電子透かし情報を埋め込み、また、
抽出することによって、電子透かし情報が多数の周波数
帯域に分散して埋め込まれ、リサンプリングや音声デー
タの圧縮及び復元を繰り返しても消えてしまうことがな
く、かつ、多数種類の符号化に対応することができる電
子透かし埋込装置、抽出装置及び方法を提供することを
目的とする。The present invention solves the above conventional problems,
Focusing on the sound generation mechanism at the sound source,
Digital watermark information is embedded according to the pronunciation structure, and
By extracting, the digital watermark information is distributed and embedded in many frequency bands, does not disappear even if resampling and compression / decompression of audio data are repeated, and supports many types of encoding. It is an object of the present invention to provide a digital watermark embedding device, an extracting device and a method that can be performed.

【００１３】[0013]

【課題を解決するための手段】そのために、本発明の電
子透かし埋込装置においては、入力された音声から調音
成分及び音源成分を抽出する音声分解部と、前記調音成
分に電子透かし情報を埋め込む電子透かし埋込部と、前
記音源成分と前記電子透かし情報が埋め込まれた調音成
分とを合成して電子透かし入り音声を出力する電子透か
し入り音声出力部とを有する。To this end, in the digital watermark embedding device of the present invention, a voice decomposing unit for extracting an articulatory component and a sound source component from an input voice and an electronic watermarking information embedded in the articulatory component. The digital watermark embedding unit includes a digital watermark embedding voice output unit that synthesizes the sound source component and the articulatory component in which the digital watermark information is embedded, and outputs a digital watermark embedding voice.

【００１４】本発明の他の電子透かし埋込装置において
は、さらに、前記調音成分は音声の調音を表現するパラ
メータであり、前記音源成分は音声のピッチ成分であ
る。In another digital watermark embedding device of the present invention, the articulation component is a parameter expressing the articulation of voice, and the sound source component is a pitch component of voice.

【００１５】本発明の更に他の電子透かし埋込装置にお
いては、さらに、前記パラメータは、ＬＳＰである。In still another digital watermark embedding device of the present invention, the parameter is LSP.

【００１６】本発明の更に他の電子透かし埋込装置にお
いては、さらに、前記ＬＳＰに対応するベクトルの数値
が格納される原コードブック、及び、論理値０又は１か
ら成る情報コードに対応するように変更した前記ベクト
ルの数値が格納される変更コードブックを有する。In still another digital watermark embedding apparatus of the present invention, it further corresponds to an original codebook in which numerical values of vectors corresponding to the LSP are stored, and an information code having a logical value of 0 or 1. And a change codebook in which the numerical values of the vector changed to are stored.

【００１７】本発明の更に他の電子透かし埋込装置にお
いては、さらに、前記電子透かし埋込部は、前記変更コ
ードブックにアクセスし、前記電子透かし情報の情報コ
ードに対応する数値を取得し、前記入力された音声から
抽出されたＬＳＰに対応するベクトルの数値を変更し
て、前記電子透かし情報が埋め込まれたＬＳＰを作成す
る。In still another digital watermark embedding device of the present invention, the digital watermark embedding unit further accesses the change code book to obtain a numerical value corresponding to the information code of the digital watermark information, The numerical value of the vector corresponding to the LSP extracted from the input voice is changed to create the LSP in which the digital watermark information is embedded.

【００１８】本発明の電子透かし抽出装置においては、
入力された音声から調音成分を抽出する音声分解部と、
前記調音成分から電子透かし情報を抽出する電子透かし
抽出部とを有する。In the digital watermark extracting apparatus of the present invention,
A voice decomposition unit that extracts articulatory components from the input voice,
And a digital watermark extraction unit that extracts digital watermark information from the articulatory component.

【００１９】本発明の他の電子透かし抽出装置において
は、さらに、前記調音成分は音声の調音を表現するパラ
メータである。In another digital watermark extracting apparatus of the present invention, the articulation component is a parameter expressing the articulation of voice.

【００２０】本発明の更に他の電子透かし抽出装置にお
いては、さらに、前記パラメータは、ＬＳＰである。In still another digital watermark extracting device of the present invention, the parameter is LSP.

【００２１】本発明の更に他の電子透かし抽出装置にお
いては、さらに、前記ＬＳＰに対応するベクトルの数値
が格納される原コードブック、及び、論理値０又は１か
ら成る情報コードに対応するように変更した前記ベクト
ルの数値が格納される変更コードブックを有する。[0021] In still another digital watermark extracting apparatus of the present invention, the apparatus further corresponds to an original codebook in which the numerical value of the vector corresponding to the LSP is stored and an information code having a logical value of 0 or 1. It has a modified codebook in which the modified values of the vector are stored.

【００２２】本発明の更に他の電子透かし抽出装置にお
いては、さらに、前記電子透かし抽出部は、前記原コー
ドブック及び変更コードブックにアクセスし、前記入力
された音声から抽出されたＬＳＰに対応するベクトルの
数値と前記原コードブック及び変更コードブックに格納
された数値とを比較し、電子透かし情報の情報コードを
抽出する。In still another digital watermark extracting apparatus of the present invention, the digital watermark extracting section further accesses the original codebook and the modified codebook, and corresponds to the LSP extracted from the input voice. The numerical value of the vector is compared with the numerical values stored in the original codebook and the modified codebook to extract the information code of the digital watermark information.

【００２３】本発明の電子透かし埋込方法においては、
音声から調音成分及び音源成分を抽出し、前記調音成分
に電子透かし情報を埋め込み、前記音源成分と前記電子
透かし情報が埋め込まれた調音成分とを合成して電子透
かし入り音声を出力する。In the digital watermark embedding method of the present invention,
An articulation component and a sound source component are extracted from the voice, electronic watermark information is embedded in the articulation component, and the sound source component and the articulation component in which the electronic watermark information is embedded are combined to output a digital watermarked voice.

【００２４】本発明の他の電子透かし埋込方法において
は、さらに、前記調音成分は音声の調音を表現するパラ
メータであり、前記音源成分は音声のピッチ成分であ
る。In another digital watermark embedding method of the present invention, the articulation component is a parameter expressing a voice articulation, and the sound source component is a voice pitch component.

【００２５】本発明の更に他の電子透かし埋込方法にお
いては、さらに、前記パラメータは、ＬＳＰである。In still another digital watermark embedding method of the present invention, the parameter is LSP.

【００２６】本発明の更に他の電子透かし埋込方法にお
いては、さらに、情報コードに対応するように変更した
前記ベクトルの数値が格納される変更コードブックにア
クセスし、前記電子透かし情報の情報コードに対応する
数値を取得し、前記音声から抽出されたＬＳＰに対応す
るベクトルの数値を変更して、前記電子透かし情報が埋
め込まれたＬＳＰを作成する。In still another method of embedding a digital watermark of the present invention, a change code book in which the numerical value of the vector changed so as to correspond to the information code is stored is accessed, and the information code of the digital watermark information is accessed. Is obtained and the value of the vector corresponding to the LSP extracted from the voice is changed to create the LSP in which the digital watermark information is embedded.

【００２７】本発明の電子透かし抽出方法においては、
音声から調音成分を抽出し、前記調音成分から電子透か
し情報を抽出する。In the digital watermark extracting method of the present invention,
An articulation component is extracted from the voice, and electronic watermark information is extracted from the articulation component.

【００２８】本発明の他の電子透かし抽出方法において
は、さらに、前記調音成分は音声の調音を表現するパラ
メータである。In another digital watermark extracting method of the present invention, the articulation component is a parameter expressing the articulation of voice.

【００２９】本発明の更に他の電子透かし抽出方法にお
いては、さらに、前記パラメータは、ＬＳＰである。In still another digital watermark extracting method of the present invention, the parameter is LSP.

【００３０】本発明の更に他の電子透かし抽出方法にお
いては、さらに、前記ＬＳＰに対応するベクトルの数値
が格納される原コードブック、及び、情報コードに対応
するように変更した前記ベクトルの数値が格納される変
更コードブックにアクセスし、前記音声から抽出された
ＬＳＰに対応するベクトルの数値と前記原コードブック
及び変更コードブックに格納された数値とを比較し、電
子透かし情報の情報コードを抽出する。In still another digital watermark extracting method of the present invention, the original codebook in which the numerical value of the vector corresponding to the LSP is stored, and the numerical value of the vector changed so as to correspond to the information code are The stored modification codebook is accessed, the numerical value of the vector corresponding to the LSP extracted from the voice is compared with the numerical values stored in the original codebook and the modified codebook, and the information code of the digital watermark information is extracted. To do.

【００３１】[0031]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を参照しながら詳細に説明する。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

【００３２】図２は本発明の実施の形態における基本概
念を示す概念図、図３は本発明の実施の形態における音
声生成モデルを示す概念図である。FIG. 2 is a conceptual diagram showing a basic concept in the embodiment of the present invention, and FIG. 3 is a conceptual diagram showing a voice generation model in the embodiment of the present invention.

【００３３】本実施の形態においては、音声の生成メカ
ニズムに着目して、電子透かし情報を埋め込むようにな
っている。この場合、音声モデルの調音部分に電子透か
し情報の埋込を行うようになっている。In the present embodiment, the digital watermark information is embedded, paying attention to the sound generation mechanism. In this case, the electronic watermark information is embedded in the articulation part of the voice model.

【００３４】すなわち、人間の発する声である音声は、
図２に示されるように、音源、調音、放射の組合せによ
って生成される。ここで、音源は主として声帯の振動に
よる呼気流の断続で生成される。また、調音は、喉頭よ
り上の声道と呼ばれる部分が、種々の言語音を発するた
めに形を調整する動作である。そして、放射は、音源と
調音とによって声道内に形成された音声波が唇から音響
的な音声波形として空間に放出されることである。That is, the voice that is a human voice is
As shown in FIG. 2, it is generated by a combination of sound source, articulation, and radiation. Here, the sound source is generated mainly by the interruption of the expiratory flow due to the vibration of the vocal cords. Articulation is the action of adjusting the shape of the part called the vocal tract above the larynx in order to emit various speech sounds. The radiation is that the sound wave formed in the vocal tract by the sound source and the articulation is emitted from the lips to the space as an acoustic sound waveform.

【００３５】そこで、本実施の形態において、電子透か
し情報の埋込は、放射される過程である調音に対して行
われる。すなわち、調音成分に電子透かし情報を埋め込
むようになっている。これは、話者としての個人の音声
生成モデルを作成し、該音声生成モデルそのものに変化
を加え、電子透かし入りの音声生成モデルを作成するこ
とで実現することができる。そして、電子透かし情報の
埋込操作を加えない音声生成モデルによって放射された
音声は、電子透かしが入っていない原音声となり、電子
透かし入りの音声生成モデルによって放射された音声は
電子透かし入り音声となる。Therefore, in the present embodiment, the embedding of digital watermark information is performed for articulation, which is the process of radiation. That is, the electronic watermark information is embedded in the articulatory component. This can be realized by creating a voice generation model of a person as a speaker, changing the voice generation model itself, and creating a voice generation model with a digital watermark. Then, the sound radiated by the voice generation model without the operation of embedding the digital watermark information becomes the original voice without the digital watermark, and the sound radiated by the voice generation model with the digital watermark becomes the voice with the digital watermark. Become.

【００３６】ところで、本実施の形態における電子透か
し情報の埋込の操作は、いかなる話者のいかなる種類の
音声であっても対象とすることができるが、電子透かし
の性質上、出所を識別し、同一性を認定する必要性の高
い音声を対象とすることに適している。ここで、出所を
識別し、同一性を認定する必要性の高い音声とは、例え
ば、映画、アニメ等の声優の音声、人気アイドル、人気
歌手等の音声等である。これらの音声は、それ自体に商
業的価値があると考えられる。さらに、これらの音声を
適宜合成して、目覚まし時計、携帯電話等の呼び出し音
声として、利用することもできる。By the way, the operation of embedding the digital watermark information in the present embodiment can be applied to any kind of voice of any speaker. However, due to the nature of the digital watermark, the source is identified. It is suitable for targeting voices that are highly required to be identified. Here, the voice that has a high necessity of identifying the source and certifying the identity is, for example, voice of voice actors such as movies and animations, voices of popular idols, popular singers, and the like. These voices are considered to have commercial value in their own right. Furthermore, these sounds can be appropriately synthesized and used as a calling sound for an alarm clock, a mobile phone, or the like.

【００３７】また、裁判所における訴訟手続やそれに類
する手続における証人や鑑定人の音声も、電子透かし情
報の埋込の操作の対象に適するものである。この場合、
電子透かし情報を埋め込むことによって、証言の証拠価
値が高くなると考えられる。さらに、警察等の捜査過程
において傍受した会話の音声の場合も、同様に、証拠価
値が高くなると考えられる。The voices of witnesses and appraisers in court proceedings and similar procedures are also suitable for embedding digital watermark information. in this case,
By embedding the electronic watermark information, it is considered that the evidence value of the testimony becomes high. Further, it is considered that the value of the evidence is also high in the case of the voice of the conversation intercepted in the investigation process of the police or the like.

【００３８】ここで、音声生成モデルを作成する場合、
話者毎の調音を表現するデータベースが必要になる。本
実施の形態においては、音声の調音成分である調音を表
現するパラメータとしてＬＳＰ（ＬｉｎｅＳｐｅｃｔ
ｒｕｍＰａｉｒ）を選択する。そして、該ＬＳＰを用
いて話者毎のコードブックをあらかじめ作成しておき、
該コードブックに基づいて調音をモデル化する。また、
コードブックは話者の調音を示す代表値であり、該コー
ドブックを変化させることによって、話者の調音を変化
させることができる。Here, when creating a voice generation model,
A database that expresses the articulation of each speaker is required. In the present embodiment, LSP (Line Spec) is used as a parameter expressing articulation which is an articulation component of voice.
rum Pair). Then, a codebook for each speaker is created in advance using the LSP,
Model articulation based on the codebook. Also,
The codebook is a representative value indicating the articulation of the speaker, and the articulation of the speaker can be changed by changing the codebook.

【００３９】なお、前記ＬＳＰは、声道の共振周波数で
あるホルマント（ｆｏｒｍａｎｔ：音声に対する周波数
スペクトル上で特定の周波数帯域にエネルギーが集中し
て生じる山）の前後に対になって現れるものである。こ
こで、音声波形には時間軸上における相関関係があるの
で、音声のサンプル値に線形予測性があると仮定し、原
音声波形との２乗平均誤差が最小になるように前後のサ
ンプル値から予測を行い、その予測係数を符号化（パラ
メータ化）する。この場合、前記予測係数の符号化には
幾つかの手段があり、前記ＬＳＰはその一つである。The LSPs appear as a pair before and after a formant (formant: a mountain where energy is concentrated in a specific frequency band on a frequency spectrum for voice) which is a resonance frequency of the vocal tract. . Here, since the speech waveform has a correlation on the time axis, it is assumed that the sample value of the speech has linear predictability, and the sample values before and after the sample value before and after so as to minimize the root mean square error with the original speech waveform. The prediction coefficient is coded (parameterized). In this case, there are several means for encoding the prediction coefficient, and the LSP is one of them.

【００４０】本実施の形態においては、原音声の音声波
形から図２に示されるような基本概念を実現するパラメ
ータを抽出し、電子透かし入りの音声として合成し直す
ようになっている。そして、前記パラメータには様々な
ものがあるが、本実施の形態においては、音声の調音成
分である調音を表現するパラメータとしてＬＳＰを使用
する。これは、音声を合成して復元する場合、声道情報
を表現するためのパラメータとして、ＬＳＰが最適なた
めである。また、音声には、図３に示されるように、音
源成分としてのピッチ成分も含まれる。そのため、音声
を復元するためには、ＬＳＰだけでなく、ピッチ成分と
してのピッチ（Ｐｉｔｃｈ）と呼ばれるパラメータも必
要となる。なお、ピッチは音声合成のためだけに使用さ
れ、ピッチを対象とした電子透かし情報の埋込操作は行
われない。In the present embodiment, parameters for realizing the basic concept as shown in FIG. 2 are extracted from the voice waveform of the original voice and are resynthesized as voice with digital watermark. There are various parameters, but in the present embodiment, LSP is used as a parameter expressing articulation, which is the articulation component of voice. This is because LSP is optimal as a parameter for expressing vocal tract information when synthesizing and restoring speech. Further, the voice also includes a pitch component as a sound source component, as shown in FIG. Therefore, in order to restore the voice, not only the LSP but also a parameter called pitch as a pitch component is necessary. It should be noted that the pitch is used only for voice synthesis, and no embedding operation of digital watermark information for the pitch is performed.

【００４１】ここで、ピッチは、声の高さを表すパラメ
ータである。そして、声帯の緊張が大きく、かつ、肺か
らの空気圧が高いと、声帯の開閉周期、すなわち、振動
周期が短くなって音源の音の高さが高くなるので、ピッ
チは高くなる。また、逆の場合、ピッチは低くなる。な
お、声帯の振動周期のことを基本周期といい、この逆数
を基本周波数と呼ぶ。そして、基本周期の時間的な変化
によって、アクセントやイントネーション感覚が付加さ
れる。Here, the pitch is a parameter representing the pitch of a voice. When the tension of the vocal cords is high and the air pressure from the lungs is high, the opening / closing cycle of the vocal cords, that is, the vibration cycle is shortened and the pitch of the sound source is increased, so that the pitch is increased. In the opposite case, the pitch becomes low. In addition, the vibration cycle of the vocal cords is called the fundamental cycle, and the reciprocal of this is called the fundamental frequency. Then, an accent or intonation sensation is added by the temporal change of the basic cycle.

【００４２】なお、音声の音源、調音及び放射と前記パ
ラメータとは、図３に示されるような関係を有する。Note that the sound source, articulation, and radiation of the voice and the parameters have the relationship shown in FIG.

【００４３】次に、電子透かし埋込装置及びその動作に
ついて説明する。Next, the digital watermark embedding device and its operation will be described.

【００４４】図１は本発明の実施の形態における電子透
かし埋込装置の動作を示すブロック図、図４は本発明の
実施の形態における電子透かし埋込装置の構成を示すブ
ロック図、図５は本発明の実施の形態におけるコードブ
ックの変更方法を示す図、図６は本発明の実施の形態に
おけるベクトル量子化においてコードブックを用いる順
を示す図である。FIG. 1 is a block diagram showing the operation of the digital watermark embedding device according to the embodiment of the present invention, FIG. 4 is a block diagram showing the configuration of the digital watermark embedding device according to the embodiment of the present invention, and FIG. FIG. 6 is a diagram showing a codebook changing method in the embodiment of the present invention, and FIG. 6 is a diagram showing an order in which the codebook is used in the vector quantization in the embodiment of the present invention.

【００４５】図４において、１０は電子透かし埋込装置
であり、ＣＰＵ、ＭＰＵ等の演算手段、半導体メモリ、
磁気ディスク等の記憶手段、キーボード、マウス、マイ
クロフォン等の入力手段、ＣＲＴ、液晶ディスプレイ、
プリンタ、ラウドスピーカ等の出力手段、通信インター
フェイス等を備えるコンピュータである。そして、電子
透かし埋込装置１０は、機能の面から、話者の原音声が
入力される原音声入力部１１、入力された音声としての
前記原音声を分解して、調音成分としてのＬＳＰ及び音
源成分としてのピッチ成分を抽出する音声分解部１２、
抽出されたＬＳＰに電子透かし情報を埋め込む電子透か
し埋込部１３、電子透かし情報を埋め込まれたＬＳＰと
ピッチ成分とを合成して電子透かし入り音声を出力する
電子透かし入り音声出力部１４を有する。In FIG. 4, reference numeral 10 denotes a digital watermark embedding device, which includes a CPU, an MPU or other arithmetic means, a semiconductor memory,
Storage means such as magnetic disk, keyboard, mouse, input means such as microphone, CRT, liquid crystal display,
The computer is equipped with a printer, an output unit such as a loudspeaker, a communication interface, and the like. From the viewpoint of function, the digital watermark embedding device 10 decomposes the original voice as the input voice, the original voice input unit 11 into which the original voice of the speaker is input, and the LSP as the articulatory component. A voice decomposition unit 12 for extracting a pitch component as a sound source component,
It has a digital watermark embedding unit 13 that embeds digital watermark information in the extracted LSP, and a digital watermark embedded audio output unit 14 that combines the LSP embedded with digital watermark information and a pitch component to output a digital watermark embedded audio.

【００４６】ここで、前記電子透かし埋込装置１０は、
前記コードブックを格納するコードブックデータベース
２０を有する。なお、該コードブックデータベース２０
は、電子透かし埋込装置１０の記憶手段内に構築された
データベースであってもよいし、外部の記憶手段内に構
築されたデータベースであってもよい。そして、前記コ
ードブックデータベース２０は、話者の音声から抽出し
たＬＳＰを用いて作成された原コードブックとしてのオ
リジナルコードブックを格納するファイルであるＣＢＫ
２１、該ＣＢＫ２１に格納されるオリジナルコードブッ
クに変更を加えて作成された「０」を表現する変更コー
ドブックを格納するファイルであるＣＢＫ_w0２２、及
び、前記ＣＢＫ２１に格納されるオリジナルコードブッ
クに変更を加えて作成された「１」を表現する変更コー
ドブックを格納するファイルであるＣＢＫ_w1２３を備え
る。ここで、前記「０」及び「１」は電子透かし情報の
情報コードである。なお、前記ＣＢＫ２１、ＣＢＫ_w0２
２及びＣＢＫ_w1２３には、一人一人の話者毎に作成され
たオリジナルコードブック及び変更コードブックがそれ
ぞれ格納されている。Here, the digital watermark embedding device 10 is
It has a codebook database 20 for storing the codebook. The codebook database 20
May be a database built in the storage means of the digital watermark embedding device 10 or may be a database built in an external storage means. The codebook database 20 is a file that stores an original codebook as an original codebook created by using the LSP extracted from the speaker's voice.
21, a CBK _w0 22 which is a file storing a modified code book expressing “0” created by modifying the original code book stored in the CBK 21, and an original code book stored in the CBK 21. It is provided with a CBK _w1 23 that is a file that stores a change codebook that expresses "1" created by making changes. Here, the "0" and "1" are information codes of digital watermark information. It is to be noted that the CBK21, CBK _w0 2
2 and CBK _w1 23 respectively store the original codebook and the modified codebook created for each speaker.

【００４７】そして、前記オリジナルコードブック及び
変更コードブックは、図１に示されるように、あらかじ
め一人一人の話者毎に作成される。まず、前処理とし
て、ある話者ｉ（１≦ｉ≦ｎ）の学習用音声を用いて、
ＬＳＰを抽出し、該ＬＳＰに対応するベクトルの数値が
格納されるオリジナルコードブックＣＢＫ_iを作成し、
ＣＢＫ２１に格納する。続いて、前記オリジナルコード
ブックＣＢＫ_iに格納されている数値に変更を加え、
「０」を表現する変更コードブックＣＢＫ_iw0、及び、
「１」を表現する変更コードブックＣＢＫ_iw1を作成
し、ＣＢＫ_w0２２及びＣＢＫ_w1２３にそれぞれ格納す
る。Then, the original codebook and the modified codebook are prepared in advance for each speaker, as shown in FIG. First, as preprocessing, using a learning voice of a speaker i (1 ≦ i ≦ n),
Extracting an LSP, creating an original codebook CBK _{i in} which numerical values of vectors corresponding to the LSP are stored,
Store in CBK21. Then, change the values stored in the original codebook CBK _i ,
A modified codebook CBK _iw0 representing "0", and
A modified codebook CBK _iw1 representing “1” is created and _stored in CBK _w0 22 and CBK _w1 23, respectively.

【００４８】ここで、ＬＳＰを抽出する場合、まず、話
者ｉの音声波形に対し、窓掛けを行い処理単位（フレー
ム）を決める。続いて、フレーム毎に線形予測分析（Ｌ
ｉｎｅａｒＰｒｅｄｉｃｔｉｖｅＣｏｄｉｎｇ）を
行い、その結果によって出された線形予測係数からＬＳ
Ｐを抽出する。そして、全フレームに対しＬＳＰを抽出
し、ＬＢＧ＋ｓｐｌｉｔｔｉｎｇアルゴリズムを用いて
コードブックを作成する。これは、各フレームから算出
されるＬＳＰに対応するベクトル、すなわち、ＬＳＰベ
クトルに関してその代表値を決める作業であり、次元数
分の代表ベクトルの集まりがオリジナルコートブックＣ
ＢＫ_iとなる。Here, in the case of extracting the LSP, first, the processing unit (frame) is determined by windowing the voice waveform of the speaker i. Then, a linear prediction analysis (L
LS is performed from the linear prediction coefficient obtained as a result of performing the inward predictive coding).
Extract P. Then, LSPs are extracted for all the frames, and a codebook is created using the LBG + splitting algorithm. This is a work for determining a representative value of a vector corresponding to an LSP calculated from each frame, that is, an LSP vector, and a set of representative vectors for the number of dimensions is the original codebook C.
BK _i .

【００４９】そして、作成したオリジナルコードブック
ＣＢＫ_iに対してその内容の変更を行い、電子透かし情
報埋込用の変更コードブックを作成する。この場合、各
レベルにおいて変更場所を一つにし、２種類の変更を加
えて変更コードブックＣＢＫ _iw0、ＣＢＫ_iw1を作成す
る。また、変更は全レベルについて行う。そして、変更
を行わない箇所はすべてオリジナルコードブックＣＢＫ
_iと同じ値にしておき、変更を加えないオリジナルコー
ドブックＣＢＫ_iは電子透かし情報「なし」を表現する
ために利用する。The created original codebook
CBK_iChange the content of the
Create a modified codebook for embedding information. In this case, each
Make one change location in the level and add two types of changes
Change Codebook CBK _iw0, CBK_iw1Create
It In addition, changes will be made for all levels. And change
Original codebook CBK
_iThe original code with the same value as
Dobook CBK_iRepresents digital watermark information "none"
To use.

【００５０】例えば、図５に示されるように、オリジナ
ルコードブックＣＢＫ_iの全レベルに対して、特定の次
元に変更を加えて変更コードブックＣＢＫ_iw0、ＣＢＫ
_iw1を作成する場合について説明する。この場合、ＬＳ
Ｐは１６次元で、１〜２５６のレベルを備えるものと
し、オリジナルコードブックＣＢＫ_iの３次元目の数値
をマイナス方向に０．２動かして変更コードブックＣＢ
Ｋ_iw0を作成し、プラス方向に０．２動かして変更コー
ドブックＣＢＫ_iw1を作成するものとする。For example, as shown in FIG. 5, with respect to all the levels of the original codebook CBK _i , a change is made in a specific dimension to change the codebooks CBK _iw0 , CBK.
A case of creating _iw1 will be described. In this case, LS
It is assumed that P is 16-dimensional and has levels of 1 to 256, and the numerical value of the third dimension of the original codebook CBK _i is moved by 0.2 in the negative direction to change the codebook CB.
It is _assumed that K _iw0 is created and moved in the plus direction by 0.2 to create a modified codebook CBK _iw1 .

【００５１】これにより、図５（ａ）に示されるような
オリジナルコードブックＣＢＫ_iの３次元目の数値１．
０が０．８に変更され、図５（ｂ）に示されるような変
更コードブックＣＢＫ_iw0が作成される。また、前記オ
リジナルコードブックＣＢＫ _iの３次元目の数値１．０
が１．２に変更され、図５（ｃ）に示されるような変更
コードブックＣＢＫ_iw1が作成される。そのため、図５
（ｄ）に示されるようなＬＳＰスペクトルの分布が、図
５（ｅ）に示されるようなものに変更される。As a result, as shown in FIG.
Original code book CBK_iNumerical value of the third dimension of 1.
0 is changed to 0.8, and the change as shown in FIG.
Sara Codebook CBK_iw0Is created. In addition, the above
Original Code Book CBK _iNumerical value of the 3rd dimension of 1.0
Is changed to 1.2, and the change is as shown in Fig. 5 (c).
Codebook CBK_iw1Is created. Therefore,
The distribution of the LSP spectrum as shown in (d) is
5 (e) is changed.

【００５２】このようにして、ある話者ｉの学習用音声
について、オリジナルコードブックＣＢＫ_i、「０」を
表現する変更コードブックＣＢＫ_iw0、及び、「１」を
表現する変更コードブックＣＢＫ_iw1が作成され、あら
かじめＣＢＫ２１、ＣＢＫ_w0２２及びＣＢＫ_w1２３にそ
れぞれ格納される。[0052] In this way, the learning for the voice of a speaker i, the original code book CBK _i, change the codebook _CBK iw0 to represent a "0", and, to change the codebook CBK _iw1 to represent a "1" It is created and _stored in advance in the CBK 21, CBK _w0 22 and CBK _w1 23, respectively.

【００５３】次に、前記話者ｉの原音声が原音声入力部
１１から入力されると、音声分解部１２は前記原音声か
らＬＳＰ及びピッチ成分を抽出する。そして、電子透か
し埋込部１３は前記ＬＳＰに対して、電子透かし情報埋
込操作を行う。この場合、電子透かし埋込部１３はコー
ドブックデータベース２０にアクセスし、抽出した前記
ＬＳＰに対応するベクトルを、オリジナルコードブック
ＣＢＫ_iを用いてベクトル量子化を行い、オリジナルコ
ードブックＣＢＫ_iが表現する代表値に置き換える。Next, when the original voice of the speaker i is input from the original voice input unit 11, the voice decomposition unit 12 extracts the LSP and the pitch component from the original voice. Then, the digital watermark embedding unit 13 performs a digital watermark information embedding operation on the LSP. In this case, the electronic watermark embedding unit 13 accesses the codebook database 20, a vector corresponding to the extracted the LSP, vector quantization is performed using the original codebook CBK _i, representing the original codebook CBK _i Replace with a representative value.

【００５４】そして、電子透かし埋込部１３は、電子透
かし情報の情報コードとしてのビット列、例えば、図１
に示されるように、「１０１０１１０・・・」に合わせ
て、オリジナルコードブックＣＢＫ_iによってベクトル
量子化していたＬＳＰ列を「０」を表現する変更コード
ブックＣＢＫ_iw0、及び、「１」を表現する変更コード
ブックＣＢＫ_iw1を用いてベクトル量子化する。この場
合、図６（ａ）に示されるようなオリジナルコードブッ
クＣＢＫ_iの３次元目の数値１．０が０．８及び１．２
に変更された「０」を表現する変更コードブックＣＢＫ
_iw0、及び、「１」を表現する変更コードブックＣＢＫ
_iw1が、図６（ｂ）に示されるような順に用いられる。
これにより、電子透かし情報埋込操作が完了し、電子透
かし情報が埋め込まれた電子透かし埋込ＬＳＰとしての
ＬＳＰ’が作成される。Then, the digital watermark embedding unit 13 uses a bit string as an information code of the digital watermark information, for example, as shown in FIG.
, The modified codebook CBK _iw0 that represents “0” in the LSP sequence that has been vector-quantized by the original codebook CBK _i , and “1” are represented in accordance with “1010110 ...”. Vector quantize using the modified codebook CBK _iw1 . In this case, the third-dimensional numerical value 1.0 of the original codebook CBK _i as shown in FIG. 6A is 0.8 and 1.2.
Change code book CBK expressing "0" changed to
_iw0 and modified codebook CBK representing "1"
_iw1 is used in the order as shown in FIG.
As a result, the digital watermark information embedding operation is completed, and LSP ′ as a digital watermark embedded LSP in which the digital watermark information is embedded is created.

【００５５】なお、ベクトル量子化とは、対象となるベ
クトルｘ₁（０）、ｘ₁（１）、…、ｘ₁（Ｌ）とオリ
ジナルコードブックＣＢＫ_i、変更コードブックＣＢＫ
_iw0、変更コードブックＣＢＫ_iw1等のコードブックに
登録されているベクトルとのユークリッド距離ｄ_i、The vector quantization means the target vectors x ₁ (0), x ₁ (1), ..., X ₁ (L) and the original codebook CBK _i and the modified codebook CBK.
_iw0 , modified codebook CBK _iw1, etc., Euclidean distance d _i with a vector registered in the codebook,

【００５６】[0056]

【式１】 [Formula 1]

【００５７】が最小になるインデックスｉを計算するこ
とである。ここで、ｃ_iは前記コードブックに登録され
ているベクトルである。Is to calculate the index i that minimizes. Here, c _i is a vector registered in the codebook.

【００５８】続いて、前記電子透かし入り音声出力部１
４は、前記ＬＳＰ’とピッチ成分とを合成して電子透か
し入り音声を作成する。なお、前記ピッチ成分は、音声
分解部１２によって抽出されたピッチ成分であり、変更
されていない。最後に、前記電子透かし入り音声は、電
子透かし入り音声出力部１４によって、話者ｉの電子透
かし入り音声として出力される。Then, the digital watermark embedded audio output unit 1
4 synthesizes the LSP 'and the pitch component to create a digital watermarked voice. The pitch component is the pitch component extracted by the voice decomposing unit 12 and is not changed. Finally, the electronic watermark embedded voice is output by the electronic watermark embedded voice output unit 14 as the electronic watermark embedded voice of the speaker i.

【００５９】次に、電子透かし抽出装置及びその動作に
ついて説明する。Next, the digital watermark extracting device and its operation will be described.

【００６０】図７は本発明の実施の形態における電子透
かし抽出装置の構成を示すブロック図、図８は本発明の
実施の形態における電子透かし抽出装置の動作を示すブ
ロック図、図９は本発明の実施の形態における電子透か
し情報抽出の手順を示す図である。FIG. 7 is a block diagram showing the configuration of the digital watermark extracting apparatus according to the embodiment of the present invention, FIG. 8 is a block diagram showing the operation of the digital watermark extracting apparatus according to the embodiment of the present invention, and FIG. 9 is the present invention. FIG. 6 is a diagram showing a procedure of extracting digital watermark information in the embodiment of FIG.

【００６１】図７において、３０は電子透かし抽出装置
であり、ＣＰＵ、ＭＰＵ等の演算手段、半導体メモリ、
磁気ディスク等の記憶手段、キーボード、マウス、マイ
クロフォン等の入力手段、ＣＲＴ、液晶ディスプレイ、
プリンタ、ラウドスピーカ等の出力手段、通信インター
フェイス等を備えるコンピュータである。なお、電子透
かし抽出装置３０は前記電子透かし埋込装置１０と一体
的に構成されていてもよい。そして、電子透かし抽出装
置３０は、機能の面から、電子透かし情報を抽出する対
象となる音声が入力される音声入力部３１、入力された
電子透かし入り音声を分解して、ＬＳＰを抽出する音声
分解部３２、及び、抽出されたＬＳＰから電子透かし情
報を抽出する電子透かし抽出部３３を有する。In FIG. 7, reference numeral 30 denotes a digital watermark extraction device, which is a CPU, MPU or other arithmetic means, semiconductor memory,
Storage means such as magnetic disk, keyboard, mouse, input means such as microphone, CRT, liquid crystal display,
The computer is equipped with a printer, an output unit such as a loudspeaker, a communication interface, and the like. The digital watermark extracting device 30 may be configured integrally with the digital watermark embedding device 10. From the viewpoint of function, the digital watermark extracting device 30 decomposes the voice input unit 31 to which the voice to be the target of extracting the digital watermark information is input, the voice into which the digital watermark is input, and the voice to extract the LSP. It has a decomposition unit 32 and a digital watermark extraction unit 33 that extracts digital watermark information from the extracted LSP.

【００６２】ここで、前記電子透かし抽出装置３０は、
前記コードブックを格納するコードブックデータベース
２０を有する。なお、該コードブックデータベース２０
は、前記電子透かし埋込装置１０のコードブックデータ
ベース２０と同一の構成を有し、同一のデータを格納す
る。そのため、該コードブックデータベース２０並びに
ファイルＣＢＫ２１、ＣＢＫ_w0２２及びＣＢＫ_w1２３の
説明は省略する。そして、前記コードブックデータベー
ス２０は電子透かし抽出装置３０の記憶手段内に構築さ
れたデータベースであってもよいし、外部の記憶手段内
に構築されたデータベースであってもよい。また、前記
電子透かし埋込装置１０のコードブックデータベース２
０を共用するようにしてもよい。Here, the digital watermark extracting device 30
It has a codebook database 20 for storing the codebook. The codebook database 20
Has the same configuration as the codebook database 20 of the digital watermark embedding device 10 and stores the same data. Therefore, the description of the codebook database 20 and the files CBK21, CBK _w0 22 and CBK _w1 23 will be omitted. The codebook database 20 may be a database built in the storage means of the digital watermark extraction device 30 or a database built in an external storage means. Further, the codebook database 2 of the digital watermark embedding device 10
You may make it share 0.

【００６３】そして、電子透かし情報を抽出する場合、
電子透かし情報が埋め込まれているか否かの検査対象と
なる話者ｉの音声が音声入力部３１から入力されると、
音声分解部３２は前記音声からＬＳＰを抽出する。この
場合、電子透かし情報を埋め込む場合のようにピッチを
抽出する必要はない。続いて、電子透かし抽出部３３
は、コードブックデータベース２０にアクセスし、前記
ＬＳＰに対して、電子透かし抽出操作を行う。When extracting the digital watermark information,
When the voice of the speaker i to be inspected as to whether or not the digital watermark information is embedded is input from the voice input unit 31,
The voice decomposing unit 32 extracts the LSP from the voice. In this case, it is not necessary to extract the pitch as in the case of embedding the digital watermark information. Then, the digital watermark extraction unit 33
Accesses the codebook database 20 and performs a digital watermark extraction operation on the LSP.

【００６４】ここで、電子透かし情報の抽出は、検査対
象となる音声から該音声のＬＳＰであるＬＳＰ’を抽出
してコードブックとの距離で判定する。まず、図９に示
されるように、フレーム毎に１６次元のＬＳＰ’を抽出
する。そして、コードブックのどのレベルを利用して合
成したのかを調べるために、選択レベルを設定する。こ
の場合、抽出したＬＳＰ’に対応するベクトル、すなわ
ち、１６次元のＬＳＰ’ベクトルの中から一つの次元を
除き、前記ファイルＣＢＫ２１、ＣＢＫ_w0２２及びＣＢ
Ｋ_w1２３に格納されている３つのコードブック、すなわ
ち、オリジナルコードブックＣＢＫ_i、「０」を表現す
る変更コードブックＣＢＫ_iw0、及び、「１」を表現す
る変更コードブックＣＢＫ_iw1とのユークリッド距離ｄ
_iを計算する。なお、該計算は、すべての１６次元に対
して行われる。そして、前記ＣＢＫ_iＣＢＫ_iw0及びＣ
ＢＫ_iw1とのユークリッド距離ｄ_iに基づいて、最も距
離が短いレベル（１〜２５６）を決定し、該レベルが前
記ＣＢＫ_iＣＢＫ_iw0及びＣＢＫ_iw1において同じ値で
ある場合は、そのレベルに電子透かし情報が埋め込まれ
ているものと判断する。Here, the extraction of the digital watermark information is performed by extracting LSP 'which is the LSP of the voice from the voice to be inspected and judging by the distance from the codebook. First, as shown in FIG. 9, 16-dimensional LSP ′ is extracted for each frame. Then, the selection level is set in order to check which level of the codebook was used for the synthesis. In this case, 'vector corresponding to, i.e., 16-dimensional LSP' extracted LSP except one dimension out of the vector, the file CBK21, CBK _w0 22 and CB
K _w1 23 on the stored and has three codebooks, i.e., changing the code representing the original codebook CBK _i, a "0" book CBK _IW0, and Euclidean distance between the change codebook CBK _iw1 to represent "1" d
Calculate _i . Note that the calculation is performed for all 16 dimensions. Then, the CBK _i CBK _iw0 and C
_Based on the Euclidean distance d _i with BK _iw1 , the level (1-256) with the shortest distance is determined, and when the level has the same value in the CBK _i CBK _iw0 and CBK _iw1 , the watermark is _{added to} that level. Judge that the information is embedded.

【００６５】その後、すべての１６次元のＬＳＰ’とオ
リジナルコードブックＣＢＫ_iとの間の距離を計算し、
電子透かし埋込装置１０において電子透かし情報の埋込
の際に、変更コードブックＣＢＫ_iw0及びＣＢＫ_iw1を
作成するために変更した値、例えば、０．２に合わせて
閾（しきい）値を設定する。そして、検査対象音声の１
６次元のＬＳＰ’ベクトルの数値（ＬＳＰ’０〜ＬＳ
Ｐ’１５）とオリジナルコードブックＣＢＫ_iに格納さ
れている１６次元のＬＳＰベクトルの数値（ＬＳＰ０〜
ＬＳＰ１５）とを比較して、図９に示されるように、前
記ＬＳＰ’ベクトルの数値とＬＳＰベクトルの数値との
差の絶対値が閾値より大きい場合に、電子透かし情報が
入っていると判断する。さらに、電子透かし情報が入っ
ていると判断された場合、前記ＬＳＰ’ベクトルの数値
（ＬＳＰ’０〜ＬＳＰ’１５）と変更コードブックＣＢ
Ｋ_iw0及びＣＢＫ_iw1に格納されている１６次元のＬＳ
Ｐベクトルの数値（ＬＳＰ０_w0〜ＬＳＰ１５_w0）及び
（ＬＳＰ０_w1〜ＬＳＰ１５_w1）との距離を計算し、距離
の近い方のコードブックが表現する情報に置き換え、
「０」か「１」かの判定を行う。Then calculate the distance between all 16-dimensional LSP's and the original codebook CBK _i ,
When embedding digital watermark information in the digital watermark embedding device 10, a threshold value is set in accordance with a value changed to create the modified codebooks CBK _iw0 and CBK _iw1 , for example, 0.2. To do. And 1 of the voice to be inspected
Numerical value of 6-dimensional LSP 'vector (LSP'0-LS
P′15) and the numerical values (LSP0 to 16) of the 16-dimensional LSP vector stored in the original codebook CBK _i.
As shown in FIG. 9, when the absolute value of the difference between the numerical value of the LSP ′ vector and the numerical value of the LSP vector is larger than a threshold value, it is determined that digital watermark information is included. . Furthermore, when it is determined that the digital watermark information is included, the numerical values (LSP'0 to LSP'15) of the LSP 'vector and the modified codebook CB are included.
16-dimensional LS stored in K _iw0 and CBK _iw1
The distance between the numerical value (LSP0 _{w0 to} LSP15 _w0 ) and (LSP0 _{w1 to} LSP15 _w1 ) of the P vector is calculated, and the distance is replaced with the information represented by the codebook having the shorter distance.
It is determined whether it is "0" or "1".

【００６６】その結果、例えば、図８に示されるよう
に、「１０１０１１０・・・」という電子透かし情報の
情報コードが抽出される。そして、該情報コード列が、
電子透かし埋込装置１０における電子透かし情報埋込操
作によって埋め込まれた電子透かし情報の情報コードと
同一のものであるか否かを判断する。As a result, for example, as shown in FIG. 8, the information code of the digital watermark information "1010110 ..." is extracted. Then, the information code string is
It is determined whether the information code is the same as the information code of the digital watermark information embedded by the digital watermark information embedding operation in the digital watermark embedding device 10.

【００６７】なお、本実施の形態における電子透かし埋
込装置１０のように、コードブックを変更することによ
って電子透かし情報の埋込操作を行う場合、コードブッ
クの内容を変更することはＬＳＰ周波数を変更すること
と等しいので、前記コードブックの変更によって合成音
声の音質が変化することもあり得る。When the digital watermark information embedding operation is performed by changing the codebook like the digital watermark embedding device 10 in the present embodiment, changing the contents of the codebook changes the LSP frequency. Since the change is equivalent to the change, the change of the codebook may change the sound quality of the synthesized voice.

【００６８】そのため、電子透かし情報を「コンテンツ
の品質を劣化させずに埋め込む」ためには、原音声の波
形から取り出したＬＳＰをどの程度まで変更することが
できるのか、また、変更したＬＳＰを用いて合成した音
声がどの程度の音質の変化を伴うものであるかを評価す
る必要がある。Therefore, in order to "embed digital watermark information without degrading the quality of contents", to what extent can the LSP extracted from the waveform of the original voice be changed, and the changed LSP is used. It is necessary to evaluate to what extent the synthesized speech has a change in sound quality.

【００６９】本実施の形態における電子透かし埋込装置
１０のように、ＬＳＰ分析次数を１６次として、１６次
元のＬＳＰを変更する場合、１６次元のＬＳＰのうち、
どの次数のＬＳＰ周波数を変更するか、どのように組合
せて埋込操作をするか、また、ＬＳＰ周波数を変更する
方向（＋、−）等を考慮する必要がある。また、すべて
のフレームのうち、どのぐらいの割合で埋め込むかを関
数によって制御することや、聴覚マスキング特性を利用
したバースト的な埋込操作等が考えられる。When the LSP analysis order is 16 and the 16-dimensional LSP is changed as in the digital watermark embedding device 10 of the present embodiment, of the 16-dimensional LSP,
It is necessary to consider which order the LSP frequency is changed, how to combine and perform the embedding operation, and the direction (+, −) in which the LSP frequency is changed. Further, it is conceivable to control the ratio of embedding in all frames by a function, or a burst embedding operation using the auditory masking characteristic.

【００７０】このように、本実施の形態において、電子
透かし埋込装置１０は、入力された音声から調音成分と
してＬＳＰを、又、音源成分としてピッチ成分を抽出す
る音声分解部１２と、ＬＳＰに電子透かし情報を埋め込
む電子透かし埋込部１３と、電子透かし情報を埋め込ま
れたＬＳＰと前記ピッチ成分とを合成して電子透かし入
り音声を出力する電子透かし入り音声出力部１４とを有
する。As described above, in the present embodiment, the digital watermark embedding device 10 uses the LSP as the articulatory component and the pitch decomposition component as the sound source component from the input voice, and the LSP. It has a digital watermark embedding unit 13 for embedding the digital watermark information, and a digital watermark embedded voice output unit 14 for synthesizing the LSP in which the digital watermark information is embedded and the pitch component to output a digital watermark embedded voice.

【００７１】また、本実施の形態において、電子透かし
抽出装置３０は、入力された音声から調音成分としてＬ
ＳＰを抽出する音声分解部３２と、前記ＬＳＰから電子
透かし情報を抽出する電子透かし抽出部３３とを有す
る。Further, in the present embodiment, the digital watermark extracting device 30 uses the input voice as an articulation component L
It has a voice decomposing unit 32 for extracting SP and a digital watermark extracting unit 33 for extracting digital watermark information from the LSP.

【００７２】したがって、電子透かし情報は、音声の発
音構造に対応して埋め込まれ、また、抽出されるので、
多数の周波数帯域に分散される。したがって、音声が多
数回リサンプリングされても、また、音声に多数回圧縮
処理及び復元処理が施されても、電子透かし情報が消え
てしまうことがない。また、多数種類の音声信号の符号
化に対応することもできる。さらに、音声の質が低下す
ることもない。Therefore, since the digital watermark information is embedded and extracted corresponding to the pronunciation structure of voice,
It is distributed over many frequency bands. Therefore, even if the sound is resampled many times or the sound is subjected to the compression processing and the decompression processing many times, the digital watermark information is not erased. It is also possible to support encoding of many types of audio signals. Furthermore, the quality of the voice does not deteriorate.

【００７３】なお、本発明は前記実施の形態に限定され
るものではなく、本発明の趣旨に基づいて種々変形させ
ることが可能であり、それらを本発明の範囲から排除す
るものではない。The present invention is not limited to the above-mentioned embodiments, but can be variously modified within the scope of the present invention, and they are not excluded from the scope of the present invention.

【００７４】[0074]

【発明の効果】以上詳細に説明したように、本発明によ
れば、音声の発音構造に対応して電子透かし情報を埋め
込み、また、抽出することによって、電子透かし情報が
多数の周波数帯域に分散して埋め込まれ、リサンプリン
グや音声データの圧縮及び復元を繰り返しても消えてし
まうことがなく、かつ、多数種類の符号化に対応するこ
とができる。As described above in detail, according to the present invention, by embedding and extracting the digital watermark information corresponding to the pronunciation structure of the voice, the digital watermark information is dispersed in many frequency bands. Since it is embedded, it does not disappear even if resampling and compression / decompression of audio data are repeated, and it is possible to cope with many types of encoding.

[Brief description of drawings]

【図１】本発明の実施の形態における電子透かし埋込装
置の動作を示すブロック図である。FIG. 1 is a block diagram showing an operation of a digital watermark embedding device according to an embodiment of the present invention.

【図２】本発明の実施の形態における基本概念を示す概
念図である。FIG. 2 is a conceptual diagram showing a basic concept in the embodiment of the present invention.

【図３】本発明の実施の形態における音声生成モデルを
示す概念図である。FIG. 3 is a conceptual diagram showing a voice generation model in the embodiment of the present invention.

【図４】本発明の実施の形態における電子透かし埋込装
置の構成を示すブロック図である。FIG. 4 is a block diagram showing a configuration of a digital watermark embedding device according to an embodiment of the present invention.

【図５】本発明の実施の形態におけるコードブックの変
更方法を示す図である。FIG. 5 is a diagram showing a codebook changing method according to the embodiment of the present invention.

【図６】本発明の実施の形態におけるベクトル量子化に
おいてコードブックを用いる順を示す図である。FIG. 6 is a diagram showing an order of using a codebook in vector quantization according to the embodiment of the present invention.

【図７】本発明の実施の形態における電子透かし抽出装
置の構成を示すブロック図である。FIG. 7 is a block diagram showing a configuration of a digital watermark extracting device according to an embodiment of the present invention.

【図８】本発明の実施の形態における電子透かし抽出装
置の動作を示すブロック図である。FIG. 8 is a block diagram showing an operation of the digital watermark extracting device in the embodiment of the present invention.

【図９】本発明の実施の形態における電子透かし情報抽
出の手順を示す図である。FIG. 9 is a diagram showing a procedure of digital watermark information extraction according to the embodiment of the present invention.

[Explanation of symbols]

１０電子透かし埋込装置１２、３２音声分解部１３電子透かし埋込部１４電子透かし入り音声出力部３０電子透かし抽出装置３３電子透かし抽出部 10 Digital watermark embedding device 12, 32 Speech decomposition unit 13 Digital watermark embedding unit 14 Digital watermarked audio output section 30 Digital Watermark Extractor 33 Digital Watermark Extraction Unit

Claims

[Claims]

1. A voice decomposition unit for extracting an articulatory component and a sound source component from an input voice, and a digital watermark embedding unit for embedding digital watermark information in the articulatory component.
(C) A digital watermark embedding device, comprising: a sound source component and an articulatory component in which the digital watermark information is embedded, and a digital watermark embedded audio output unit for outputting a digital watermark embedded audio.

2. The digital watermark embedding device according to claim 1, wherein the articulation component is a parameter expressing a voice articulation, and the sound source component is a voice pitch component.

3. The digital watermark embedding device according to claim 2, wherein the parameter is LSP.

4. An original codebook in which a numerical value of a vector corresponding to the LSP is stored, and a modified codebook in which a numerical value of the vector modified to correspond to an information code having a logical value 0 or 1 is stored. Claim 3 having
The electronic watermark embedding device as described in 1.

5. The digital watermark embedding unit accesses the modified codebook, obtains a numerical value corresponding to the information code of the digital watermark information, and obtains a vector corresponding to the LSP extracted from the input voice. Change the value of
The digital watermark embedding device according to claim 4, wherein an LSP in which the digital watermark information is embedded is created.

6. A digital watermark comprising: (a) a voice decomposing unit for extracting an articulatory component from the input voice; and (b) a digital watermark extracting unit for extracting digital watermark information from the articulatory component. Extractor.

7. The digital watermark extracting apparatus according to claim 6, wherein the articulation component is a parameter expressing articulation of voice.

8. The digital watermark extracting apparatus according to claim 7, wherein the parameter is LSP.

9. An original codebook in which a numerical value of a vector corresponding to the LSP is stored, and a modified codebook in which a numerical value of the vector modified to correspond to an information code having a logical value 0 or 1 is stored. Claim 8 having
The digital watermark extracting device according to.

10. The digital watermark extracting unit accesses the original codebook and the modified codebook, and sets the numerical value of the vector corresponding to the LSP extracted from the input speech and the original codebook and the modified codebook. The digital watermark extracting device according to claim 9, wherein the information code of the digital watermark information is extracted by comparing it with a stored numerical value.

11. (a) Extracting an articulatory component and a sound source component from a voice, (b) embedding digital watermark information in the articulatory component, and (c) a sound source component and an articulatory component in which the digital watermark information is embedded. A method for embedding a digital watermark, which comprises synthesizing audio and outputting audio with a digital watermark.

12. The digital watermark embedding method according to claim 12, wherein the articulatory component is a parameter expressing the articulation of voice, and the sound source component is a pitch component of voice.

13. The digital watermark embedding method according to claim 12, wherein the parameter is LSP.

14. A modified code book, in which a numerical value of a vector corresponding to the LSP modified to correspond to an information code is stored, is accessed to acquire a numerical value corresponding to the information code of the digital watermark information, and the voice The LSP in which the digital watermark information is embedded is created by changing the numerical value of the vector corresponding to the LSP extracted from the.
The method for embedding a digital watermark according to item 3.

15. (a) Extracting an articulatory component from voice,
(B) A digital watermark extracting method characterized by extracting digital watermark information from the articulatory component.

16. The digital watermark extracting method according to claim 15, wherein the articulation component is a parameter expressing articulation of voice.

17. The digital watermark extracting method according to claim 16, wherein the parameter is LSP.

18. Accessing an original codebook in which a numerical value of a vector corresponding to the LSP is stored and a modified codebook in which a numerical value of the vector modified to correspond to an information code is stored, 18. The digital watermark extracting method according to claim 17, wherein the numerical value of the vector corresponding to the extracted LSP is compared with the numerical values stored in the original codebook and the modified codebook to extract the information code of the digital watermark information.