JP2003228388A

JP2003228388A - Method and device for voice code conversion

Info

Publication number: JP2003228388A
Application number: JP2002026957A
Authority: JP
Inventors: Yoshiteru Tsuchinaga; 義照土永; Takashi Ota; 恭士大田; Masanao Suzuki; 政直鈴木; Masakiyo Tanaka; 正清田中
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2002-02-04
Filing date: 2002-02-04
Publication date: 2003-08-15
Anticipated expiration: 2022-02-04
Also published as: JP4330303B2

Abstract

<P>PROBLEM TO BE SOLVED: To convert data embedded in a voice code into another code without impairing embedded data. <P>SOLUTION: A voice code converting device which converts a 1st voice code generated by converting an input voice by a 1st voice encoding system into a 2nd voice code by a 2nd voice encoding system is characterized by that when optional data are embedded in the received 1st voice code, a code conversion part converts the 1st voice code into the 2nd voice code and an embedded data extraction part extracts the embedded data from the 1st voice code, and a data embedding part embeds the extracted data in the 2nd voice code and sends them out. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は音声符号変換方法及
び音声符号変換装置に係わり、特に、インターネットなど
のネットワークで用いられる音声符号化装置、又は自動
車・携帯電話システム等で用いられる音声符号化装置に
よって符号化された音声符号を別の符号化方式の音声符
号に変換する音声符号変換方法及び音声符号変換装置に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech code conversion method and a speech code conversion apparatus, and more particularly to a speech coding apparatus used in a network such as the Internet or a speech coding apparatus used in an automobile / mobile phone system or the like. The present invention relates to a voice code conversion method and a voice code conversion device for converting a voice code encoded by the method into a voice code of another encoding method.

【０００２】[0002]

【従来の技術】近年、携帯電話システムの多様化や加入
者の爆発的な増加、インターネットを使った音声通信(V
oice over IP:VoIP)の普及等により、異なる通信システ
ム間での通信量がますます増加すると考えられる。携帯
電話やVoIPなどの音声通信システムでは、通信回線を有
効利用するために音声を圧縮する音声符号化技術が用い
られている。携帯電話では国によって、あるいはシステ
ムによって異なる音声符号化技術が用いられており、W-
CDMAでは世界共通の音声符号化方式としてAMR(Adaptive
Multi-Rate;適応マルチレート)方式が採用されてい
る。一方、VoIPでは音声符号化方式としてITU-T勧告G.7
29Aが広く用いられている。以下では、G.729Aの符号化方
式及び復号方式を説明すると共に、G.729AとAMR方式の相
違点について説明する。2. Description of the Related Art In recent years, the diversification of mobile phone systems, the explosive increase in subscribers, and voice communication (V
It is considered that the amount of communication between different communication systems will increase more and more due to the spread of oice over IP (VoIP). In voice communication systems such as mobile phones and VoIP, a voice encoding technique for compressing voice is used in order to effectively use a communication line. Mobile phones use different voice coding technologies depending on the country or system.
In CDMA, AMR (Adaptive
The Multi-Rate method is adopted. On the other hand, in VoIP, ITU-T Recommendation G.7 is adopted as a voice coding method.
29A is widely used. In the following, the G.729A encoding method and the decoding method will be described, and the differences between the G.729A and the AMR method will be described.

【０００３】G.729Aの符号化方式及び復号方式は次の通
りである。・符号器の構成及び動作図18はITU-T勧告G.729A方式の符号器の構成図である。
図18において、１フレーム当り所定サンプル数（＝Ｎ）
の入力信号（音声信号）Ｘがフレーム単位でLPC分析部
１に入力する。サンプリング速度を8kHz、1フレーム期
間を10msecとすれば、1フレームは80サンプルである。L
PC分析部１は、人間の声道を次式 H(z)=１／［１＋Σαi・ｚ^-i］（ｉ＝１〜P） (1) で表される全極型フィルタと見なし、このフィルタの係
数αi(i=1,・・・,p)を求める。ここで、Pはフィルタ次数
である。一般に、電話帯域音声の場合はpとして10〜12
の値が用いられる。LPC(線形予測)分析部１では、入力
信号の80サンプルと先読み分の40サンプル及び過去の信
号120サンプルの合計240サンプルを用いてLPC分析を行
いLPC係数を求める。The G.729A encoding system and decoding system are as follows. -Encoder configuration and operation Fig. 18 is a block diagram of an ITU-T Recommendation G.729A system encoder.
In Fig. 18, a predetermined number of samples per frame (= N)
The input signal (voice signal) X of is input to the LPC analysis unit 1 in frame units. If the sampling rate is 8 kHz and the one frame period is 10 msec, one frame is 80 samples. L
The PC analysis unit 1 regards the human vocal tract as an all-pole filter represented by the following equation H (z) = 1 / [1 + Σαi · z ⁻ⁱ ] (i = 1 to P) (1) The coefficient αi (i = 1, ..., P) of is calculated. Here, P is the filter order. Generally, 10 to 12 as p for telephone band voice
The value of is used. The LPC (linear prediction) analysis unit 1 performs LPC analysis using 80 samples of the input signal, 40 samples of the look-ahead and a total of 240 samples of the past signal 120 samples, and obtains the LPC coefficient.

【０００４】パラメータ変換部２はLPC係数をLSP(線ス
ペクトル対)パラメータに変換する。ここで、LSPパラメ
ータは、LPC係数と相互に変換が可能な周波数領域のパ
ラメータであり、量子化特性がLPC係数よりも優れてい
ることから量子化はLSPの領域で行われる。LSP量子化部
３は変換されたLSPパラメータを量子化してLSP符号とLS
P逆量子化値を求める。LSP補間部４は、現フレームで求
めたLSP逆量子化値と前フレームで求めたLSP逆量子化値
によりLSP補間値を求める。すなわち、１フレームは5ms
ecの第１、第２の２つのサブフレームに分割され、LPC
分析部１は第２サブフレームのLPC係数を決定するが、
第１サブフレームのLPC係数は決定しない。そこで、LSP
補間部４は、現フレームで求めたLSP逆量子化値と前フ
レームで求めたLSP逆量子化値を用いて補間演算により
第１サブフレームのLSP逆量子化値を予測する。The parameter converter 2 converts the LPC coefficient into an LSP (line spectrum pair) parameter. Here, the LSP parameter is a parameter in the frequency domain that can be mutually transformed with the LPC coefficient, and since the quantization characteristic is superior to the LPC coefficient, the quantization is performed in the LSP domain. The LSP quantizer 3 quantizes the transformed LSP parameter to generate an LSP code and an LS.
P Find the inverse quantized value. The LSP interpolation unit 4 obtains an LSP interpolation value from the LSP dequantization value obtained in the current frame and the LSP dequantization value obtained in the previous frame. That is, one frame is 5ms
It is divided into the first and second subframes of ec, and LPC
The analysis unit 1 determines the LPC coefficient of the second subframe,
The LPC coefficient of the first subframe is not determined. So LSP
The interpolator 4 predicts the LSP dequantized value of the first subframe by an interpolation operation using the LSP dequantized value obtained in the current frame and the LSP dequantized value obtained in the previous frame.

【０００５】パラメータ逆変換部５はLSP逆量子化値とL
SP補間値をそれぞれLPC係数に変換してLPC合成フィルタ
６に設定する。この場合、LPC合成フィルタ６のフィル
タ係数として、フレームの第１サブフレームではLSP補
間値から変換されたLPC係数が用いられ、第２サブフレ
ームではLSP逆量子化値から変換したLPC係数が用られ
る。尚、以降において1に添字があるもの、例えばlspi,
li（ｎ）,・・・における1はアルファベットのエルであ
る。LSPパラメータlspi(i=1,・・・,p)はLSP量子化部３で
スカラー量子化やベクトル量子化などにより量子化され
た後、量子化インデックス（LSP符号)が復号器側へ伝送
される。The parameter inverse transforming unit 5 uses the LSP inverse quantized value and L
Each SP interpolation value is converted into an LPC coefficient and set in the LPC synthesis filter 6. In this case, as the filter coefficient of the LPC synthesis filter 6, the LPC coefficient converted from the LSP interpolation value is used in the first subframe of the frame, and the LPC coefficient converted from the LSP dequantized value is used in the second subframe. . In the following, those with a subscript in 1, such as lspi,
1 in li (n), ... Is the letter L in the alphabet. The LSP parameter lspi (i = 1, ..., p) is quantized by the LSP quantizer 3 by scalar quantization or vector quantization, and then the quantization index (LSP code) is transmitted to the decoder side. It

【０００６】次に音源とゲインの探索処理を行なう。音
源とゲインはサブフレーム単位で処理を行う。まず、音
源信号をピッチ周期成分と雑音成分の２つに分け、ピッ
チ周期成分の量子化には過去の音源信号系列を格納した
適応符号帳７を用い、雑音成分の量子化には代数符号帳
や雑音符号帳などを用いる。以下では、音源符号帳とし
て適応符号帳７と代数符号帳８の２つを使用する音声符
号化方式について説明する。Next, a sound source and gain search process is performed. The sound source and gain are processed in subframe units. First, the excitation signal is divided into a pitch period component and a noise component, the adaptive codebook 7 storing the past excitation signal sequence is used for the quantization of the pitch period component, and the algebraic codebook is used for the quantization of the noise component. Or noise codebook is used. In the following, a speech coding method using two adaptive codebooks 7 and an algebraic codebook 8 as excitation codebooks will be described.

【０００７】適応符号帳７は、インデックス１〜Ｌに対
応して順次１サンプル遅延したＮサンプル分の音源信号
（周期性信号という）を出力するようになっている。Ｎ
は1サブフレームのサンプル数であり（N=40)、最新の(L
+39)サンプルのピッチ周期成分を記憶するバッファを有
している。インデックス１により第1〜第40サンプルより
なる周期性信号が特定され、インデックス２により第2
〜第41サンプルよりなる周期性信号が特定され、・・・イ
ンデックスＬにより第Ｌ〜第L+39サンプルよりなる周期
性信号が特定される。初期状態では適応符号帳７の中身
は全ての振幅が0の信号が入っており、サブフレーム毎
に時間的に一番古い信号をサブフレーム長だけ捨て、現
サブフレームで求めた音源信号を適応符号帳７に格納す
るように動作する。The adaptive codebook 7 outputs N samples of excitation signals (referred to as periodic signals) sequentially delayed by 1 sample corresponding to indexes 1 to L. N
Is the number of samples in one subframe (N = 40), and the latest (L = 40
+39) Has a buffer that stores the pitch period component of the sample. The index 1 identifies the periodic signal consisting of the 1st to 40th samples, and the index 2 identifies the second signal.
The periodic signal composed of the 41st sample is specified, and the periodic signal composed of the Lth to L + 39th samples is specified by the index L. In the initial state, the content of the adaptive codebook 7 contains signals with all amplitudes of 0. For each subframe, the oldest signal in time is discarded by the subframe length and the excitation signal found in the current subframe is adapted. It operates so as to be stored in the codebook 7.

【０００８】適応符号帳探索は、過去の音源信号を格納
している適応符号帳７を用いて音源信号の周期性成分を
同定する。すなわち、適応符号帳７から読み出す開始点
を1サンプルづつ変えながら適応符号帳７内の過去の音
源信号をサブフレーム長(=40サンプル)だけ取り出し、L
PC合成フィルタ６に入力してピッチ合成信号βＡＰＬを
作成する。ただし、ＰＬは適応符号帳７から取り出され
た遅れＬに相当する過去のピッチ周期性信号(適応符号
ベクトル)、ＡはLPC合成フィルタ６のインパルス応答、
βは適応符号帳ゲインである。In the adaptive codebook search, the adaptive codebook 7 storing past excitation signals is used to identify the periodic component of the excitation signal. That is, the past excitation signal in the adaptive codebook 7 is extracted by the subframe length (= 40 samples) while changing the starting point read from the adaptive codebook 7 by one sample, and L
It is input to the PC synthesis filter 6 to create a pitch synthesis signal βAPL. However, PL is the past pitch periodic signal (adaptive code vector) corresponding to the delay L extracted from the adaptive codebook 7, A is the impulse response of the LPC synthesis filter 6,
β is an adaptive codebook gain.

【０００９】演算部９は入力音声ＸとβＡＰＬの誤差電
力ＥＬを次式ＥＬ＝｜Ｘ−βＡＰＬ｜２ (2) により求める。適応符号帳出力の重み付き合成出力をＡ
ＰＬとし、ＡＰＬの自己相関をＲpp、ＡＰＬと入力信号
Ｘの相互相関をＲxpとすると、式(2)の誤差電力が最小
となるピッチラグＬoptにおける適応符号ベクトルＰＬ
は、次式 P_L=argmax（Rxp²／Rpp） (3) により表わされる。すなわち、ピッチ合成信号ＡＰＬと入
力信号Ｘとの相互相関Ｒxpをピッチ合成信号の自己相関
Ｒppで正規化した値が最も大きくなる読み出し開始点を
最適な開始点とする。以上より、誤差電力評価部１０は
(3)式を満足するピッチラグＬoptを求める。このとき、
最適ピッチゲインβoptは次式 βopt＝Ｒxp／Ｒpp (4) で与えられる。The calculation unit 9 obtains the error power EL between the input voice X and βAPL by the following equation EL = │X-βAPL│2 (2). A is the weighted composite output of the adaptive codebook output.
Let PL be the autocorrelation of APL be Rpp, and the cross-correlation of APL and input signal X be Rxp, the adaptive code vector PL at pitch lag Lopt at which the error power in equation (2) is minimized.
Is expressed by the following equation P _L = argmax (Rxp ² / Rpp) (3). That is, the optimum starting point is the read start point at which the value obtained by normalizing the cross-correlation Rxp between the pitch synthesized signal APL and the input signal X by the autocorrelation Rpp of the pitch synthesized signal is the largest. From the above, the error power evaluation unit 10
Find the pitch lag Lopt that satisfies the equation (3). At this time,
The optimum pitch gain βopt is given by the following equation βopt = Rxp / Rpp (4).

【００１０】次に代数符号帳８を用いて音源信号に含ま
れる雑音成分を量子化する。代数符号帳８は、振幅が1
又は−1の複数のパルスから構成される。例として、サ
ブフレーム長が40サンプルの場合のパルス位置を表1に
示す。Next, the algebraic codebook 8 is used to quantize the noise component contained in the excitation signal. The algebraic codebook 8 has an amplitude of 1
Or, it is composed of a plurality of pulses of -1. As an example, Table 1 shows the pulse positions when the subframe length is 40 samples.

【表1】代数符号帳８は、１サブフレームを構成するＮ(=40)サ
ンプル点を複数のパルス系統グループ１〜４に分割し、
各パルス系統グループから１つのサンプル点を取り出し
てなる全組み合わせについて、各サンプル点で＋１ある
いは−１のパルスを有するパルス性信号を雑音成分とし
て順次出力する。この例では、基本的に1サブフレーム
あたり4本のパルスが配置される。【table 1】 The algebraic codebook 8 divides N (= 40) sample points constituting one subframe into a plurality of pulse system groups 1 to 4,
For all combinations obtained by extracting one sample point from each pulse system group, a pulse signal having +1 or -1 pulse at each sample point is sequentially output as a noise component. In this example, basically four pulses are arranged per subframe.

【００１１】図1９は各パルス系統グループ１〜４に割
り当てたサンプル点の説明図であり、(1) パルス系統グ
ループ１には8個のサンプル点 0、5、10,15,20,25,30,3
5が割り当てられ、(2) パルス系統グループ２には8個の
サンプル点 1,6,11,16,21,26,31,36が割り当てられ、
(3) パルス系統グループ３には8個のサンプル点 2,7,1
2,17,22,27,32,37が割り当てられ、(4) パルス系統グル
ープ４には16個のサンプル点 3,4,8,9,13,14,18,19,23,
24,28,29,33,34,38,39が割り当てられている。FIG. 19 is an explanatory view of the sampling points assigned to the pulse system groups 1 to 4, and (1) the pulse system group 1 has eight sampling points 0, 5, 10, 15, 20, 25, 30,3
5 are assigned, and (2) pulse system group 2 is assigned 8 sample points 1,6,11,16,21,26,31,36,
(3) Eight sample points 2,7,1 in pulse system group 3
2,17,22,27,32,37 are assigned, and (4) 16 groups of sampling points are assigned to the pulse system group 4, 3,4,8,9,13,14,18,19,23,
24,28,29,33,34,38,39 are assigned.

【００１２】パルス系統グループ１〜３のサンプル点を
表現するために３ビット、パルスの正負を表現するのに
１ bit、トータル4 bit が必要であり、又、パルス系統
グループ４のサンプル点を表現するために4 bit、パル
スの正負を表現するのに1 bit、トータル5 bit 必要で
ある。従って、表１のパルス配置を有する雑音符号帳８
から出力するパルス性信号を特定するために17bitが必
要になり、パルス性信号の種類は２１７（＝２４×２４
×２４×２５）存在する。表1に示すように各パルス系
統のパルス位置は限定されており、代数符号帳探索では
各パルス系統のパルス位置の組み合わせの中から、再生
領域で入力音声との誤差電力が最も小さくなるパルスの
組み合わせを決定する。すなわち、適応符号帳探索で求
めた最適ピッチゲインβoptとし、適応符号帳出力ＰＬ
に該ゲインβoptを乗算して加算器１１に入力する。こ
れと同時に代数符号帳８より順次パルス性信号を加算器
に１１に入力し、加算器出力をLPC合成フィルタ６に入
力して得られる再生信号と入力信号Ｘとの差が最小とな
るパルス性信号を特定する。具体的には、まず入力信号
Ｘから適応符号帳探索で求めた最適な適応符号帳出力Ｐ
Ｌ、最適ピッチゲインβ_optから次式により代数符号帳
探索のためのターゲットベクトルＸ′を生成する。3 bits are required to express the sampling points of the pulse system groups 1 to 3, 1 bit is required to express the positive / negative of the pulse, and a total of 4 bits are required, and the sampling points of the pulse system group 4 are expressed. To achieve this, 4 bits are required, and 1 bit is required to express the positive / negative of the pulse, for a total of 5 bits. Therefore, the random codebook 8 having the pulse arrangement in Table 1 is
17 bits are required to specify the pulsed signal output from, and the type of pulsed signal is 217 (= 24 × 24
X24x25) present. As shown in Table 1, the pulse positions of each pulse system are limited, and in the algebraic codebook search, among the combinations of pulse positions of each pulse system, the pulse with the smallest error power with the input voice in the reproduction area is selected. Determine the combination. That is, the optimum pitch gain βopt obtained by the adaptive codebook search is set, and the adaptive codebook output PL
Is multiplied by the gain βopt and input to the adder 11. Simultaneously with this, a pulsed signal from the algebraic codebook 8 is sequentially input to the adder 11, and the output of the adder is input to the LPC synthesis filter 6 to obtain a pulsed signal having a minimum difference between the reproduced signal and the input signal X. Identify the signal. Specifically, first, the optimum adaptive codebook output P obtained by the adaptive codebook search from the input signal X
From L and the optimum pitch gain β _opt, a target vector X ′ for algebraic codebook search is generated by the following equation.

【００１３】Ｘ′＝Ｘ−β_optAPＬ (5) この例では、パルスの位置と振幅(正負)を前述のように
17bitで表現するため、その組合わせは2の17乗通り存在
する。ここで、k通り目の代数符号出力ベクトルをCｋと
すると、代数符号帳探索では次式Ｄ＝|Ｘ′−ＧＣACｋ|２ (6) の評価関数誤差電力Ｄを最小とする符号ベクトルCｋを
求める。ＧＣは代数符号帳ゲインである。誤差電力評価
部１０は代数符号帳の探索において、代数合成信号ＡＣ
ｋと入力信号Ｘ′の相互相関値Rcxの２乗を代数合成信
号の自己相関値Rccで正規化して得られる正規化相互相
関値(Rcx*Rcx/Rcc)が最も大きくなるパルス位置と極性
の組み合わせを探索する。X '= X-β _opt APL (5) In this example, the pulse position and amplitude (positive / negative) are as described above.
Since it is expressed in 17 bits, there are 2 to the 17th power combinations. Here, assuming that the kth algebraic code output vector is Ck, in the algebraic codebook search, a code vector Ck that minimizes the evaluation function error power D of the following equation D = | X'-GCACk | 2 (6) is obtained. . GC is the algebraic codebook gain. The error power evaluation unit 10 searches the algebraic codebook for the algebraic composite signal AC.
The square of the cross-correlation value Rcx of k and the input signal X ′ is normalized by the auto-correlation value Rcc of the algebraic composite signal, and the normalized cross-correlation value (Rcx * Rcx / Rcc) is maximized. Search for combinations.

【００１４】次にゲイン量子化について説明する。G.72
9A方式において代数符号帳ゲインは直接量子化されず、
適応符号帳ゲインＧa（＝βopt）と代数符号帳ゲインＧ
cの補正係数γをベクトル量子化する。ここで、代数符
号帳ゲインＧＣと補正係数γとの間にはＧＣ＝ｇ′×
γなる関係がある。ｇ′は過去の4サブフレームの対数
利得から予測される現フレームの利得である。ゲイン量
子化器１２の図示しないゲイン量子化テーブルには、適
応符号帳ゲインＧaと代数符号帳ゲインに対する補正係
数γの組み合わせが128通り(＝２７)用意されている。
ゲイン符号帳の探索方法は、適応符号帳出力ベクトル
と代数符号帳出力ベクトルに対して、ゲイン量子化テー
ブルの中から1組のテーブル値を取り出してゲイン可変
部１３、１４に設定し、ゲイン可変部１３、１４でそ
れぞれのベクトルにゲインＧa、Ｇcを乗じてLPC合成フ
ィルタ６に入力し、誤差電力評価部１０において入力
信号Ｘとの誤差電力が最も小さくなる組み合わせを選択
する、ことにより行なう。Next, the gain quantization will be described. G.72
In the 9A method, the algebraic codebook gain is not directly quantized,
Adaptive codebook gain Ga (= βopt) and algebraic codebook gain G
Vector-quantize the correction coefficient γ of c. Here, between the algebraic codebook gain GC and the correction coefficient γ, GC = g ′ ×
There is a relation called γ. g'is the gain of the current frame predicted from the logarithmic gain of the past 4 subframes. In the gain quantization table (not shown) of the gain quantizer 12, 128 combinations (= 27) of the adaptive codebook gain Ga and the correction coefficient γ with respect to the algebraic codebook gain are prepared.
The gain codebook search method is such that, for the adaptive codebook output vector and the algebraic codebook output vector, one set of table values is extracted from the gain quantization table and set in the gain variable units 13 and 14, The respective vectors are multiplied by gains Ga and Gc in the units 13 and 14 and input to the LPC synthesis filter 6, and the error power evaluation unit 10 selects a combination having the smallest error power with the input signal X.

【００１５】以上より、回線符号化部１５は、LSPの
量子化インデックスであるLSP符号、ピッチラグの量
子化インデックスであるピッチラグ符号Ｌopt、(3) 代
数符号帳インデックスである代数符号、(4) ゲインの量
子化インデックスであるゲイン符号を多重して回線デー
タを作成し、復号器に伝送する。From the above, the line coding unit 15 uses the LSP code which is the quantization index of the LSP, the pitch lag code Lopt which is the quantization index of the pitch lag, (3) the algebraic code which is the algebraic codebook index, and (4) the gain. The gain code, which is the quantization index of, is multiplexed to create line data and transmitted to the decoder.

【００１６】・復号器の構成及び動作図２０はG.729A方式の復号器のブロック図である。符号
器側から送られてきた回線データが回線復号部２１へ入
力されてLSP符号、ピッチラグ符号、代数符号、ゲイン
符号が出力される。復号器ではこれらの符号に基づいて
音声データを復号する。復号器の動作については、復号
器の機能が符号器に含まれているため一部重複するが、
以下で簡単に説明する。LSP逆量子化部２２はLSP符号が
入力すると逆量子化し、LSP逆量子化値を出力する。LSP
補間部２３は現フレームの第２サブフレームにおけるLS
P逆量子化値と前フレームの第２サブフレームのLSP逆量
子化値から現フレームの第１サブフレームのLSP逆量子
化値を補間演算する。次に、パラメータ逆変換部２４は
LSP補間値とLSP逆量子化値をそれぞれLPC合成フィルタ
係数へ変換する。G.729A方式のLPC合成フィルタ２５
は、最初の第１サブフレームではLSP補間値から変換さ
れたLPC係数を用い、次の第２サブフレームではLSP逆量
子化値から変換されたLPC係数を用いる。Configuration and Operation of Decoder FIG. 20 is a block diagram of a G.729A system decoder. The line data sent from the encoder side is input to the line decoding unit 21, and the LSP code, pitch lag code, algebraic code, and gain code are output. The decoder decodes the audio data based on these codes. Regarding the operation of the decoder, it partially overlaps because the function of the decoder is included in the encoder,
A brief description will be given below. When the LSP code is input, the LSP dequantization unit 22 dequantizes the LSP code and outputs the LSP dequantized value. LSP
The interpolator 23 determines the LS in the second subframe of the current frame.
The LSP dequantized value of the first subframe of the current frame is interpolated from the P dequantized value and the LSP dequantized value of the second subframe of the previous frame. Next, the parameter inverse conversion unit 24
The LSP interpolated value and the LSP dequantized value are respectively converted into LPC synthesis filter coefficients. G.729A LPC synthesis filter 25
Uses the LPC coefficient converted from the LSP interpolation value in the first first subframe, and uses the LPC coefficient converted from the LSP dequantized value in the second second subframe.

【００１７】適応符号帳２６はピッチラグ符号が指示す
る読み出し開始位置からサブフレーム長(=40サンプル)
のピッチ信号を出力し、雑音符号帳２７は代数符号に対
応するの読出し位置からパルス位置とパルスの極性を出
力する。また、ゲイン逆量子化部２８は入力されたゲイ
ン符号より適応符号帳ゲイン逆量子化値と代数符号帳ゲ
イン逆量子化値を算出してゲイン可変部２９，３０に設
定する。加算部３１は適応符号帳出力に適応符号帳ゲイ
ン逆量子化値を乗じて得られる信号と、代数符号帳出力
に代数符号帳ゲイン逆量子化値を乗じて得られる信号と
を加え合わせて音源信号を作成し、この音源信号をLPC
合成フィルタ２５に入力する。これにより、LPC合成フ
ィルタ２５から再生音声を得ることができる。尚、初期
状態では復号器側の適応符号帳２６の内容は全て振幅0
の信号が入っており、サブフレーム毎に時間的に一番古
い信号をサブフレーム長だけ捨て、一方、現サブフレー
ムで求めた音源信号を適応符号帳２６に格納するように
動作する。つまり、符号器と復号器の適応符号帳２６は
常に最新の同じ状態になるように維持される。以上がG.
729Aの符号化及び復号方式である。一方、AMR方式もG.72
9A方式と同様にCELP(Code Excited Linear Prediction;
符号駆動線形予測符号化)と呼ばれる基本アルゴリズム
を用いており、G.729A方式との違いは以下の通りであ
る。The adaptive codebook 26 has a subframe length (= 40 samples) from the read start position indicated by the pitch lag code.
, The noise codebook 27 outputs the pulse position and the polarity of the pulse from the read position corresponding to the algebraic code. Further, the gain dequantization unit 28 calculates an adaptive codebook gain dequantization value and an algebraic codebook gain dequantization value from the input gain code and sets them in the gain variable units 29 and 30. The adding unit 31 adds a signal obtained by multiplying the adaptive codebook output by the adaptive codebook gain dequantization value and a signal obtained by multiplying the algebraic codebook output by the algebraic codebook gain dequantization value Create a signal and use this source signal as an LPC
Input to the synthesis filter 25. Thereby, the reproduced voice can be obtained from the LPC synthesis filter 25. In the initial state, the contents of the adaptive codebook 26 on the decoder side are all amplitude 0.
, The oldest signal in time is discarded for each subframe by the subframe length, and the excitation signal obtained in the current subframe is stored in the adaptive codebook 26. That is, the adaptive codebooks 26 of the encoder and the decoder are always maintained in the latest state. That's G.
729A encoding and decoding method. On the other hand, the AMR method is also G.72.
CELP (Code Excited Linear Prediction;
A basic algorithm called code-driven linear predictive coding) is used, and the difference from the G.729A system is as follows.

【００１８】・G729A方式とAMR方式における符号化方法
の相違図２１はG.729A方式とAMRの主要諸元を比較した結果で
ある。なお、AMRの符号化モードは全部で８種類あるが
図２１の諸元は全ての符号化モードで共通である。G729
A方式とAMR方式は、入力信号の標本化周波数(=8KHz)、
サブフレーム長(=5msec)、線形予測次数(=10次)は同じ
であるが、フレーム長が異なり、１フレーム当りのサブ
フレーム数が異なっている。図２２に示すようにG.729A
方式では１フレームが２つの第０〜第１サブフレームで
構成され、AMR方式では１フレームが４つの第０〜第３
サブフレームで構成されている。-Difference in encoding method between G729A system and AMR system Fig. 21 shows a result of comparison of main specifications of the G.729A system and AMR system. Note that there are eight types of AMR coding modes in all, but the specifications in FIG. 21 are common to all coding modes. G729
A method and AMR method, the sampling frequency of the input signal (= 8KHz),
The subframe length (= 5 msec) and the linear prediction order (= 10th order) are the same, but the frame length is different and the number of subframes per frame is different. As shown in FIG. 22, G.729A
In the method, one frame is composed of two 0th to 1st subframes, and in the AMR method, one frame is four 0th to 3rd subframes.
It is composed of subframes.

【００１９】図２３はG.729A方式とAMR方式におけるビ
ット割り当ての比較結果を示すもので、AMR方式につい
てはG.729Aのビットレートに最も近い7.95kbit/sモード
の場合を示した。図２３から明らかなように、1サブフ
レーム当りの代数符号帳のビット数(=17ビット)は同じ
であるが、その他の符号に必要なビット数の配分は全て
異なっている。また、G.729A方式では適応符号帳ゲイン
と代数符号帳ゲインをまとめてベクトル量子化するた
め、ゲイン符号は１サブフレームにつき１種類である
が、AMR方式では１サブフレームにつき適応符号帳ゲイ
ンと代数符号帳ゲインの２種類が必要である。以上説明
した通り、インターネットで音声を通信するVoIPで広く
用いられているG.729A方式と携帯電話システムで採用さ
れたAMR方式とでは、基本アルゴリズムが共通である
が、フレーム長が異なり、しかも、符号を表現するビッ
ト数が異なっている。FIG. 23 shows the result of bit allocation comparison between the G.729A system and the AMR system, and the AMR system shows the case of 7.95 kbit / s mode which is the closest to the bit rate of G.729A. As is apparent from FIG. 23, the number of bits (= 17 bits) of the algebraic codebook per subframe is the same, but the allocation of the number of bits required for other codes is different. Further, in the G.729A system, the adaptive codebook gain and the algebraic codebook gain are collectively vector-quantized, so that there is only one type of gain code per subframe. Two types of algebraic codebook gain are required. As described above, the G.729A method, which is widely used in VoIP for communicating voice over the Internet, and the AMR method, which is adopted in the mobile phone system, have the same basic algorithm, but the frame length is different, and The number of bits expressing the code is different.

【００２０】・音声符号変換インターネットと携帯電話の普及に伴い、インターネッ
トユーザと携帯電話網のユーザによる音声通話の通信量
が今後ますます増えてくると考えられる。このような異
なる通信システム間の音声通信には、図２４に示すよう
に中間に音声符号変換装置５３が必要になる。すなわ
ち、音声符号変換装置５３において、一方の通信システ
ム５1の符号器５２で第1音声符号化方式に従って符号化
した音声符号を、他方の通信システム５４で使用されて
いる第2音声符号化方式の音声符号に変換する。このよ
うに音声符号変換すれば、通信システム５４の第2音声符
号化方式の復号器５５はユーザ1の音声を正しく再生す
ることができる。Voice code conversion With the spread of the Internet and mobile phones, it is considered that the communication volume of voice calls between Internet users and mobile phone network users will increase more and more in the future. For voice communication between such different communication systems, a voice code conversion device 53 is required in the middle as shown in FIG. That is, in the speech transcoding device 53, the speech code encoded by the encoder 52 of the one communication system 51 according to the first speech coding system is converted into the speech code of the second speech coding system used in the other communication system 54. Convert to voice code. By performing voice code conversion in this way, the decoder 55 of the second voice coding system of the communication system 54 can correctly reproduce the voice of the user 1.

【００２１】かかる符号変換技術としては、各々のシ
ステムの音声符号化方式で復号・符号を繰り返すタンデ
ム接続方式や、音声符号を、該音声符号を構成する各
要素符号に分解し、各要素符号を個別に別の音声符号化
方式の符号に変換する手法が提案されている（特願2001
-75427参照）。図２５は後者の手法の説明図である。端
末71に組み込まれた符号化方式１の符号器71ａはユーザ
Ａが発した音声信号を符号化方式１の音声符号に符号化
して伝送路71ｂに送出する。音声符号変換部74は伝送路
71ｂより入力した符号化方式１の音声符号を符号化方式
２の音声符号に変換して伝送路72ｂに送出し、端末72の
復号器72ａは、伝送路72ｂを介して入力する符号化方式
２の音声符号から再生音声を復号し、ユーザＢはこの再
生音声を聞くことができる。As such a code conversion technique, a tandem connection system in which decoding / coding is repeated by a voice encoding system of each system, or a voice code is decomposed into each element code constituting the voice code, and each element code is A method of individually converting to a code of another voice encoding method has been proposed (Japanese Patent Application No. 2001).
-75427). FIG. 25 is an explanatory diagram of the latter method. The encoder 71a of the encoding system 1 incorporated in the terminal 71 encodes the voice signal of the user A into a voice code of the encoding system 1 and sends it to the transmission line 71b. The voice code conversion unit 74 is a transmission line
The audio code of the encoding method 1 input from 71b is converted into the audio code of the encoding method 2 and transmitted to the transmission line 72b, and the decoder 72a of the terminal 72 inputs the encoding method 2 via the transmission line 72b. The reproduced voice is decoded from the voice code of, and the user B can hear the reproduced voice.

【００２２】符号化方式１は、フレーム毎の線形予測
分析により得られる線形予測係数(LPC係数)から求まるL
SPパラメータを量子化することにより得られる第１のL
ＳＰ符号と、周期性音源信号を出力するための適応符
号帳の出力信号を特定する第１のピッチラグ符号と、
雑音性音源信号を出力するための代数符号帳(あるいは
雑音符号帳)の出力信号を特定する第１の代数符号(雑音
符号)と、前記適応符号帳の出力信号の振幅を表すピ
ッチゲインと前記代数符号帳の出力信号の振幅を表す代
数符号帳ゲインとを量子化して得られる第１のゲイン符
号とで音声信号を符号化する方式である。又、符号化方
式２は、第１の音声符号化方式と異なる量子化方法によ
り量子化して得られる第２のLＳＰ符号、第２のピ
ッチラグ符号、第２の代数符号（雑音符号）、第２
のゲイン符号とで音声信号を符号化する方式である。The coding method 1 is L obtained from the linear prediction coefficient (LPC coefficient) obtained by the linear prediction analysis for each frame.
The first L obtained by quantizing the SP parameter
An SP code and a first pitch-lag code that specifies an output signal of an adaptive codebook for outputting a periodic excitation signal,
A first algebraic code (noise code) for specifying an output signal of an algebraic codebook (or a noise codebook) for outputting a noisy excitation signal, a pitch gain representing the amplitude of the output signal of the adaptive codebook, and This is a method of encoding a voice signal with a first gain code obtained by quantizing an algebraic codebook gain representing the amplitude of an output signal of an algebraic codebook. The coding method 2 is a second LSP code, a second pitch lag code, a second algebraic code (noise code), a second LSP code obtained by quantizing by a quantization method different from the first speech coding method.
This is a method of encoding a voice signal with the gain code of.

【００２３】音声符号変換部74は、符号分離部74ａ、LS
P符号変換部74ｂ、ピッチラグ符号変換部74ｃ、代数符
号変換部74ｄ、ゲイン符号変換部74ｅ、符号多重化部74
ｆを有している。符号分離部74ａは、端末１の符号器71
ａから伝送路71ｂを介して入力する符号化方式１の音声
符号より、音声信号を再現するために必要な複数の成分
の符号、すなわち、LSP符号、ピッチラグ符号、
代数符号、ゲイン符号に分離し、それぞれを各符号変
換部74ｂ〜74ｅに入力する。各符号変換部74ｂ〜74ｅは
入力された音声符号化方式１によるLSP符号、ピッチラ
グ符号、代数符号、ゲイン符号をそれぞれ音声符号化方
式２によるLSP符号、ピッチラグ符号、代数符号、ゲイ
ン符号(ピッチゲイン符号、代数ゲイン符号)に変換し、
符号多重化部74ｆは変換された音声符号化方式２の各符
号を多重化して伝送路72ｂに送出する。The voice code conversion section 74 includes a code separation section 74a and LS.
P code converter 74b, pitch lag code converter 74c, algebraic code converter 74d, gain code converter 74e, code multiplexer 74
have f. The code separation unit 74a includes the encoder 71 of the terminal 1.
A code of a plurality of components necessary for reproducing a voice signal from a voice code of the encoding method 1 input from a through the transmission path 71b, that is, an LSP code, a pitch lag code,
It is separated into an algebraic code and a gain code and input to each of the code conversion units 74b to 74e. Each of the code conversion units 74b to 74e receives the LSP code, the pitch lag code, the algebraic code, and the gain code according to the speech coding method 1 from the LSP code, the pitch lag code, the algebraic code, and the gain code (pitch gain with the speech coding method 2). Code, algebraic gain code),
The code multiplexing unit 74f multiplexes the converted codes of the voice coding method 2 and sends them to the transmission line 72b.

【００２４】・データの埋め込み技術近年コンピュータやインターネットが普及する中で、マ
ルチメディアコンテンツ(静止画、動画、オーディオ、
音声など)に特殊なデータを埋め込む「電子透かし技
術」が注目を集めている。電子透かし技術とは、画像や
動画、音声などのマルチメディアコンテンツ自体に、人
間の知覚の特性を利用し、品質にはほとんど影響を与え
ずに別の任意の情報を埋め込む技術である。このような
技術は、コンテンツに作成者や販売者などの名前を埋め
込んで、不正コピーやデータの改ざんなどを防止すると
いった著作権保護を目的とすることが多いが、その他に
もコンテンツに関する関連情報や付属情報を埋め込んで
利用者のコンテンツ利用時における利便性を高めること
を目的としても用いられる。Data embedding technology With the spread of computers and the Internet in recent years, multimedia contents (still images, moving images, audio,
"Digital watermarking technology", which embeds special data in (such as voice), is drawing attention. The digital watermarking technique is a technique for embedding other arbitrary information into multimedia contents such as images, moving images, and sounds, by utilizing the characteristics of human perception and hardly affecting the quality. Such technologies often aim to protect copyright by embedding the names of creators and sellers in the content to prevent illegal copying and data tampering. It is also used for the purpose of embedding or additional information to improve the convenience of the user when using the content.

【００２５】音声通信の分野でも、音声符号にこのよう
な任意の情報を埋め込んで伝送する試みが行われてい
る。図２６はデータ埋め込み技術を適用した音声通信シ
ステムの概念図である。符号器81は、入力音声SPを音声
符号に符号化する際に、音声以外の任意のデータ系列DT
を音声符号SCDに埋め込んで復号器８２へ伝送する。こ
のときデータの埋め込みを音声符号のフォーマットを変
えずに音声符号自体に行うため、音声符号の情報量の増
加はない。復号器82は音声符号に埋め込まれた任意のデ
ータ系列を読み出すとともに、音声符号に通常の復号器
処理を施して再生音声SP′を出力する。このとき、再生
音声SP′の品質にほとんど影響がないように埋め込みが
行われるため、再生音声は埋め込みを行わない場合とほ
とんど差がない。以上の構成により、伝送量を増加させ
ることなく音声とは別に任意のデータを伝送することが
可能となる。また、データが埋め込まれていることを知
らない第3者にとっては通常の音声通信としか認識され
ない。Also in the field of voice communication, attempts are being made to embed such arbitrary information in a voice code for transmission. FIG. 26 is a conceptual diagram of a voice communication system to which the data embedding technique is applied. The encoder 81, when encoding the input voice SP into a voice code, outputs an arbitrary data sequence DT other than voice.
Embedded in the voice code SCD and transmitted to the decoder 82. At this time, since the data is embedded in the voice code itself without changing the voice code format, the information amount of the voice code does not increase. The decoder 82 reads out an arbitrary data sequence embedded in the voice code, performs the normal decoder processing on the voice code, and outputs the reproduced voice SP '. At this time, since the embedding is performed so that the quality of the reproduced sound SP ′ is hardly affected, the reproduced sound is almost the same as the case where the embedding is not performed. With the above configuration, it is possible to transmit arbitrary data separately from voice without increasing the transmission amount. In addition, a third party who does not know that the data is embedded is recognized as normal voice communication.

【００２６】データの埋め込み方法としては、さまざま
な方法がある。特にCELP方式をベースとする高圧縮音声
符号化方式では、符号化された音声符号に任意の情報を
埋め込む方法がいくつか提案されている。例えば、代数
符号帳および適応符号帳を用いて符号化を行う音声符号
化方式において、ピッチラグ符号、代数符号に任意のデ
ータを埋め込む技術が提案されている。この埋め込む技
術は、ある規則に従って代数符号帳あるいは適応符号帳
で量子化した符号（ピッチラグ符号、代数符号）に任意
のデータ系列を埋め込むものである。ピッチ音源に対応
するピッチラグ符号と雑音音源に対応する代数符号に着
目すると、これらのゲイン(ピッチゲイン、代数符号帳ゲ
イン)が各符号の寄与度を示すファクタとみなすことが
でき、ゲインが小さい場合は対応する符号の寄与度が小
さくなる。そこで、ゲインを判定パラメータとして定義
し、該ゲインがある閾値以下になる場合は対応する符号
の寄与度が小さいと判断して、該符号のインデックスを
任意のデータ系列で置き換える。これにより、置き換え
の影響を小さく抑えながら、任意のデータを埋め込むこ
とが可能となる。There are various methods for embedding data. In particular, in the high compression speech coding method based on the CELP method, some methods of embedding arbitrary information in the coded speech code have been proposed. For example, in a speech coding method that performs coding using an algebraic codebook and an adaptive codebook, a technique has been proposed in which arbitrary data is embedded in a pitch lag code or an algebraic code. This embedding technique embeds an arbitrary data sequence in a code (pitch lag code, algebraic code) quantized by an algebraic codebook or an adaptive codebook according to a certain rule. Focusing on the pitch lag code corresponding to the pitch sound source and the algebraic code corresponding to the noise sound source, these gains (pitch gain, algebraic codebook gain) can be regarded as factors indicating the contribution of each code, and when the gain is small Has a smaller contribution of the corresponding code. Therefore, the gain is defined as a determination parameter, and when the gain is less than or equal to a certain threshold value, it is determined that the contribution of the corresponding code is small, and the index of the code is replaced with an arbitrary data sequence. This makes it possible to embed arbitrary data while suppressing the effect of replacement.

【００２７】今後、以上説明したようなデータ埋め込み
技術を適用した通信システム間での通信が増大すること
が予想される。このとき音声符号変換装置はデータ埋め
込みを施された音声符号を対象に符号変換を行う必要性
がある。In the future, it is expected that communication between communication systems to which the data embedding technique described above is applied will increase. At this time, the voice code conversion device needs to perform the code conversion for the voice code embedded with the data.

【００２８】[0028]

【発明が解決しようとする課題】・課題1 図２７に符号変換の原理図を示す。図２７は第1符号化
方式の符号化データCode1を第2符号化方式の符号化デー
タCode2に変換する場合を示している。符号変換部91
は、第1符号化方式による符号化の際に使用される第1量
子化テーブル92と第2符号化方式による符号化の際に使
用される第2量子化テーブル93をそれぞれ備えている。
また、第1量子化テーブル92と第2量子化テーブル93はテ
ーブルサイズおよびテーブル値が異なるが、図２７で
は、説明の簡略化のためにテーブルサイズが2ビットと
同じ場合を示す。[Problems to be Solved by the Invention] -Problem 1 FIG. 27 shows a principle diagram of code conversion. FIG. 27 shows a case where coded data Code1 of the first coding method is converted into coded data Code2 of the second coding method. Code converter 91
Includes a first quantization table 92 used for encoding by the first encoding method and a second quantization table 93 used for encoding by the second encoding method.
Further, although the first quantization table 92 and the second quantization table 93 have different table sizes and table values, FIG. 27 shows the case where the table size is the same as 2 bits for simplification of description.

【００２９】図２７において、符号変換部91に入力され
る第1符号化方式の符号化データCode1（図では"01"）
は、第1量子化テーブル92のインデックス番号を表して
いる。したがって、入力されたCode1に対応する第1量子
化テーブル92の値（図では2.0）に最も誤差の小さい値
を第2量子化テーブル93より選択し、それに対応する第2
量子化テーブル93のインデックス番号（図では、"10"）
を第2符号化方式の符号化データCode2として出力する。
このように符号変換部91では、変換元、変換先の量子化
テーブルを比較して誤差が最も小さくなるようにインデ
ックス番号の対応付けを行っている。ここで入力符号Co
de1のデータ系列が、前述した埋め込み方法によって埋
め込まれた任意のデータ("01"とする)である場合を考え
る。符号変換部91は、前述と同様の変換処理を行うた
め、入力データ系列"01"を"10"へ変換する。しかし、こ
れでは、埋め込まれたデータ系列が"01"→"10"と変化し
てしまい保持されなくなり、受信側の第2符号化方式の
復号器は埋め込まれたデータ系列を正常に復元すること
ができない。以上のように、従来の符号変換方式では、
入力符号に任意のデータ系列が埋め込まれている場合、
該埋め込みデータ系列を保持できず、結果として符号変
換装置において埋め込みデータが損なわれる問題があっ
た。In FIG. 27, the coded data Code1 (“01” in the figure) of the first coding method input to the code conversion unit 91.
Represents the index number of the first quantization table 92. Therefore, the value having the smallest error in the value (2.0 in the figure) of the first quantization table 92 corresponding to the input Code1 is selected from the second quantization table 93, and the corresponding second value is selected.
Index number of quantization table 93 ("10" in the figure)
Is output as encoded data Code2 of the second encoding method.
In this way, the code conversion unit 91 compares the quantization tables of the conversion source and the conversion destination and associates the index numbers so that the error becomes the smallest. Where input code Co
Consider a case where the data series of de1 is arbitrary data ("01") embedded by the above-described embedding method. The code conversion unit 91 converts the input data series “01” into “10” in order to perform the same conversion processing as described above. However, with this, the embedded data sequence changes from "01" to "10" and is no longer held, and the decoder of the second encoding method on the receiving side can restore the embedded data sequence normally. I can't. As described above, in the conventional code conversion method,
When an arbitrary data series is embedded in the input code,
There is a problem that the embedded data series cannot be held and, as a result, the embedded data is lost in the code conversion device.

【００３０】・課題2 今後、第3世代携帯電話システムに代表されるように、
音声通信に加え、データ通信等マルチメディア情報を対
象とした通信システムの普及が予想される。このため、
従来のような音声回線のみを持つ通信システムと、音声
回線とその他のデータ回線を持つ通信システム間での通
信が発生する。かかる場合、音声回線については従来の音
声符号変換装置で両通信システム間の音声符号の相互変
換を行うことによりユーザ間の音声通信が可能となる。
しかし、データ回線については、一方がデータ回線を持
たないため、ユーザ間のデータ通信は不可能である。以
上のように音声回線のみを持つ通信システムと音声回線
と他にデータ回線を持つ通信システム間では、ユーザ間
で音声通信しか行うことが出来ない問題がある。[Problem 2] As will be represented by the third-generation mobile phone system in the future,
In addition to voice communication, communication systems for multimedia information such as data communication are expected to spread. For this reason,
Communication occurs between a conventional communication system having only voice lines and a communication system having voice lines and other data lines. In such a case, for a voice line, voice communication between users can be performed by performing mutual conversion of voice codes between both communication systems using a conventional voice code conversion device.
However, one of the data lines does not have a data line, so that data communication between users is impossible. As described above, between the communication system having only the voice line and the communication system having the voice line and the other data line, there is a problem that only the voice communication can be performed between the users.

【００３１】以上から、本発明の第１の目的は、第１の
符号化方式によって符号化された音声符号を、該音声符
号に埋め込まれている埋め込みデータを損なうことな
く、該データが埋め込まれた第２符号化方式の音声符号
に変換できるようにすることである。本発明の第2の目的は、音声回線のみを持つ通信システ
ムと音声回線の外にデータ回線を持つ通信システム間
で、音声通信とデータ通信の両方の通信ができるように
することである。From the above, the first object of the present invention is to embed a voice code encoded by the first encoding method without damaging the embedded data embedded in the voice code. In addition, it is possible to convert to a voice code of the second encoding method. A second object of the present invention is to enable both voice communication and data communication between a communication system having only a voice line and a communication system having a data line outside the voice line.

【００３２】[0032]

【課題を解決するための手段】本発明の第1は、入力音声
を第1音声符号化方式により符号化した第1音声符号を第
2音声符号化方式による第2音声符号に変換する音声符号
変換方法及び装置である。かかる本発明の第1の音声符号
変換装置において、符号変換部は、受信した第1音声符号
に任意のデータが埋め込まれている場合、該第1音声符
号を第2音声符号に変換し、埋め込みデータ抽出部は該
第1音声符号から埋め込みデータを抽出し、データ埋め
込み部は前記変換により得られる第2音声符号に前記抽
出したデータを埋め込んで送出する。このように、変換
元の第１符号化方式の音声符号から埋め込みデータを一
旦抽出して、符号変換後の第２符号化方式の音声符号に
該データを再度埋め込むことにより、第１符号化方式の
音声符号に埋め込まれたデータを損なうことなく、同デ
ータを埋め込んだ第２符号化方式の音声符号に変換する
ことができる。According to a first aspect of the present invention, a first voice code obtained by encoding input voice by a first voice encoding method is used.
A voice code conversion method and device for converting into a second voice code by a two-voice encoding method. In the first voice code conversion device of the present invention, the code conversion unit converts the first voice code to the second voice code when any data is embedded in the received first voice code and embeds it. The data extraction unit extracts embedded data from the first voice code, and the data embedding unit embeds the extracted data in the second voice code obtained by the conversion and sends it out. As described above, the embedded data is once extracted from the voice code of the first encoding method of the conversion source, and the data is embedded again in the voice code of the second encoding method after the code conversion. The voice code can be converted into the voice code of the second encoding method in which the data is embedded without damaging the data embedded in the voice code.

【００３３】又、送信元において、データ埋め込み条件が
満たされた時、第1音声符号の一部をデータで置き換える
ことにより、第1音声符号にデータを埋め込んだ場合、埋
め込みデータ抽出部は、送信元から受信した第1音声符
号を構成する所定の要素符号の逆量子化値を参照して前
記データ埋め込み条件が満たされているか監視し、デー
タ埋め込み条件が満たされていれば第1音声符号より前
記埋め込みデータを抽出し、該抽出した埋め込みデータ
をデータ保持部に保存する。データ埋め込み部は第1音
声符号より変換された第2音声符号を構成する所定の要
素符号の逆量子化値を参照してデータ埋め込み条件が満
たされているか監視し、満たされている場合、前記データ
保持部に保存されているデータで第2音声符号の一部を
置き換えることによりデータを第2音声符号に埋め込
む。このように変換元と変換先で適応的に埋め込み制御
が行われる場合、各符号化方式の埋め込み制御方法の相
違により、あるいは従来の音声符号変換部での変換誤差
により生じるデータ抽出と埋め込みのタイミングの差を
データ保持部により吸収することで、第１符号化方式の
音声符号に埋め込まれたデータを損なうことなく、同デ
ータを埋め込んだ第２符号化方式の音声符号に変換する
ことができる。When data is embedded in the first voice code by replacing a part of the first voice code with the data when the data embedding condition is satisfied at the transmission source, the embedded data extracting section transmits the data. Monitor whether or not the data embedding condition is satisfied by referring to the dequantized value of a predetermined element code that constitutes the first speech code received from the original, and if the data embedding condition is satisfied, then from the first speech code The embedded data is extracted, and the extracted embedded data is stored in the data holding unit. The data embedding unit refers to the dequantized value of a predetermined element code constituting the second speech code converted from the first speech code, monitors whether the data embedding condition is satisfied, and if it is satisfied, The data is embedded in the second voice code by replacing a part of the second voice code with the data stored in the data holding unit. In this way, when the embedding control is adaptively performed at the conversion source and the conversion destination, the timing of data extraction and embedding caused by the difference in the embedding control method of each encoding method or by the conversion error in the conventional speech code conversion unit. By absorbing the difference of the above by the data holding unit, it is possible to convert the data embedded in the voice code of the first encoding system into the voice code of the second encoding system in which the same data is embedded.

【００３４】本発明の第２の音声符号変換装置におい
て、受信部は第1音声符号とデータを送信元から別々に
受信し、符号変換部は第1音声符号を第2音声符号に変換
し、データ埋め込み部は該変換により得られた第2音声
符号に前記データを埋め込んで送信先へ送信する。この
ようにすれば、変換元の通信システムより第１符号化方
式の音声符号とデータが別々の回線あるいは多重回線で
別々に音声符号変換部に入力された場合、音声符号変換
部は符号変換後の第２符号化方式の音声符号にデータを
埋め込むことにより変換先へ音声回線のみで伝送するこ
とが可能となる。In the second voice code conversion apparatus of the present invention, the receiving unit receives the first voice code and the data separately from the transmission source, and the code converting unit converts the first voice code into the second voice code, The data embedding unit embeds the data in the second voice code obtained by the conversion and transmits it to the destination. With this configuration, when the voice code and the data of the first coding method are separately input to the voice code conversion unit from the communication system of the conversion source through separate lines or multiple lines, the voice code conversion unit performs the code conversion after the code conversion. By embedding data in the voice code of the second encoding method, it becomes possible to transmit to the conversion destination only by the voice line.

【００３５】本発明の第３の音声符号変換装置におい
て、符号変換部は、受信した第1音声符号に任意のデータ
が埋め込まれている場合、該第1音声符号を第2音声符号
に変換し、埋め込みデータ抽出部は該第1音声符号から
埋め込みデータを抽出し、送信部は前記変換により得ら
れる第2音声符号と前記抽出したデータを別々の回線あ
るいは多重回線で別々に送信先に送信する。このように
すれば、変換元の音声回線によって伝送された音声情報
とデータ情報とを変換先の音声回線とデータ回線に分離
して伝送することが可能となる。In the third voice code conversion apparatus of the present invention, the code conversion unit converts the first voice code into the second voice code when arbitrary data is embedded in the received first voice code. The embedded data extraction unit extracts embedded data from the first voice code, and the transmission unit transmits the second voice code obtained by the conversion and the extracted data separately to the destination through separate lines or multiplex lines. . With this configuration, the voice information and the data information transmitted by the conversion source voice line can be separately transmitted to the conversion destination voice line and the data line.

【００３６】[0036]

【発明の実施の形態】(A)本発明の概略 (a)第1のシステム図１は本発明の第1のシステム概念図であり、任意のデ
ータDTを埋め込んだ第1符号化方式の音声符号ＳＰ1を、
該データDTを埋め込んだ第2符号化方式の音声符号SP2へ
変換する場合を示している。第１符号化方式の通信シス
テム101と第2符号化方式の通信システム102間に音声符
号変換装置103が設けられている。通信システム101にお
ける第１符号化方式の符号器104は、入力音声SP１を符
号化する際、音声データ以外の任意のデータ系列DTを音
声符号SCD１に埋め込んで伝送路105に送出する。この
際、符号器104によるデータの埋め込みは、音声符号の
フォーマットを変えずに音声符号自体に行われるため、
音声符号の情報量の増加はない。BEST MODE FOR CARRYING OUT THE INVENTION (A) Outline of the present invention (a) First system FIG. 1 is a conceptual diagram of a first system of the present invention, in which a voice of a first encoding system in which arbitrary data DT is embedded. Code SP1
The case where the data DT is converted into the voice code SP2 of the second encoding method is shown. A voice code conversion device 103 is provided between a communication system 101 of the first coding system and a communication system 102 of the second coding system. When encoding the input voice SP1, the encoder 104 of the first encoding system in the communication system 101 embeds an arbitrary data sequence DT other than voice data in the voice code SCD1 and sends it to the transmission line 105. At this time, since the data is embedded by the encoder 104 in the voice code itself without changing the format of the voice code,
There is no increase in the amount of voice code information.

【００３７】音声符号変換装置103は、符号器104から第
1音声符号化方式に従って符号化した音声符号SCD1を受
信すれば、該音声符号を通信システム102で使用されて
いる第2音声符号化方式の音声符号SCD2に変換して伝送
路106に送出する。この際、音声符号変換装置103は埋め
込みデータを損なわずに音声符号変換を行う。通信シス
テム102における第2符号化方式の復号器107は音声符号S
CD2に埋め込まれた任意のデータ系列ＤＴを読み出して
出力するとともに、音声符号に通常の復号器処理を施し
て再生音声SP２を出力する。このとき、再生音声SP２の
品質にほとんど影響がないように埋め込みが行われるた
め、再生音声は埋め込みを行わない場合とほとんど差が
ない。The voice transcoding device 103 is connected to the encoder 104 from the encoder 104.
When the voice code SCD1 encoded according to the one voice encoding system is received, the voice code is converted into the voice code SCD2 of the second voice encoding system used in the communication system 102 and transmitted to the transmission line 106. At this time, the voice code conversion device 103 performs voice code conversion without damaging the embedded data. In the communication system 102, the decoder 107 of the second coding method is the speech code S
An arbitrary data sequence DT embedded in CD2 is read and output, and a voice code is subjected to a normal decoder process to output reproduced voice SP2. At this time, since the embedding is performed so that the quality of the reproduced sound SP2 is hardly affected, the reproduced sound is almost the same as the case where the embedding is not performed.

【００３８】図２は本発明の第1システムにおける符号
変換装置103の構成図である。変換元で第1符号化方式に
従って符号化され、且つ、データDTが埋め込まれた音声符
号SCD1は、フレーム単位で順番に符号変換部111と埋め
込みデータ抽出部112に入力する。符号変換部111は図２
５に示す従来と同様の構成を有し、第1符号化方式の音
声符号SCD1を第２符号化方式の音声符号SCD２′に変換
する。埋め込みデータ抽出部112は、音声符号SCD1に埋
め込まれたデータDTを抽出してデータ埋め込み部113へ
出力する。埋め込みデータ抽出部112によるデータ抽出
方法は、第1符号化方式の復号器のデータ抽出方法と同
じである。データ埋め込み部113は、符号変換部111で変
換された第2符号化方式の音声符号SCD2′と音声符号SCD
1から抽出したデータDTが入力すると、音声符号SCD2′
へフレーム単位でデータDTの埋め込みを行い、音声符号
SCD2として出力する。データ埋め込み部113によるデー
タ埋め込み方法は、第2符号化方式の符号器のデータ埋
め込み方法と同じである。FIG. 2 is a block diagram of the code conversion device 103 in the first system of the present invention. The voice code SCD1 encoded at the conversion source according to the first encoding method and having the data DT embedded therein is input to the code conversion unit 111 and the embedded data extraction unit 112 in order on a frame-by-frame basis. The code conversion unit 111 is shown in FIG.
The speech code SCD1 of the first coding method is converted into the speech code SCD2 'of the second coding method, which has the same configuration as the conventional one shown in FIG. The embedded data extraction unit 112 extracts the data DT embedded in the voice code SCD1 and outputs it to the data embedding unit 113. The data extraction method by the embedded data extraction unit 112 is the same as the data extraction method of the decoder of the first encoding method. The data embedding unit 113 includes a voice code SCD2 ′ and a voice code SCD of the second coding method converted by the code conversion unit 111.
When the data DT extracted from 1 is input, the voice code SCD2 ′
Data DT is embedded in each frame to
Output as SCD2. The data embedding method by the data embedding unit 113 is the same as the data embedding method of the encoder of the second coding method.

【００３９】図３は本発明の第1システムにおける符号
変換装置103の別の構成図であり、図２の符号変換装置
と同一部分には同一符号を付している。この符号変換装
置103は、音声符号の性質に基いて適応的に音声符号SCD
1から埋め込みデータDTを抽出すると共に音声符号SCD
2′へデータDTの埋め込みを行う。たとえば、従来技術
の項で説明したように、第1符号化方式の符号器は、ゲイ
ン(ピッチゲイン、代数符号帳ゲイン)がある閾値以下で
あれば対応する符号(ピッチラグ符号、代数符号)の音声
に対する寄与は小さいもの見なして、該符号のインデッ
クスを任意のデータ系列DTで置き換える。このため、第
1符号化方式の音声符号SCD1には、ゲインに応じてデー
タが埋め込まれている区間と埋め込まれていない区間が
生じる。FIG. 3 is another block diagram of the code conversion apparatus 103 in the first system of the present invention, in which the same parts as those of the code conversion apparatus of FIG. This code conversion device 103 adaptively adjusts the voice code SCD based on the nature of the voice code.
The embedded data DT is extracted from 1 and the voice code SCD
Embed the data DT in 2 '. For example, as described in the section of the prior art, the encoder of the first encoding method, if the gain (pitch gain, algebraic codebook gain) is less than or equal to a certain threshold, the corresponding code (pitch lag code, algebraic code) Considering that the contribution to the voice is small, the index of the code is replaced with an arbitrary data sequence DT. For this reason,
In the voice code SCD1 of the one-encoding system, a section in which data is embedded and a section in which data is not embedded occur depending on the gain.

【００４０】埋め込み判定部121は、音声符号SCD1のゲ
インに基いてフレームあるいはサブフレーム単位で該符
号に別のデータが埋め込まれているかどうかを判定し、
データが埋め込まれていると判定した場合には、スイッ
チSW1を閉じて音声符号SCD1を埋め込みデータ抽出部112
に入力する。埋め込みデータ抽出部112は音声符号SCD1よ
りデータを抽出し、FIFOバッファ構成のデータ保持部12
2に入力する。FIFOバッファはfirst-in first-outのバ
ッファである。The embedding judging unit 121 judges whether or not another data is embedded in the code on a frame or sub-frame basis based on the gain of the speech code SCD1,
When it is determined that the data is embedded, the switch SW1 is closed and the voice code SCD1 is embedded in the data extraction unit 112.
To enter. The embedded data extraction unit 112 extracts data from the voice code SCD1 and stores the data in the FIFO buffer.
Enter in 2. The FIFO buffer is a first-in first-out buffer.

【００４１】埋め込み判定部123は、符号変換部111より
出力された第2符号化方式の音声符号SCD2′のゲインに
基いてフレームあるいはサブフレーム単位で該音声符号
にデータを埋め込むかどうか判定し、データを埋め込む
と判定すればスイッチSW2を閉じ、データ保持部122は保
持しているデータを古いものからフレームあるいはサブ
フレーム単位でデータ埋め込み部113に入力する。この
結果、データ埋め込み部113は、第2符号化方式の音声符
号SCD2′にデータ保持部122から出力するデータDTをフ
レーム単位で埋め込み、音声符号SCD2として出力する。The embedding determination unit 123 determines whether to embed data in the voice code in frame or subframe units based on the gain of the voice code SCD2 'of the second coding system output from the code conversion unit 111, If it is determined that the data is to be embedded, the switch SW2 is closed, and the data holding unit 122 inputs the held data to the data embedding unit 113 in units of frames or subframes. As a result, the data embedding unit 113 embeds the data DT output from the data holding unit 122 in the voice code SCD2 ′ of the second encoding method in frame units and outputs it as the voice code SCD2.

【００４２】各埋め込み判定の方法は、それぞれの符号
化方式において使用されている方法と同じでよい。埋め
込み判定部１２１と埋め込み判定部123の埋め込み判定
方法が異なる場合、スイッチSW1,SW2の閉じるタイミン
グは必ずしも一致しない。さらに埋め込み判定方法が同
じ場合でも、音声符号変換部111の変換誤差により変換
前後で音声符号が異なるため、同様な現象が生じる。図
3のデータ保持部122は上記スイッチングタイミングの差
を吸収してデータの消失を防止する機能を有している。The method of each embedding determination may be the same as the method used in each encoding method. When the embedding determination units 121 and 123 have different embedding determination methods, the closing timings of the switches SW1 and SW2 do not necessarily match. Further, even when the embedding determination method is the same, a similar phenomenon occurs because the voice code is different before and after conversion due to the conversion error of the voice code conversion unit 111. Figure
The third data holding unit 122 has a function of absorbing the above-mentioned difference in switching timing and preventing data loss.

【００４３】すなわち、変換先が埋め込み対象区間でな
い場合には、データ保持部122により第1音声符号SCD1か
ら抽出したデータDTを一旦保持する。逆に変換元が埋め
込み対象区間でない場合には、データ保持部122に保持
しているデータを取り出して第2音声符号SCD2′に埋め
込む。さらに、変換元の埋め込み対象の符号データサイ
ズが変換先よりも大きい場合は、埋め込み可能なデータ
量のみを埋め込み、残りをデータ保持部122により一旦
保持する。また、データ保持部122のデータ保持数が減
少した場合、変換先のデータ埋め込みを一旦停止し、デ
ータ保持数を回復させる。以上により、スイッチングタ
イミングの差を吸収してデータの消失を防止する。That is, when the conversion destination is not the embedding target section, the data holding unit 122 temporarily holds the data DT extracted from the first voice code SCD1. Conversely, when the conversion source is not the embedding target section, the data held in the data holding unit 122 is extracted and embedded in the second voice code SCD2 '. Further, when the code data size of the embedding target of the conversion source is larger than the conversion destination, only the embeddable data amount is embedded, and the rest is temporarily held by the data holding unit 122. Further, when the number of held data in the data holding unit 122 decreases, the embedding of data at the conversion destination is temporarily stopped and the number of held data is restored. As described above, the difference in switching timing is absorbed to prevent data loss.

【００４４】(b)第２のシステム図４は本発明の第２のシステム概念図であり、変換元の
通信システム101が音声回線105とデータ回線108を持
ち、変換先の通信システム102が音声回線106のみ持つ場
合を示している。図に示すように通信システム101にお
ける第１符号化方式の符号器104は、入力音声SP1を符号
化して音声符号SCD1にし該音声符号を音声回線105に送
出すると共に、音声符号以外の任意のデータ系列DTをデ
ータ回線108に送出する。実際には音声符号SCDとデータ
系列DTを時分割多重して多重回線に送出し、適当な箇所
で分離して音声符号変換装置103に入力する。以上によ
り、音声符号変換装置103には音声回線105から音声符号
SCD1とデータ回線108からデータDTがそれぞれ入力す
る。音声符号変換装置103は第1符号化方式の音声符号SC
D1を第２符号化方式の音声符号に変換するとともに該音
声符号にデータDTを埋め込んで音声符号SCD2として変換
先の通信システム102に音声回線106を介して伝送する。(B) Second system FIG. 4 is a conceptual diagram of a second system of the present invention. The communication system 101 of the conversion source has a voice line 105 and a data line 108, and the communication system 102 of the conversion destination is a voice system. The case where only the line 106 is provided is shown. As shown in the figure, the encoder 104 of the first encoding system in the communication system 101 encodes the input voice SP1 into a voice code SCD1 and sends the voice code to the voice line 105, and also outputs arbitrary data other than the voice code. The series DT is sent to the data line 108. Actually, the voice code SCD and the data sequence DT are time-division multiplexed, sent out to the multiplex line, separated at appropriate places, and input to the voice code conversion device 103. As described above, the voice code conversion device 103 receives the voice code from the voice line 105.
The data DT is input from the SCD1 and the data line 108, respectively. The voice code conversion device 103 is a voice code SC of the first coding method.
The D1 is converted into a voice code of the second encoding method, the data DT is embedded in the voice code, and the voice code SCD2 is transmitted to the communication system 102 of the conversion destination via the voice line 106.

【００４５】通信システム102における第2符号化方式の
復号器107は音声符号に埋め込まれた任意のデータ系列
ＤＴを読み出して出力すると共に、音声符号に通常の復
号器処理を施して再生音声SP2を出力する。このとき、
再生音声SP2の品質にほとんど影響がないように埋め込
みが行われるため、再生音声は埋め込みを行わない場合
とほとんど差がない。図5は本発明の第２システムにお
ける符号変換装置103の構成図であり、図2の第１システ
ムにおける符号変換装置と同一部分には同一符号を付し
ている。異なる点は、データDTが音声符号SCD1とは別の
経路で入力する点、埋め込みデータ抽出部がなく、埋
め込みデータDTを直接データ埋め込み部113へ入力する
点である。The decoder 107 of the second coding system in the communication system 102 reads out and outputs an arbitrary data sequence DT embedded in the voice code, and at the same time, performs a normal decoder process on the voice code to reproduce the reproduced voice SP2. Output. At this time,
Since the embedding is performed so that the quality of the reproduced sound SP2 is hardly affected, the reproduced sound is almost the same as the case where the embedding is not performed. FIG. 5 is a configuration diagram of the code conversion apparatus 103 in the second system of the present invention, and the same parts as those of the code conversion apparatus in the first system of FIG. The difference is that the data DT is input via a route different from that of the voice code SCD1, and there is no embedded data extraction unit, and the embedded data DT is directly input to the data embedding unit 113.

【００４６】変換元である通信システムは第1符号化方
式に従って符号化した音声符号SCD1とデータDTを時分割
多重して多重回線200に送出し、回線分離部201はこれら
音声符号SCD1とデータDTを分離して音声回線105、データ
回線108を介して符号変換装置103に入力する。データ埋
め込み部113は、符号変換部111で変換された第2符号化
方式の音声符号SCD2′とデータDTが入力すると、音声符
号SCD2′へフレーム単位でデータDTの埋め込みを行い、
音声符号SCD2として音声回線106に送出する。The communication system as the conversion source time-division-multiplexes the voice code SCD1 and the data DT encoded according to the first encoding method and sends them to the multiplex line 200, and the line separating unit 201 outputs the voice code SCD1 and the data DT. Are separated and input to the code conversion device 103 via the voice line 105 and the data line 108. The data embedding unit 113, when the voice code SCD2 ′ of the second encoding method converted by the code conversion unit 111 and the data DT are input, embeds the data DT in the voice code SCD2 ′ in frame units,
The voice code SCD2 is sent to the voice line 106.

【００４７】図６は本発明の第２システムにおける符号
変換装置103の別の構成図であり、図３の第１システム
における符号変換装置と同一部分には同一符号を付して
いる。図3と異なる点は、データDTが音声符号SCD1とは
別の経路で入力する点、埋め込み判定部、埋め込みデ
ータ抽出部がなく、埋め込みデータDTを直接データ保持
部122へ入力する点である。変換元である通信システム
は第1符号化方式に従って符号化した音声符号SCD1とデ
ータDTを時分割多重して多重回線200に送出し、回線分離
部201はこれら音声符号SCD1とデータDTを分離して音声
回線105、データ回線108を介して符号変換装置103に入力
する。FIG. 6 is another block diagram of the code conversion device 103 in the second system of the present invention, in which the same parts as those of the code conversion device in the first system of FIG. 3 are designated by the same reference numerals. 3 is different from FIG. 3 in that the data DT is input via a route different from that of the voice code SCD1, there is no embedding determination unit, embedded data extraction unit, and the embedded data DT is directly input to the data holding unit 122. The communication system as the conversion source time-division-multiplexes the voice code SCD1 and the data DT encoded according to the first encoding method and sends them to the multiplex line 200, and the line separation unit 201 separates these voice code SCD1 and the data DT. Input to the code conversion device 103 via the voice line 105 and the data line 108.

【００４８】符号変換装置103は、音声符号の性質に基
いて適応的に音声符号SCD′へデータDTの埋め込みを行
う。すなわち、符号変換部111は第1符号化方式の音声符
号SCD1を第２符号化方式の音声符号SCD２′に変換し、FI
FOバッファ構成のデータ保持部122は入力されたデータD
Tを保持する。埋め込み判定部123は、符号変換部111より
出力された第2符号化方式の音声符号SCD2′を基にフレ
ームあるいはサブフレーム単位で該音声符号にデータを
埋め込むかどうか判定し、データを埋め込むと判定すれ
ばスイッチSW2を閉じ、データ保持部122は保持している
データを古いものからフレームあるいはサブフレーム単
位でデータ埋め込み部113に入力する。この結果、データ
埋め込み部113は、第2符号化方式の音声符号SCD2′にデ
ータ保持部122から出力するデータDTをフレーム単位で
埋め込み、音声符号SCD2として音声回線106に送出す
る。The code conversion device 103 adaptively embeds the data DT in the voice code SCD 'based on the property of the voice code. That is, the code conversion unit 111 converts the speech code SCD1 of the first coding method into the speech code SCD2 'of the second coding method, and
The data holding unit 122 of the FO buffer structure receives the input data D
Hold T The embedding determination unit 123 determines whether to embed data in the voice code in frame or subframe units based on the voice code SCD2 ′ of the second encoding method output from the code conversion unit 111, and determines that the data is embedded. Then, the switch SW2 is closed, and the data holding unit 122 inputs the held data from the oldest one to the data embedding unit 113 in units of frames or subframes. As a result, the data embedding unit 113 embeds the data DT output from the data holding unit 122 in the voice code SCD2 ′ of the second coding method in frame units and sends it as the voice code SCD2 to the voice line 106.

【００４９】(c)第３のシステム図７は本発明の第３のシステム概念図であり、第2のシ
ステムとは逆に、変換元の通信システム101が音声回線1
05のみを持ち、変換先の通信システム102が音声回線106
とデータ回線109を持つ場合を示している。通信システ
ム101における第１符号化方式の符号器104は、入力音声
SP1を符号化すると共に該符号に音声データ以外の任意
のデータ系列DTを埋め込み、音声符号SCD1として音声回
線105に送出する。音声符号変換装置103は、第1符号化方
式の音声符号SCD1を第2符号化方式の音声符号SCD2に変
換するとともに、音声符号SCD1に埋め込まれているデー
タDTを抽出し、これら音声符号SCD2、データDTを各回線
106,109に送出する。通信システム102はデータ回線109
を介して入力したデータを出力すると共に、復号器107で
音声符号SCD2を復号して再生音声SP2を出力する。な
お、実際には音声符号SCD2、データDTは適所で時分割多
重されて通信システム102に伝送され、通信システムで分
離される。(C) Third System FIG. 7 is a conceptual diagram of the third system of the present invention. Contrary to the second system, the communication system 101 of the conversion source is the voice line 1
05 only, the communication system 102 of the conversion destination is the voice line 106
And a data line 109 is shown. The encoder 104 of the first coding system in the communication system 101
SP1 is encoded, an arbitrary data series DT other than voice data is embedded in the code, and the voice code SCD1 is transmitted to the voice line 105. The voice code conversion device 103 converts the voice code SCD1 of the first coding method into the voice code SCD2 of the second coding method, extracts the data DT embedded in the voice code SCD1, and these voice code SCD2, Data DT for each line
Send to 106,109. The communication system 102 is a data line 109
The input data is output through the decoder 107, and the decoder 107 decodes the audio code SCD2 and outputs the reproduced audio SP2. Actually, the voice code SCD2 and the data DT are time division multiplexed at appropriate places, transmitted to the communication system 102, and separated by the communication system.

【００５０】図8は本発明の第３システムにおける符号
変換装置103の構成図であり、図2の第１システムにおけ
る符号変換装置と同一部分には同一符号を付している。
異なる点は、データ埋め込み部がなく、符号変換部111
から出力する第2符号化方式の音声符号SCD2に埋め込み
データ抽出部112で抽出したデータDTを埋め込まない点、
データDTが第2符号化方式の音声符号SCD２とは別々に
送出される点である。変換元で第1符号化方式に従って
符号化され、且つ、データDTが埋め込まれた音声符号SCD1
は、フレーム単位で順番に符号変換部111と埋め込みデ
ータ抽出部112に入力する。符号変換部111は第1符号化
方式の音声符号SCD1を第２符号化方式の音声符号SCD２
に変換して音声回線106に送出する。また、埋め込みデ
ータ抽出部112は、音声符号SCD1に埋め込まれたデータD
Tを抽出してデータ回線109に送出する。回線多重部203
は音声回線106、データ回線109を介して入力する音声符
号SCD2及びデータDTを時分割多重して多重回線204に送
出する。FIG. 8 is a block diagram of the code conversion device 103 in the third system of the present invention. The same parts as those of the code conversion device in the first system of FIG. 2 are designated by the same reference numerals.
The difference is that there is no data embedding unit, and the code conversion unit 111
The point that the data DT extracted by the embedded data extraction unit 112 is not embedded in the voice code SCD2 of the second encoding method that is output from
The point is that the data DT is transmitted separately from the voice code SCD2 of the second encoding method. Speech code SCD1 encoded according to the first encoding method at the conversion source and in which data DT is embedded
Are sequentially input to the code conversion unit 111 and the embedded data extraction unit 112 in frame units. The code conversion unit 111 converts the speech code SCD1 of the first coding method into the speech code SCD2 of the second coding method.
And is transmitted to the voice line 106. In addition, the embedded data extraction unit 112 uses the data D embedded in the voice code SCD1.
T is extracted and sent to the data line 109. Line multiplexer 203
Sends time-division-multiplexed voice code SCD2 and data DT input via voice line 106 and data line 109 to multiplex line 204.

【００５１】図９は本発明の第３システムにおける符号
変換装置103の別の構成図であり、図３の第１システム
における符号変換装置と同一部分には同一符号を付して
いる。図3と異なる点は、データ保持部、埋め込み判定
部、データ埋め込み部がない点、符号変換部111から出
力する音声符号SCD2にデータDTを埋め込まない点、デ
ータDTが音声符号SCD２とは別々に送出される点であ
る。FIG. 9 is another block diagram of the code conversion device 103 in the third system of the present invention, in which the same parts as those of the code conversion device in the first system of FIG. 3 is different from FIG. 3 in that there is no data holding unit, embedding determination unit, or data embedding unit, that data DT is not embedded in the voice code SCD2 output from the code conversion unit 111, and that the data DT is separate from the voice code SCD2. It is the point that is sent out.

【００５２】送信側の通信システムの符号器は,ゲイン
(ピッチゲイン、代数符号帳ゲイン)がある閾値以下の場
合は対応する符号(ピッチラグ符号、代数符号)の音声に
対する寄与は小さいもの見なして、該符号のインデック
スを任意のデータ系列DTで置き換える。この結果、第1符
号化方式の音声符号SCD1には、データが埋め込まれてい
る区間と埋め込まれていない区間が生じる。埋め込み判
定部121は、音声符号SCD1から求まるゲインを基にフレ
ームあるいはサブフレーム単位で該符号に別のデータが
埋め込まれているかどうかを判定し、データが埋め込ま
れていると判定した場合には、スイッチSW1を閉じて音
声符号SCD1を埋め込みデータ抽出部112に入力する。埋め
込みデータ抽出部112は音声符号SCD1より埋め込みデー
タを抽出し、データ回線109に送出する。又、以上と並行
して音声符号変換部111は第1符号化方式の音声符号SCD1
を第2符号化方式の音声符号SCD2に変換して音声回線106
に送出する。回線多重部203は音声回線106、データ回線10
9を介して入力する音声符号SCD2及びデータDTを時分割
多重して多重回線204に送出する。The encoder of the communication system on the transmission side has a gain
When (pitch gain, algebraic codebook gain) is less than or equal to a certain threshold, the contribution of the corresponding code (pitch lag code, algebraic code) to speech is regarded as small, and the index of the code is replaced with an arbitrary data sequence DT. As a result, in the voice code SCD1 of the first encoding method, a section in which data is embedded and a section in which data is not embedded occur. The embedding determination unit 121 determines whether another data is embedded in the code on a frame or sub-frame basis based on the gain obtained from the voice code SCD1, and when it is determined that the data is embedded, The switch SW1 is closed and the voice code SCD1 is input to the embedded data extraction unit 112. The embedded data extraction unit 112 extracts embedded data from the voice code SCD1 and sends it to the data line 109. In parallel with the above, the voice code conversion unit 111 uses the voice code SCD1 of the first coding method.
Is converted into the voice code SCD2 of the second encoding system to convert the voice line 106
Send to. The line multiplexer 203 includes a voice line 106 and a data line 10.
The voice code SCD2 and the data DT input via 9 are time division multiplexed and transmitted to the multiplex line 204.

【００５３】（B）第1システムにおける実施例 (a)第1実施例図10は本発明の第1システムにおける符号変換装置の構
成図であり、埋め込み制御する場合の構成を示してい
る。この第1実施例では、任意のデータが埋め込まれて
いるAMRの音声符号を、埋め込みデータを損なうことな
くG.729Aの音声符号に変換する場合の例を示している。
さらに、第1実施例では、変換元のAMRの符号器は、代数
符号帳ゲインが設定値より小さければ、代数符号に割り
当てられている17ビット／サブフレームすべてに任意の
データを埋め込み、代数符号帳ゲインが設定値より大き
ければ本来の代数符号データを埋め込むものとする。ま
た、変換先のG.729Aの符号器も同様に代数符号帳ゲイン
に応じて代数符号に割り当てられている17bitすべてに
データを埋め込むものとする。(B) Embodiment in First System (a) First Embodiment FIG. 10 is a block diagram of a code conversion device in the first system of the present invention, showing a configuration for embedding control. The first embodiment shows an example in which an AMR voice code in which arbitrary data is embedded is converted into a G.729A voice code without damaging the embedded data.
Furthermore, in the first embodiment, the encoder of the conversion source AMR embeds arbitrary data in all 17 bits / subframes allocated to the algebraic code if the algebraic codebook gain is smaller than the set value, and the algebraic code If the book gain is larger than the set value, the original algebraic code data is embedded. Similarly, the conversion destination G.729A encoder also embeds data in all 17 bits assigned to the algebraic code according to the algebraic codebook gain.

【００５４】図10において、第mフレームのAMRの符号器
出力である回線データbst1(m)が端子1を通して符号分離
部114に入力すると、該符号分離部114は、回線データbst
1(m)をAMRの要素符号(LSP符号1、ピッチラグ符号1、ピ
ッチゲイン符号1、代数符号1、代数ゲイン符号1)に分離
する。そして、これら要素符号を符号変換部111における
各符号変換部(LSP符号変換部111a、ピッチラグ符号変換
部111b、ピッチゲイン符号変換部111c、代数ゲイン符号
変換部111d、代数符号変換部111e)へ入力する。各符号
変換部111a〜111eは第１符号化方式の符号を第2符号化
方式の符号に変換するが、その動作については従来技術
と同じであるためここでは説明を省略する。以下では、
データ埋め込みに関連した部分のみを説明する。In FIG. 10, when the line data bst1 (m), which is the output of the AMR encoder of the m-th frame, is input to the code separation unit 114 through the terminal 1, the code separation unit 114 outputs the line data bst.
1 (m) is separated into element codes of AMR (LSP code 1, pitch lag code 1, pitch gain code 1, algebraic code 1, algebraic gain code 1). Then, these element codes are input to each code conversion unit (LSP code conversion unit 111a, pitch lag code conversion unit 111b, pitch gain code conversion unit 111c, algebraic gain code conversion unit 111d, algebraic code conversion unit 111e) in the code conversion unit 111. To do. Each of the code conversion units 111a to 111e converts the code of the first coding system into the code of the second coding system, but the operation thereof is the same as that of the conventional technique, and therefore the description thereof is omitted here. Below,
Only the part related to data embedding will be described.

【００５５】埋め込み判定部121は、代数ゲイン符号1か
ら代数ゲイン逆量子化値(代数ゲイン)を求め、そのゲイ
ン値に応じてスイッチSW1の切り替えを行う。すなわ
ち、AMRの代数ゲイン値がある閾値よりも小さい場合
は、埋め込みデータありと判定してスイッチSW1を閉
じ、代数符号1を埋め込みデータ抽出部112に入力する。
埋め込みデータ抽出部112は、代数符号に含まれる埋め
込みデータDcodeを抽出してデータ保持部122へ出力す
る。本実施例では、AMRの代数符号(１７ビット／サブフ
レーム)すべてにデータが埋め込まれているので、１７b
itのデータ系列を埋め込みデータDcodeとしてそのまま
切り出す。FIFO構成のデータ保持部122は、入力された
データ系列を古い順に格納して保持する。The embedding judging unit 121 obtains an algebraic gain dequantized value (algebraic gain) from the algebraic gain code 1 and switches the switch SW1 according to the gain value. That is, when the AMR algebraic gain value is smaller than a certain threshold value, it is determined that there is embedded data, the switch SW1 is closed, and the algebraic code 1 is input to the embedded data extraction unit 112.
The embedded data extraction unit 112 extracts the embedded data Dcode included in the algebraic code and outputs it to the data holding unit 122. In the present embodiment, since data is embedded in all the AMR algebraic codes (17 bits / subframe), 17b
The data series of it is cut out as it is as embedded data Dcode. The data holding unit 122 having a FIFO structure stores and holds the input data series in the order of oldness.

【００５６】一方、埋め込み判定部123は、代数ゲイン符
号変換部111dより入力された変換後のG.729Aの代数ゲイ
ン符号2から代数ゲイン逆量子化値を求め、そのゲイン
値に応じてスイッチSW2の切り替えを行う。すなわち、
G.729Aの代数ゲイン値がある閾値よりも小さい場合は、
データを埋め込むと判断してスイッチSW2を閉じ、デー
タ保持部122からデータをデータ埋め込み部113に入力す
る。本実施例では、G.729Aの代数符号(１７ビット／サ
ブフレーム)すべてにデータを埋め込むため、データ保
持部122は１７ビットのデータをデータ埋め込み部113に
入力する。データ埋め込み部113は、代数符号2に割り当
てられている１７ビットに入力されたデータを埋め込
む。すなわち、G.729Aの代数符号(１７ビット)すべてを
データ系列(１７ビット)で置き換える。On the other hand, the embedding determination unit 123 obtains an algebraic gain dequantized value from the converted G.729A algebraic gain code 2 input from the algebraic gain code conversion unit 111d, and switches SW2 according to the gain value. Switch. That is,
If the G.729A algebraic gain value is less than a certain threshold,
When it is determined that the data is to be embedded, the switch SW2 is closed and the data is input from the data holding unit 122 to the data embedding unit 113. In this embodiment, since data is embedded in all G.729A algebraic codes (17 bits / subframe), the data holding unit 122 inputs 17-bit data to the data embedding unit 113. The data embedding unit 113 embeds the input data in 17 bits assigned to the algebraic code 2. That is, all the G.729A algebraic codes (17 bits) are replaced with the data series (17 bits).

【００５７】データを埋め込まれた代数符号2は、その
他の要素符号と共に符号多重部115で多重化され、埋め
込みデータを含んだG.729Aの第ｎフレームの回線データ
bst2(n)として、端子2より出力される。この第1実施例
によれば、AMRの音声符号bst1(m)における代数符号に任
意のデータが埋め込まれている場合、埋め込みデータを
損なうことなく、該データをG.729Aの代数符号に埋め込
んだ音声符号bst2(n)へと変換することができる。これ
によりAMRとG.729A間で音声フォーマットを変更するこ
となく、音声通信に加えデータ通信を行うことが可能と
なる。以上では、AMR→G.729Aへの変換について説明した
が、第1実施例のデータ抽出、データ埋め込みに関連する
部分の構成は、G.729AからAMRへの逆変換時にも適用可
能である。The algebraic code 2 in which the data is embedded is multiplexed by the code multiplexing unit 115 together with other element codes, and the line data of the nth frame of G.729A including the embedded data.
It is output from terminal 2 as bst2 (n). According to the first embodiment, when arbitrary data is embedded in the algebraic code in the AMR voice code bst1 (m), the data is embedded in the G.729A algebraic code without damaging the embedded data. It can be converted into a voice code bst2 (n). This enables data communication in addition to voice communication without changing the voice format between AMR and G.729A. Although the conversion from AMR to G.729A has been described above, the configuration of the portion related to the data extraction and data embedding in the first embodiment can be applied to the reverse conversion from G.729A to AMR.

【００５８】(b)第2実施例図11は本発明の第1システムにおける符号変換装置の別
の構成図であり、埋め込み制御する場合の構成を示して
おり、図10の第1実施例と同一部分には同一符号を付し
ている。異なる点は、第1実施例では、代数ゲインが設定
値より小さければ、代数符号に割り当てられている17ビ
ット／サブフレームすべてに任意のデータを埋め込むも
のとしているが、第2実施例では、ピッチゲインが設定値
より小さければ、ピッチラグ符号に割り当てられている
8ビットあるいは５ビット／サブフレームすべてに任意
のデータを埋め込むものとする点である。(B) Second Embodiment FIG. 11 is another configuration diagram of the code conversion device in the first system of the present invention, showing the configuration for embedding control, which is the same as the first embodiment of FIG. The same parts are given the same reference numerals. The difference is that in the first embodiment, if the algebraic gain is smaller than the set value, arbitrary data is embedded in all 17 bits / subframes assigned to the algebraic code, but in the second embodiment, the pitch is changed. If the gain is smaller than the set value, it is assigned to the pitch lag code.
The point is that arbitrary data is embedded in all 8 bits or 5 bits / subframe.

【００５９】埋め込み判定部121は、ピッチゲイン符号1
からピッチゲイン逆量子化値(ピッチゲイン)を求め、そ
のゲイン値に応じてスイッチSW1の切り替えを行う。す
なわち、AMRのピッチゲイン値がある閾値よりも小さい
場合は、埋め込みデータありと判定してスイッチSW1を
閉じ、ピッチラグ符号1を埋め込みデータ抽出部112に入
力する。埋め込みデータ抽出部112は、ピッチラグ符号
に含まれる埋め込みデータDcodeを抽出してデータ保持
部122へ出力する。本実施例では、AMRのピッチラグ符号
(8ビット又は６ビット／サブフレーム)すべてにデータ
が埋め込まれているので、8ビット又は６ビットのデー
タ系列を埋め込みデータDcodeとしてそのまま切り出
す。FIFO構成のデータ保持部122は、入力されたデータ
系列を古い順に格納して保持する。The embedding determination unit 121 determines the pitch gain code 1
A pitch gain dequantized value (pitch gain) is obtained from the value, and the switch SW1 is switched according to the gain value. That is, when the pitch gain value of the AMR is smaller than a certain threshold value, it is determined that there is embedded data, the switch SW1 is closed, and the pitch lag code 1 is input to the embedded data extraction unit 112. The embedded data extracting unit 112 extracts the embedded data Dcode included in the pitch lag code and outputs it to the data holding unit 122. In this embodiment, the pitch lag code of AMR
Since the data is embedded in all (8 bits or 6 bits / subframe), the 8-bit or 6-bit data series is cut out as it is as the embedded data Dcode. The data holding unit 122 having a FIFO structure stores and holds the input data series in the order of oldness.

【００６０】一方、埋め込み判定部123は、ピッチゲイン
符号変換部111cより入力された変換後のG.729Aのピッチ
ゲイン符号2からピッチゲイン逆量子化値を求め、その
ゲイン値に応じてスイッチSW2の切り替えを行う。すな
わち、G.729Aのピッチゲイン値がある閾値よりも小さい
場合は、データを埋め込むと判断してスイッチSW2を閉
じ、データ保持部122からデータをデータ埋め込み部113
に入力する。本実施例では、G.729Aのピッチラグ符号(8
ビット又は５ビット／サブフレーム)すべてにデータを
埋め込むため、データ保持部122はサブフレームに応じ
て8ビット又は５ビットのデータをデータ埋め込み部113
に入力する。データ埋め込み部113は、ピッチラグ符号2
に割り当てられている8ビット又は５ビットに入力され
たデータを埋め込む。On the other hand, the embedding determination unit 123 obtains a pitch gain dequantized value from the converted pitch gain code 2 of G.729A input from the pitch gain code conversion unit 111c, and switches SW2 according to the gain value. Switch. That is, when the pitch gain value of G.729A is smaller than a certain threshold value, it is determined to embed the data, the switch SW2 is closed, and the data holding unit 122 stores the data in the data embedding unit 113.
To enter. In this embodiment, G.729A pitch lag code (8
Since data is embedded in all bits (5 bits or 5 bits / subframe), the data holding unit 122 stores the data of 8 bits or 5 bits according to the subframe.
To enter. The data embedding unit 113 uses the pitch lag code 2
The input data is embedded in the 8 bits or 5 bits allocated to.

【００６１】データを埋め込まれたピッチラグ符号2
は、その他の要素符号と共に符号多重部115で多重化さ
れ、埋め込みデータを含んだG.729Aの第ｎフレームの回
線データbst2(n)として、端子2より出力される。第2実
施例によれば、AMRの音声符号bst1(ｍ)のピッチラグ符
号に任意のデータが埋め込まれている場合、埋め込みデ
ータを損なうことなく、該データをG.729Aのピッチラグ
符号に埋め込んだ音声符号bst2(n)へと変換することが
できる。これによりAMR(7.95kbps)とG.729A間で音声フ
ォーマットを変更することなく、音声通信に加えデータ
通信を行うことが可能となる。以上では、AMR→G.729Aへ
の変換について説明したが、データ抽出、データ埋め込
みに関連する部分の構成は、G.729AからAMRへの逆変換
時やその他の符号変換時にも適用可能である。Pitch lag code 2 with embedded data
Is multiplexed with other element codes by the code multiplexing unit 115 and output from the terminal 2 as the line data bst2 (n) of the n.th frame of G.729A including embedded data. According to the second embodiment, when arbitrary data is embedded in the pitch lag code of the AMR voice code bst1 (m), the voice in which the data is embedded in the G.729A pitch lag code without damaging the embedded data. It can be converted into the code bst2 (n). This enables data communication in addition to voice communication without changing the voice format between AMR (7.95 kbps) and G.729A. In the above, the conversion from AMR to G.729A has been described, but the configuration of the part related to data extraction and data embedding can be applied to the reverse conversion from G.729A to AMR and other code conversions. .

【００６２】（ｃ）第3実施例図12は本発明の第1システムにおける符号変換装置の別
の構成図であり、埋め込み制御を行なわない場合の構成
を示している。この第３実施例では、AMRの音声符号を
埋め込みデータを損なうことなく、G.729Aの音声符号に
変換する場合の例を示している。AMRの音声符号は図２
１〜図２３を参照すると1フレーム20msecであり、5msec
毎の4つのサブフレームを備え、各サブフレーム毎に１７
ビットの代数符号を有している。一方、G.729Aの音声符号
は1フレーム10msecであり、5msec毎の２つのサブフレー
ムを備え、各サブフレーム毎に１７ビットの代数符号を
有している。AMR,G729Aともに、この17ビットにより4つ
のパルス系統(表1参照)のパルス位置m0〜m3と極性s0〜s
3が表現される。パルス位置m0〜m3と極性s0〜s3に対す
るビット割当は図13に示す通りである。(C) Third Embodiment FIG. 12 is another block diagram of the code conversion device in the first system of the present invention, showing the structure in the case where the embedding control is not performed. The third embodiment shows an example in which the AMR voice code is converted into the G.729A voice code without damaging the embedded data. Figure 2 shows the voice code of AMR.
1 to FIG. 23, one frame is 20 msec, and 5 msec
4 sub-frames for each, 17 for each sub-frame
It has a bit algebraic sign. On the other hand, the voice code of G.729A is 10 msec per frame, two subframes are provided every 5 msec, and each subframe has a 17-bit algebraic code. For both AMR and G729A, the pulse position m0 to m3 and polarity s0 to s of four pulse systems (see Table 1) are set by these 17 bits.
3 is expressed. Bit assignments for the pulse positions m0 to m3 and the polarities s0 to s3 are as shown in FIG.

【００６３】第3実施例において、変換元のAMRの符号器
は例えば第4パス系統のパルス位置及び極性を示すm3,s3
の5ビットにデータDcodeを埋め込む。埋め込みデータ抽
出部112は常時、代数符号１に含まれる埋め込みデータD
codeを抽出してデータ埋め込み部113に入力する。デー
タ埋め込み部113は、代数符号2に割り当てられている１
７ビットのうちm3,s3の5ビットに入力されたデータDcod
eを埋め込む。データを埋め込まれた代数符号2は、その
他の要素符号と共に符号多重部115で多重化され、埋め
込みデータを含んだG.729Aの第ｎフレームの回線データ
bst2(n)として、端子2より出力される。In the third embodiment, the conversion source AMR encoder is, for example, m3, s3 indicating the pulse position and polarity of the fourth path system.
Embed the data Dcode in 5 bits. The embedded data extracting unit 112 always stores the embedded data D included in the algebraic code 1.
The code is extracted and input to the data embedding unit 113. The data embedding unit 113 is assigned to the algebraic code 2 1
Data Dcod input to 5 bits of m3 and s3 out of 7 bits
Embed e. The algebraic code 2 in which the data is embedded is multiplexed with the other element codes in the code multiplexing unit 115, and the line data of the nth frame of G.729A including the embedded data.
It is output from terminal 2 as bst2 (n).

【００６４】以上第1のシステムによれば、変換元の第１
符号化方式の音声符号SCD1から埋め込みデータDTを一旦
抽出して、符号変換後の第２符号化方式の音声符号SCD
2′に該データDTを再度埋め込むことにより、音声符号S
CD1に埋め込まれたデータDTを損なうことなく、同デー
タを埋め込んだ音声符号SCD2に変換することができる。
また、第1のシステムによれば、変換元と変換先で適応的
に埋め込み制御が行われる場合、各符号化方式の埋め込
み制御方法の相違により、あるいは従来の音声符号変換
部での変換誤差により生じるデータ抽出と埋め込みのタ
イミングの差をデータ保持部により吸収することで、音
声符号SCD1に埋め込まれたデータを損なうことなく、同
データを埋め込んだ音声符号SCD2に変換することができ
る。また、第1のシステムによれば、データ埋め込み技術
を適用した音声回線を持つ音声通信システム間におい
て、埋め込まれたデータを損なうことなく、しかも、音声
符号フォーマットを変更することなく音声回線を介して
音声とデータの両方の通信を行うことが可能となる。According to the first system described above, the first conversion source
The embedded data DT is once extracted from the audio code SCD1 of the encoding method, and the audio code SCD of the second encoding method after the code conversion is performed.
By embedding the data DT in 2 ′ again, the voice code S
The data DT embedded in the CD1 can be converted into the voice code SCD2 in which the data DT is embedded without damaging the same.
Further, according to the first system, when the embedding control is adaptively performed between the conversion source and the conversion destination, due to the difference in the embedding control method of each coding method or the conversion error in the conventional speech code conversion unit. By absorbing the generated timing difference between the data extraction and the embedding by the data holding unit, it is possible to convert the data embedded in the voice code SCD1 into the voice code SCD2 in which the data is embedded. Further, according to the first system, between voice communication systems having a voice line to which the data embedding technology is applied, the voice data is transmitted via the voice line without damaging the embedded data and without changing the voice code format. It becomes possible to perform both voice and data communication.

【００６５】（C）本発明の第2のシステムの実施例 (a)第1実施例図14は本発明の第2のシステムにおける音声符号変換装
置の構成図であり、音声符号bst1(m)にデータDcodeが埋
め込まれておらず、該データが音声符号と別回線で音声
符号変換装置に入力される点が第1のシステムの実施例
と異なる。回線多重部201は多重回線200を介して受信し
た多重データより音声符号bst1(m)とデータDcodeを分離
し、端子1より音声符号bst1(m)を符号分離部114に入力
し、端子3からデータDcodeを直接データ保持部122に入
力する。符号分離部114は、回線データbst1(m)をAMRの要素符号
(LSP符号1、ピッチラグ符号1、ピッチゲイン符号1、代
数符号1、代数ゲイン符号1)に分離し、これら要素符号
を符号変換部111における各符号変換部(LSP符号変換部1
11a、ピッチラグ符号変換部111b、ピッチゲイン符号変
換部111c、代数ゲイン符号変換部111d、代数符号変換部
111e)へ入力する。各符号変換部111a〜111eは第１符号
化方式の符号を第2符号化方式の符号に変換する。(C) Embodiment of Second System of the Present Invention (a) First Embodiment FIG. 14 is a block diagram of a voice code conversion device in the second system of the present invention, wherein voice code bst1 (m) Unlike the first system embodiment, the data Dcode is not embedded in and the data is input to the voice code conversion device via a separate line from the voice code. The line multiplexer 201 separates the voice code bst1 (m) and the data Dcode from the multiplex data received via the multiplex line 200, inputs the voice code bst1 (m) from the terminal 1 to the code separator 114, and from the terminal 3. The data Dcode is directly input to the data holding unit 122. The code separation unit 114 converts the line data bst1 (m) into the element code of AMR.
(LSP code 1, pitch lag code 1, pitch gain code 1, algebraic code 1, algebraic gain code 1) and these element codes are code conversion units in the code conversion unit 111 (LSP code conversion unit 1
11a, pitch lag code conversion unit 111b, pitch gain code conversion unit 111c, algebraic gain code conversion unit 111d, algebraic code conversion unit
Enter it into 111e). Each of the code conversion units 111a to 111e converts the code of the first coding method into the code of the second coding method.

【００６６】埋め込み判定部123は、代数ゲイン符号変
換部111dより入力された変換後のG.729Aの代数ゲイン符
号2から代数ゲイン逆量子化値を求め、そのゲイン値に
応じてスイッチSW2の切り替えを行う。すなわち、G.729
Aの代数ゲイン値がある閾値よりも小さい場合は、デー
タを埋め込むと判断してスイッチSW2を閉じ、データ保
持部122からデータをデータ埋め込み部113に入力する。
データ埋め込み部113は、代数符号2に割り当てられてい
る１７ビットに入力されたデータを埋め込む。データを
埋め込まれた代数符号2は、その他の要素符号と共に符
号多重部115で多重化され、埋め込みデータを含んだG.7
29Aの第ｎフレームの回線データbst2(n)として、端子2
より出力される。The embedding determination unit 123 obtains an algebraic gain dequantization value from the converted G.729A algebraic gain code 2 input from the algebraic gain code conversion unit 111d, and switches the switch SW2 according to the gain value. I do. That is, G.729
When the algebraic gain value of A is smaller than a certain threshold value, it is determined to embed the data, the switch SW2 is closed, and the data is input from the data holding unit 122 to the data embedding unit 113.
The data embedding unit 113 embeds the input data in 17 bits assigned to the algebraic code 2. The algebraic code 2 in which the data is embedded is multiplexed with the other element codes in the code multiplexing unit 115, and G.7 containing the embedded data is included.
Terminal 2 as line data bst2 (n) of the 29th nth frame of 29A
Will be output.

【００６７】この実施例によれば、AMR側の通信システ
ムにおいて、音声回線に加えデータ回線を持つ場合に、
音声回線とデータ回線を介して別々に入力された音声符
号bst1(m)とデータDcodeを、データを埋め込んだ音声符
号bst2(n)に変換し、音声回線のみを持つG.729A側の通
信システムへ伝送することができる。これにより、音声
通信とデータ通信が可能な通信システム例えば第3世代
携帯電話システム(音声符号化方式としてAMRが採用)か
ら、音声回線のみを持つ通信システム例えば音声通信の
みを行う従来の第2世代の携帯電話システム(G.729A)へ
音声通信に加えてデータ通信を行うことが可能となる。According to this embodiment, in the case where the communication system on the AMR side has a data line in addition to a voice line,
G.729A communication system that has only voice line by converting voice code bst1 (m) and data Dcode input separately via voice line and data line to voice code bst2 (n) with embedded data Can be transmitted to. As a result, from a communication system capable of voice communication and data communication, for example, a third-generation mobile phone system (AMR is adopted as a voice encoding method), a communication system having only a voice line, for example, a conventional second-generation system that performs only voice communication. It becomes possible to perform data communication in addition to voice communication to the mobile phone system (G.729A).

【００６８】(a)第２実施例図1５は本発明の第2のシステムにおける音声符号変換装
置の別の構成図であり、埋め込み制御を行なわない場合
の構成を示している。この第２実施例では、音声符号bs
t1(m)にデータDcodeが埋め込まれておらず、該データが
音声符号と別回線で音声符号変換装置に入力される。
又、G729Aの代数符号は、17ビットにより4つのパルス系
統の各パルス位置m0〜m3と極性s0〜s3を表現するから、
第2実施例では例えば第4パス系統のパルス位置及び極性
を示すm3, s3の5ビットにデータDcodeを埋め込むものと
する。回線多重部201は多重回線200を介して受信した多重デー
タより音声符号bst1(m)とデータDcodeを分離し、端子1よ
り音声符号bst1(m)を符号分離部114に入力し、端子3か
らデータDcodeを直接データ埋め込み部113に入力する。符号分離部114は、回線データbst1(m)をAMRの要素符号
(LSP符号1、ピッチラグ符号1、ピッチゲイン符号1、代
数符号1、代数ゲイン符号1)に分離し、これら要素符号
を符号変換部111における各符号変換部(LSP符号変換部1
11a、ピッチラグ符号変換部111b、ピッチゲイン符号変
換部111c、代数ゲイン符号変換部111d、代数符号変換部
111e)へ入力する。各符号変換部111a〜111eは第１符号
化方式の符号を第2符号化方式の符号に変換する。(A) Second Embodiment FIG. 15 is another block diagram of the speech code conversion apparatus in the second system of the present invention, showing the configuration when the embedding control is not performed. In the second embodiment, the voice code bs
The data Dcode is not embedded in t1 (m), and the data is input to the voice code conversion device via a line different from the voice code.
Further, the G729A algebraic code expresses each pulse position m0 to m3 and polarity s0 to s3 of four pulse systems by 17 bits,
In the second embodiment, for example, the data Dcode is embedded in 5 bits of m3 and s3 indicating the pulse position and polarity of the fourth pass system. The line multiplexer 201 separates the voice code bst1 (m) and the data Dcode from the multiplex data received via the multiplex line 200, inputs the voice code bst1 (m) from the terminal 1 to the code separator 114, and from the terminal 3. The data Dcode is directly input to the data embedding unit 113. The code separation unit 114 converts the line data bst1 (m) into the element code of AMR.
(LSP code 1, pitch lag code 1, pitch gain code 1, algebraic code 1, algebraic gain code 1) and these element codes are code conversion units in the code conversion unit 111 (LSP code conversion unit 1
11a, pitch lag code conversion unit 111b, pitch gain code conversion unit 111c, algebraic gain code conversion unit 111d, algebraic code conversion unit
Enter it into 111e). Each of the code conversion units 111a to 111e converts the code of the first coding method into the code of the second coding method.

【００６９】データ埋め込み部113は、代数符号2に割り
当てられている１７ビットのうちm3,s3の5ビットに入力
されたデータDcodeを埋め込む。データを埋め込まれた
代数符号2は、その他の要素符号と共に符号多重部115で
多重化され、埋め込みデータを含んだG.729Aの第ｎフレ
ームの回線データbst2(n)として、端子2より出力され
る。The data embedding unit 113 embeds the input data Dcode in 5 bits of m3 and s3 among the 17 bits assigned to the algebraic code 2. The data-embedded algebraic code 2 is multiplexed with other element codes by the code multiplexing unit 115, and is output from the terminal 2 as the G.729A nth frame line data bst2 (n) including the embedded data. It

【００７０】以上第2のシステムによれば、音声回線と別
にデータ回線を持つ通信システムから音声回線のみを持
つ通信システムへ音声符号フォーマットを変更すること
なく、音声通信とデータ通信を行うことが可能となる。
以上では、AMR→G.729Aへの変換について説明したが、G.
729AからAMRへの逆変換時、その他の符号変換時にも適用
可能である。又、以上では、代数ゲインに応じて代数符号
にデータを埋め込む場合について説明したが、ピッチゲ
インに応じてピッチラグ符号にデータを埋め込むように
することもできる。According to the second system described above, it is possible to perform voice communication and data communication without changing the voice code format from a communication system having a data line separate from the voice line to a communication system having only a voice line. Becomes
In the above, the conversion from AMR to G.729A has been described.
It can be applied at the time of reverse conversion from 729A to AMR and at the time of other code conversions. Further, although the case where data is embedded in the algebraic code according to the algebraic gain has been described above, data may be embedded in the pitch lag code according to the pitch gain.

【００７１】（D）本発明の第3のシステム (a)第1実施例図16は本発明の第3のシステムにおける音声符号変換装
置の構成図であり、埋め込みデータを適応的に抽出する
場合の構成を示している。この実施例において、第1の
符号化方式はG.729A、第2の符号化方式はAMR(7.95kbps)
であり、符号変換装置はG.729Aの音声符号をAMRの音声
符号に変換して伝送すると共に、G.729Aの音声符号に埋
め込まれていたデータを抽出して音声符号と別々に伝送
する。また、変換元のG.729Aの符号器（図示せず)は、
代数ゲインが設定値より小さければ、代数符号に割り当
てられている17ビット／サブフレームすべてに任意のデ
ータを埋め込み、代数ゲインが設定値より大きければ本
来の代数符号データを埋め込むものとする。(D) Third system of the present invention (a) First embodiment FIG. 16 is a block diagram of a speech code conversion apparatus in the third system of the present invention, in the case of adaptively extracting embedded data. Shows the configuration of. In this embodiment, the first coding method is G.729A and the second coding method is AMR (7.95 kbps).
The code conversion device converts the G.729A voice code into an AMR voice code and transmits the voice code, and also extracts the data embedded in the G.729A voice code and transmits the data separately from the voice code. Further, the conversion source G.729A encoder (not shown) is
If the algebraic gain is smaller than the set value, arbitrary data is embedded in all 17 bits / subframes allocated to the algebraic code, and if the algebraic gain is larger than the set value, the original algebraic code data is embedded.

【００７２】第mフレームのG.729Aの符号器出力である
回線データbst1(m)が端子1を通して符号分離部114に入
力すると、該符号分離部114は、回線データbst1(m)をG.7
29Aの要素符号(LSP符号1、ピッチラグ符号1、ピッチゲ
イン符号1、代数符号1、代数ゲイン符号1)に分離する。
そして、これら要素符号を符号変換部111における各符号
変換部(LSP符号変換部111a、ピッチラグ符号変換部111
b、ピッチゲイン符号変換部111c、代数ゲイン符号変換
部111d、代数符号変換部111e)へ入力する。各符号変換
部111a〜111eはG.729Aの符号をAMRの符号に変換し、符
号多重部115は各AMRの符号を多重して音声符号bst2(n)
として回線多重部203に入力する。When the line data bst1 (m), which is the output of the G.729A encoder of the mth frame, is input to the code separating unit 114 through the terminal 1, the code separating unit 114 sets the line data bst1 (m) to G.729A. 7
The 29A element code (LSP code 1, pitch lag code 1, pitch gain code 1, algebraic code 1, algebraic gain code 1) is separated.
Then, these element codes are converted into code conversion units in the code conversion unit 111 (LSP code conversion unit 111a, pitch lag code conversion unit 111.
b, pitch gain code conversion unit 111c, algebraic gain code conversion unit 111d, and algebraic code conversion unit 111e). Each of the code conversion units 111a to 111e converts the G.729A code into an AMR code, and the code multiplexing unit 115 multiplexes each AMR code to produce a voice code bst2 (n).
To the line multiplexer 203.

【００７３】以上と並行して、埋め込み判定部121は、代
数ゲイン符号1から代数ゲイン逆量子化値(代数ゲイン)
を求め、そのゲイン値に応じてスイッチSW1の切り替え
を行う。すなわち、G.729Aの代数ゲイン値がある閾値よ
りも小さい場合は、埋め込みデータありと判定してスイ
ッチSW1を閉じ、代数符号1を埋め込みデータ抽出部112
に入力する。埋め込みデータ抽出部112は、代数符号に
含まれる埋め込みデータDcodeを抽出して回線多重部203
に入力する。G.729Aの代数符号(１７ビット／サブフレ
ーム)すべてにデータが埋め込まれているので、１７bit
のデータ系列を埋め込みデータDcodeとしてそのまま切
り出して回線多重部203に入力する。回線多重部203は入
力する音声符号bst2(n)及びデータDcode を多重して多
重回線204に送出する。In parallel with the above, the embedding judging unit 121 determines the algebraic gain code 1 to the algebraic gain dequantized value (algebraic gain).
Then, the switch SW1 is switched according to the gain value. That is, when the algebraic gain value of G.729A is smaller than a certain threshold value, it is determined that there is embedded data, the switch SW1 is closed, and the algebraic code 1 is set as the embedded data extraction unit 112.
To enter. The embedded data extraction unit 112 extracts the embedded data Dcode included in the algebraic code, and the line multiplexing unit 203
To enter. Since the data is embedded in all G.729A algebraic codes (17 bits / subframe), 17 bits
The data series of is cut out as it is as embedded data Dcode and is input to the line multiplexing unit 203. The line multiplexer 203 multiplexes the input voice code bst2 (n) and the data Dcode and sends them to the multiplex line 204.

【００７４】(b)第2実施例図17は本発明の第3のシステムにおける音声符号変換装
置の別の構成図であり、埋め込みデータが代数符号に常
に挿入されている場合である。この実施例において、第1
の符号化方式はG.729A、第2の符号化方式はAMR(7.95kbp
s)であり、音声符号変換装置はG.729Aの音声符号をAMR
の音声符号に変換して伝送すると共に、G.729Aの音声符
号に埋め込まれていたデータを抽出して音声符号と別回
線で伝送する。また、変換元のG.729Aの符号器は、代数
符号のm3, s3の5ビット(図13参照)にデータDcodeを埋め
込むものとする。(B) Second Embodiment FIG. 17 is another block diagram of the speech code conversion apparatus in the third system of the present invention, in which the embedded data is always inserted in the algebraic code. In this example, the first
The encoding method is G.729A, and the second encoding method is AMR (7.95kbp
s), and the voice transcoding device uses the G.729A voice code for AMR.
The G.729A voice code is extracted and transmitted on a separate line from the voice code. In addition, it is assumed that the G.729A encoder that is the conversion source embeds the data Dcode in 5 bits (see FIG. 13) of m3 and s3 of the algebraic code.

【００７５】第mフレームのG.729Aの符号器出力である
回線データbst1(m)が端子1を通して符号分離部114に入
力すると、該符号分離部114は、回線データbst1(m)をG.7
29Aの要素符号(LSP符号1、ピッチラグ符号1、ピッチゲ
イン符号1、代数符号1、代数ゲイン符号1)に分離する。
そして、これら要素符号を符号変換部111における各符号
変換部(LSP符号変換部111a、ピッチラグ符号変換部111
b、ピッチゲイン符号変換部111c、代数ゲイン符号変換
部111d、代数符号変換部111e)へ入力する。各符号変換
部111a〜111eはG.729Aの符号をAMRの符号に変換し、符
号多重部115は各AMRの符号を多重して音声符号bst2(n)
として回線多重部203に入力する。When the line data bst1 (m), which is the G.729A encoder output of the mth frame, is input to the code separation unit 114 through the terminal 1, the code separation unit 114 sets the line data bst1 (m) to G.729A. 7
The 29A element code (LSP code 1, pitch lag code 1, pitch gain code 1, algebraic code 1, algebraic gain code 1) is separated.
Then, these element codes are converted into code conversion units in the code conversion unit 111 (LSP code conversion unit 111a, pitch lag code conversion unit 111.
b, pitch gain code conversion unit 111c, algebraic gain code conversion unit 111d, and algebraic code conversion unit 111e). Each of the code conversion units 111a to 111e converts the G.729A code into an AMR code, and the code multiplexing unit 115 multiplexes each AMR code to produce a voice code bst2 (n).
To the line multiplexer 203.

【００７６】以上と並行して、埋め込みデータ抽出部112
は、代数符号に含まれる埋め込みデータDcodeを抽出し
て回線多重部203に入力する。G.729Aの代数符号m3,s3ビ
ット位置にデータが埋め込まれているので、該データを
切り取って埋め込みデータDcodeとして回線多重部203に
入力する。回線多重部203は入力する音声符号bst2(n)及
びデータDcode を多重して多重回線204に送出する。第3のシステムによれば、音声回線のみを持つ通信システ
ムから音声回線と別にデータ回線を持つ通信システムへ
音声符号フォーマットを変更することなく、音声通信と
データ通信を行うことが可能となる。以上では、G.729A
→AMRへの変換について説明したが、その他の符号変換
時にも適用可能である。又、以上では、代数ゲインに応じ
て代数符号にデータを埋め込む場合について説明した
が、ピッチゲインに応じてピッチラグ符号にデータを埋
め込むようにすることもできる。In parallel with the above, the embedded data extraction unit 112
Extracts the embedded data Dcode included in the algebraic code and inputs it to the line multiplexer 203. Since the data is embedded in the G.729A algebraic code m3, s3 bit positions, the data is cut and input to the line multiplexer 203 as embedded data Dcode. The line multiplexer 203 multiplexes the input voice code bst2 (n) and the data Dcode and sends them to the multiplex line 204. According to the third system, it becomes possible to perform voice communication and data communication without changing the voice code format from the communication system having only the voice line to the communication system having the data line separately from the voice line. Above, G.729A
→ The conversion to AMR was explained, but it can be applied to other code conversions. Further, although the case where data is embedded in the algebraic code according to the algebraic gain has been described above, data may be embedded in the pitch lag code according to the pitch gain.

【００７７】・付記（付記１）入力音声を第1音声符号化方式により符号
化した第1音声符号を第2音声符号化方式による第2音声
符号に変換する音声符号変換方法において、第1音声符
号に任意のデータが埋め込まれている場合、該第1音声
符号を第2音声符号に変換すると共に、該第1音声符号か
ら埋め込みデータを抽出し、前記変換により得られる第
2音声符号に前記抽出したデータを埋め込む、ことを特
徴とする音声符号変換方法。（付記２）送信元において、データ埋め込み条件が満
たされた時、第1音声符号の一部を前記データで置き換え
ることにより、第1音声符号にデータを埋め込んだ場合、
受信した第1音声符号を構成する所定の要素符号の逆量
子化値を参照して前記データ埋め込み条件が満たされて
いるか監視し、データ埋め込み条件が満たされていれば
第1音声符号より前記埋め込みデータを抽出する、こと
を特徴とする付記1記載の音声符号変換方法。（付記３）前記抽出した埋め込みデータをデータ保持
部に保存すると共に、該データ保持部より埋め込みデー
タを読み出して第2音声符号に埋め込む、ことを特徴と
する付記２記載の音声符号変換方法。（付記４）送信元において、データ埋め込み条件が満
たされた時、第1音声符号の一部を前記データで置き換え
ることにより、第1音声符号にデータを埋め込んだ場合、
送信元から受信した第1音声符号を構成する所定の要素
符号の逆量子化値を参照して前記データ埋め込み条件が
満たされているか監視し、データ埋め込み条件が満たさ
れていれば該第1音声符号より前記埋め込みデータを抽
出し、該抽出した埋め込みデータを保持し、前記変換に
より得られた第2音声符号を構成する所定の要素符号の
逆量子化値を参照してデータ埋め込み条件が満たされて
いるか監視し、満たされている場合、前記保持されている
データで該第2音声符号の一部を置き換えることにより
データを第2音声符号に埋め込む、ことを特徴とする付
記１記載の音声符号変換方法。（付記5）入力音声を第1音声符号化方式により符号化
した第1音声符号を第2音声符号化方式による第2音声符
号に変換する音声符号変換方法において、第1音声符号
とデータを送信元から別々に受信し、第1音声符号を第2
音声符号に変換し、該変換により得られた第2音声符号
に前記データを埋め込んで送信先へ送信する、ことを特
徴とする音声符号変換方法。（付記６）前記第1音声符号を音声回線より、前記デー
タをデータ回線よりそれぞれ受信し、前記データが埋め
込まれた第2音声符号を音声回線を介して送信先へ送信
する、ことを特徴とする付記５記載の音声符号変換方
法。（付記７）前記受信したデータをデータ保持部に保存
し、前記第2音声符号を構成する所定の要素符号の逆量子
化値を参照してデータ埋め込み条件が満たされているか
監視し、満たされている場合、前記データ保持部に保存さ
れているデータで第2音声符号の一部を置き換えること
によりデータを第2音声符号に埋め込む、ことを特徴と
する付記５記載の音声符号変換方法。（付記８）入力音声を第1音声符号化方式により符号
化した第1音声符号を第2音声符号化方式による第2音声
符号に変換する音声符号変換方法において、第1音声符
号を受信し、該第1音声符号に任意のデータが埋め込まれ
ている場合、該第1音声符号を第2音声符号に変換すると
共に、該第1音声符号から埋め込みデータを抽出し、前記
変換により得られる第2音声符号と前記抽出したデータ
を別々に送信先に送信する、ことを特徴とする音声符号
変換方法。（付記９）送信元において、データ埋め込み条件が満
たされた時、第1音声符号の一部を前記データで置き換え
ることにより、第1音声符号にデータを埋め込んだ場合、
受信した第1音声符号を構成する所定の要素符号の逆量
子化値を参照して前記データ埋め込み条件が満たされて
いるか監視し、データ埋め込み条件が満たされていれば
該第1音声符号より前記埋め込みデータを抽出する、こ
とを特徴とする付記8記載の音声符号変換方法。（付記１０）入力音声を第1音声符号化方式により符
号化した第1音声符号を第2音声符号化方式による第2音
声符号に変換する音声符号変換装置において、第1音声
符号に任意のデータが埋め込まれている場合、第1音声
符号を第2音声符号に変換する符号変換部、該第1音声符
号から埋め込みデータを抽出する埋め込みデータ抽出
部、前記変換により得られる第2音声符号に前記抽出したデ
ータを埋め込むデータ埋め込み部、を備えたことを特徴
とする音声符号変換装置。（付記１１）送信元において、データ埋め込み条件が
満たされた時、第1音声符号の一部を前記データで置き換
えることにより、第1音声符号にデータを埋め込んだ場
合、前記埋め込みデータ抽出部は受信した第1音声符号を
構成する所定の要素符号の逆量子化値を参照して前記デ
ータ埋め込み条件が満たされているか監視し、データ埋
め込み条件が満たされていれば第1音声符号より前記埋
め込みデータを抽出する、ことを特徴とする付記1０記
載の音声符号変換装置。（付記１２）更に、前記抽出した埋め込みデータを保
存するデータ保持部を備え、前記埋め込みデータ抽出部
は該データ保持部に前記抽出した埋め込みデータを保存
すると共に、前記データ埋め込み部は該データ保持部よ
り埋め込みデータを読み出して第2音声符号に埋め込
む、ことを特徴とする付記１１記載の音声符号変換装
置。（付記１３）前記埋め込みデータ抽出部は、前記第2
音声符号を構成する所定の要素符号の逆量子化値を参照
してデータ埋め込み条件が満たされているか監視し、満
たされている場合、前記データ保持部に保存されている
データで第2音声符号の一部を置き換えることによりデ
ータを第2音声符号に埋め込む、ことを特徴とする付記
１２記載の音声符号変換装置。（付記１４）入力音声を第1音声符号化方式により符
号化した第1音声符号を第2音声符号化方式による第2音
声符号に変換する音声符号変換装置において、第1音声
符号とデータを送信元から別々に受信する受信手段、第1音声符号を第2音声符号に変換する符号変換部、該変
換により得られた第2音声符号に前記データを埋め込ん
で送信先へ送信するデータ埋め込み部、を有することを
特徴とする音声符号変換装置。（付記１５）音声符号変換装置は更に前記データを保
存するデータ保持部を備え、データ埋め込み部は、前記
第2音声符号を構成する所定の要素符号の逆量子化値を
参照してデータ埋め込み条件が満たされているか監視す
る手段、満たされている場合、前記データ保持部に保存さ
れているデータで第2音声符号の一部を置き換えること
によりデータを第2音声符号に埋め込む手段、を有する
ことを特徴とする付記１４記載の音声符号変換装置。（付記１６）入力音声を第1音声符号化方式により符
号化した第1音声符号を第2音声符号化方式による第2音
声符号に変換する音声符号変換装置において、送信元か
ら受信した第1音声符号に任意のデータが埋め込まれて
いる場合、該第1音声符号を第2音声符号に変換する符号
変換部、該第1音声符号から埋め込みデータを抽出する埋め込み
データ抽出部、前記変換により得られる第2音声符号と
前記抽出したデータを別々に送信先に送信する手段、を
備えたことを特徴とする音声符号変換装置。（付記１７）送信元において、データ埋め込み条件が
満たされた時、第1音声符号の一部を前記データで置き換
えることにより、第1音声符号にデータを埋め込んだ場
合、前記埋め込みデータ抽出部は、送信元から受信した1
音声符号を構成する所定の要素符号の逆量子化値を参照
して前記データ埋め込み条件が満たされているか監視
し、データ埋め込み条件が満たされていれば第1音声符号
より前記埋め込みデータを抽出する、ことを特徴とする
付記１６記載の音声符号変換装置。Supplementary note (Supplementary note 1) In the speech code conversion method for converting the first speech code obtained by coding the input speech by the first speech coding method into the second speech code by the second speech coding method, the first speech When arbitrary data is embedded in the code, the first voice code is converted into the second voice code, embedded data is extracted from the first voice code, and the first voice code obtained by the conversion is extracted.
(2) A voice code conversion method comprising embedding the extracted data in a voice code. (Supplementary Note 2) When data is embedded in the first voice code by replacing a part of the first voice code with the data when the data embedding condition is satisfied at the transmission source,
The dequantized value of a predetermined element code forming the received first voice code is referred to monitor whether the data embedding condition is satisfied, and if the data embedding condition is satisfied, the first voice code is used to embed the data. The voice code conversion method according to appendix 1, wherein data is extracted. (Supplementary note 3) The voice code conversion method according to supplementary note 2, wherein the extracted embedded data is stored in a data holding unit, and the embedded data is read from the data holding unit and embedded in the second voice code. (Supplementary Note 4) When data is embedded in the first voice code by replacing part of the first voice code with the data when the data embedding condition is satisfied at the transmission source,
By referring to the dequantized value of a predetermined element code constituting the first voice code received from the transmission source, it is monitored whether the data embedding condition is satisfied, and if the data embedding condition is satisfied, the first voice The embedded data is extracted from the code, the extracted embedded data is held, and the data embedding condition is satisfied by referring to the dequantized value of the predetermined element code that constitutes the second speech code obtained by the conversion. The voice code as set forth in appendix 1, characterized in that if it is satisfied, the data is embedded in the second voice code by replacing a part of the second voice code with the held data. How to convert. (Supplementary note 5) In the voice code conversion method of converting the first voice code obtained by encoding the input voice by the first voice encoding method into the second voice code by the second voice encoding method, transmitting the first voice code and data. Separately received from the original, the first voice code to the second
A voice code conversion method comprising converting to a voice code, embedding the data in a second voice code obtained by the conversion, and transmitting the data to a destination. (Supplementary Note 6) The first voice code is received from a voice line, the data is received from a data line, and the second voice code in which the data is embedded is transmitted to a destination via the voice line. A method of converting a voice code according to appendix 5. (Supplementary Note 7) The received data is stored in a data holding unit, and by referring to the dequantized value of a predetermined element code forming the second speech code, it is monitored whether the data embedding condition is satisfied, and the condition is satisfied. The voice code conversion method according to appendix 5, wherein the data is embedded in the second voice code by replacing a part of the second voice code with the data stored in the data holding unit. (Supplementary Note 8) In a voice code conversion method for converting a first voice code obtained by encoding an input voice by a first voice encoding method into a second voice code by a second voice encoding method, receiving a first voice code, When arbitrary data is embedded in the first voice code, the first voice code is converted into a second voice code, and the embedded data is extracted from the first voice code to obtain a second voice code. A voice code conversion method comprising transmitting a voice code and the extracted data separately to a destination. (Supplementary Note 9) When data is embedded in the first voice code by replacing part of the first voice code with the data when the data embedding condition is satisfied at the transmission source,
By referring to the dequantized value of a predetermined element code constituting the received first voice code, it is monitored whether the data embedding condition is satisfied, and if the data embedding condition is satisfied, the first voice code is used. 9. The audio code conversion method according to appendix 8, wherein embedded data is extracted. (Supplementary note 10) In a voice code conversion device for converting a first voice code obtained by encoding an input voice according to a first voice encoding system into a second voice code according to a second voice encoding system, arbitrary data is used as the first voice code. Is embedded, a code conversion unit that converts the first voice code into a second voice code, an embedded data extraction unit that extracts embedded data from the first voice code, and a second voice code obtained by the conversion. A voice code conversion device comprising a data embedding unit for embedding the extracted data. (Supplementary Note 11) When data is embedded in the first voice code by replacing a part of the first voice code with the data when the data embedding condition is satisfied at the transmission source, the embedded data extraction unit receives the data. It is monitored whether the data embedding condition is satisfied by referring to the dequantized value of a predetermined element code that constitutes the first voice code, and if the data embedding condition is satisfied, the embedded data is sent from the first voice code. The voice code conversion device according to appendix 10, wherein the voice code conversion device according to claim 10 is extracted. (Supplementary Note 12) A data storage unit for storing the extracted embedded data is further provided, wherein the embedded data extraction unit stores the extracted embedded data in the data storage unit and the data embedding unit stores the extracted data in the data storage unit. 12. The voice code conversion device according to appendix 11, wherein the embedded data is read out and embedded in the second voice code. (Supplementary Note 13) The embedded data extraction unit is configured to operate in the second
By referring to the dequantized value of a predetermined element code forming the voice code, it is monitored whether the data embedding condition is satisfied, and if the condition is satisfied, the second voice code is stored in the data holding unit. 13. The voice code conversion apparatus according to appendix 12, wherein the data is embedded in the second voice code by replacing a part of the above. (Supplementary Note 14) In a voice code conversion device for converting a first voice code obtained by encoding an input voice by a first voice encoding method into a second voice code by a second voice encoding method, transmitting a first voice code and data Receiving means for separately receiving from the original, a code conversion unit for converting the first voice code into a second voice code, a data embedding unit for embedding the data in the second voice code obtained by the conversion and transmitting it to the destination, A voice code conversion device comprising: (Supplementary Note 15) The voice code conversion device further includes a data holding unit for storing the data, and the data embedding unit refers to an inverse quantized value of a predetermined element code forming the second voice code and sets a data embedding condition. And a means for embedding the data in the second voice code by replacing a part of the second voice code with the data stored in the data holding unit. 14. The voice code conversion device according to appendix 14, characterized in that. (Supplementary Note 16) In a voice code conversion device for converting a first voice code obtained by encoding an input voice by a first voice encoding method into a second voice code by a second voice encoding method, a first voice received from a transmission source. When arbitrary data is embedded in the code, a code conversion unit that converts the first voice code into a second voice code, an embedded data extraction unit that extracts embedded data from the first voice code, and is obtained by the conversion. A voice code conversion device comprising: a means for separately transmitting a second voice code and the extracted data to a destination. (Supplementary Note 17) When the data embedding condition is satisfied at the transmission source, when the data is embedded in the first voice code by replacing a part of the first voice code with the data, the embedded data extraction unit, 1 received from source
By referring to the dequantized value of a predetermined element code constituting the voice code, it is monitored whether the data embedding condition is satisfied, and if the data embedding condition is satisfied, the embedded data is extracted from the first voice code. The speech code conversion device according to appendix 16, characterized in that.

【００７８】[0078]

【発明の効果】以上、本発明によれば、変換元の第１符
号化方式の音声符号から埋め込みデータを一旦抽出し
て、符号変換後の第２符号化方式の音声符号に該データ
を再度埋め込むことにより、第１符号化方式の音声符号
に埋め込まれたデータを損なうことなく、同データを埋
め込んだ第２符号化方式の音声符号に変換することがで
きる。また、本発明によれば、変換元と変換先で適応的
に埋め込み制御が行われる場合、各符号化方式の埋め込
み制御方法の相違により、あるいは従来の音声符号変換
部での変換誤差により生じるデータ抽出と埋め込みのタ
イミングの差をデータ保持部により吸収することで、第
１符号化方式の音声符号に埋め込まれたデータを損なう
ことなく、同データを埋め込んだ第２符号化方式の音声
符号に変換することができる。As described above, according to the present invention, the embedded data is once extracted from the voice code of the first encoding method of the conversion source, and the data is re-converted to the voice code of the second encoding method after the code conversion. By embedding, it is possible to convert the data embedded in the voice code of the first encoding method into the voice code of the second encoding method in which the same data is embedded, without damaging the data. Further, according to the present invention, when the embedding control is adaptively performed between the conversion source and the conversion destination, data generated due to a difference in the embedding control method of each encoding method or due to a conversion error in the conventional speech code conversion unit. By absorbing the difference between the extraction timing and the embedding timing by the data holding unit, the data embedded in the voice code of the first encoding system is converted into the voice code of the second encoding system in which the same data is embedded. can do.

【００７９】また、本発明によれば、データ埋め込み技
術を適用した音声回線を持つ音声通信システム間におい
て、埋め込まれたデータを損なうことなく、しかも、音声
符号フォーマットを変更することなく音声回線を介して
音声とデータの両方の通信を行うことが可能となる。ま
た、本発明によれば、変換元のシステムより第１符号化
方式の音声符号とデータが別回線で音声符号変換部に入
力された場合、該音声符号変換部は符号変換後の第２符
号化方式の音声符号に前記データを埋め込むことにより
変換先へ音声回線のみで伝送することが可能となる。ま
た、本発明によれば、変換元のシステムより音声回線を
介して任意のデータDTが埋め込まれた第１符号化方式の
音声符号が入力された場合に、音声符号変換部は該音声
符号から埋め込みデータを抽出してデータ回線に送出す
ると共に第１符号化方式の音声符号を第２符号化方式の
音声符号に変換して音声回線に送出することにより、変
換元の音声回線によって伝送された音声情報とデータ情
報とを変換先の音声回線とデータ回線に分離して伝送す
ることが可能となる。Further, according to the present invention, between voice communication systems having a voice line to which the data embedding technique is applied, the voice data is transmitted via the voice line without damaging the embedded data and without changing the voice code format. It becomes possible to communicate both voice and data. Further, according to the present invention, when the voice code and the data of the first encoding method are input to the voice code conversion unit via a separate line from the system of the conversion source, the voice code conversion unit performs the second code after the code conversion. By embedding the data in the voice code of the conversion system, it becomes possible to transmit to the conversion destination only by the voice line. Further, according to the present invention, when the voice code of the first coding method in which the arbitrary data DT is embedded is input from the system of the conversion source via the voice line, the voice code conversion unit converts the voice code from the voice code. The embedded data is extracted and sent to the data line, and the voice code of the first coding method is converted into the voice code of the second coding method and sent to the voice line, so that it is transmitted by the voice line of the conversion source. The voice information and the data information can be separately transmitted to the voice line and the data line of the conversion destination.

【００８０】また、本発明によれば、音声回線のみを持
つ通信システムと音声回線と別にデータ回線を持つ通信
システム間において、音声符号フォーマットを変更する
ことなく、音声通信とデータ通信を行うことが可能とな
る。今後、マルチメディア情報通信の普及を背景に、従
来携帯電話システムと次世代携帯電話システム間の通
信、またはVoIPと携帯電話等のモバイルシステム間の通
信等、多様な通信システム間の通信において、データ埋
め込み技術と音声符号変換技術を併用した技術の必要性
は高いため、本発明の効果は大きい。Further, according to the present invention, voice communication and data communication can be performed between a communication system having only a voice line and a communication system having a data line separately from the voice line without changing the voice code format. It will be possible. In the future, due to the spread of multimedia information communication, data will be used in communication between various communication systems such as communication between conventional mobile phone systems and next-generation mobile phone systems, or communication between VoIP and mobile systems such as mobile phones. The effect of the present invention is great because there is a high need for a technology that uses both the embedding technology and the voice code conversion technology.

[Brief description of drawings]

【図１】本発明の第1のシステム概念図である。FIG. 1 is a first system conceptual diagram of the present invention.

【図２】本発明の第1システムにおける音声符号変換装
置の構成図である。FIG. 2 is a configuration diagram of a voice code conversion device in the first system of the present invention.

【図３】本発明の第1システムにおける音声符号変換装
置の別の概略構成図である。FIG. 3 is another schematic configuration diagram of the speech code conversion apparatus in the first system of the present invention.

【図４】本発明の第２のシステム概念図である。FIG. 4 is a second system conceptual diagram of the present invention.

【図５】本発明の第２システムにおける音声符号変換装
置の概略構成図である。FIG. 5 is a schematic configuration diagram of a speech code conversion device in the second system of the present invention.

【図６】本発明の第２システムにおける音声符号変換装
置の別の概略構成図である。FIG. 6 is another schematic configuration diagram of the speech code conversion device in the second system of the present invention.

【図７】本発明の第３のシステム概念図である。FIG. 7 is a conceptual diagram of a third system according to the present invention.

【図８】本発明の第３システムにおける音声符号変換装
置の概略構成図である。FIG. 8 is a schematic configuration diagram of a voice code conversion device in a third system of the present invention.

【図９】本発明の第３システムにおける音声符号変換装
置の別の概略構成図である。FIG. 9 is another schematic configuration diagram of the speech code conversion apparatus in the third system of the present invention.

【図１０】本発明の第1システムにおける音声符号変換
装置の構成図である。FIG. 10 is a configuration diagram of a voice code conversion device in the first system of the present invention.

【図１１】本発明の第1システムにおける音声符号変換
装置の別の実施例構成図である。FIG. 11 is a configuration diagram of another embodiment of the voice code conversion device in the first system of the present invention.

【図１２】本発明の第1システムにおける音声符号変換
装置の更に別の実施例構成図である。FIG. 12 is a configuration diagram of still another embodiment of the voice code conversion device in the first system of the present invention.

【図１３】代数符号の構成図である。FIG. 13 is a configuration diagram of an algebraic code.

【図１４】本発明の第2のシステムにおける音声符号変
換装置の実施例構成図である。FIG. 14 is a configuration diagram of an embodiment of a voice code conversion device in the second system of the present invention.

【図１５】本発明の第2のシステムにおける音声符号変
換装置の別の実施例構成図である。[Fig. 15] Fig. 15 is a configuration diagram of another embodiment of the voice code conversion device in the second system of the present invention.

【図１６】本発明の第3のシステムにおける音声符号変
換装置の実施例構成図である。FIG. 16 is a configuration diagram of an embodiment of a voice code conversion device in the third system of the present invention.

【図１７】本発明の第3のシステムにおける音声符号変
換装置の別の実施例構成図である。[Fig. 17] Fig. 17 is a configuration diagram of another embodiment of the speech code conversion device in the third system of the present invention.

【図１８】ITU-T勧告G.729A方式の符号器の構成図であ
る。[Fig. 18] Fig. 18 is a configuration diagram of an ITU-T recommended G.729A system encoder.

【図１９】各パルス系統グループ１〜４に割り当てたサ
ンプル点の説明図である。FIG. 19 is an explanatory diagram of sample points assigned to each pulse system group 1 to 4.

【図２０】G.729A方式の復号器のブロック図である。FIG. 20 is a block diagram of a G.729A system decoder.

【図２１】G.729A方式とAMRの主要諸元の比較説明図で
ある。[Fig. 21] Fig. 21 is an explanatory diagram of comparison between main specifications of the G.729A system and AMR.

【図２２】G.729A方式とAMRのフレーム構成説明図であ
る。[Fig. 22] Fig. 22 is an explanatory diagram of frame configurations of the G.729A system and AMR.

【図２３】G.729A方式とAMR方式におけるビット割り当
ての比較説明図である。[Fig. 23] Fig. 23 is a diagram for explaining comparison of bit allocation in the G.729A system and the AMR system.

【図２４】異なる通信システム間での音声符号変換説明
図である。FIG. 24 is an explanatory diagram of voice code conversion between different communication systems.

【図２５】音声符号を別の音声符号化方式の符号に変換
する従来技術の説明図である。[Fig. 25] Fig. 25 is an explanatory diagram of a conventional technique for converting a voice code into a code of another voice encoding method.

【図２６】データ埋め込み技術を適用した音声通信シス
テムの概念図である。FIG. 26 is a conceptual diagram of a voice communication system to which a data embedding technique is applied.

【図２７】符号変換の原理図である。FIG. 27 is a principle diagram of code conversion.

[Explanation of symbols]

１０３符号変換装置１１１符号変換部１１２埋め込みデータ抽出部１１３データ埋め込み部 103 code converter 111 code conversion unit 112 Embedded data extractor 113 Data embedding section

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｂ 14/04 Ｇ１０Ｌ 9/18 Ａ (72)発明者大田恭士神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内 (72)発明者鈴木政直神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内 (72)発明者田中正清神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内Ｆターム(参考） 5D045 CA02 CC03 CC07 DA11 5J064 AA01 BB03 BC01 BC26 BD02 5K041 BB08 CC01 HH21 HH22 ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁷ Identification code FI theme code (reference) H04B 14/04 G10L 9/18 A (72) Inventor Kyoji Ohta 4 chome Uedanaka, Nakahara-ku, Kawasaki-shi, Kanagawa 1-1 In Fujitsu Limited (72) Inventor Masanao Suzuki 4-1-1 Kamiotanaka, Nakahara-ku, Kawasaki-shi, Kanagawa Prefecture 1-1 1-1 Masayoshi Tanaka Inventor Masataka Tanaka 4-chome, Ueda-anaka, Kawasaki-shi, Kanagawa No. 1 No. 1 F term in Fujitsu Limited (reference) 5D045 CA02 CC03 CC07 DA11 5J064 AA01 BB03 BC01 BC26 BD02 5K041 BB08 CC01 HH21 HH22

Claims

[Claims]

1. A voice code conversion method for converting a first voice code obtained by encoding an input voice by a first voice encoding method into a second voice code according to a second voice encoding method, wherein the first voice code is arbitrary. If the data is embedded,
A voice characterized by converting the first voice code to a second voice code, extracting embedded data from the first voice code, and embedding the extracted data in a second voice code obtained by the conversion. Code conversion method.

2. When data is embedded in the first voice code by replacing a part of the first voice code with the data when the data embedding condition is satisfied at the source, the first voice code received from the source (1) By referring to the dequantized value of a predetermined element code that constitutes one voice code, monitor whether the data embedding condition is satisfied, and if the data embedding condition is satisfied, the embedded data from the first voice code Extraction is carried out, the extracted embedded data is held, and it is monitored whether the data embedding condition is satisfied by referring to the dequantized value of the predetermined element code that constitutes the second speech code obtained by the conversion, and it is satisfied. The voice code according to claim 1, wherein, in the case where the voice code is stored, the data is embedded in the second voice code by replacing a part of the second voice code with the held data. No. conversion method.

3. A voice code conversion method for converting a first voice code obtained by encoding input voice by a first voice encoding method into a second voice code according to a second voice encoding method, wherein the first voice code and data are A voice code characterized by receiving separately from a transmission source, converting a first voice code into a second voice code, embedding the data in a second voice code obtained by the conversion, and transmitting the data to a destination. How to convert.

4. A voice code conversion method for converting a first voice code obtained by encoding an input voice by a first voice encoding method into a second voice code according to a second voice encoding method, wherein the first voice code is received. , If any data is embedded in the first voice code, the first voice code is converted into a second voice code, embedded data is extracted from the first voice code, and the first voice code is obtained by the conversion. (2) A voice code conversion method, comprising transmitting the voice code and the extracted data separately to a destination.

5. A voice code conversion device for converting a first voice code obtained by encoding an input voice by a first voice encoding system into a second voice code according to a second voice encoding system, wherein the first voice code is converted into an arbitrary first voice code. If the data is embedded,
A code conversion unit that converts the first voice code into a second voice code, an embedded data extraction unit that extracts embedded data from the first voice code, and a data embedding that embeds the extracted data in the second voice code obtained by the conversion. A voice code conversion device comprising: