JP2968109B2

JP2968109B2 - Code-excited linear prediction encoder and decoder

Info

Publication number: JP2968109B2
Application number: JP3327443A
Authority: JP
Inventors: 賢一郎細田; 浩桂川; 弘美青柳; 義博有山
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1991-12-11
Filing date: 1991-12-11
Publication date: 1999-10-25
Anticipated expiration: 2014-10-25
Also published as: JPH05165497A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、コード励振線形予測符
号化方法（ＣＥＬＰ）に従う符号化器及び復号化器に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an encoder and a decoder according to a code-excited linear prediction encoding method (CELP).

【０００２】[0002]

【従来の技術】従来、デジタル移動通信分野における音
声信号（音響信号を含む）の高能率符号化方式として、
コード励振線形予測符号化方式やこのコード励振線形予
測符号化方式の変形であるベクトル加算励振線形予測符
号化方式（ＶＳＥＬＰ）が採用されてきた。2. Description of the Related Art Conventionally, as a high-efficiency coding method for voice signals (including audio signals) in the field of digital mobile communication,
A code-excited linear predictive coding scheme (VSELP), which is a modification of the code-excited linear predictive coding scheme, has been adopted.

【０００３】音声信号に対する符号化方式の基本的構成
は、音声の声道特性を表現する声道パラメータ、音源情
報を表現する音源パラメータを求めることにある。最近
のコード励振線形予測符号化方式では、音源情報として
の励振信号を統計的に周期性の強い有声音に寄与する適
応コードと統計的に周期性の弱いランダム的な無声音に
寄与する統計コードとでコード化して、それぞれコード
ブックに格納しておき、入力音声信号と合成音声信号と
の重付け誤差電力和が最小となるように各コードブック
内の最適な適応コード及び統計コードを見付け出すこと
で符号化処理を行なっている。そして、入力音声信号か
ら声道パラメータを得るフォワード型の符号化方式であ
れ、合成音声信号から声道パラメータを得るバックワー
ド型の符号化方式であれ、少なくとも音源パラメータ、
従って最適な適応コード及び統計コードの情報を伝送す
る。The basic configuration of a coding method for a speech signal is to obtain a vocal tract parameter representing vocal tract characteristics of speech and a sound source parameter representing sound source information. In recent code-excited linear predictive coding, the excitation signal as the sound source information is divided into an adaptive code that contributes to a voiced sound with a statistically strong periodicity and a statistical code that contributes to a random unvoiced sound with a statistically weak periodicity. And store them in a codebook, and find the optimal adaptive code and statistical code in each codebook so that the sum of weighting error power between the input audio signal and the synthesized audio signal is minimized. Performs the encoding process. And at least a sound source parameter, whether it is a forward type encoding method for obtaining vocal tract parameters from an input audio signal or a backward type encoding method for obtaining vocal tract parameters from a synthesized audio signal.
Therefore, the information of the optimal adaptive code and statistical code is transmitted.

【０００４】このようなコード励振線形予測符号化方式
では、６kbit/s〜８kbit/sの符号化速度において高品質
な再生音声が得られることが知られている。It is known that such a code excitation linear predictive coding system can provide high-quality reproduced speech at a coding speed of 6 kbit / s to 8 kbit / s.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、通信シ
ステムの中には、より低い符号化速度を求めるものがあ
る。例えば、４kbit/s以下の符号化速度を求めるものが
ある。このような低速度の場合、声道パラメータ及び音
源パラメータを共に伝送するフォワード型であろうと、
音源パラメータを伝送するバックワード型であろうと、
音源パラメータに割当てられる符号化ビット数は当然に
少なくなり、適応コードブック及び統計コードブックに
格納されている適応コード及び統計コードの数も少なく
なる。その結果、このような低符号化速度においては再
生音声の品質が低下する。However, some communication systems require a lower coding rate. For example, there is one that calculates an encoding speed of 4 kbit / s or less. In the case of such a low speed, whether the forward type transmits both the vocal tract parameters and the sound source parameters,
Regardless of the backward type that transmits sound source parameters,
Naturally, the number of coding bits allocated to the excitation parameters is reduced, and the number of adaptive codes and statistical codes stored in the adaptive codebook and statistical codebook is also reduced. As a result, at such a low encoding speed, the quality of the reproduced sound is reduced.

【０００６】また、適応コードブックは、最適な適応コ
ード及び統計コードの合成コードによって適応的に更新
されるものであるので、適応コードは統計コードに基づ
いて形成されるものであるということができる。そのた
め、周期性の強い有声音の立ち上がりが遅く、また、有
声音の定常部においても明確なパルス性の強いコードを
形成できず、再生音声が明瞭性に欠けるという欠点を有
する。この欠点に対しては、孤立インパルスでなるパル
ス性コードを用いることも考えられる。 Further, since the adaptive codebook is updated adaptively with a combination code of an optimal adaptive code and a statistical code, it can be said that the adaptive code is formed based on the statistical code. . Therefore, there is a disadvantage that a voiced sound having strong periodicity rises slowly, and a code having a strong pulse characteristic cannot be formed even in a stationary portion of the voiced sound, and the reproduced voice lacks clarity. To overcome this disadvantage, a pulse consisting of an isolated impulse
It is also conceivable to use security codes.

【０００７】しかしながら、統計コードやパルス性コー
ドは固定コードであり、再生音声の品質を向上させよう
とすると、統計コードやパルス性コード等の固定コード
の種類を非常に多くしなければならない。しかし、この
ようにした場合には、低符号化速度に対応できない。 However, a statistical code or a pulse code
Code is a fixed code, let's improve the quality of playback audio
Is a fixed code such as a statistical code or pulse code
The types must be very many. But this
In such a case, it is impossible to cope with a low encoding speed.

【０００８】一方、固定コードを少なくすると、入力音
声信号を正確に反映した再生音声を得難くなる。例え
ば、個人によって、声帯（音源情報発生源）の特質は異
なっているが、このような声帯の差異を励振信号に正し
く反映させることができず、再生音声の品質を低下させ
る。 On the other hand, if the fixed code is reduced, the input sound
It becomes difficult to obtain a reproduced sound that accurately reflects the voice signal. example
For example, the characteristics of vocal cords (sources of sound source information) differ depending on the individual.
However, such differences in vocal cords are corrected in the excitation signal.
And the quality of the playback audio is reduced.
You.

【０００９】本発明は、以上の点を考慮してなされたも
のであり、低符号化速度の場合であっても、主に音源パ
ラメータ面から再生音声の品質を高めることができるコ
ード励振線形予測符号化器及び復号化器を提供しようと
するものである。The present invention has been made in view of the above points, and has a code-excited linear prediction method capable of improving the quality of reproduced speech mainly from a sound source parameter plane even at a low coding rate. It is intended to provide an encoder and a decoder.

【００１０】[0010]

【課題を解決するための手段】かかる課題を解決するた
め、第１の本発明は、（１）入力音声信号を線形予測し
て声道パラメータを求める声道パラメータ作成手段と、
（２）当該声道パラメータを量子化して、量子化された
声道パラメータを少なくとも出力する声道パラメータ量
子化手段と、（３）過去の入力音声信号の最適なコード
を表わす適応コードを記憶している適応コードブック
と、（４）予め定められている統計コード及び又はパル
ス性コードでなる固定コードを記憶している固定コード
ブックと、（５）上記適応コードと上記固定コードに基
づいて励振信号を作成する励振信号作成手段とを備え、
（６）量子化された上記声道パラメータと上記励振信号
とに基づいて合成音声信号を作成し、上記入力音声信号
と当該合成音声信号との誤差を評価することによって、
現時刻の当該入力音声信号に最適なコードに対応した上
記両コードブックのインデックスを決定し、上記声道パ
ラメータと上記両インデックスとを符号化音声信号とし
て出力するコード励振線形予測符号化器において、
（７）上記励振信号作成手段は、上記声道パラメータに
基づいて特定のインパルス応答を作成し、当該インパル
ス応答を用いて、上記固定コードに畳み込み処理を施し
た後、上記適応コードと合成して上記励振信号を作成す
るものであり、（８）上記インパルス応答は、上記固定
コードが出力された時点で定められた上記入力音声信号
の周波数特性に変換する伝達関数のインパルス応答に対
応するものであることを特徴とする。 Means for Solving the Problems In order to solve the problems, the first invention is as follows: (1) Linear prediction of an input speech signal
Vocal tract parameter creation means for determining vocal tract parameters by
(2) The vocal tract parameters are quantized and quantized
Amount of vocal tract parameters that output at least vocal tract parameters
(3) optimal code of past input audio signal
Adaptive codebook for storing an adaptive code representing the
And (4) a predetermined statistical code and / or pal
Fixed code that stores a fixed code consisting of security codes
Book, and (5) based on the adaptive code and the fixed code.
Excitation signal creating means for creating an excitation signal based on the
(6) The quantized vocal tract parameters and the excitation signal
And a synthesized speech signal is created based on the input speech signal.
And evaluating the error between the synthesized speech signal and
Supports the best code for the input audio signal at the current time
Determine the index of the two codebooks, and
Parameters and the above indices are coded audio signals.
In a code-excited linear prediction encoder that outputs
(7) The excitation signal creating means includes the vocal tract parameter
Create a specific impulse response based on
Convolution processing of the above fixed code using the
After that, the excitation signal is created by combining with the adaptive code.
(8) the impulse response is fixed
The input audio signal determined at the time the code was output
Of the transfer function that converts to the frequency response of
It is characterized by being responsive.

【００１１】また、第２の本発明は、（１）コード励振
線形予測符号化器側から与えられた符号化音声信号を、
声道パラメータ、適応コードのインデックス及び固定コ
ードのインデックスに分離する多重分離手段と、（２）
上記適応コードのインデックスに対応した、過去の入力
音声信号のコードを適応コードとして出力する適応コー
ドブックと、（３）上記固定コードのインデックスに対
応した、予め定められている固定コードを出力する固定
コードブックと、（４）上記適応コードと上記固定コー
ドとに基づいて励振信号を作成する励振信号作成手段
と、（５）上記声道パラメータと上記励振信号に基づい
て合成音声信号を作成する合成音声信号作成手段とを有
するコード励振線形予測復号化器において、（６）上記
励振信号作成手段は、上記声道パラメータに基づいて特
定のインパルス応答を作成し、当該インパルス応答を用
いて、上記固定コードに畳み込み処理を施した後、上記
適応コードと加算して上記励振信号を作成するものであ
り、（７）上記インパルス応答は、上記固定コードが出
力された時点で定められた上記入力音声信号の周波数特
性に変換する伝達関数のインパルス応答に対応するもの
であることを特徴とする。 Further , the second invention provides (1) code excitation
The encoded audio signal given from the linear prediction encoder is
Vocal tract parameters, adaptive code index and fixed code
Demultiplexing means for demultiplexing into code indexes, (2)
Past input corresponding to the index of the above adaptive code
An adaptive code that outputs a speech signal code as an adaptive code.
And (3) the index of the above fixed code.
Fixed to output a predetermined fixed code corresponding to
A code book, (4) the adaptive code and the fixed code
Excitation signal generating means for generating an excitation signal based on the
And (5) based on the vocal tract parameters and the excitation signal
Means for creating a synthesized speech signal by
(6) The code-excited linear prediction decoder
The excitation signal generating means is configured to generate a characteristic based on the vocal tract parameters.
Create a constant impulse response and use the impulse response
After performing the convolution process on the fixed code,
The excitation signal is created by adding to the adaptive code.
(7) The fixed code is output from the impulse response.
The frequency characteristics of the input audio signal determined at the time
Corresponding to the impulse response of the transfer function to be converted
It characterized in that it is.

【００１２】[0012]

【作用】第１の本発明のコード励振線形予測符号化器及
び第２の本発明のコード励振線形予測復号化器において
は、声道パラメータに基づいて特定のインパルス応答を
作成し、かつ、当該インパルス応答を用いて、固定コー
ドに畳み込み処理を施した後、適応コードと合成して励
振信号を作成すると共に、上記インパルス応答が、上記
固定コードが出力された時点で定められた上記入力音声
信号の周波数特性に変換する伝達関数のインパルス応答
に対応するものであるので、励振信号の要素となってい
る固定コードについて、上記入力音声信号の周波数特性
が反映され、固定コードとして数多くのもののが用意さ
れていなくても再生音声として高品質のものを得ること
ができる。 According to a first aspect of the present invention, there is provided a code-excited linear predictive encoder and a code-exciting linear predictive encoder.
And a second embodiment of the code-excited linear prediction decoder according to the present invention.
Generates a specific impulse response based on vocal tract parameters
Create a fixed code using the impulse response.
After convolving the code with the adaptive code,
And the impulse response is
The above input sound determined at the time when the fixed code is output
Impulse response of transfer function converted to frequency characteristics of signal
Therefore, it is an element of the excitation signal.
Frequency characteristics of the input audio signal
And a number of fixed codes are available.
Obtain high quality playback audio even if it is not
Can be.

【００１３】すなわち、固定コードブックから出力され
た固定コードの周波数特性を、入力音声信号の周波数特
性に近付ける操作を行なっている。このようにしたの
は、従来、励振信号の周波数特性は理論的に白色として
モデル化されてきたが、実際には白色的でなく、入力音
声信号の周波数特性に近い特性を有していることが実験
的に確認されており、統計コードやパルス性コードなど
の固定コードの周波数特性を、入力音声信号の周波数特
性に近付れば、それだけ高品質な合成音声信号を得るこ
とができるためであり、また、励振信号の有効な周波数
成分は量子化誤差信号よりかなり大きくなって量子化誤
差信号のマスキング効果が得られるためである。 That is, the output from the fixed codebook is
An operation is performed to bring the frequency characteristics of the fixed code closer to the frequency characteristics of the input audio signal. The reason for this is that the frequency characteristic of the excitation signal has been theoretically modeled as white in the past, but it is not white in nature but has a characteristic close to the frequency characteristic of the input audio signal. Has been experimentally confirmed, such as statistical codes and pulse codes
This is because, if the frequency characteristics of the fixed code are closer to the frequency characteristics of the input voice signal, a higher quality synthesized voice signal can be obtained, and the effective frequency component of the excitation signal is the quantization error signal. This is because the masking effect becomes much larger and a masking effect of the quantization error signal is obtained.

【００１４】[0014]

【Example】

（Ａ）コード励振線形予測符号化器の一実施例以下、本発明によるコード励振線形予測符号化器の一実
施例を図面を参照しながら詳述する。ここで、図１がこ
の実施例の構成を示すブロック図である。(A) One Embodiment of Code Excited Linear Prediction Encoder An embodiment of a code excited linear prediction encoder according to the present invention will be described in detail below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of this embodiment.

【００１５】図１において、この実施例のコード励振線
形予測符号化器は、大きくは、入力音声処理部１と最適
合成音声探索部２と多重化部３とから構成されている。Referring to FIG. 1, the code-excited linear predictive encoder of this embodiment comprises an input speech processing section 1, an optimum synthesized speech search section 2, and a multiplexing section 3.

【００１６】入力音声処理部１は、ＬＳＰ（線スペクト
ル対パラメータ）分析部１１、ＬＳＰパラメータ符号化
部１２、ＬＳＰパラメータ復号化部１３、ＬＰＣ（線形
予測係数）係数変換部１４、重付けフィルタ１５、合成
フィルタ零入力応答生成部１６、重付けフィルタ零入力
応答生成部１７及び２個の減算器１８、１９から構成さ
れており、入力音声信号が与えられたときに、復号化器
に伝送する声道パラメータを得ると共に、局部再生で形
成される合成音声信号の目標音声信号を形成するもので
ある。The input speech processing unit 1 includes an LSP (line spectrum vs. parameter) analyzer 11, an LSP parameter encoder 12, an LSP parameter decoder 13, an LPC (linear prediction coefficient) coefficient converter 14, and a weighting filter 15. , A synthesis filter zero-input response generator 16, a weighting filter zero-input response generator 17, and two subtractors 18 and 19, and when an input audio signal is given, is transmitted to a decoder. The vocal tract parameters are obtained, and a target audio signal of a synthesized audio signal formed by local reproduction is formed.

【００１７】この実施例の場合、デジタル化された離散
的な入力音声信号系列は、声道パラメータを求めるため
の分析フレーム長に対応する時間だけ蓄積され、さら
に、この分析フレーム長は数個のサブフレームに分割さ
れて入力音声処理部１で処理される。In this embodiment, a digitized discrete input speech signal sequence is accumulated for a time corresponding to an analysis frame length for obtaining vocal tract parameters, and the analysis frame length is several times. It is divided into sub-frames and processed by the input audio processing unit 1.

【００１８】入力音声信号はＬＳＰ分析部１１に与えら
れ、このＬＳＰ分析部１１によってＬＳＰ分析されて声
道パラメータとしてのＬＳＰパラメータに変換される。
このＬＳＰパラメータはＬＳＰパラメータ符号化部１２
によって符号化（例えばベクトル量子化）されて多重化
部３に与えられてコード励振線形予測復号化器側に伝送
される。また、符号化されたＬＳＰパラメータは、ＬＳ
Ｐパラメータ復号化部１３によって復号化（ベクトル逆
量子化）された後、ＬＰＣ係数変換部１４によってＬＰ
Ｃ係数に変換される。このように変換されたＬＰＣ係数
が、重付けフィルタ１５、合成フィルタ零入力応答生成
部１６、重付けフィルタ零入力応答生成部１７及び後述
する重付き合成フィルタ２９のタップ係数として使用さ
れる。また、後述する周波数特性操作部２８に与えられ
る。なお、ＬＳＰ分析部１１から出力されたＬＳＰパラ
メータを直接ＬＰＣ係数に変換するのではなく、符号、
復号処理を施したＬＳＰパラメータをＬＰＣ係数に変換
するのは、復号化器が利用するＬＰＣ係数と同様なＬＰ
Ｃ係数を局部再生で利用して音源パラメータを適切に決
定できるようにするためである。The input voice signal is supplied to an LSP analysis unit 11, which performs an LSP analysis and converts it into LSP parameters as vocal tract parameters.
This LSP parameter is the LSP parameter encoding unit 12
(For example, vector quantization), and the resulting signal is supplied to the multiplexing unit 3 and transmitted to the code excitation linear prediction decoder. Also, the encoded LSP parameter is LS
After being decoded (vector inverse quantization) by the P parameter decoding unit 13, the LPC coefficient conversion unit 14
It is converted to a C coefficient. The LPC coefficients thus converted are used as tap coefficients of the weighting filter 15, the synthesis filter zero-input response generator 16, the weighting filter zero-input response generator 17, and a weighting synthesis filter 29 described later. It is also provided to a frequency characteristic operation unit 28 described later. Note that, instead of directly converting the LSP parameters output from the LSP analysis unit 11 into LPC coefficients, codes,
The conversion of the decoded LSP parameters into LPC coefficients is performed in the same manner as the LPC coefficients used by the decoder.
This is because the sound source parameters can be appropriately determined by using the C coefficient in local reproduction.

【００１９】ここで、伝送に供する声道パラメータとし
てＬＳＰパラメータを用いるようにしたのは、声道の周
波数特性に対する補間特性が良くなること、ＬＳＰパラ
メータは少ない符号化ビット数で符号化してもＬＰＣパ
ラメータ等より声道スペクトルに与える歪みが小さいこ
と、ベクトル量子化法との組み合わせによって効率の良
い符号化ができることによる。Here, the LSP parameter is used as the vocal tract parameter to be used for transmission because the interpolation characteristic with respect to the frequency characteristic of the vocal tract is improved, and even if the LSP parameter is encoded with a small number of encoding bits, the LPC parameter is used. This is because distortion given to the vocal tract spectrum is smaller than parameters and the like, and efficient coding can be performed by combination with the vector quantization method.

【００２０】次に、入力音声信号から、局部再生される
合成音声信号に対する目標音声信号を形成する動作を説
明する。Next, the operation of forming a target audio signal for a synthesized audio signal locally reproduced from an input audio signal will be described.

【００２１】上述した入力音声信号は重付けフィルタ１
５に与えられ、人間の聴覚特性が考慮された重付けが施
された後、減算器１８に被減算入力として与えられる。
この減算器１８には、また、合成フィルタ零入力応答生
成部１６がＬＰＣ係数をタップ係数として用いて生成し
た、重み付き合成フィルタ２９に関する零入力応答信号
が減算入力として与えられる。かくして、直前の分析フ
レームにおける重み付き合成フィルタ２９の状態の影響
が除去された音声信号が得られ、これが減算器１９に被
減算入力として与えられる。この減算器１９には、ま
た、重付けフィルタ零入力応答生成部１７がＬＰＣ係数
をタップ係数として用いて生成した、重付けフィルタ１
５に関する零入力応答信号が減算入力として与えられ
る。かくして、直前の分析フレームにおける重付けフィ
ルタ１５の状態の影響が除去された音声信号が得られ、
これが目標音声信号として後述する減算器３０に与えら
れる。The input audio signal described above is applied to the weighting filter 1
5, weighted in consideration of human auditory characteristics, and then supplied to the subtractor 18 as a subtracted input.
The subtracter 18 is supplied with a zero input response signal relating to the weighted synthesis filter 29 generated by the synthesis filter zero input response generation unit 16 using the LPC coefficient as a tap coefficient, as a subtraction input. Thus, an audio signal from which the influence of the state of the weighted synthesis filter 29 in the immediately preceding analysis frame has been removed is obtained, and this is supplied to the subtractor 19 as an input to be subtracted. The subtractor 19 also includes a weighting filter 1 generated by the weighting filter zero-input response generator 17 using LPC coefficients as tap coefficients.
The quiescent response signal for 5 is provided as the subtraction input. Thus, an audio signal from which the influence of the state of the weighting filter 15 in the immediately preceding analysis frame has been removed is obtained.
This is supplied to a subtractor 30 described later as a target audio signal.

【００２２】最適合成音声探索部２は、局部再生による
合成音声信号が最も目標音声信号に類似する音源パラメ
ータを探索するものであり、適応コードブック２０、統
計コードブック２１、パルス性コードブック２２、利得
コードブック２３、利得制御器２４、２７、加算器２
５、固定コード選択スイッチ２６、周波数特性操作部２
８、重み付き合成フィルタ２９、減算器３０、誤差電力
和計算部３１及び最小誤差電力和コード選択部３２から
構成されている。The optimum synthesized speech search unit 2 searches for a sound source parameter whose synthesized speech signal obtained by local reproduction is most similar to the target speech signal, and includes an adaptive codebook 20, a statistical codebook 21, a pulse codebook 22, Gain codebook 23, gain controllers 24 and 27, adder 2
5, fixed code selection switch 26, frequency characteristic operation unit 2
8, a weighted synthesis filter 29, a subtractor 30, an error power sum calculator 31 and a minimum error power sum code selector 32.

【００２３】適応コードブック２０、統計コードブック
２１及びパルス性コードブック２２はそれぞれ、励振信
号に係る波形コードである適応コード、統計コード、パ
ルス性コードを格納しているものであり、利得コードブ
ック２３は適応コード及び固定コード（統計コード及び
パルス性コードをまとめてこのように呼ぶ）に関する利
得コードを格納しているものである。The adaptive codebook 20, the statistical codebook 21, and the pulse codebook 22 store an adaptive code, a statistical code, and a pulse code, which are waveform codes related to an excitation signal, respectively. Reference numeral 23 stores a gain code relating to the adaptive code and the fixed code (the statistical code and the pulse code are collectively referred to as such).

【００２４】適応コード及び統計コードはそれぞれ、従
来と同様に、統計的に周期性の強い有声音に寄与する波
形コード、統計的に周期性の弱いランダム的な無声音に
寄与する波形コードである。なお、適応コードブック２
０の適応コードは後述するように適応的に更新される。
パルス性コードは、孤立インパルスよりなる波形コード
である。パルス性コードは、周期性の強い有声音の立ち
上がりや、パルス性が明確な有声音の定常部分に寄与す
ることを考慮したものである。利得コードは、例えばベ
クトル量子化されており、ベクトルの一成分が適応コー
ドの利得に関し、他成分が固定コードの利得に関するも
のとなる。The adaptive code and the statistical code are respectively a waveform code contributing to a voiced sound having a statistically strong periodicity and a waveform code contributing to a random unvoiced sound having a statistically weak periodicity. In addition, adaptive code book 2
The adaptive code of 0 is adaptively updated as described later.
The pulse code is a waveform code composed of an isolated impulse. The pulse-like code takes into account the fact that it contributes to the rise of a voiced sound with a strong periodicity and to the stationary part of a voiced sound with a clear pulse. The gain code is, for example, vector-quantized, and one component of the vector relates to the gain of the adaptive code, and the other component relates to the gain of the fixed code.

【００２５】なお、パルス性の音源信号は、周期性を有
する単純な信号であるのでパルス信号発生部が発生する
ことも考えられるが、この実施例のようにコード化して
コードブック２２から読出すことで発生することが以下
の理由によって好ましい。すなわち、適応コードブック
２０からの出力と同期させ易く、また、統計コードブッ
ク２１と同一のブック構成とすることで後述するように
統計コード又はパルス性コードを選択して復号化器に伝
送する際の多重化処理等が容易になるためである。Since the pulse-like sound source signal is a simple signal having a periodicity, a pulse signal generator may be generated. However, the pulse-like sound source signal is coded and read out from the code book 22 as in this embodiment. This is preferable for the following reasons. That is, it is easy to synchronize with the output from the adaptive codebook 20, and by using the same book configuration as the statistical codebook 21, the statistical code or the pulse code is selected and transmitted to the decoder as described later. This is because multiplexing processing and the like can be easily performed.

【００２６】このような各種コードを用いて局部再生し
た合成音声信号が目標音声信号に最も類似する、各種コ
ードの最適コードを求めてそのインデックスを多重化部
３に与えて、コード励振線形予測復号化器側に伝送す
る。この実施例は、低符号化速度を意識したものである
ので、固定コードについては統計コード又はパルス性コ
ードを選択してそのインデックスを伝送することとして
いる。従って、固定コードとしていずれを選択している
かの選択情報も、コード励振線形予測復号化器側に伝送
する。The optimum code of each code, in which the synthesized voice signal locally reproduced using such various codes is most similar to the target voice signal, is obtained, and its index is given to the multiplexing unit 3 to perform code excitation linear predictive decoding. To the transmitter. In this embodiment, since a low coding rate is considered, a statistic code or a pulse code is selected for a fixed code and the index thereof is transmitted. Therefore, the selection information of which one is selected as the fixed code is also transmitted to the code excitation linear prediction decoder side.

【００２７】このような最適コードの探索（統計コード
又はパルス性コードの選択処理を含む）は、この実施例
の場合、適応コード、統計コード、パルス性コード、利
得コードの順に実行される。In this embodiment, the search for the optimum code (including the selection of the statistical code or the pulse code) is performed in the order of the adaptive code, the statistical code, the pulse code, and the gain code.

【００２８】最適な適応コードの探索時においては、統
計コードブック２１及びパルス性コードブック２２から
の出力を０とし、また、利得制御器２４が適切な値の利
得係数（例えば１）を乗算するようになされている。こ
のような状態において、適応コードブック２０に格納さ
れている全ての適応コードを時間順次に又は並列的に出
力させ、利得制御器２４及び加算器２５を介して重み付
き合成フィルタ２９に励振信号として与える。重み付き
合成フィルタ２９は、ＬＰＣ係数変換部１４から与えら
れたＬＰＣ係数をタップ係数としてこの励振信号に対し
て畳み込み処理を行ない、音源パラメータとして適応コ
ードの内容だけが反映された合成音声信号を、全ての適
応コードについて求める。At the time of searching for an optimal adaptive code, the outputs from the statistical codebook 21 and the pulse codebook 22 are set to 0, and the gain controller 24 multiplies an appropriate value of a gain coefficient (for example, 1). It has been made like that. In such a state, all the adaptive codes stored in the adaptive code book 20 are output in time sequence or in parallel, and are output to the weighted synthesis filter 29 via the gain controller 24 and the adder 25 as excitation signals. give. The weighted synthesis filter 29 performs convolution processing on the excitation signal using the LPC coefficient given from the LPC coefficient conversion unit 14 as a tap coefficient, and generates a synthesized speech signal in which only the content of the adaptive code is reflected as a sound source parameter, Obtain for all adaptive codes.

【００２９】減算器３０は、適応コードの内容だけが反
映された合成音声信号と目標音声信号との誤差信号を全
ての適応コードについて求めて誤差電力和計算部３１に
与える。誤差電力和計算部３１は誤差信号についてその
成分の２乗和（誤差電力和）を、全ての適応コードにつ
いて求めて最小誤差電力和コード選択部３２に与える。
最小誤差電力コード選択部３２は、誤差電力和が最小の
適応コードを最適なものと決定する。The subtracter 30 obtains an error signal between the synthesized speech signal in which only the content of the adaptive code is reflected and the target speech signal for all the adaptive codes, and supplies it to the error power sum calculator 31. The error power sum calculator 31 calculates the sum of squares (error power sum) of the components of the error signal for all the adaptive codes and supplies the sum to the minimum error power sum code selector 32.
The minimum error power code selection unit 32 determines the adaptive code having the minimum error power sum as the optimal one.

【００３０】次に、最適な統計コードの探索が実行され
るが、この探索時においては、固定コード選択スイッチ
２６が統計コードブック２１側に切換えられ、適応コー
ドブック２０が出力を０とする（なお、最適適応コード
を出力することも考えられる）。このような状態におい
て、統計コードブック２１に格納されている全ての統計
コードを時間順次に又は並列的に出力させ、固定コード
選択スイッチ２６及び利得制御器２４を介して周波数特
性操作部２８に入力させる。Next, a search for an optimum statistical code is performed. In this search, the fixed code selection switch 26 is switched to the statistical codebook 21 side, and the output of the adaptive codebook 20 is set to 0 ( It is also conceivable to output an optimal adaptive code). In such a state, all the statistical codes stored in the statistical code book 21 are output in time sequence or in parallel, and input to the frequency characteristic operation unit 28 via the fixed code selection switch 26 and the gain controller 24. Let it.

【００３１】この周波数特性操作部２８は、後述する図
３に示す詳細構成を有し、入力された統計コードの周波
数特性を統計コードの時間的な長さに対応して入力音声
信号の周波数特性に近付けるように変換操作する。この
ように周波数特性が変換操作された全ての統計コードが
加算器２５（この場合ないのに等しい）を介して励振信
号として重み付き合成フィルタ２９に与えられる。これ
以降は、最適な適応コードの探索と同様に処理され、最
小誤差電力和コード選択部３２が最適な統計コードを決
定する。The frequency characteristic operation unit 28 has a detailed configuration shown in FIG. 3 described later, and converts the frequency characteristics of the input statistic code into the frequency characteristics of the input audio signal corresponding to the time length of the statistic code. Perform the conversion operation to approach. All the statistical codes whose frequency characteristics have been converted in this way are supplied to the weighted synthesis filter 29 as an excitation signal via the adder 25 (equivalently in this case). Subsequent processing is performed in the same manner as the search for the optimal adaptive code, and the minimum error power sum code selection unit 32 determines the optimal statistical code.

【００３２】ここで、周波数特性操作部２８を設けるよ
うにしたのは以下の理由による。従来、励振信号の周波
数特性は理論的に白色としてモデル化されてきたが、実
際には白色的でなく、入力音声信号の周波数特性に近い
特性を有していることが実験的に確認されている。従っ
て、統計コードやパルス性コードの周波数特性を、入力
音声信号の周波数特性に近付れば、それだけ高品質な合
成音声信号を得ることができ、また、励振信号の有効な
周波数成分は量子化誤差信号よりかなり大きくなって量
子化誤差信号のマスキング効果が得られる。そこで、周
波数特性操作部２８を設けている。ここで、入力音声信
号の周波数特性を表す情報としては、ＬＰＣ係数があ
り、また、ピッチ予測情報を意味する最適な適応コード
の情報（それに対する利得を含む）がある。従って、周
波数特性操作部２８はこれらの情報に基づいて、統計コ
ードやパルス性コードの周波数特性を操作する。The reason why the frequency characteristic operation section 28 is provided is as follows. Conventionally, the frequency characteristics of the excitation signal have been theoretically modeled as white, but it has been experimentally confirmed that the frequency characteristics of the excitation signal are not white and have characteristics close to the frequency characteristics of the input audio signal. I have. Therefore, the closer the frequency characteristics of the statistical code or pulse code are to those of the input audio signal, the higher the quality of the synthesized audio signal can be obtained, and the effective frequency components of the excitation signal are quantized. It becomes much larger than the error signal, and a masking effect of the quantization error signal is obtained. Therefore, a frequency characteristic operation unit 28 is provided. Here, the information representing the frequency characteristics of the input speech signal includes an LPC coefficient, and information on an optimal adaptive code (including a gain for the code), which means pitch prediction information. Therefore, the frequency characteristic operation unit 28 operates the frequency characteristics of the statistical code or the pulse code based on the information.

【００３３】このようにして最適な統計コードの探索が
終了すると、次には、最適なパルス性コードの探索を行
なう。この探索時においては、固定コード選択スイッチ
２６がパルス性コードブック２２側に切換えられ、適応
コードブック２０が出力を０とする（なお、最適適応コ
ードを出力することも考えられる）。このような状態に
おいて、パルス性コードブック２２に格納されている全
てのパルス性コードを時間順次に又は並列的に出力させ
る。以降の処理は、最適な統計コードの探索時と同様で
あるのでその説明は省略する。When the search for the optimum statistical code is completed in this way, the search for the optimum pulse code is next performed. At the time of this search, the fixed code selection switch 26 is switched to the pulse codebook 22 side, and the output of the adaptive codebook 20 is set to 0 (note that the optimum adaptive code may be output). In such a state, all the pulse codes stored in the pulse code book 22 are output in time sequence or in parallel. Subsequent processing is the same as that at the time of searching for the optimum statistical code, and therefore, description thereof is omitted.

【００３４】このようにして最適なパルス性コードが決
定されたときには、最小誤差電力和コード選択部３２
は、最適な統計コードの誤差電力和と最適なパルス性コ
ードの誤差電力和とを比較し、誤差電力和が小さい方を
コード励振線形予測復号化器側に伝送する固定コードに
決定する。When the optimum pulse code is determined in this way, the minimum error power sum code selector 32
Compares the error power sum of the optimal statistic code and the error power sum of the optimal pulsed code, and determines the smaller one as the fixed code to be transmitted to the code excitation linear prediction decoder.

【００３５】この後、最適な利得コードの探索を行な
う。この利得コードの探索時においては、適応コードブ
ック２０からは最適な適応コードが出力され、固定コー
ド選択スイッチ２６は選択された統計コードブック２１
又はパルス性コードブック２２に切換えられ、選択され
た固定コードブック２１又は２２からは最適な固定コー
ドが出力される。１個の利得コードブック２３は適応コ
ード用の利得と固定コード用の利得からなり、適応コー
ド用の利得は利得制御器２４に与えられ、固定コード用
の利得は利得制御器２７に与えられる。かくして、利得
制御された最適適応コードと、周波数特性操作と利得制
御とが施された最適固定コードとが加算器２５によって
加算され、励振信号として重み付き合成フィルタ２９に
与えられる。このような処理は、利得コードブック２３
内の全ての利得コードに対して時間順次に又は並列的に
実行される。重み付き合成フィルタ２９以降の探索時の
処理は、他のコードの探索時の処理と同様である。Thereafter, a search for an optimum gain code is performed. During the search for the gain code, the adaptive codebook 20 outputs the optimal adaptive code, and the fixed code selection switch 26 sets the selected statistical codebook 21.
Alternatively, the codebook is switched to the pulse codebook 22, and the selected fixed codebook 21 or 22 outputs the optimum fixed code. One gain codebook 23 includes an adaptive code gain and a fixed code gain. The adaptive code gain is provided to a gain controller 24, and the fixed code gain is provided to a gain controller 27. Thus, the optimal adaptive code subjected to the gain control and the optimal fixed code subjected to the frequency characteristic operation and the gain control are added by the adder 25, and given to the weighted synthesis filter 29 as an excitation signal. Such processing is performed by the gain codebook 23.
Is performed in time sequence or in parallel for all the gain codes in. The processing at the time of searching after the weighted synthesis filter 29 is the same as the processing at the time of searching for other codes.

【００３６】最小誤差電力和コード選択部３２は、最適
適応コード、最適固定コード、最適利得コードが得られ
ると、これらのインデックスを多重化部３に与えると共
に、統計コード及びパルス性コードのどちらを選択した
かを表す固定コード選択スイッチ情報も多重化部３に与
える。多重化部３は、ＬＳＰパラメータ符号化部１２か
ら与えられたＬＳＰパラメータと、これら情報とを多重
化してコード励振線形予測復号化器側に出力する。な
お、利得コードとしてベクトル量子化を適用している場
合には、伝送されるインデックスはベクトル番号であ
る。When the optimum adaptive code, the optimum fixed code, and the optimum gain code are obtained, the minimum error power sum code selecting unit 32 gives these indices to the multiplexing unit 3 and selects either the statistical code or the pulse code. The fixed code selection switch information indicating the selection is also given to the multiplexing unit 3. The multiplexing unit 3 multiplexes the LSP parameters supplied from the LSP parameter encoding unit 12 and the information, and outputs the multiplexed information to the code excitation linear prediction decoder side. When vector quantization is applied as a gain code, the transmitted index is a vector number.

【００３７】また、最小誤差電力和コード選択部３２
は、多重化部３に与えるインデックス及び固定コード選
択スイッチ情報を、対応するコードブック（２０及び２
３と、２１又は２２）や固定コード選択スイッチ２６に
与える。このとき、スイッチ２６が切換えられ、各コー
ドブックから最適コードが出力される。これにより、今
回のサブフレーム処理時において最も目標音声信号に近
い合成音声信号を形成できる励振信号が加算器２５から
出力され、これが適応コードブック２０に与えられる。
そして、適応コードブック２０は適応コードの更新処理
を行なう。The minimum error power sum code selector 32
Stores the index and fixed code selection switch information given to the multiplexing unit 3 in the corresponding codebook (20 and 2).
3 and 21 or 22) and the fixed code selection switch 26. At this time, the switch 26 is switched, and the optimal code is output from each codebook. As a result, an excitation signal capable of forming a synthesized voice signal closest to the target voice signal in the current subframe processing is output from the adder 25, and supplied to the adaptive codebook 20.
Then, adaptive code book 20 performs an adaptive code update process.

【００３８】以上のような符号化処理がサブフレーム毎
に繰返され、符号化音声信号が順次コード励振線形予測
復号化器に送信される。The above-described encoding process is repeated for each subframe, and the encoded speech signal is sequentially transmitted to the code excitation linear prediction decoder.

【００３９】図３は、上述した周波数特性操作部２８の
詳細構成を示すものである。図３において、この周波数
特性操作部２８は、縦続接続された２個のフィルタ２８
１及び２８２と、ピッチラグ決定部２８３とから構成さ
れている。FIG. 3 shows a detailed configuration of the frequency characteristic operation unit 28 described above. In FIG. 3, the frequency characteristic operation unit 28 includes two filters 28 connected in cascade.
1 and 282, and a pitch lag determining unit 283.

【００４０】固定コード選択スイッチ２６から出力され
た固定コードは、第１のフィルタ２８１に与えられる。
この第１のフィルタ２８１のインパルス応答Ｈ１（ｚ）
は、次式に示すように選定されており、これによって入
力された固定コードに対する周波数変換操作を行なう。The fixed code output from the fixed code selection switch 26 is provided to the first filter 281.
The impulse response H 1 (z) of the first filter 281
Is selected as shown in the following equation, and performs a frequency conversion operation on the input fixed code.

【００４１】Ｈ１（ｚ）＝（１−Σγⁱａｉｚ^-i）／（１−Σνⁱａｉｚ^-i） …(1) 但し、ａｉ（ｉは１〜Ｍ）は、ＬＰＣ係数変換部１４か
ら供給される合成フィルタ２９に対するタップ係数であ
る。また、γ及びνはそれぞれ、予め定められた０より
大きく１より小さい定数である。H 1 (z) = (1−Σγ ⁱ aiz ⁻ⁱ ) / (1−Σν ⁱ aiz ⁻ⁱ ) (1) where ai (i is 1 to M) is supplied from the LPC coefficient conversion unit 14. The tap coefficient for the synthesis filter 29 to be used. Further, γ and ν are each from a predetermined 0.
It is a constant that is largely smaller than 1.

【００４２】この第１のフィルタ２８１によって周波数
特性が操作された固定コードが、第２のフィルタ２８２
に入力される。ピッチラグ決定部２８３は適応コードブ
ック２０に対する最適適応コードのインデックスからピ
ッチラグＬを得て第２のフィルタ２８２に与える。この
第２のフィルタ２８２のインパルス応答Ｈ２（ｚ）は、
次式に示すように選定されており、これによって入力さ
れた固定コードに対する周波数変換操作を行なう。The fixed code whose frequency characteristics have been manipulated by the first filter 281 is
Is input to The pitch lag determining unit 283 obtains the pitch lag L from the index of the optimal adaptive code for the adaptive codebook 20, and supplies the pitch lag L to the second filter 282. The impulse response H2 (z) of the second filter 282 is
The frequency is selected as shown in the following equation, and a frequency conversion operation is performed on the input fixed code.

【００４３】Ｈ２（ｚ）＝１／（１−εｚ^-L） …(2) 但し、εは予め定められた０より大きく１以下の定数で
ある。この第２のフィルタ２８２の出力が、利得制御器
２７に与えられる。H2 (z) = 1 / (1−εz− ^L ) (2) where ε is a constant greater than 0 and less than or equal to 1 which is predetermined. The output of the second filter 282 is provided to the gain controller 27.

【００４４】このような詳細構成を有する周波数特性操
作部２８によって、上述したように、入力された固定コ
ードの周波数特性を固定コードの時間的な長さに対応し
て入力音声信号の周波数特性に近付けるようにできてい
る。As described above, the frequency characteristic of the input fixed code is converted into the frequency characteristic of the input audio signal in accordance with the time length of the fixed code by the frequency characteristic operation unit 28 having such a detailed configuration. It is designed to get closer.

【００４５】従って、上記実施例のコード励振線形予測
符号化器によれば、低符号化速度においても高品質の再
生音声を得ることができる。以下、かかる効果が得られ
ることを具体的に説明する。Therefore, according to the code-excited linear predictive encoder of the above-described embodiment, high-quality reproduced speech can be obtained even at a low encoding speed. Hereinafter, it will be specifically described that such effects can be obtained.

【００４６】(1) 低符号化速度を期した場合、音源パラ
メータに割当てられる符号化ビット数が少ないので、用
意される固定コードも少なくなり、入力音声信号に含ま
れているパルス性雑音を明確に再生でき難いが、この実
施例の場合、パルス性コードを利用しているので、この
ような場合の音声の再生品質を高めることができる。(1) When a low coding rate is expected, the number of coded bits allocated to the excitation parameters is small, so that the number of fixed codes to be prepared is also small, and the pulse noise included in the input speech signal can be clearly identified. However, in this embodiment, since the pulse code is used, the sound reproduction quality in such a case can be improved.

【００４７】また、パルス性コードと統計コードとを切
換えて用いているので、低符号化速度に対応できると共
に、音声の過渡部のようなランダム信号とパルス的信号
が混在する信号に対する再生品質を高めることができ
る。Further, since the pulse code and the statistical code are switched and used, it is possible to cope with a low encoding speed and to improve the reproduction quality of a signal in which a random signal and a pulse signal are mixed, such as a transient part of voice. Can be enhanced.

【００４８】(2) 低符号化速度を期した場合、音源パラ
メータに対する符号化ビット数も少なくなるが、声道パ
ラメータに対する符号化ビット数も少なくなる。この実
施例の場合、少ない符号化ビット数で符号化してもＬＰ
Ｃ等より声道スペクトルに与える歪みが小さいＬＳＰパ
ラメータを伝送するようにしているので、この点から再
生品質を高めることができる。(2) When a low encoding speed is expected, the number of encoded bits for the excitation parameters is reduced, but the number of encoded bits for the vocal tract parameters is also reduced. In the case of this embodiment, even if encoding is performed with a small number of encoding bits, LP
Since the LSP parameter that causes less distortion to the vocal tract spectrum than C or the like is transmitted, the reproduction quality can be improved from this point.

【００４９】(3) 上述のように、実際の励振信号が入力
音声信号の周波数特性に近い周波数特性を有することを
考慮して周波数特性操作部を設けているので、実際に即
している分だけ再生品質を高めることができると共に、
この変換に伴い量子化誤差信号に対するマスキング効果
が生じて再生品質を高めることができる。(3) As described above, the frequency characteristic operation unit is provided in consideration of the fact that the actual excitation signal has a frequency characteristic close to the frequency characteristic of the input audio signal. Only while improving the playback quality,
With this conversion, a masking effect on the quantization error signal is generated, and the reproduction quality can be improved.

【００５０】（Ｂ）コード励振線形予測復号化器の一実施例次に、本発明によるコード励振線形予測復号化器の一実
施例を図面を参照しながら詳述する。この実施例は、図
１に示すコード励振線形予測符号化器の実施例に対応す
るものである。図２がこの実施例の構成を示すブロック
図である。(B) One Embodiment of Code Excited Linear Predictive Decoder Next, an embodiment of a code excited linear predictive decoder according to the present invention will be described in detail with reference to the drawings. This embodiment corresponds to the embodiment of the code excitation linear prediction encoder shown in FIG. FIG. 2 is a block diagram showing the configuration of this embodiment.

【００５１】図２において、この実施例のコード励振線
形予測復号化器は、多重分離部４０、ＬＳＰパラメータ
復号化部４１、ＬＰＣ係数変換部４２、適応コードブッ
ク４３、統計コードブック４４、パルス性コードブック
４５、利得コードブック４６、利得制御器４７、４９、
固定コード選択スイッチ４８、周波数特性操作部５０、
加算器５１及び重み付き合成フィルタ５２から構成され
ている。Referring to FIG. 2, a code excitation linear prediction decoder according to this embodiment includes a demultiplexer 40, an LSP parameter decoder 41, an LPC coefficient converter 42, an adaptive codebook 43, a statistical codebook 44, Codebook 45, gain codebook 46, gain controllers 47, 49,
Fixed code selection switch 48, frequency characteristic operation unit 50,
It comprises an adder 51 and a weighted synthesis filter 52.

【００５２】コード励振線形予測符号化器側から与えら
れた符号化音声信号は、多重分離部４０に入力される。
多重分離部４０は、この符号化音声信号を、ＬＳＰパラ
メータ、最適適応コードのインデックス、最適固定コー
ドのインデックス、最適利得コードのインデックス及び
固定コード選択スイッチ情報に分離する。そして、ＬＳ
ＰパラメータはＬＳＰパラメータ復号化部４１に与え、
最適適応コードのインデックスを適応コードブック４３
に与え、最適利得コードのインデックスを利得コードブ
ック４６に与え、固定コード選択スイッチ情報を固定コ
ード選択スイッチ４８に与える。また、最適固定コード
のインデックスを、固定コード選択スイッチ情報に基づ
いて定まる統計コードブック４４又はパルス性コードブ
ック４５に与える。The coded speech signal given from the code excitation linear prediction coder side is input to the demultiplexing section 40.
The demultiplexing unit 40 separates the encoded audio signal into LSP parameters, an index of an optimal adaptive code, an index of an optimal fixed code, an index of an optimal gain code, and fixed code selection switch information. And LS
The P parameter is given to the LSP parameter decoding unit 41,
The index of the optimal adaptive code is stored in the adaptive code book 43.
, The index of the optimal gain code is provided to the gain codebook 46, and the fixed code selection switch information is provided to the fixed code selection switch 48. Further, the index of the optimal fixed code is given to the statistical codebook 44 or the pulse codebook 45 determined based on the fixed code selection switch information.

【００５３】ＬＳＰパラメータ復号化部４１は、与えら
れた符号化されているＬＳＰパラメータを復号化（例え
ばベクトル逆量子化）し、ＬＰＣ係数変換部４２はこの
ＬＳＰパラメータをＬＰＣ係数に変換する。このように
変換されたＬＰＣ係数が、周波数特性操作部５０及び重
付き合成フィルタ５２に声道パラメータ情報として与え
られる。The LSP parameter decoding section 41 decodes (for example, vector dequantizes) the given coded LSP parameters, and the LPC coefficient conversion section 42 converts the LSP parameters into LPC coefficients. The LPC coefficients thus converted are provided to the frequency characteristic operation unit 50 and the weighted synthesis filter 52 as vocal tract parameter information.

【００５４】利得コードブック４６は、与えられたイン
デックスで定まる利得コードを、適応コード用と固定コ
ード用とに分離し、それぞれ適応コード用の利得制御器
４７、固定コード用の利得制御器４９に与える。The gain codebook 46 separates a gain code determined by a given index into an adaptive code and a fixed code, and sends them to an adaptive code gain controller 47 and a fixed code gain controller 49, respectively. give.

【００５５】適応コードブック４３は、与えられたイン
デックスで定まる適応コードを出力し、この適応コード
が利得制御器４７を介して利得制御されて加算器５１に
与えられる。また、適応コードブック４３は適応コード
を周波数特性操作部５０に与える。The adaptive codebook 43 outputs an adaptive code determined by a given index, and the adaptive code is gain-controlled via a gain controller 47 and is applied to an adder 51. The adaptive code book 43 gives the adaptive code to the frequency characteristic operation unit 50.

【００５６】統計コードブック４４又はパルス性コード
ブック４５は、与えられたインデックスに対応する統計
コード又はパルス性コードを固定コード選択スイッチ４
８を介して周波数特性操作部５０に与えられ、周波数特
性操作部５０は、ＬＰＣ係数、適応コードのインデック
スに基づいてその周波数特性を操作する（入力音声信号
の周波数特性に近くなるように操作する）。周波数特性
操作部５０の詳細構成は、上述した図３に示す構成を有
している。このように周波数特性が操作された固定コー
ドが、利得制御器４９で利得制御されて加算器５１に与
えられる。The statistical code book 44 or the pulse code book 45 stores the statistical code or the pulse code corresponding to the given index in the fixed code selection switch 4.
8 to the frequency characteristic operation unit 50, and the frequency characteristic operation unit 50 operates the frequency characteristic based on the LPC coefficient and the index of the adaptive code (operates so as to be close to the frequency characteristic of the input audio signal). ). The detailed configuration of the frequency characteristic operation unit 50 has the configuration shown in FIG. 3 described above. The fixed code whose frequency characteristics have been manipulated as described above is gain-controlled by the gain controller 49 and is provided to the adder 51.

【００５７】加算器５１は、与えられた適応コードと固
定コードを加算してその加算信号を励振信号として重み
付き合成フィルタ５２に与える。重み付き合成フィルタ
５２は、この励振信号をＬＰＣ係数で畳み込んで合成音
声信号を得て出力する。The adder 51 adds the given adaptive code and fixed code, and supplies the added signal to the weighted synthesis filter 52 as an excitation signal. The weighted synthesis filter 52 convolves the excitation signal with LPC coefficients to obtain and output a synthesized speech signal.

【００５８】加算器５１から出力された励振信号は、ま
た適応コードブック４３に与えられる。このとき、適応
コードブック４３は、この励振信号を用いて適応コード
の更新を行なう。The excitation signal output from the adder 51 is supplied to the adaptive codebook 43 again. At this time, the adaptive code book 43 updates the adaptive code using the excitation signal.

【００５９】コード励振線形予測復号化器は、以上のよ
うな処理を符号化音声信号が与えられる毎に、従ってサ
ブフレーム毎に行なう。The code-excited linear predictive decoder performs the above-described processing every time a coded speech signal is supplied, and thus for each subframe.

【００６０】従って、この実施例のコード励振線形予測
復号化器によれば、与えられたＬＳＰパラメータを処理
する構成を有し、音源としてパルス性コードブック４５
を有し、入力音声信号の周波数特性に固定音源の周波数
特性を近付ける周波数特性操作部５０を有するので、こ
れにより、上述したコード励振線形予測符号化器につい
ての効果が実際のものとなる。Accordingly, the code-excited linear predictive decoder according to this embodiment has a configuration for processing a given LSP parameter, and has a pulse codebook 45 as a sound source.
And the frequency characteristic operation unit 50 that brings the frequency characteristics of the fixed sound source closer to the frequency characteristics of the input audio signal, thereby realizing the effect of the above-described code excitation linear prediction encoder.

【００６１】（Ｃ）他の実施例上記実施例は、フォワード型のコード励振線形予測符号
化器及び復号化器に関するものであるが、本発明を、バ
ックワード型のコード励振線形予測符号化器及び復号化
器に適用することもできる。(C) Other Embodiments The above embodiment relates to a forward-type code-excited linear prediction encoder and a decoder. The present invention relates to a backward-type code-excited linear prediction encoder. And a decoder.

【００６２】さらに、上記実施例は、４bit/s 以下の符
号化速度を意識して構成されたものであるが、これより
高い符号化速度の符号化器及び復号化器に適用できるこ
とは勿論である。符号化速度が許すならば、統計コード
ブック及びパルス性コードブックを選択的ではなく、常
に両者を有効に動作させるものであっても良い。Further, the above-mentioned embodiment is configured with consideration for the encoding speed of 4 bit / s or less, but it is needless to say that the present invention can be applied to an encoder and a decoder having a higher encoding speed. is there. If the coding speed allows, the statistical codebook and the pulsed codebook may not always be selected but both may always be operated effectively.

【００６３】上記実施例の周波数特性操作部２８及び５
０は、ＬＰＣ係数を利用した周波数特性操作とピッチラ
グを利用した周波数特性操作の両方を行なうものであっ
たが、少なくとも一方だけの操作を行なうものであって
も良い。The frequency characteristic operating units 28 and 5 of the above embodiment
In the case of 0, both the frequency characteristic operation using the LPC coefficient and the frequency characteristic operation using the pitch lag are performed, but at least one of the operations may be performed.

【００６４】[0064]

【発明の効果】本発明のコード励振線形予測符号化器及
び復号化器によれば、固定コードブックから出力された
固定コードの周波数特性を、入力音声信号の周波数特性
に近付ける操作を行なうので、音源パラメータが実際の
音源と良く対応がとれ、符号化速度の高低を問わず、こ
の点から再生音声の品質を高めることができる。According to the code excited linear predictive encoder and decoder of the present invention, the frequency characteristic of the fixed code output from the fixed codebook, because the operation close to the frequency characteristics of the input speech signal, The sound source parameters correspond well to the actual sound source, and the quality of the reproduced sound can be improved from this point regardless of the coding speed .

[Brief description of the drawings]

【図１】コード励振線形予測符号化器の一実施例を示す
ブロック図である。FIG. 1 is a block diagram showing an embodiment of a code excitation linear prediction encoder.

【図２】コード励振線形予測復号化器の一実施例を示す
ブロック図である。FIG. 2 is a block diagram showing an embodiment of a code excitation linear prediction decoder.

【図３】実施例の周波数特性操作部２８（５０）の詳細
構成を示すブロック図である。FIG. 3 is a block diagram illustrating a detailed configuration of a frequency characteristic operation unit (50) according to the embodiment.

[Explanation of symbols]

３…多重化部、１１…ＬＳＰパラメータ分析部、１２…
ＬＳＰパラメータ符号化部、１３、４１…ＬＳＰパラメ
ータ復号化部、１４、４２…ＬＰＣ係数変換部、２０、
４３…適応コードブック、２１、４４…統計コードブッ
ク、２２、４５…パルス性コードブック、２５、５１…
加算器、２６、４８…固定コード選択スイッチ、２８、
５０…周波数特性操作部、２９、５２…重み付き合成フ
ィルタ、３０…減算器、３１…誤差電力和計算部、３２
…最小誤差電力コード選択部、４０…多重分離部。3: multiplexing unit, 11: LSP parameter analysis unit, 12 ...
LSP parameter encoding unit, 13, 41 ... LSP parameter decoding unit, 14, 42 ... LPC coefficient conversion unit, 20,
43: Adaptive codebook, 21, 44: Statistical codebook, 22, 45: Pulse codebook, 25, 51 ...
Adder, 26, 48 ... fixed code selection switch, 28,
50: frequency characteristic operation unit, 29, 52: weighted synthesis filter, 30: subtractor, 31: error power sum calculation unit, 32
... Minimum error power code selector, 40 ... Demultiplexer.

───────────────────────────────────────────────────── フロントページの続き (72)発明者有山義博東京都港区虎ノ門１丁目７番12号沖電気工業株式会社内 (56)参考文献特開平２−282800（ＪＰ，Ａ) 特開平２−148926（ＪＰ，Ａ) 特開平１−54497（ＪＰ，Ａ) 特開平３−101800（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G10L 3/00 - 9/20 H03M 7/30 H04B 14/04 ＪＩＣＳＴファイル（ＪＯＩＳ)────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Yoshihiro Ariyama 1-7-12 Toranomon, Minato-ku, Tokyo Oki Electric Industry Co., Ltd. (56) References JP-A-2-282800 (JP, A) JP-A-2-148926 (JP, A) JP-A-1-54497 (JP, A) JP-A-3-101800 (JP, A) (58) Fields investigated (Int. Cl. ⁶ , DB name) G10L 3 / 00-9/20 H03M 7/30 H04B 14/04 JICST file (JOIS)

Claims

(57) [Claims]

1. A vocal tract parameter which is linearly predicted for an input speech signal.
Vocal tract parameter generating means for obtaining data, and quantizing the vocal tract parameters to obtain a quantized vocal tract parameter.
A vocal tract parameter quantizer that outputs at least parameters
Stage and an adaptive code representing the optimal code of the past input speech signal
And an adaptive code book that stores a predetermined statistical code and / or pulse code
The fixed code is stored based on the vocal tract parameters.
Frequency characteristics of the input audio signal determined at the time
Using the impulse response of the transfer function to be converted,
A frequency characteristic operation unit for performing convolution processing on the code, the adaptive code on which the convolution processing has been performed, and the fixed code;
And a gain controller for multiplying the gain by the gain and the gain, respectively, and the convolution multiplied by the gain by the gain controller.
Add the processed adaptive code and the fixed code
Excitation signal generating means for generating an excitation signal by using the quantized vocal tract parameters and the excitation signal.
A synthesis filter for creating a synthesized voice signal, and evaluating an error between the input voice signal and the synthesized voice signal.
The optimum time before the input audio signal at the current time
Select the adaptive code and fixed code, and corresponding index
A code selecting unit for determining a vocal tract parameter, the adaptive code and the fixed code.
Multiplexed with the index and output as an encoded audio signal.
Code excitation linear circuit having a multiplexing section
Predictive encoder.

2. The code-excited linear prediction code according to claim 1, wherein said impulse response is an impulse response of a transfer function H (z) represented by the following equation. Chemist. H (z) = 1 / (1−εz− ^L ) where ε is a constant satisfying 0 <ε ≦ 1, and L is a pitch lag calculated from the index of the adaptive code.

3. The code-excited linear prediction encoder according to claim 1, wherein the impulse response is an impulse response of a transfer function H (z) represented by the following equation. Chemist. H (z) = (1−Σγ ⁱ aiz− ⁱ ) / (1−Σv ⁱ aiz− ⁱ ) where ai (i is 1 to M) is a tap coefficient, and γ and ν are 0
<Γ <1 and 0 <ν <1.

4. The code excitation linear predictive encoder according to claim 1, wherein the impulse response is a transfer function H1 (z) represented by the following equation:
A code excitation linear predictive encoder characterized by using an impulse response in which the impulse response of the transfer function H2 (z) and the impulse response of the transfer function H2 (z) are cascaded. H1 (z) = (1−Σγ ⁱ aiz− ⁱ ) / (1−Σv ⁱ aiz− ⁱ ) H2 (z) = 1 / (1−εz− ^L ) where ai (i is 1 to M) Is the tap coefficient, and γ and ν are 0
<Γ <1 and 0 <ν <1 constants, ε is a constant satisfying 0 <ε ≦ 1, and L is a pitch lag calculated from the index of the adaptive code.

5. The method according to claim 1, wherein the code excitation linear prediction encoder supplies
Vocal tract parameters and adaptive code
Index and fixed code index.
Demultiplexing means, and a past input corresponding to the index of the adaptive code.
An adaptive code that outputs a speech signal code as an adaptive code.
And a predetermined book corresponding to the index of the fixed code.
A fixed code book that outputs a fixed code, and the fixed code is based on the vocal tract parameters.
Determined at the time selected by the code excitation linear prediction encoder
Of the transfer function to be converted into the frequency characteristic of the input audio signal
Using the impulse response, convolve the fixed code
A frequency characteristic operation unit for performing processing, the adaptive code subjected to the convolution processing, and the fixed code.
And a gain controller for multiplying the gain by the gain and the gain, respectively, and the convolution multiplied by the gain by the gain controller.
Add the processed adaptive code and the fixed code
Excitation signal generation means for generating an excitation signal by using the vocal tract parameter and the excitation signal.
And a synthesis filter for generating a signal.
Code-excited linear predictive decoder.

6. The code-excited linear prediction decoder according to claim 5, wherein the impulse response is an impulse response of a transfer function H (z) represented by the following equation. Chemist. H (z) = 1 / (1−εz− ^L ) where ε is a constant satisfying 0 <ε ≦ 1, and L is a pitch lag calculated from the index of the adaptive code.

7. The code-excited linear prediction decoder according to claim 5, wherein the impulse response is an impulse response of a transfer function H (z) represented by the following equation. Chemist. H (z) = (1−Σγ ⁱ aiz− ⁱ ) / (1−Σv ⁱ aiZ− ⁱ ) where ai (i is 1 to M) is a tap coefficient, and γ and ν are 0
<Γ <1 and 0 <ν <1.

8. The code-excited linear prediction decoder according to claim 5, wherein the impulse response is a transfer function H1 (z) represented by the following equation:
A code-excited linear predictive decoder characterized by using an impulse response obtained by cascading an impulse response of the transfer function H2 (z) and an impulse response of the transfer function H2 (z). H1 (z) = (1−Σγ ⁱ aiz− ⁱ ) / (1−Σv ⁱ aiz− ⁱ ) H2 (z) = 1 / (1−εz− ^L ) where ai (i is 1 to M) Is the tap coefficient, and γ and ν are 0
<Γ <1 and 0 <ν <1 constants, ε is a constant satisfying 0 <ε ≦ 1, and L is a pitch lag calculated from the index of the adaptive code.