JPH0561499A

JPH0561499A - Voice encoding/decoding method

Info

Publication number: JPH0561499A
Application number: JP3267112A
Authority: JP
Inventors: Hideaki Kurihara; 秀明栗原; Tomohiko Taniguchi; 智彦谷口; Takashi Ota; 恭士大田; Yoshihiro Sakai; 良広坂井; Yoshiaki Tanaka; 良紀田中
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-09-18
Filing date: 1991-09-18
Publication date: 1993-03-12
Anticipated expiration: 2015-10-16
Also published as: JP3100082B2

Abstract

PURPOSE:To curtail the arithmetic quantity by calculating a time inversion auditory sense weighing input sound signal vector and multiplying it by each pitch prediction residual vector of an adaptive code book generating a correlation value of both of them. CONSTITUTION:An arithmetic means 21 calculates a time inversion auditory sense weighting input sound signal vector <1>AAX from an input sound signal vector AX subjected to auditory sense weighing. Also, a multiplying part 22 multiplies the time inversion auditory sense weighting input sound signal vector <t>AAXC and each pitch prediction residual vector P of an adaptive code book 1 and generated a correlation value t(AP)AX of both of them. Subsequently, a filter arithmetic part 23 derives a self-correlation value <t>(AP)AP of a vector AP after auditory sense weighting reproduction of each pitch prediction residual vector P of the adaptive code book 1. Moreover, an evaluating party 10 selects an optimal pitch prediction residual vector Popt and gain bopt for minimizing power of an error signal E to the input sound signal vector AX subjected to auditory sense weighting, based on both correlation values.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声符号化・復号化方
式に関し、特にベクトル量子化を用いて音声信号の情報
圧縮を行う高能率な音声符号化・復号化方式に関するも
のである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding / decoding method, and more particularly to a highly efficient speech coding / decoding method for compressing information of a speech signal by using vector quantization.

【０００２】近年、企業内通信システム・ディジタル移
動無線システムなどにおいて、音声信号をその品質を保
持しつつ情報圧縮するベクトル量子化方式が用いられて
いるが、このベクトル量子化方式とは、符号帳（コード
ブック）の各信号ベクトルに予測重み付けを施して再生
信号を作り、再生信号と入力音声信号との間の誤差電力
を評価して最も誤差の小さい信号ベクトルの番号（イン
デックス）を決定するものとして良く知られたものであ
るが、音声情報をより一層圧縮するためこのベクトル量
子化方式をより進めた方式に対する要求が高まってい
る。In recent years, a vector quantization method for compressing information while maintaining the quality of a voice signal has been used in a corporate communication system, a digital mobile radio system and the like. This vector quantization method is a codebook. Prediction weighting is applied to each signal vector in the (codebook) to create a reproduced signal, and the error power between the reproduced signal and the input audio signal is evaluated to determine the number (index) of the signal vector with the smallest error. However, there is an increasing demand for a method that is a more advanced version of this vector quantization method in order to further compress audio information.

【０００３】[0003]

【従来の技術】図１２及び図１３には、ベクトル量子化
を用いたＣＥＬＰ(Code Excited LPC)と呼ばれる高能率
音声符号化方式が示されており、この内、図１２は逐次
最適化ＣＥＬＰと呼ばれ、図１３は同時最適化ＣＥＬＰ
と呼ばれる方式を示している。尚、以下の説明で用いる
符号Ｐ，Ｘ，Ｙ，Ｃ，及びＥはそれぞれベクトルを意味
するものとする（但し、図面中ではベクトル特有の記号
で示している）。2. Description of the Related Art FIGS. 12 and 13 show a high-efficiency speech coding system called CELP (Code Excited LPC) using vector quantization. Among them, FIG. 12 shows sequential optimization CELP. Called, Figure 13 is a joint optimization CELP
The method is called. The symbols P, X, Y, C, and E used in the following description each mean a vector (however, in the drawings, they are shown by vector-specific symbols).

【０００４】図１２において、適応符号帳１ａは音声信
号を１サンプルづつピッチ周期が遅延されたＮサンプル
に対応するＮ次元のピッチ予測残差ベクトルが適応的に
変化しながら格納されて行くものであり、また固定符号
帳２には同様のＮサンプルに対応するＮ次元の白色雑音
を用いて生成した適応符号帳１ａでの周期的な成分以外
の非周期的な成分のコード・ベクトルが２^mパターンだ
け予め固定設定されている。In FIG. 12, an adaptive codebook 1a stores an audio signal while adaptively changing an N-dimensional pitch prediction residual vector corresponding to N samples whose pitch period is delayed by one sample. Also, the fixed codebook 2 has 2 ^m of code vectors of non-periodic components other than the periodic components in the adaptive codebook 1a generated by using N-dimensional white noise corresponding to the same N samples. Only the pattern is fixed in advance.

【０００５】まず、適応符号帳１ａの各ピッチ予測残差
ベクトルＰにスカラーＡ＝１／Ａ’(Z) （但し、Ａ’
(Z) は聴覚重み付け線形予測分析フィルタを示す）で示
される聴覚重み付け線形予測再生フィルタ３で聴覚重み
付けして生成されたピッチ予測ベクトルＡＰにゲイン５
でゲインｂを乗算してピッチ予測再生信号ベクトルｂＡ
Ｐを生成する。First, a scalar A = 1 / A '(Z) (where A'is included in each pitch prediction residual vector P of the adaptive codebook 1a).
(Z) indicates a perceptual weighted linear prediction analysis filter) and a gain of 5 is added to the pitch prediction vector AP generated by perceptual weighting by the perceptual weighted linear prediction reproduction filter 3
Is multiplied by a gain b to obtain a pitch prediction reproduction signal vector bA
Generate P.

【０００６】そして、このピッチ予測再生信号ベクトル
ｂＡＰと、Ａ(Z) ／Ａ’(Z) （但し、Ａ(Z) は線形予測
分析フィルタを示す）で示される聴覚重み付けフィルタ
７で聴覚重み付けされた入力音声信号ベクトルＡＸとの
間で聴覚重み付けされたピッチ予測誤差信号ベクトルＡ
Ｙを減算部８で求め、このピッチ予測誤差信号ベクトル
ＡＹの電力が最小の値になるように評価部１０がフレー
ム毎に下記式：Ｐ＝argmin（｜ＡＹ｜²）＝argmin（｜ＡＸ−ｂＡＰ｜²） …… により、符号帳１ａの中から最適なピッチ予測残差ベク
トルＰを選択すると共に最適なゲインｂを選択する。The pitch prediction reproduction signal vector bAP and the perceptual weighting filter 7 represented by A (Z) / A '(Z) (where A (Z) represents a linear prediction analysis filter) are perceptually weighted. And the input speech signal vector AX
Y is obtained by the subtraction unit 8, and the evaluation unit 10 calculates the following formula for each frame so that the power of the pitch prediction error signal vector AY becomes a minimum value: P = argmin (| AY | ² ) = argmin (| AX− bAP | ² ) ..., the optimum pitch prediction residual vector P is selected from the codebook 1a, and the optimum gain b is selected.

【０００７】更に、白色雑音の固定符号帳２の各コード
・ベクトル信号Ｃにも同様にして線形予測再生フィルタ
４で聴覚重み付けして生成された聴覚重み付け再生後の
コード・ベクトルＡＣにゲイン６でゲインｇを乗算して
線形予測再生信号ベクトルｇＡＣを生成する。Further, each code vector signal C of the fixed codebook 2 of white noise is similarly perceptually weighted by the linear predictive reproduction filter 4 and generated with the gain 6 to the code vector AC after perceptual weighting reproduction. The gain g is multiplied to generate the linear prediction reproduction signal vector gAC.

【０００８】そして、この線形予測再生信号ベクトルｇ
ＡＣと、上記のピッチ予測誤差信号ベクトルＡＹとの誤
差信号ベクトルＥを減算部９で求め、この誤差信号ベク
トルＥの電力が下記の式：Ｃ＝argmin（｜Ｅ｜²）＝argmin（｜ＡＹ−ｇＡＣ｜²） …… により最小の値になるように評価部１１がフレーム毎に
符号帳２の中から最適なコード・ベクトルＣを選択する
と共に最適なゲインｇを選択する。Then, the linear prediction reproduction signal vector g
An error signal vector E between AC and the pitch prediction error signal vector AY is obtained by the subtraction unit 9, and the power of this error signal vector E is expressed by the following formula: C = argmin (| E | ² ) = argmin (| AY -GAC | ² ) ... The evaluation unit 11 selects the optimum code vector C from the codebook 2 for each frame and the optimum gain g so that the value becomes the minimum value.

【０００９】尚、適応符号帳１ａの適応化（更新）は、
最適駆動音源信号ｂＡＰ＋ｇＡＣを加算部１２で求め、
これを聴覚重み付け線形予測分析フィルタ( Ａ' (Z) ）
１３でｂＰ＋ｇＣに戻し、更に遅延器１４で１フレーム
分遅延させたものを次のフレームの適応符号帳（ピッチ
予測符号帳）として格納することにより行われる。The adaptation (update) of the adaptive codebook 1a is
The optimum driving sound source signal bAP + gAC is obtained by the addition unit 12,
This is a perceptual weighting linear prediction analysis filter (A '(Z))
This is performed by returning the signal to bP + gC in 13 and further delaying it by one frame in the delay device 14 as the adaptive codebook (pitch prediction codebook) of the next frame.

【００１０】このように図１２に示した逐次最適化ＣＥ
ＬＰ方式では、ゲインｂとｇが別々に制御されるのに対
し、図１３に示した同時最適化ＣＥＬＰ方式では、ｂＡ
ＰとｇＡＣとを加算部１５で加算してＡＸ’＝ｂＡＰ＋
ｇＡＣを求め、更に減算部８でフィルタ７からの聴覚重
み付けされた入力音声信号ベクトルＡＸとの誤差信号ベ
クトルＥを上記の式と同様にして求め、評価部１６が
このベクトルＥの電力を最小にするコード・ベクトルＣ
を固定符号帳２から選択すると共に最適なゲインｂとｇ
を同時に選択制御するものである。Thus, the sequential optimization CE shown in FIG.
In the LP method, the gains b and g are controlled separately, whereas in the simultaneous optimization CELP method shown in FIG.
P and gAC are added by the addition unit 15 and AX ′ = bAP +
gAC is obtained, and the subtraction unit 8 further obtains an error signal vector E with the perceptually weighted input speech signal vector AX from the filter 7 in the same manner as the above equation, and the evaluation unit 16 minimizes the power of this vector E. Code vector C
Is selected from the fixed codebook 2 and optimum gains b and g are obtained.
Are controlled simultaneously.

【００１１】この場合には、上記の式，より、Ｃ＝argmin（｜Ｅ｜²）＝argmin（｜ＡＸ−ｂＡＰ−ｇＡＣ｜²） …… となる。In this case, from the above equation, C = argmin (| E | ² ) = argmin (| AX-bAP-gAC | ² ) ...

【００１２】尚、この場合の適応符号帳１ａの適応化
は、図１２の加算部１２の出力に相当するＡＸ’に対し
て同様にして行われる。また、フィルタ３，４は加算部
１５の後に共通に設けてもよく、このときには逆フィル
タ１３は不要となる。The adaptation of the adaptive codebook 1a in this case is similarly performed for AX 'corresponding to the output of the adder 12 in FIG. Further, the filters 3 and 4 may be provided in common after the addition unit 15, and in this case, the inverse filter 13 is not necessary.

【００１３】ところで、実際の符号帳探索は、適応符号
帳１ａに対する探索と、固定符号帳２に対する探索の二
段階に分けて行われ、適応符号帳１ａのピッチ探索にお
いては、上記の式の場合であっても、式に示すよう
に行われる。By the way, the actual codebook search is performed in two steps, that is, the search for the adaptive codebook 1a and the search for the fixed codebook 2. In the pitch search of the adaptive codebook 1a, the above equation is used. Even is done as shown in the equation.

【００１４】即ち、上記の式において、ベクトルＥの
電力を最小にするためのゲインｇを偏微分により求める
と、０＝δ（｜ＡＸ−ｂＡＰ｜²)／δｂ＝２^t( −ｂＡＰ)(ＡＸ−ｂＡＰ）より、ｂ＝^t( ＡＰ) ＡＸ／^t( ＡＰ）ＡＰ … となる。但し、「^t」は転置行列を示す。That is, in the above equation, the gain g for minimizing the electric power of the vector E is obtained by partial differentiation: 0 = δ (│AX-bAP│ ² ) / δb = 2 ^t (-bAP) ( From AX-bAP), b = ^t (AP) AX / ^t (AP) AP. However, " ^t " shows a transposed matrix.

【００１５】そこで、図１４に示すピッチ周期の最適化
アルゴリズムにおいては、聴覚重み付け入力音声信号ベ
クトルＡＸと、適応符号帳１ａの各ピッチ予測残差ベク
トルＰを聴覚重み付け線形予測再生フィルタ４に通して
得られるコード・ベクトルＡＰとを乗算部４１で乗算し
て両者の相関値^t( ＡＰ) ＡＸを発生し、聴覚重み付け
再生後のピッチ予測残差ベクトルＡＰの自己相関値^t(
ＡＰ) ＡＰを乗算部４２で求める。Therefore, in the pitch period optimization algorithm shown in FIG. 14, the auditory weighted input speech signal vector AX and each pitch prediction residual vector P of the adaptive codebook 1a are passed through the auditory weighted linear prediction reproduction filter 4. the resulting code vector AP is multiplied by the multiplication unit 41 generates a correlation value ^t (AP) AX of both, after perceptual weighting reproduction pitch prediction residual vector AP autocorrelation value ^t (
AP) AP is calculated by the multiplication unit 42.

【００１６】そして、評価部１０では、両相関値^t( Ａ
Ｐ) ＡＸ及び^t( ＡＰ) ＡＰに基づいて上記の式によ
り聴覚重み付け入力信号ベクトルＡＸに対する誤差信号
ベクトルＥ＝ＡＹの電力を最小にする最適なピッチ予測
残差信号ベクトルＰ及びゲインｂを選択する。Then, in the evaluation section 10, both correlation values ^t (A
P) AX and ^t (AP) Select the optimum pitch prediction residual signal vector P and gain b that minimizes the power of the error signal vector E = AY with respect to the perceptually weighted input signal vector AX according to the above equation based on AP ..

【００１７】尚、上記の式を最小とするように各ピッ
チ予測残差信号ベクトルＰに対してゲインｂが求めら
れ、このゲインに対する量子化がオープン・ループで行
われるなら、相関値の比率、（^t（ＡＸ）ＡＰ）²／^t（ＡＰ）ＡＰを最大にすることと等価になる。If the gain b is obtained for each pitch prediction residual signal vector P so as to minimize the above equation, and the quantization for this gain is performed in open loop, the ratio of the correlation value, It is equivalent to maximizing ( ^t (AX) AP) ² / ^t (AP) AP.

【００１８】即ち、となり、この右辺第２項を最大にすれば良い。That is, Therefore, the second term on the right side should be maximized.

【００１９】[0019]

【発明が解決しようとする課題】このような適応符号帳
１ａのピッチ探索においては、適応符号帳１ａの各ピッ
チ予測残差信号ベクトルＰに対してフィルタ４で聴覚重
み付け再生フィルタのインパルス応答が畳み込まれるの
で、適応符号帳１ａのＭ本（Ｍ＝１２８〜２５６）の各
ピッチ予測残差信号ベクトルの次元をＮ（通常Ｎ＝４０
〜６０）、聴覚重み付けフィルタ４の次数をＮ_P（ＩＩ
Ｒ型フィルタの場合はＮ_P＝１０) とすると、乗算部４
１での演算量は、各ベクトル毎に聴覚重み付けフィルタ
に要する演算量Ｎ×Ｎ_Pと、ベクトルの内積計算に要す
る演算量Ｎとの和となる。In such a pitch search of the adaptive codebook 1a, the impulse response of the auditory weighting reproduction filter is convolved by the filter 4 with respect to each pitch prediction residual signal vector P of the adaptive codebook 1a. Therefore, the dimension of each of M (M = 128 to 256) pitch prediction residual signal vectors of the adaptive codebook 1a is set to N (normally N = 40).
, 60), the order of the perceptual weighting filter 4 is set to N _P (II
In the case of an R type filter, if N _P = 10), the multiplication unit 4
The calculation amount for 1 is the sum of the calculation amount N × N _P required for the perceptual weighting filter for each vector and the calculation amount N required for the inner product calculation of the vector.

【００２０】そして、最適なピッチ・ベクトルＰを決定
するには、適応符号帳１ａに含まれるＭ本のピッチ・ベ
クトルの全てについてこの演算量が必要となり、演算量
が膨大になるという問題点があった。In order to determine the optimum pitch vector P, this calculation amount is required for all of the M pitch vectors included in the adaptive codebook 1a, which causes a problem that the calculation amount becomes huge. there were.

【００２１】また、過去のフレームの最適駆動音源信号
をそのまま帰還させて適応符号帳１ａの更新を行ってい
るので、図１５に示すように、固定符号帳２からのコー
ドベクトル成分をも含んだ信号が帰還されてしまい、適
応符号帳にとって好ましくない非周期的な雑音成分が重
畳されてしまい、特に駆動音源信号の性質としてピッチ
周期の強い有声音の区間において符号化音声品質の劣化
を招くという問題点もあった。Since the adaptive codebook 1a is updated by directly feeding back the optimum driving excitation signal of the past frame, the code vector component from the fixed codebook 2 is also included, as shown in FIG. The signal is fed back, and an aperiodic noise component that is not preferable for the adaptive codebook is superimposed, and the quality of the coded speech is deteriorated especially in the voiced sound section having a strong pitch period as a property of the driving excitation signal. There were also problems.

【００２２】従って、本発明は、このような適応符号帳
によりピッチ周期探索による長期予測を行うＣＥＬＰ型
の音声符号化・復号化方式において、ピッチ周期探索の
演算量をできるだけ少なくすると共に固定符号帳から非
周期な雑音成分が適応符号帳に漏れ込まないようにする
ことを目的とする。Therefore, according to the present invention, in the CELP-type speech coding / decoding system for performing long-term prediction by pitch period search by such an adaptive codebook, the calculation amount of pitch period search is minimized and the fixed codebook is used. The purpose is to prevent non-periodic noise components from leaking into the adaptive codebook.

【００２３】[0023]

【課題を解決するための手段】図１は、上記の課題を解
決するための本発明に係る音声符号化方式における適応
符号帳１の最適なピッチ・ベクトルＰ及びゲインｂを選
択するための最適化アルゴリズムを概念的に示したもの
で、図１４に示した従来例の最適化アルゴリズムの改良
に相当している。FIG. 1 is a block diagram of an optimum pitch vector P and a gain b of an adaptive codebook 1 in a speech coding system according to the present invention for solving the above problems. This is a conceptual illustration of the optimization algorithm, which corresponds to the improvement of the optimization algorithm of the conventional example shown in FIG.

【００２４】この発明では、適応符号帳１が、所定の要
素を除いて全てゼロのスパース符号帳であると共に、そ
の中からピッチ探索により選択された最適ピッチ・ベク
トルｂ_optＰ_optをスパース化回路１７でスパース化し
た後、固定符号帳２から符号帳探索により選択された最
適コードベクトルｇ_optＣ_optと加えあわせ遅延器１４
で１フレーム分遅延させて与えることにより更新されて
いる。尚、スパース回路１７のスパース化は、一定閾値
Ｔｈ又は所定サンプル数の平均信号振幅に応じた適応閾
値Ｔｈを基準として行うことができる。According to the present invention, the adaptive codebook 1 is a sparse codebook having all zeros except for predetermined elements, and the optimum pitch vector b _opt P _opt selected by pitch search from the adaptive codebook 1 is sparse circuit. After being sparsed by 17, the optimum code vector g _opt C _opt selected by the codebook search from the fixed codebook 2 is added to the delay unit 14
It is updated by delaying by 1 frame and giving. The sparse circuit 17 can be made sparse based on the fixed threshold Th or the adaptive threshold Th according to the average signal amplitude of a predetermined number of samples.

【００２５】そして更に、聴覚重み付けされた入力音声
信号ベクトルＡＸから時間反転聴覚重み付け入力音声信
号ベクトル^tＡＡＸを算出する演算手段２１と、時間反
転聴覚重み付け入力音声信号ベクトル^tＡＡＸと適応符
号帳１の各ピッチ予測残差ベクトルＰとを乗算して両者
の相関値^t（ＡＰ) ＡＸを発生する乗算部２２と、適応
符号帳１の各ピッチ予測残差ベクトルＰの聴覚重み付け
再生後のベクトルＡＰの自己相関値^t（ＡＰ) ＡＰを求
めるフィルタ演算部２３と、両相関値に基づいて聴覚重
み付けされた入力音声信号ベクトルＡＸに対する誤差信
号Ｅの電力を最小にする最適なピッチ予測残差ベクトル
Ｐ_opt及びゲインｂ_optを選択する評価部１０とを備え
ている。Further, the calculating means 21 for calculating the time-reversed auditory-weighted input speech signal vector ^t AAX from the auditory-weighted input speech signal vector AX, the time-reversed auditory-weighted input speech signal vector ^t AAX and the adaptive codebook 1 are used. A multiplication unit 22 that multiplies each pitch prediction residual vector P by each other to generate a correlation value ^t (AP) AX between them, and a vector AP after the auditory weighting reproduction of each pitch prediction residual vector P of the adaptive codebook 1 A filter operation unit 23 for obtaining an autocorrelation value ^t (AP) AP, and an optimum pitch prediction residual vector P _opt that minimizes the power of the error signal E with respect to the perceptually weighted input speech signal vector AX based on both correlation values. And an evaluation unit 10 that selects the gain b _opt .

【００２６】また、図１に示すような符号化側に対し
て、本発明の復号化側では、図２に示すように、符号化
側と同一のスパース適応符号帳１と固定符号帳２とスパ
ース化回路１７と遅延器１４とを設け、適応符号帳１の
内の最適選択されたピッチ予測残差ベクトルＰ_optに最
適ゲインｂ_optを乗じることにより得た最適コード・ベ
クトルｂ_optＰ_optを該スパース化回路１７でスパース
化し、固定符号帳２の最適選択されたコード・ベクトル
Ｃ_optに最適ゲインｇ_optを乗じることにより得た最適
コード・ベクトルｇ_optＣ_optとを加算したコード・ベ
クトルＸを線形予測再生フィルタ２００を通して再生信
号を得ると共にスパース化回路１７に与えている。この
場合もスパース回路１７のスパース化は、一定閾値Ｔｈ
又は平均信号振幅に応じた適応閾値Ｔｈを基準として行
うことができる。In contrast to the coding side as shown in FIG. 1, the decoding side of the present invention has the same sparse adaptive codebook 1 and fixed codebook 2 as the coding side, as shown in FIG. An optimal code vector b _opt P _opt obtained by multiplying the optimally selected pitch prediction residual vector P _opt in the adaptive codebook 1 by the optimal gain b _opt is provided by providing the sparsification circuit 17 and the delay unit 14. A code vector X which is sparsed by the sparsification circuit 17 and is added with an optimum code vector g _opt C _opt obtained by multiplying the optimum selected code vector C _opt of the fixed codebook 2 by an optimum gain g _opt. Is obtained through the linear predictive reproduction filter 200 and is given to the sparsification circuit 17. Also in this case, the sparse circuit 17 is made sparse by a constant threshold Th.
Alternatively, the adaptive threshold Th according to the average signal amplitude can be used as a reference.

【００２７】また、上記の図１及び図２におけるスパー
ス化回路１７は、それぞれ図３及び図４に示すように、
最適ピッチ・ベクトルｂ_optＰ_optに対してではなく、該
最適ピッチ・ベクトルｂ_optＰ_optと最適コードベクトル
ｇ_optＣ_optとを加え合わせた値に対して設けてもよ
く、この場合には、全体のパワーに占める該最適コード
ベクトルｇ_optＣ_optのパワーの割合に対応した閾値Ｔ
ｈを閾値演算回路１８で生成してスパース化回路１７に
与えてスパース化した後、遅延器１４に送ることとな
る。Further, the sparsification circuit 17 in FIGS. 1 and 2 is as shown in FIGS. 3 and 4, respectively.
Optimal Pitch rather than to vector b _opt P _opt, the optimum pitch vector b _opt P _opt and the optimal code vector g _opt C _opt and may be provided with respect to the added combined value, in this case, Threshold value T corresponding to the ratio of the power of the optimum code vector g _opt C _opt to the total power
After the h is generated by the threshold value calculation circuit 18 and given to the sparsification circuit 17 to be sparsified, it is sent to the delay device 14.

【００２８】このようなＣＥＬＰ方式においては、図５
(a) に示すように、演算手段２１が、ＦＩＲ聴覚重み付
けフィルタ・マトリックスの転置マトリックス^tＡを乗
算するもので構成することができる。In such a CELP system, as shown in FIG.
As shown in (a), the calculating means 21 may be configured to multiply the transposed matrix ^t A of the FIR auditory weighting filter matrix.

【００２９】或いは、図５(b) に示すように、演算手段
２１が、入力信号を時間軸上で逆に並べ換え、ＩＩＲ聴
覚重み付けフィルタ処理 (１／Ａ' (Z) ）した後、再び
時間軸上で逆に並べ換えるもので構成することもでき
る。Alternatively, as shown in FIG. 5 (b), the calculating means 21 rearranges the input signals in reverse order on the time axis, performs IIR auditory weighting filter processing (1 / A '(Z)), and then repeats the time. It can also be configured by rearranging in reverse on the axis.

【００３０】[0030]

【作用】まず、図１に示した本発明のＣＥＬＰ型の音声
符号化方式においては、適応符号帳１がスパース化され
た最適駆動音源信号によって更新されているので、常に
格納されるピッチ予測残差信号ベクトルが所定のサンプ
ルを除いてゼロとなっているスパース（間引）状態に在
る。First, in the CELP type speech coding system of the present invention shown in FIG. 1, since the adaptive codebook 1 is updated by the sparsified optimum driving excitation signal, the pitch prediction residue that is always stored. It is in a sparse (decimated) state in which the difference signal vector is zero except for a predetermined sample.

【００３１】そして、評価部１０に与えるべき一方の自
己相関値^t（ＡＰ) ＡＰは図１４に示した従来例と同様
にして演算されるが、相関値^t（ＡＰ) ＡＸの方は、聴
覚重み付け入力音声信号ベクトルＡＸを演算手段２１で
^tＡＡＸに変換しておき、スパース構成の適応符号帳２
のピッチ予測残差信号ベクトルＰをそのまま乗算部２２
に与えることにより得ているので、スパース回路１７で
スパース化された適応符号帳１の利点をそのまま生かし
た形で（即ち、サンプル値が“０”の部分に対する乗算
を行わない形で）乗算を行うことができ、演算量を削減
することができる。これは、逐次最適化方式及び同時最
適化ＣＥＬＰ方式のいずれの場合にも全く同様に適用す
ることができると共に更には、両者を組み合わせたピッ
チ直交最適化ＣＥＬＰ方式にも適用することができる。One autocorrelation value ^t (AP) AP to be given to the evaluation unit 10 is calculated in the same manner as in the conventional example shown in FIG. 14, but the correlation value ^t (AP) AX is the auditory sense. The weighted input audio signal vector AX is calculated by the calculating means 21.
Adaptive codebook 2 with sparse structure after conversion to ^t AAX
Of the pitch prediction residual signal vector P
Since it is obtained by applying to the sparse circuit 17, the multiplication is performed in a form in which the advantage of the adaptive codebook 1 sparsed by the sparse circuit 17 is used as it is (that is, in the form in which the portion of which the sample value is “0” is not multiplied). This can be performed and the amount of calculation can be reduced. This can be applied in exactly the same manner to both the sequential optimization method and the simultaneous optimization CELP method, and further to the pitch orthogonal optimization CELP method in which both are combined.

【００３２】また、スパース回路１７で各サンプルの信
号振幅を閾値と比較することにより、閾値を越えないサ
ンプル点についてはサンプル値を零に置き換えることに
より非周期成分の適応符号帳１への漏れ込みを防ぐこと
もできる。The sparse circuit 17 compares the signal amplitude of each sample with a threshold value, and replaces the sample value with zero for sample points that do not exceed the threshold value, thereby leaking aperiodic components into the adaptive codebook 1. Can be prevented.

【００３３】更に、スパース化回路１７を、それぞれ図
３及び図４に示すように、最適ピッチ・ベクトルｂ_opt
Ｐ_optに対してではなく、該最適ピッチ・ベクトルｂ
_optＰ_optと最適コードベクトルｇ_optＣ_optとを加え
合わせた値に対して設け、このスパース回路１７で、全
体のパワーに占める該固定符号帳２のパワーの比に対応
した閾値Ｔｈを閾値演算回路１８で生成してスパース化
回路１７に与えてスパース化した後、遅延器１４に送る
ようにすれば、より一層非周期成分の適応符号帳１への
漏れ込みを抑制することができる。Further, the sparsification circuit 17 is provided with an optimum pitch vector b _opt as shown in FIGS. 3 and 4, respectively.
The optimal pitch vector b, not for P _opt
_{It is} provided for a value obtained by adding _opt P _opt and the optimum code vector g _opt C _opt, and the sparse circuit 17 calculates a threshold Th corresponding to the ratio of the power of the fixed codebook 2 to the total power. If the circuit 18 generates the signal, supplies it to the sparsification circuit 17 to make it sparse, and then sends it to the delay device 14, it is possible to further suppress the leakage of the aperiodic component into the adaptive codebook 1.

【００３４】即ち、音声は有声音のときには、或る一定
周期の信号（ピッチ周期）が大きくなり、他の非周期成
分との振幅差が大きくなる。逆に、無声音のときには、
ピッチ周期西部が殆ど無くなり非周期成分が優勢になり
振幅差が無くなってくる。That is, when the voice is a voiced sound, a signal of a certain fixed period (pitch period) becomes large, and an amplitude difference from other aperiodic components becomes large. Conversely, when unvoiced,
The western part of the pitch period is almost eliminated, and the non-periodic component becomes dominant and the amplitude difference disappears.

【００３５】従って、ピッチ周期成分と非周期成分との
差、即ち適応符号帳と固定符号帳の信号パワー差（全体
のパワーに占める固定符号帳２のパワーの比）によりス
パース回路１７の閾値を適応的に可変にすることによ
り、非周期成分の適応符号帳１への漏れ込みを少なくす
ることができる。Therefore, the threshold value of the sparse circuit 17 is determined by the difference between the pitch periodic component and the aperiodic component, that is, the signal power difference between the adaptive codebook and the fixed codebook (the ratio of the fixed codebook 2 power to the total power). By making it adaptively variable, leakage of non-periodic components into the adaptive codebook 1 can be reduced.

【００３６】また、図１に示すような符号化側に対し
て、本発明の復号化側では、図２に示すように符号化側
から知らされた適応符号帳１の内の最適選択されたピッ
チ予測残差ベクトルＰ_optに最適ゲインｂ_optを乗じる
ことにより得た最適コード・ベクトルｂ_optＰ_optと、
やはり符号化側から知らされた固定符号帳２の最適選択
されたコード・ベクトルＣ_optに最適ゲインｇ_optを乗
じることにより得た最適コード・ベクトルｇ_optＣ_opt
とを加算したコード・ベクトルＸを線形予測再生フィル
タ２００を通して再生信号を得ることにより適応符号帳
１の更新を行っている。Further, in contrast to the encoding side as shown in FIG. 1, the decoding side of the present invention is the optimum selection from the adaptive codebook 1 notified from the encoding side as shown in FIG. An optimum code vector b _opt P _opt obtained by multiplying the pitch prediction residual vector P _opt by an optimum gain b _opt ,
Again the optimal code vectors obtained by multiplying the optimum gain g _opt optimally selected code vector C _opt fixed codebook 2 was informed from the encoding side g _opt C _opt
The adaptive codebook 1 is updated by obtaining the reproduction signal of the code vector X obtained by adding and through the linear prediction reproduction filter 200.

【００３７】尚、この場合において、演算手段２１が、
図５(b) に示すように、入力信号を時間軸上で逆に並べ
換えし、ＩＩＲ聴覚重み付けフィルタ処理（１／Ａ’
(Z) ）した後、再び時間軸上で逆に並べ換えするもので
構成する場合には、図５(a) に示すように、ＦＩＲ聴覚
重み付けフィルタ・マトリックスの転置マトリックス^t
Ａを乗算するもので構成する場合に比べて、ＩＩＲとＦ
ＩＲの違いにより演算量が削減される。In this case, the calculation means 21 is
As shown in Fig. 5 (b), the input signals are rearranged in the reverse order on the time axis, and IIR auditory weighting filter processing (1 / A '
(Z)) and then rearranging them again on the time axis, the transposed matrix ^t of the FIR auditory weighting filter matrix as shown in FIG.
Compared to the case where it is configured by multiplying A, IIR and F
The amount of calculation is reduced due to the difference in IR.

【００３８】[0038]

【実施例】図６は、図１及び図２に示したスパース化回
路１７の一実施例を示したもので、この実施例では、図
６(a) に示すように、一定の閾値Ｔｈ以上の値を有する
サンプル点については、入力値をそのまま出力し、閾値
Ｔｈ以下の場合は入力値がゼロに置き換えられてスパー
ス化される。FIG. 6 shows an embodiment of the sparsification circuit 17 shown in FIGS. 1 and 2. In this embodiment, as shown in FIG. For a sample point having a value of, the input value is output as it is, and when it is less than or equal to the threshold Th, the input value is replaced with zero and sparsified.

【００３９】従って、この場合のスパース化回路１７は
同図(b) に示すようにセンター・クリッピング特性を有
する回路となり、このようなセンター・クリッピング回
路の実現手法としては、例えば２通り考えられる。Therefore, the sparsification circuit 17 in this case becomes a circuit having a center clipping characteristic as shown in FIG. 9B, and there are two possible methods for realizing such a center clipping circuit.

【００４０】まず、図７に示す実施例では、入力信号
（最適ピッチ予測残差信号）の各サンプル点の値を、そ
の絶対値（信号振幅）の大きい方から順位を付け、その
上位から所望のサンプル数（一定閾値Ｔｈに相当）まで
はそのまま出力し、それ以外のサンプル点はゼロに置き
換えている。これにより、ピッチ探索の演算量に直接の
影響を与える“ゼロでないサンプル点”の数（スパース
度）を正確に制御できることとなる。First, in the embodiment shown in FIG. 7, the value of each sample point of the input signal (optimum pitch prediction residual signal) is ranked from the one having the larger absolute value (signal amplitude), and the value is desired from the higher order. Up to the number of samples (corresponding to the constant threshold Th) are output as they are, and the other sample points are replaced with zero. As a result, the number of "non-zero sample points" (degree of sparseness) that directly affects the calculation amount of pitch search can be accurately controlled.

【００４１】一方、図８に示す実施例では、入力信号に
対して所定サンプル当たりの平均信号振幅Ｖ_AVを算出
し、その値Ｖ_AVに係数λを乗じて閾値Ｔｈ＝Ｖ_AV・λを
決定し、この閾値Ｔｈを用いてセンター・クリッピング
を行うものである。この場合には、入力信号の性質によ
って適応符号帳１のスパース度は多少変化するが、図７
の実施例に比べてサンプル点の順位付けに必要な演算量
が不要となるため、より少ない演算量で済むこととな
る。On the other hand, in the embodiment shown in FIG. 8, the average signal amplitude V _AV per predetermined sample is calculated for the input signal, and the value V _AV is multiplied by the coefficient λ to determine the threshold Th = V _AV · λ. However, the center clipping is performed using this threshold Th. In this case, the degree of sparseness of the adaptive codebook 1 changes somewhat depending on the nature of the input signal.
As compared with the embodiment described above, the amount of calculation required for ranking the sample points is unnecessary, and thus the amount of calculation can be reduced.

【００４２】図９は、図３及び図４に示した閾値演算回
路１８の一実施例を示したもので、この実施例では、閾
値Ｔｈを適応符号帳に帰還される最適駆動音源信号中の
ピッチベクトル及びコードベクトルの各成分のパワー
（電力）｜ｂ_optＰ_opt｜²及び｜ｇ_optＣ_opt｜²を
（ベクトルの内積演算により）算出し、この内のコード
ベクトルの成分パワー｜ｇ_optＣ_opt｜²が全体に占め
る割合（ｋ_C) を次式のように求める。ｋ_C＝（｜ｇ_optＣ_opt｜²/（｜ｂ_optＰ_opt｜²+｜ｇ_optＣ_opt｜²))^1/2 但し、０≦ｋ_C≦１である。FIG. 9 shows an embodiment of the threshold value calculation circuit 18 shown in FIGS. 3 and 4. In this embodiment, the threshold value Th is fed back to the adaptive codebook in the optimum drive excitation signal. The power (electric power) | b _opt P _opt | ² and | g _opt C _opt | ² of each component of the pitch vector and the code vector is calculated (by vector inner product calculation), and the component power of the code vector | g _opt The ratio (k _C ) of C _opt | ² to the whole is calculated by the following equation. k _C = (| g _opt C _opt | ² / (| b _opt P _opt | ² + | g _opt C _opt | ² )) ^1/2 where 0 ≦ k _C ≦ 1.

【００４３】そして、このｋ_Cの関数ｆ（ｋ_C）として
図９に示すように閾値Ｔｈ＝λ／ｋ_Cが決定される。但
し、λは経験により決定される定数である。[0043] Then, the threshold Th = λ / k _C as shown as the k _C of the function f (k _C) in FIG. 9 is determined. However, λ is a constant determined by experience.

【００４４】一方、ピッチベクトルの成分パワー｜ｇ
_optＣ_opt｜²が全体に占める割合（ｋ_P) は、次式の
ようになる。ｋ_p＝（｜ｂ_optＰ_opt｜²/（｜ｂ_optＰ_opt｜²+｜ｇ_optＣ_opt｜²))^1/2 但し、ｋ_C ²＋ｋ_p ²＝１である。On the other hand, the component power of the pitch vector | g
_The ratio (k _P ) of _opt C _opt | ² to the whole is as follows. k _p = ( _{│b opt} P _opt │ ² / ( _{│b opt} P _opt │ ² + │g _opt C _opt │ ² )) ^1/2 where k _C ² + k _p ² = 1.

【００４５】ここで、ｋ_Cとｋ_pとについて考えると、
これらの値は相補的な関係にあり、駆動音源中に占める
ピッチ成分の割合が大きく適応符号帳が入力信号のピッ
チ周期性に充分追従出来ているときにはｋ_Cの値は小さ
くなるため、閾値Ｔｈは反対に大きくなる。従って、ピ
ッチ成分のみが残り、他の信号成分はクリップされ、適
応符号帳へ帰還される非周期成分が抑圧されることとな
る。Now, considering k _C and k _p ,
These values have a complementary relationship, and when the ratio of the pitch component in the driving sound source is large and the adaptive codebook can sufficiently follow the pitch periodicity of the input signal, the value of k _C becomes small. Grows on the contrary. Therefore, only the pitch component remains, other signal components are clipped, and the non-periodic component returned to the adaptive codebook is suppressed.

【００４６】逆に、ピッチ成分の割合が小さく適応符号
帳が入力信号のピッチ周期性に追従できていないときに
はｋ_Cの値が大きくなり、閾値Ｔｈが小さくなるため、
最適駆動音源信号成分はそのまま適応符号帳へ帰還され
る。On the contrary, when the ratio of the pitch component is small and the adaptive codebook cannot follow the pitch periodicity of the input signal, the value of k _C becomes large and the threshold Th becomes small.
The optimum driving excitation signal component is directly returned to the adaptive codebook.

【００４７】このように、適応符号帳の追従状態に応じ
て適応符号帳の更新に用いられている駆動音源信号の
内、非周期成分の帰還量を制御することが可能になる。In this way, it is possible to control the feedback amount of the aperiodic component of the driving excitation signal used for updating the adaptive codebook according to the tracking state of the adaptive codebook.

【００４８】尚、ピッチベクトル及びコードベクトルの
割合を評価する方法としては，上記のｋ_Cの代わりにそ
れぞれの成分に対して重み付け合成フィルタを施したも
のについて次式のようにしてｋ_C’を求めてもよい。ｋ_C’＝（｜ｇ_optＡＣ_opt｜²/（｜ｂ_optＡＰ_opt｜²+｜ｇ_optＡＣ_opt｜²))^1/2 As a method of evaluating the ratio of the pitch vector and the code vector, instead of the above k _C , a weighting synthesis filter is applied to each component, and k _C ′ is calculated as follows. You may ask. k _C '= (| g _opt AC _opt | ² / (| b _opt AP _opt | ² + | g _opt AC _opt | ² )) ^1/2

【００４９】また、このような閾値Ｔｈは、ｋ_Cの値か
らテーブル・ルック・アップ方式でも求めることができ
る。Further, such a threshold value Th can also be obtained from the value of k _{C by} a table lookup method.

【００５０】図１０は、図５(a) に示した本発明に係る
音声符号化方式に用いられる演算手段の一実施例を示し
たもので、ＦＩＲ（有限インパルス応答）聴覚重み付け
フィルタ・マトリックスをＡとし、このマトリックスＡ
の転置マトリックス^tＡを図１０(a) に示す符号帳次元
数Ｎに一致したＮ次元のマトリックスとすると、図１に
示したＣＥＬＰ方式の場合では、重み付け入力信号ベク
トルＡＸが図１０(b)に示すようなものであれば、この
重み付け入力信号ベクトルＡＸに転置マトリックス^tＡ
を乗じた時間反転聴覚重み付け入力信号ベクトル ^tＡ
ＡＸは図１０(c) に示すようになる。尚、図中、＊は乗
算符号を示す。FIG. 10 shows an embodiment of the arithmetic means used in the speech coding system according to the present invention shown in FIG. 5 (a), which is a FIR (finite impulse response) auditory weighting filter matrix. A and this matrix A
Assuming that the transposed matrix ^t A of N is a N-dimensional matrix that matches the codebook dimension number N shown in FIG. 10 (a), in the case of the CELP method shown in FIG. 1, the weighted input signal vector AX is shown in FIG. 10 (b). If it is as shown in, the transposed matrix ^t A is added to the weighted input signal vector AX.
Time-reversed auditory weighted input signal vector ^t A multiplied by
AX becomes as shown in FIG. 10 (c). In the figure, * indicates a multiplication code.

【００５１】また、図１１は、図５(b) に示した本発明
に係る音声符号化方式に用いられる演算手段の一実施例
を示したもので、まず、図１に示したＣＥＬＰ方式の場
合では、重み付け入力信号ベクトルＡＸが図１１(a) に
示すようなもの（図１０(b)に示すものと同じ）とする
と、これを時間軸上で逆に並べ換えしたものが図１１
(b) に示すベクトル（ＡＸ）_TRである。FIG. 11 shows an embodiment of the arithmetic means used in the speech coding system according to the present invention shown in FIG. 5 (b). First, the CELP system shown in FIG. 1 is used. In this case, if the weighted input signal vector AX is as shown in FIG. 11 (a) (the same as that shown in FIG. 10 (b)), the result obtained by reversing this on the time axis is shown in FIG.
It is the vector (AX) _TR shown in (b).

【００５２】そして、このベクトル（ＡＸ）_TRを、聴覚
重み付けフィルタ関数１／Ａ’(Z)のＩＩＲ（無限イン
パルス応答）形の聴覚重み付け線形予測再生フィルタＡ
にかけると、Ａ（ＡＸ）_TRは例えば図１１(c) に示すよ
うになる。Then, this vector (AX) _TR is _converted into an IIR (infinite impulse response) type auditory weighted linear prediction reproduction filter A of the auditory weighting filter function 1 / A '(Z).
As a result, A (AX) _TR becomes as shown in FIG. 11 (c), for example.

【００５３】この場合、マトリックスＡは図１０(a) に
示す転置マトリックス^tＡを戻した行列であるので、上
記のＡ（ＡＸ）_TRを元に戻すために、時間軸上で逆に並
べ換えを行うと、図１１(d) に示すように、なり、これ
は図１０(c) に示した時間反転聴覚重み付け入力信号ベ
クトル^tＡＡＸと同じになる。In this case, the matrix A is a matrix obtained by returning the transposed matrix ^t A shown in FIG. 10 (a), and therefore, in order to restore the above A (AX) _TR , the rearrangement is performed in reverse on the time axis. When done, it becomes as shown in FIG. 11 (d), which is the same as the time-reversed auditory-weighted input signal vector ^t AAX shown in FIG. 10 (c).

【００５４】このようにして、図１０の実施例と図１１
の実施例が同じ機能を果たすことが分かる。Thus, the embodiment of FIG. 10 and FIG.
It can be seen that the example of FIG.

【００５５】尚、図１１の実施例では、フィルタ・マト
リックスＡをＩＩＲフィルタとしたが、ＦＩＲフィルタ
を用いても構わない。但し、ＦＩＲフィルタを用いる
と、図１０の実施例と同様に全乗算回数がＮ²／２（及
び２Ｎの移動操作）となるが、ＩＩＲフィルタを用いた
場合には、例えば１０次線形予測分析の場合であれば１
０Ｎの乗算回数と２Ｎの移動操作とを必要とするだけで
済むことになる。Although the filter matrix A is an IIR filter in the embodiment shown in FIG. 11, an FIR filter may be used. However, the use of FIR filters, but the total number of multiplications in the same manner as in the example of FIG. 10 is a N ^2/2 (and 2N moving operation), in the case of using an IIR filter, for example, 10-order linear prediction analysis If 1 then
It only requires 0N multiplications and 2N move operations.

【００５６】[0056]

【発明の効果】以上説明したように、本発明によれば、
適応符号帳として所定の要素を除いて全てゼロのスパー
ス符号帳を用いると共に最適ピッチ・ベクトルをスパー
ス化回路でスパース化して与えることにより更新し、評
価部に与えるべき相関値を求める際に、聴覚重み付けさ
れた入力音声信号ベクトルから時間反転聴覚重み付け入
力音声信号ベクトルを算出して適応符号帳の各ピッチ予
測残差ベクトルとを乗算し両者の相関値を生成するよう
に構成したので、スパース符号帳の利点をそのまま生か
した形で符号化及び復号化に際しての乗算を行うことが
でき、演算量を削減することができる。As described above, according to the present invention,
A sparse codebook with all zeros except for certain elements is used as the adaptive codebook, and the optimum pitch vector is updated by sparseizing with a sparser circuit to give the correlation value to the evaluation unit. The time-reversed auditory weighted input speech signal vector is calculated from the weighted input speech signal vector and multiplied with each pitch prediction residual vector of the adaptive codebook to generate the correlation value between them. It is possible to carry out multiplication in encoding and decoding while making the best use of the advantage of (3) as it is, and to reduce the amount of calculation.

【００５７】また、スパース化に際しての閾値を、適応
符号帳の追従状態に応じて適応符号帳の更新に用いられ
ている駆動音源信号の内、固定符号帳からのコードベク
トルによる非周期成分の帰還量を制御するように可変に
したので、従来のものに比べてより周期性が保たれ、結
果として有声音などのピッチ周期性の強い駆動音源を有
する音声に対して符号化・復号化音声品質を改善するこ
とができる。Further, as a threshold for sparsification, of the driving excitation signals used for updating the adaptive codebook according to the tracking state of the adaptive codebook, the non-periodic component is fed back by the code vector from the fixed codebook. Since it is variable to control the amount, the periodicity is maintained more than the conventional one, and as a result, the encoded / decoded voice quality is applied to the voice having a drive source with a strong pitch periodicity such as voiced sound. Can be improved.

[Brief description of drawings]

【図１】本発明に係る音声符号化方式のピッチ探索の最
適化アルゴリズムを概念的に示したブロック図である。FIG. 1 is a block diagram conceptually showing an optimization algorithm for pitch search in a speech coding system according to the present invention.

【図２】本発明に係る音声復号化方式の再生アルゴリズ
ムを概念的に示したブロック図である。FIG. 2 is a block diagram conceptually showing a reproduction algorithm of a voice decoding system according to the present invention.

【図３】本発明に係る音声符号化方式のピッチ探索を別
のスパース化により実行するときの最適化アルゴリズム
を概念的に示したブロック図である。FIG. 3 is a block diagram conceptually showing an optimization algorithm when the pitch search of the speech coding method according to the present invention is executed by another sparsification.

【図４】本発明に係る音声復号化方式の別のスパース化
により実行するときの再生アルゴリズムを概念的に示し
たブロック図である。FIG. 4 is a block diagram conceptually showing a reproduction algorithm when it is executed by another sparsification of the speech decoding system according to the present invention.

【図５】本発明に係る音声符号化・復号化方式に用いる
演算手段の構成例を概念的に示した図である。[Fig. 5] Fig. 5 is a diagram conceptually showing a configuration example of a calculation means used in a voice encoding / decoding system according to the present invention.

【図６】本発明に係る音声符号化・復号化方式に用いる
スパース化回路の実施例を概念的に説明するためのグラ
フ図である。FIG. 6 is a graph diagram conceptually illustrating an embodiment of a sparsification circuit used in a voice encoding / decoding system according to the present invention.

【図７】本発明に係る音声符号化・復号化方式に用いる
スパース化回路の信号振幅順によるセンター・クリッピ
ングを概念的に説明するためのグラフ図である。FIG. 7 is a graph diagram conceptually illustrating center clipping in the signal amplitude order of the sparsification circuit used in the speech encoding / decoding system according to the present invention.

【図８】本発明に係る音声符号化・復号化方式に用いる
スパース化回路の平均化閾値によるセンター・クリッピ
ングを概念的に説明するためのグラフ図である。FIG. 8 is a graph diagram conceptually illustrating center clipping due to an averaging threshold of a sparsification circuit used in a voice encoding / decoding system according to the present invention.

【図９】本発明に係る音声符号化・復号化方式に用いる
閾値演算回路の実施例を示した図である。FIG. 9 is a diagram showing an embodiment of a threshold value calculation circuit used in the voice encoding / decoding method according to the present invention.

【図１０】本発明に用いる演算手段の実施例を説明する
ための図である。FIG. 10 is a diagram for explaining an embodiment of a calculation means used in the present invention.

【図１１】本発明に用いる演算手段の他の実施例を説明
するための図である。FIG. 11 is a diagram for explaining another embodiment of the calculating means used in the present invention.

【図１２】一般的な逐次最適化ＣＥＬＰ方式を概略的に
示すブロック図である。FIG. 12 is a block diagram schematically showing a general sequential optimization CELP method.

【図１３】一般的な同時最適化ＣＥＬＰ方式を概略的に
示すブロック図である。FIG. 13 is a block diagram schematically showing a general joint optimization CELP method.

【図１４】従来のピッチ探索の最適化アルゴリズムを概
念的に示したブロック図である。FIG. 14 is a block diagram conceptually showing a conventional pitch search optimization algorithm.

【図１５】従来方式の問題点を説明するためのブロック
図である。FIG. 15 is a block diagram for explaining problems of the conventional method.

[Explanation of symbols]

１スパース（ピッチ周期）適応符号帳２固定符号帳１０評価部１４フレーム遅延器１７スパース回路１８閾値演算回路２１演算手段２２乗算部２３フィルタ演算部図中、同一符号は同一又は相当部分を示す。 1 sparse (pitch period) adaptive codebook 2 fixed codebook 10 evaluation unit 14 frame delay unit 17 sparse circuit 18 threshold value calculation circuit 21 calculation unit 22 multiplication unit 23 filter calculation unit In the drawings, the same reference numerals indicate the same or corresponding portions.

───────────────────────────────────────────────────── フロントページの続き (72)発明者坂井良広神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (72)発明者田中良紀神奈川県川崎市中原区上小田中1015番地富士通株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Yoshihiro Sakai 1015 Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture, Fujitsu Limited (72) Inventor Yoshinori Tanaka, 1015, Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture, Fujitsu Limited

Claims

[Claims]

1. Encoding is performed by using two codebooks, an adaptive codebook (1) and a white noise fixed codebook (2), to perform a pitch search / codebook search to find an optimum driving excitation signal. In the CELP-type speech coding method to be performed, the adaptive codebook (1) is a sparse codebook of all zeros except for predetermined elements, and the optimum pitch vector (b _opt P _opt ) is sparsified by a sparsification circuit (17), and then the fixed codebook
The input speech signal is updated by adding the optimum code vector (g _opt C _opt ) selected by the codebook search from (2) and delaying it by one frame with the delay device (14), and further weighted perceptually. Vector (A
X) time reversal auditory weighting input speech signal vector (
calculating means (21) for calculating ^t AAX) and the time-reversal auditory-weighted input speech signal vector ( ^t AA
X) and each pitch prediction residual vector (P) of the adaptive codebook (1) are multiplied to obtain a correlation value (( ^t (AP) AX) of the two.
And the autocorrelation value (( ^t (AP) A) of the vector (AP) after auditory weighting reproduction of each pitch prediction residual vector (P) of the adaptive codebook (1). A filter calculation unit (23) to be obtained and an optimum pitch prediction residual vector (P _opt ) that minimizes the power of the error signal (E) with respect to the perceptually weighted input speech signal vector (AX) based on both correlation values. And an evaluation unit (10) for selecting a gain (b _opt ), and a speech coding method.

2. The sparsification circuit (17) has a constant threshold value (Th).
The speech coding method according to claim 1, wherein the sparse conversion is performed with reference to.

3. The speech coding according to claim 1, wherein the sparsification circuit (17) performs sparsification based on an adaptive threshold (Th) corresponding to an average signal amplitude of a predetermined number of samples. method.

Wherein said sparse circuit (17), said optimum rather than to the pitch vector (b _opt P _opt), said optimum pitch vector (b _opt P _opt) and the optimum code vector (g _opt C _opt ) and the optimum code vector (g
a threshold value (Th) corresponding to the power ratio of ( _opt C _opt ) is generated by the threshold value calculation circuit (18), given to the sparsification circuit (17) to be sparsified, and then sent to the delay device (14). The voice encoding system according to claim 1, wherein

5. The voice code according to claim 1, wherein the calculating means (21) is for multiplying a transposed matrix ( ^t A) of the FIR auditory weighting filter matrix. Method.

6. The computing means (21) rearranges the input signals in reverse on the time axis, and IIR auditory weighting filter processing (1
/ A '(Z)), and then reversely rearranged again on the time axis. 5. The speech encoding system according to claim 1, wherein

7. The same sparse adaptive codebook as the encoding side
(1), fixed codebook (2), sparsification circuit (17), delay device (1
4) and a calculating means (21), and the optimum gain obtained by multiplying the optimum selected pitch prediction residual vector (P _opt ) in the adaptive codebook (1) by the optimum gain (b _opt ). code·
The vector (b _opt P _opt ) is sparsed by the sparsification circuit (17), and the optimum selected code vector (C _opt ) of the fixed codebook (2) is multiplied by the optimum gain (g _opt ). Optimal code vector (g _opt C _opt )
7. The speech decoding system according to claim 1, wherein the code vector (X) obtained by adding and is obtained through a linear prediction reproduction filter (200) to obtain a reproduction signal.