JPH0981191A

JPH0981191A - Voice coding/decoding device and voice decoding device

Info

Publication number: JPH0981191A
Application number: JP7231120A
Authority: JP
Inventors: Tomokazu Morio; 智一森尾
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1995-09-08
Filing date: 1995-09-08
Publication date: 1997-03-28
Anticipated expiration: 2015-09-08
Also published as: JP3229784B2

Abstract

PROBLEM TO BE SOLVED: To improve the pitch predicting precision by shortening the renewal period of pitch length compared with the renewal period of pitch predicting factor. SOLUTION: Frequency dividers 102, 107 are newly added to give the renewal period of pitch length shorter than that of pitch predicting factor. Though the instructions on analysis processing of pitch predicting factor are given to a pitch prediction analysing filter and a pitch prediction synthesizing filter with the same period as the past one, the instructions on the analysis processing of pitch length are given with a frequency faster than that in the analysis processing of the pitch predicting factor by frequency dividers 102, 107. By using these means, the renewal period of pitch length is shortened compared with the renewal period of the pitch predicting factor. Accordingly, the pitch predicting precision is improved in the voice coding/decoding device which conducts pitch prediction to plural sample length units.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音声波形を情報圧
縮して伝送或いは蓄積する装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a device for compressing and transmitting or storing voice waveform information.

【０００２】[0002]

【従来の技術】図４に従来技術のピッチ予測処理を含む
音声符号化復号化装置を示す。2. Description of the Related Art FIG. 4 shows a speech coding / decoding apparatus including pitch prediction processing of the prior art.

【０００３】符号化器は、音声の入力端子４００、入力
した音声を線形予測分析し、線形予測係数を符号化する
とともに、線形予測残差信号を出力する線形予測分析フ
ィルタ４０３、線形予測残差信号を入力し、ピッチ予測
分析を行い、ピッチ長とピッチ予測係数を符号化すると
ともに、ピッチ予測残差信号を出力するピッチ予測分析
フィルタ４０４、ピッチ予測残差信号を入力し量子化す
る予測残差量子化器４０５、及び該線形予測分析フィル
タ、ピッチ予測分析フィルタ、予測残差量子化器の動作
を制御する制御器４０１から構成される。The encoder has a speech input terminal 400, a linear prediction analysis of the inputted speech, a linear prediction coefficient, a linear prediction analysis filter 403 for outputting a linear prediction residual signal, and a linear prediction residual. A signal is input, pitch prediction analysis is performed, a pitch length and a pitch prediction coefficient are encoded, a pitch prediction analysis filter 404 that outputs a pitch prediction residual signal, and a prediction residual that inputs and quantizes a pitch prediction residual signal It is composed of a difference quantizer 405, a linear prediction analysis filter, a pitch prediction analysis filter, and a controller 401 for controlling the operation of the prediction residual quantizer.

【０００４】復号化器は、符号化器とは逆の動作手順と
なり、符号化器から伝送されるくる符号化情報をもと
に、ピッチ予測残差信号を生成する予測残差逆量子化器
４０８、音声のピッチ構造の信号を生成するピッチ予測
合成フィルタ４０９、そして線形予測合成フィルタ４１
０、合成された音声信号の出力端子４１１、及び該線形
予測合成フィルタ、ピッチ予測合成フィルタ、予測残差
逆量子化器の動作を制御する制御器４０６から構成され
る。The decoder has an operation procedure opposite to that of the encoder, and a prediction residual inverse quantizer that generates a pitch prediction residual signal based on the coding information transmitted from the encoder. 408, a pitch prediction synthesis filter 409 that generates a signal having a pitch structure of speech, and a linear prediction synthesis filter 41.
0, an output terminal 411 of the synthesized speech signal, a linear prediction synthesis filter, a pitch prediction synthesis filter, and a controller 406 for controlling the operations of the prediction residual dequantizer.

【０００５】上記音声符号化復号化装置において、制御
器からは各処理ブロックに、複数のサンプル単位毎に処
理を実行させるよう、指示が与えられる。ピッチ予測分
析フィルタ、ピッチ予測合成フィルタともに、その指示
に従ってピッチ長とピッチ予測係数を、同期したタイミ
ングで算出し処理するように動作する。In the above speech coding / decoding apparatus, the controller gives an instruction to each processing block to execute processing for each of a plurality of sample units. Both the pitch prediction analysis filter and the pitch prediction synthesis filter operate to calculate and process the pitch length and the pitch prediction coefficient at the synchronized timing according to the instruction.

【０００６】次に、従来のピッチ予測処理の具体例を説
明する。Next, a specific example of the conventional pitch prediction processing will be described.

【０００７】音声波形を効率的に情報圧縮して伝送或い
は蓄積する方式として、符号励振線形予測符号化（例え
ば、"Code-Excited Linear Prediction (CELP) : High
Quality Speech at Very Low Bit Rates"、M. R. Schro
eder and B.S. Atal、Proc.IEEE Int. Conf. on Acoust
ics、Speech and Signal Processing、pp. 937-940、19
85）がある。公知であるＣＥＬＰ符号化器のブロック図
を図５に示す。この符号化器は、音声の入力端子５０
１、符号化による誤差信号を聴覚的にマスキング処理す
るための聴覚重み付けフィルタ５０２、長期予測（ピッ
チ予測）を行なうための適応符号帳５０３、複数の励振
波形を記憶している励振符号帳５０５、適応符号帳５０
３及び励振符号帳５０５から生成される信号を、それぞ
れｇ_a 、ｇ_s の利得で増幅する増幅器５０４と５０６、
加算器５０７、音声の線形予測合成フィルタと聴覚重み
付けフィルタを従属接続した合成フィルター５０８、減
算器５０９、減算器５０９で生成される誤差信号のエネ
ルギーを最小化判定する誤差エネルギー最小化判定器５
１０で構成される。As a method for efficiently compressing and transmitting or accumulating a speech waveform, code-excited linear prediction coding (for example, "Code-Excited Linear Prediction (CELP): High") is used.
Quality Speech at Very Low Bit Rates ", MR Schro
eder and BS Atal, Proc. IEEE Int. Conf. on Acoust
ics, Speech and Signal Processing, pp. 937-940, 19
85) A block diagram of a known CELP encoder is shown in FIG. This encoder has a voice input terminal 50.
1, an auditory weighting filter 502 for auditorily masking an error signal by encoding, an adaptive codebook 503 for long-term prediction (pitch prediction), an excitation codebook 505 storing a plurality of excitation waveforms, Adaptive codebook 50
3 and the amplifiers 504 and 506 for amplifying the signals generated from the excitation codebook 505 with gains of g _a and g _s , respectively.
An adder 507, a synthesis filter 508 in which a speech linear prediction synthesis filter and an auditory weighting filter are connected in cascade, a subtractor 509, and an error energy minimization determiner 5 that determines the energy of an error signal generated by the subtractor 509.
It consists of 10.

【０００８】ＣＥＬＰの符号化処理は、例えば１６０サ
ンプル（これをフレームと呼ぶ）毎に線形予測分析等の
処理を行い、フレームを４分割した４０サンプル（これ
をサブフレームと呼ぶ）毎に、適応符号帳５０３及び励
振符号帳５０５の探索処理を行うのが一般的である。In the CELP coding process, for example, a process such as a linear prediction analysis is performed for every 160 samples (which is called a frame), and is adapted for every 40 samples (which is called a subframe) obtained by dividing a frame into four. In general, the search processing of the codebook 503 and the excitation codebook 505 is performed.

【０００９】ＣＥＬＰの符号帳（適応符号帳及び励振符
号帳）選択は数（１）に示すように、誤差エネルギーＥ
_k を最小にするインデックスｋを、符号帳の中から選択
することが目的である。以下に適応符号帳の場合で簡単
に説明する。The CELP codebook (adaptive codebook and excitation codebook) is selected by the error energy E as shown in equation (1).
_The purpose is to select the index _k that minimizes k from the codebook. The case of the adaptive codebook will be briefly described below.

【００１０】[0010]

【数１】 [Equation 1]

【００１１】ここで、Ｘは聴覚重み付けフィルタ５０２
で処理された入力信号の列ベクトルで、数（２）で表さ
れる。Ｈは合成フィルタ５０８のインパルス応答を要素
に持つ下三角行列で、数（３）で表される。Ｐ_ｋは適
応符号帳５０３のｋ番目のインデックスで生成される列
ベクトルで、数（４）で表される。適応符号帳５０３を
用いた上記予測をピッチ予測と呼ぶこともある。ｇは増
幅器５０４に与えられるスカラーの利得（ピッチ予測係
数）である。Here, X is a perceptual weighting filter 502.
The column vector of the input signal processed by the above equation (2). H is a lower triangular matrix having the impulse response of the synthesizing filter 508 as an element, and is represented by equation (3). P _k is a column vector generated at the k-th index of adaptive codebook 503 and is represented by equation (4). The above prediction using the adaptive codebook 503 may be called pitch prediction. g is a scalar gain (pitch prediction coefficient) given to the amplifier 504.

【００１２】[0012]

【数２】 [Equation 2]

【００１３】[0013]

【数３】 (Equation 3)

【００１４】[0014]

【数４】 (Equation 4)

【００１５】ここでＮは励振信号のベクトル長（サブフ
レーム長）を表し、^T は転置操作を表す。Here, N represents the vector length (subframe length) of the excitation signal, and ^T represents the transposition operation.

【００１６】Ｅ_k を最小化するための最適なピッチ長を
選択することは、結局数（５）に示すＳ_k を最大化する
インデックスｋ（ピッチ長）を、適応符号帳５０３の中
から選択することになる。To select the optimum pitch length for minimizing E _k , the index k (pitch length) that maximizes S _k shown in equation (5) is selected from the adaptive codebook 503. Will be done.

【００１７】[0017]

【数５】 (Equation 5)

【００１８】適応符号帳から信号を生成する処理を図６
に示す。適応符号帳６０１は、具体的にはメモリーで構
成されており、ピッチ長（ｋ）位置からサブフレーム長
（Ｎ）だけのベクトルを取り出す。ここでは取り出した
信号をピッチ信号格納器６０２に格納している。The process of generating a signal from the adaptive codebook is shown in FIG.
Shown in The adaptive codebook 601 is specifically composed of a memory, and extracts a vector of only the subframe length (N) from the pitch length (k) position. Here, the extracted signal is stored in the pitch signal storage 602.

【００１９】ピッチ長（ｋ）が、サブフレーム長（Ｎ）
より短い場合の処理や、適応符号帳の信号をオーバーサ
ンプリングすることで、ピッチ予測精度を改善する方法
が種々提案されている（例えば、"Analysis and Improv
ement of the Vector Quantization in SELP"、W.B.Kle
ijn、Signal Processing IV, pp. 1043-1046、1988、モP
itch Predictors with High Temporal Resolution"、P.
Kroon、ICASSP、pp. 661-664、1990）。The pitch length (k) is the subframe length (N)
Various methods have been proposed to improve the pitch prediction accuracy by processing for shorter times and oversampling the signal of the adaptive codebook (for example, "Analysis and Improv").
ement of the Vector Quantization in SELP ", WBKle
ijn, Signal Processing IV, pp. 1043-1046, 1988, Mo P
itch Predictors with High Temporal Resolution ", P.
Kroon, ICASSP, pp. 661-664, 1990).

【００２０】適応符号帳選択で得られたパラメータは、
最適なピッチ長（ｋ_o _p _t ）と、ピッチ予測係数（ｇ
_a ）である（ピッチ予測係数の選択は、公知の種々の手
法があるので、ここでは説明しない）。The parameters obtained by the adaptive codebook selection are
The optimum pitch length and (k _o _p _t), pitch prediction coefficient (g
_a ) (Pitch prediction coefficient selection is not described here because there are various known methods).

【００２１】伝送情報量の増加を小さく抑え、ピッチ予
測の精度を向上させることを目的に、ピッチ長の更新周
期に対して、ピッチ予測係数の更新周期を短縮すること
も提案されている（例えば、特開平３−３３８９８「ピ
ッチ予測方式」谷口ら）。For the purpose of suppressing the increase in the amount of transmitted information and improving the accuracy of pitch prediction, it has also been proposed to shorten the pitch prediction coefficient update cycle with respect to the pitch length update cycle (for example, , Taniguchi et al., Japanese Patent Laid-Open No. 3-33898 "Pitch Prediction Method".

【００２２】また、符号化・復号化によって再生される
信号は、スペクトルやピッチ構造が原信号に比べ平滑化
されてしまう。復号化処理において、これらを強調する
装置を一般に、ポストフィルタと呼ぶ。ピッチ構造を強
調する処理は、基本的にはピッチ予測の技術と同様であ
り、一般に符号化処理で符号化したピッチ長やピッチ予
測係数などの情報を用い、サブフレーム単位にフィルタ
リング処理を行う（例えば"Pitch Synchronous Innovat
ion CELP (PSI-CELP) -PDCハーフレート音声CODEC-"、
大矢、須田、三木、信学技報RCS93-78、pp. 63-70、199
3）。Further, the signal reproduced by the encoding / decoding has a smoother spectrum and pitch structure than the original signal. A device that emphasizes these in the decoding process is generally called a post filter. The process of emphasizing the pitch structure is basically the same as the pitch prediction technique, and generally, the information such as the pitch length and the pitch prediction coefficient coded by the coding process is used to perform the filtering process in subframe units ( For example, "Pitch Synchronous Innovat
ion CELP (PSI-CELP) -PDC half rate audio CODEC- ",
Oya, Suda, Miki, IEICE RCS93-78, pp. 63-70, 199
3).

【００２３】圧縮率を更に上げるために、フレーム長や
サブフレーム長等の処理単位を長くする方法が広く用い
られている。例えば日本のハーフレートデジタルセルラ
ー標準方式であるＰＳＩ−ＣＥＬＰでは、サブフレーム
長は８０サンプル単位で、適応符号帳探索処理は８０サ
ンプルの信号に対してピッチ長とピッチ予測係数を設定
している。In order to further increase the compression rate, a method of increasing the processing unit such as the frame length and the subframe length is widely used. For example, in PSI-CELP, which is the Japanese half-rate digital cellular standard system, the subframe length is in units of 80 samples, and the adaptive codebook search process sets the pitch length and pitch prediction coefficient for a signal of 80 samples.

【００２４】[0024]

【発明が解決しようとする課題】しかしながら処理単位
が長くなると、特に女性の高いピッチ周波数は分析範囲
中でピッチ長が変化してしまうという問題がある。例え
ばサンプリング周波数が８ｋＨｚ、ピッチ周波数が４０
０Ｈｚとすると、８０サンプルの処理単位には４ピッチ
含まれ、この４ピッチの間でピッチ長が変化することは
充分にありうることである。該サブフレーム内でのピッ
チ長変化によるピッチ長誤りは、非常に大きなピッチ予
測性能劣化を招くという問題があった。また従来技術で
示したようなピッチ予測係数を更新する方法では、予測
の性能が不十分であるという問題があった。However, when the processing unit becomes long, there is a problem that the pitch length changes in the analysis range, especially at high pitch frequencies of women. For example, the sampling frequency is 8 kHz and the pitch frequency is 40
At 0 Hz, the processing unit of 80 samples includes 4 pitches, and it is quite possible that the pitch length changes between these 4 pitches. A pitch length error due to a change in pitch length within the subframe has a problem of causing a great deterioration in pitch prediction performance. Further, the method of updating the pitch prediction coefficient as shown in the related art has a problem that the prediction performance is insufficient.

【００２５】[0025]

【課題を解決するための手段】本発明は、音声の入力端
子と、音声信号の線形予測分析器と、音声のピッチ信号
を予測するピッチ予測分析器と、予測残差信号を量子化
する予測残差量子化器と、該線形予測分析器、ピッチ予
測分析器、及び予測残差量子化器を複数のサンプル長単
位に動作させる制御器と、音声の出力端子と、線形予測
合成器と、ピッチ予測合成器と、予測残差逆量子化器
と、該線形予測合成器、ピッチ予測合成器、及び予測残
差逆量子化器を複数のサンプル長単位に動作させる制御
器と、ピッチ長の更新周期をピッチ予測係数の更新周期
より短縮する手段から構成される。According to the present invention, a speech input terminal, a speech signal linear prediction analyzer, a pitch prediction analyzer for predicting a speech pitch signal, and a prediction for quantizing a prediction residual signal. A residual quantizer, a controller for operating the linear prediction analyzer, the pitch prediction analyzer, and the prediction residual quantizer in units of a plurality of sample lengths, a speech output terminal, and a linear prediction synthesizer, Pitch prediction synthesizer, prediction residual dequantizer, controller for operating the linear prediction synthesizer, pitch prediction synthesizer, and prediction residual dequantizer in multiple sample length units, and pitch length It comprises means for shortening the update cycle to be shorter than the pitch prediction coefficient update cycle.

【００２６】[0026]

【発明の実施の形態】本発明の音声符号化・復号化装置
のブロック構成図を図１に示す。従来技術で説明した図
４と共通の部分の説明は割愛する。異なるのはピッチ長
の更新周期をピッチ予測係数の更新周期より短縮する手
段である分周器１０２と１０７が新たに加わっているこ
とである。1 is a block diagram showing the configuration of a voice encoding / decoding device according to the present invention. The description of the parts common to FIG. 4 described in the related art will be omitted. The difference is that frequency dividers 102 and 107, which are means for shortening the pitch length update cycle from the pitch prediction coefficient update cycle, are newly added.

【００２７】ピッチ予測分析フィルタ及びピッチ予測合
成フィルタに対して、ピッチ予測係数の分析処理指示
は、従来と同じ周期で与えられるが、ピッチ長の分析処
理指示は、分周器によってピッチ予測係数の分析処理指
示より早い周期で指示が与えらる。このようにすること
で、ピッチ長の更新周期をピッチ予測係数の更新周期よ
り短縮して処理する。For the pitch prediction analysis filter and the pitch prediction synthesis filter, the pitch prediction coefficient analysis processing instruction is given in the same cycle as the conventional one, but the pitch length analysis processing instruction is given by the frequency divider. Instructions are given in a cycle earlier than analysis processing instructions. By doing so, the pitch length update cycle is shorter than the pitch prediction coefficient update cycle for processing.

【００２８】次に、従来技術で説明した図６に対応し
て、本発明のピッチ予測信号を生成する処理の部分を図
２を用いて説明する。説明の便宜上、ピッチ長の更新単
位を「サブサブフレーム」と呼ぶことにする。図２はサ
ブフレーム長Ｎ、サブサブフレーム長Ｎ／２の例であ
る。この場合サブフレームを前半／後半に等分割して、
ピッチ長を変化させることができる。Next, the process of generating the pitch prediction signal of the present invention will be described with reference to FIG. 2, corresponding to FIG. 6 described in the prior art. For convenience of description, the pitch length update unit will be referred to as a “sub-subframe”. FIG. 2 shows an example of the subframe length N and the subsubframe length N / 2. In this case, divide the sub-frame equally into the first half / second half,
The pitch length can be changed.

【００２９】２０１は適応符号帳であり、２０２は適応
符号帳から生成された信号を格納するピッチ信号格納器
を示している。また、ｋ１及びｋ２はそれぞれ、前半の
ピッチ長、後半のピッチ長を示している。Reference numeral 201 denotes an adaptive codebook, and reference numeral 202 denotes a pitch signal storage for storing a signal generated from the adaptive codebook. Further, k1 and k2 indicate the pitch length of the first half and the pitch length of the second half, respectively.

【００３０】図２では、適応符号帳２０１から、処理単
位長さ（Ｎ）の信号を取り出す処理を、２回に分けて実
行する例を示している。FIG. 2 shows an example in which the process of extracting the signal of the processing unit length (N) from the adaptive codebook 201 is executed twice.

【００３１】ピッチ予測のパラメータを決定する手順
は、従来技術で説明した数（１）から数（５）の手順と
同じである。異なるのは、数（４）で表されるピッチ予
測信号の生成処理が、以下に示す数（６）のように変更
される点である。The procedure for determining the parameters for pitch prediction is the same as the procedure of the equations (1) to (5) described in the prior art. The difference is that the pitch prediction signal generation process represented by the equation (4) is changed as shown by the following equation (6).

【００３２】本発明の手段で生成される信号は、数
（６）で表わされる。The signal generated by the means of the present invention is represented by equation (6).

【００３３】[0033]

【数６】 (Equation 6)

【００３４】このようにすると、サブフレームの中でピ
ッチ長が変化したことに対応できる。適応符号帳の中か
ら最適なピッチ長（上記例では最適なｋ１、ｋ２)を探
索する処理は種々考えられる。計算量は多いが予測精度
が最も高い方法としては、可能な範囲のピッチ長（ｋ
１、ｋ２）の組み合わせを全て、数（６）に基づいて生
成する処理がある。ピッチ長の決定は、数（５）同様に
数（７）に示すＳ_k ₁ ，_k ₂ を最大化するインデックス
ｋ１、ｋ２を、適応符号帳の中から選択することにな
る。By doing so, it is possible to cope with the change in the pitch length in the subframe. There are various possible processes for searching for the optimum pitch length (optimal k1, k2 in the above example) from the adaptive codebook. As a method with a large amount of calculation but the highest prediction accuracy, the pitch length (k
There is a process of generating all combinations of 1, k2) based on the equation (6). To determine the pitch length, the indexes k1 and k2 that maximize S _k ₁ and _k ₂ shown in the equation (7) as in the equation (5) are selected from the adaptive codebook.

【００３５】[0035]

【数７】 (Equation 7)

【００３６】数（７）の場合、ピッチ長の伝送に必要な
情報量は、ピッチ長（ｋ１、ｋ２）の種類をそれぞれＫ
とすると、２＊ｌｏｇ₂ （Ｋ）ｂｉｔである。しかしな
がら上記方法では、ピッチ長の伝送情報量が非常に多く
なってしまう。In the case of the equation (7), the amount of information required for pitch length transmission is K for each type of pitch length (k1, k2).
Then, it is 2 * log ₂ (K) bit. However, in the above method, the amount of pitch length transmission information becomes very large.

【００３７】次に、請求項２に関る実施例について述べ
る。Next, an embodiment according to claim 2 will be described.

【００３８】音声のピッチ周波数は徐々に変化すること
が一般的であり、サブフレーム内におけるピッチ長の変
化量はごく小さい範囲に限定されることが多い。この性
質を利用し、サブサブフレーム毎のピッチ更新範囲を限
定することができる。例えば、サブフレームを二つに分
割した場合、以下の様な例が考えられる。Generally, the pitch frequency of speech changes gradually, and the change amount of pitch length within a subframe is often limited to a very small range. By utilizing this property, the pitch update range for each sub-subframe can be limited. For example, when the subframe is divided into two, the following examples can be considered.

【００３９】[0039]

【数８】 (Equation 8)

【００４０】これは、サブフレームの後半のサブサブフ
レームでは、前半のピッチ長からせいぜい±１サンプル
長だけ、ピッチ長が変化できるような制限を設けた場合
である。数（８）の場合、ピッチ長の伝送に必要な情報
量は、ピッチ長（ｋ１）の種類をＫとすると、ｌｏｇ₂
（３Ｋ）ｂｉｔとなり、伝送に必要な情報量を削減でき
る。In this case, in the sub-subframe in the latter half of the subframe, the pitch length can be changed by ± 1 sample length at most from the pitch length in the first half. In the case of the number (8), the amount of information required to transmit the pitch length is log _{2 where} K is the type of pitch length (k1).
(3K) bits, and the amount of information required for transmission can be reduced.

【００４１】他の例のピッチ長の探索処理としては、サ
ブフレーム単位の平均的なピッチ長を選択した後、サブ
サブフレームのピッチ長を決定する手法が考えられる。
サブフレームを二つに分割した場合、平均的なピッチ長
をｋｍとすると、ｋ１、ｋ２の取り得る値の範囲として
は、例えば数（９）のような例が考えられる。As another example of pitch length search processing, a method of determining the pitch length of the sub-subframe after selecting an average pitch length in subframe units can be considered.
When the sub-frame is divided into two, assuming that the average pitch length is km, the range of values that can be taken by k1 and k2 is, for example, an example such as Expression (9).

【００４２】[0042]

【数９】 [Equation 9]

【００４３】数（９）の例では、平均ピッチ長ｋｍの情
報を利用して、サブフレーム内のピッチ長の動きを７パ
ターンに制限したものである。数（９）の場合、ピッチ
長の伝送に必要な情報量は、ピッチ長（ｋｍ）の種類を
Ｋとすると、ｌｏｇ₂ （７Ｋ）ｂｉｔである。このピッ
チ長の動きを模式的に表現したものを図３に示す。In the example of the expression (9), the information of the average pitch length km is used to limit the movement of the pitch length within the subframe to 7 patterns. In the case of equation (9), the amount of information required for pitch length transmission is log ₂ (7K) bits, where K is the type of pitch length (km). A schematic representation of this pitch length movement is shown in FIG.

【００４４】図３では２つのサブサブフレームでのピッ
チ長の変化パターンが７本の矢印で示されている。In FIG. 3, the change pattern of the pitch length in two sub-subframes is shown by seven arrows.

【００４５】以上の例のように、サブフレーム内のピッ
チ長の変化を制限しても、実際の音声のピッチ長変化の
殆どの場合が表現できるので、伝送情報量の増加を少な
く抑えて、ピッチ予測精度を向上することができる。As in the above example, even if the change in pitch length within a subframe is limited, most cases of the actual change in pitch length of voice can be expressed, so that the increase in the amount of transmitted information can be suppressed to a small level. The pitch prediction accuracy can be improved.

【００４６】このようなピッチ長の更新幅の制限は、従
来技術で説明したオーバーサンプリングの技術を適用す
る等、種々のバリエーションが考えられる。Various limitations can be considered for the limitation of the update width of the pitch length such as applying the oversampling technique described in the prior art.

【００４７】次に、請求項３に関るサブフレーム内で変
化するピッチ長の探索処理量を削減する方法を、サブフ
レーム内のピッチ長の変化を制限した場合で説明する。Next, a method of reducing the search processing amount of the pitch length changing within the subframe according to claim 3 will be described in the case where the change of the pitch length within the subframe is limited.

【００４８】探索は数（７）の演算を行うことになる
が、この演算において例えばピッチ長の変化を数（９）
のように７パターンに制限する。In the search, the calculation of the equation (7) is performed. In this calculation, for example, the change of the pitch length is calculated by the equation (9).
It limits to 7 patterns like.

【００４９】先ずは数（７）の分子項について考える。
この項において、Ｘ^T Ｈの部分を先に計算すると、（Ｘ
^T Ｈ）とＰ_k ₁ ，_k ₂ の内積演算になる。これは一般
に"Backward filtering"と呼ばれる処理である（例え
ば、"Fast CESLP coding based on algebraic codes",
J-P.Adoul, etc., Proc. IEEE Int. Conf. on Acoustic
s、Speech and Signal Processing、pp. 1957-1960、19
87）。サブフレームの長さをＮとすると、分子項の演算
量は約７Ｎである。しかしながらサブフレームを２分割
した場合、この内積演算も２分割したものについて計算
し、ピッチ長の動きの組み合わせに従い、その前半／後
半の内積値を加算することで算出できる。ゆえに内積値
の演算量は約３Ｎ（＝６＊Ｎ／２）になり、約半分に削
減できる。First, consider the numerator term of equation (7).
In this section, calculating a portion of the X ^T H above, (X
^{This is} an inner product operation of ^T H) and P _k ₁ , _k ₂ . This is a process commonly called "Backward filtering" (eg, "Fast CESLP coding based on algebraic codes",
JP.Adoul, etc., Proc. IEEE Int. Conf. On Acoustic
s, Speech and Signal Processing, pp. 1957-1960, 19
87). When the length of the subframe is N, the amount of calculation of the numerator term is about 7N. However, when the sub-frame is divided into two, this inner product calculation can also be calculated by dividing into two, and by adding the inner product values of the first half / latter half according to the combination of movements of the pitch length. Therefore, the calculation amount of the inner product value is about 3N (= 6 * N / 2), which can be reduced to about half.

【００５０】次に数（７）の分母項について考える。こ
の項はＰ_k ₁ ，_k ₂ をフィルタリングしたエネルギーを
表わしており、フィルタ次数をＰ（一般に１０程度）と
すると、分母項の演算量は約７ＮＰである。しかしなが
らサブフレームを２分割した場合、このエネルギーの計
算量も以下の様に削減できる。Next, consider the denominator term of the equation (7). This term represents the energy obtained by filtering P _k ₁ and _k _{2. If} the filter order is P (generally about 10), the denominator calculation amount is about 7 NP. However, when the subframe is divided into two, the amount of calculation of this energy can be reduced as follows.

【００５１】先ずは前半のサブサブフレームの信号のみ
値が設定されており、後半のサブサブフレームは全て０
の信号をフィルタリングした信号を数（１０）で表わ
す。First, the values are set only for the signals of the first half sub-subframes, and all of the latter half sub-subframes are set to 0.
The signal obtained by filtering the signal of is represented by the number (10).

【００５２】[0052]

【数１０】 (Equation 10)

【００５３】ここで、記号＠は、ベクトルの接続を表わ
し、前半のＮ／２長のベクトルがＰ０_k ₁ であり、後半
のＮ／２長のベクトルがＲ_k ₁ である。同様に後半のサ
ブサブフレームの信号のみフィルタリングした信号を数
（１１）で表わす。Here, the symbol @ represents a connection of vectors, the first half N / 2 length vector is P0 _k ₁ , and the second half N / 2 length vector is R _k ₁ . Similarly, a signal obtained by filtering only the signal of the latter sub-subframe is represented by the equation (11).

【００５４】[0054]

【数１１】 [Equation 11]

【００５５】ここでＰ１_k ₂ は長さＮ／２のベクトルで
ある。Here, P1 _k ₂ is a vector of length N / 2.

【００５６】Ｐ_k ₁ ，_k ₂ をフィルタリングしたエネル
ギーを、サブフレームの前半／後半に分割して計算す
る。前半部分はＰ０_k ₁ （ｋ１＝ｋｍ＋１，ｋｍ，ｋｍ
−１）の演算のみ考えればよく、処理量は約３ＮＰ／２
である。後半部分はＲ_k ₁ とＰ１_k ₂ を加算合成したベ
クトルのエネルギーを算出する必要がある。このエネル
ギーの計算は数（１２）に示すように、各々のベクトル
のエネルギーの算出と、２つのベクトルの内積演算で算
出される。The energy obtained by filtering P _k ₁ and _k ₂ is calculated by dividing it into the first half / second half of the subframe. The first half is P0 _k ₁ (k1 = km + 1, km, km
Only the calculation of -1) should be considered, and the processing amount is about 3NP / 2.
It is. In the second half, it is necessary to calculate the energy of the vector obtained by adding and combining R _k ₁ and P 1 _k ₂ . This energy is calculated by calculating the energy of each vector and calculating the inner product of two vectors, as shown in equation (12).

【００５７】[0057]

【数１２】 (Equation 12)

【００５８】ここでベクトルＲ_k ₁ は、ゼロ入力応答
（入力信号が０のフィルタリング）に相当するので、エ
ネルギーの計算はＮ／２サンプル長より短いＭ（例えば
１０）の長さで計算を打ち切ることができる。以上より
後半部分のエネルギーの計算に必要な処理量は、約３Ｍ
Ｐ＋３ＮＰ／２＋７Ｎ／２である（この式の中の７は、
ｋ１とｋ２の組み合わせ数である）。Since the vector R _k ₁ corresponds to a zero input response (filtering with an input signal of 0), the calculation of energy is terminated with a length of M (eg, 10) shorter than N / 2 sample length. be able to. From the above, the amount of processing required to calculate the energy in the latter half is about 3M.
P + 3NP / 2 + 7N / 2 (7 in this formula is
It is the number of combinations of k1 and k2).

【００５９】前半／後半合わせるとその処理量は約３Ｐ
（Ｎ＋Ｍ）＋７Ｎ／２となり、サブフレームを分割しな
い処理量（７ＮＰ)に比べて削減できる。When the first half and the second half are combined, the processing amount is about 3P.
(N + M) + 7N / 2, which can be reduced compared to the processing amount (7NP) in which the subframe is not divided.

【００６０】次に、請求項４に関る実施例について述べ
る。Next, an embodiment according to claim 4 will be described.

【００６１】本手法は、符号化音声のピッチ構造を強調
するピッチ強調フィルタに応用することもできる。一般
的には、ピッチ強調フィルタはピッチ予測の技術と同じ
であり、そのピッチ長とピッチ予測係数の情報は、符号
化器から伝送されてくる情報を用いて、サブフレーム長
毎に更新する。The present method can also be applied to a pitch emphasis filter for emphasizing the pitch structure of coded speech. Generally, the pitch enhancement filter is the same as the pitch prediction technique, and the information of the pitch length and the pitch prediction coefficient is updated for each subframe length using the information transmitted from the encoder.

【００６２】符号化器側に本発明のピッチ予測フィルタ
が含まれている場合、ピッチ長の情報はサブフレーム内
で変化するゆえ、ピッチ強調フィルタにおいても、ピッ
チ予測係数よりピッチ長の更新周期を早くして処理する
ことができる。When the pitch prediction filter of the present invention is included in the encoder side, the pitch length information changes within the subframe. Therefore, even in the pitch enhancement filter, the pitch length update period is set from the pitch prediction coefficient. It can be processed quickly.

【００６３】また、符号化器側のピッチ予測フィルタが
従来技術で説明したように、サブフレーム単位にピッチ
長、ピッチ予測係数を算出するような場合であっても、
復号化処理において再度最適なピッチ長の分析探索処理
を行えば、ピッチ長をサブフレーム内で変化させ、ピッ
チ予測係数よりピッチ長の更新周期を早くしたピッチ強
調フィルタ処理が行える。Even when the pitch prediction filter on the encoder side calculates the pitch length and the pitch prediction coefficient in subframe units as described in the prior art,
If the optimum pitch length analysis / search process is performed again in the decoding process, the pitch length can be changed within the subframe, and the pitch enhancement filter process in which the pitch length update period is faster than the pitch prediction coefficient can be performed.

【００６４】ピッチ予測器は、音声の符号化方式として
はＣＥＬＰに限定されるものではなく、ピッチ予測を含
む全ての音声符号化復号化方式に適用可能である。The pitch predictor is not limited to CELP as a speech coding method, but can be applied to all speech coding / decoding methods including pitch prediction.

【００６５】[0065]

【発明の効果】複数のサンプル長単位にピッチ予測を行
う音声符号化復号化装置において、ピッチ予測係数の更
新周期に対して、ピッチ長の更新周期を短縮すること
で、ピッチ予測精度を向上できる。またピッチ長の更新
量を一定の範囲に制限する手段を備えることで、ピッチ
長の符号化に必要な情報量を小さく抑えられる。またピ
ッチ長探索に必要な処理量も削減できる。また本手法を
ピッチ強調フィルタに適応することで、符号化復号化音
声の品質を向上できる。EFFECTS OF THE INVENTION In a speech coding / decoding apparatus that performs pitch prediction in units of a plurality of sample lengths, the pitch prediction accuracy can be improved by shortening the pitch length update cycle with respect to the pitch prediction coefficient update cycle. . Further, by providing a means for limiting the update amount of the pitch length within a certain range, the amount of information necessary for encoding the pitch length can be suppressed to be small. Also, the amount of processing required for pitch length search can be reduced. Moreover, by applying this method to a pitch enhancement filter, the quality of encoded and decoded speech can be improved.

[Brief description of drawings]

【図１】本発明のピッチ予測器を備えた音声符号化復号
化装置を説明する図である。FIG. 1 is a diagram illustrating a speech coding / decoding apparatus including a pitch predictor of the present invention.

【図２】本発明のピッチ予測器から予測信号を生成する
処理を説明する図である。FIG. 2 is a diagram illustrating a process of generating a prediction signal from the pitch predictor of the present invention.

【図３】本発明のピッチ予測器において、ピッチ長の変
化に制限を設けた場合の、ピッチ長の動きの一例を模式
的に説明した図である。FIG. 3 is a diagram schematically illustrating an example of the movement of the pitch length when the change of the pitch length is limited in the pitch predictor of the present invention.

【図４】従来技術のピッチ予測器を備えた音声符号化復
号化装置を説明する図である。FIG. 4 is a diagram illustrating a speech encoding / decoding device including a conventional pitch predictor.

【図５】従来技術の音声符号化処理のブロック図であ
る。FIG. 5 is a block diagram of a conventional speech encoding process.

【図６】従来技術のピッチ予測器から予測信号を生成す
る処理を説明する図である。FIG. 6 is a diagram illustrating a process of generating a prediction signal from a conventional pitch predictor.

[Explanation of symbols]

２０１，５０３，６０１適応符号帳２０２，６０２ピッチ信号格納器１００，５０１，４００入力端子５０２聴覚重み付けフィル
タ５０４，５０６増幅器５０５励振符号帳５０７加算器５０８合成フィルタ５０９減算器５１０誤差エネルギー最小
化判定器１０１，１０６，４０１，４０６制御器１０２，１０７分周器１０３，４０３線形予測分析フィル
タ１０４，４０４ピッチ予測分析フィ
ルタ１０５，４０５予測残差量子化器１０８，４０８予測残差逆量子化器１０９，４０９ピッチ予測合成フィ
ルタ１１０，４１０線形予測合成フィル
タ201, 503, 601 Adaptive codebook 202, 602 Pitch signal store 100, 501, 400 Input terminal 502 Auditory weighting filter 504, 506 Amplifier 505 Excitation codebook 507 Adder 508 Synthesis filter 509 Subtractor 510 Error energy minimization determiner 101, 106, 401, 406 Controller 102, 107 Frequency divider 103, 403 Linear prediction analysis filter 104, 404 Pitch prediction analysis filter 105, 405 Prediction residual quantizer 108, 408 Prediction residual dequantizer 109, 409 pitch prediction synthesis filter 110,410 linear prediction synthesis filter

Claims

[Claims]

1. A speech input terminal, a speech signal linear prediction analyzer, a pitch prediction analyzer for predicting a speech pitch signal, a prediction residual quantizer for quantizing a prediction residual signal,
A controller for operating the linear prediction analyzer, the pitch prediction analyzer, and the prediction residual quantizer in units of a plurality of sample lengths, a speech output terminal, a linear prediction synthesizer, a pitch prediction synthesizer, and a prediction In a speech coding / decoding apparatus including a residual dequantizer, a controller for operating the linear prediction synthesizer, the pitch prediction synthesizer, and the prediction residual dequantizer in units of a plurality of sample lengths, the pitch A speech coding / decoding apparatus comprising means for shortening a long update cycle shorter than a pitch prediction coefficient update cycle.

2. The speech coding / decoding apparatus according to claim 1, wherein the pitch length update amount is limited to a certain range.

3. The apparatus according to claim 1, wherein a pitch length search operation is performed on a signal obtained by dividing a plurality of sample lengths,
A speech coding / decoding device, characterized in that a combined operation of divided operation results is performed as a whole of a plurality of sample lengths.

4. A pitch decoding filter for emphasizing a pitch structure of decoded speech, wherein a pitch length updating cycle is shorter than an updating cycle of a pitch prediction coefficient. .