JP3462958B2

JP3462958B2 - Audio encoding device and recording medium

Info

Publication number: JP3462958B2
Application number: JP17148496A
Authority: JP
Inventors: 中隆太朗山
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1996-07-01
Filing date: 1996-07-01
Publication date: 2003-11-05
Anticipated expiration: 2016-07-01
Also published as: JPH1020894A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音声符号化装置お
よびそれをソフトウェアで実現したプログラムを記録し
た記録媒体に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice encoding device and a recording medium recording a program that implements the same with software.

【０００２】[0002]

【従来の技術】８kbit/sの音声符号化方式として、ＩＴ
Ｕ−Ｔにより標準化されたConjugate-Structure Algebr
aic-Code-Excited-Linear-Predictive(CS-ACELP)Coding
(DraftRecommendation G. ７２９) が知られている。図
６にＣＳ−ＡＣＥＬＰのブロック図を示す。図６におい
て、３１は入力信号をバッファリングするバッファメモ
リ、３２は入力信号のＬＰＣ分析を行うＬＰＣ分析回
路、３３はＬＰＣ係数を量子化する量子化器、３４は音
源信号から合成音声を再生する合成フィルタ、３５は入
力音声信号から合成音声を差し引いて残差信号を求める
加算器、３６は求めた残差信号に聴感重み付けを行う重
み付け回路、３７は過去の駆動音源を蓄える適応コード
ブック３８から適応コードベクトルを探索する適応コー
ドブック探索回路、３９は雑音音源等の固定の音源ベク
トルを蓄える固定コードブック４０から固定コードベク
トルを探索する固定コードブック探索回路、４１はゲイ
ンコードブック４２を用いてゲインの予測を行うゲイン
量子化器、４３は量子化されたＬＰＣ係数とそれぞれ探
索されたコードベクトルと量子化ゲインとを多重化して
符号化する多重化器である。この方式はフレーム長を８
０次元（１０ms）として、サブフレーム長４０次元（５
ms）ごとに音源コードブック３８、４０が探索される。
音源情報には、１７ビットが割り当てられ、１７ビット
で４本の音源パルスの位置と符号を表すよう構成されて
いる。2. Description of the Related Art As an 8 kbit / s voice encoding system, IT
Conjugate-Structure Algebr standardized by UT
aic-Code-Excited-Linear-Predictive (CS-ACELP) Coding
(Draft Recommendation G. 729) is known. FIG. 6 shows a block diagram of CS-ACELP. In FIG. 6, 31 is a buffer memory for buffering an input signal, 32 is an LPC analysis circuit for performing LPC analysis of the input signal, 33 is a quantizer for quantizing LPC coefficients, and 34 is for reproducing synthesized speech from a sound source signal. A synthesis filter, 35 is an adder for subtracting a synthetic speech from an input speech signal to obtain a residual signal, 36 is a weighting circuit for perceptually weighting the obtained residual signal, and 37 is an adaptive codebook 38 for storing past driving sound sources. An adaptive codebook search circuit for searching an adaptive code vector, 39 is a fixed codebook search circuit for searching a fixed code vector from a fixed code book 40 that stores a fixed sound source vector such as a noise sound source, and 41 is a gain code book 42. The gain quantizer for predicting the gain, and 43 are quantized LPC coefficients and searched respectively Dobekutoru and a quantization gain a multiplexer for coding and multiplexing. This method has a frame length of 8
0 dimension (10ms), subframe length 40 dimension (5
Sound source codebooks 38 and 40 are searched for each ms).
17 bits are assigned to the sound source information, and 17 bits are configured to represent the positions and codes of the four sound source pulses.

【０００３】[0003]

【発明が解決しようとする課題】このＣＳ−ＡＣＥＬＰ
において、さらに低ビットレート（４kbit/s）化するた
めにフレーム長を１６０次元（２０ms）、サブフレーム
長を８０次元（１０ms）として、他は８kbit/sの場合と
全く同じ構成にした場合、問題となるのは１サブフレー
ムにつき４本の音源パルスしか探索・量子化することが
できないため、入力音声を忠実に再現することに限界が
生じることである。[Problems to be Solved by the Invention] This CS-ACELP
In order to further reduce the bit rate (4 kbit / s), the frame length is set to 160 dimensions (20 ms), the subframe length is set to 80 dimensions (10 ms), and the other configurations are exactly the same as in the case of 8 kbit / s, The problem is that since only four sound source pulses can be searched and quantized in one subframe, there is a limit in faithfully reproducing the input voice.

【０００４】したがって、４kbit/s程度の低ビットレー
ト音声符号化方式においては、できるだけ少ないビット
レートで、入力音声を忠実に再現する音声符号化方式が
要求されている。Therefore, in a low bit rate voice encoding system of about 4 kbit / s, a voice encoding system which faithfully reproduces an input voice at a bit rate as low as possible is required.

【０００５】本発明は、比較的少ないビットレートで、
入力音声に忠実な音源コードベクトルを構成することを
目的とする。The present invention provides a relatively low bit rate,
The purpose is to construct a sound source code vector that is faithful to the input speech.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するため
に、本発明は、ピッチパルス近傍における残差信号の冗
長性を利用して、ピッチパルスを中心に、ピッチパルス
との相対的な距離を考慮に入れた学習法によりピッチ適
応音源コードベクトルを求め、これによりピッチ適応音
源コードブックを作成することで、４kbit/s程度の低ビ
ットレートで音声の符号化を行うとき、できるだけ少な
いビットレートで入力音声に忠実な音源コードベクトル
を構築するように音源コードブックを構成したものであ
る。In order to achieve the above-mentioned object, the present invention utilizes the redundancy of the residual signal in the vicinity of the pitch pulse to make the relative distance from the pitch pulse centered on the pitch pulse. A pitch adaptive excitation code vector is obtained by a learning method that takes into consideration the above, and a pitch adaptive excitation codebook is created by this method, so that when the speech is encoded at a low bit rate of about 4 kbit / s, the bit rate is as low as possible. The sound source codebook is constructed so as to construct a sound source code vector faithful to the input speech.

【０００７】またピッチ適応音源コードブックを複数に
分割することにより、従来のＣＥＬＰ方式の音源コード
ブックのサイズよりもメモリ量、演算量が少なくてす
み、また、従来のＣＳ−ＡＣＥＬＰを４kbit/s化したと
きの音源コードベクトルよりも、入力音声を忠実に再現
した音源コードベクトルが得られる。Further, by dividing the pitch adaptive excitation codebook into a plurality of pieces, the memory amount and the calculation amount are smaller than the size of the conventional CELP type excitation codebook, and the conventional CS-ACELP is 4 kbit / s. It is possible to obtain a sound source code vector that faithfully reproduces the input voice, rather than the sound source code vector when it is converted.

【０００８】[0008]

【発明の実施の形態】本発明では、入力信号をバッファ
リングするバッファメモリと、入力信号のＬＰＣ分析を
行うＬＰＣ分析回路と、ＬＰＣ係数を量子化する量子化
器と、音源信号から合成音声を再生する合成フィルタ
と、入力音声から合成音声を差し引いた残差信号に聴感
重み付けを行う重み付け回路と、重み付き残差信号から
ピッチ周期を予測し、適応コードブックを用いて適応コ
ードベクトルを探索する適応コードブック探索回路と、
ピッチに適応した雑音源を生成・量子化するピッチ適応
雑音源と、ゲイン予測を行うゲイン量子化器およびゲイ
ンコードブックと、各量子化パラメータを多重化する多
重化器とを備え、前記ピッチ適応雑音源が、音源コード
ブックを分割し、各コードブックのコードベクトルが、
他のコードブックのコードベクトルと互いに直交してお
り、前記コードベクトルの一部がピッチ位置を原点とし
て時間軸の負の方向に学習し、かつ、他の一部が正の方
向に学習したコードベクトルであり、前記合成音声が、
前記コードベクトルをピッチ位置までシフトして足し合
わせ、サブフレーム長からはみ出した部分は切り捨てて
作ることを特徴とした音声符号化装置が提供される。BEST MODE FOR CARRYING OUT THE INVENTION In the present invention , a buffer memory for buffering an input signal, an LPC analysis circuit for performing LPC analysis of the input signal, a quantizer for quantizing LPC coefficients, and a synthesized voice from a sound source signal. A synthesis filter for reproduction, a weighting circuit for perceptually weighting a residual signal obtained by subtracting the synthetic speech from the input speech, a pitch period is predicted from the weighted residual signal, and an adaptive code vector is searched using an adaptive codebook. An adaptive codebook search circuit,
A pitch adaptive noise source for generating and quantizing a noise source adapted to pitch, a gain quantizer and gain codebook for predicting gain, and a multiplexer for multiplexing each quantization parameter are provided . Noise source is sound source code
Divide the book and the code vector of each codebook is
It is orthogonal to the code vectors of other codebooks.
Part of the above code vector has its origin at the pitch position.
Learn in the negative direction of the time axis, and some are positive
It is a code vector that has been learned toward the
Shift the chord vector to the pitch position and add
And cut off the part that exceeds the subframe length.
Provided is a speech encoding device characterized by making .

【０００９】本発明の音声符号化装置では、ピッチパル
ス近傍の残差信号の冗長性を利用することにより、４kb
it/s程度の低ビットレートで入力音声に忠実な音声の符
号化を行うことが可能という作用を有する。またピッチ
適応雑音源の構成として、ピッチ位置を固定しその近傍
の雑音パルスをピッチ位置との相対的な距離を横軸にと
って学習を行ったピッチ適応音源コードブックと、その
音源コードブックを用いて探索を行い、ピッチ適応音源
コードベクトルを生成するピッチ適応音源コードブック
探索回路とすることにより、４kbit/s程度の低ビットレ
ートで入力音声に忠実な音声の符号化を行うことが可能
という作用を有する。In the speech coding apparatus of the present invention, the pitch pulse
By using the redundancy of the residual signal near the
Voice bit faithful to the input voice at a low bit rate of it / s
It has the effect that encryption can be performed. As the configuration of the pitch adaptive noise source, a pitch adaptive sound source codebook in which a pitch position is fixed and noise pulses in the vicinity of the noise pulse are learned using the relative distance from the pitch position as the horizontal axis, and the sound source codebook is used. By using a pitch adaptive excitation codebook search circuit that performs a search and generates a pitch adaptive excitation code vector, it is possible to encode speech that is faithful to the input speech at a low bit rate of about 4 kbit / s. Have.

【００１０】請求項２記載の発明は、ピッチ適応音源コ
ードブック探索回路が、ピッチ位置検出回路とコードブ
ック探索回路からなる請求項１記載の音声符号化装置で
あり、ピッチ位置検出回路により、検出されたピッチ位
置を用いてコードブック探索を行い、ピッチに適応した
音源コードベクトルを生成することにより、４kbit/s程
度の低ビットレートで入力音声に忠実な音声の符号化を
行うことが可能という作用を有する。According to a second aspect of the invention, there is provided the speech coding apparatus according to the first aspect, wherein the pitch adaptive excitation codebook search circuit comprises a pitch position detection circuit and a codebook search circuit. By performing a codebook search using the specified pitch position and generating a sound source code vector that is adapted to the pitch, it is possible to encode audio that is faithful to the input audio at a low bit rate of about 4 kbit / s. Have an effect.

【００１１】請求項３記載の発明は、ピッチ位置検出回
路が、適応コードベクトルの振幅が最大となる位置を探
索する請求項２記載の音声符号化装置であり、ピッチ位
置の探索を容易に行うことが可能という作用を有する。A third aspect of the present invention is the speech coding apparatus according to the second aspect , wherein the pitch position detection circuit searches for a position where the amplitude of the adaptive code vector is maximum, and the pitch position is easily searched. It is possible to do so.

【００１２】請求項４記載の発明は、複数のｎ個のコー
ドブックの組合せによりピッチ適応音源コードベクトル
を表し、各コードブックのコードベクトルが、他のコー
ドブックのそれと互いに直交するようにｎ個毎にパルス
を配置し、ベクトル長をサブフレーム長の１／ｎ倍に圧
縮したコードブックを有する請求項１から３のいずれか
に記載の音声符号化装置であり、複数のコードブックに
分けて探索することにより、低演算量で探索が可能とな
り、ベクトル長をサブフレーム長の１／ｎ倍とすること
により、低メモリ量にすることが可能という作用を有す
る。According to a fourth aspect of the invention, a pitch adaptive excitation code vector is expressed by a combination of a plurality of n codebooks, and n codebooks are coded so that the codevectors of each codebook are orthogonal to those of other codebooks. 4. The speech coding apparatus according to claim 1, further comprising a codebook in which a pulse is arranged for each and the vector length is compressed to 1 / n times the subframe length, and the codebook is divided into a plurality of codebooks. By performing the search, it is possible to perform the search with a low calculation amount, and by making the vector length 1 / n times the subframe length, it is possible to reduce the memory amount.

【００１３】請求項５記載の発明は、コードブック探索
回路が、入力音声から合成音声を差し引いた残差信号に
重みを付けた重み付き残差信号の平均自乗誤差を最小に
するピッチ適応音源コードベクトルを探索・生成する請
求項２または３に記載の音声符号化装置であり、検出さ
れたピッチ位置だけピッチ適応音源コードベクトルをシ
フトさせ、その中から最適なベクトルを探索することに
より、４kbit/s程度の低ビットレートで入力音声に忠実
な音声の符号化を行うことが可能という作用を有する。According to a fifth aspect of the present invention, a pitch adaptive excitation code in which the codebook search circuit minimizes the mean square error of the weighted residual signal obtained by weighting the residual signal obtained by subtracting the synthetic speech from the input speech. The voice coding device according to claim 2 or 3 , which searches / generates a vector, and shifts the pitch adaptive excitation code vector by the detected pitch position, and searches for an optimum vector from among the 4 kbit / s. It has an effect that it is possible to encode the voice faithful to the input voice at a low bit rate of about s.

【００１４】請求項６記載の発明は、請求項４記載のピ
ッチ適応音源コードブックを用い、請求項５記載のコー
ドブック探索回路で各コードブックの探索を行い、複数
のｎ個得られたコードベクトルの線形和を取ることによ
り、ピッチ適応音源コードベクトルを得る請求項２また
は３記載の音声符号化装置であり、低演算量、低メモリ
量で、４kbit/s程度の低ビットレートで入力音声に忠実
な音声の符号化を行うことが可能という作用を有する。According to a sixth aspect of the present invention, the pitch adaptive excitation codebook according to the fourth aspect is used, each codebook is searched by the codebook search circuit according to the fifth aspect , and a plurality of n obtained codes are obtained. by taking the linear sum of vectors, also claim 2 obtain pitch adaptive excitation codevector
Is a voice encoding device described in 3 , and has an effect that it is possible to perform voice encoding faithful to an input voice at a low bit rate of about 4 kbit / s with a low calculation amount and a low memory amount.

【００１５】請求項７記載の発明は、請求項１から６の
いずれかに記載の音声符号化装置をソフトウェアで実現
したプログラムを記録した磁気ディスク、光磁気ディス
クＲＯＭカードリッジ等の記録媒体であり、例えばパー
ソナルコンピュータ等にこれら記録媒体を入力すること
により、請求項１から６記載のいずれかの音声符号化装
置をソフトウェアにより実現できるという作用を有す
る。The invention described in claim 7 is a recording medium such as a magnetic disk or a magneto-optical disk ROM cartridge which records a program in which the audio encoding device according to any one of claims 1 to 6 is realized by software. By inputting these recording media to, for example, a personal computer or the like, there is an effect that the speech encoding device according to any one of claims 1 to 6 can be realized by software.

【００１６】以下、本発明の実施の形態について、図１
から図４を用いて説明する。（実施の形態１）図１において、１１は入力信号をバッ
ファリングするバッファメモリ、１２は入力信号のＬＰ
Ｃ分析を行うＬＰＣ分析回路、１３はＬＰＣ係数を量子
化する量子化器、１４は音源信号から合成音声を再生す
る合成フィルタ、１５は入力音声信号から合成音声を差
し引いて残差信号を求める加算器、１６は求めた残差信
号に聴感重み付けを行う重み付け回路、１７は過去の駆
動音源を蓄える適応コードブック１８から適応コードベ
クトルを探索する適応コードブック探索回路、１９はピ
ッチ適応音源コードブック２０を用いて重み付き残差信
号の平均自乗誤差を最小にするベクトルを探索するピッ
チ適応音源コードブック探索回路であり、ピッチ位置検
出回路１９Ａとコードブック探索回路１９Ｂとからな
る。２１はゲインコードブック２２を用いてゲインの予
測を行うゲイン量子化器、２３は量子化されたＬＰＣ係
数とそれぞれ探索されたコードベクトルと量子化ゲイン
とを多重化して符号化する多重化器である。FIG. 1 shows an embodiment of the present invention.
4 to FIG. (Embodiment 1) In FIG. 1, 11 is a buffer memory for buffering an input signal, and 12 is an LP of the input signal.
An LPC analysis circuit that performs C analysis, 13 a quantizer that quantizes LPC coefficients, 14 a synthesis filter that reproduces synthesized speech from a sound source signal, and 15 that subtracts synthetic speech from an input speech signal to obtain a residual signal , 16 is a weighting circuit for weighting the residual signal obtained by perceptual weighting, 17 is an adaptive codebook search circuit for searching an adaptive code vector from an adaptive codebook 18 that stores past driving sound sources, and 19 is a pitch adaptive sound source codebook 20. Is a pitch adaptive excitation codebook search circuit that searches for a vector that minimizes the mean square error of the weighted residual signal, and includes a pitch position detection circuit 19A and a codebook search circuit 19B. Reference numeral 21 is a gain quantizer that performs gain prediction using the gain codebook 22, and reference numeral 23 is a multiplexer that multiplexes and codes the quantized LPC coefficient, the searched code vector, and the quantized gain. is there.

【００１７】次に本実施の形態における動作について説
明する。入力信号はバッファメモリ１１でバッファリン
グされ、サブフレーム長に分割される。サブフレーム長
に分割された音声信号は、ＬＰＣ分析回路１２でＬＰＣ
係数を算出し、量子化器１３で量子化を行い、その出力
の一方を多重化器２３に入力して符号化し、他方はＬＰ
Ｃ係数に逆量子化されて合成フィルタ１４の係数として
入力される。合成フィルタ１４には、ゲイン量子化器２
１でそれぞれスケーリングされた適応コードベクトルと
ピッチ適応音源ベクトルとの和が入力し、そこで合成音
声が得られる。入力音声から合成音声を加算器１５で減
算することにより残差信号が得られ、この残差信号を重
み付け回路１６に通すことにより、重み付き残差信号が
得られる。適応コードブック探索回路１７では、重み付
き残差信号を入力として、その平均自乗誤差が最小とな
るようにピッチ周期を算出する。算出したピッチ周期を
多重化器２３に入力して符号化する。次いで、このピッ
チ周期に基づいて、適応コードブック１８から適応コー
ドベクトルを生成する。Next, the operation of this embodiment will be described. The input signal is buffered in the buffer memory 11 and divided into subframe lengths. The audio signal divided into sub-frame lengths is LPC-analyzed by the LPC analysis circuit 12.
Coefficients are calculated, quantization is performed by the quantizer 13, one of the outputs is input to the multiplexer 23 for coding, and the other is LP
The C coefficient is inversely quantized and input as a coefficient of the synthesis filter 14. The synthesis filter 14 includes a gain quantizer 2
The sum of the adaptive code vector and the pitch adaptive sound source vector, which are respectively scaled by 1, is input, and the synthesized speech is obtained there. A residual signal is obtained by subtracting the synthesized speech from the input speech by the adder 15, and the weighted residual signal is obtained by passing the residual signal through the weighting circuit 16. The adaptive codebook search circuit 17 inputs the weighted residual signal and calculates the pitch period so that the mean square error thereof is minimized. The calculated pitch period is input to the multiplexer 23 and encoded. Then, an adaptive code vector is generated from the adaptive codebook 18 based on this pitch period.

【００１８】ピッチ適応音源コードブック探索回路１９
には、適応コードベクトルと重み付き残差信号が入力さ
れる。まず適応コードベクトルの中で振幅が最大となる
パルス位置をピッチ位置として探索する。次いで、ピッ
チ位置との相対的な距離を横軸に取って学習を行ったピ
ッチ適応音源コードブック２０₁〜２０_nを用いて、そ
のコードベクトルをピッチ位置までシフトして探索を行
う。探索は各コードブック２０₁〜２０_nの中で、重み
付き残差信号の平均自乗誤差を最小にするベクトルを探
索する。探索の結果、ｎ個の音源コードベクトルとその
インデックスが得られ、インデックスは多重化器２３に
入力して符号化される。また、ｎ個のコードベクトルの
線形和を取り、最終的なピッチ適応音源コードベクトル
を生成する。ゲイン量子化器２１には、適応コードベク
トルとピッチ適応音源コードベクトルと重み付き残差信
号が入力し、この重み付き残差信号の平均自乗誤差が最
小になるように、適応コードベクトルとピッチ適応音源
コードベクトルのゲインを求める。求められたゲインを
ゲインコードブック２２により量子化して多重化器２３
に出力するとともに、それぞれのベクトルのスケーリン
グを行い、加算して音源信号を生成する。音源信号を合
成フィルタ１４に通すことにより合成音声が得られる。Pitch adaptive excitation codebook search circuit 19
An adaptive code vector and a weighted residual signal are input to. First, a pulse position having the maximum amplitude in the adaptive code vector is searched as a pitch position. Then, using the pitch adaptive sound source codebooks 20 _{1 to} 20 _n that have been learned by taking the relative distance from the pitch position as the horizontal axis, the code vector is shifted to the pitch position and a search is performed. In the search, a vector that minimizes the mean square error of the weighted residual signal is searched in each of the codebooks 20 _{1 to} 20 _n . As a result of the search, n excitation code vectors and their indexes are obtained, and the indexes are input to the multiplexer 23 and encoded. Also, a linear sum of n code vectors is taken to generate a final pitch adaptive excitation code vector. An adaptive code vector, a pitch adaptive excitation code vector, and a weighted residual signal are input to the gain quantizer 21, and the adaptive code vector and the pitch adaptive are adjusted so that the mean square error of the weighted residual signal is minimized. Obtain the gain of the sound source code vector. The obtained gain is quantized by the gain codebook 22 and is multiplexed by the multiplexer 23.
, And each vector is scaled and added to generate a sound source signal. A synthetic voice is obtained by passing the sound source signal through the synthesis filter 14.

【００１９】次に、ピッチ適応音源コードブック探索回
路１９の詳細について説明する。図２はピッチ適応音源
コードブック探索回路１９の処理手順をフローチャート
で示したもので、同図（ａ）はピッチ適応音源コードブ
ック探索回路１９全体の処理の流れを示し、ステップＳ
１がピッチ位置検出回路１９Ａの動作であり、ステップ
Ｓ２からＳ６までの動作がコードブック探索回路１９Ｂ
の動作である。図２（ｂ）はピッチ位置検出回路１９Ａ
の動作の詳細を示しており、入力された適応コードベク
トルの中で振幅が最大となる位置を探索する。Next, details of the pitch adaptive excitation codebook search circuit 19 will be described. FIG. 2 is a flow chart showing the processing procedure of the pitch adaptive excitation codebook search circuit 19, and FIG. 2A shows the overall processing flow of the pitch adaptive excitation codebook search circuit 19 in step S.
1 is the operation of the pitch position detection circuit 19A, and the operations of steps S2 to S6 are the codebook search circuit 19B.
Is the operation. FIG. 2B shows a pitch position detection circuit 19A.
This shows the details of the operation of (1) and searches for the position where the amplitude is maximum in the input adaptive code vector.

【００２０】なお、ピッチ位置検出法に関して、パルス
列と適応コードベクトルとの波形歪みを最小化するピッ
チ位置を検出する方法や、それを合成波形領域で考え、
パルス列による合成波形と適応コードベクトルによる合
成波形との波形歪みを最小にするパルス位置を検出する
方法などを用いることにより、ピッチ位置の検出の精度
を上げることができる。また、ピッチ周期を利用して、
得られたピッチ適応音源コードベクトルをピッチ周期化
することにより、有声音部での性能向上を図ることがで
きる。Regarding the pitch position detection method, a method for detecting a pitch position that minimizes waveform distortion between a pulse train and an adaptive code vector, or a method for detecting it in a synthesized waveform area,
The accuracy of pitch position detection can be improved by using a method of detecting a pulse position that minimizes waveform distortion between a composite waveform of a pulse train and a composite waveform of an adaptive code vector. Also, using the pitch period,
By making the obtained pitch adaptive sound source code vector into a pitch period, it is possible to improve the performance in the voiced sound part.

【００２１】図３はコードブック探索回路１９Ｂの処理
手順を示しており、検出したピッチ位置を用いて、第１
コードブックをピッチ位置までシフトして重み付けを行
い、入力音声との平均自乗誤差を最小にするコードベク
トルを探索する。FIG. 3 shows the processing procedure of the codebook search circuit 19B.
The codebook is shifted to the pitch position and weighted to search for a code vector that minimizes the mean square error with the input speech.

【００２２】第２コードブック以下も同様な操作を行う
が、第２コードブック以降の探索では、それ以前に決定
されたコードベクトルとの和を取ったベクトルを用いて
探索を行う。以下同様な操作により、各コードブックで
最適なコードベクトルが得られ、それらの和が最終的な
ピッチ適応音源コードベクトルとなる。The same operation is performed for the second codebook and thereafter, but in the search after the second codebook, the search is performed by using the vector obtained by adding the code vector determined before that. By the same operation, the optimum code vector is obtained in each codebook, and the sum of them is the final pitch adaptive excitation code vector.

【００２３】以上の処理を図解すると図４のようにな
り、生成されたピッチ適応音源コードベクトルは、互い
に直交したベクトルの和で表されることがわかる。The above processing is illustrated in FIG. 4, and it can be seen that the generated pitch adaptive excitation code vector is represented by the sum of vectors orthogonal to each other.

【００２４】なお、第１コードブックから順に一つずつ
コードベクトルの決定を行ったが、まず各コードブック
から最適な候補を複数挙げておき、次に挙げた候補の中
で全探索により最適な組合せを検出する処理すなわち予
備選択処理を設けることで、より最適なピッチ適応音源
コードベクトルが得られる。The code vectors are determined one by one starting from the first codebook. First, a plurality of optimal candidates are listed from each codebook, and the optimal search is performed among the listed candidates. By providing a process for detecting a combination, that is, a preselection process, a more optimal pitch adaptive excitation code vector can be obtained.

【００２５】図５はコードベクトルがピッチ位置までシ
フトする様子を図示したもので、各コードベクトルは、
ピッチ位置との相対的な距離を横軸に取って学習されて
いるので、ピッチの位置情報が分かれば各コードベクト
ルをピッチ位置までシフトして、サブフレーム長からは
み出た成分については切り捨てることにより得られる。FIG. 5 shows how the code vectors shift to the pitch position. Each code vector is
Since learning is done by taking the relative distance from the pitch position as the horizontal axis, if you know the position information of the pitch, shift each code vector to the pitch position, and discard the components that exceed the subframe length. can get.

【００２６】またこのとき、メモリ領域でｉの位置にあ
るパルス位置を、探索の時点でｉ’の位置にシフトする
下記の変換式を用いることにより、メモリ領域ではサブ
フレーム長の１／ｎ倍に圧縮されたコードベクトルも、
探索の時点ではサブフレーム長に復元することができ
る。Further, at this time, by using the following conversion formula for shifting the pulse position at the position i in the memory area to the position i ′ at the time of the search, 1 / n times the subframe length in the memory area is used. The code vector compressed to
At the time of search, the subframe length can be restored.

【００２７】[0027]

【数１】ただし、ｉ：メモリ領域でのパルス位置、ｉ’：探索時
点でのパルス位置、Ｆ：サブフレーム長、ｎ：コードブ
ックの分割数、ｋ：コードブックの番号( ０,
１，．．．ｎ−１）、Ｌ：ピッチ位置である。[Equation 1] However, i: pulse position in memory area, i ': pulse position at the time of search, F: subframe length, n: number of divisions of codebook, k: codebook number (0,
1 ,. ．． n-1), L: pitch position.

【００２８】なお、図５はサブフレーム長を８０次元と
し、ピッチ適応音源コードブックを５分割、ピッチ位置
をＬ＝５８とした例であるが、この場合メモリ領域にお
けるコードベクトル長は８０÷５＝１６となり、ピッチ
位置を時間の原点として、パルス位置( ０, １,...,
７) までを時間軸上の負の領域、パルス位置( ８,
９,..., １５) を正もしくは原点として学習しているた
め、探索時点でコードベクトルを変換するときに、式
（１）の右辺第１項ｉ−８でメモリ領域のパルス位置か
ら８を減じている。また、第２項ｋ−１で、それぞれコ
ードブックの番号から１を減じた分だけシフトすること
により直交関係が保証され、さらに第３項で、ピッチ位
置Ｌだけシフトすることによりピッチ適応が施される。FIG. 5 shows an example in which the subframe length is 80 dimensions, the pitch adaptive excitation codebook is divided into 5, and the pitch position is L = 58. In this case, the code vector length in the memory area is 80/5. = 16, and the pulse position (0, 1, ...,
Up to 7), the negative region on the time axis, pulse position (8,
Since 9 ..., 15) is learned as a positive or origin, when converting the code vector at the time of search, the first term i-8 on the right-hand side of the equation (1) gives 8 from the pulse position of the memory area. Is being reduced. Further, in the second term k−1, the orthogonal relationship is guaranteed by shifting the codebook number by 1 respectively, and in the third term, the pitch adaptation is performed by shifting the pitch position L. To be done.

【００２９】[0029]

【発明の効果】以上のように本発明によれば、ピッチパ
ルス近傍における、残差信号の冗長性を利用して、例え
ば、サブフレーム長８０次元、音源コードブック５分
割、１つの音源コードブックにつき３ビット（８種類）
を割り当てることにより、１サブフレーム当たり１５ビ
ットでピッチに適応した音源コードベクトルを表すこと
ができ、４kbit/s程度の低ビットレートで音声の符号化
を行うとき、低演算量、低メモリ量で、入力音声に忠実
な音源ベクトルを構築できるという有利な効果が得られ
る。As described above, according to the present invention, by utilizing the redundancy of the residual signal in the vicinity of the pitch pulse, for example, the subframe length is 80 dimensions, the excitation codebook is divided into five, and one excitation codebook is used. 3 bits per type (8 types)
By assigning, the sound source code vector adapted to the pitch can be represented by 15 bits per subframe, and when the speech is encoded at a low bit rate of about 4 kbit / s, a low calculation amount and a low memory amount are required. The advantageous effect that a sound source vector faithful to the input voice can be constructed is obtained.

[Brief description of drawings]

【図１】本発明の一実施の形態による音声符号化装置の
構成を示すブロック図FIG. 1 is a block diagram showing a configuration of a speech coding apparatus according to an embodiment of the present invention.

【図２】（ａ）同装置におけるピッチ適応音源コードブ
ック探索回路の探索処理手順を示すフロー図（ｂ）同ピッチ適応音源コードブック探索回路における
ピッチ位置検出回路の処理手順を示すフロー図FIG. 2A is a flowchart showing a search processing procedure of a pitch adaptive excitation codebook search circuit in the same apparatus; FIG. 2B is a flowchart showing a processing procedure of a pitch position detection circuit in the pitch adaptive excitation codebook search circuit.

【図３】同ピッチ適応音源コードブック探索回路におけ
るコードブック探索回路の処理手順を示すフロー図FIG. 3 is a flowchart showing a processing procedure of a codebook search circuit in the pitch adaptive excitation codebook search circuit.

【図４】コードベクトルの和により表される適応音源コ
ードベクトルの一覧図FIG. 4 is a list of adaptive excitation code vectors represented by the sum of code vectors.

【図５】ピッチ適応音源コードベクトルのピッチ位置Ｌ
までシフトする様子を示す遷移図FIG. 5: Pitch position L of pitch adaptive sound source code vector
Transition diagram showing shifting up to

【図６】従来のＣＳ−ＡＣＥＬＰ方式の音声符号化装置
の構成を示すブロック図FIG. 6 is a block diagram showing the configuration of a conventional CS-ACELP speech encoding apparatus.

[Explanation of symbols]

１１バッファメモリ１２ＬＰＣ分析回路１３量子化器１４合成フィルタ１５加算器１６重み付け回路１７適応コードブック探索回路１８適応コードブック１９ピッチ適応音源コードブック探索回路１９Ａピッチ位置検出回路１９Ｂコードブック探索回路２０₁〜２０_n ピッチ適応音源コードブック２１ゲイン量子化器２２ゲインコードブック２３多重化器11 buffer memory 12 LPC analysis circuit 13 quantizer 14 synthesis filter 15 adder 16 weighting circuit 17 adaptive codebook search circuit 18 adaptive codebook 19 pitch adaptive excitation codebook search circuit 19A pitch position detection circuit 19B codebook search circuit 20 ₁ ~ 20 _n pitch adaptive excitation codebook 21 gain quantizer 22 gain codebook 23 multiplexer

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 19/12 ─────────────────────────────────────────────────── ─── Continuation of front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) G10L 19/12

Claims

(57) [Claims]

1. A buffer memory for buffering an input signal, an LPC analysis circuit for LPC analysis of the input signal, a quantizer for quantizing LPC coefficients, and a synthesis filter for reproducing synthesized speech from a sound source signal. A weighting circuit that performs perceptual weighting on the residual signal obtained by subtracting the synthetic voice from the input voice, and predicts the pitch period from the weighted residual signal,
An adaptive codebook search circuit that searches an adaptive code vector using an adaptive codebook, a pitch adaptive noise source that generates and quantizes a pitch-adapted noise source, a gain quantizer and a gain codebook that perform gain prediction, and , A multiplexer that multiplexes each quantization parameter , wherein the pitch adaptive noise source divides the excitation codebook,
The code vector of each codebook is replaced by the other codebook
Are orthogonal to the code vector of
Part of the kuturu is the negative of the time axis with the pitch position as the origin
The coaches who have learned positively and some of them have learned positively.
The chord vector up to the pitch position.
Shifted and added together, and exceeded the subframe length
A voice coding device characterized by cutting off parts .

2. The speech coding apparatus according to claim 1 , wherein the pitch adaptive excitation codebook search circuit comprises a pitch position detection circuit and a codebook search circuit.

3. The speech coding apparatus according to claim 2 , wherein the pitch position detection circuit searches for a position where the amplitude of the adaptive code vector is maximum.

4. A pitch adaptive excitation codevector is represented by a combination of a plurality of n codebooks, and a pulse is arranged every nth such that the codevector of each codebook is orthogonal to that of another codebook. the speech encoding apparatus according to claim 2 or 3, wherein a codebook obtained by compressing the vector length to 1 / n times the sub-frame length.

5. A codebook search circuit searches and generates a pitch adaptive excitation code vector that minimizes the mean square error of the weighted residual signal obtained by weighting the residual signal obtained by subtracting the synthetic speech from the input speech. The speech encoding device according to claim 2 or 3 .

6. A codebook search circuit according to claim 5 , wherein the pitch adaptive excitation codebook according to claim 4 is used.
4. The speech coding apparatus according to claim 2 , wherein a pitch adaptive excitation code vector is obtained by searching each codebook and taking a linear sum of a plurality of n obtained code vectors.

7. A recording medium recording a program which realizes the voice coding apparatus described software to any one of claims 1 to 6.