JPH11259098A

JPH11259098A - Method of speech encoding/decoding

Info

Publication number: JPH11259098A
Application number: JP10367836A
Authority: JP
Inventors: Ko Amada; 皇天田; Kimio Miseki; 公生三関
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1997-12-24
Filing date: 1998-12-24
Publication date: 1999-09-24
Anticipated expiration: 2018-12-24
Also published as: JP3579276B2

Abstract

PROBLEM TO BE SOLVED: To achieve speech encoding of speech quality by using an algebraic structure code book which is decreased in the number of pulse positions and pulses by lowering a rate. SOLUTION: When a speech signal is encoded by expressing it by information presenting at least a characteristic of LPC synthesis part 120, a pitch vector to drive the LPC synthesis part 120, and a driving signal consisting of noise vectors, the pitch vector is searched for from an adaptive code book 141 and also a pulse position candidate varying in pulse position according to the shape of the pitch is searched for by a vector candidate search part 142; the adaptive algebraic structure code book 143 produces a pulse sequence by arranging the pulses in the pulse position chosen by this pulse position candidate; and constitutes the driving signal of the LPC synthesis part 120 from these pitch vector and pulse sequence.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ディジタル電話、
ボイスメモなどに用いられる低符号化レートの音声符号
化／復号化方法に関する。The present invention relates to a digital telephone,
The present invention relates to a low coding rate speech encoding / decoding method used for voice memos and the like.

【０００２】[0002]

【従来の技術】近年、携帯電話やインターネットなどで
音声や楽音を少ない情報量に圧縮して伝送、蓄積するた
めの符号化技術として、ＣＥＬＰ方式(Code Excited Li
nearPrediction ( M.R.Schroeder and B.S.Atal, "Code
Excited Linear Prediction (CELP) : High Quality S
peech at Very Low Bit Rates," Proc. ICASSP, pp.937
-940, 1985（文献１）および W.S.Kleijin, D.J.Krasin
ski et al. "Improved Speech Quality and Efficient
Vector Quantization in SELP," Proc. ICASSP, pp.155
-158, 1988 （文献２））がよく用いられている。2. Description of the Related Art In recent years, a CELP (Code Excited Lithium) method has been used as an encoding technique for compressing voice and musical sounds into a small amount of information for transmission and storage on a cellular phone or the Internet.
nearPrediction (MRSchroeder and BSAtal, "Code
Excited Linear Prediction (CELP): High Quality S
peech at Very Low Bit Rates, "Proc. ICASSP, pp.937
-940, 1985 (Reference 1) and WSKleijin, DJKrasin
ski et al. "Improved Speech Quality and Efficient
Vector Quantization in SELP, "Proc. ICASSP, pp.155
-158, 1988 (Reference 2)) are often used.

【０００３】ＣＥＬＰは線形予測分析に基づく符号化方
式であり、入力音声信号は線形予測分析によって音韻情
報を表す線形予測係数と音の高さ等を表す予測残差信号
に分けられる。線形予測係数を基に合成フィルタと呼ば
れる再帰型のディジタルフィルタが構成され、この合成
フィルタに予測残差信号が駆動信号として入力されるこ
とで、元の入力音声信号に復元できる。[0003] CELP is a coding method based on linear prediction analysis, and an input speech signal is divided into linear prediction coefficients representing phoneme information and prediction residual signals representing pitches and the like by linear prediction analysis. A recursive digital filter called a synthesis filter is formed based on the linear prediction coefficients, and the original input speech signal can be restored by inputting a prediction residual signal as a drive signal to the synthesis filter.

【０００４】低レートで符号化するためには、合成フィ
ルタの特性を表す合成フィルタ情報である線形予測係数
と、合成フィルタを駆動する駆動信号である予測残差信
号をより少ない情報量で符号化する必要がある。ＣＥＬ
Ｐ方式では、ピッチベクトルと雑音ベクトルの２種類の
信号に適当なゲインを乗じた後、足し合わせることによ
って、予測残差信号を符号化した信号が駆動信号として
生成される。ピッチベクトルの生成方法は例えば文献２
に述べられている。In order to perform encoding at a low rate, a linear prediction coefficient, which is synthesis filter information representing the characteristics of a synthesis filter, and a prediction residual signal, which is a drive signal for driving the synthesis filter, are encoded with a smaller amount of information. There is a need to. CEL
In the P system, a signal obtained by encoding a prediction residual signal is generated as a drive signal by multiplying two types of signals, that is, a pitch vector and a noise vector, by an appropriate gain and then adding the signals. A method of generating a pitch vector is described in, for example, Reference
It is described in.

【０００５】文献２の方法の他に音声立上り部(onset)
で固定の符号ベクトルを用いる方法なども提案されてい
るが本発明ではこれらをまとめてピッチベクトルと呼ぶ
ことにする。雑音ベクトルは通常、多数の候補を雑音符
号帳に格納しておき、この中から最適なものを選ぶこと
によって生成される。雑音ベクトルの探索方法として、
全ての雑音ベクトルをピッチベクトルと足し合わせた後
に合成フィルタに通して合成音声信号を生成し、この合
成音声信号の入力音声信号に対する歪みを評価し、最も
歪みの小さい合成音声信号を生成する雑音ベクトルを選
ぶという方法がとられる。従って、如何に効率良く雑音
ベクトルを雑音符号帳に格納しておくかがＣＥＬＰ方式
の重要なポイントになる。[0005] In addition to the method of Reference 2, the voice onset (onset)
Although a method using a fixed code vector has been proposed, these are collectively referred to as a pitch vector in the present invention. A noise vector is usually generated by storing a large number of candidates in a random codebook and selecting an optimum one from the stored candidates. As a search method of the noise vector,
A noise vector that adds all the noise vectors to the pitch vector, passes through a synthesis filter to generate a synthesized voice signal, evaluates the distortion of the synthesized voice signal with respect to the input voice signal, and generates a synthesized voice signal with the least distortion. Is chosen. Therefore, how to efficiently store the noise vector in the noise codebook is an important point of the CELP method.

【０００６】代数構造符号帳(Algebraic Codebook)(J-
P.Adoul et al, “ Fast CELP Coding based on algebr
aic codes”, Proc. ICASSP'87, pp.1957-1960(文献
３))は、雑音ベクトルをパルスの有無と極性（＋，−）
だけで表す簡単な構造である。代数構造符号帳は複数の
雑音ベクトルを格納した雑音符号帳を用いた方式に比
べ、コードベクトルを格納する必要がなく、また計算量
が少ないなどの特徴を持つ。音質の面でも従来の方式に
比べて遜色がないため、近年、様々な標準方式に用いら
れている。[0006] Algebraic Codebook (J-
P. Adoul et al, “Fast CELP Coding based on algebr
aic codes ”, Proc. ICASSP'87, pp.1957-1960 (Reference 3)) describes the presence and absence of pulses and the polarity (+, −) of the noise vector.
It is a simple structure that can be expressed simply. The algebraic structure codebook has features that it is not necessary to store code vectors and the amount of calculation is small, as compared with a method using a noise codebook storing a plurality of noise vectors. Since the sound quality is comparable to that of the conventional system, it has recently been used in various standard systems.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、代数構
造符号帳は符号化のビットレート（符号化レート）が下
がるに従い、音質の劣化が目立つようになる。その理由
の一つとして、パルスの位置情報の不足が挙げられる。
すなわち、代数構造符号帳ではパルスの位置情報を代数
的に単純化しているため、上述した利点はあるが、低符
号化レートではパルスを立てる必要の無い箇所に位置候
補が存在し、必要な個所に存在しないことがあるため、
効率が悪いばかりでなく、音声の品質が劣化してしま
う。However, in the algebraic structure codebook, as the bit rate of the coding (coding rate) decreases, the deterioration of the sound quality becomes conspicuous. One of the reasons is lack of pulse position information.
In other words, although the position information of the pulse is algebraically simplified in the algebraic structure codebook, the above advantage is obtained. However, at a low coding rate, a position candidate exists at a position where a pulse does not need to be raised, May not exist,
Not only is the efficiency inferior, but also the quality of the voice is degraded.

【０００８】代数構造符号帳を用いた場合に音質が劣化
するもう一つの理由として、パルス数の不足が挙げられ
る。パルス数が不足すると、復号音声に「プチプチ」と
いう雑音が目立つようになる。これは駆動信号がパルス
列から生成されているためであり、パルス数の減少とと
もにパルスの有無が聴覚的に知覚されやすくなるからで
ある。音質の向上のためには、このプチプチ感を軽減さ
せる必要がある。Another reason why sound quality is degraded when an algebraic structure codebook is used is that the number of pulses is insufficient. When the number of pulses is insufficient, the noise of “bubble wrap” becomes noticeable in the decoded speech. This is because the drive signal is generated from the pulse train, and as the number of pulses decreases, the presence / absence of the pulse becomes more audible. In order to improve the sound quality, it is necessary to reduce the bubble wrap.

【０００９】上述したように、従来の代数構造符号帳は
構造が簡単であり、計算量が少ないという利点を有する
反面、低符号化レートになると合成フィルタの駆動信号
を構成するパルス列の位置情報およびパルス数の不足に
より復号音声の音質が低下するという問題点があった。As described above, the conventional algebraic structure codebook has the advantages of a simple structure and a small amount of calculation. On the other hand, at a low coding rate, the position information of the pulse train constituting the driving signal of the synthesis filter and There is a problem that the sound quality of the decoded voice is deteriorated due to the shortage of the number of pulses.

【００１０】本発明は、低符号化レートでも良好な音質
が得られる音声符号化／復号化方法を提供することを目
的とする。An object of the present invention is to provide a speech encoding / decoding method capable of obtaining good sound quality even at a low encoding rate.

【００１１】[0011]

【課題を解決するための手段】本発明は、音声信号を少
なくとも合成フィルタの特性を表す情報を生成するステ
ップと、該合成フィルタを駆動するための信号であり、
前記音声信号の性質に応じて適応的に変化するパルス位
置候補から選ばれた所定の数のパルス位置にパルスを配
置することで生成されたパルス列を含む駆動信号を生成
するステップとでなる音声符号化方法を提供する。According to the present invention, there is provided a speech signal comprising a step of generating at least information indicating characteristics of a synthesis filter, and a signal for driving the synthesis filter.
Generating a drive signal including a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates that adaptively change according to the properties of the audio signal. Provide a method of conversion.

【００１２】本発明は、音声信号の性質に応じて適応的
に変化するパルス位置候補から選ばれた所定の数のパル
ス位置にパルスを配置することで生成されたパルス列を
含む駆動信号を合成フィルタに入力して音声信号を復号
化する音声復号化方法を提供する。According to the present invention, a drive signal including a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates which adaptively change according to the characteristics of an audio signal is synthesized by a filter. And a voice decoding method for decoding a voice signal by inputting the voice signal to a voice signal.

【００１３】本発明に係る音声符号化／復号化方法で
は、合成フィルタを駆動する駆動信号は音声信号の性質
に応じて適応的に変化するパルス位置候補から選ばれた
所定の数のパルス位置にパルスを配置することで生成さ
れたパルス列を含んでいる。パルス位置候補は、より具
体的には音声信号のパワ(power)の大きい所ほど多くの
候補が存在するように配置される。[0013] In the speech encoding / decoding method according to the present invention, the driving signal for driving the synthesis filter has a predetermined number of pulse positions selected from pulse position candidates that adaptively change according to the properties of the audio signal. It includes a pulse train generated by arranging pulses. More specifically, the pulse position candidates are arranged so that there are more candidates as the power of the audio signal increases.

【００１４】また、駆動信号は音声信号の性質に応じて
適応的に変化するパルス位置候補全てにパルスを配置
し、各パルスの振幅を所定の手段で最適化することで生
成されたパルス列を含んで構成することもできる。この
場合、パルス位置候補はより具体的には、音声信号のパ
ワの大きい所ほど多くの候補が存在するように配置され
る。Further, the drive signal includes a pulse train generated by arranging pulses at all the pulse position candidates which adaptively change according to the characteristics of the audio signal, and optimizing the amplitude of each pulse by a predetermined means. Can also be configured. In this case, more specifically, the pulse position candidates are arranged so that there are more candidates as the power of the audio signal is higher.

【００１５】さらに、駆動信号は音声信号の性質に応じ
て適応的に変化する第１のパルス位置候補から選ばれた
所定の数のパルス位置にパルスを配置することで生成さ
れたパルス列か、又は、第１のパルス位置候補として用
いられなかった位置の一部または全部からなる第２のパ
ルス位置候補から選ばれた所定の数のパルス位置にパル
スを配置することで生成されたパルス列のいずれかを用
いて生成することもできる。この場合、第１のパルス位
置候補は、より具体的には、音声信号のパワの大きい所
ほど多くの候補が存在するように配置される。Further, the drive signal is a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from first pulse position candidates which adaptively change according to the characteristics of the audio signal, or Any of the pulse trains generated by arranging pulses at a predetermined number of pulse positions selected from the second pulse position candidates that are part or all of the positions not used as the first pulse position candidates It can also be generated using. In this case, the first pulse position candidates are more specifically arranged such that there are more candidates as the power of the audio signal increases.

【００１６】また、駆動信号がピッチベクトルおよび雑
音ベクトルからなる場合には、雑音ベクトルがピッチベ
クトルの形状に応じて変化するパルス位置候補から選ば
れた所定の数のパルス位置にパルスを配置することで生
成される。この場合、パルス位置候補はより具体的に
は、ピッチベクトルのパワの大きい所ほど多くの候補が
存在するように配置される。When the drive signal is composed of a pitch vector and a noise vector, the pulses are arranged at a predetermined number of pulse positions selected from pulse position candidates in which the noise vector changes according to the shape of the pitch vector. Generated by In this case, more specifically, the pulse position candidates are arranged such that the greater the power of the pitch vector, the more candidates there are.

【００１７】また、雑音ベクトルがピッチベクトルの形
状から求められた位置候補密度関数に基づき設定された
位置候補から選ばれた所定の数のパルス位置にパルスを
配置することで生成されたパルス列を用いて構成とする
こともできる。この場合、パルス位置候補はより具体的
には、位置候補密度関数の値の大きい所ほど多くの候補
が存在するように配置され、位置候補密度関数はピッチ
ベクトルのパワとパルスが配置される確率を関連付ける
予め求められた関数である。Further, a noise vector is generated using a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from position candidates set based on a position candidate density function obtained from a pitch vector shape. It is also possible to adopt a configuration. In this case, more specifically, the pulse position candidates are arranged such that the larger the value of the position candidate density function is, the more candidates exist, and the position candidate density function is the probability that the power of the pitch vector and the pulse are arranged. Is a function obtained in advance.

【００１８】さらに、雑音ベクトルにピッチ周期強調フ
ィルタなどの補正手段を用いる場合には、ピツチベクト
ルにこの逆特性に基づく処理を行った逆補正ピッチベク
トルの形状に応じて変化するパルス位置候補から選ばれ
た所定の数のパルス位置にパルスを配置することで生成
される。この場合、パルス位置候補はより具体的には、
逆補正ピッチベクトルのパワの大きい所ほど多くの候補
が存在するように配置される。Further, when a correction means such as a pitch period emphasis filter is used for the noise vector, the pitch vector is selected from pulse position candidates which change according to the shape of the inversely corrected pitch vector obtained by performing processing based on this inverse characteristic. It is generated by arranging pulses at predetermined predetermined number of pulse positions. In this case, more specifically, the pulse position candidate
The arrangement is such that the greater the power of the inverse correction pitch vector, the more candidates there are.

【００１９】このようにパルス位置候補を音声信号のパ
ワー分布などの性質に応じて適応的に変化させることに
より、低符号化レート化によってパルス位置やパルス数
が削減された代数構造符号帳を用いた場合でも符号化効
率が向上し、復号音声の音質を維持しつつ低符号化レー
ト化を図ることができる。また、パルス位置候補の作成
にピッチベクトルを用いることで、付加情報を必要とせ
ずにパルス位置候補の適応化が可能となる。As described above, by changing the pulse position candidates adaptively according to the properties such as the power distribution of the audio signal, the algebraic structure codebook in which the pulse positions and the number of pulses are reduced by the lower coding rate is used. In this case, the coding efficiency is improved, and the coding rate can be reduced while maintaining the sound quality of the decoded speech. In addition, by using a pitch vector to generate a pulse position candidate, it is possible to adapt the pulse position candidate without requiring additional information.

【００２０】本発明に係る他の音声符号化／復号化方法
では、駆動信号がピッチベクトルおよび雑音ベクトルか
らなる場合、ピッチベクトルの形状を基に決められた特
性を持つパルス整形手段によって整形されたパルス列を
含む駆動信号が生成される。In another speech encoding / decoding method according to the present invention, when the drive signal comprises a pitch vector and a noise vector, the signal is shaped by pulse shaping means having characteristics determined based on the shape of the pitch vector. A drive signal including a pulse train is generated.

【００２１】このような構成によって、パルス数の減少
による復号音声に含まれるパルス状の雑音が軽減され、
低符号化レート化によってパルス位置やパルス数が削減
された場合でも、復号音声の音質を維持しつつ低符号化
レート化が可能となる。With such a configuration, the pulse-like noise included in the decoded speech due to the decrease in the number of pulses is reduced,
Even when the pulse position and the number of pulses are reduced by the lower encoding rate, the lower encoding rate can be achieved while maintaining the sound quality of the decoded speech.

【００２２】さらに、本発明に係る音声符号化／復号化
方法においては、音声信号の性質に応じて適応的に変化
するパルス位置候補から選ばれた所定の数のパルス位置
にパルスを配置することで生成されたパルス列を含む駆
動信号を生成し、かつこのパルス列をピッチベクトルの
形状を基に決められた特性を持つパルス整形手段によっ
て整形してもよい。Further, in the speech encoding / decoding method according to the present invention, the pulses are arranged at a predetermined number of pulse positions selected from pulse position candidates which adaptively change according to the characteristics of the speech signal. May be generated, and the pulse train may be shaped by a pulse shaping means having characteristics determined based on the shape of the pitch vector.

【００２３】[0023]

【発明の実施の形態】図１に、第１の実施形態に係る音
声符号化方法を適用した音声符号化システムが示され
る。この音声符号化システムは、入力端子１０１，１０
６と、ＬＰＣ分析部１１０と、ＬＰＣ量子化部１１１
と、ＬＰＣ合成部１２０と、聴覚重み付け部１３０と、
適応符号帳１４１と、パルス位置候補探索部１４２と、
適応代数構造符号帳１４３と、符号選択部１５０と、ピ
ッチ周期強調部１６０と、利得乗算部１０２，１０３お
よび加算部１０４，１０５から構成される。FIG. 1 shows a speech coding system to which a speech coding method according to a first embodiment is applied. This speech coding system has input terminals 101, 10
6, LPC analysis unit 110, and LPC quantization unit 111
, An LPC synthesis unit 120, an auditory weighting unit 130,
An adaptive codebook 141, a pulse position candidate search unit 142,
It comprises an adaptive algebraic structure codebook 143, a code selector 150, a pitch period enhancer 160, gain multipliers 102 and 103, and adders 104 and 105.

【００２４】入力端子１０１には、符号化すべき入力音
声信号が１フレーム分の長さの単位で入力され、これに
同期してＬＰＣ分析部１１０で線形予測分析が行われる
ことにより、声道特性に相当する線形予測係数（ＬＰＣ
係数）が求められる。ＬＰＣ係数はＬＰＣ量子化部１１
１で量子化され、この量子化値がＬＰＣ合成部１２０に
ＬＰＣ合成部１２０の特性を表す合成フィルタ情報とし
て入力されると共に、量子化値を指し示すインデックス
Ａが符号化結果として図示しない多重化部へ出力され
る。An input speech signal to be encoded is input to an input terminal 101 in units of one frame length, and the LPC analysis unit 110 performs linear prediction analysis in synchronization with the input speech signal, thereby obtaining vocal tract characteristics. Linear prediction coefficient (LPC
Coefficient). The LPC coefficient is calculated by the LPC quantization unit 11
1, the quantized value is input to the LPC synthesizing unit 120 as synthesis filter information indicating the characteristics of the LPC synthesizing unit 120, and an index A indicating the quantized value is used as a coding result by a multiplexing unit (not shown). Output to

【００２５】適応符号帳１４１には、過去にＬＰＣ合成
部１２０に入力された駆動信号が格納されている。ＬＰ
Ｃ合成部１２０の入力となる駆動信号は、線形予測分析
における予測残差信号を量子化した信号であり、音の高
低の情報などを含む声帯信号に相当する。適応符号帳１
４１は過去の駆動信号からピッチ周期に相当する長さの
波形を切り出し、これを繰り返すことでピッチベクトル
を生成する。ピッチベクトルは通常、フレームを幾つか
に分割したサブフレーム単位で求められる。The adaptive codebook 141 stores drive signals that have been input to the LPC synthesis section 120 in the past. LP
The drive signal input to the C synthesizing unit 120 is a signal obtained by quantizing a prediction residual signal in the linear prediction analysis, and corresponds to a vocal cord signal including information on a pitch of a sound. Adaptive codebook 1
Reference numeral 41 cuts out a waveform having a length corresponding to the pitch period from the past drive signal, and repeats this to generate a pitch vector. The pitch vector is usually obtained in units of subframes obtained by dividing a frame into several parts.

【００２６】パルス位置候補探索部１４２では、適応符
号帳１４１で求められたピッチベクトルを基に、サブフ
レーム内のどの位置にパルス位置候補を設定するかを計
算で求め、その結果を適応代数構造符号帳１４３に出力
する。The pulse position candidate search section 142 calculates, based on the pitch vector obtained by the adaptive codebook 141, at which position in the subframe the pulse position candidate is to be set, and calculates the result in the adaptive algebraic structure. Output to codebook 143.

【００２７】適応代数構造符号帳１４３は、パルス位置
候補探索部１４２から入力されたパルス位置候補の中か
ら、ピッチベクトルの影響を差し引いた入力音声信号に
対する歪みが聴覚重みの下で最小となるように、所定の
本数分のパルス位置とその符号を探索する。The adaptive algebraic structure codebook 143 is designed to minimize the distortion of the input speech signal, from which the influence of the pitch vector is subtracted, from the pulse position candidates input from the pulse position candidate search section 142 under the auditory weight. Then, a predetermined number of pulse positions and their signs are searched.

【００２８】適応代数構造符号帳１４３の出力であるパ
ルス列は、必要に応じてピッチ周期強調部１６０によっ
てピッチ単位で周期化される。ピッチ周期強調部１６０
では、入力端子１０６から適応符号帳１４３の探索で求
められたピッチ周期の情報Ｌが入力され、パルス列にピ
ッチ周期の周期性が与えられる。The pulse train output from the adaptive algebraic structure codebook 143 is periodicized in pitch units by the pitch period emphasizing unit 160 as necessary. Pitch cycle emphasis unit 160
Then, the pitch period information L obtained by searching the adaptive codebook 143 is input from the input terminal 106, and the pulse train is given the periodicity of the pitch period.

【００２９】適応符号帳１４１から出力されるピッチベ
クトルおよび適応代数構造符号帳１４３から出力され、
かつ必要に応じてピッチ周期強調部１６０で周期性が与
えられたパルス列は、利得乗算部１０２，１０３により
ピッチベクトルに対する利得Ｇ０および雑音ベクトルに
対する利得Ｇ１がそれぞれ乗じられた後、加算部１０４
で加え合わせられ、ＬＰＣ合成部１２０に駆動信号とし
て入力される。なお、利得Ｇ０，Ｇ１としては通常、複
数の利得を格納した利得符号帳（図示していない）から
最適な利得が選ばれる。The pitch vector output from the adaptive codebook 141 and the output from the adaptive algebraic structure codebook 143,
In addition, the pulse train given periodicity by the pitch period emphasizing unit 160 is multiplied by the gain G0 for the pitch vector and the gain G1 for the noise vector by the gain multiplying units 102 and 103, respectively, and then added to the adding unit 104.
And input to the LPC synthesis unit 120 as a drive signal. Note that, as the gains G0 and G1, usually, an optimum gain is selected from a gain codebook (not shown) storing a plurality of gains.

【００３０】符号選択部１５０からは、適応符号帳１４
１に対する探索で選ばれたピッチベクトルを示すインデ
ックスＢと、適応代数構造符号帳１４３に対する探索で
選ばれたパルス列を示すインデックスＣと、利得符号帳
に対する探索で選ばれた利得Ｇ０，Ｇ１を示すインデッ
クスＧが出力される。これらの各インデックスＢ，Ｃ，
ＧとＬＰＣ量子化部１１１からのＬＰＣ係数の量子化値
である合成フィルタ情報を示すインデックスＡが図示し
ない多重化部で多重化され、ビットストリームとして出
力される。The code selecting section 150 outputs the adaptive codebook 14
1, an index B indicating a pitch vector selected in the search for the index 1, an index C indicating a pulse train selected in the search for the adaptive algebraic codebook 143, and an index indicating the gains G0 and G1 selected in the search for the gain codebook. G is output. Each of these indices B, C,
G and the index A indicating the synthesis filter information, which is the quantization value of the LPC coefficient from the LPC quantization unit 111, are multiplexed by a multiplexing unit (not shown) and output as a bit stream.

【００３１】次に、本実施形態の特徴部分であるパルス
位置候補探索部１４２と適応代数構造符号帳１４３につ
いて説明する。Next, the pulse position candidate search unit 142 and the adaptive algebraic structure codebook 143, which are characteristic parts of the present embodiment, will be described.

【００３２】本実施形態では低符号化レート時にパルス
が立つ位置を制限しても、従来のように音質を劣化させ
ずに符号化レートだけを低減させることができるように
するために、パルスは駆動信号のパワの大きい所に集中
して立つ性質を利用し、駆動信号のパワの大きい所ほど
多くの位置候補が割り振られるようにサブフレーム毎に
パルス位置候補が設定される。In the present embodiment, even if the position where the pulse rises at the time of the low coding rate is limited, in order to reduce only the coding rate without deteriorating the sound quality unlike the related art, the pulse is used. The pulse position candidates are set for each sub-frame such that the position where the power of the drive signal is high is concentrated and more position candidates are allocated to the portion where the power of the drive signal is high.

【００３３】ピッチベクトルは理想的な駆動信号の形状
と似ているため、適応符号帳１４１の探索により求めら
れたピッチベクトルに基づいてパルス位置候補探索部１
４２でパルス位置候補を設定することは効果的である。
ピッチベクトルは、復号化側でも符号化側と同一のもの
が求められるため、パルス位置候補の適応化に伴って余
分な付加情報を発生させる必要はない。Since the pitch vector resembles the shape of an ideal drive signal, the pulse position candidate search unit 1 based on the pitch vector obtained by searching the adaptive codebook 141.
Setting pulse position candidates at 42 is effective.
Since the same pitch vector is required on the decoding side as on the encoding side, it is not necessary to generate extra additional information with the adaptation of the pulse position candidates.

【００３４】パルス位置候補の適応化に際して、パワの
大きい所のみに位置候補を割り振ると、パワの小さな区
間では連続して位置候補が存在しなくなることが原因で
音質が劣化することもある。パルス位置候補の適応化の
方法は様々な方法が考えられるが、例えば以下のような
方法をとることにより音質劣化の少ない適応化が可能で
ある。When adapting the pulse position candidates, if the position candidates are assigned only to the places where the power is large, the sound quality may be degraded because the position candidates do not exist continuously in the section where the power is small. Various methods of adapting the pulse position candidates are conceivable. For example, adaptation with less sound quality degradation is possible by adopting the following method.

【００３５】図２に示すフローチャートを用いて、パル
ス位置候補探索部１４２によるパルス位置候補の適応化
の処理手順を説明する。また、図３に図２の各ステップ
における入力ピッチベクトル波形（Ｆ０）、この入力ピ
ッチベクトル波形のパワ（Ｆ１）、平滑化したパワ（Ｆ
２）、この平滑化したパワをサンプル方向に積分した値
（Ｆ３）を図２に対応させてそれぞれ示す。With reference to the flowchart shown in FIG. 2, a description will be given of a processing procedure for adaptation of pulse position candidates by the pulse position candidate search unit 142. FIG. 3 shows the input pitch vector waveform (F0), the power (F1) of this input pitch vector waveform, and the smoothed power (F
2) A value (F3) obtained by integrating the smoothed power in the sample direction is shown in FIG.

【００３６】パワの他に振幅値の絶対値（パワの平方
根）など波形の形状を表す他の尺度を用いても同様の処
理が可能である。本発明ではこれらをまとめてパワで代
表することにする。Similar processing can be performed by using other scales representing the waveform shape such as the absolute value of the amplitude value (the square root of the power) in addition to the power. In the present invention, these are collectively represented by power.

【００３７】まず最初に、図３の入力ピッチベクトル
（Ｆ０）について、パワ（Ｆ１）を算出し（ステップＳ
１）、次いでパワ（Ｆ１）を平滑化し、平滑化パワ（Ｆ
２）を得る（ステップＳ２）。パワの平滑化には、例え
ば数サンプルの窓で重みを付けて移動平均をとるなどの
方法がある。First, the power (F1) is calculated for the input pitch vector (F0) of FIG. 3 (step S).
1) Then, the power (F1) is smoothed, and the smoothed power (F1)
2) is obtained (step S2). For power smoothing, for example, there is a method of taking a moving average by weighting in a window of several samples.

【００３８】次に、ステップＳ２で平滑化されたパワを
サンプル方向に積分する（ステップＳ３）。この様子が
図３の（Ｆ３）に示されている。具体的には、ｎ番目の
サンプルの平滑化されたパワをｐ（ｎ）、この平滑化さ
れたパワｐ（ｎ）の積分値をｑ（ｎ）、サブフレーム長
をＬとすると、積分値ｑ（ｎ）はｑ（ｎ）＝ｐ（ｎ）＋ｑ（ｎ−１）＋Ｃ（ｎ＝０，
…，Ｌ−１）で求められる。ただし、Ｃは定数であり、パルス位置候
補の密度の偏りの度合いを調節する。Next, the power smoothed in step S2 is integrated in the sample direction (step S3). This state is shown in (F3) of FIG. Specifically, assuming that the smoothed power of the n-th sample is p (n), the integrated value of the smoothed power p (n) is q (n), and the subframe length is L, the integrated value is q (n) is q (n) = p (n) + q (n-1) + C (n = 0,
.., L-1). Here, C is a constant, and adjusts the degree of bias in the density of the pulse position candidates.

【００３９】次に、この積分値ｑ（ｎ）を用いてパルス
位置候補の算出を行う（ステップＳ４）。この場合、最
終サンプルでの積分値が求める位置候補数がＭになるよ
うに積分値を正規化する。ｍ番目の候補の位置は、図３
の（Ｆ３）に示したように積分値と対応させることで、
Ｓｍとして求めることができる。ｍ＝０，…，Ｍ−１ま
で繰り返すことでＭ個の位置候補を求めることができ
る。Next, a pulse position candidate is calculated using the integrated value q (n) (step S4). In this case, the integral value is normalized so that the number of position candidates for which the integral value in the final sample is obtained is M. The position of the m-th candidate is shown in FIG.
By associating with the integral value as shown in (F3),
It can be obtained as Sm. By repeating the processing until m = 0,..., M−1, M position candidates can be obtained.

【００４０】図４に、このようにして求められたパルス
位置候補とピッチベクトルのパワとの関係を示す。実線
はピッチベクトルのパワ包絡、矢印はパルス位置候補を
示している。同図に示されるように、パルス位置候補の
分布はピッチベクトルのパワの大きいところでは密とな
り、パワが小さくなるに従って疎になってゆく。その結
果、音質上重要なピッチベクトルのパワの大きいところ
では、より正確にパルス位置を選ぶことができる。ま
た、低符号化レート化によってパルス位置候補の数が減
少しても、少ないパルス位置候補をピッチベクトルのパ
ワの大きい所に適応的に集中させることで、高音質の符
号化が可能となる。FIG. 4 shows the relationship between the pulse position candidates thus obtained and the power of the pitch vector. The solid line indicates the power envelope of the pitch vector, and the arrows indicate pulse position candidates. As shown in the figure, the distribution of the pulse position candidates becomes dense where the power of the pitch vector is large, and becomes sparse as the power becomes small. As a result, where the power of the pitch vector that is important in sound quality is large, the pulse position can be selected more accurately. Further, even if the number of pulse position candidates decreases due to the lowering of the encoding rate, high-quality encoding becomes possible by adaptively concentrating a small number of pulse position candidates in a place where the power of the pitch vector is large.

【００４１】次に、このようにして求められた位置候補
をチャネル毎に分配する（ステップＳ５）。分配の方法
も様々であるが、図３の（Ｆ４）に示したように位置候
補は各チャネルが互い違いになるように分配されるのが
望ましい。このようにして、適応代数構造符号帳１４３
が求められる。探索では、この適応代数構造符号帳１４
３の各チャネル（Ｃｈ１，Ｃｈ２，Ｃｈ３）から１パル
スずつ最適な位置と符号が選ばれ、３本のパルスで構成
される雑音ベクトルが生成される。Next, the position candidates thus obtained are distributed for each channel (step S5). Although there are various distribution methods, as shown in (F4) of FIG. 3, it is desirable that the position candidates are distributed so that each channel is alternated. Thus, adaptive algebraic structure codebook 143
Is required. In the search, the adaptive algebraic structure codebook 14
The optimum position and code are selected one pulse at a time from each of the three channels (Ch1, Ch2, Ch3), and a noise vector composed of three pulses is generated.

【００４２】サブフレーム長が８０サンプルの場合、パ
ルス候補位置を全チャネル合計で４０サンプル程度に削
減しても、上記の手法を用いれば聴覚的な劣化はほとん
ど感じられなくなる。When the subframe length is 80 samples, even if the pulse candidate positions are reduced to about 40 samples in all channels in total, almost no auditory deterioration is perceived by using the above method.

【００４３】代数構造符号帳ではパルスの振幅は通常＋
１または−１のどちらかであるが、振幅情報を持つパル
スを用いる方法も提案されている、文献４（Chang Deyu
an, "An 8kb/s low complexity ACELP speech codec,"
1996 3rd International Conference on Signal Proces
sing, pp. 671-4, 1996）に示されているようにパルス
の振幅を１．０，０．５，０，−０．５，−１．０の中
から選択する方法があげられる。また、文献５（K. Oza
wa and T. Araseki, "Low Bit Rate Multi-pulse Speec
h Coder with Natural Speech Quality," IEEE Proc. I
CASSP' 86,pp. 457-460, 1986）に示されているパルス
音源の一種であるマルチパルス方式なども駆動信号が振
幅を持つパルス列から構成される。本発明はこれらの例
に代表されるようなパルスが振幅をもつ場合にも適用可
能である。In the algebraic codebook, the pulse amplitude is usually +
Reference 4 (Chang Deyu), which uses either a pulse having amplitude information, which is either 1 or -1.
an, "An 8kb / s low complexity ACELP speech codec,"
1996 3rd International Conference on Signal Proces
sing, pp. 671-4, 1996), there is a method of selecting the pulse amplitude from 1.0, 0.5, 0, -0.5, and -1.0. Reference 5 (K. Oza
wa and T. Araseki, "Low Bit Rate Multi-pulse Speec
h Coder with Natural Speech Quality, "IEEE Proc. I
CASSP'86, pp. 457-460, 1986), a multi-pulse method, which is a kind of pulse sound source, also has a drive signal composed of a pulse train having an amplitude. The present invention is also applicable to a case where a pulse as represented by these examples has an amplitude.

【００４４】次に、図５を用いて図１の音声符号化シス
テムに対応する音声復号化システムについて説明する。Next, a speech decoding system corresponding to the speech encoding system of FIG. 1 will be described with reference to FIG.

【００４５】図１と同一機能を有する部分に同一符号を
付して説明すると、図５の音声復号化システムは、ＬＰ
Ｃ合成部１２０と、ＬＰＣ逆量子化部１２１と、適応符
号帳１４１と、パルス位置候補探索部１４２と、適応代
数構造符号帳１４３と、ピッチ周期強調部１６０と、利
得乗算部１０２，１０３および加算部１０４から構成さ
れ、図１の音声符号化システムから伝送されてきた符号
化ストリームが入力される。The parts having the same functions as those in FIG. 1 are denoted by the same reference numerals and described. The speech decoding system in FIG.
C combining section 120, LPC inverse quantization section 121, adaptive codebook 141, pulse position candidate searching section 142, adaptive algebraic structure codebook 143, pitch period emphasizing section 160, gain multiplying sections 102 and 103, and The coded stream, which is constituted by the adder 104 and transmitted from the speech coding system of FIG. 1, is input.

【００４６】入力された符号化ストリームは図示しない
逆多重化部１２１に入力され、この逆多重化部１２１に
よって前述した合成フィルタ情報のインデックスＡ、適
応符号帳１４１に対する探索で選ばれたピッチベクトル
を示すインデックスＢ、適応代数構造符号帳１４３に対
する探索で選ばれたパルス列を表すインデックスＣ、利
得符号帳に対する探索で選ばれた利得Ｇ０，Ｇ１を示す
インデックスＧおよびピッチ周期を示すインデックスＬ
に分離されて取り出される。The input coded stream is input to a demultiplexing unit 121 (not shown). The demultiplexing unit 121 calculates the index A of the synthesis filter information and the pitch vector selected in the search for the adaptive codebook 141. Index B, index C representing the pulse train selected in the search for adaptive algebraic structure codebook 143, index G indicating the gains G0 and G1 selected in the search for gain codebook, and index L indicating the pitch period.
It is separated and taken out.

【００４７】インデックスＡは、ＬＰＣ逆量子化部１２
１で復号されて合成フィルタ情報であるＬＰＣ係数が求
められ、ＬＰＣ合成部１２０に入力される。インデック
スＢおよびＣは、適応符号帳１４１および適応代数構造
符号帳１４３にそれぞれ入力され、これらの符号帳１４
１，１４３からピッチベクトルおよびパルス列が出力さ
れる。この場合、適応代数構造符号帳１４３は、適応符
号帳１４１から入力されたピッチベクトルに基づいてパ
ルス位置候補探索部１４２で生成されたた適応代数構造
符号帳１４３とインデックスＢから、パルス位置と符号
を決定してパルス列を出力する。適応代数構造符号帳１
４３から出力されるパルス列は、必要に応じてピッチ周
期強調部１６０によりピッチ周期Ｌの周期性が与えられ
る。The index A is the LPC inverse quantizer 12
The LPC coefficient, which is decoded by 1 and is synthesis filter information, is obtained and input to the LPC synthesis unit 120. Indexes B and C are input to adaptive codebook 141 and adaptive algebraic structure codebook 143, respectively.
A pitch vector and a pulse train are output from 1,143. In this case, the adaptive algebraic structure codebook 143 obtains a pulse position and a code from the adaptive algebraic structure codebook 143 and the index B generated by the pulse position candidate search unit 142 based on the pitch vector input from the adaptive codebook 141. And outputs a pulse train. Adaptive Algebraic Codebook 1
The pulse train output from 43 is given a periodicity of the pitch period L by the pitch period emphasizing unit 160 as necessary.

【００４８】適応符号帳１４１から出力されるピッチベ
クトルおよび適応代数構造符号帳１４３から出力され、
かつ必要に応じてピッチ周期強調部１６０で周期性が与
えられたパルス列は、利得乗算部１０２，１０３により
ピッチベクトルに対する利得Ｇ０および雑音ベクトルに
対する利得Ｇ１がそれぞれ乗じられた後、加算部１０４
で加え合わせられてＬＰＣ合成部１２０に駆動信号とし
て入力され、このＬＰＣ合成部１２０から再生音声信号
が出力される。利得Ｇ０，Ｇ１は、インデックスＧに従
って図示しない利得符号帳から選ばれる。The pitch vector output from the adaptive codebook 141 and the output from the adaptive algebraic structure codebook 143,
In addition, the pulse train given periodicity by the pitch period emphasizing unit 160 is multiplied by the gain G0 for the pitch vector and the gain G1 for the noise vector by the gain multiplying units 102 and 103, respectively, and then added to the adding unit 104.
Are input to the LPC synthesis section 120 as a drive signal, and the LPC synthesis section 120 outputs a reproduced audio signal. The gains G0 and G1 are selected from a gain codebook (not shown) according to the index G.

【００４９】このように本実施形態によれば、音声の品
質を維持したまま、ビットレートのみを削減することが
可能となり、低符号化レートで高音質の音声符号化／復
号化を実現することができる。As described above, according to the present embodiment, it is possible to reduce only the bit rate while maintaining the voice quality, and realize high-quality voice coding / decoding at a low coding rate. Can be.

【００５０】図６に、本発明の第２の実施形態に係る音
声符号化システムが示される。この音声符号化システム
は、第１の実施形態による図１に示した構成からパルス
位置候補探索部１４２および適応代数構造符号帳１４３
を取り除き、適応代数構造符号帳１４３に代わるものと
して一般的な雑音符号帳１４４を備え、さらにパルス整
形フィルタ分析部１６１とパルス整形部１６２が追加さ
れた構成となっている。FIG. 6 shows a speech coding system according to a second embodiment of the present invention. This speech coding system is different from the configuration shown in FIG. 1 according to the first embodiment in that the pulse position candidate search unit 142 and the adaptive algebraic structure codebook 143 are used.
, A general noise codebook 144 is provided as an alternative to the adaptive algebraic structure codebook 143, and a pulse shaping filter analyzing unit 161 and a pulse shaping unit 162 are added.

【００５１】次に、本実施形態の処理手順について説明
すると、入力音声信号のＬＰＣ分析およびＬＰＣ量子化
を行った後、適応符号帳１４１の探索を行う所までは、
第１の実施形態と同じである。雑音符号帳１４４は、こ
の例では例えば代数構造符号帳により構成される。Next, the processing procedure of the present embodiment will be described. After performing LPC analysis and LPC quantization of the input speech signal, the process up to the point of searching the adaptive codebook 141 is as follows.
This is the same as the first embodiment. In this example, the noise codebook 144 is constituted by, for example, an algebraic structure codebook.

【００５２】パルス整形フィルタ分析部１６１は適応符
号帳１４１の探索で求められたピッチベクトルに基づい
てパルス整形部１６２のフィルタ係数を決定して出力す
る。パルス整形部１６２は、雑音符号帳１４４の出力を
整形し雑音ベクトルとして出力する。The pulse shaping filter analysis unit 161 determines and outputs the filter coefficient of the pulse shaping unit 162 based on the pitch vector obtained by searching the adaptive codebook 141. The pulse shaper 162 shapes the output of the noise codebook 144 and outputs the result as a noise vector.

【００５３】第１の実施形態と同様に、必要に応じてピ
ッチ周期強調部１６０を用いて雑音ベクトルが周期化さ
れ、ピッチベクトルと雑音ベクトルに対する利得Ｇ０，
Ｇ１が決められインデックスが出力される。パルス整形
部１６２のフィルタ係数はピッチベクトルから求められ
るため、新たな付加情報を必要としない。As in the first embodiment, if necessary, the noise vector is periodicized by using the pitch period emphasizing unit 160, and the gains G0, G0,
G1 is determined and an index is output. Since the filter coefficient of the pulse shaping section 162 is obtained from the pitch vector, no new additional information is required.

【００５４】本実施形態の特徴は、パルス整形部１６２
をピッチベクトルの波形を基に設定し、代数構造符号帳
からなる雑音符号帳１４４の出力であるパルス列にパル
ス整形を施す点にある。第１の実施形態で述べたよう
に、低符号化レート化に伴ってパルス位置、パルス数が
減少し音質の劣化が目立つようになる。パルス数が減少
した場合は「プチプチ」という雑音が復号音声に目立つ
ようになるが、本実施形態のようにパルス整形部１６２
を用いることで、このプチプチ感が大幅に軽減される。The feature of this embodiment is that the pulse shaping section 162
Is set based on the pitch vector waveform, and pulse shaping is performed on the pulse train output from the noise codebook 144 including the algebraic structure codebook. As described in the first embodiment, the pulse position and the number of pulses are reduced as the encoding rate is reduced, and the deterioration of the sound quality becomes conspicuous. When the number of pulses is reduced, the noise of “bubbles” becomes noticeable in the decoded speech, but the pulse shaping section 162 as in the present embodiment.
By using this, the bubble wrap feeling is greatly reduced.

【００５５】パルス整形部１６２の設計方法としては、
様々な方法を用いることができる。第一の例として、合
成フィルタを駆動する駆動信号を位相等化すると、それ
がパルス状の信号になるという性質を利用する方法が考
えられる。位相等化の逆フィルタを用いれば、パルス状
の信号を入力することで駆動信号状の波形が得られるこ
とになる。従来のパルス波形を用いた場合のデメリット
は理想的な駆動信号に含まれている位相情報が欠如して
しまう点であり、パルス数が少なくなるとこの問題が顕
著になる。そこで、この例のように位相情報をパルス整
形部１６２で付加することで、パルス波形からより理想
的な駆動信号に近い波形を生成することができる。The design method of the pulse shaping unit 162 is as follows.
Various methods can be used. As a first example, a method utilizing the property that when a drive signal for driving a synthesis filter is phase-equalized, the signal becomes a pulse-like signal is considered. When an inverse filter for phase equalization is used, a pulse-like signal is input to obtain a drive signal-like waveform. The disadvantage of using the conventional pulse waveform is that the phase information included in the ideal drive signal is lacking. This problem becomes conspicuous as the number of pulses decreases. Therefore, by adding the phase information by the pulse shaping unit 162 as in this example, a waveform closer to an ideal driving signal can be generated from the pulse waveform.

【００５６】この第一の例では、位相等化逆フィルタの
フィルタ係数の情報を伝送する必要があり、その分だけ
符号化レート(bit rate)が増える。そこで、パルス整形
部１６２の第二の例として、位相情報の近似としてピッ
チベクトルを用いる方法が考えられる。有音区間などで
はピッチベクトルは、駆動信号と形状が類似しているた
め、位相情報を取り出すことができる。In this first example, it is necessary to transmit the information of the filter coefficient of the phase equalization inverse filter, and the coding rate (bit rate) is increased by that amount. Therefore, as a second example of the pulse shaping unit 162, a method using a pitch vector as an approximation of the phase information is considered. Since the pitch vector has a similar shape to the drive signal in a sound section or the like, phase information can be extracted.

【００５７】具体的な方法の一つとして、ピッチベクト
ルのピーク位置などの同期点を求め、この同期点から数
サンプル分の波形を取り出し、これをインパルス応答と
するパルス整形フィルタを用いることができる。取り出
す波形の長さは２〜３サンプル程度で効果が現われる。
また、取り出したサンプルに窓をかけて減衰させてそれ
を用いるのも効果がある。さらに、ピッチベクトルは復
号側でも符号化側と同一のものが得られるため、新たな
伝送ビットを必要としない利点もある。雑音符号帳１４
４の探索時には、パルス整形部１６２は一定であるた
め、そのインパルス応答をＬＰＣ合成部１２０と合わせ
て予め計算しておくことで、計算量を削減することがで
きる。As a specific method, a pulse shaping filter that determines a synchronization point such as a peak position of a pitch vector, extracts a waveform of several samples from the synchronization point, and uses the waveform as an impulse response can be used. . The effect appears when the length of the extracted waveform is about 2 to 3 samples.
It is also effective to apply a window to the sample taken out and attenuate it. Further, since the same pitch vector can be obtained on the decoding side as on the encoding side, there is also an advantage that a new transmission bit is not required. Noise codebook 14
Since the pulse shaping section 162 is constant during the search of 4, the amount of calculation can be reduced by calculating the impulse response in advance together with the LPC synthesis section 120.

【００５８】図７に、図６の音声符号化システムに対応
する音声復号化システムが示される。図６と同一機能を
有する部分に同一符号を付して説明すると、図７の音声
復号化システムは、ＬＰＣ合成部１２０と、ＬＰＣ逆量
子化部１２１と、適応符号帳１４１と、代数構造符号帳
からなる雑音符号帳１４４と、パルス整形フィルタ分析
部１６１と、パルス整形部１６２と、ピッチ周期強調部
１６０と、利得乗算部１０２，１０３および加算部１０
４から構成され、図６の音声符号化システムから伝送さ
れてきた符号化ストリームが入力される。FIG. 7 shows a speech decoding system corresponding to the speech encoding system of FIG. The parts having the same functions as those in FIG. 6 will be described with the same reference numerals. The speech decoding system in FIG. 7 includes an LPC synthesis unit 120, an LPC inverse quantization unit 121, an adaptive codebook 141, an algebraic structure code , A pulse shaping filter analyzing unit 161, a pulse shaping unit 162, a pitch period emphasizing unit 160, gain multiplying units 102 and 103, and an adding unit 10.
4, and receives an encoded stream transmitted from the speech encoding system of FIG.

【００５９】入力された符号化ストリームは、図示しな
い逆多重化部に入力され、この逆多重化部によって前述
した合成フィルタ情報のインデックスＡ、適応符号帳１
４１に対する探索で選ばれたピッチベクトルを示すイン
デックスＢ、雑音符号帳１４４に対する探索で選ばれた
パルス列を表すインデックスＣと、利得符号帳に対する
探索で選ばれた利得Ｇ０，Ｇ１を示すインデックスＧに
分離されて取り出される。ピッチ周期Ｌは、インデック
スＢより算出される。The input coded stream is input to a demultiplexer (not shown), and the demultiplexer outputs the index A of the synthesis filter information and the adaptive codebook 1 described above.
An index B indicating a pitch vector selected in the search for 41, an index C indicating a pulse train selected in the search for the noise codebook 144, and an index G indicating the gains G0 and G1 selected in the search for the gain codebook are separated. It is taken out. The pitch period L is calculated from the index B.

【００６０】インデックスＡは、ＬＰＣ逆量子化部１２
１で復号されて合成フィルタ情報となり、ＬＰＣ合成部
１２０に入力される。インデックスＢおよびＣは適応符
号帳１４１および雑音符号帳１４４にそれぞれ入力さ
れ、これらの符号帳１４１，１４４からピッチベクトル
およびパルス列が出力される。The index A is the LPC inverse quantizer 12
1 and becomes synthesis filter information, which is input to the LPC synthesis section 120. The indexes B and C are input to the adaptive codebook 141 and the noise codebook 144, respectively, and a pitch vector and a pulse train are output from these codebooks 141 and 144.

【００６１】この場合、雑音符号帳１４４から出力され
るパルス列は、適応符号帳１４１の探索で求められたピ
ッチベクトルに基づいてパルス整形フィルタ分析部１６
１により係数が設定されたパルス整形部１６２により処
理された後、必要に応じてピッチ周期強調部１６０によ
りピッチ周期Ｌの周期性が与えられる。In this case, the pulse train output from the random codebook 144 is converted to a pulse shaping filter analysis unit 16 based on the pitch vector obtained by searching the adaptive codebook 141.
After being processed by the pulse shaping unit 162 whose coefficient is set to 1, the pitch period emphasizing unit 160 gives periodicity of the pitch period L as necessary.

【００６２】適応符号帳１４１から出力されるピッチベ
クトルおよび雑音符号帳１４４から出力され、パルス整
形部１６２およびピッチ周期強調部１６０を経たパルス
列は、利得乗算部１０２，１０３によりピッチベクトル
に対する利得Ｇ０および雑音ベクトルに対する利得Ｇ１
がそれぞれ乗じられた後、加算部１０４で加え合わせら
れ、ＬＰＣ合成部１２０に駆動信号として入力され、こ
のＬＰＣ合成部１２０から合成された復号音声信号が出
力される。利得Ｇ０，Ｇ１は、インデックスＧに従って
図示しない利得符号帳から選ばれる。The pulse train output from the adaptive codebook 141 and the pulse train output from the noise codebook 144 and passed through the pulse shaping unit 162 and the pitch period emphasizing unit 160 are subjected to gain multiplications 102 and 103 to obtain gains G0 and G0 for the pitch vector. Gain G1 for noise vector
Are added to each other by the adder 104, input to the LPC synthesizer 120 as a drive signal, and the LPC synthesizer 120 outputs a decoded speech signal synthesized. The gains G0 and G1 are selected from a gain codebook (not shown) according to the index G.

【００６３】このように本実施形態によると、パルス整
形部１６２を用いることで、雑音符号帳１４４に低符号
化レート化によってパルス数が減少した代数構造符号帳
を用いた場合においても、復号音声の音質を維持したま
ま符号化レートだけを効果的に削減することが可能にな
る。As described above, according to the present embodiment, the use of the pulse shaping section 162 enables the decoding of the decoded speech even when the algebraic structure codebook in which the number of pulses is reduced due to the lower coding rate is used as the noise codebook 144. , It is possible to effectively reduce only the coding rate while maintaining the sound quality.

【００６４】図８に、本発明の第３の実施形態に係る音
声符号化システムが示される。この音声符号化システム
は、第１の実施形態の構成に第２の実施形態で説明した
パルス整形フィルタ分析部１６１とパルス整形部１６２
を加えた構成になっている。FIG. 8 shows a speech coding system according to a third embodiment of the present invention. This speech coding system includes a pulse shaping filter analyzing unit 161 and a pulse shaping unit 162 described in the second embodiment in the configuration of the first embodiment.
Has been added.

【００６５】次に、本実施形態の処理手順について説明
すると、第１の実施形態と同様にまずＬＰＣ分析および
ＬＰＣ量子化が行われ、適応符号帳１４１の探索が完了
した後、ピッチベクトルがパルス位置候補探索部１４２
とパルス整形フィルタ分析部１６１に渡される。パルス
位置候補探索部１４２では、第１の実施形態で述べた方
法を用いてパルス位置候補が求められ，適応代数構造符
号帳１４３が作られる。パルス整形フィルタ分析部１６
１では、第２の実施形態で述べたようにパルス整形部１
６２の係数が求められる。Next, the processing procedure of this embodiment will be described. As in the first embodiment, first, LPC analysis and LPC quantization are performed, and after the search of the adaptive codebook 141 is completed, the pitch vector becomes pulsed. Position candidate search unit 142
Is passed to the pulse shaping filter analyzer 161. The pulse position candidate search unit 142 obtains pulse position candidates using the method described in the first embodiment, and creates an adaptive algebraic structure codebook 143. Pulse shaping filter analyzer 16
1, the pulse shaping unit 1 as described in the second embodiment.
A coefficient of 62 is determined.

【００６６】適応代数構造符号帳１４３の探索では、出
力されたパルス列はパルス整形部１６２で整形される。
実際の探索では、パルス整形部１６２やピッチ周期強調
部１６０のインパルス応答はＬＰＣ合成部１２０と合わ
せられ、計算量の削減が行われる。In the search for adaptive algebraic structure codebook 143, the output pulse train is shaped by pulse shaping section 162.
In an actual search, the impulse responses of the pulse shaping unit 162 and the pitch period emphasizing unit 160 are combined with the LPC synthesizing unit 120 to reduce the amount of calculation.

【００６７】図９に、図８の音声符号化システムに対応
する音声復号化システムが示される。この音声復号化シ
ステムの動作は第１および第２の実施形態で説明した音
声復号化システムの動作から自明であるので、図１、図
７および図８と同一部分に同一符号を付して詳細な説明
は省略する。FIG. 9 shows a speech decoding system corresponding to the speech encoding system of FIG. Since the operation of the speech decoding system is obvious from the operation of the speech decoding system described in the first and second embodiments, the same reference numerals are given to the same parts as in FIGS. Detailed description is omitted.

【００６８】このように本実施形態では、第１の実施形
態で説明したパルス位置候補探索部１４２および適応代
数構造符号帳１４３と、第２の実施形態で説明したパル
ス整形フィルタ分析部１６１およびパルス整形部１６２
を同時に用いることで、限られた位置候補に少数のパル
スを立てる場合でも高い音質を維持することが可能とな
り、高音質、低符号化レートの音声符号化方式を実現す
ることができる。As described above, in the present embodiment, the pulse position candidate searching unit 142 and the adaptive algebraic structure codebook 143 described in the first embodiment, and the pulse shaping filter analyzing unit 161 and the pulse Shaping unit 162
Are used at the same time, it is possible to maintain high sound quality even when a small number of pulses are set at limited position candidates, and it is possible to realize a voice coding method with high sound quality and a low coding rate.

【００６９】図１０に本発明の第４の実施形態に係る音
声符号化システムのブロック図を示す。この音声符号化
システムでは、第１の実施形態のパルス位置候補探索部
がピッチベクトル平滑部１７１と位置候補密度関数算出
部１７２および位置候補算出部１７３から構成されてい
る他は、第１の実施形態と同じ構成である。FIG. 10 is a block diagram showing a speech coding system according to the fourth embodiment of the present invention. In this speech coding system, the pulse position candidate search unit of the first embodiment is configured by a pitch vector smoothing unit 171, a position candidate density function calculation unit 172, and a position candidate calculation unit 173, except that It has the same configuration as the embodiment.

【００７０】次に、本実施形態の処理手順について説明
すると、第１の実施形態と同様に、まずＬＰＣ分析およ
びＬＰＣ量子化と、適応符号帳１４１の探索が完了した
後、ピッチベクトルがパルス位置候補探索部１４２のピ
ッチベクトル平滑部１７１に渡される。ピッチベクトル
平滑部１７１ではピッチベクトルに対し、例えば図２の
フローチャートのステップＳ１〜Ｓ２の処理を行い、ピ
ッチベクトルのパワ包絡を求め、これを出力する。位置
候補密度関数算出部１７２ではパワ包絡を位置候補密度
関数に変換し、出力する。位置候補算出部１７３ではパ
ワ包絡の代わりにこの位置候補密度関数を用いてパルス
位置候補を算出し、得られたパルス位置候補に従って適
応代数構造符号帳１４３を作る。以降の処理は第１の実
施形態と同様である。Next, the processing procedure of this embodiment will be described. As in the first embodiment, after the LPC analysis and LPC quantization and the search of the adaptive codebook 141 are completed, the pitch vector is changed to the pulse position. The pitch vector is passed to the pitch vector smoothing unit 171 of the candidate searching unit 142. The pitch vector smoothing unit 171 performs, for example, the processing of steps S1 and S2 in the flowchart of FIG. 2 on the pitch vector, obtains the power envelope of the pitch vector, and outputs this. The position candidate density function calculation unit 172 converts the power envelope into a position candidate density function and outputs it. The position candidate calculation unit 173 calculates a pulse position candidate using the position candidate density function instead of the power envelope, and creates an adaptive algebraic structure codebook 143 according to the obtained pulse position candidate. Subsequent processing is the same as in the first embodiment.

【００７１】本実施形態の特徴は、パルス位置候補探索
部１４２の処理の方法にある。第１の実施形態ではピッ
チベクトルのパワ包絡をそのまま用いてパルス位置候補
の適応化を行っていたのに対し本実施形態ではパワ包絡
を位置候補密度関数に変換した後これを用いて適応化を
行っている。図１１を用いて詳しく説明する。図１１
（ａ）がピッチベクトル平滑化部１７１から出力された
ピッチベクトルのパワ包絡である。位置候補密度関数算
出部１７２では、ピッチベクトルのパワ包絡（図１１
（ａ））から位置候補密度関数(図１１（ｂ））を生成
する。この時、図１１（ｃ）に示したパワ包絡の値
（ｘ）と位置候補密度関数の値（ｆ（ｘ））の対応を示
す関数ｆを用いて変換を行う。関数ｆの作成方法は例え
ば多くの学習音声を処理する事で統計的に求めておく方
法などがあげられる。また、関数の代わりにテーブルデ
ータ等を用いることも可能である。The feature of the present embodiment lies in the processing method of the pulse position candidate search section 142. In the first embodiment, the adaptation of the pulse position candidates is performed using the power envelope of the pitch vector as it is. In the present embodiment, the power envelope is converted into a position candidate density function, and then the adaptation is performed using the function. Is going. This will be described in detail with reference to FIG. FIG.
(A) is the power envelope of the pitch vector output from the pitch vector smoothing unit 171. In the position candidate density function calculating unit 172, the power envelope of the pitch vector (FIG. 11)
A position candidate density function (FIG. 11B) is generated from (a)). At this time, the conversion is performed using a function f indicating the correspondence between the value (x) of the power envelope and the value (f (x)) of the position candidate density function shown in FIG. As a method of creating the function f, for example, there is a method of statistically obtaining a number by processing many learning voices. Further, table data or the like can be used instead of the function.

【００７２】パルス位置候補探索部１４２は変換用の関
数ｆも合めて、符号器と復号器にそれぞれ同一のものを
用意するので、適応化に関する情報は送る必要がなく、
適応化を行わない場合と比べてビットレートの増加は無
い。The pulse position candidate search section 142 prepares the same for the encoder and the decoder together with the conversion function f, so that there is no need to send information on adaptation.
There is no increase in the bit rate as compared to the case without adaptation.

【００７３】図１２に図１０の音声符号化システムに対
応する本実施形態の音声復号化システムの構成を示す。
この音声復号化システム動作は第１〜３の実施形態で説
明した音声復号化システムの動作から自明であるので詳
細な説明は省略する。FIG. 12 shows the configuration of the speech decoding system of the present embodiment corresponding to the speech encoding system of FIG.
Since the operation of the speech decoding system is obvious from the operation of the speech decoding system described in the first to third embodiments, detailed description will be omitted.

【００７４】このように本実施形態ではピッチベクトル
のパワ包絡の値とパルス位置候補の密度を関数ｆを用い
て変換するため、第１の実施形態に比べて処理手順は僅
かに複雑になるが、より正確な位置候補の配分が可能と
なる。また、第１の実施形態は、本実施形態においてｘ
＝ｆ（ｘ）とした場合と考えることができる。As described above, in the present embodiment, since the value of the power envelope of the pitch vector and the density of the pulse position candidates are converted using the function f, the processing procedure is slightly complicated as compared with the first embodiment. Thus, more accurate distribution of position candidates is possible. In the first embodiment, x
= F (x).

【００７５】図１３に本発明の第５の実施形態に係る音
声符号化システムのブロック図を示す。この音声符号化
システムでは、第１の実施形態のパルス位置候補探索部
がピッチフィルタ逆演算部１７４と平滑化部１７５およ
び位置候補算出部１７３から構成されている他は、第１
の実施形態と同じ構成である。FIG. 13 is a block diagram showing a speech coding system according to the fifth embodiment of the present invention. In this speech coding system, except that the pulse position candidate search unit of the first embodiment includes a pitch filter inverse operation unit 174, a smoothing unit 175, and a position candidate calculation unit 173,
This is the same configuration as the embodiment.

【００７６】次に、本実施形態の処理手願について説明
すると、第１の実施形態と同様にまず、ＬＰＣ分析およ
びＬＰＣ量子化と、適応符号帳１４１の探索が完了した
後、ピッチベクトルがパルス位置候補探索部１４２のピ
ッチフィルタ逆演算部１７４に渡される。ピッチフィル
タ逆演算部１７４はピッチ周期強調部１６０の逆特性を
表す演算を行う。例えばピッチフィルタの伝達関数Ｐ
（Ｚ）がＰ（ｚ）＝１−ａｚ＾（−Ｌ）（１）で与えられる場合、ピッチフィルタ逆演算部１７４では
伝達関数Ｑ（ｚ）がＱ（Ｚ）＝ｌ／（１−ｂａｚ＾（−Ｌ））（２）で与えられるフィルタを用いる方法が挙げられる。ここ
でａは定数、ｂは逆特性の度合を表し、ｂ＝１の時Ｑ
（ｚ）はＰ（ｚ）の逆フィルタとなる。入力されたピッ
チベクトルは逆演算が施された後、出力され、平滑化部
１７５で実施形態４のピッチベクトル平滑化部１７１と
同様の手法でパワ包絡が求められる。位置候補算出部１
７３ではこのパワ包絡に従っでパルス位置候補を選択
し、適応代数構造符号帳１４３を作る。以降の処理は実
施形態１と同様である。Next, the processing request of the present embodiment will be described. First, as in the first embodiment, after the LPC analysis and LPC quantization and the search of the adaptive codebook 141 are completed, the pitch vector becomes pulsed. It is passed to the pitch filter inverse operation unit 174 of the position candidate search unit 142. The pitch filter inverse operation unit 174 performs an operation representing the inverse characteristic of the pitch cycle emphasis unit 160. For example, the transfer function P of the pitch filter
When (Z) is given by P (z) = 1−az ＾ (− L) (1), the pitch filter inverse operation unit 174 sets the transfer function Q (z) to Q (Z) = 1 / (1-baz). ＾ (− L)) (2) A method using the filter given by Here, a is a constant, b is the degree of the inverse characteristic, and when b = 1, Q
(Z) is an inverse filter of P (z). The input pitch vector is output after being subjected to an inverse operation, and the power envelope is obtained by the smoothing unit 175 in the same manner as the pitch vector smoothing unit 171 of the fourth embodiment. Position candidate calculation unit 1
At 73, a pulse position candidate is selected according to the power envelope, and an adaptive algebraic structure codebook 143 is created. Subsequent processing is the same as in the first embodiment.

【００７７】本実施形態の特徴はピッチ周期強調部１６
０の影響を考慮したピッチベクトルをパルス位置候補の
適応化に用いる点である。このようにすることで効率が
上がる理由を述べる。The feature of this embodiment is that the pitch period emphasizing unit 16
The point is that the pitch vector considering the influence of 0 is used for adapting the pulse position candidate. The reason why the efficiency is improved by doing this will be described.

【００７８】適応代数構造符号帳から生成された雑音ベ
クトルはピッチ周期強調部１６０でピッチ周期化がされ
る。周期化に式（１）を用いた場合、サブフレームの先
頭に近いパルスはピッチ周期間隔でサブフレーム内で何
度も繰り返されるのに対し、後半のパルスほど繰り返さ
れる回数が少なくなる。実際に得られた雑音符号ベクト
ルを観測すると、強いピッチフィルタが用いられる場合
ほど先頭に近い位置にパルスが立ちやすい傾向があるこ
とが確認できる。このことから、パルス位置はピッチベ
クトルの形状だけでなく、ピッチフィルタとも関係が深
いことがわかる。本実施形態ではピッチフィルタ逆演算
部１７４を用いることにより、ピッチ周期強調部１６０
の影響を考慮したパルス位置候補の適応化を実現してい
る。The noise vector generated from the adaptive algebraic structure codebook is pitch-performed by the pitch period emphasizing unit 160. When the equation (1) is used for the periodization, the pulse near the head of the subframe is repeated many times within the subframe at the pitch cycle interval, whereas the number of repetitions decreases in the latter half of the pulse. By observing the actually obtained noise code vector, it can be confirmed that the pulse tends to be more likely to appear near the head as the strong pitch filter is used. This indicates that the pulse position is deeply related not only to the pitch vector shape but also to the pitch filter. In the present embodiment, the pitch cycle emphasis unit 160 is used by using the pitch filter inverse operation unit 174.
Adaptation of the pulse position candidates taking into account the influence of is realized.

【００７９】ところで、第３の実施形態では雑音ベクト
ルにパルス整形フィルタとピッチフィルタの２種類のフ
ィルタをかけることが可能である。このような場合に本
実施形態を適用する場合は、２つのフィルタを合わせた
特性を求め、この特性の逆特性をピッチフィルタ逆演算
部に用いるのが理想的である。しかし、処理量が増える
ため影響の大きなピッチフィルタの特性のみを用いるだ
けでも効果は得られる。また、ピッチフィルタ逆演算部
１７４と平滑化部１７５の順序は逆でも実現可能であ
る。In the third embodiment, it is possible to apply two types of filters, a pulse shaping filter and a pitch filter, to the noise vector. When the present embodiment is applied to such a case, it is ideal that a characteristic obtained by combining two filters is obtained, and an inverse characteristic of the characteristic is used for the pitch filter inverse operation unit. However, the effect can be obtained only by using the characteristic of the pitch filter which has a large influence because the processing amount increases. Further, the order of the pitch filter inverse operation unit 174 and the smoothing unit 175 can be reversed.

【００８０】図１４に図１３の音声符号化システムに対
応する本実施形態の音声復号化システムの構成を示す。
この音声符号化システムの動作は第１乃至４実施形態で
説明した音声復号化システムの動作から自明であるので
詳細な説明は省略する。FIG. 14 shows the configuration of the speech decoding system of the present embodiment corresponding to the speech encoding system of FIG.
The operation of this speech encoding system is obvious from the operation of the speech decoding system described in the first to fourth embodiments, and therefore detailed description is omitted.

【００８１】図１５に本発明の第６の実施形態に係る音
声符号化システムのブロック図を示す。この音声符号化
システムでは、第１の実施形態の適応代数構造符号帳が
雑音ベクトル生成部１８０と振幅符号帳１８１に置き替
わっている他は、第１の実施形態と同じ構成である。FIG. 15 is a block diagram showing a speech coding system according to the sixth embodiment of the present invention. This speech coding system has the same configuration as that of the first embodiment, except that the adaptive algebraic structure codebook of the first embodiment is replaced by a noise vector generator 180 and an amplitude codebook 181.

【００８２】次に、本実施形態の処理手順について説明
すると、第１の実施形態と同様にまずＬＰＣ分析および
ＬＰＣ量子化と、適応符号帳１４１の探索が完了した
後、ピッチベクトルがパルス位置探索部１７４に渡され
る。パルス位置探索部１７４では第１の実施形態と同様
の手法でピッチベクトルのパワ包絡に基づきパルス位置
を求め、雑音ベクトル生成部にこれを出力する。ここ
で、本実施形態がこれまでの実施形態と異なる点はパル
ス位置探索部１７４で得られた位置には雑音ベクトル探
索部で全てパルスが立てられる点である。つまり、これ
までの実施形態ではパルス位置の候補が求められ、この
中から適応代数構造符号帳で最適なパルス位置を選んで
いたのに対し、本実施形態ではパルス位置の候補の全部
を同時に用いる。従ってパルス位置を選ぶ処理は不要に
なる。その代わりに、各パルスの振幅を振幅符号帳１８
１から選ぶ処理が追加される。また、出力信号もパルス
位置を示す情報ｃの代わりにパルスの振幅を表す情報Ｄ
が出力される。Next, the processing procedure of this embodiment will be described. As in the first embodiment, after the LPC analysis and LPC quantization and the search of the adaptive codebook 141 are completed, the pitch vector is changed to the pulse position search. It is passed to the unit 174. The pulse position search unit 174 obtains a pulse position based on the power envelope of the pitch vector in the same manner as in the first embodiment, and outputs this to the noise vector generation unit. Here, the present embodiment is different from the previous embodiments in that the noise vector search unit generates all the pulses at the position obtained by the pulse position search unit 174. That is, in the embodiments described above, pulse position candidates are obtained, and the optimal pulse position is selected from the candidate pulse positions in the adaptive algebraic structure codebook. On the other hand, in the present embodiment, all the pulse position candidates are used simultaneously. . Therefore, the process of selecting the pulse position becomes unnecessary. Instead, the amplitude of each pulse is
Processing to select from 1 is added. Also, the output signal is information D representing the pulse amplitude instead of the information c representing the pulse position.
Is output.

【００８３】図１６を用いて雑音ベクトルの生成方法を
詳しく説明する。図１６（ａ）に振幅符号帳から得られ
た振幅パターンを矢印で示す。この場合、７本のパルス
を立てることを想定している。図１６（ｂ）と図１６
（ｃ）の波形はパルス位置探索部１７４で得られたピッ
チベクトルパワ包絡とこれに対応するパルス位置（図の
○印）である。図１６（ｂ）ではパワの山が２箇所ある
ため７個のパルス位置が２箇所に分散されているのに対
し、図１６（ｃ）では山が中央に１箇所あるので中央に
パルス位置が集中している。図１６（ｄ）と図１６
（ｅ）はそれぞれのパルス位置に図１６（ａ）の振幅の
パルスを立てられた雑音ベクトルである。ピッチベクト
ルパワ包絡に合わせて駆動信号の形状も変化することが
分る。既に述べたようにピッチベクトルのパワ包絡の情
報は伝送する必要がないため、本実施形態ではビットレ
ートの増加を伴わずに雑音ベクトルの形状を理想的な雑
音ベクトルの形に近づけることができる。A method of generating a noise vector will be described in detail with reference to FIG. FIG. 16A shows an amplitude pattern obtained from the amplitude codebook by an arrow. In this case, it is assumed that seven pulses are made. 16 (b) and FIG.
The waveform of (c) is the pitch vector power envelope obtained by the pulse position search unit 174 and the corresponding pulse position (indicated by a circle in the figure). In FIG. 16 (b), there are two peaks in the power, so that seven pulse positions are dispersed in two places, whereas in FIG. 16 (c), there is one peak in the center, so that the pulse position is in the center. focusing. FIG. 16D and FIG.
(E) is a noise vector in which a pulse having the amplitude of FIG. 16 (a) is set at each pulse position. It can be seen that the shape of the drive signal also changes according to the pitch vector power envelope. As described above, since the power envelope information of the pitch vector does not need to be transmitted, in the present embodiment, the shape of the noise vector can be approximated to the ideal noise vector shape without increasing the bit rate.

【００８４】本実施形態ではビットレートが高くなるに
従ってパルスの振幅情報Ｄも多く送れるようになり品質
も向上するが、向上の度合は鈍くなっていく。ある程度
高いビットレートでは、振幅情報を増やすよりも選ばれ
なかった位置にパルスを立てた雑音ベクトルも探索の候
補に含めた方が性能が向上する場合がある。具体的に
は、パルス位置探索部１７４は異なるパルス位置のパタ
ーン（パルスパターン）を出力し、雑音ベクトル生成部
ではパルスパターンごとに振幅を探索する。パルスパタ
ーンは前述のピッチベクトルに適応化させたパルスパタ
ーンの他に、このパルスパターンに選ばれなかったパル
ス位置から生成されたパルスパターンも用意する。例え
ばサブフレームの全サンプル位置から適応化で選ばれた
サンプル位置を引いた残りを第２のパルスパターンとし
て２種類のパルスパターンに対して振幅の探索を行う方
法が挙げられる。振幅情報に割り当てられるビット数は
各パルスパターンごとに異なる構成にすることも可能で
あり、通常適応化を用いたパルスパターンの方に多くの
ビットを配分した方が効率が良い。複数のパルスパター
ンを用いた場合、どのパルスパターンを用いたかを表す
情報を情報Ｄに含めて伝送する必要があり、その分、振
幅情報が減ってしまうが、単一のパルスパターンのみを
探索するより品質が良い。In this embodiment, as the bit rate increases, more pulse amplitude information D can be sent and the quality is improved, but the degree of improvement is reduced. At a somewhat higher bit rate, performance may be improved by including noise vectors that have pulses at unselected positions rather than increasing the amplitude information as search candidates. Specifically, the pulse position search unit 174 outputs a pattern (pulse pattern) of a different pulse position, and the noise vector generation unit searches for an amplitude for each pulse pattern. As the pulse pattern, in addition to the pulse pattern adapted to the above-described pitch vector, a pulse pattern generated from a pulse position not selected as the pulse pattern is prepared. For example, there is a method in which the remainder obtained by subtracting the sample position selected by the adaptation from all the sample positions of the sub-frame is used as a second pulse pattern to search the amplitude of two types of pulse patterns. The number of bits assigned to the amplitude information can be different for each pulse pattern, and it is generally more efficient to allocate more bits to the pulse pattern using the adaptation. When a plurality of pulse patterns are used, it is necessary to transmit information indicating which pulse pattern is used in the information D, and the amplitude information is reduced accordingly, but only a single pulse pattern is searched. Better quality.

【００８５】図１７に図１５の音声符号化システムに対
応する本実施形態の音声復号化システムの構成を示す。
この音声復号化システム動作は第１〜５の実施形態で説
明した音声復号化システムの動作から自明であるので詳
細な説明は省略する。FIG. 17 shows the configuration of the speech decoding system of this embodiment corresponding to the speech encoding system of FIG.
Since the operation of the audio decoding system is obvious from the operation of the audio decoding system described in the first to fifth embodiments, detailed description will be omitted.

【００８６】なお、上述の実施形態では音声符号化／復
号化方法について説明したが、本発明は音声合成方法に
も適用でき、その場合は図５、図７および図９に示した
音声復号化システムにおいて、各インデックスを合成し
たい再生音声信号に基づいて与えればよい。Although the speech encoding / decoding method has been described in the above embodiment, the present invention can also be applied to a speech synthesis method, in which case the speech decoding method shown in FIGS. 5, 7 and 9 is used. In the system, each index may be given based on a reproduced audio signal to be synthesized.

【００８７】[0087]

【発明の効果】以上説明したように、本発明によれば低
符号化レート化によってパルス位置やパルス数が削減さ
れた代数構造符号帳を用いても、高音質の音声符号化／
復号化を行うことができる。As described above, according to the present invention, even if an algebraic structure codebook in which the pulse positions and the number of pulses are reduced by reducing the coding rate is used, high-quality speech coding / coding can be performed.
Decryption can be performed.

[Brief description of the drawings]

【図１】本発明の第１の実施形態に係る音声符号化シス
テムのブロック図FIG. 1 is a block diagram of a speech encoding system according to a first embodiment of the present invention.

【図２】第１の実施形態におけるパルス位置候補の選択
手順を示すフローチャートFIG. 2 is a flowchart showing a procedure for selecting a pulse position candidate according to the first embodiment;

【図３】図２の各ステップでの処理の様子を示す図FIG. 3 is a view showing a state of processing in each step of FIG. 2;

【図４】第１の実施形態におけるピッチベクトルのパワ
包絡とパルス位置候補の関係を示す図FIG. 4 is a diagram showing a relationship between a power envelope of a pitch vector and a pulse position candidate in the first embodiment.

【図５】第１の実施形態に係る音声復号化システムのブ
ロック図FIG. 5 is a block diagram of a speech decoding system according to the first embodiment;

【図６】本発明の第２の実施形態に係る音声符号化シス
テムのブロック図FIG. 6 is a block diagram of a speech encoding system according to a second embodiment of the present invention.

【図７】第２の実施形態に係る音声復号化システムのブ
ロック図FIG. 7 is a block diagram of a speech decoding system according to a second embodiment;

【図８】本発明の第３の実施形態に係る音声符号化シス
テムのブロック図FIG. 8 is a block diagram of a speech encoding system according to a third embodiment of the present invention.

【図９】第３の実施形態に係る音声復号化システムのブ
ロック図FIG. 9 is a block diagram of a speech decoding system according to a third embodiment;

【図１０】本発明の第４の実施形態に係る音声符号化化
システムのブロック図FIG. 10 is a block diagram of a speech coding system according to a fourth embodiment of the present invention.

【図１１】ピッチベクトルパワ包絡、位置候補密度関
数、パワー包絡の値と位置候補密度関数の値の関係をそ
れぞれ示す図FIG. 11 is a diagram showing the relationship between the values of the pitch vector power envelope, the position candidate density function, the power envelope, and the position candidate density function.

【図１２】第４の実施形態に係る復号システムのブロッ
ク図FIG. 12 is a block diagram of a decoding system according to a fourth embodiment;

【図１３】本発明の第５の実施形態に係る音声符号化化
システムのブロック図FIG. 13 is a block diagram of a speech coding system according to a fifth embodiment of the present invention.

【図１４】第５の実施形態に係る復号システムのブロッ
ク図FIG. 14 is a block diagram of a decoding system according to a fifth embodiment;

【図１５】本発明の第６の実施形態に係る音声符号化化
システムのブロック図FIG. 15 is a block diagram of a speech coding system according to a sixth embodiment of the present invention.

【図１６】雑音ベクトル生成方法を説明するための図FIG. 16 is a diagram for explaining a noise vector generation method.

【図１７】第６の実施形態に係る復号システムのブロッ
ク図FIG. 17 is a block diagram of a decoding system according to a sixth embodiment.

[Explanation of symbols]

１０１…音声入力端子１０２，１０３…利得乗算部１０４，１０５…加算部１１０…ＬＰＣ分析部１１１…ＬＰＣ量子化部１２０…ＬＰＣ合成部１３０…聴覚重み付け部１４１…適応符号帳１４２…パルス位置候補探索部１４３…適応代数構造符号帳１４４…雑音符号帳１５０…符号選択部１６０…ピッチ周期強調部１６１…パルス整形フィルタ分析部１６２…パルス整形部１７１…ピッチベクトル平滑部１７２…位置候補密度関数算出部１７３…位置候補算出部１７４…パルス位置探索部１８０…雑音ベクトル生成部１８１…振幅符号帳 101 voice input terminal 102, 103 gain multiplying unit 104, 105 adding unit 110 LPC analyzing unit 111 LPC quantizing unit 120 LPC synthesizing unit 130 auditory weighting unit 141 adaptive codebook 142 pulse position candidate search Unit 143: adaptive algebraic structure codebook 144: noise codebook 150: code selection unit 160: pitch period emphasis unit 161: pulse shaping filter analysis unit 162: pulse shaping unit 171 ... pitch vector smoothing unit 172: position candidate density function calculation unit 173: position candidate calculating unit 174: pulse position searching unit 180: noise vector generating unit 181: amplitude codebook

Claims

[Claims]

1. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information indicating characteristics of a synthesis filter and a drive signal for driving the synthesis filter, wherein the drive signal is A speech encoding method comprising a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates that adaptively change according to properties.

2. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information representing characteristics of a synthesis filter and a drive signal for driving the synthesis filter, wherein the drive signal is It is characterized by comprising a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates arranged such that there are more candidates as the power is larger. The audio coding method to use.

3. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information representing characteristics of a synthesis filter and a drive signal for driving the synthesis filter, wherein the drive signal is A pulse is arranged at a predetermined number of pulse positions selected from pulse position candidates that adaptively change according to the property, and the amplitude of each pulse includes a pulse train generated by optimizing by a predetermined means. A speech coding method characterized by being performed.

4. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information indicating characteristics of a synthesis filter and a drive signal for driving the synthesis filter, wherein the drive signal is A pulse train generated by arranging pulses at a predetermined number of pulse positions selected from first pulse position candidates that adaptively change according to properties, or a pulse train generated by the first pulse position candidate was not held. A speech comprising one of pulse trains generated by arranging pulses at a predetermined number of pulse positions selected from second pulse position candidates including part or all of positions. Encoding method.

5. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information indicating characteristics of a synthesis filter and a drive signal including a pitch vector and a noise vector for driving the synthesis filter. The speech code is characterized in that the vector is configured using a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates that change according to the shape of the pitch vector. Method.

6. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information indicating characteristics of a synthesis filter and a drive signal including a pitch vector and a noise vector for driving the synthesis filter. The vector uses a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates arranged such that there are more candidates as the power of the pitch vector increases. A speech coding method characterized by comprising:

7. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information representing the characteristics of a synthesis filter and a drive signal including a pitch vector and a noise vector for driving the synthesis filter. The vector calculates a position candidate density function from the shape of the pitch vector, and generates a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from position candidates set based on the position candidate density function. A speech encoding method characterized by comprising:

8. A speech signal which encodes a speech signal by expressing at least information representing characteristics of a synthesis filter and a drive signal for driving the synthesis filter by a pitch vector and a noise vector whose shape is processed by a correction means. In the encoding method, the noise vector is a predetermined number of pulse positions selected from pulse position candidates that change according to the shape of an inverse correction pitch vector obtained by performing an operation based on the inverse characteristic of the correction means on the pitch vector. A speech coding method characterized by comprising a pulse train generated by arranging a pulse in a speech signal.

9. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information representing characteristics of a synthesis filter and a drive signal including a pitch vector and a noise vector for driving the synthesis filter. A speech encoding method, wherein the signal is configured to include a pulse train shaped by a pulse shaping means having characteristics determined based on the shape of the pitch vector.

10. A speech encoding method for encoding a speech signal by expressing the speech signal with at least information representing characteristics of a synthesis filter and a drive signal comprising a pitch vector and a noise vector for driving the synthesis filter. The vector is generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates arranged so that there are more candidates as the power of the pitch vector increases, and the pitch vector Characterized by comprising a pulse train shaped by a pulse shaping means having characteristics determined based on the shape of the speech.

11. A driving signal including a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates that adaptively change according to the characteristics of an audio signal. A speech decoding method characterized in that a speech signal is decoded by inputting to a synthesis filter.

12. A pulse train generated by arranging pulses at a predetermined number of pulse positions selected from pulse position candidates arranged such that there are more candidates as the power of the audio signal increases. A speech decoding method characterized by inputting a drive signal composed of the following to a synthesis filter to decode a speech signal.

13. A pulse train including a pulse train generated by arranging a pulse having a given amplitude at a predetermined number of pulse positions selected from pulse position candidates that adaptively change according to the characteristics of an audio signal. A speech decoding method characterized by inputting a driving signal to a synthesis filter to decode a speech signal.

14. A pulse train generated by arranging pulses at a predetermined number of pulse positions selected from first pulse position candidates that adaptively change according to the characteristics of an audio signal, or said first pulse. Drive including a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from second pulse position candidates including a part or all of positions not used as position candidates A speech decoding method comprising: inputting a signal to a synthesis filter to decode a speech signal.

15. A noise vector constituted by using a pulse train generated by arranging pulses at a predetermined number of pulse positions selected from a pitch vector and pulse position candidates that change according to the shape of the pitch vector. A speech decoding method characterized by inputting a drive signal consisting of the following to a synthesis filter to decode a speech signal.

16. A pulse vector is generated by arranging pulses at a predetermined number of pulse positions selected from a pitch vector and pulse position candidates arranged such that there are more candidates as the power of the pitch vector is larger. A speech decoding method characterized in that a speech signal is decoded by inputting a drive signal composed of a noise vector formed by using a pulse train to a synthesis filter.

17. A position candidate density function is obtained from a pitch vector and the shape of the pitch vector, and pulses are arranged at a predetermined number of pulse positions selected from position candidates set based on the position candidate density function. A speech decoding method characterized by inputting a drive signal composed of a noise vector formed using a generated pulse train to a synthesis filter to decode a speech signal.

18. A pulse position which is processed by a pitch vector and a correction means, and which is selected from pulse position candidates which change according to the shape of an inverse correction pitch vector obtained by performing an operation based on the inverse characteristic of the correction means on the pitch vector. A speech decoding method characterized by inputting a driving signal composed of a noise vector formed using a pulse train generated by arranging pulses at a predetermined number of pulse positions to a synthesis filter and decoding a speech signal. .

19. A driving signal comprising a pitch vector and a noise vector and including a pulse train shaped by pulse shaping means having characteristics determined based on the shape of the pitch vector is input to a synthesis filter. An audio decoding method comprising decoding an audio signal.

20. A pulse vector generated by arranging pulses at a predetermined number of pulse positions selected from pitch position candidates arranged such that there are more candidates as the pitch vector and the power of the pitch vector are larger. Decoding a voice signal by inputting a drive signal including a noise vector including a pulse train shaped by a pulse shaping unit having characteristics determined based on the shape of the pitch vector to a synthesis filter. A speech decoding method characterized by the following.