JPH08328595A

JPH08328595A - Speech encoding device

Info

Publication number: JPH08328595A
Application number: JP7131298A
Authority: JP
Inventors: Mitsuo Fujimoto; 光男藤本
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1995-05-30
Filing date: 1995-05-30
Publication date: 1996-12-13
Anticipated expiration: 2019-04-05
Also published as: JP3515215B2

Abstract

PURPOSE: To provide a speech encoding device capable of expressing the part insufficiently expressed by an adaptive code book among a period part of an input speech and improving tone quality of a reproduced speech. CONSTITUTION: In the speech encoding device constituting a speech synthesis filter by linear predictively analyzing the input voice, reproducing the speech based on a code vector stored in the code book and the speech synthesis filter and encoding the speech based on the reproduced voice and the input speech, a pulse code book 7 storing plural kinds of code vectors for a pitch waveform of a voiced sound is provided, and the code vector read out from the pulse code book is made periodic based on an impulse line selected by the retrieval of the impulse line.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、符号駆動線形予測音
声符号化方式（ＣＥＬＰ）、ピッチ同期雑音源符号励振
線形予測音声符号化方式（ＰＳＩ−ＣＥＬＰ）等の音声
符号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding apparatus such as a code driven linear predictive speech coding method (CELP) and a pitch synchronization noise source code excited linear predictive speech coding method (PSI-CELP).

【０００２】[0002]

【従来の技術】近年、自動車電話、携帯電話の電波帯域
を有効に利用したり、マルチメディア通信における音声
部分の情報量を圧縮するために、低ビットレート音声符
号化の技術が脚光を浴びている。2. Description of the Related Art In recent years, low bit rate audio encoding technology has been in the limelight in order to effectively use the radio wave band of car phones and mobile phones and to compress the amount of information in the audio part in multimedia communication. There is.

【０００３】この種の音声符号化方式として、符号駆動
線形予測音声符号化方式（ＣＥＬＰ：Code Excited Lin
ear Prediction）、ピッチ同期雑音源符号励振線形予測
音声符号化方式（ＰＳＩ−ＣＥＬＰ：Pitch Synchronou
s Innovation Code ExcitedLinear Prediction ）等が
既に開発されている。As this type of speech coding system, a code driven linear predictive speech coding system (CELP: Code Excited Lin) is used.
ear Prediction), pitch synchronization noise source code excitation linear prediction speech coding method (PSI-CELP: Pitch Synchronou)
s Innovation Code ExcitedLinear Prediction) has already been developed.

【０００４】ＣＥＬＰ符号化方式は、入力音声のスペク
トル包絡に対応する線形フィルタを線形予測分析法によ
り構成し、それを符号帳（コードブック）に蓄えた時系
列符号ベクトルで駆動することにより音声を再生する符
号化方式である。In the CELP coding system, a linear filter corresponding to the spectral envelope of input speech is constructed by a linear predictive analysis method, and the speech is driven by driving it with a time series code vector stored in a codebook. This is an encoding method for reproduction.

【０００５】ＰＳＩ−ＣＥＬＰ符号化方式では、ＣＥＬ
Ｐ符号化方式に基づいて、予め符号帳（コードブック）
に用意された候補ベクトルを励振源として線形予測フィ
ルタを駆動する方式である。ＰＳＩ−ＣＥＬＰ符号化方
式では、励振源が、音声のピッチ周期に対応する適応コ
ードブック周期に同期して周期化されるという点に特徴
がある。In the PSI-CELP coding system, the CEL
A codebook based on the P coding method
This is a method of driving a linear prediction filter using the candidate vector prepared in Section 1 as the excitation source. The PSI-CELP coding method is characterized in that the excitation source is periodicized in synchronization with the adaptive codebook period corresponding to the pitch period of speech.

【０００６】図６は、ＣＥＬＰ符号化装置の一例を示し
ている。まず、連続した入力音声信号が５〜１０ｍｓ程
度の一定間隔の区間に分割される。この間隔をここで
は、サブフレームということにする。FIG. 6 shows an example of a CELP coding device. First, a continuous input voice signal is divided into sections at regular intervals of about 5 to 10 ms. This interval is referred to as a subframe here.

【０００７】次に、線形予測分析部１０１によって、入
力音声はサブフレーム単位で線形予測分析され、Ｐ次の
線形予測係数α_i（ｉ＝１，２…Ｐ）が計算される。そ
して、得られた線形予測係数α_iに基づいて、線形予測
合成フィルタ１０２が作成される。Next, the linear prediction analysis unit 101 performs linear prediction analysis on the input speech in units of subframes to calculate P- _th order linear prediction coefficients α _i (i = 1, 2, ... P). Then, the linear prediction synthesis filter 102 is created based on the obtained linear prediction coefficient α _i .

【０００８】次に、適応コードブック１０３の探索が行
なわれる。適応コードブック１０３は、音声の周期成
分、つまりピッチを表現するために用いられる。Next, the adaptive codebook 103 is searched. The adaptive codebook 103 is used to express the periodic component of speech, that is, the pitch.

【０００９】適応コードブック１０３の入力符号に対応
する出力符号ベクトルは、前サブフレームおよびそれ以
前の線形予測合成フィルタ１０２の励振信号を、後ろか
ら入力符号に対応する長さ（以下、ラグという）分切り
出し、それをサブフレーム長になるまで繰り返し並べる
ことにより作成される。The output code vector corresponding to the input code of the adaptive codebook 103 has a length (hereinafter referred to as a lag) corresponding to the input code from the rear of the excitation signals of the linear prediction synthesis filter 102 in the previous subframe and before. It is created by segmenting and repeatedly arranging it until the sub-frame length is reached.

【００１０】作成された出力符号ベクトルで線形予測合
成フィルタ１０２が駆動されて、再生音声が作成され
る。そして、入力音声と再生音声との距離（再生音声の
原音声に対する歪）が理論的に最小になるような利得が
再生音声にかけられた後、入力音声と再生音声との距離
が距離計算部１０５で計算される。The linear predictive synthesis filter 102 is driven by the produced output code vector to produce reproduced voice. Then, after the reproduced voice is subjected to a gain that theoretically minimizes the distance between the input voice and the reproduced voice (distortion of the reproduced voice with respect to the original voice), the distance between the input voice and the reproduced voice is calculated by the distance calculation unit 105. Calculated by

【００１１】このような操作が、入力符号ごとに繰り返
され、距離が最小となるような励振ベクトルの符号が選
択される。Such an operation is repeated for each input code, and the code of the excitation vector that minimizes the distance is selected.

【００１２】この後、雑音コードブック１０４の探索が
行なわれる。雑音コードブック１０４は、適応コードブ
ック１０３で表現できない音声の変動部分を表現するた
めに用いられる。雑音コードブック１０４には、通常白
色ガウス性雑音を基調とし、１サブフレーム分の長さの
各種の符号ベクトル（以下、雑音符号ベクトルという）
が予め記憶されている。Thereafter, the noise codebook 104 is searched. The noise codebook 104 is used to represent a variable portion of speech that cannot be represented by the adaptive codebook 103. The noise codebook 104 normally uses white Gaussian noise as a basic tone, and various code vectors each having a length of one subframe (hereinafter referred to as a noise code vector).
Is stored in advance.

【００１３】まず、雑音コードブック１０４に記憶され
ている各種の雑音符号ベクトルのうちから、入力符号に
対応する雑音符号ベクトルが読み出される。次に、適応
コードブックの探索で選ばれた符号ベクトルの影響を除
くために、読み出された雑音符号ベクトルの合成フィル
タ出力は、適応コードブックの探索で選ばれた符号ベク
トルの合成フィルタ出力に対して直交化せしめられ、再
生音声が作成される。そして、入力音声と再生音声との
距離が理論的に最小になるような利得が再生音声にかけ
られた後、入力音声と再生音声との距離が距離計算部１
０５で計算される。First, the noise code vector corresponding to the input code is read from the various noise code vectors stored in the noise code book 104. Next, in order to remove the influence of the code vector selected in the adaptive codebook search, the read noise code vector synthesis filter output is converted to the code vector synthesis filter output selected in the adaptive codebook search. On the other hand, it is made orthogonal to each other and a reproduced voice is created. Then, after the reproduced voice is given a gain such that the distance between the input voice and the reproduced voice is theoretically minimized, the distance between the input voice and the reproduced voice is calculated by the distance calculation unit 1.
Calculated as 05.

【００１４】このような操作が、入力符号ごとに繰り返
され、距離が最小となるような励振ベクトルの符号が選
択される。Such an operation is repeated for each input code, and the code of the excitation vector that minimizes the distance is selected.

【００１５】適応コードブック１０３の探索によって選
択された適応コードブック１０３の入力符号およびそれ
に対応する利得を表す符号、雑音コードブック１０４の
探索によって選択された雑音コードブック１０４の入力
符号およびそれに対応する利得を表す符号ならびに線形
予測係数が符号化出力として出力される。The input code of the adaptive codebook 103 selected by the search of the adaptive codebook 103 and the code representing the gain corresponding thereto, the input code of the noise codebook 104 selected by the search of the noise codebook 104, and the corresponding code The code representing the gain and the linear prediction coefficient are output as the encoded output.

【００１６】[0016]

【発明が解決しようとする問題点】適応コードブック１
０３は、有声部でかつ定常な部分において音声のピッチ
構造を効率的に表現する。しかしながら、前サブフレー
ムの励振信号のパワーがほとんどない場合、現サブフレ
ームが、前サブフレームと異なる成分から構成されてい
る音声の立ち上がり部等の非定常性音声である場合、現
サブフレームが、ピッチ周期を持たない無声部等の雑音
性音声である場合には、適応コードブック１０３は適当
な符号ベクトルを構成することができず、再生音質を悪
化させてしまうという問題がある。Problems to be Solved by the Invention Adaptive Codebook 1
03 efficiently expresses the pitch structure of the voice in the voiced part and the stationary part. However, when there is almost no power of the excitation signal of the previous subframe, when the current subframe is a non-stationary speech such as a rising portion of speech that is composed of a component different from the preceding subframe, the current subframe is In the case of a noisy voice such as an unvoiced part having no pitch period, the adaptive codebook 103 cannot compose an appropriate code vector, which causes a problem that reproduction sound quality is deteriorated.

【００１７】このような問題に対処するため、ランダム
成分を出力するコードブックを適応コードブック１０３
に対して補完的に用意する手法が提案されている。この
ようなコードブックは、雑音コードブックと同様にどの
サブフレームにおいても入力符号に対して固定的な対応
関係にある符号ベクトルを出力する構造を持つため、固
定コードブックと呼ばれる。In order to deal with such a problem, a codebook that outputs a random component is adapted to the adaptive codebook 103.
A method of complementarily preparing is proposed. Like a noise codebook, such a codebook is called a fixed codebook because it has a structure that outputs a code vector that has a fixed correspondence to the input code in every subframe.

【００１８】固定コードブックは、適応コードブックと
同時に検索され、歪最小基準によりどちらかの出力ベク
トルが排他的に選択される。つまり、適応コードブック
と固定コードブックとは、互いに補完しあって１つのコ
ードブックとして動作する。The fixed codebook is searched at the same time as the adaptive codebook, and either output vector is exclusively selected according to the minimum distortion criterion. That is, the adaptive codebook and the fixed codebook complement each other and operate as one codebook.

【００１９】また、周期的ではあるが前サブフレームの
成分だけでは対応できない成分、すなわち、適応コード
ブックで表せないような有声部の非定常成分を、雑音コ
ードブックによって小さな歪で表現するようにするため
に、適応符号ベクトルの周期に対応させて雑音符号ベク
トルを周期化する手法もすでに提案されている。Further, a component which is periodic but cannot be dealt with only by the component of the previous subframe, that is, a non-stationary component of the voiced part which cannot be represented by the adaptive codebook is expressed by the noise codebook with a small distortion. In order to do so, a method of making the noise code vector periodic corresponding to the period of the adaptive code vector has already been proposed.

【００２０】しかしながら、固定コードブックおよび雑
音コードブックに記憶されている符号ベクトルは、そも
そも雑音に対する符号ベクトルであるため、いずれの手
法を用いても、入力音声の周期部分のうち、適応コード
ブックで十分に表現されなかった部分を、表現すること
ができないことがあった。However, since the code vector stored in the fixed codebook and the noise codebook is a code vector for noise in the first place, whichever method is used, the adaptive codebook of the periodic part of the input speech is used. Sometimes it was not possible to express the parts that were not expressed sufficiently.

【００２１】この発明は、入力音声の周期部分のうち、
適応コードブックで十分に表現されなかった部分を表現
することが可能となり、再生音声の音質を向上させるこ
とができる音声符号化装置を提供することを目的とす
る。According to the present invention, among the periodic parts of the input voice,
It is an object of the present invention to provide a speech coding apparatus capable of expressing a portion that is not sufficiently expressed by an adaptive codebook and improving the sound quality of reproduced speech.

【００２２】[0022]

【問題点を解決するための手段】この発明による第１の
音声符号化装置は、入力音声を線形予測分析することに
より音声合成フィルタを構成し、コードブックに蓄積さ
れている符号ベクトルと、音声合成フィルタとに基づい
て音声を再生し、再生された音声と入力音声とに基づい
て音声を符号化する音声符号化装置において、有声音の
ピッチ波形に対する複数種類の符号ベクトルが記憶され
たパルスコードブックが設けられており、インパルス列
の探索によって選択されたインパルス列に基づいて、パ
ルスコードブックから読み出された符号ベクトルが周期
化されることを特徴とする。A first speech coding apparatus according to the present invention constitutes a speech synthesis filter by performing linear predictive analysis on input speech, and a speech vector and a code vector stored in a codebook are used. In a voice encoding device for reproducing a voice based on a synthesis filter and encoding the voice based on the reproduced voice and an input voice, a pulse code storing a plurality of types of code vectors for a pitch waveform of voiced sound A book is provided, and the code vector read from the pulse codebook is periodicized based on the impulse train selected by the impulse train search.

【００２３】この発明による第２の音声符号化装置は、
入力音声を線形予測分析することにより、音声合成フィ
ルタを構成し、過去の励振信号に対応する符号ベクトル
を記憶する適応コードブックと、雑音に対する符号ベク
トルが記憶された雑音コードブックとを含むコードブッ
クから読み出された符号ベクトルと、音声合成フィルタ
とに基づいて音声を再生し、再生された音声と入力音声
とに基づいて音声を符号化する音声符号化装置におい
て、有声音のピッチ波形に対する複数種類の符号ベクト
ルが記憶されたパルスコードブックが、雑音コードブッ
クに対して補完的に設けられており、パルスコードブッ
クから読み出された符号ベクトルに基づいて再生音声を
作成するにあたっては、入力音声のピッチ周期間隔でイ
ンパルスが発生するインパルス列であって初期位置が互
いに異なる複数種類のインパルス列と、音声合成フィル
タとに基づいて、各インパルス列に対応する再生音声が
それぞれ作成され、再生音声と入力音声との歪が最も小
さくなるインパルス列が選択され、パルスコードブック
から読み出された符号ベクトルが、選択されたインパル
ス列に基づいて周期化されることを特徴とする。A second speech coding apparatus according to the present invention is
A codebook including an adaptive codebook that stores a code vector corresponding to a past excitation signal and a noise codebook that stores a code vector for noise by configuring a speech synthesis filter by performing linear predictive analysis on input speech. In a voice encoding device that reproduces a voice based on a code vector read from a voice synthesis filter and a voice synthesis filter, and encodes a voice based on the reproduced voice and an input voice, a plurality of voiced pitch waveforms A pulse codebook in which different types of code vectors are stored is provided in a complementary manner to the noise codebook.When creating a reproduced voice based on the code vector read from the pulse codebook, the input voice A series of impulses in which impulses are generated at different pitch cycle intervals and initial positions are different from each other. Based on the impulse train and the voice synthesis filter, the reproduced voice corresponding to each impulse train is created, and the impulse train with the smallest distortion between the reproduced voice and the input voice is selected and read from the pulse codebook. The encoded code vector is periodicized based on the selected impulse train.

【００２４】この発明による第３の音声符号化装置は、
入力音声を線形予測分析することにより、音声合成フィ
ルタを作成する手段、過去の励振信号に対応する符号ベ
クトルが記憶された適応コードブックから、切り出し位
置を変えて複数の符号ベクトルを順次切り出し、切り出
された各符号ベクトルで音声合成フィルタを駆動するこ
とによって、切り出された各符号ベクトルに対応する再
生音声をそれぞれ作成し、再生音声と入力音声との歪が
最も小さくなる符号ベクトルを探索する第１探索手段、
ならびに、雑音に対する複数種類の符号ベクトルが記憶
された雑音コードブックおよび有声音のピッチ波形に対
する複数種類の符号ベクトルが記憶されたパルスコード
ブックから、各符号ベクトルを順次読み出し、読み出さ
れた各符号ベクトルと、音声合成フィルタとに基づい
て、読み出された各符号ベクトルに対応する再生音声を
それぞれ作成し、再生音声と入力音声との歪が最も小さ
くなる符号ベクトルを探索する第２探索手段を備え、第
２探索手段は、パルスコードブックから読み出された符
号ベクトルに基づいて再生音声を作成するにあたり、入
力音声のピッチ周期間隔でインパルスが発生するインパ
ルス列であって初期位置が互いに異なる複数種類のイン
パルス列と、音声合成フィルタとに基づいて、各インパ
ルス列に対応する再生音声をそれぞれ作成し、再生音声
と入力音声との歪が最も小さくなるインパルス列を選択
し、パルスコードブックから読み出された符号ベクトル
を、選択されたインパルス列に基づいて、周期化する手
段を備えていることを特徴とする。A third speech encoding apparatus according to the present invention is
A means for creating a speech synthesis filter by performing a linear predictive analysis on the input speech, and a plurality of code vectors are sequentially cut out by changing the cutout position from the adaptive codebook in which the code vectors corresponding to the past excitation signals are stored. By driving the speech synthesis filter with each of the code vectors that have been generated to generate reproduced voices corresponding to the respective code vectors that have been cut out, and searching for a code vector that minimizes distortion between the reproduced voice and the input voice. Exploration means,
Also, each code vector is sequentially read from the noise codebook in which a plurality of types of code vectors for noise and the pulse codebook in which a plurality of types of code vectors for the pitch waveform of voiced sound are stored, and each code read out. Second search means for creating a reproduced voice corresponding to each read code vector based on the vector and the voice synthesis filter and searching for a code vector in which the distortion between the reproduced voice and the input voice is the smallest The second search means includes a plurality of impulse trains having different initial positions, which are impulse trains in which impulses are generated at pitch intervals of the input voice when creating the reproduced voice based on the code vector read from the pulse codebook. Based on the type of impulse train and the speech synthesis filter, the replay corresponding to each impulse train Creates each voice, selects the impulse train that minimizes the distortion between the reproduced voice and the input voice, and makes the code vector read from the pulse codebook periodic based on the selected impulse train. It is characterized by having.

【００２５】この発明による第４の音声符号化装置は、
入力音声を線形予測分析することにより、音声合成フィ
ルタを作成する手段、過去の励振信号に対応する符号ベ
クトルが記憶された適応コードブックから、切り出し位
置を変えて複数の符号ベクトルを順次切り出し、切り出
された各符号ベクトルで音声合成フィルタを駆動するこ
とによって、切り出された各符号ベクトルに対応する再
生音声をそれぞれ作成し、各再生音声と入力音声との歪
を算出するとともに、複数種類の符号ベクトルが記憶さ
れた固定コードブックから符号ベクトルを順次読み出
し、読み出された符号ベクトルで音声合成フィルタを駆
動することによって、読み出された各符号ベクトルに対
応する再生音声をそれぞれ作成し、各再生音声と入力音
声との歪を算出し、適応コードブックから切り出された
符号ベクトルおよび固定コードブックから読み出された
符号ベクトルのうち、歪算出結果が最小であるものを探
索する第１探索手段、ならびに、雑音に対する複数の符
号ベクトルが記憶された雑音コードブックおよび有声音
のピッチ波形に対する複数種類の符号ベクトルが記憶さ
れたパルスコードブックから、各符号ベクトルを順次読
み出し、読み出された各符号ベクトルと音声合成フィル
タとに基づいて、読み出された各符号ベクトルに対応す
る再生音声をそれぞれ作成し、再生音声と入力音声との
歪が最も小さくなる符号ベクトルに対する符号を探索す
る第２探索手段を備え、第２探索手段は、パルスコード
ブックから読み出された符号ベクトルに基づいて再生音
声を作成するにあたり、入力音声のピッチ周期間隔でイ
ンパルスが発生するインパルス列であって初期位置が互
いに異なる複数種類のインパルス列と、音声合成フィル
タとに基づいて、各インパルス列に対応する再生音声を
それぞれ作成し、再生音声と入力音声との歪が最も小さ
くなるインパルス列を選択し、パルスコードブックから
読み出された符号ベクトルを、選択されたインパルス列
に基づいて、周期化する手段を備えていることを特徴と
する。A fourth voice encoding device according to the present invention is
A means for creating a speech synthesis filter by performing a linear predictive analysis on the input speech, and a plurality of code vectors are sequentially cut out by changing the cutout position from the adaptive codebook in which the code vectors corresponding to the past excitation signals are stored. By driving the speech synthesis filter with each of the code vectors generated, the reproduced voice corresponding to each code vector that is cut out is created, the distortion between each reproduced voice and the input voice is calculated, and a plurality of types of code vectors are generated. By sequentially reading out the code vector from the fixed codebook in which is stored and driving the voice synthesis filter with the read code vector, the reproduced voice corresponding to each read code vector is created, and the reproduced voice is generated. And the distortion of the input speech are calculated, and the code vector cut out from the adaptive codebook and First search means for searching the code vector read out from the constant codebook for which the distortion calculation result is the minimum, and a noise codebook storing a plurality of code vectors for noise and a pitch waveform of voiced sound From the pulse codebook in which a plurality of types of code vectors are stored, the code vectors are sequentially read, and the reproduced speech corresponding to the read code vectors is read based on the read code vectors and the speech synthesis filter. Respectively, and second search means for searching for a code for a code vector in which the distortion between the reproduced voice and the input voice is minimized, and the second search means is based on the code vector read from the pulse codebook. An impulse train in which impulses are generated at pitch intervals of the input sound when creating a reproduced sound. Create a reproduced voice corresponding to each impulse train based on multiple types of impulse trains whose initial positions are different from each other and a voice synthesis filter, and select the impulse train that minimizes the distortion between the reproduced voice and the input voice. However, it is characterized by comprising means for making the code vector read from the pulse codebook periodic based on the selected impulse train.

【００２６】[0026]

【作用】この発明による第１の音声符号化装置では、有
声音のピッチ波形に対する複数種類の符号ベクトルが記
憶されたパルスコードブックが設けられている。インパ
ルス列の探索によって選択されたインパルス列に基づい
て、パルスコードブックから読み出された符号ベクトル
が周期化される。In the first speech coding apparatus according to the present invention, the pulse codebook in which plural kinds of code vectors for the pitch waveform of voiced sound are stored is provided. The code vector read from the pulse codebook is made periodic based on the impulse train selected by the impulse train search.

【００２７】この発明の第２の音声符号化装置では、有
声音のピッチ波形に対する複数種類の符号ベクトルが記
憶されたパルスコードブックが、雑音コードブックに対
して補完的に設けられている。パルスコードブックは、
雑音コードブックと同時に探索され、歪最小化基準によ
り、どちらかの出力ベクトルが排他的に選択される。In the second speech coding apparatus according to the present invention, the pulse codebook in which plural kinds of code vectors for the pitch waveform of voiced sound are stored is provided in a complementary manner to the noise codebook. The pulse codebook is
Searched simultaneously with the noise codebook, either output vector is exclusively selected by the distortion minimization criterion.

【００２８】パルスコードブックから読み出された符号
ベクトルに基づいて再生音声を作成するにあたっては、
入力音声のピッチ周期間隔でインパルスが発生するイン
パルス列であって初期位置が互いに異なる複数種類のイ
ンパルス列と、音声合成フィルタとに基づいて、各イン
パルス列に対応する再生音声がそれぞれ作成される。再
生音声と入力音声との歪が最も小さくなるインパルス列
が選択される。そして、パルスコードブックから読み出
された符号ベクトルが、選択されたインパルス列に基づ
いて周期化される。In creating a reproduced voice based on the code vector read from the pulse codebook,
Reproduced voices corresponding to the respective impulse trains are created based on a plurality of types of impulse trains in which impulses are generated at the pitch cycle intervals of the input voice and having different initial positions, and the voice synthesis filter. The impulse train that minimizes the distortion between the reproduced voice and the input voice is selected. Then, the code vector read from the pulse codebook is made periodic based on the selected impulse train.

【００２９】この発明の第３の音声符号化装置では、入
力音声が線形予測分析されることにより、音声合成フィ
ルタが作成される。過去の励振信号に対応する符号ベク
トルが記憶された適応コードブックから、切り出し位置
を変えて複数の符号ベクトルが順次切り出され、切り出
された各符号ベクトルで音声合成フィルタが駆動される
ことによって、切り出された各符号ベクトルに対応する
再生音声がそれぞれ作成される。そして、再生音声と入
力音声との歪が最も小さくなる符号ベクトルが探索され
る。In the third speech coding apparatus of the present invention, the speech synthesis filter is created by performing the linear predictive analysis on the input speech. From the adaptive codebook in which the code vectors corresponding to the past excitation signals are stored, a plurality of code vectors are sequentially cut out by changing the cutout position, and the cutout is performed by driving the speech synthesis filter with each cutout code vector. The reproduced voice corresponding to each of the generated code vectors is created. Then, the code vector that minimizes the distortion between the reproduced voice and the input voice is searched for.

【００３０】また、雑音に対する複数種類の符号ベクト
ルが記憶された雑音コードブックおよび有声音のピッチ
波形に対する複数種類の符号ベクトルが記憶されたパル
スコードブックから、各符号ベクトルが順次読み出さ
れ、読み出された各符号ベクトルと、音声合成フィルタ
とに基づいて、読み出された各符号ベクトルに対応する
再生音声がそれぞれ作成される。そして、再生音声と入
力音声との歪が最も小さくなる符号ベクトルが探索され
る。Further, each code vector is sequentially read out from the noise codebook in which a plurality of types of code vectors for noise are stored and the pulse codebook in which a plurality of types of code vectors for the pitch waveform of voiced sound are stored. Reproduced voices corresponding to the read code vectors are created based on the output code vectors and the voice synthesis filter. Then, the code vector that minimizes the distortion between the reproduced voice and the input voice is searched for.

【００３１】パルスコードブックから読み出された符号
ベクトルに基づいて再生音声を作成するにあたっては、
入力音声のピッチ周期間隔でインパルスが発生するイン
パルス列であって初期位置が互いに異なる複数種類のイ
ンパルス列と、音声合成フィルタとに基づいて、各イン
パルス列に対応する再生音声がそれぞれ作成される。再
生音声と入力音声との歪が最も小さくなるインパルス列
が選択される。そして、パルスコードブックから読み出
された符号ベクトルが、選択されたインパルス列に基づ
いて周期化される。In creating a reproduced voice based on the code vector read from the pulse codebook,
Reproduced voices corresponding to the respective impulse trains are created based on a plurality of types of impulse trains in which impulses are generated at the pitch cycle intervals of the input voice and having different initial positions, and the voice synthesis filter. The impulse train that minimizes the distortion between the reproduced voice and the input voice is selected. Then, the code vector read from the pulse codebook is made periodic based on the selected impulse train.

【００３２】この発明の第４の音声符号化装置では、入
力音声が線形予測分析されることにより、音声合成フィ
ルタが作成される。過去の励振信号に対応する符号ベク
トルが記憶された適応コードブックから、切り出し位置
を変えて複数の符号ベクトルが順次切り出され、切り出
された各符号ベクトルで音声合成フィルタが駆動される
ことによって、切り出された各符号ベクトルに対応する
再生音声がそれぞれ作成される。各再生音声と入力音声
との歪が算出される。また、複数種類の符号ベクトルが
記憶された固定コードブックから符号ベクトルが順次読
み出され、読み出された符号ベクトルで音声合成フィル
タが駆動されることによって、読み出された各符号ベク
トルに対応する再生音声がそれぞれ作成される。各再生
音声と入力音声との歪が算出される。そして、適応コー
ドブックから切り出された符号ベクトルおよび固定コー
ドブックから読み出された符号ベクトルのうち、歪算出
結果が最小であるものが探索される。In the fourth speech encoding apparatus of the present invention, the speech synthesis filter is created by performing the linear predictive analysis on the input speech. From the adaptive codebook in which the code vectors corresponding to the past excitation signals are stored, a plurality of code vectors are sequentially cut out by changing the cutout position, and the cutout is performed by driving the speech synthesis filter with each cutout code vector. The reproduced voice corresponding to each of the generated code vectors is created. The distortion between each reproduced voice and the input voice is calculated. Further, the code vectors are sequentially read from the fixed codebook in which a plurality of types of code vectors are stored, and the speech synthesis filter is driven by the read code vectors to correspond to the read code vectors. Playback audio is created respectively. The distortion between each reproduced voice and the input voice is calculated. Then, the code vector cut out from the adaptive codebook and the code vector read from the fixed codebook are searched for the one having the smallest distortion calculation result.

【００３３】また、雑音に対する複数種類の符号ベクト
ルが記憶された雑音コードブックおよび有声音のピッチ
波形に対する複数種類の符号ベクトルが記憶されたパル
スコードブックから、各符号ベクトルが順次読み出さ
れ、読み出された各符号ベクトルと、音声合成フィルタ
とに基づいて、読み出された各符号ベクトルに対応する
再生音声がそれぞれ作成される。そして、再生音声と入
力音声との歪が最も小さくなる符号ベクトルが探索され
る。Further, each code vector is sequentially read out from the noise codebook in which a plurality of types of code vectors for noise are stored and the pulse codebook in which a plurality of types of code vectors for the pitch waveform of voiced sound are stored. Reproduced voices corresponding to the read code vectors are created based on the output code vectors and the voice synthesis filter. Then, the code vector that minimizes the distortion between the reproduced voice and the input voice is searched for.

【００３４】パルスコードブックから読み出された符号
ベクトルに基づいて再生音声を作成するにあたっては、
入力音声のピッチ周期間隔でインパルスが発生するイン
パルス列であって初期位置が互いに異なる複数種類のイ
ンパルス列と、音声合成フィルタとに基づいて、各イン
パルス列に対応する再生音声がそれぞれ作成される。再
生音声と入力音声との歪が最も小さくなるインパルス列
が選択される。そして、パルスコードブックから読み出
された符号ベクトルが、選択されたインパルス列に基づ
いて周期化される。In creating a reproduced voice based on the code vector read from the pulse codebook,
Reproduced voices corresponding to the respective impulse trains are created based on a plurality of types of impulse trains in which impulses are generated at the pitch cycle intervals of the input voice and having different initial positions, and the voice synthesis filter. The impulse train that minimizes the distortion between the reproduced voice and the input voice is selected. Then, the code vector read from the pulse codebook is made periodic based on the selected impulse train.

【００３５】[0035]

【実施例】以下、図面を参照して、この発明の実施例に
ついて説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００３６】図１は、音声符号化装置の構成を示してい
る。この音声符号化装置では、線形予測フィルタの励振
源は２つの部分からなる。一方の励振源は、適応コード
ブック４と固定コードブック５から構成され、他方の励
振源は雑音コードブック６とパルスコードブック７とか
ら構成されている。FIG. 1 shows the configuration of a speech coder. In this speech coder, the excitation source of the linear prediction filter consists of two parts. One excitation source consists of an adaptive codebook 4 and a fixed codebook 5, and the other excitation source consists of a noise codebook 6 and a pulse codebook 7.

【００３７】適応コードブック４は、既に説明したよう
に、音声の周期成分、つまりピッチを表現するために用
いられる。適応コードブック４には、過去所定長分の線
形予測フィルタの励振信号ｅ（適応符号ベクトル）が記
憶される。The adaptive codebook 4 is used to express the periodic component of the voice, that is, the pitch, as described above. The adaptive codebook 4 stores the excitation signal e (adaptive code vector) of the linear prediction filter for a predetermined length in the past.

【００３８】固定コードブック５は、既に説明したよう
に、前サブフレームの励振信号のパワーがほとんどない
場合、現サブフレームが、前サブフレームと異なる成分
から構成されている音声の立ち上がり部等の非定常性音
声である場合、現サブフレームが、ピッチ周期を持たな
い無声部等の雑音性音声である場合等において、適応コ
ードブック４を補完するために設けられている。固定コ
ードブック５には、それぞれサブブレーム長に相当する
長さの各種の符号ベクトル（固定符号ベクトル）が記憶
されている。As described above, the fixed codebook 5 has a structure in which the current subframe has a component different from that of the preceding subframe when there is almost no power of the excitation signal in the preceding subframe. It is provided to complement the adaptive codebook 4 in the case of non-stationary speech, in the case where the current subframe is noisy speech such as an unvoiced part having no pitch period, and the like. The fixed codebook 5 stores various code vectors (fixed code vectors) each having a length corresponding to the sub-frame length.

【００３９】雑音コードブック６は、既に説明したよう
に、音声の非周期成分を表現するために用いられる。雑
音コードブック６には、それぞれサブブレーム長に相当
する長さの各種の符号ベクトル（雑音符号ベクトル）が
記憶されている。The noise codebook 6 is used to represent the aperiodic component of speech, as described above. The noise codebook 6 stores various code vectors (noise code vectors) each having a length corresponding to the sub-bram length.

【００４０】パルスコードブック７は、入力音声の周期
部分のうち、適応コードブックで十分に表現されなかっ
た部分を表現するために用いられる。図２は、パルスコ
ードブック７に記憶されている複数の符号ベクトル（パ
ルス符号ベクトル）の例を示している。各パルス符号ベ
クトルとしては、代表的な有声音のピッチ波形に対する
符号ベクトルが用いられる。The pulse codebook 7 is used for expressing a part of the periodic part of the input speech which is not sufficiently expressed by the adaptive codebook. FIG. 2 shows an example of a plurality of code vectors (pulse code vectors) stored in the pulse codebook 7. A code vector for a typical pitch waveform of voiced sound is used as each pulse code vector.

【００４１】以下、音声符号化装置の動作につい説明す
る。The operation of the speech coder will be described below.

【００４２】連続した入力音声信号が４０ｍｓ程度の一
定間隔の区間に分割される。この間隔をここでは、フレ
ームということにする。また、１フレーム内の音声信号
が８ｍｓ程度の一定間隔の区間に分割される。この間隔
をここでは、サブフレームということにする。A continuous input voice signal is divided into sections having a constant interval of about 40 ms. This interval is referred to as a frame here. Further, the audio signal in one frame is divided into sections having a constant interval of about 8 ms. This interval is referred to as a subframe here.

【００４３】（１）線形予測分析および線形予測合成フ
ィルタの作成まず、線形予測分析部１によって１フレーム単位で、入
力音声が線形予測分析される。この例では、線形予測分
析部１によって、１フレームに２回の線形予測分析が行
なわれ、それぞれの分析によって２つの１０次の線形予
測係数が求められる。そして、求められた線形予測係数
に基づいて、当該１フレーム内の各サブフレームに対す
る線形予測係数α_i（ｉ＝１，２…１０）がそれぞれ求
められる。得られたサブフレームに対する線形予測係数
α_iに基づいて、線形予測合成フィルタ（音声合成フィ
ルタ）３がサブフレーム単位で作成される。(1) Creation of Linear Prediction Analysis and Linear Prediction Synthesis Filter First, the linear prediction analysis unit 1 performs a linear prediction analysis on the input speech on a frame-by-frame basis. In this example, the linear prediction analysis unit 1 performs linear prediction analysis twice in one frame, and two 10th-order linear prediction coefficients are obtained by each analysis. Then, based on the obtained linear prediction coefficient, the linear prediction coefficient α _i (i = 1, 2, ..., 10) for each subframe in the one frame is calculated. A linear prediction synthesis filter (speech synthesis filter) 3 is created for each subframe based on the obtained linear prediction coefficient α _i for the subframe.

【００４４】（２）ピッチ抽出また、ピッチ抽出部２によって１フレーム単位で入力音
声のピッチ周期Ｔｐが抽出される。(2) Pitch Extraction Further, the pitch extraction unit 2 extracts the pitch period Tp of the input voice on a frame-by-frame basis.

【００４５】（３）コードブックの探索次に、適応コードブック４および固定コードブック５の
探索（適応／固定コードブック探索）と、雑音コードブ
ック６およびパルスコードブック７の探索（雑音／パル
スコードブック探索）とが、サブフレーム単位で行なわ
れる。(3) Search for Codebook Next, search for adaptive codebook 4 and fixed codebook 5 (adaptive / fixed codebook search) and search for noise codebook 6 and pulse codebook 7 (noise / pulse code). Book search) is performed in subframe units.

【００４６】（３−１）適応／固定コードブック探索(3-1) Adaptive / fixed codebook search

【００４７】（３−１−１）適応コードブックによる距
離計算適応／固定コードブック探索においては、まず、適応コ
ードブック４による距離計算が行なわれる。適応コード
ブック４による距離計算においては、まず、適応コード
ブック４の入力符号に対応する出力符号ベクトルが、次
のようにして作成される。(3-1-1) Distance Calculation by Adaptive Codebook In the adaptive / fixed codebook search, first, distance calculation by the adaptive codebook 4 is performed. In the distance calculation by the adaptive codebook 4, first, the output code vector corresponding to the input code of the adaptive codebook 4 is created as follows.

【００４８】適応コードブック４に記憶されている前サ
ブフレームおよびそれ以前の線形予測合成フィルタ３の
励振信号（適応符号ベクトル）が、後ろから入力符号に
対応する長さ（以下、ラグという）分切り出される。The excitation signals (adaptive code vectors) of the linear predictive synthesis filter 3 in the preceding subframe and in the preceding subframes stored in the adaptive codebook 4 are divided by the length corresponding to the input code (hereinafter referred to as lag). It is cut out.

【００４９】ラグがサブフレーム長より短い場合には、
切り出された適応符号ベクトルが、サブフレーム長にな
るまで繰り返し並べられることにより出力符号ベクトル
が作成される。ラグがサブフレーム長より長い場合に
は、切り出された適応符号ベクトルのうち、その先頭か
らサブフレーム長に相当する長さが切り出されることに
より、出力符号ベクトル作成される。If the lag is shorter than the subframe length,
An output code vector is created by repeatedly arranging the cut out adaptive code vectors until the subframe length is reached. When the lag is longer than the subframe length, the output code vector is created by cutting out the length corresponding to the subframe length from the head of the cut out adaptive code vector.

【００５０】各入力符号に対応する長さ（ラグ）は、そ
れぞれ異なる。各入力符号に対応する長さは、ピッチ抽
出部２によって検出されたピッチ周期Ｔｐに相当する長
さに基づいて決定される。ピッチ抽出部２によって検出
されたピッチ周期Ｔｐに相当する長さをＬ₀とすると、
各入力符号に対応する長さは、Ｌ₀を中心とする所定範
囲内から選択された長さとなる。The length (lag) corresponding to each input code is different. The length corresponding to each input code is determined based on the length corresponding to the pitch cycle Tp detected by the pitch extraction unit 2. If the length corresponding to the pitch cycle Tp detected by the pitch extraction unit 2 is L ₀ ,
The length corresponding to each input code is a length selected from a predetermined range centered on L ₀ .

【００５１】作成された出力符号ベクトルで線形予測合
成フィルタ３が駆動されて、再生音声が作成される。そ
して、入力音声と再生音声との距離（再生音声の原音声
に対する歪）が理論的に最小になるような利得が再生音
声にかけられた後、入力音声と再生音声との距離が距離
計算部８で計算される。このような操作が、適応コード
ブック４に対する各入力符号ごとに繰り返された後、固
定コードブック５による距離計算が行なわれる。The linear predictive synthesis filter 3 is driven by the produced output code vector to produce reproduced voice. Then, after the reproduced voice is given a gain such that the distance between the input voice and the reproduced voice (distortion of the reproduced voice with respect to the original voice) is theoretically minimized, the distance between the input voice and the reproduced voice is calculated by the distance calculation unit 8. Calculated by After such an operation is repeated for each input code to adaptive codebook 4, fixed codebook 5 performs distance calculation.

【００５２】（３−１−２）固定コードブックによる距
離計算固定コードブック５による距離計算では、固定コードブ
ック５の入力符号に対応する固定符号ベクトルが読み出
される。読み出された固定符号ベクトルで線形予測合成
フィルタ３が駆動されて、再生音声が作成される。そし
て、入力音声と再生音声との距離が理論的に最小になる
ような利得が再生音声にかけられた後、入力音声と再生
音声との距離が距離計算部８で計算される。このような
操作が、固定コードブック５に対する各入力符号ごとに
繰り返される。(3-1-2) Distance Calculation by Fixed Codebook In the distance calculation by the fixed codebook 5, the fixed code vector corresponding to the input code of the fixed codebook 5 is read. The linear predictive synthesis filter 3 is driven by the read fixed code vector, and reproduced voice is created. Then, after the reproduced voice is given a gain such that the distance between the input voice and the reproduced voice is theoretically minimized, the distance calculation unit 8 calculates the distance between the input voice and the reproduced voice. Such an operation is repeated for each input code to the fixed codebook 5.

【００５３】このようにして、適応コードブックによる
距離計算および固定コードブックによる距離計算が行な
われると、計算された距離が最小となる励振ベクトルの
入力符号およびそれに対応する利得が選択される。In this way, when the distance calculation by the adaptive codebook and the distance calculation by the fixed codebook are performed, the input code of the excitation vector that minimizes the calculated distance and the gain corresponding thereto are selected.

【００５４】（３−２）雑音／パルスコードブック探索(3-2) Noise / pulse codebook search

【００５５】（３−２−１）雑音コードブックによる距
離計算雑音／パルスコードブック探索においては、まず、雑音
コードブック６による距離計算が行なわれる。雑音コー
ドブック６による距離計算では、雑音コードブック６の
入力符号に対応する雑音符号ベクトルが読み出される。
次に、適応／固定コードブック探索で選ばれた符号ベク
トルの影響を除くために、読み出された雑音符号ベクト
ルの合成フィルタ出力は、適応／固定コードブック探索
で選ばれた符号ベクトルの合成フィルタ出力に対して直
交化せしめられ、再生音声が作成される。(3-2-1) Distance Calculation by Noise Codebook In the noise / pulse codebook search, distance calculation by the noise codebook 6 is first performed. In the distance calculation by the noise codebook 6, the noise code vector corresponding to the input code of the noise codebook 6 is read.
Next, in order to remove the influence of the code vector selected in the adaptive / fixed codebook search, the synthesis filter output of the read noise code vector is the synthesis filter of the code vector selected in the adaptive / fixed codebook search. The output is orthogonalized, and reproduced sound is created.

【００５６】そして、入力音声と再生音声との距離が理
論的に最小になるような利得が再生音声にかけられた
後、入力音声と再生音声との距離が距離計算部８で計算
される。このような操作が、雑音コードブック６に対す
る入力符号ごとに繰り返された後、パルスコードブック
７による距離計算が行なわれる。Then, after the reproduced voice is provided with a gain that theoretically minimizes the distance between the input voice and the reproduced voice, the distance calculation unit 8 calculates the distance between the input voice and the reproduced voice. After such an operation is repeated for each input code to the noise codebook 6, the distance calculation by the pulse codebook 7 is performed.

【００５７】（３−２−２）パルスコードブックによる
距離計算パルスコードブック７による距離計算を行なうに際して
は、まず、インパルス列の探索が行なわれる。(3-2-2) Distance Calculation by Pulse Codebook When performing distance calculation by the pulse codebook 7, first, impulse train search is performed.

【００５８】このインパルス列の探索においては、ま
ず、ピッチ抽出部２で抽出されたピッチ周期Ｔｐに基づ
いて、インパルス列が作成される。ピッチ抽出部２で抽
出されたピッチ周期Ｔｐに相当する長さが、サブフレー
ム長Ｔｓより短い場合には、図３に示すように、ピッチ
抽出部２で抽出されたピッチ周期間隔でインパルスが発
生し、かつ全長がサブフレーム長Ｔｓに等しいインパル
ス列Ｐ０が作成される。In this impulse train search, an impulse train is first created based on the pitch period Tp extracted by the pitch extracting unit 2. When the length corresponding to the pitch cycle Tp extracted by the pitch extraction unit 2 is shorter than the subframe length Ts, impulses are generated at the pitch cycle intervals extracted by the pitch extraction unit 2 as shown in FIG. And an impulse train P0 whose total length is equal to the subframe length Ts is created.

【００５９】ピッチ抽出部２で抽出されたピッチ周期Ｔ
ｐに相当する長さが、サブフレーム長Ｔｓより長い場合
には、図４に示すように、１つのインパルスからなるイ
ンパルス列Ｐ０が作成される。Pitch cycle T extracted by the pitch extraction unit 2
When the length corresponding to p is longer than the subframe length Ts, an impulse train P0 composed of one impulse is created as shown in FIG.

【００６０】そして、雑音コードブック６から読み出さ
れた雑音符号ベクトルに基づいて再生音声を作成したと
同様の方法で、インパルス列に基づいて再生音声が作成
され、入力音声との距離計算が行なわれる。Then, a reproduced voice is created based on the impulse train in the same manner as the reproduced voice is created based on the noise code vector read from the noise codebook 6, and the distance to the input voice is calculated. Be done.

【００６１】このような処理を図３または図４に示すよ
うに、インパルス列の初期位置が異なる複数のインパル
ス列Ｐ０〜Ｐｎに対して行い、最も距離の短いインパル
ス列が選択される。As shown in FIG. 3 or 4, such processing is performed on a plurality of impulse trains P0 to Pn having different initial positions of the impulse trains, and the impulse train having the shortest distance is selected.

【００６２】この後、パルスコードブック７による距離
計算が行なわれる。パルスコードブック７による距離計
算では、パルスコードブック７の入力符号に対応するパ
ルス符号ベクトルが読み出される。次に、たとえば、図
５に示すように、インパルス列の探索で選択されたイン
パルス列（図５（ａ））の各インパルス位置に、パルス
コードブック７から読み出されたパルス符号ベクトルを
設定することにより、サブフレーム長に相当する長さの
パルス符号ベクトル（図５（ｂ））が作成される。After that, the distance calculation by the pulse codebook 7 is performed. In the distance calculation by the pulse codebook 7, the pulse code vector corresponding to the input code of the pulse codebook 7 is read. Next, for example, as shown in FIG. 5, the pulse code vector read from the pulse codebook 7 is set at each impulse position of the impulse train (FIG. 5A) selected in the impulse train search. As a result, a pulse code vector (FIG. 5B) having a length corresponding to the subframe length is created.

【００６３】次に、適応／固定コードブック探索で選ば
れた符号ベクトルの影響を除くために、作成されたパル
ス符号ベクトルの合成フィルタ出力は、適応／固定コー
ドブック探索で選ばれた符号ベクトルの合成フィルタ出
力に対して直交化せしめられ、再生音声が作成される。Next, in order to remove the influence of the code vector selected in the adaptive / fixed codebook search, the synthesized filter output of the pulse code vector generated is the code vector selected in the adaptive / fixed codebook search. The output of the synthesis filter is orthogonalized, and reproduced sound is created.

【００６４】そして、入力音声と再生音声との距離が理
論的に最小になるような利得が再生音声にかけられた
後、入力音声と再生音声との距離が距離計算部８で計算
される。このような操作が、パルスコードブック７に対
する入力符号ごとに繰り返される。Then, after the reproduced voice is given a gain that theoretically minimizes the distance between the input voice and the reproduced voice, the distance calculation unit 8 calculates the distance between the input voice and the reproduced voice. Such an operation is repeated for each input code to the pulse codebook 7.

【００６５】このようにして、雑音コードブックによる
距離計算およびパルスコードブックによる距離計算が行
なわれると、計算された距離が最小となる励振ベクトル
の入力符号およびそれに対応する利得が選択される。In this way, when the distance calculation by the noise codebook and the distance calculation by the pulse codebook are performed, the input code of the excitation vector that minimizes the calculated distance and the gain corresponding thereto are selected.

【００６６】適応／固定コードブック探索によって選択
されたサブフレーム毎の適応コードブックまたは固定コ
ードブックの入力符号およびそれに対応する利得を表す
符号、雑音／パルスコードブック探索によって選択され
たサブフレーム毎の雑音コードブックまたはパルスコー
ドブックの入力符号およびそれに対応する利得を表す符
号ならびにフレーム毎に計算された２つの線形予測係数
が符号化出力として出力される。An input code of the adaptive codebook or fixed codebook for each subframe selected by the adaptive / fixed codebook search and a code representing the gain corresponding thereto, for each subframe selected by the noise / pulse codebook search The input code of the noise codebook or the pulse codebook and the code representing the corresponding gain, and the two linear prediction coefficients calculated for each frame are output as encoded outputs.

【００６７】上記の音声符号化装置においては、現サブ
フレームが、前サブフレームと異なる成分から構成され
ている場合には、たとえば、次のような動作になると考
えられる。つまり、現サブフレームが、前サブフレーム
と異なる成分から構成されている場合には、現サブフレ
ームの適応／固定コードブック探索によって固定コード
ブック５に対する入力符号が選択され、雑音／パルスコ
ードブックの探索によってパルスコードブック７による
入力符号が選択される。In the above speech coding apparatus, when the current subframe is composed of a component different from that of the preceding subframe, for example, the following operation is considered. That is, when the current subframe is composed of a component different from that of the previous subframe, the input code for the fixed codebook 5 is selected by the adaptive / fixed codebook search of the current subframe, and the noise / pulse codebook The search selects the input code from the pulse codebook 7.

【００６８】したがって、適応コードブック４には、適
応／固定コードブック探索によって選択された固定コー
ドブックに基づく励振信号と、雑音／パルスコードブッ
クの探索によって選択されたパルスコードブックに基づ
く励振信号との合成信号が、新たに格納される。Therefore, the adaptive codebook 4 includes an excitation signal based on the fixed codebook selected by the adaptive / fixed codebook search and an excitation signal based on the pulse codebook selected by the noise / pulse codebook search. The synthesized signal of is newly stored.

【００６９】そして、次のサブフレームの適応／固定コ
ードブック探索では、適応コードブック４に対する符号
が選択され、雑音／パルスコードブックの探索では、雑
音コードブック６による符号が選択される。Then, in the adaptive / fixed codebook search of the next subframe, the code for the adaptive codebook 4 is selected, and in the noise / pulse codebook search, the code by the noise codebook 6 is selected.

【００７０】上記実施例では、代表的な有声音のピッチ
波形に対する符号ベクトルが記憶されたパルスコードブ
ック７が、雑音コードブック６に対して補完的に設けら
れているので、入力音声の周期部分のうち、適応コード
ブックで十分に表現されなかった部分を効率良く表現す
ることが可能となる。この結果、再生音声の音質が向上
する。In the above embodiment, since the pulse codebook 7 in which the code vector for the typical pitch waveform of the voiced sound is stored is provided in a complementary manner to the noise codebook 6, the periodic part of the input voice is used. Of these, it is possible to efficiently represent the part that was not sufficiently expressed by the adaptive codebook. As a result, the quality of reproduced voice is improved.

【００７１】また、単純なインパルス列の探索結果に基
づいて、入力音声のピッチ周期に対応するように、パル
スコードブック７から読み出されたパルス符号ベクトル
を周期化しているため、パルスコードブック７から読み
出されたパルス符号ベクトルを周期化するための処理時
間が短縮される。Since the pulse code vector read from the pulse codebook 7 is made periodic so as to correspond to the pitch period of the input voice based on the search result of the simple impulse train, the pulse codebook 7 The processing time for periodicizing the pulse code vector read from is shortened.

【００７２】なお、適応／固定コードブック探索および
雑音／パルスコードブック探索において、原音声と再生
音声の差をマスキング特性に対応したフィルタ（聴覚重
み付けフィルタ）に通した値に基づいて、距離計算を行
なうようにしてもよい。また、原音声を聴覚重み付けフ
ィルタに通した値と、再生音声を聴覚重み付けフィルタ
に通した値との差に基づいて、距離計算を行なうように
してもよい。In the adaptive / fixed codebook search and the noise / pulse codebook search, the distance is calculated based on the value obtained by passing the difference between the original voice and the reproduced voice through a filter (auditory weighting filter) corresponding to the masking characteristic. You may do it. Further, the distance calculation may be performed based on the difference between the value obtained by passing the original voice through the auditory weighting filter and the value obtained by passing the reproduced voice through the auditory weighting filter.

【００７３】聴覚重み付けフィルタは、周波数軸上にお
いて音声パワーの大きな部分の歪を軽く、音声パワーの
小さな部分の歪を重く、重み付けする特性を持つフィル
タである。また、マスキング特性とは、人間の聴覚はあ
る周波数成分が大きいとその近くの周波数の音が聞こえ
にくくなる特性をいう。The perceptual weighting filter is a filter having a characteristic of weighting the distortion of a portion having a large sound power on the frequency axis to be light and the distortion of a portion having a small sound power to be heavy. In addition, the masking characteristic is a characteristic that human hearing makes it difficult to hear sounds of frequencies near a certain frequency component.

【００７４】[0074]

【発明の効果】この発明によれば、代表的な有声音のピ
ッチ波形に対する符号ベクトルが記憶されたパルスコー
ドブックが、雑音コードブックに対して補完的に設けら
れているので、入力音声の周期部分のうち、適応コード
ブックで十分に表現されなかった部分を表現することが
可能となる。この結果、再生音声の音質が向上する。According to the present invention, since the pulse codebook in which the code vector for the pitch waveform of a typical voiced sound is stored is provided complementarily to the noise codebook, the cycle of the input speech is It is possible to represent a part of the part that is not sufficiently represented by the adaptive codebook. As a result, the quality of reproduced voice is improved.

【００７５】また、単純なインパルス列の探索結果に基
づいて、入力音声のピッチ周期に対応するように、パル
スコードブックから読み出されたパルス符号ベクトルを
周期化しているため、パルスコードブックから読み出さ
れたパルス符号ベクトルを周期化するための処理時間が
短縮される。Further, since the pulse code vector read from the pulse codebook is made periodic so as to correspond to the pitch period of the input voice based on the search result of the simple impulse train, it is read from the pulse codebook. The processing time for periodicizing the emitted pulse code vector is reduced.

[Brief description of drawings]

【図１】音声符号化装置の構成を示すブロック図であ
る。FIG. 1 is a block diagram showing a configuration of a speech encoding apparatus.

【図２】パルスコードブックの内容の一例を示す模式図
である。FIG. 2 is a schematic diagram showing an example of the contents of a pulse codebook.

【図３】インパルス列の例を示す模式図である。FIG. 3 is a schematic diagram showing an example of an impulse train.

【図４】インパルス列の例を示す模式図である。FIG. 4 is a schematic diagram showing an example of an impulse train.

【図５】インパルス列の探索によって選択されたインパ
ルス列と、そのインパルス列の各インパルス位置に、パ
ルスコードブックから読み出された符号ベクトルが設定
されることにより作成されたパルス符号ベクトルとを示
す模式図である。FIG. 5 shows an impulse train selected by impulse train search and a pulse code vector created by setting a code vector read from a pulse codebook at each impulse position of the impulse train. It is a schematic diagram.

【図６】従来例を示すブロック図である。FIG. 6 is a block diagram showing a conventional example.

[Explanation of symbols]

１線形予測分析部２ピッチ抽出部３線形予測合成フィルタ４適応コードブック５固定コードブック６雑音コードブック７パルスコードブック８距離計算部 1 Linear Prediction Analysis Section 2 Pitch Extraction Section 3 Linear Prediction Synthesis Filter 4 Adaptive Codebook 5 Fixed Codebook 6 Noise Codebook 7 Pulse Codebook 8 Distance Calculation Section

Claims

[Claims]

1. A speech synthesis filter is constructed by performing a linear predictive analysis of input speech, the speech is reproduced based on a code vector stored in a codebook and the speech synthesis filter, and the reproduced speech and the input speech are input. In a speech coder that encodes speech based on speech, a pulse codebook that stores multiple types of code vectors for the pitch waveform of voiced sound is provided, and the impulse train selected by the impulse train search A speech coding apparatus, wherein a code vector read from a pulse codebook is periodicized based on the above.

2. An adaptive codebook for constructing a speech synthesis filter by linearly predicting input speech and storing code vectors corresponding to past excitation signals, and a noise codebook in which code vectors for noise are stored. A code vector read from a codebook including and,
In a voice encoding device that reproduces voice based on a voice synthesis filter and encodes voice based on the reproduced voice and the input voice, a pulse in which a plurality of types of code vectors for a pitch waveform of voiced sound are stored The codebook is provided as a complement to the noise codebook, and when creating the reproduced voice based on the code vector read from the pulse codebook, impulses are generated at the pitch period intervals of the input voice. A reproduced voice corresponding to each impulse train is created based on a plurality of types of impulse trains whose initial positions are different from each other and a voice synthesis filter, and the distortion between the reproduced voice and the input voice is minimized. Impulse train is selected,
The code vector read from the pulse codebook is
A speech coding apparatus characterized in that it is made periodic based on a selected impulse train.

3. A means for creating a speech synthesis filter by performing a linear predictive analysis of input speech, and a plurality of code vectors at different clipping positions from an adaptive codebook in which code vectors corresponding to past excitation signals are stored. Are sequentially extracted, and the speech synthesis filter is driven by each of the clipped code vectors to create a reproduced voice corresponding to each of the clipped code vectors, and the code vector with the smallest distortion between the reproduced voice and the input voice is generated. Each of the code vectors is sequentially searched from a first search means for searching for a noise codebook in which a plurality of types of code vectors for noise are stored and a pulse codebook in which a plurality of types of code vectors for a pitch waveform of voiced sound are stored. Based on the read, each read code vector, and the voice synthesis filter, A second search means for creating reproduced voices corresponding to the respective extracted code vectors and searching for a code vector having the smallest distortion between the reproduced voice and the input voice is provided, and the second search means is a pulse codebook. When creating playback audio based on the code vector read from
Creates and plays back the reproduced voice corresponding to each impulse train based on the impulse trains that generate impulses at the pitch period of the input voice and have different initial positions, and the voice synthesis filter. A speech coding apparatus including means for selecting an impulse train having the smallest distortion between the speech and the input speech and for making the code vector read from the pulse codebook periodic based on the selected impulse train. .

4. A means for creating a speech synthesis filter by performing a linear predictive analysis of input speech, a plurality of code vectors by changing a cut-out position from an adaptive codebook in which code vectors corresponding to past excitation signals are stored. Is sequentially cut out, by driving the speech synthesis filter with each clipped code vector, to create a reproduced voice corresponding to each clipped code vector, and calculate the distortion between each reproduced voice and the input voice,
By sequentially reading code vectors from a fixed codebook that stores multiple types of code vectors and driving the speech synthesis filter with the read code vectors, the reproduced speech corresponding to each read code vector is created. Then, the distortion between each reproduced voice and the input voice is calculated, and the code vector cut out from the adaptive codebook and the code vector read from the fixed codebook are searched for the one having the smallest distortion calculation result. 1 search means, the noise codebook in which a plurality of code vectors for noise is stored, and the pulse codebook in which a plurality of types of code vectors for a pitch waveform of voiced sound are stored Based on each code vector and the speech synthesis filter And a second search means for creating a reproduced voice corresponding to each of the tolls and searching for a code for a code vector having the smallest distortion between the reproduced voice and the input voice. The second search means is read from the pulse codebook. In creating the reproduced voice based on the code vector,
Creates and plays back the reproduced voice corresponding to each impulse train based on the impulse trains that generate impulses at the pitch period of the input voice and have different initial positions, and the voice synthesis filter. A speech coding apparatus including means for selecting an impulse train having the smallest distortion between the speech and the input speech and for making the code vector read from the pulse codebook periodic based on the selected impulse train. .