JPH08248999A

JPH08248999A - Voice coding/decoding device

Info

Publication number: JPH08248999A
Application number: JP7074672A
Authority: JP
Inventors: Masaya Takahashi; 真哉高橋; Yoji Maeda; 陽二前田
Original assignee: IDO TSUSHIN SYST KAIHATSU KK
Current assignee: IDO TSUSHIN SYST KAIHATSU KK
Priority date: 1995-03-08
Filing date: 1995-03-08
Publication date: 1996-09-27
Anticipated expiration: 2016-03-26
Also published as: JP3148920B2

Abstract

PURPOSE: To generate a decoded voice source signal with excellent continuity and to obtain the decoded voice source signal with high quality by providing a voice source shaping part weighting and adding a representative voice source of a decoded frame and an impulse signal. CONSTITUTION: The voice source shaping part 25 being a voice source shaping means is provided in a decoding part 120, and the impulse signal is weighted and added to the representative voice source S20, and the representative voice source S20 is shaped. Since the waveform shape of the shaped representative voice source S26 shaped by the shaping approaches to an impulse always, even when that is repeated in a representative voice source repeat part 21, the decoded voice source signal S22 with a smooth change is generated. Since the representative voice source S20 is shaped so as to approach to the impulse by the voice source shaping part 25 in such a manner, the decoded voice source signal with the smooth change between frames is obtained. Further, since the shaping is performed so as to approach to the impulse further when a change of an input voice signal is large, the decoded voice source signal following up even large change is obtained.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声信号をディジタル
信号に変換して伝送又は蓄積する場合に用いる音声符号
化復号化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding / decoding apparatus used for converting a speech signal into a digital signal for transmission or storage.

【０００２】[0002]

【従来の技術】従来、音声信号を所定の長さのフレーム
毎に分析し、音源信号とスペクトル包絡情報とに分離し
て符号化し、その符号化されたデータを復号化して音声
信号を生成する音声符号化復号化装置としては、例え
ば、文献１（高橋、飯森「位相等化音源のベクトル量子
化を用いたボコーダ」，電気関係学会北陸支部連合大
会，Ｂ１４５，Ｐ２９１，平成４年１０月）に示された
ものが知られている。2. Description of the Related Art Conventionally, a voice signal is analyzed for each frame of a predetermined length, separated into a sound source signal and spectrum envelope information and encoded, and the encoded data is decoded to generate a voice signal. As a speech coding / decoding device, for example, Document 1 (Takahashi, Iimori "Vocoder Using Vector Quantization of Phase Equalized Sound Source", Japan Electrical Association Hokuriku Branch Union Conference, B145, P291, October 1992) The one shown in is known.

【０００３】この従来例は、音声信号が有声音の場合に
はピッチ周期で類似の波形が繰り返されるという点を利
用し、当該フレームの音源信号をその中の１ピッチ周期
長の信号のみで代表させることで有声音の部分の符号化
特性を改善した点に特徴があり、音源信号を短時間位相
等化することにより、代表音源の切り出しの安定化と代
表音源のベクトル量子化の効率化が図れる、という利点
がある。This conventional example takes advantage of the fact that a similar waveform is repeated in a pitch cycle when a voice signal is a voiced sound, and the sound source signal of the frame is represented by only a signal of one pitch cycle length in it. This is characterized by improving the coding characteristics of the voiced sound part.By equalizing the phase of the excitation signal for a short time, the extraction of the representative excitation is stabilized and the vector quantization of the representative excitation is made more efficient. There is an advantage that it can be achieved.

【０００４】図４は、上記した従来の音声符号化復号化
装置の構成を示すブロック図である。図に示すように、
この音声符号化復号化装置２００は、符号化部２１０と
復号化部２２０を備えて構成されている。上記の符号化
部２１０は、スペクトル包絡分析部２と、逆フィルタ部
４と、位相等化部７と、代表音源切出し部９と、ピッチ
周期抽出部１１と、有声／無声判定部１３と、代表音源
符号化部１７を有している。また、上記の復号化部２２
０は、代表音源復号化部１９と、代表音源繰返し部２１
と、合成フィルタ部２３を有して構成されている。FIG. 4 is a block diagram showing the configuration of the above-mentioned conventional speech coding / decoding apparatus. As shown in the figure,
The audio encoding / decoding device 200 is configured to include an encoding unit 210 and a decoding unit 220. The coding unit 210 described above includes a spectrum envelope analysis unit 2, an inverse filter unit 4, a phase equalization unit 7, a representative sound source cutout unit 9, a pitch period extraction unit 11, a voiced / unvoiced determination unit 13, and It has a representative excitation encoding unit 17. In addition, the decoding unit 22 described above
0 represents the representative excitation decoding unit 19 and the representative excitation repetition unit 21.
And a synthesis filter unit 23.

【０００５】次に、上記した従来の音声符号化復号化装
置２００の動作について説明する。まず符号化部２１０
の動作について説明する。スペクトル包絡分析部２は、
入力された１フレーム毎に入力音声信号Ｓ1 を分析し
て、該当するフレームの音声信号のスペクトル包絡を、
例えば線形予測分析法によって求め、スペクトル包絡情
報Ｓ3 として逆フィルタ部４に出力する。また、スペク
トル包絡分析部２は、同時に、このスペクトル包絡情報
Ｓ3 を符号化してスペクトル包絡符号Ｓ5 として外部へ
出力する。Next, the operation of the above-described conventional speech coding / decoding apparatus 200 will be described. First, the encoding unit 210
The operation of will be described. The spectrum envelope analysis unit 2
The input voice signal S1 is analyzed for each input frame, and the spectrum envelope of the voice signal of the corresponding frame is calculated as
For example, it is obtained by a linear predictive analysis method and is output to the inverse filter unit 4 as spectrum envelope information S3. At the same time, the spectrum envelope analysis unit 2 encodes this spectrum envelope information S3 and outputs it to the outside as a spectrum envelope code S5.

【０００６】有声／無声判定部１３は、入力音声信号Ｓ
1 の例えばパワーを分析し、この入力音声信号Ｓ1 が有
声音と無声音のどちらであるかの判定を行い、その判定
結果を有声／無声判定情報Ｓ12としてピッチ周期抽出部
１１に出力する。また、有声／無声判定部１３は、同時
に、この有声／無声判定情報Ｓ12を符号化して有声／無
声判定符号Ｓ16として外部へ出力する。ピッチ周期抽出
部１１は、上記の有声／無声判定情報Ｓ12が有声音であ
る場合には、入力音声信号Ｓ1 に対してピッチ周期分析
を行い、得られたピッチ周期Ｓ10を代表音源切出し部９
に出力する。また、ピッチ周期抽出部１１は、同時に、
ピッチ周期Ｓ10を符号化してピッチ周期符号Ｓ15として
外部へ出力する。The voiced / unvoiced decision unit 13 receives the input voice signal S
For example, the power of 1 is analyzed to determine whether the input voice signal S1 is voiced or unvoiced, and the determination result is output to the pitch period extraction unit 11 as voiced / unvoiced determination information S12. At the same time, the voiced / unvoiced determination unit 13 encodes this voiced / unvoiced determination information S12 and outputs it as a voiced / unvoiced determination code S16 to the outside. When the voiced / unvoiced determination information S12 is voiced sound, the pitch cycle extraction unit 11 performs a pitch cycle analysis on the input voice signal S1 and extracts the obtained pitch cycle S10 from the representative sound source cutout unit 9
Output to. In addition, the pitch period extraction unit 11 simultaneously
The pitch cycle S10 is encoded and output as a pitch cycle code S15 to the outside.

【０００７】逆フィルタ部４は、上記のスペクトル包絡
情報Ｓ3 を用いて、入力音声信号Ｓ1 に、例えば線形予
測逆フィルタ処理を行って音源信号Ｓ6 を求め、位相等
化部７に出力する。位相等化部７は、音源信号Ｓ6 の位
相が、ピッチ周期間隔で現れる波形上の振幅最大位置
（ピーク位置）においてゼロになるように短時間位相等
化処理を行い、位相等化音源信号Ｓ8 を代表音源切出し
部９に出力する。代表音源切出し部９は、ピッチ周期Ｓ
10が入力された場合、すなわち有声／無声判定情報Ｓ12
が有声音であった場合には、該当するフレームにおける
位相等化音源信号８の振幅最大の位置を基準にして位相
等化音源信号Ｓ8 から１ピッチ周期Ｓ10の長さ分の信号
を切り出し、代表音源Ｓ14として代表音源符号化部１７
に出力する。代表音源符号化部１７は、上記の代表音源
Ｓ14を、例えばベクトル量子化して符号化し、得られた
代表音源符号Ｓ18を外部へ出力する。The inverse filter unit 4 uses the above-mentioned spectrum envelope information S3 to perform, for example, linear prediction inverse filter processing on the input voice signal S1 to obtain a sound source signal S6, and outputs it to the phase equalization unit 7. The phase equalizer 7 performs a short-time phase equalization process so that the phase of the sound source signal S6 becomes zero at the maximum amplitude position (peak position) on the waveform appearing at pitch period intervals, and the phase equalized sound source signal S8 is obtained. Is output to the representative sound source cutout unit 9. The representative sound source cutout unit 9 has a pitch cycle S.
When 10 is input, that is, voiced / unvoiced determination information S12
Is a voiced sound, a signal corresponding to the length of one pitch period S10 is cut out from the phase-equalized sound source signal S8 with reference to the maximum amplitude position of the phase-equalized sound source signal 8 in the corresponding frame. Representative excitation encoding unit 17 as excitation S14
Output to. The representative excitation encoding unit 17 vector-quantizes and encodes the above-described representative excitation S14, and outputs the obtained representative excitation code S18 to the outside.

【０００８】次に、復号化部の動作について説明する。
代表音源復号化部１９は、符号化された代表音源符号Ｓ
18を復号化し、復号代表音源Ｓ20を代表音源繰返し部２
１に出力する。代表音源繰返し部２１は、有声／無声判
定符号Ｓ16とピッチ周期符号Ｓ15を、それぞれ有声／無
声判定情報とピッチ周期に復号化し、有声／無声判定情
報が有声音であった場合には、ピッチ周期の間隔で復号
代表音源Ｓ20を繰り返して復号音源信号Ｓ27を生成し、
合成フィルタ２３に出力する。また有声／無声判定情報
が無声音であった場合には、白色雑音を生成し、復号音
源信号Ｓ27として合成フィルタ２３に出力する。合成フ
ィルタ部２３は、スペクトル包絡符号Ｓ5 よりスペクト
ル包絡を復号化し、この復号スペクトル包絡と復号音源
信号Ｓ27より復号音声信号Ｓ28を例えば線形予測合成フ
ィルタ手法によって合成する。Next, the operation of the decoding section will be described.
The representative excitation decoding unit 19 encodes the representative excitation code S
18 is decoded, and the decoded representative sound source S20 is used as the representative sound source repeating unit 2
Output to 1. The representative sound source repeating unit 21 decodes the voiced / unvoiced determination code S16 and the pitch period code S15 into voiced / unvoiced determination information and pitch period, respectively, and when the voiced / unvoiced determination information is voiced sound, the pitch period The decoded representative sound source S20 is repeated at intervals of to generate a decoded sound source signal S27,
Output to the synthesis filter 23. When the voiced / unvoiced determination information is unvoiced sound, white noise is generated and output to the synthesis filter 23 as a decoded excitation signal S27. The synthesis filter unit 23 decodes the spectrum envelope from the spectrum envelope code S5 and synthesizes the decoded speech signal S28 from the decoded spectrum envelope and the decoded excitation signal S27 by, for example, a linear prediction synthesis filter method.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、図４に
示したような従来の音声符号化復号化装置２００におい
ては、復号化部２２０の代表音源繰返し部２１が復号代
表音源Ｓ20を単純にピッチ周期間隔で繰り返して該当フ
レームの復号音源信号Ｓ27を生成していたため、該当す
るフレームと直前直後のフレームの復号代表音源の形状
がある程度以上異なる場合には、フレーム境界で復号音
源信号が急に変化して復号音声信号に音質劣化（急激な
変化による異音等）を生ずる、という問題点があった。However, in the conventional speech coding / decoding apparatus 200 as shown in FIG. 4, the representative excitation repeating unit 21 of the decoding unit 220 simply outputs the decoded representative excitation S20 to the pitch cycle. Since the decoded excitation signal S27 of the corresponding frame is repeatedly generated at intervals, if the shapes of the decoded representative excitations of the corresponding frame and the frames immediately before and after the same differ to some extent or more, the decoded excitation signal suddenly changes at the frame boundary. As a result, there is a problem that sound quality deterioration (abnormal noise due to abrupt change) occurs in the decoded speech signal.

【００１０】このことを図５によって説明する。図５
は、上記従来の音声符号化復号化装置における復号代表
音源の単純繰返しによる復号音源信号の生成を説明する
図であり、図５（Ａ）は代表音源を、図５（Ｂ）は復号
音源信号を示している。図に示すように、該当するフレ
ームＭと直前直後のフレーム（Ｍ−１），（Ｍ＋１）の
復号代表音源の形状が異なると、フレーム境界Ｂ3 ，Ｂ
4 において復号音源信号が急に変化し、結果的に復号音
声信号に音質劣化（急激な変化による異音等）が生じ
る。This will be described with reference to FIG. Figure 5
5A and 5B are diagrams for explaining generation of a decoded excitation signal by simple repetition of a decoded representative excitation in the above conventional speech encoding / decoding device. FIG. 5A shows a representative excitation and FIG. 5B shows a decoded excitation signal. Is shown. As shown in the figure, when the shapes of the decoded representative sound sources of the corresponding frame M and the immediately preceding and succeeding frames (M-1) and (M + 1) are different, frame boundaries B3 and B3
At 4, the decoded sound source signal suddenly changes, resulting in deterioration of the sound quality of the decoded speech signal (such as abnormal noise due to a sudden change).

【００１１】本発明は、かかる課題を解決するためにな
されたものであり、該当するフレームと直前直後のフレ
ームの復号代表音源が異なっても、連続性の良い復号音
源信号を生成し、品質の良い復号音声信号を得ることが
できる音声符号化復号化装置を提供することを目的とす
る。The present invention has been made to solve the above problems, and generates a decoded excitation signal with good continuity even if the decoding representative excitation of the corresponding frame and the immediately preceding and succeeding frames are different, and It is an object of the present invention to provide a speech coding / decoding device capable of obtaining a good decoded speech signal.

【００１２】[0012]

【課題を解決するための手段】上記課題を解決するた
め、請求項１記載の発明は、入力音声信号を所定の長さ
のフレーム毎に分析して、前記入力音声信号のスペクト
ル包絡情報と音源信号及びピッチ周期を求め、前記入力
音声信号が有声音である場合には前記音源信号の位相が
前記ピッチ周期の間隔で現れる信号波形上の振幅最大位
置においてゼロになるように短時間位相等化処理を行っ
た後に１ピッチ周期長分の代表音源を抽出し、この代表
音源と前記ピッチ周期及び前記スペクトル包絡情報を符
号化しスペクトル包絡符号として出力する符号化部と、
この符号化部より出力された符号化された各パラメータ
を復号化し、前記入力音声信号が有声音である場合には
復号化された１ピッチ周期長の前記代表音源を前記ピッ
チ周期の間隔で並べることにより復号音源信号を生成す
るとともに、この復号音源信号と復号化されたスペクト
ル包絡を用いて復号音声信号を復号する復号化部と、を
備えた音声符号化復号化装置において、復号化された該
当フレームの前記代表音源とインパルス信号とを重み付
け加算する音源整形手段と、この音源整形手段から出力
される整形代表音源を前記ピッチ周期の間隔で並べるこ
とにより前記復号音源信号を生成する代表音源繰り返し
手段と、を前記復号化部に備えて構成される。In order to solve the above-mentioned problems, the present invention according to claim 1 analyzes an input voice signal for each frame of a predetermined length, and obtains a spectrum envelope information and a sound source of the input voice signal. A signal and a pitch period are obtained, and when the input voice signal is a voiced sound, short-time phase equalization is performed so that the phase of the sound source signal becomes zero at the maximum amplitude position on the signal waveform appearing at the intervals of the pitch period. An encoding unit that extracts a representative sound source for one pitch period length after performing the processing, encodes the representative sound source, the pitch period, and the spectrum envelope information, and outputs the spectrum envelope code.
Each encoded parameter output from this encoding unit is decoded, and when the input speech signal is a voiced sound, the decoded representative sound sources of one pitch cycle length are arranged at intervals of the pitch cycle. And a decoding unit that decodes the decoded speech signal by using this decoded excitation signal and the decoded spectrum envelope, and is decoded by the speech coding and decoding apparatus. Sound source shaping means for weighting and adding the representative sound source and impulse signal of the corresponding frame, and a representative sound source repetition for generating the decoded sound source signal by arranging the shaped representative sound sources output from the sound source shaping means at intervals of the pitch cycle. And a means for providing the decoding unit.

【００１３】また、請求項２記載の発明は、請求項１記
載の音声符号化復号化装置において、前記音源整形手段
がは、前記該当フレームと前記該当フレームの近傍のフ
レームの音声信号の変化の大きさに従って、前記代表音
源と前記インパルス信号とを重み付け加算する場合の重
み付け係数の値を変化させるように構成される。According to a second aspect of the present invention, in the voice encoding / decoding apparatus according to the first aspect, the excitation shaping means is configured to change the voice signal of the corresponding frame and a frame in the vicinity of the corresponding frame. It is configured to change the value of the weighting coefficient when performing weighted addition of the representative sound source and the impulse signal according to the magnitude.

【００１４】[0014]

【作用】上記構成を有する請求項１記載の発明によれ
ば、音源整形手段は、復号化された該当フレームの前記
代表音源とインパルス信号とを重み付け加算する。ま
た、代表音源繰り返し手段は、音源整形手段から出力さ
れる整形代表音源を前記ピッチ周期の間隔で並べること
により前記復号音源信号を生成する。According to the invention having the above-mentioned structure, the sound source shaping means weights and adds the representative sound source and the impulse signal of the decoded corresponding frame. Further, the representative excitation repeating unit generates the decoded excitation signal by arranging the shaped representative excitations output from the excitation shaping unit at intervals of the pitch cycle.

【００１５】また上記構成を有する請求項２記載の発明
によれば、前記音源整形手段は、前記該当フレームと前
記該当フレームの近傍のフレームの音声信号の変化の大
きさに従って、前記代表音源と前記インパルス信号とを
重み付け加算する場合の重み付け係数の値を変化させ
る。According to the invention of claim 2 having the above-mentioned configuration, the sound source shaping means determines the representative sound source and the representative sound source according to the magnitude of change in the audio signal of the corresponding frame and a frame in the vicinity of the corresponding frame. The value of the weighting coefficient when weighting and adding the impulse signal is changed.

【００１６】[0016]

【Example】

（第１実施例）以下に、本発明の実施例を図に基づいて
説明する。図１は、本発明の第１実施例である音声符号
化復号化装置の構成を示すブロック図である。図に示す
ように、この音声符号化復号化装置１０１は、符号化部
１１０と復号化部１２０を備えて構成されている。上記
の符号化部１１０は、スペクトル包絡分析部２と、逆フ
ィルタ部４と、位相等化部７と、代表音源切出し部９
と、ピッチ周期抽出部１１と、有声／無声判定部１３
と、代表音源符号化部１７を有している。また、上記の
復号化部１２０は、代表音源復号化部１９と、代表音源
繰返し部２１と、音源整形部２５と、合成フィルタ部２
３を有して構成されている。この第１実施例の音声符号
化復号化装置１０１が、上記した従来例の音声符号化復
号化装置２００と構成において異なる点は、復号化部１
２０に音源整形手段である音源整形部２５を備えた点で
ある。この音源整形部２５は、代表音源復号化部１９の
出力側及び代表音源繰返し手段である代表音源繰返し部
２１の入力側に配置される。音源整形部２５以外の構成
要素は、図４に示す構成要素と同一構成で同一作用効果
を有するので、それらについては、図４に示す構成要素
と同一符号を符し、その説明は省略する。(First Embodiment) An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of a speech coding / decoding apparatus that is a first embodiment of the present invention. As shown in the figure, this audio encoding / decoding apparatus 101 is configured to include an encoding unit 110 and a decoding unit 120. The encoding unit 110 includes the spectrum envelope analysis unit 2, the inverse filter unit 4, the phase equalization unit 7, and the representative sound source cutout unit 9.
A pitch period extraction section 11 and a voiced / unvoiced determination section 13
And a representative excitation coding unit 17. Further, the above decoding unit 120 includes the representative excitation decoding unit 19, the representative excitation repeating unit 21, the excitation shaping unit 25, and the synthesis filter unit 2.
It is configured to have 3. The configuration of the speech coding / decoding apparatus 101 of the first embodiment differs from that of the speech coding / decoding apparatus 200 of the conventional example described above in that the decoding unit 1
20 is provided with a sound source shaping section 25 which is a sound source shaping means. This excitation shaping section 25 is arranged on the output side of the representative excitation decoding section 19 and on the input side of the representative excitation repetition section 21, which is the representative excitation repetition means. The components other than the sound source shaping unit 25 have the same configurations and the same effects as the components shown in FIG. 4, and therefore, the same reference numerals as those of the components shown in FIG.

【００１７】以下、上記の音声符号化復号化装置１０１
の動作について説明する。上記の音源整形部２５は、代
表音源Ｓ20にインパルス信号を重み付け加算して代表音
源Ｓ20を整形する。この整形により、整形された整形代
表音源Ｓ26の波形形状が常にインパルスに近づくため、
これを代表音源繰返し部２１で繰り返しても、変化の滑
らかな復号化音源信号Ｓ22が生成できる。また、現在の
フレームとその近傍のフレームの間で大きく入力音声信
号が変化した場合は、上記の重み付け係数を調整して代
表音源をよりインパルスに近づけることにより、音源信
号の変化を滑らかにすることができる。Hereinafter, the above speech coding / decoding device 101 will be described.
The operation of will be described. The sound source shaping unit 25 shapes the representative sound source S20 by weighting and adding the impulse signal to the representative sound source S20. By this shaping, the shaped waveform of the shaped representative sound source S26 is always close to an impulse,
Even if the representative excitation repeating unit 21 repeats this, a decoded excitation signal S22 with a smooth change can be generated. Also, when the input audio signal changes significantly between the current frame and its neighboring frames, the above-mentioned weighting coefficient is adjusted to bring the representative sound source closer to the impulse, thereby smoothing the change of the sound source signal. You can

【００１８】以下に、音源整形部２５の動作について、
さらに詳しく説明する。ここで、代表音源音Ｓ20をＲＣ
i （ｉは１からＰまでの正の整数）とし、この代表音源
Ｓ20の振幅最大位置と同じ位置にパルスを持つインパル
ス信号をＲＩiとし、ピッチ周期をＰとし、重み付け係
数をα（αは０≦α≦１なる実数）としたとき、音源整
形部２５は、例えば下式ＲＭi ＝（１−α）×ＲＣi ＋α×ＲＩi ………（１）に従って両者を重み付け加算し、整形代表音源Ｓ26であ
る信号ＲＭi を得る。The operation of the sound source shaping unit 25 will be described below.
This will be described in more detail. Here, the representative sound source sound S20 is RC
i (i is a positive integer from 1 to P), the impulse signal having a pulse at the same position as the maximum amplitude position of the representative sound source S20 is RIi, the pitch period is P, and the weighting coefficient is α (α is 0 ≦ α ≦ 1), the sound source shaping unit 25 weights and adds both according to the following formula RMi = (1-α) × RCi + α × RIi ... (1) and the shaped representative sound source S26 Obtain a signal RMi.

【００１９】上式（１）において、下式In the above equation (1), the following equation

【数１】なる関係がある。[Equation 1] There is a relationship.

【００２０】図２は、上記の音源整形部２５における上
式（１）で示した代表音源の重み付け加算の動作を説明
する図である。図に示すように、インパルス波形の形状
に近づいた整形代表音源が得られることが示されてい
る。FIG. 2 is a diagram for explaining the operation of weighted addition of the representative sound source expressed by the above equation (1) in the sound source shaping section 25. As shown in the figure, it is shown that a shaped representative sound source that approaches the shape of the impulse waveform can be obtained.

【００２１】また音源整形部２５は、スペクトル包絡符
号Ｓ5 を復号化してスペクトル包絡を求め、現在のフレ
ームのスペクトル包絡と例えば直前のフレームのスペク
トル包絡との距離Ｄを計算し、距離Ｄが大きい場合は重
み付け係数αを大きく設定して整形代表音源をよりイン
パルスに近づける。Further, the sound source shaping unit 25 decodes the spectrum envelope code S5 to obtain the spectrum envelope, calculates the distance D between the spectrum envelope of the present frame and the spectrum envelope of the immediately preceding frame, and when the distance D is large. Sets a large weighting coefficient α to bring the shaped representative sound source closer to an impulse.

【００２２】代表音源繰返し部２１は音源整形部２５で
求められた整形代表音源Ｓ26をピッチ周期間隔で繰り返
して復号音源信号Ｓ22を求める。The representative excitation repetition unit 21 repeats the shaped representative excitation S26 obtained by the excitation shaping unit 25 at a pitch cycle interval to obtain a decoded excitation signal S22.

【００２３】図３は本実施例で生成される復号音源信号
Ｓ22の例を示しており、図５（Ａ）は整形された整形代
表音源を、図５（Ｂ）は復号音源信号を示している。図
に示すように、音源整形部２１の処理によりフレームと
フレームの間で変化の滑らかな復号音源信号が得られ
る。FIG. 3 shows an example of the decoded excitation signal S22 generated in this embodiment. FIG. 5A shows the shaped representative excitation and FIG. 5B shows the decoded excitation signal. There is. As shown in the figure, by the processing of the excitation shaping unit 21, a decoded excitation signal with a smooth change between frames can be obtained.

【００２４】（第２実施例）上記した第１実施例では、
音源整形部２５が重み付け係数αの大きさを設定するの
に現在のフレームと直前のフレームのスペクトル包絡間
の距離値Ｄを用いたが、本発明はこれには限定されず、
両フレームの代表音源の相互相関値を用いる第２実施例
のように構成してもよい。(Second Embodiment) In the first embodiment described above,
The sound source shaping unit 25 uses the distance value D between the spectral envelopes of the current frame and the immediately preceding frame to set the magnitude of the weighting coefficient α, but the present invention is not limited to this.
It may be configured as in the second embodiment using the cross-correlation values of the representative sound sources of both frames.

【００２５】なお、本発明は、上記実施例に限定される
ものではない。上記実施例は、例示であり、本発明の特
許請求の範囲に記載された技術的思想と実質的に同一な
構成を有し、同様な作用効果を奏するものは、いかなる
ものであっても本発明の技術的範囲に包含される。The present invention is not limited to the above embodiment. The above-described embodiment is an exemplification, and has substantially the same configuration as the technical idea described in the claims of the present invention, and any device having the same function and effect can be realized by the present invention. It is included in the technical scope of the invention.

【００２６】[0026]

【発明の効果】以上説明したように本発明によれば、音
源整形手段により代表音源をインパルスに近づけるよう
に整形したので、フレーム間で変化の滑らかな復号音源
信号を得られる。また入力音声信号の変化が大きいとき
はさらにインパルスに近づくように整形するので、大き
な変化にも追従可能な復号音源信号を得られる。したが
って、品質の良い復号音声信号を得ることができる、と
いう利点がある。As described above, according to the present invention, since the representative sound source is shaped by the sound source shaping means so as to be close to an impulse, a decoded sound source signal having a smooth change between frames can be obtained. Further, when the change of the input voice signal is large, the shaping is performed so as to be closer to the impulse, so that the decoded excitation signal that can follow the large change can be obtained. Therefore, there is an advantage that a decoded voice signal of good quality can be obtained.

[Brief description of drawings]

【図１】本発明の第１実施例である音声符号化復号化装
置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a speech coding / decoding apparatus that is a first embodiment of the present invention.

【図２】図１に示す音声符号化復号化装置における音源
整形部の動作を説明する図である。[Fig. 2] Fig. 2 is a diagram for explaining the operation of a sound source shaping unit in the speech coding / decoding device shown in Fig. 1.

【図３】図１に示す音声符号化復号化装置における復号
音源信号の生成動作を説明する図である。[Fig. 3] Fig. 3 is a diagram for describing an operation of generating a decoded excitation signal in the speech coding / decoding device shown in Fig. 1.

【図４】従来の音声符号化復号化装置の構成を示すブロ
ック図である。FIG. 4 is a block diagram showing a configuration of a conventional speech encoding / decoding device.

【図５】図５に示す音声符号化復号化装置における復号
音源信号の生成動作を説明する図である。FIG. 5 is a diagram for explaining an operation of generating a decoded excitation signal in the speech coding / decoding apparatus shown in FIG.

[Explanation of symbols]

２スペクトル包絡分析部４逆フィルタ部７位相等化部９代表音源切出し部１１ピッチ周期抽出部１３有声／無声判定部１７代表音源符号化部１９代表音源復号化部２１代表音源繰返し部２３合成フィルタ部２５音源整形部１０１音声符号化復号化装置１１０符号化部１２０復号化部２００音声符号化復号化装置２１０符号化部２２０復号化部Ｓ信号等 2 spectrum envelope analysis unit 4 inverse filter unit 7 phase equalization unit 9 representative sound source extraction unit 11 pitch period extraction unit 13 voiced / unvoiced judgment unit 17 representative sound source encoding unit 19 representative sound source decoding unit 21 representative sound source repetition unit 23 synthesis filter Part 25 Excitation shaping part 101 Speech coding / decoding device 110 Coding part 120 Decoding part 200 Speech coding / decoding device 210 Coding part 220 Decoding part S Signal etc.

Claims

[Claims]

1. An input speech signal is analyzed for each frame of a predetermined length to obtain spectral envelope information, a sound source signal and a pitch period of the input speech signal, and when the input speech signal is a voiced sound. A representative sound source for one pitch cycle length is extracted after performing a short-time phase equalization process so that the phase of the sound source signal becomes zero at the maximum amplitude position on the signal waveform appearing at intervals of the pitch cycle. An encoding unit that encodes the excitation, the pitch period, and the spectrum envelope information and outputs the spectrum envelope code, and decodes the encoded parameters output from the encoding unit,
When the input speech signal is a voiced sound, a decoded excitation signal is generated by arranging the decoded representative excitations of one pitch period length at intervals of the pitch period,
In a speech coding / decoding apparatus comprising: a decoding unit that decodes a decoded speech signal using this decoded excitation signal and a decoded spectrum envelope, the representative excitation and impulse signal of the corresponding decoded frame And a representative excitation repeating unit that generates the decoded excitation signal by arranging shaped representative excitations output from the excitation shaping unit at intervals of the pitch cycle. A speech encoding / decoding device characterized by the above.

2. The value of the weighting coefficient when the sound source shaping means weights and adds the representative sound source and the impulse signal according to the magnitude of change in the audio signal of the corresponding frame and a frame in the vicinity of the corresponding frame. The speech coding / decoding apparatus according to claim 1, wherein