JP3174781B2

JP3174781B2 - Diffusion sound source vector generation apparatus and diffusion sound source vector generation method

Info

Publication number: JP3174781B2
Application number: JP2000113779A
Authority: JP
Inventors: 和敏安永; 利幸森井
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1996-08-22
Filing date: 2000-04-14
Publication date: 2001-06-11
Anticipated expiration: 2016-08-22
Also published as: JP2000330596A

Abstract

PROBLEM TO BE SOLVED: To obtain good synthesized voice with small calculation amount for codebook survey and with small ROM capacities by supplying the same vector with the code vector generated by an algebraic codebook as a voice source vector. SOLUTION: An information storage part 1 for voice source stores voice source information of generating information of the like of code vector similar to an algebraic codebook and outputs the voice source information as a voice source vector. A diffused vector storing part 2 stores diffused information of a random progression or the like, and outputs the diffused information as a diffused vector. A convolution part 3 executes the convolution by inputting the voice source vector and the diffused vector and outputs them as a diffused voice source vector. Thus, a vector component of drive voice source diffused source has a random property, and a voice quality of output voice can be improved. Further, as the diffused voice source vector is formed with the voice source information of the information storage part 1 and the diffused vector of the diffused vector storing part 2, a ROM capacity can be reduced.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声符号化／復号
化装置に用いる音源ベクトルの生成装置及び生成方法に
関するものである。[0001] 1. Field of the Invention [0002] The present invention relates to an apparatus and a method for generating an excitation vector used in a speech encoding / decoding apparatus.

【０００２】[0002]

【従来の技術】ディジタル携帯電話等の移動体通信の分
野においては、加入者の増加に対処するため、低ビット
レート（〜８ｋｂｐｓ程度）の音声の圧縮符号化法が求
められている。日本国内では、ＶＳＥＬＰおよびＰＳＩ
−ＣＥＬＰという音声符号化方式が、フルレートおよび
ハーフレートのディジタル携帯電話の音声符号化標準方
式として、それぞれ採用・実用化されている。国際的に
は、ＣＳ−ＡＣＥＬＰという符号化方式が８ｋｂｐｓの
国際標準音声符号化方式として採用され、ＩＴＵ−Ｔ勧
告Ｇ．７２９となっている（１９９５年）。これらの音
声符号化はいずれも、ＣＥＬＰ方式（"CODE-EXCITED LI
NEAR PRIDICTION (CELP):HIGH QUALITY SPEECH AT VERY
LOW RATES", Manfred R. Schroeder,Bishnu S. Atal,
Proc., ICASSP'85, pp.937-940に記載されている）を改
良したものである。2. Description of the Related Art In the field of mobile communications such as digital cellular phones, there is a demand for a low bit rate (up to about 8 kbps) voice compression coding method in order to cope with an increase in the number of subscribers. In Japan, VSELP and PSI
A voice coding method called -CELP has been adopted and put into practical use as a voice coding standard method for full-rate and half-rate digital mobile phones. Internationally, a coding system called CS-ACELP has been adopted as an international standard voice coding system of 8 kbps, and the ITU-T recommendation G.264. 729 (1995). All of these speech encodings are CELP ("CODE-EXCITED LI
NEAR PRIDICTION (CELP): HIGH QUALITY SPEECH AT VERY
LOW RATES ", Manfred R. Schroeder, Bishnu S. Atal,
Proc., ICASSP'85, pp.937-940).

【０００３】ここで、ＣＥＬＰ方式の基本的アルゴリズ
ムについて説明する。ＣＥＬＰ方式は、音声情報を音源
情報と声道情報とに分離して符号化する方式で、音源情
報については符号帳に格納された複数のコードベクトル
のインデクスによって符号化し、声道情報についてはＬ
ＰＣ（線形予測係数）を符号化するということと、音源
情報符号化の際には声道情報を加味して入力音声と比較
を行う方法（Ａ−ｂ−Ｓ：ＡｎａｌｙｓｉｓｂｙＳｙ
ｎｔｈｅｓｉｓ）を採用していることに特徴を有してい
る。なおＣＥＬＰでは一般に、入力音声をある時間間隔
で区間（フレームと呼ばれる）ごとに分けてＬＰＣ分析
を行い、フレームをさらに細かく分けた区間（サブフレ
ームと呼ばれる）ごとに適応符号帳と確率的符号帳の音
源探索が行われる。Here, the basic algorithm of the CELP system will be described. The CELP system is a system in which voice information is separated and coded into sound source information and vocal tract information. The sound source information is coded by a plurality of code vector indexes stored in a codebook, and the vocal tract information is coded by L.
A method of encoding a PC (Linear Prediction Coefficient) and comparing the input speech with vocal tract information when encoding the sound source information (Abs: Analysis by Sy)
It is characterized in that it employs (theory). In general, in CELP, an input speech is divided into intervals (called frames) at certain time intervals, and LPC analysis is performed, and an adaptive codebook and a probabilistic codebook are divided into subdivided frames (called subframes). Is performed.

【０００４】図５は、従来のＣＥＬＰ型音声復号化装置
の機能ブロック図である。パラメータ復号化部１５２
は、従来のＣＥＬＰ型音声符号化装置（図６：後に説明
する）から送られた音声符号（ＬＰＣ符号、確率的符号
帳のインデクス、確率的符号帳の符号化ゲイン、適応符
号帳のインデクス、適応符号帳の符号化ゲイン）を、伝
送部１５１を通して獲得する。次に、ＬＰＣ符号を復号
化して復号化ＬＰＣ係数を得、確率的符号帳の符号化ゲ
インを復号化して確率的符号帳の復号化ゲインを得、適
応符号帳の符号化ゲインを復号化して適応符号帳の復号
化ゲインを得る。更に、確率的符号帳のインデクス、確
率的符号帳の復号化ゲイン、適応符号帳のインデクス、
適応符号帳の復号化ゲインを駆動音源生成部１５５へ出
力し、復号化ＬＰＣ係数をＬＰＣ合成部１５６へ出力す
る。FIG. 5 is a functional block diagram of a conventional CELP speech decoder. Parameter decoding unit 152
Is a speech code (LPC code, probabilistic codebook index, probabilistic codebook coding gain, adaptive codebook index, transmitted from a conventional CELP speech coding apparatus (FIG. 6: described later). (The coding gain of the adaptive codebook) through the transmission unit 151. Next, the LPC code is decoded to obtain decoded LPC coefficients, the coding gain of the stochastic codebook is decoded to obtain the decoding gain of the stochastic codebook, and the coding gain of the adaptive codebook is decoded. Obtain the adaptive codebook decoding gain. Further, the index of the probabilistic codebook, the decoding gain of the probabilistic codebook, the index of the adaptive codebook,
The decoding gain of the adaptive codebook is output to driving excitation generating section 155, and the decoded LPC coefficient is output to LPC synthesizing section 156.

【０００５】駆動音源生成部１５５は、まず、適応符号
帳のインデクスに基づいた適応コードベクトルを適応符
号帳１５３から読み出し、得られた適応コードベクトル
に適応符号帳の復号化ゲインを乗じて駆動音源適応符号
帳成分を得る。次に、確率的符号帳のインデクスに基づ
いた確率的コードベクトルを確率的符号帳１５４から読
み出し、得られた確率的コードベクトルに確率的符号帳
の復号化ゲインを乗じて駆動音源確率的符号帳成分を得
る。更に、駆動音源適応符号帳成分と駆動音源確率的符
号帳成分を加算して駆動音源を得、得られた駆動音源を
ＬＰＣ合成部１５６と適応符号帳１５３へ出力する。こ
こで、適応符号帳１５３内の古いコードベクトルは、駆
動音源生成部１５５から入力された上記駆動音源で更新
される。Driving excitation generating section 155 first reads an adaptive code vector based on the index of the adaptive code book from adaptive code book 153, and multiplies the obtained adaptive code vector by a decoding gain of the adaptive code book to obtain a driving excitation. Obtain an adaptive codebook component. Next, the probabilistic codebook based on the index of the probabilistic codebook is read out from the probabilistic codebook 154, and the obtained probabilistic codevector is multiplied by the decoding gain of the probabilistic codebook to generate a driving excitation probabilistic codebook. Get the ingredients. Further, a driving excitation is obtained by adding the driving excitation adaptive codebook component and the driving excitation stochastic codebook component, and the obtained driving excitation is output to LPC synthesis section 156 and adaptive codebook 153. Here, the old code vector in adaptive codebook 153 is updated with the above-mentioned driving excitation inputted from driving excitation generating section 155.

【０００６】ＬＰＣ合成部１５６は、駆動音源生成部１
５５で得られた駆動音源に対し、パラメータ復号化部１
５２より得た復号化ＬＰＣ係数をもとにＬＰＣ合成を行
い、その出力をディジタルの出力音声１５７として出力
部位へ送る。[0006] The LPC synthesizing section 156 includes a driving sound source generating section 1
55. The parameter decoding unit 1
LPC synthesis is performed based on the decoded LPC coefficients obtained from 52, and the output is sent to the output section as digital output sound 157.

【０００７】図６は、従来のＣＥＬＰ型音声符号化装置
の機能ブロック図である。ＬＰＣ分析部１１２は、ま
ず、ディジタルの入力音声１１１内のあるフレームに対
して自己相関分析と線形予測分析を行うことによってＬ
ＰＣ係数を算出し、そのＬＰＣ係数を量子化してＬＰＣ
符号を得てパラメータ符号化部１２３へ出力し、ＬＰＣ
符号を復号化して復号化ＬＰＣ係数を得、次に、ピッチ
強調や高域強調などの特性を持つ聴感重み付けフィルタ
のインパルス応答を求めて聴感重み付け部１１３へ出力
するとともに、聴感重み付けＬＰＣ合成フィルタのイン
パルス応答を求めて、聴感重み付けＬＰＣ逆順合成部Ａ
１１４、聴感重み付けＬＰＣ合成部Ａ１１６、聴感重み
付けＬＰＣ逆順合成部Ｂ１１９、聴感重み付けＬＰＣ合
成部Ｂ１２１へ出力する。FIG. 6 is a functional block diagram of a conventional CELP type speech coding apparatus. The LPC analysis unit 112 first performs an autocorrelation analysis and a linear prediction analysis on a certain frame in the digital input speech 111 to perform L prediction.
Calculate the PC coefficient and quantize the LPC coefficient to obtain the LPC
Code and outputs it to the parameter coding unit 123,
The code is decoded to obtain decoded LPC coefficients. Next, the impulse response of an auditory weighting filter having characteristics such as pitch emphasis and high-frequency emphasis is obtained and output to the auditory weighting section 113, and the perceptual weighting LPC synthesis filter is obtained. The impulse response is obtained, and the perceptual weighting LPC reverse order synthesis unit A
114, an audibility weighted LPC synthesis unit A116, an audibility weighted LPC reverse order synthesis unit B119, and an audibility weighted LPC synthesis unit B121.

【０００８】聴感重み付け部１１３は、入力された音声
データに対し、サブフレーム毎に聴感重み付けフィルタ
リングを行い、その出力結果から聴感重み付けＬＰＣ合
成フィルタのゼロ入力応答を差し引いて、適応符号帳の
音源探索時に参照するターゲット信号を求め、聴感重み
付けＬＰＣ逆順合成部Ａ１１４および減算部１１８へ出
力する。The perceptual weighting section 113 performs perceptual weighting filtering on the input speech data for each subframe, subtracts the zero input response of the perceptual weighting LPC synthesis filter from the output result, and searches for the sound source in the adaptive codebook. A target signal that is sometimes referred to is obtained and output to the perceptual weighting LPC reverse order synthesis unit A114 and the subtraction unit 118.

【０００９】聴感重み付けＬＰＣ逆順合成部Ａ１１４
は、聴感重み付け部１１３で得られたターゲット信号を
時間逆順化し、得られた逆順化信号をＬＰＣ分析部１１
２より与えられたインパルス応答を係数に持つ聴感重み
付きＬＰＣ合成フィルタで合成し、その出力信号を再度
時間逆順化して、ターゲット信号の時間逆合成出力とし
て比較部Ａ１１７へ出力する。A perceptual weighting LPC reverse order synthesis unit A114
, Time-reverse-orders the target signal obtained by the auditory weighting section 113 and converts the obtained reverse-ordered signal to the LPC analysis section 11.
The impulse response given by 2 is synthesized by a perceptually weighted LPC synthesis filter having coefficients, and its output signal is again time-reversed and output to the comparison unit A117 as a time-reverse-combined output of the target signal.

【００１０】適応符号帳１１５は、適応符号帳更新部１
２４により受けた過去の駆動音源を格納しており、その
過去の駆動音源情報は、聴感重み付けＬＰＣ合成部Ａ１
１６、比較部Ａ１１７、適応符号帳更新部１２４によ
り、適応コードベクトルとして参照される。The adaptive codebook 115 includes an adaptive codebook updating unit 1
24, and stores the past drive sound source information as the audibility weighting LPC synthesis unit A1.
16, is referred to as an adaptive code vector by the comparison unit A117 and the adaptive codebook updating unit 124.

【００１１】聴感重み付けＬＰＣ合成部Ａ１１６は、適
応符号帳１１５から適応コードベクトルを読み出し、読
み出した適応コードベクトルに対し、ＬＰＣ分析部１１
２より得たインパルス応答を係数に持つ聴感重み付けＬ
ＰＣ合成フィルタで合成し、その結果を比較部Ａ１１７
へ出力する。The perceptual weighting LPC synthesizing unit A 116 reads out the adaptive code vector from the adaptive code book 115 and applies the LPC analysis unit 11 to the read out adaptive code vector.
Weight L with the impulse response obtained as a coefficient
The signal is synthesized by the PC synthesis filter, and the result is compared by the comparison unit A117.
Output to

【００１２】比較部Ａ１１７は、まず、適応符号帳１１
５から直接読み出した適応コードベクトルと、聴感重み
付けＬＰＣ逆順合成部Ａ１１４で求めたターゲット信号
の時間逆合成出力との内積の２乗値を求め、次に、聴感
重み付けＬＰＣ合成部Ａ１１６から受けた、適応コード
ベクトルに聴感重み付けＬＰＣ合成を施した信号のパワ
を求め、そして、上記内積の２乗値をこのパワで割算す
ることによって適応符号帳探索の基準値を求め、その基
準値が最も大きくなるときに読み出した適応コードベク
トルのインデクスと、そのコードベクトルに乗じる最適
ゲインを算出して、減算部１１８およびパラメータ符号
化部１２３へ出力する。この一連の処理を適応符号帳の
探索という。The comparing unit A117 firstly receives the adaptive codebook 11
5, the square value of the inner product of the adaptive code vector read directly from No. 5 and the time-reverse synthesized output of the target signal obtained by the perceptual weighting LPC reverse order synthesizing unit A114 is obtained. The power of a signal obtained by subjecting the adaptive code vector to perceptual weighting LPC synthesis is obtained, and the square value of the inner product is divided by this power to obtain a reference value for adaptive codebook search, and the reference value is the largest. The index of the adaptive code vector read at this time and the optimal gain by which the code vector is multiplied are calculated and output to the subtraction unit 118 and the parameter encoding unit 123. This series of processing is called adaptive codebook search.

【００１３】減算部１１８は、聴感重み付け部１１３で
得たターゲット信号から、適応符号帳探索によって探索
されたコードベクトルを聴感重み付けＬＰＣ合成した出
力信号にゲインを乗じて得られた信号を減算し、その減
算結果を確率的符号帳の探索の際に参照するターゲット
信号として聴感重み付けＬＰＣ逆順合成部Ｂ１１９へ出
力する。A subtraction unit 118 subtracts a signal obtained by multiplying an output signal obtained by subjecting the code vector searched for by the adaptive codebook search to a perceptual weighting LPC synthesis by a gain from the target signal obtained by the perceptual weighting unit 113, The result of the subtraction is output to the perceptual weighting LPC reverse order synthesis unit B119 as a target signal to be referred to when searching the stochastic codebook.

【００１４】聴感重み付けＬＰＣ逆順合成部Ｂ１１９
は、減算部１１８において生成された確率的符号帳の音
源探索用のターゲット信号を時間逆順化し、それを聴感
重み付きＬＰＣ合成し、その出力信号を再度時間逆順化
して、確率的符号帳の音源探索用ターゲット信号の時間
逆合成出力を得て、比較部Ｂ１２２へ出力する。Audience weighting LPC reverse order synthesis section B119
Subtracts the time-reversed order of the target signal for excitation search of the probabilistic codebook generated in the subtraction unit 118, performs LPC synthesis with perceptual weighting, and time-reverse-orders the output signal again to obtain the sound source of the probabilistic codebook. A time-reverse composite output of the search target signal is obtained and output to the comparison unit B122.

【００１５】確率的符号帳１２０は、複数のコードベク
トルを格納しており、これらのコードベクトルは、聴感
重み付けＬＰＣ合成部Ｂ１２１、比較部Ｂ１２２、適応
符号帳更新部１２４により、確率的コードベクトルとし
て参照される。The stochastic codebook 120 stores a plurality of code vectors, and these code vectors are converted by the perceptual weighting LPC synthesizing unit B121, the comparing unit B122, and the adaptive codebook updating unit 124 as stochastic code vectors. Referenced.

【００１６】聴感重み付けＬＰＣ合成部Ｂ１２１は、確
率的符号帳１２０から読み出した確率的コードベクトル
に対し、ＬＰＣ分析部１１２から得たインパルス応答を
係数に持つ聴感重み付けＬＰＣ合成フィルタで合成し、
その合成信号を比較部Ｂ１２２へ出力する。An auditory weighting LPC synthesizing unit B121 synthesizes the stochastic code vector read from the stochastic codebook 120 with an auditory weighting LPC synthesizing filter having the impulse response obtained from the LPC analyzing unit 112 as a coefficient.
The combined signal is output to comparison section B122.

【００１７】比較部Ｂ１２２は、（数１）に示すよう
に、まず、確率的符号帳１２０より直接読み出したｉ番
目の確率的コードベクトルV(i,n)と、聴感重み付けＬＰ
Ｃ逆順合成部Ｂ１１９で求めたターゲット信号の時間逆
合成出力r(n)との内積の２乗値を求め、次に、聴感重み
付けＬＰＣ合成部Ｂ１２１より受けた合成信号S(i,n)の
パワを計算し、そして上記内積の２乗値をこのパワで割
算することによって確率的符号帳探索の基準値std(i)を
求め、その基準値が最も大きくなるときに読み出した確
率的コードベクトルの番号を表すインデクスと、その確
率的コードベクトルに乗じる最適ゲインを算出して、パ
ラメータ符号化部１２３へ出力する。この一連の処理を
確率的符号帳の探索という。As shown in (Equation 1), the comparing unit B122 firstly outputs the i-th stochastic code vector V (i, n) directly read from the stochastic codebook 120 and the auditory weighting LP
The square value of the inner product of the target signal and the time-reverse-combined output r (n) determined by the C-reverse-order combining unit B119 is determined, and then the squared value of the combined signal S (i, n) received from the auditory weighting LPC combining unit B121 The power is calculated, and the square value of the inner product is divided by the power to obtain a reference value std (i) of the probabilistic codebook search, and the probabilistic code read when the reference value is maximized. An index representing a vector number and an optimal gain to be multiplied by the probabilistic code vector are calculated and output to the parameter encoding unit 123. This series of processing is called a probabilistic codebook search.

【００１８】[0018]

【数１】 (Equation 1)

【００１９】パラメータ符号化部１２３は、まず、比較
部Ａ１１７から得られた適応コードベクトルに乗じる最
適ゲインと、比較部Ｂ１２２で得られた確率的コードベ
クトルに乗じる最適ゲインの符号化を行って、適応符号
帳の符号化ゲインと確率的符号帳の符号化ゲインをそれ
ぞれ得、次に、得られた適応符号帳の符号化ゲインと確
率的符号帳の符号化ゲイン復号化して、適応符号帳の復
号化ゲインと確率的符号帳の復号化ゲインををそれぞれ
得、さらに、適応符号帳の符号化ゲイン、確率的符号帳
の符号化ゲイン、適応符号帳のインデクス、確率的符号
帳のインデクス、ＬＰＣ符号を伝送部１２５へ出力し、
適応符号帳の復号化ゲイン、確率的符号帳の復号化ゲイ
ン、適応符号帳のインデクス、確率的符号帳のインデク
スを適応符号帳更新部１２４へ出力する。The parameter encoding unit 123 first encodes an optimal gain by which the adaptive code vector obtained from the comparing unit A117 is multiplied and an optimal gain by which the probabilistic code vector obtained by the comparing unit B122 is multiplied. The coding gain of the adaptive codebook and the coding gain of the probabilistic codebook are obtained, and then the obtained coding gain of the adaptive codebook and the coding gain of the probabilistic codebook are decoded to obtain the adaptive codebook. The decoding gain and the decoding gain of the probabilistic codebook are obtained respectively, and the coding gain of the adaptive codebook, the coding gain of the probabilistic codebook, the index of the adaptive codebook, the index of the probabilistic codebook, the LPC Output the code to the transmission unit 125,
The decoding gain of the adaptive codebook, the decoding gain of the probabilistic codebook, the index of the adaptive codebook, and the index of the probabilistic codebook are output to the adaptive codebook updating unit 124.

【００２０】適応符号帳更新部１２４は、パラメータ符
号化部１２３からの入力を受けて、まず、適応符号帳の
インデクスに基づいた適応コードベクトルを適応符号帳
１１５から読みだし、読み出した適応コードベクトルに
適応符号帳の復号化ゲインを乗じて駆動音源適応符号帳
成分を得、次に、確率的符号帳のインデクスに基づいた
確率的コードベクトルを確率的符号帳１２０から読みだ
し、読み出した確率的コードベクトルに確率的符号帳の
復号化ゲインを乗じて駆動音源確率的符号帳成分を得、
駆動音源適応符号帳成分と駆動音源確率的符号帳成分を
加算して駆動音源を得、得られた駆動音源を適応符号帳
１１５へ出力する。ここで、適応符号帳１１５内の古い
コードベクトルは、適応符号帳更新部１２４から入力さ
れた上記駆動音源で更新される。Adaptive codebook updating section 124 receives an input from parameter encoding section 123, first reads an adaptive code vector based on the index of the adaptive codebook from adaptive codebook 115, and reads the read adaptive code vector. Is multiplied by the decoding gain of the adaptive codebook to obtain a driving excitation adaptive codebook component, and then reads a probabilistic code vector based on the index of the probabilistic codebook from the probabilistic codebook 120 and reads the read probabilistic codebook. Multiply the code vector by the decoding gain of the probabilistic codebook to obtain the driving excitation probabilistic codebook component,
The driving excitation adaptive codebook component and the driving excitation probabilistic codebook component are added to obtain a driving excitation, and the obtained driving excitation is output to adaptive codebook 115. Here, the old code vector in adaptive codebook 115 is updated by the above-mentioned driving excitation input from adaptive codebook updating section 124.

【００２１】ＣＥＬＰで用いる上記確率的符号帳として
は、雑音符号帳や代数的符号帳（"8KBIT/S ACELP CODIN
G OF SPEECH WITH 10 MS SPEECH-FRAME : A CANDIDATE
FORCCITT STANDARDIZATION"：R.Salami, C.Laflamme, J
-P.Adoul,ICASSP'94,pp.II-97〜II-100, 1994に記載さ
れている）などがある。それぞれを簡単に説明する。The stochastic codebook used in CELP includes a random codebook and an algebraic codebook (“8KBIT / S ACELP CODIN”).
G OF SPEECH WITH 10 MS SPEECH-FRAME: A CANDIDATE
FORCCITT STANDARDIZATION ": R. Salami, C. Laflamme, J
-P. Adoul, ICASSP'94, pp. II-97 to II-100, 1994). Each will be briefly described.

【００２２】「雑音符号帳」は最も古典的な符号帳で、
乱数から作成したランダム数列を格納したものである。
コードブックの性質がランダムなので、質の高い合成音
を得ることができる。しかし、全コードベクトルをあら
かじめ格納しておけなければならないので、大きなＲＯ
Ｍ容量が必要となる。また、符号帳探索は、全てのコー
ドベクトルに対して聴感重み付けＬＰＣ合成を行うこと
になるので、多くの計算量を必要としてしまう。"Noise codebook" is the most classic codebook.
It stores a random number sequence created from random numbers.
Since the nature of the codebook is random, high quality synthesized speech can be obtained. However, since all code vectors must be stored in advance, a large RO
M capacity is required. In addition, in the codebook search, perceptually weighted LPC synthesis is performed for all code vectors, so that a large amount of calculation is required.

【００２３】「代数的符号帳」の個々のコードベクトル
は、大きさが１（振幅は、＋１か−１）の４本のパルス
で構成され、それぞれのパルスの位置はインデクスの演
算により決定されることを特徴としている。したがっ
て、符号帳用のＲＯＭを必要としない。しかし、少数の
パルスで構成されることから、合成音（特に無声部）の
音質が悪くなってしまう。そして代数的符号帳の最大の
特徴は、少量の計算量で符号帳探索ができるということ
である。代数的符号帳における上記最大の特徴は、V(i,
n)中には大きさ１のパルスが４本しか存在していないの
で、（数１）の分子を、r(n)中の４サンプルの値を加算
し（加算：３回）、その加算結果を２乗（乗算：１回）
することで計算できることと、前もって求めておいた聴
感重み付けＬＰＣ合成フィルタのインパルス応答の自己
相関行列をＲＡＭに格納しておくと、（数１）の分母
を、たかだか１５回の加算（４重ループの性質や自己相
関行列の対称性を利用すると、実際の計算量はさらに少
なくなる。）によって計算することができることによる
ものである。Each code vector of the "algebraic codebook" is composed of four pulses having a magnitude of 1 (amplitude is +1 or -1), and the position of each pulse is determined by an index operation. It is characterized by that. Therefore, a codebook ROM is not required. However, since the sound is composed of a small number of pulses, the sound quality of the synthesized sound (particularly, the unvoiced part) is deteriorated. The greatest feature of the algebraic codebook is that the codebook search can be performed with a small amount of calculation. The biggest feature of the algebraic codebook is that V (i,
Since only four pulses of magnitude 1 exist in n), the value of four samples in r (n) is added to the numerator of (expression 1) (addition: 3 times), and the addition is performed. Square the result (multiplication: 1)
By storing the autocorrelation matrix of the impulse response of the perceptual weighting LPC synthesis filter obtained in advance in the RAM, the denominator of (Equation 1) is added at most 15 times (quadruple loop). The actual calculation amount is further reduced by using the property of (1) and the symmetry of the autocorrelation matrix.).

【００２４】[0024]

【発明が解決しようとする課題】雑音符号帳を用いる
と、ランダムな性質を持つ音源で聴感重み付けＬＰＣ合
成フィルタを駆動できるので合成音の音質は高くなる
が、（数１）の分子および分母の計算量が多くなるので
符号帳探索の計算量が多くなってしまい、また、全コー
ドベクトルをあらかじめ格納しておくためにＲＯＭ容量
が大きくなってしまう。When the noise codebook is used, the sound quality of the synthesized sound can be improved because the perceptually weighted LPC synthesis filter can be driven by a sound source having random characteristics, but the sound quality of the synthesized sound is improved. Since the calculation amount increases, the calculation amount of the codebook search increases, and the ROM capacity increases because all the code vectors are stored in advance.

【００２５】一方、代数的符号帳を用いると、（数１）
の分子および分母の計算量が少なくなるので符号帳探索
の計算量が少なくなり、コードベクトルをそのまま格納
する必要がないのでＲＯＭ容量が小さくなるが、少数本
のパルスによって聴感重み付けＬＰＣ合成フィルタを駆
動することになるため合成音（特に無声部）の音質が悪
くなってしまう。On the other hand, when an algebraic codebook is used, (Equation 1)
The amount of calculation of the numerator and denominator is reduced, so the amount of calculation in codebook search is reduced, and the ROM capacity is reduced because the code vector need not be stored as it is, but the perceptual weighting LPC synthesis filter is driven by a small number of pulses. Therefore, the sound quality of the synthesized sound (especially the unvoiced part) is deteriorated.

【００２６】本発明は、符号帳探索のための計算量が少
なく、ＲＯＭ容量が小さく、良好な合成音を提供するこ
とが可能であるという有利な作用を併せ持つ音声符号化
装置及び音声復号化装置に用いられる音源ベクトル生成
装置及び方法を提供することを目的とする。The present invention provides a speech coding apparatus and a speech decoding apparatus which have an advantageous effect that the amount of calculation for searching for a codebook is small, the ROM capacity is small, and good synthesized speech can be provided. It is an object of the present invention to provide a sound source vector generation device and method used for the above.

【００２７】[0027]

【課題を解決するための手段】この課題を解決するため
に本発明の拡散音声ベクトル生成装置は、音源ベクトル
を供給する音源ベクトル供給手段と、前記音源ベクトル
を拡散させる拡散ベクトルを格納する手段と、格納され
た前記拡散ベクトルと前記音源ベクトルとを畳み込ん
で、拡散音源ベクトルを出力する畳み込み手段とを有
し、代数的符号帳で生成されたコードベクトルを前記音
源ベクトルとして用いるようにした。また、本発明の拡
散音声ベクトル生成方法は、音源ベクトルを供給する段
階と、前記音源ベクトルを拡散させる拡散ベクトルを供
給する段階と、前記拡散ベクトルと前記音源ベクトルと
を畳み込んで拡散音源ベクトルを出力する段階と有し、
代数的符号帳で生成されたコードベクトルを前記音源ベ
クトルとして用いるようにした。 In order to solve this problem, a spread speech vector generating apparatus according to the present invention comprises: a sound source vector supply means for supplying a sound source vector;
And a convolution means for convolving the stored diffusion vector and the excitation vector to output a diffusion excitation vector, and a code vector generated by an algebraic codebook. It was used as the above excitation vector. Further, the present invention is expanded.
The method for generating a scattered voice vector includes a step of supplying a sound source vector.
Floor and a diffusion vector for diffusing the sound source vector.
Supplying, the diffusion vector and the sound source vector
And outputting a diffused sound source vector by convolving
The code vector generated by the algebraic codebook is
It was used as a vector.

【００２８】[0028]

【発明の実施の形態】以下、本発明の実施の形態につい
て、図１から図４を用いて説明する。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described with reference to FIGS.

【００２９】（実施の形態１）本発明の実施の形態を、
図１と図２を用いて説明する。(Embodiment 1) The embodiment of the present invention
This will be described with reference to FIGS.

【００３０】図１は本発明による音声復号化装置の主要
部のブロック図を示す。図１において、１は音源情報格
納部、２は拡散ベクトル格納部、３は畳み込み部であ
る。音源情報格納部１は、例えば代数的符号帳と同じコ
ードベクトルの生成情報等の音源情報を格納し、その音
源情報を音源ベクトルとして出力する。拡散ベクトル格
納部２は、例えばランダム数列等の拡散情報を格納し、
その拡散情報を拡散ベクトルとして出力する。畳み込み
部３は音源ベクトルと拡散ベクトルを入力して畳み込み
を行い、拡散音源ベクトルとして出力する。FIG. 1 is a block diagram showing a main part of a speech decoding apparatus according to the present invention. In FIG. 1, 1 is a sound source information storage unit, 2 is a diffusion vector storage unit, and 3 is a convolution unit. The excitation information storage unit 1 stores, for example, excitation information such as generation information of the same code vector as an algebraic codebook, and outputs the excitation information as an excitation vector. The diffusion vector storage unit 2 stores diffusion information such as a random number sequence,
The spread information is output as a spread vector. The convolution unit 3 performs convolution by inputting the sound source vector and the diffusion vector, and outputs the result as a diffusion sound source vector.

【００３１】図２は、図１の音声復号化装置の主要部を
従来の確率的符号帳部に代えて用いたＣＥＬＰ型の音声
復号化装置を示す。音源情報格納部１は代数的符号帳と
同じコードベクトルの生成情報を格納し、拡散ベクトル
格納部２はランダム数列を拡散ベクトルとして格納して
いる。パラメータ復号化部４は、まず、符号化装置が生
成した音声符号（ＬＰＣ符号、拡散音源インデクス、拡
散音源ベクトルの符号化ゲイン、適応符号帳のインデク
ス、適応符号帳の符号化ゲイン）を伝送部５を通して獲
得し、ＬＰＣ符号を復号化して復号化ＬＰＣ係数を得
る。次に拡散音源ベクトルの符号化ゲインを復号化して
拡散音源ベクトルの復号化ゲインを得、適応符号帳の符
号化ゲインを復号化して適応符号帳の復号化ゲインを得
る。更に、拡散音源インデクスを畳み込み部３へ出力
し、適応符号帳のインデクス及び復号化ゲイン、拡散音
源ベクトルの復号化ゲインを駆動音源生成部７へ出力
し、復号化ＬＰＣ係数をＬＰＣ合成部８へ出力する。FIG. 2 shows a CELP-type speech decoding apparatus in which a main part of the speech decoding apparatus shown in FIG. 1 is used in place of a conventional stochastic codebook section. The excitation information storage unit 1 stores the same code vector generation information as the algebraic codebook, and the spreading vector storage unit 2 stores a random number sequence as a spreading vector. The parameter decoding unit 4 first transmits the speech code (LPC code, spread excitation index, coding gain of the spread excitation vector, index of the adaptive codebook, coding gain of the adaptive codebook) generated by the coding apparatus. 5 to decode the LPC code to obtain decoded LPC coefficients. Next, the decoding gain of the spreading excitation vector is decoded to obtain the decoding gain of the spreading excitation vector, and the coding gain of the adaptive codebook is decoded to obtain the decoding gain of the adaptive codebook. Further, the spread excitation index is output to the convolution unit 3, the adaptive codebook index and the decoding gain, and the diffusion gain of the spread excitation vector are output to the driving excitation generation unit 7, and the decoded LPC coefficients are output to the LPC synthesis unit 8. Output.

【００３２】畳み込み部３は、まず、拡散音源インデク
スに基づいた音源ベクトルを音源情報格納部１から読み
出し、次に、拡散ベクトル格納部２に格納されている拡
散ベクトルを読み出し、更に、読み出した音源ベクトル
と拡散ベクトルとの畳み込みを行って拡散音源ベクトル
を生成し駆動音源生成部７へ出力する。駆動音源生成部
７は、まず、適応符号帳のインデクスに基づいた適応コ
ードベクトルを適応符号帳６から読み出し、得られた適
応コードベクトルに適応符号帳の復号化ゲインを乗じて
駆動音源適応符号帳成分を得、畳み込み部３より得られ
た拡散音源ベクトルに拡散音源ベクトルの復号化ゲイン
を乗じて駆動音源拡散音源ベクトル成分を得る。次に、
得られた駆動音源適応符号帳成分と駆動音源拡散音源ベ
クトル成分を加算して駆動音源を得、得られた駆動音源
をＬＰＣ合成部８と適応符号帳６へ出力する。ここで、
適応符号帳６内の古いコードベクトルは、駆動音源生成
部７から入力された上記駆動音源で更新される。ＬＰＣ
合成部８は、駆動音源生成部７から入力される駆動音源
に対し、パラメータ復号化部４から得た復号化ＬＰＣ係
数を持つＬＰＣ合成フィルタで合成し、出力音声９を得
る。The convolution unit 3 first reads out the sound source vector based on the diffused sound source index from the sound source information storage unit 1, then reads out the diffusion vector stored in the spread vector storage unit 2, and further reads out the read out sound source vector. The convolution of the vector and the diffusion vector is performed to generate a diffusion sound source vector, which is output to the driving sound source generation unit 7. Driving excitation generating section 7 first reads an adaptive code vector based on the index of the adaptive code book from adaptive code book 6, multiplies the obtained adaptive code vector by the decoding gain of the adaptive code book, and generates a driving excitation adaptive code book. Then, the component is obtained, and the spread excitation vector obtained by the convolution unit 3 is multiplied by the decoding gain of the spread excitation vector to obtain the drive excitation spread excitation vector component. next,
The obtained driving excitation adaptive codebook component and the driving excitation diffusion excitation vector component are added to obtain a driving excitation, and the obtained driving excitation is output to LPC synthesis section 8 and adaptive codebook 6. here,
The old code vector in the adaptive codebook 6 is updated with the driving excitation input from the driving excitation generation unit 7. LPC
The synthesizing unit 8 synthesizes the driving sound source input from the driving sound source generating unit 7 with an LPC synthesis filter having the decoded LPC coefficient obtained from the parameter decoding unit 4 to obtain an output sound 9.

【００３３】なお、本実施の形態において拡散ベクトル
格納部２は、ランダム数列を拡散ベクトルとして格納し
ているので、上記駆動音源拡散音源ベクトル成分がラン
ダムな性質をもつことになり、出力音声（特に無声部）
の音質を向上することができる。また、拡散音源ベクト
ルは音源情報格納部１の音源情報と拡散ベクトル格納部
２の拡散ベクトルにより作成できるので、ＲＯＭ容量は
小さくなっている。In this embodiment, since the spreading vector storage unit 2 stores a random sequence as a spreading vector, the driving sound source spreading sound source vector component has a random property, and the output sound (particularly, Silent part)
Sound quality can be improved. Further, since the diffused sound source vector can be created from the sound source information in the sound source information storage unit 1 and the diffusion vector in the spread vector storage unit 2, the ROM capacity is small.

【００３４】なお、本実施の形態では、音源情報格納部
１に代数的符号帳と同じコードベクトルの生成情報を格
納した例で説明したが、その他の符号帳の生成情報もし
くはその他の符号帳自身を格納した場合についても同様
に実施可能である。また、本実施の形態では、拡散ベク
トル格納部２にランダム数列を格納した例で説明した
が、その他の学習により求められた数列、もしくは知見
により求められた数列を用いる場合についても同様に実
施可能である。Although the present embodiment has been described with an example in which the excitation information storage unit 1 stores the same code vector generation information as the algebraic codebook, other codebook generation information or other codebooks themselves are stored. Can also be implemented in the same manner. Further, in the present embodiment, an example has been described in which a random number sequence is stored in the diffusion vector storage unit 2. However, a case where a sequence obtained by other learning or a sequence obtained by knowledge is used can be similarly performed. It is.

【００３５】なお、本実施の形態における音声復号化装
置はＣＥＬＰ型としたが、ＶＯＣＯＤＥＲ型等、その他
の音声復号化装置においても適用が可能である。Although the speech decoding apparatus according to the present embodiment is of the CELP type, it can be applied to other speech decoding apparatuses such as the VOCODER type.

【００３６】（実施の形態２）本発明の実施の形態を、
図３と図４を用いて説明する。(Embodiment 2) The embodiment of the present invention
This will be described with reference to FIGS.

【００３７】図３は本発明による音声符号化装置の主要
部のブロック図を示す。図３において、１１は拡散ベク
トル格納部、１２はディジタルフィルタ、１３は畳み込
み部、１４は音源情報格納部である。拡散ベクトル格納
部１１は、例えばランダム数列等の拡散情報を格納し、
その拡散情報を拡散ベクトルとして出力する。FIG. 3 is a block diagram showing a main part of a speech coding apparatus according to the present invention. 3, reference numeral 11 denotes a diffusion vector storage unit, 12 denotes a digital filter, 13 denotes a convolution unit, and 14 denotes a sound source information storage unit. The diffusion vector storage unit 11 stores diffusion information such as a random number sequence,
The spread information is output as a spread vector.

【００３８】ディジタルフィルタ１２は、入力信号をフ
ィルタリングして出力するとともに、フィルタ自身の特
性を決めるフィルタ情報を出力するもので、図３におい
てはフィルタ情報としてインパルス応答もしくは係数の
出力を示す。音源情報格納部１４は、例えば代数的符号
帳と同じコードベクトルの生成情報等の音源情報を格納
し、その音源情報を音源ベクトルとして出力する。The digital filter 12 filters and outputs the input signal and outputs filter information for determining the characteristics of the filter itself. FIG. 3 shows an output of an impulse response or a coefficient as the filter information. The excitation information storage unit 14 stores, for example, excitation information such as generation information of the same code vector as the algebraic codebook, and outputs the excitation information as an excitation vector.

【００３９】畳み込み部１３は、図３（ａ）では、拡散
ベクトルとインパルス応答もしくは係数を入力して畳み
込みを行い、新インパルス応答もしくは新係数を出力
し、図３（ｂ）では、図３（ａ）の機能に加えて、音源
ベクトルと拡散ベクトルを入力して畳み込みを行い、拡
散音源ベクトルとして出力する。In FIG. 3A, the convolution unit 13 performs convolution by inputting a diffusion vector and an impulse response or a coefficient, and outputs a new impulse response or a new coefficient. In FIG. In addition to the function of a), the sound source vector and the diffusion vector are input, convolution is performed, and the resultant is output as a diffusion sound source vector.

【００４０】図４は、図３の音声符号化装置の主要部を
従来の確率的符号帳部に代えて用いたＣＥＬＰ型の音声
符号化装置を示す。音源情報格納部１４は代数的符号帳
と同じコードベクトルの生成情報を格納し、拡散ベクト
ル格納部１１はランダム数列を拡散ベクトルとして格納
している。１５は入力音声で、ディジタルの入力音声デ
ータである。FIG. 4 shows a CELP-type speech coding apparatus in which the main part of the speech coding apparatus of FIG. 3 is used in place of the conventional stochastic codebook section. The excitation information storage unit 14 stores the same code vector generation information as the algebraic codebook, and the spreading vector storage unit 11 stores a random number sequence as a spreading vector. Reference numeral 15 denotes input voice, which is digital input voice data.

【００４１】ＬＰＣ分析部１６は、入力音声１５におけ
るあるフレームに対して自己相関分析と線形予測分析を
行うことによってＬＰＣ係数を算出し、そのＬＰＣ係数
を量子化してＬＰＣ符号を得てパラメータ符号化部１７
へ出力し、ＬＰＣ符号を復号化して復号化ＬＰＣ係数を
得る。次に、ピッチ強調や高域強調などの特性を持つ聴
感重み付けフィルタのインパルス応答を求めて聴感重み
付け部１８へ出力するとともに、聴感重み付けＬＰＣ合
成フィルタのインパルス応答を求めて、聴感重み付けＬ
ＰＣ逆順合成部Ａ１９および聴感重み付けＬＰＣ合成部
Ａ２０へ出力する。聴感重み付けＬＰＣ逆順合成部Ａ１
９と聴感重み付けＬＰＣ合成部Ａ２０はいずれもディジ
タルフィルタを含む。The LPC analysis unit 16 calculates an LPC coefficient by performing an autocorrelation analysis and a linear prediction analysis on a certain frame in the input speech 15, quantizes the LPC coefficient to obtain an LPC code, and performs parameter coding. Part 17
And decodes the LPC code to obtain a decoded LPC coefficient. Next, the impulse response of an audibility weighting filter having characteristics such as pitch emphasis and high frequency emphasis is obtained and output to the audibility weighting unit 18, and the impulse response of the audibility weighting LPC synthesis filter is obtained.
It outputs to PC reverse order synthesis part A19 and audibility weighting LPC synthesis part A20. Audience weighting LPC reverse order synthesis unit A1
9 and the perceptual weighting LPC synthesis unit A20 each include a digital filter.

【００４２】聴感重み付け部１８は、入力音声データに
対し、サブフレーム毎に聴感重み付けフィルタリングを
行い、その出力結果から聴感重み付けＬＰＣ合成フィル
タのゼロ入力応答を差し引いて適応符号帳の音源探索時
に参照するターゲット信号を求め、聴感重み付けＬＰＣ
逆順合成部Ａ１９および減算部１８へ出力する。聴感重
み付けＬＰＣ逆順合成部Ａ１９は、聴感重み付け部１８
から入力されるターゲット信号を時間逆順化し、その逆
順化信号をＬＰＣ分析部１６より与えられたインパルス
応答を係数に持つ聴感重み付きＬＰＣ合成フィルタで合
成し、その出力信号を再度時間逆順化してターゲット信
号の時間逆合成出力を得て比較部Ａ２１へ出力する。The perceptual weighting unit 18 performs perceptual weighting filtering on the input speech data for each subframe, subtracts the zero input response of the perceptual weighting LPC synthesis filter from the output result, and refers to the result when searching for a sound source in the adaptive codebook. Obtain target signal and weight LPC
The data is output to the reverse synthesis unit A19 and the subtraction unit 18. The perceptual weighting LPC reverse order synthesizing unit A19 includes the perceptual weighting unit 18
, And synthesizes the inverted signal with a perceptually weighted LPC synthesis filter having the impulse response given by the LPC analysis unit 16 as a coefficient. A time inverse composite output of the signal is obtained and output to the comparison unit A21.

【００４３】適応符号帳２２は、適応符号帳更新部２３
により受けた過去の駆動音源を格納しており、その過去
の駆動音源情報は、聴感重み付けＬＰＣ合成部Ａ２０、
比較部Ａ２１、適応符号帳更新部２３により、適応コー
ドベクトルとして参照される。聴感重み付けＬＰＣ合成
部Ａ２０は、まず、適応符号帳２２から適応コードベク
トルを読み出し、読み出した適応コードベクトルに対
し、ＬＰＣ分析部１６より得たインパルス応答を係数に
持つ聴感重み付けＬＰＣ合成フィルタで合成し、その結
果を比較部Ａ２１へ出力する。次に、聴感重み付けＬＰ
Ｃ合成フィルタのインパルス応答を畳み込み部１３へ出
力する。The adaptive codebook 22 includes an adaptive codebook updating unit 23
, And the past driving sound source information is stored as the perceptual weighting LPC synthesis unit A20,
The comparing unit A21 and the adaptive codebook updating unit 23 refer to the adaptive codebook as an adaptive code vector. The perceptual weighting LPC synthesis unit A20 first reads out the adaptive code vector from the adaptive codebook 22, and synthesizes the read out adaptive code vector with a perceptual weighting LPC synthesis filter having the impulse response obtained from the LPC analysis unit 16 as a coefficient. , And outputs the result to the comparison unit A21. Next, the hearing weighting LP
The impulse response of the C synthesis filter is output to the convolution unit 13.

【００４４】比較部Ａ２１は、まず、適応符号帳２２か
ら直接読み出した適応コードベクトルと、聴感重み付け
ＬＰＣ逆順合成部Ａ１９で求めたターゲット信号の時間
逆合成出力との内積の２乗値を求める。次に、聴感重み
付けＬＰＣ合成部Ａ２０から受けた適応コードベクトル
に聴感重み付けＬＰＣ合成を施した信号のパワを求め、
上記内積の２乗値をこのパワで割算することによって適
応コードベクトル探索の基準値を求める。その基準値が
最も大きくなるときに読み出した適応コードベクトルの
インデクスと、その適応コードベクトルに乗じる最適ゲ
インを算出し、減算部２４およびパラメータ符号化部１
７へ出力する。First, the comparison unit A21 obtains the square value of the inner product of the adaptive code vector directly read from the adaptive codebook 22 and the time-inverse combined output of the target signal obtained by the perceptual weighting LPC reverse-order combining unit A19. Next, the power of a signal obtained by subjecting the adaptive code vector received from the perceptual weighting LPC synthesis unit A20 to perceptual weighting LPC synthesis is obtained,
By dividing the square value of the inner product by this power, a reference value for adaptive code vector search is obtained. The index of the adaptive code vector read when the reference value becomes the maximum and the optimal gain by which the adaptive code vector is multiplied are calculated, and the subtraction unit 24 and the parameter encoding unit 1
7 is output.

【００４５】減算部２４は、聴感重み付け部１８で得ら
れたターゲット信号から、適応符号帳探索において探索
された適応コードベクトルを聴感重み付けＬＰＣ合成
し、合成信号に比較部Ａ２１で求めた適応符号帳の最適
ゲインを乗じて得られた信号を減算する。その減算結果
を、拡散音源ベクトルの探索の際に参照するターゲット
信号として聴感重み付けＬＰＣ逆順合成部Ｂ２５へ出力
する。The subtraction unit 24 performs perceptual weighting LPC synthesis of the adaptive code vector searched for in the adaptive codebook search from the target signal obtained by the perceptual weighting unit 18, and applies the adaptive codebook obtained by the comparison unit A21 to the synthesized signal. Is subtracted from the signal obtained by multiplying by the optimum gain of. The result of the subtraction is output to the perceptual weighting LPC reverse order synthesis unit B25 as a target signal to be referred to when searching for a diffusion sound source vector.

【００４６】拡散ベクトル格納部１１は、ランダム数列
を格納している。畳み込み部１３は、聴感重み付けＬＰ
Ｃ合成部Ａ２０から受けた聴感重み付けＬＰＣ合成フィ
ルタのインパルス応答と拡散ベクトル格納部１１から読
み出した拡散ベクトルとの畳み込み演算を行い、その演
算結果を新インパルス応答として、聴感重み付けＬＰＣ
逆順合成部Ｂ２５と聴感重み付けＬＰＣ合成部Ｂ２６へ
出力する。聴感重み付けＬＰＣ逆順合成部Ｂ２５と聴感
重み付けＬＰＣ合成部Ｂ２６はいずれもディジタルフィ
ルタを含む。The diffusion vector storage section 11 stores a random number sequence. The convolution unit 13 has a hearing weight LP
The convolution operation of the impulse response of the perceptual weighting LPC synthesis filter received from the C synthesizing unit A20 and the diffusion vector read from the diffusion vector storage unit 11 is performed, and the calculation result is used as a new impulse response, and the perceptual weighting LPC
Output to the reverse order synthesis unit B25 and the perceptual weighting LPC synthesis unit B26. Both the perceptual weighting LPC reverse order synthesizing unit B25 and the perceptual weighting LPC synthesizing unit B26 include digital filters.

【００４７】聴感重み付けＬＰＣ逆順合成部Ｂ２５は、
減算部２４で生成された拡散音源ベクトル探索時のター
ゲット信号を時間逆順化し、その逆順化した信号を、畳
み込み部１３から得た新インパルス応答を係数に持つ聴
感重み付けＬＰＣ合成フィルタで合成する。その出力信
号を再度時間逆順化し、ターゲット信号の時間逆合成出
力として比較部Ｂ２７へ出力する。The perceptual weighting LPC reverse order synthesis unit B25
The target signal at the time of searching for the diffused sound source vector generated by the subtraction unit 24 is time-reversed, and the inverted signal is synthesized by an audibility weighting LPC synthesis filter having the new impulse response obtained from the convolution unit 13 as a coefficient. The output signal is time-reversed again and output to the comparison unit B27 as a time-reverse synthesized output of the target signal.

【００４８】音源情報格納部１４は、代数的符号帳と同
じコードベクトルの生成情報を格納しているので、少数
（ここでは４本）のパルスよりなるコードベクトルを生
成することができる。生成したコードベクトルは、聴感
重み付けＬＰＣ合成部Ｂ２６、比較部Ｂ２７、畳み込み
部１３により、音源ベクトルとして参照される。聴感重
み付けＬＰＣ合成部Ｂ２６は、音源情報格納部１４より
読み出した音源ベクトルに対し、畳み込み部１３から得
た新インパルス応答を係数に持つ聴感重み付けＬＰＣ合
成フィルタで合成し、その合成信号の自己相関行列を求
めて比較部Ｂ２７へ出力する。Since the excitation information storage unit 14 stores the same code vector generation information as the algebraic codebook, it can generate a code vector composed of a small number (here, four) of pulses. The generated code vector is referred to as a sound source vector by the auditory weighting LPC synthesis unit B26, the comparison unit B27, and the convolution unit 13. The audibility weighting LPC synthesis unit B26 synthesizes the sound source vector read from the sound source information storage unit 14 with an audibility weighting LPC synthesis filter having the new impulse response obtained from the convolution unit 13 as a coefficient, and the autocorrelation matrix of the synthesized signal And outputs it to the comparison unit B27.

【００４９】比較部Ｂ２７は、まず、音源情報格納部１
４より直接読み出した音源ベクトルと、聴感重み付けＬ
ＰＣ逆順合成部Ｂ２５で求めたターゲット信号の時間逆
合成出力との内積の２乗値を求める。次に、聴感重み付
けＬＰＣ合成部Ｂ２６より受けた自己相関行列を参照し
て、音源ベクトルに聴感重み付けＬＰＣ合成を施した信
号のパワを求め、そして上記内積の２乗値をこのパワで
割算することによって拡散音源ベクトル探索の基準値を
求める。その基準値が最も大きくなるときに読み出した
音源ベクトルの番号を表す拡散音源インデクスと、その
拡散音源ベクトルに乗じる最適ゲインを算出して、パラ
メータ符号化部１７へ出力する。The comparing section B27 firstly outputs the sound source information storing section 1
The sound source vector read directly from No. 4 and the audibility weighting L
The square value of the inner product of the target signal obtained by the PC reverse order synthesis unit B25 and the time reverse synthesis output is obtained. Next, with reference to the autocorrelation matrix received from the perceptual weighting LPC synthesis unit B26, the power of the signal obtained by subjecting the sound source vector to perceptual weighting LPC synthesis is obtained, and the square value of the inner product is divided by this power. Thus, a reference value for searching for a diffusion sound source vector is obtained. A diffusion excitation index indicating the number of the excitation vector read when the reference value becomes maximum, and an optimum gain by which the diffusion excitation vector is multiplied are calculated and output to the parameter encoding unit 17.

【００５０】パラメータ符号化部１７は、まず、比較部
Ａ２１で得られた適応コードベクトルに乗じる最適ゲイ
ンと、比較部Ｂ２７で得られた拡散音源ベクトルに乗じ
る最適ゲインの符号化を行って、適応符号帳の符号化ゲ
インと拡散音源ベクトルの符号化ゲインをそれぞれ得
る。次に、得られた適応符号帳の符号化ゲインと拡散音
源ベクトルの符号化ゲインを復号化して、適応符号帳の
復号化ゲインと拡散音源ベクトルの復号化ゲインをそれ
ぞれ得る。更に、適応符号帳の符号化ゲイン、拡散音源
ベクトルの符号化ゲイン、ＬＰＣ符号、適応符号帳のイ
ンデクス、拡散音源インデクスを伝送部２８へ出力し、
適応符号帳の復号化ゲインと、適応符号帳のインデクス
とを適応符号帳更新部２３へ出力し、拡散音源ベクトル
の復号化ゲイン、拡散音源インデクスを畳み込み部１３
へ出力する。The parameter encoding unit 17 first encodes the optimal gain by which the adaptive code vector obtained by the comparison unit A21 is multiplied and the optimal gain by which the spread excitation vector obtained by the comparison unit B27 is multiplied. The coding gain of the codebook and the coding gain of the spreading excitation vector are obtained. Next, the obtained coding gain of the adaptive codebook and the coding gain of the spread excitation vector are decoded to obtain the decoding gain of the adaptive codebook and the decoding gain of the spread excitation vector, respectively. Further, the coding gain of the adaptive codebook, the coding gain of the spreading excitation vector, the LPC code, the index of the adaptive codebook, and the spreading excitation index are output to the transmission unit 28,
The decoding gain of the adaptive codebook and the index of the adaptive codebook are output to the adaptive codebook updating unit 23, and the decoding gain of the diffusion excitation vector and the diffusion excitation index are convolved with the convolution unit 13.
Output to

【００５１】畳み込み部１３は、まず、拡散音源インデ
クスに基づいた音源ベクトルを音源情報格納部１４から
読み出し、次に、拡散ベクトル格納部１１に格納されて
いる拡散ベクトルを読み出す。次に、読み出した音源ベ
クトルと拡散ベクトルとの畳み込み演算を行って拡散音
源ベクトルを生成し、拡散音源ベクトルに復号化ゲイン
を乗じて駆動音源拡散音源ベクトル成分を得て適応符号
帳更新部２３へ出力する。適応符号帳更新部２３は、ま
ず、パラメータ符号化部１７から受けた適応符号帳のイ
ンデクスに基づいた適応コードベクトルを適応符号帳２
２から読み出し、得られた適応コードベクトルに適応符
号帳の復号化ゲインを乗じて駆動音源適応符号帳成分を
得る。次に、得られた駆動音源適応符号帳成分と畳み込
み部１３から入力された駆動音源拡散音源ベクトル成分
を加算して駆動音源を生成し、生成した駆動音源を適応
符号帳２２へ出力する。ここで、適応符号帳２２内の古
いコードベクトルは、適応符号帳更新部２３から入力さ
れた上記駆動音源で更新される。The convolution unit 13 first reads out a sound source vector based on the diffusion sound source index from the sound source information storage unit 14, and then reads out a diffusion vector stored in the diffusion vector storage unit 11. Next, a convolution operation of the read-out excitation vector and the diffusion vector is performed to generate a diffusion excitation vector, and the diffusion excitation vector is multiplied by a decoding gain to obtain a driving excitation diffusion excitation vector component. Output. The adaptive codebook updating unit 23 first converts the adaptive codebook based on the index of the adaptive codebook received from the parameter encoding unit 17 into the adaptive codebook 2.
2 to obtain a driving excitation adaptive codebook component by multiplying the obtained adaptive code vector by the decoding gain of the adaptive codebook. Next, a driving excitation is generated by adding the obtained driving excitation adaptive codebook component and the driving excitation diffusion excitation vector component input from the convolution unit 13, and the generated driving excitation is output to the adaptive codebook 22. Here, the old code vector in the adaptive codebook 22 is updated by the driving excitation input from the adaptive codebook updating unit 23.

【００５２】なお、本実施の形態では、音源情報格納部
１４に代数的符号帳と同じコードベクトルの生成情報を
格納した例で説明したが、その他の符号帳の生成情報も
しくはその他の符号帳自身を格納した場合についても同
様に実施可能である。また、本実施の形態では、拡散ベ
クトル格納部１１にランダム数列を格納した例で説明し
たが、その他の学習により求められた数列、もしくは知
見により求められた数列を用いる場合についても同様に
実施可能である。なお、本実施の形態における音声符号
化装置はＣＥＬＰ型としたが、ＶＯＣＯＤＥＲ型等、そ
の他の音声符号化装置においても適用が可能である。In the present embodiment, an example has been described in which the excitation information storage unit 14 stores the same code vector generation information as the algebraic codebook, but other codebook generation information or another codebook itself is stored. Can also be implemented in the same manner. Further, in the present embodiment, an example in which a random number sequence is stored in the diffusion vector storage unit 11 has been described. However, a case where a sequence obtained by other learning or a sequence obtained by knowledge is used can be similarly performed. It is. Although the speech coding apparatus according to the present embodiment is of the CELP type, it can be applied to other speech coding apparatuses such as a VOCODER type.

【００５３】[0053]

【発明の効果】以上のように本発明によれぱ、符号帳探
索のための計算量が少なくて済むとともに、ＲＯＭ容量
が小さくて済み、更に良好な合成音を提供できるという
有利な効果が得られる。As described above, according to the present invention, the following advantages can be obtained: the amount of calculation for codebook search can be reduced, the ROM capacity can be reduced, and a better synthesized sound can be provided. Can be

[Brief description of the drawings]

【図１】本発明の一実施の形態による音声復号化装置の
主要部を示すブロック図FIG. 1 is a block diagram showing a main part of a speech decoding apparatus according to an embodiment of the present invention;

【図２】本発明の一実施の形態によるＣＥＬＰ型の音声
復号化装置を示すブロック図FIG. 2 is a block diagram showing a CELP-type speech decoding apparatus according to an embodiment of the present invention;

【図３】（ａ）本発明の一実施の形態による音声符号化
装置の主要部を示すブロック図（ｂ）本発明の一実施の形態による音声符号化装置の主
要部を示すブロック図FIG. 3A is a block diagram showing a main part of a speech coding apparatus according to an embodiment of the present invention; FIG. 3B is a block diagram showing a main part of the speech coding apparatus according to one embodiment of the present invention;

【図４】本発明の一実施の形態によるＣＥＬＰ型の音声
符号化装置を示すブロック図FIG. 4 is a block diagram showing a CELP-type speech encoding apparatus according to an embodiment of the present invention;

【図５】従来のＣＥＬＰ型音声復号化装置を示すブロッ
ク図FIG. 5 is a block diagram showing a conventional CELP-type speech decoding device.

【図６】従来のＣＥＬＰ型音声符号化装置を示すブロッ
ク図FIG. 6 is a block diagram showing a conventional CELP-type speech coding apparatus.

【符号の説明】１音源情報格納部２拡散ベクトル格納部３畳み込み部４パラメータ復号化部５伝送部６適応符号帳７駆動音源生成部８ＬＰＣ合成部９出力音声１１拡散ベクトル格納部１２ディジタルフィルタ１３畳み込み部１４音源情報格納部１５入力音声１６ＬＰＣ分析部１７パラメータ符号化部１８聴感重み付け部１９聴感重み付けＬＰＣ逆順合成部Ａ２０聴感重み付けＬＰＣ合成部Ａ２１比較部Ａ２２適応符号帳２３適応符号帳更新部２４減算部２５聴感重み付けＬＰＣ逆順合成部Ｂ２６聴感重み付けＬＰＣ合成部Ｂ２７比較部Ｂ２８伝送部[Description of Code] 1 sound source information storage unit 2 diffusion vector storage unit 3 convolution unit 4 parameter decoding unit 5 transmission unit 6 adaptive codebook 7 driving excitation generation unit 8 LPC synthesis unit 9 output sound 11 diffusion vector storage unit 12 digital filter Reference Signs List 13 Convolution unit 14 Sound source information storage unit 15 Input speech 16 LPC analysis unit 17 Parameter encoding unit 18 Perception weighting unit 19 Perception weight LPC reverse order synthesis unit A 20 Perception weight LPC synthesis unit A 21 Comparison unit A 22 Adaptive codebook 23 Adaptive code Book update unit 24 Subtraction unit 25 Perception weighting LPC reverse order synthesis unit B 26 Perception weighting LPC synthesis unit B 27 Comparison unit B 28 Transmission unit

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平２−280200（ＪＰ，Ａ) 特開平２−282800（ＪＰ，Ａ) 特開平６−130994（ＪＰ，Ａ) 特開平６−195098（ＪＰ，Ａ) 特開平９−34498（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 19/00 - 19/14 H03M 7/30 H04B 14/04 ──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-2-280200 (JP, A) JP-A-2-282800 (JP, A) JP-A-6-130994 (JP, A) JP-A-6-130994 195098 (JP, A) JP-A-9-34498 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) G10L 19/00-19/14 H03M 7/30 H04B 14/04

Claims

(57) [Claims]

1. A sound source vector supply means for supplying a sound source vector, a means for storing a diffusion vector for spreading the sound source vector, and a convolution of the stored diffusion vector and the sound source vector to generate a spread sound source vector. and a convolution unit outputs, the use of <br/> code vectors generated by the algebraic codebook as the excitation vector
Diffusion sound source vector generator characterized by the following .

Wherein the step of supplying the excitation vector, the sound
Supplying a diffusion vector for spreading the source vector, and convolving the diffusion vector and the excitation vector to output a diffusion excitation vector , wherein the code vector generated by the algebraic codebook is converted to the excitation vector. A method for generating a diffuse sound source vector, wherein the method comprises:

3. The stored diffusion vector may be learned or
Is a sequence determined by one of the findings.
2. The diffused sound source vector generation device according to claim 1, wherein:

4. The method according to claim 1, wherein the supplied diffusion vector is learned or learned.
Is a sequence determined by one of the findings.
3. The method according to claim 2, wherein