JPS60239799A

JPS60239799A - Multipulse type vocoder

Info

Publication number: JPS60239799A
Application number: JP59096038A
Authority: JP
Inventors: 哲田口
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1984-05-14
Filing date: 1984-05-14
Publication date: 1985-11-28
Also published as: JPH043876B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（技術分野）本発明はマルチパルス型ボコーダに関し、殊に分析側に
於けるマルチパルスの振幅、位置の決定法に関する。TECHNICAL FIELD The present invention relates to a multi-pulse vocoder, and more particularly to a method for determining the amplitude and position of multi-pulses on the analysis side.

（軌来技術）入力音声信号を分析して、この入力音声信号の音声情報
を構成するスペクトル包絡情報と音源情報とを分析側で
抽出し、これら音声情報を伝送路を介して合成側に送出
して入力音声信号を再生するボコーダはよく知られてい
る。(Traditional technology) Analyzes the input audio signal, extracts spectral envelope information and sound source information that make up the audio information of this input audio signal on the analysis side, and sends this audio information to the synthesis side via a transmission path. Vocoders that reproduce input audio signals are well known.

上述したスペクトル包絡情報は、入力音声イｇ号を発生
する声道系のスペクトル分布情報を表わすもので、通常
ＬＰＣ分析によって得られた分析次数に対応する個数の
ＬＰＧ係数、たとえばαパラメータ、にパラメータ等に
よって表現され、また音源情報はスペクトル包絡の微細
構造を示すもので入力音声信号からスペクトル分布情報
を除いた、いわゆる残差信号として知られるもので、入
力音声信号の音源の強さ、ピッチ周期および有声・無声
に関する情報が含まれ１通常これらの情報は入力音声信
号の分析フレームごとの１己相関係数を介して抽出され
ることもよく知られている。The above-mentioned spectral envelope information represents the spectral distribution information of the vocal tract system that generates the input speech ig, and usually includes parameters in the number of LPG coefficients, such as the α parameter, corresponding to the analysis order obtained by LPC analysis. The sound source information indicates the fine structure of the spectral envelope, and is known as the so-called residual signal, which is obtained by removing the spectral distribution information from the input audio signal. and voiced/unvoiced information.It is also well known that this information is usually extracted via an autocorrelation coefficient for each analysis frame of the input audio signal.

さて、スペクトル包絡情報はボコーダの合成側で入力音
声信号を合成する場合、通常全極型のデジタルフィルム
を利用して近似的声道系を形成せしめるＬ　Ｐ　Ｃ”合
成器の係数として利用され、音源情報はこのデジタルフ
ィルタの駆動音源として利用され、このデジタルフィル
タによって入力音声信号が合成される。Now, when spectral envelope information is synthesized with input audio signals on the synthesis side of a vocoder, it is usually used as coefficients of an LPC'' synthesizer that uses an all-polar digital film to form an approximate vocal tract system. The sound source information is used as a driving sound source for this digital filter, and the input audio signal is synthesized by this digital filter.

このようにして得られる従来のＬ　Ｐ　Ｃホ：７−　ｐ
−は、約４Ｋｂ（キロワット）μ下の低ビツトレートで
も音声の合成が５Ｔ能であり多用されているものの、高
品質の音声合成は高ビットレートにおいても困難でおる
という欠点を有する。この原因は音源情報のモデル化の
場合、有声音に対してはその内容に対応するピッチ周期
を抽出してこのピッチ周期に対応する単一のインパルス
列で近似的に表現し、ランクム周期の無声音に対しては
白色雑音で近似的に表現するという単純なモデル化焙理
を前提としているため、入力音声信号の音源情報を忠実
に抽出したものとならず、従って音源情報に含まれる入
力音声信号の波形情報の分析９合成が実施されていない
ことによる。Conventional LPC ho obtained in this way: 7-p
- is widely used because it is capable of 5T speech synthesis even at a low bit rate of about 4 Kb (kilowatt) μ, but has the drawback that high-quality speech synthesis is difficult even at a high bit rate. The reason for this is that when modeling sound source information, the pitch period corresponding to the content of a voiced sound is extracted and approximately represented by a single impulse train corresponding to this pitch period, and the unvoiced sound with a rankum period is is based on a simple modeling process in which the sound source information of the input audio signal is approximately expressed using white noise, so the source information of the input audio signal is not faithfully extracted. This is because analysis and synthesis of waveform information has not been performed.

マルチパルス型ボコーダは、このような波形非伝送によ
る問題の改善を図るため波形伝送を行なって入力音声信
号の合成を実施するボコーダのひとつとして近時よく知
られククあるものでめる。A multi-pulse vocoder is recently well known as one of the vocoders that synthesizes input audio signals by transmitting waveforms in order to improve the problem caused by non-transmission of waveforms.

第１図は従来のマルチパルス型ボコーダの分析側基本的
構成を示すブロック図である。FIG. 1 is a block diagram showing the basic configuration of the analysis side of a conventional multi-pulse vocoder.

ＬＰＧ合成器１は声道をシミュレートする全極型デジタ
ルフィルタを備え、その係数は入力端子２００１を介し
て入力される入力音声信号ｘ（ｎ）（ｎ＝　１．２．３
・・・・・・ｎ）をＬＰＧ分析器２により分析フレーム
ごとに分析したＬＰＧ係数が供給される。The LPG synthesizer 1 is equipped with an all-pole digital filter that simulates the vocal tract, and its coefficients are determined by the input audio signal x(n) (n=1.2.3) input via the input terminal 2001.
.

音源パルス発生器３は、入力音声信号の音源情報から複
数個のインパルス系列、すなわちマルチパルスからなる
駆動音源系列Ｖ　（ｎ）を得て、これをＬＰＣ合成器１
の駆動音源として供給する。The sound source pulse generator 3 obtains a driving sound source sequence V (n) consisting of a plurality of impulse sequences, that is, multi-pulses, from the sound source information of the input audio signal, and outputs the driving sound source sequence V (n) to the LPC synthesizer 1.
supplied as a driving sound source.

ＬＰＣ合成器１はこうして入力するＬＰＣ係数金、通常
は全極型デジタルフィルタを利用する会合フィルタの係
数とし、マルチパルスを駆動音源として駆動され合成信
号？（ｎ）を出力する。この場合、マルチパルスは入力
音声信号の波形情報を含むものであり、ＬＰＣ合成器１
は波形情報を含む入力音声信号の合成を行なうこととな
る。The LPC synthesizer 1 uses the input LPC coefficients as coefficients of an association filter that normally uses an all-pole digital filter, and generates a synthesized signal driven by a multi-pulse as a driving sound source. Output (n). In this case, the multi-pulse contains waveform information of the input audio signal, and the LPC synthesizer 1
synthesizes the input audio signal including waveform information.

さて、ＬＰＧ合成器１から出力する合成信号１（ｎ）は
次に減算器４で入力音声信号ｘ　（ｎ）との差金とり、
誤差ｅ　（ｎ）を得てこれを聴感重み付は器５に送出す
る。Now, the synthesized signal 1(n) output from the LPG synthesizer 1 is then subtracted from the input audio signal x(n) by a subtracter 4,
The error e(n) is obtained and sent to the perceptual weighting device 5.

聴感重み付は器５は、誤差ｅ　（ｎ）に対して次の（１
）式に示す特性ｗ（ｚ）を有する重み付はフィルタによ
って聴感的な重み付けを付与したうえ、これらを２乗誤
差最小比容６に送出するものである。The auditory weighting unit 5 calculates the following (1
) The weighting having the characteristic w(z) shown in equation (2) is performed by applying perceptual weighting using a filter and then sending these weights to the minimum square error specific volume 6.

〕・・・・・・・・・・・・・・・（１）（１）式におい
てａｋはＬＰＣ合成器１の全極型デジタルフィルタの係
数とすべきＬ　Ｐ　Ｃ係数、ｐはその次数であり従って
ＬＰＣ分析次数、ｒは重み付は係数、２は全極型デジタ
ルフィルタの２変換表示による伝達関数Ｈ（ｚ−”）に
おけるｚ＝ｅｘｐ（ｊλ）を示し、ここにλ＝２πΔＴ
ｆでありΔＴは分析フレームの標本化サンプリング周期
、ｆは周波数を示す。] ・・・・・・・・・・・・・・・(1) In equation (1), ak is the LPC coefficient that should be the coefficient of the all-pole digital filter of LPC synthesizer 1, and p is its order Therefore, the LPC analysis order, r is the weighting coefficient, and 2 is z=exp(jλ) in the transfer function H(z-'') by the two-transform representation of the all-pole digital filter, where λ=2πΔT
f, ΔT is the sampling period of the analysis frame, and f is the frequency.

また（１）式において重み付は係数ｒは、Ｏ＜ｒ＜１の
範囲で設定される。Further, in equation (1), the weighting coefficient r is set in the range O<r<1.

（１）式に示すｗ（ｚ）はｒ＝１に対しては１、ｒ＝Ｏ
に対してはｗ　（ｚ）　＝　１−１）　（Ｚ）の範囲の
範囲で変化し、ｒの値は誤差ｅ　（ｎ）の周波数スペク
トルにおけるフォルマント領域に現われる過大なレベル
を抑圧する程度に対応して１述した範囲の中で設定され
、合成すべき信号の聴感的貢み付けの役割を果たすもの
であり、通常予め最適聴感テストによってその最適値が
選定される。w(z) shown in equation (1) is 1 for r=1, and r=O
for w (z) = 1-1) (Z), and the value of r corresponds to the extent to which excessive levels appearing in the formant region of the frequency spectrum of the error e (n) are suppressed. It is set within the range mentioned above, and plays the role of aural contribution of the signals to be synthesized, and its optimum value is usually selected in advance by an optimum auditory test.

このようにして重み付けされた誤差ｅ　（ｎ）は、音源
パルス発生器３から出力される駆動音源系列Ｖ　（ｎ）
、すなわちマルチパルスのＲＪＭ時間位置と振幅とを決
定するために２乗誤差最小比容６に送出され、次の（２
）式による２乗誤差εを計算し、εを最小にするように
駆動音源７表Ｖ　（ｎ）が選択される。The error e (n) weighted in this way is the driving sound source series V (n) output from the sound source pulse generator 3.
, i.e., the RJM time position and amplitude of the multi-pulse are sent to the squared error minimum ratio 6, and the following (2
), and the drive sound source 7 table V (n) is selected so as to minimize ε.

ε＝　Σ（ｅ　（ｎ）　４ｗ　（ｎ）　〕２−−・−＝
Ｃ２）ｎ＝１（２）式において記号井は聴感重み付は器５の屯み付は
フィルタによるたたみ込み積分、Ｎはマルチパルスを計
算する区間長を示す。ε= Σ(e (n) 4w (n) 〕2−−・−=
C2) n=1 In the equation (2), the symbol I indicates the auditory weighting unit 5's convolution integral by the filter, and N indicates the interval length for calculating the multipulse.

上述した６理はマルチパルスのパルスごとに繰返され、
分析による合成がマルチパルスごとに行なわれる、いわ
ゆるＡｎａｌｙｓｉｓ　−ｂｙ−８ｙｎｔｎｅｓｉｓ手
伝（以下Ａ−ｂ−８手法と略称する）でろって、このＡ
−ｂ−８手法は上述した内容からも明らかな如く、マル
チパルス１りすりについてパルス発生、２乗誤差計算お
よびパルス位置・振幅調整のループで行なわれるため、
低ピットレー）　ｆｆ４Ｍ域における有効な手段である
にもかかわらずその演算量が極めて膨大なものとなると
いう欠点がめる。The above six principles are repeated for each multi-pulse,
This A-b-8 method is called analysis-by-8yntesis (hereinafter abbreviated as A-b-8 method) in which synthesis by analysis is performed for each multipulse.
As is clear from the above, the b-8 method is performed in a loop of pulse generation, square error calculation, and pulse position/amplitude adjustment for one multipulse.
Although it is an effective method in the FF4M range, it has the disadvantage that the amount of calculation required is extremely large.

なお、このＡ−ｂ−８手法については、Ｂ、　Ｓ。Regarding this A-b-8 method, B, S.

Ａｔａｌ　ｅｔ　ａｔ　、”Ａ　Ｎｅｗ　Ｍｏｄｅｌ　
ｏｆ　ＬＰＣＥｘｃｉｔ−ａｔｉｏｎ　ｆｏｒ　Ｐｒｏ
ｄｕｃｉｎｇ　Ｎａｔｕｒａｌ−８ｏｕｎｄｉｎｇ　５
ｐｅｅｃｈａｔ　Ｌｏｗ　Ｂｉｔ　Ｒａｔｅｓ”、　Ｐ
ｒｏｃ、　ＩＣＡ３８Ｆ　８２゜１）ｐ　６１４−６１
７．（１９８２）等に詳述されている。Atal et at, “A New Model”
of LPCExcit-ation for Pro
ducing Natural-8ounding 5
peechat Low Bit Rates”, P
roc, ICA38F 82°1) p 614-61
7. (1982) and others.

このよりな従来のＡ−ｂ−８手法における欠点に対して
、相関演算にもとつき最適なマルチノくルスを効率的に
計算する次のような演１１．処理アルゴリズムが最近紹
介されている。To address this shortcoming of the more conventional A-b-8 method, the following operation 11. Processing algorithms have been recently introduced.

すなわち、入力音声信号ｘ　（ｎ）はＮ？７プルごとも
理フレームによって区分され、このフレームごとにマル
チパルスが包括的にｄｉ算されるものである。That is, is the input audio signal x (n) N? Every 7 pulls are also divided into logical frames, and multipulses are comprehensively calculated for each frame.

いま、１分析フレーム内に石源パルスかに個存在するも
のとし、１番目のパルスがフレーム端から時間位置ｍ、
にあり、かつその振幅がｇｌであるとすると、ＬＰＣ合
成フィルタの駆動音源ｄ（ｎ）に次の（３）式で示され
る。Now, it is assumed that there are several stone source pulses in one analysis frame, and the first pulse is at a time position m from the frame end.
and its amplitude is gl, the driving sound source d(n) of the LPC synthesis filter is expressed by the following equation (3).

ｄ（ｎ）　＝Σｇ＋・δｎ、ｍｌ　・・・・・・・・・
・・・（３）皿＝１（３）式においてδ”　ｔ　”　ｌはクロ不ツカ−のデ
ルタ関数であり、δｎ　＊　ｍ　Ｈ＝１（”　＝ｍＩ　
）　ｖ　δｎ　、　ｍ　Ｈ＝　０（ｎ”ｍｌ）　である
。d(n) = Σg+・δn, ml ・・・・・・・・・
...(3) Plate=1 In equation (3), δ" t "l is Kurofutsuka's delta function, and δn * m H=1(" = mI
) v δn , m H= 0(n”ml).

ＬＰＧ合成フィルムはこの駆動音源ｄ　（ｎ）によって
駆動され合成信号ｘ　（ｍ）を出力する。The LPG composite film is driven by this driving sound source d (n) and outputs a composite signal x (m).

ＬＰＧ合成フィルタとして、たとえば全極型デジタルフ
ィルタを考えるものとし、その伝達関数全インパルス応
答ｋ（ｎ）（０≦ｎ≦Ｍ−１）で表現するものとすると
、合成信号ｘ　（ｎ）は灰“の（４）式で表わされる。Assume that an all-pole digital filter is considered as an LPG synthesis filter, and its transfer function is expressed by the total impulse response k(n) (0≦n≦M-1), then the synthesized signal x (n) is gray It is expressed by equation (4) of “.

４（４）式においてｄ（ｚ）は駆動音源７表わす。次に入
力音声信号ｘ（ｎ）と合成信号ｘ　（ｎ）との誤差に対
し聴感的な補止を施した重み付は誤差’ｅｅＷ（ｎ）と
するとｅ、（ｎ）　はべの（５）式で示される。4 In equation (4), d(z) represents the driving sound source 7. Next, the weighting with auditory correction for the error between the input audio signal x(n) and the composite signal ) is shown by the formula.

ｅ、（ｎ）　＝　Ｉ　Ｘ　（ｎ）　−ｘ　（ｎ）　）　
４ｗ　（ｎ）・・・・・・・・・・・・・・・（５）さ
らに２乗誤鈴は（５）式から誘導して次の（６）式で示
すことができる。e, (n) = I X (n) −x (n))
4w (n) (5) Further, the square false ring can be derived from the equation (5) and expressed by the following equation (6).

・・・・・・・・・・・・・・・（６）（６）式におい
てＭは誤差を最小化する区間のサンプル数を示し、たと
えば１分析フレーム長に選ぶ。最適な音源パルス伺とし
てのマルチパルスは（６）式を最小化するｇｌ　を得る
ことによって得られ、このｇ　は上述した（３）、　（
４）および（６）式から次の（７）式の如く誘導される
。(6) In equation (6), M indicates the number of samples in the section that minimizes the error, and is selected to have a length of one analysis frame, for example. The multipulse as the optimal sound source pulse range can be obtained by obtaining gl that minimizes equation (6), and this g is determined by the above (3), (
The following equation (7) is derived from equations (4) and (6).

Ｍ＝１Ｅ　ｈＷ（ｎ−ｌＴｌ、　）　−ｂ、（ｎ−ｍｌ）ｎ＝
１・・・・・・・・・・・・（７）（７）式にかいてｘｗ（”）　ｉｄｘ（ｎ）Ｎｗ（ｎ）
、　ｈ、。M=1 E hW(n-lTl, ) -b, (n-ml)n=
1 ・・・・・・・・・・・・(7) In equation (7), xw(”) idx(n)Nw(n)
,h,.

（ｎ）はｈ（ｎ）％ｗ（ｎ）を示す。（７）式の右辺の
分子の第１項はｘ、（ｎ）とｂＷ（ｎ）　との時間迦れ
ｍｌの相互相関関数φｈｘ（町）を示すものであり、ま
た、第２項の　Ｅ　ｈｖｒ（”−ｍｔ）　・ｉｗ　（ｎ
　”ｓ　）はり、（ｎ）Ｍ＝１の共分散関数φｈｈ（ｍｔｙｍｔ）（１４１ｎｚｙｍｔ
≦Ｍ）を示す。共分散関数φｈｈ（ｍｚ＊ｍ＋）は１己
相関関数ｈｈｈ　（ｌ　ｍｚ　−ｍ　Ｉｌ　）　と等し
くなり、従ッテ（７）式は次の（８）式の如く表わすこ
とができる。(n) indicates h(n)%w(n). The first term in the numerator on the right side of equation (7) indicates the cross-correlation function φhx (town) of the time delay ml between x, (n) and bW(n), and the second term E hvr(”-mt) ・iw(n
"s) beam, (n) M=1 covariance function φhh(mtymt) (141nzymt
≦M). The covariance function φhh (mz*m+) is equal to the one-autocorrelation function hhh (l mz −m Il ), and the following equation (7) can be expressed as the following equation (8).

・・・・・・・・・・・・（８）（８）式によれば、時間位置ｍＩにおいてパルスを発生
せしめると振幅ｇＩ（ｍ＋）が最適なものとして決定し
うろこととなる。なお（８）式において１−ｍ１≦Ｍ　
である。(8) According to equation (8), if a pulse is generated at time position mI, the amplitude gI(m+) will be determined to be optimal. Note that in equation (8), 1-m1≦M
It is.

つまり、ある音源パルスに着目し、抜枠の時間位置にお
いて（８）式によりその振幅を計算したうえ、その振幅
の絶対瞼を最大とするものが（６）式に示す２乗誤差を
最小化するパルスとなり、このような手続を繰返して複
数個の音源パルスをめることができる。In other words, by focusing on a certain sound source pulse and calculating its amplitude using equation (8) at the time position of frame extraction, the one that maximizes the absolute amplitude of the amplitude minimizes the squared error shown in equation (6). By repeating this procedure, a plurality of sound source pulses can be generated.

なお、上述した計算アルゴリズムに関しては、小浜、葉
間、小野１マルチパルス駈動形音声符号化法の検討“、
１９８３年３月　電子通信学会　通信方式研究会に詳述
されている。Regarding the above-mentioned calculation algorithm, please refer to Obama, Hama, and Ono's study of multi-pulse canter speech coding method.
It was detailed in the Communication Method Study Group of the Institute of Electronics and Communication Engineers, March 1983.

このような訓算アルゴリズムに基づいて行なわれるマル
チパルスの発生によれば、相互相関関数と自己相関開数
ならびに最大値演算から最適なマルチパルスの計算かり
能となるため、構成が非Ａ１に猷１素化されたものとな
り演算歓を大幅に低減しつるマルチパルス型ボコーダを
実現することができる。By generating multi-pulses based on such a calculation algorithm, it is possible to calculate an optimal multi-pulse from the cross-correlation function, autocorrelation numerical value, and maximum value calculation, so the configuration can be made into a non-A1 configuration. This makes it possible to realize a multi-pulse vocoder that can be integrated into a single element, greatly reducing the amount of calculation required.

しかしながら、このようにして改善したマルチパルス型
ボコーダにうってもさらに次に述べるような欠へかある
。However, even with the multi-pulse vocoder improved in this way, there are still deficiencies as described below.

すなわち、合成側に於ける合成フィルタの語長が有眼の
場合、上述した計算アルゴリズムによりめたマルチパル
スによる駆動音＃を用いた合成フィルタ８力が、無角時
や有声音語尾等の音声の富力が小いきい場合に合成フィ
ルタの有眼語長の影響で発生するリミットサイクルによ
る雑音電力により妨害され、著しく聴覚的に耳ざわりに
なるという欠点がある。In other words, when the word length of the synthesis filter on the synthesis side is lentic, the synthesis filter 8 force using the multi-pulse driving sound # determined by the above-mentioned calculation algorithm will be effective when the word length of the synthesis filter on the synthesis side is lentic. When the threshold is small, noise power due to the limit cycle generated due to the effect of the visible word length of the synthesis filter interferes with the noise power, which has the disadvantage that it becomes extremely aurally harsh.

（発明の目的）本発明の目的は上述した欠点を除去し、マルチパルス型
ボコーダにおいて、分析側で入力音声信号に直流信号を
加算してマルチパルスの振幅、位置をめる事により、合
成側へ合成フィルタのリミットサイクルによる雑音電力
よりも十分に大きな合成電力を発生し得るマルチパルス
による駆動音源を供給することにより、無首時や有声音
語尾等の音声の電力が小いさい場合にも良好な合成音を
発生し得るマルチパルス型ボコーダを提供することにあ
る。(Object of the Invention) The object of the present invention is to eliminate the above-mentioned drawbacks, and in a multi-pulse type vocoder, add a DC signal to the input audio signal on the analysis side to adjust the amplitude and position of the multi-pulse. By supplying a multi-pulse driving sound source that can generate a synthesized power sufficiently larger than the noise power due to the limit cycle of the synthesis filter, it is possible to generate sound even when the power of the voice is low, such as when there is no head or at the end of a voiced speech. An object of the present invention is to provide a multi-pulse vocoder that can generate good synthesized sounds.

（発明の構成）本発明のマルチパルス型ボコーダは、入力音声信号を分
析フレームごとにＬＰＧ分析して抽出したＬＰＣ係数を
スペクトル包絡情報としこのスペクトル包絡情報ととも
に前記入力音声信号の音声情報を構成する音源情報を分
析フレームごとにこの音源情報の特徴に対応する発生時
間位置と蛋幅とを有する複数個のインパルス系刺（マル
チパルス）を以って表現し前記入力音声信号の分析およ
び合成を行なうマルチパルス型ボコーダにおいて。(Structure of the Invention) The multi-pulse vocoder of the present invention performs LPG analysis on an input audio signal for each analysis frame, uses the extracted LPC coefficients as spectral envelope information, and configures the audio information of the input audio signal together with this spectral envelope information. The input audio signal is analyzed and synthesized by representing the sound source information in each analysis frame using a plurality of impulse system pulses (multipulses) having generation time positions and pulse widths corresponding to the characteristics of the sound source information. In multi-pulse vocoders.

前記入力音声信号に直流信号を加算する手段と、前記曲
流信号を加算された入力音声信号と音声合成フィルタの
インパルス応答との相互相関係数列を算出する手段と、
前記インパルス応答の自己相関係数列を算出する手段と
、前記相互相関係数列と前記自己相関係との関連性に基
づいてインパルス系９Ｉ］（マルチパルス）の振幅、位
置をフォワード的に算出する手段を分析側に備えて構成
される。means for adding a DC signal to the input audio signal; and means for calculating a cross-correlation coefficient sequence between the input audio signal to which the meandering signal has been added and the impulse response of the speech synthesis filter;
means for calculating an autocorrelation coefficient sequence of the impulse response; and means for forwardly calculating the amplitude and position of the impulse system 9I (multipulse) based on the relationship between the cross-correlation coefficient sequence and the autocorrelation. is configured on the analysis side.

（実施例）次に図面を参照して本発明の詳細な説明する。(Example) Next, the present invention will be described in detail with reference to the drawings.

第２図は本発明によるマルチバルン型ボコーダの分析側
の一実施例を示すブロック図、第３図は本発明によるマ
ルチパルス型ボコーダの合成側の一実施例を示すブロッ
ク図である。FIG. 2 is a block diagram showing an embodiment of the analysis side of the multi-pulse vocoder according to the present invention, and FIG. 3 is a block diagram showing an embodiment of the synthesis side of the multi-pulse vocoder according to the present invention.

第２図に示す本発明によるマルチパルス型ボコーダの分
析側は、ＬＰＧ分析器７．相互相関関数算出器８．符号
化器（１）９．　自己相関関数算出器１０、マルチパル
ス算出器１１．Ｒ号化器（２）１２゜直流信号発生器１
９％直流加算器２ｏおよびマルチプレクサ１３を備えて
構成されている。The analysis side of the multi-pulse vocoder according to the present invention shown in FIG. 2 includes an LPG analyzer 7. Cross-correlation function calculator8. Encoder (1)9. Autocorrelation function calculator 10, multipulse calculator 11. R encoder (2) 12° DC signal generator 1
It includes a 9% DC adder 2o and a multiplexer 13.

入力端子７００１を介して入力した入力音声信号は、Ｌ
ＰＣ分析器７および直流加算器１５に供給される。The input audio signal input via the input terminal 7001 is
It is supplied to a PC analyzer 7 and a DC adder 15.

ＬＰＣ分析器７は入力音声信号を分析フＶ−ムごとに、
予め設定するビット数のデジタル酸として量子化化し、
この敬子化音声信号をＬＰＧ分析してＬＰＣ係数として
のｐ次のにパラメータ（偏自己相関係数）を抽出し、こ
ｒＬを出力ライン７０１を介して符号化器（１）９に供
給する。本実施例においては分析フレームは２０ｍ　Ｓ
ＥＣに設定している。The LPC analyzer 7 analyzes the input audio signal for each analysis frame.
Quantize as a digital acid with a preset number of bits,
This child-respected speech signal is subjected to LPG analysis to extract a p-order parameter (partial autocorrelation coefficient) as an LPC coefficient, and this rL is supplied to the encoder (1) 9 via an output line 701. In this example, the analysis frame is 20 m S
It is set to EC.

符号化器（１）９は、入力したＬＰＧ係数の量子化と符
号化を行なったのち、出力ライン９０１を介してマルチ
プレクサ１３に送出する。The encoder (1) 9 quantizes and encodes the input LPG coefficients, and then sends them to the multiplexer 13 via an output line 901.

ＬＰＧ分析器７はまた。ＬＰＣ係数からインパルス応答
ｈ（ｎ）（０４ｎ≦Ｍ−１）を計算し、出カライン７０
２．符号化器（１）９．出力ライン９０２を介して相互
相関関数算出器８および自己相関関数算出器１０に供給
する。LPG analyzer 7 is also. The impulse response h(n) (04n≦M-1) is calculated from the LPC coefficient, and the output line 70
2. Encoder (1)9. It is supplied via an output line 902 to a cross-correlation function calculator 8 and an autocorrelation function calculator 10.

直流信号発生器１９は直流信号を発生するもの工め９、
発生する直流信号の振幅は、入力音声信号の有声音足常
部の最大振幅に対応して予じめ経鹸的に決定される。本
実施例に於いては入力端子７００１を介して供給される
入力音声信号の最大振幅より３０ｄ　Ｂ低い振幅の曲流
信号を直流信号発生器１９は発生し直流加算器２０へ出
力する。曲流加算器２０は入力端子７００１を介して供
給された入力音声信号に直流信号を加臭し相互相関関数
算出器８へ出力する。The DC signal generator 19 is designed to generate a DC signal 9,
The amplitude of the generated DC signal is predetermined empirically, corresponding to the maximum amplitude of the voiced footstep of the input audio signal. In this embodiment, the DC signal generator 19 generates a curved signal having an amplitude 30 dB lower than the maximum amplitude of the input audio signal supplied via the input terminal 7001 and outputs it to the DC adder 20. The meandering adder 20 adds a DC signal to the input audio signal supplied via the input terminal 7001 and outputs it to the cross-correlation function calculator 8 .

相互相関関数算出器８は、直流信号を加算された入力音
声信号とインパルス応纂：ｈ（ｎ）とを利用して相互相
関関数φ敗を劇算し、これを呂カライン８０１を介して
マルチパルス算出器１１に送出する。The cross-correlation function calculator 8 calculates the cross-correlation function φ using the input audio signal to which the DC signal has been added and the impulse response: h(n), and calculates the cross-correlation function It is sent to the pulse calculator 11.

また、自己相関関数所出器１０は、入力したインパルス
応答ｈ　（ｎ）の自己相関関数Ｒ５ｈｈを計算し、こ１
ｔｔｌ−出力ライン１００１を介して類似度算出器１１
に送出する。Further, the autocorrelation function calculating unit 10 calculates the autocorrelation function R5hh of the input impulse response h (n), and
ttl-similarity calculator 11 via output line 1001
Send to.

マルチパルス算出器１１はこうして入力した分析フレー
ムごとの相互相関関数φ欣と自己相関関数ＲｈｈＩ！：
を利用して後述する手法を用いて所定の数の音源パルス
９１Ｉを得て、これらのパルスの振幅および位置情報を
出力ライン１１０１を介して符号化器（２）　１２に送
出し、これによって量子化および符号化を行なったのち
出力ライン１２０１を介してマルチパルス？１３に送出
する。The multi-pulse calculator 11 calculates the cross-correlation function φ and autocorrelation function RhhI! for each analysis frame thus input. :
A predetermined number of sound source pulses 91I are obtained using the method described below using the After encoding and encoding, the multi-pulse? Send on 13th.

このようにして、量子化および符号化されてマルチプレ
クサ１３に送出きね、るＬＰＣ係数およびマルチパルス
データは、入力音声信号のスペクトル包絡および音源情
報を表わすデータとしてマルチプレクサ１３を介して所
定の方式で時分割され、伝送路１３０１を介して第２図
に示す分析側から第３図に示す合成側に伝送される。In this way, the LPC coefficients and multipulse data that are quantized and encoded and sent to the multiplexer 13 are processed in a predetermined manner via the multiplexer 13 as data representing the spectral envelope and sound source information of the input audio signal. The signal is time-divided and transmitted from the analysis side shown in FIG. 2 to the synthesis side shown in FIG. 3 via a transmission path 1301.

第３図に示す合成側は、伝送路１３０１ｉ介して分析側
から伝送されたデータに基づいて入力音声信号の合成を
行なうものであり、デマルチプレ比容１４．複号化器（
１）　１５．捨号イヒ器（２）１６゜ＬＰＣ合成器１７
およびＬ　Ｐ　Ｆ　（Ｌｏｗ　Ｐａ５ｓＦｉｌｔｅｒ　
）　１８等を備えて構成される。The synthesis side shown in FIG. 3 synthesizes input audio signals based on data transmitted from the analysis side via the transmission path 1301i, and has a demultiplex ratio of 14. Decoder (
1) 15. Discretion Ihi device (2) 16° LPC synthesizer 17
and L P F (Low Pa5sFilter
) 18 mag.

テマルチプレクサ１４は、伝送路１３０１を介して入力
した各種データをマルチプレクサ１３の時分割伝送形式
による変換前の状態に復元し、ＬＰＣ休数体−タは出力
ライン１４１を介して復号化器（１）　１５に、マルチ
パルステータは出力ライン１４２を介して復号化器（２
）　１６に、それぞれ供給され、これらの復号化器によ
ってデータの復号化を行なったうえ、それぞれ出力ライ
ン１５１，１６１に送出する。The multiplexer 14 restores various data input via the transmission line 1301 to the state before conversion by the multiplexer 13 in the time division transmission format, and the LPC rest field data is sent to the decoder (1) via the output line 141. ) 15, the multi-pulse theta is sent to the decoder (2) via output line 142.
) 16, the data is decoded by these decoders, and then sent to output lines 151 and 161, respectively.

ＬＰＧ合成器１７は、このようにして入力するマルチパ
ルスを音源情報としてｐ次の全極型デジ　□タルフィル
タの駆動音源に利用し、また出力ライン１５１を介して
入力するｐ次のＬＰＣ係数データを上記全極型デジタル
フィルタの係数としてこのＬＰＣ合成フィルタを制御し
て入力音声信号を合成し、これを男カライン２１１を介
して［、ＰＦ’１８に送出し、所定の低域フィルタリン
グ全行っテアナログ量の合成音声としてａ：’、カライ
ン１８１に送出する。The LPG synthesizer 17 uses the thus inputted multipulses as sound source information as a driving sound source for a p-order all-pole digital filter, and also uses the p-order LPC coefficient data input via an output line 151. is used as the coefficient of the above-mentioned all-pole digital filter to control this LPC synthesis filter to synthesize the input audio signal, which is sent to [, PF'18 via the male line 211, and is subjected to predetermined low-pass filtering for all rows of analog signals. A:' is sent to the Kaline 181 as a synthesized voice of the amount.

上述のＬＰＣ合成器１７に入力されたマルチパルスで表
現される音源情報は少なくとも有声音定襠部の３９ｄＢ
程度低い電力を有するものであり、ＬＰＣ合成器１７の
リミットサイクルによる電力より十分大きなもので必る
。なお、Ｌ１’Ｃ合成器１７で合成された直流成分は聴
覚的にはいっさい知覚され４い。The sound source information expressed by the multi-pulse input to the LPC synthesizer 17 described above is at least 39 dB of the voiced sound constant part.
It has relatively low power, and must be sufficiently larger than the power generated by the limit cycle of the LPC synthesizer 17. Note that the DC component synthesized by the L1'C synthesizer 17 is not perceived audibly at all.

次にマルチパルス算出器１１に応用ざａる手法について
説明する。マルチパルス林声器１１は前述の分析フレー
ム毎の相互相関係数φ欣と自己相関係数Ｒｈｈ、ｌ！：
全利用して所定の数の背綜パルス列を算出し得る手法で
あれば全て用いることが可能である。−例としてマルチ
パルス算串器１１のマルチパルス算出手法として前記小
パリの゛γアルゴリズム甲いる場合について述べる。Next, a method applied to the multi-pulse calculator 11 will be explained. The multi-pulse Lin voice generator 11 calculates the above-mentioned cross-correlation coefficient φ and auto-correlation coefficient Rhh, l! for each analysis frame. :
Any method can be used as long as it can calculate a predetermined number of dorsal pulse trains using all the methods. - As an example, a case will be described in which the multi-pulse calculation method of the multi-pulse calculator 11 uses the above-mentioned small Paris algorithm.

初めに相互相関係数φ敗の絶対値の最大のものを検索す
る。次に一１記検索されたφｈｘの位置と振幅（極性を
汀する）とを有する第ｌを目のｆ諒パルスを決定する。First, the maximum absolute value of the cross-correlation coefficient φ is searched. Next, determine the lth f-pulse having the position and amplitude (change the polarity) of φhx found in step 11.

丈に決定した曾諒パルスの成分を除去するにめに、迎ｒ
ｔ“０″の自己相関ｆ糸数で正規化さｒＬ前記第１番目
の音源パルスの振幅（極性を何する）によ０鼠へ付けさ
れｆｃ　Ｒｈｈ　ＶＣ工り前記検索された位置をＲｈｈ
（０）と対応させてφｂｘを補′正する。次に補正され
たφｂｘの絶対値の最大のものを検索し、第２の音源パ
ルスを決定し、φｈｘを補正する。必要に応じ土建の操
作金繰返す。In order to remove the components of the long pulse, the
The autocorrelation of t "0" f is normalized by the number of threads rL is attached to 0 according to the amplitude (what polarity) of the first sound source pulse fc Rhh VC processing is performed to make the searched position Rhh
(0) and correct φbx. Next, the maximum absolute value of the corrected φbx is searched, a second sound source pulse is determined, and φhx is corrected. Civil engineering operation fees will be repeated as necessary.

なお小浜らのアルゴリズムに依らないマルチパルス算出
手法としては、類似度に依る方法がある。Note that as a multipulse calculation method that does not rely on Obama et al.'s algorithm, there is a method that relies on similarity.

類似度による方法は上記φｈｘの絶対値の最大のものを
検索する代りにφｂｘとＦＬｈｈとの類似度、例えば相
互相関係数、の最大のものを検索するものであり特許＃
４５８−１４９００７に記載された手法である。The method based on similarity searches for the maximum similarity between φbx and FLhh, such as the cross-correlation coefficient, instead of searching for the maximum absolute value of φhx, and is disclosed in patent #
This is the method described in No. 458-149007.

又、上記の説明に於いては（１）式に示される聴感重み
付けを実施しない事を前提にしていたが、ｑＢｐ・感に
、み付けを実施することも可能である。聴感重み付けを
実施する場合には（１）式に示される伝達関数を有する
フィルタを、例えばγ＝０．８として構成し、自流加算
器２０と相互相関関数算出器８との間に挿入し貞１述の
ＬＰＣ分析器７で算出されるインパルス応答の代りに、
ＬＰＧ係数に減衰係数ｒ（γ＝０．８）　を印加したＬ
ＰＣ除数から計算されたインパルス応答ｈ（ｎ）’を用
いればよいことは明らかである。Further, in the above explanation, it was assumed that the perceptual weighting shown in equation (1) was not performed, but it is also possible to perform the perceptual weighting on the qBp/sensation. When perceptual weighting is carried out, a filter having a transfer function shown in equation (1) is configured with γ=0.8, for example, and inserted between the free-flow adder 20 and the cross-correlation function calculator 8. Instead of the impulse response calculated by the LPC analyzer 7 described in 1.
L with the damping coefficient r (γ=0.8) applied to the LPG coefficient
It is clear that the impulse response h(n)' calculated from the PC divisor can be used.

なお、第２図および第３図に不す本発明の実施例におい
ては、ＬＰＣ係数としてにパラメータを用いているがこ
れは他のＬＰＣＩ糸数、たとえはαパラメータ等を利用
してもよく、また符号化器とマルチプレクサ、および復
号化器とテマルチプレクサはそれぞれこれらｔ−ｎ化し
た構成のものとしても同様に実施し得ることは明らかで
あり、またＬＰＧ合成フィルタは全極型以外の非極型デ
ジタルフィルタ等と置換してもほぼ同様に実施しうろこ
ともまた明らかである。In the embodiments of the present invention shown in FIGS. 2 and 3, parameters are used as LPC coefficients, but other LPCI thread counts, such as the α parameter, etc. may also be used. It is clear that the encoder and multiplexer, and the decoder and the multiplexer, respectively, can be similarly implemented with these t-n configurations, and the LPG synthesis filter can also be implemented with a non-polar type other than an all-pole type. It is also clear that the implementation can be carried out in almost the same way even if a digital filter or the like is substituted.

（発明の効果）以上説明した如く本発明によれば、マルチパルスボコー
ダにおいて、人力音声信号に一流信号を加算する手段と
、１１１記曲流、信号を加算された入力音声信号と音声
合成フィルタのインパルス応答との相互相関係数を算出
する手段と、前記インパルス応答の自己相関係数刊を算
出する手段と、前記相互相関係数刊と１１１記自己相関
係数ダ１との関連性に基づいてインパルス系９１１　（
マルチパルス）の像幅、位置をフォワード的に算出する
手段を分析側に有することにより、無賃、有声首語尾等
、電力の小いさな音声部分についてもＬＰＣ合成器の入
力音源レベルを合１３ｙ、器のリミットサイクルよりも
十分に大きなものにすること力■］゛能となり、五好な
合成音を発発し得るという効果がある。(Effects of the Invention) As explained above, according to the present invention, in a multi-pulse vocoder, there is provided a means for adding a first-class signal to a human-powered speech signal, and a means for adding a first-class signal to a human-powered speech signal, and a means for adding a first-class signal to a human-powered speech signal, and an input speech signal to which the signal has been added and a speech synthesis filter. Based on means for calculating a cross-correlation coefficient with an impulse response, means for calculating an autocorrelation coefficient of the impulse response, and the relationship between the cross-correlation coefficient and the autocorrelation coefficient d1 of 111. Impulse system 911 (
By having a means for forward-calculating the image width and position of (multi-pulse) on the analysis side, the input sound source level of the LPC synthesizer can be summed up to 13y, It has the effect of making it sufficiently larger than the limit cycle of the instrument, making it possible to produce a good synthesized sound.

[Brief explanation of drawings]

第１図は従来のマルチパルス型ボコーダの基本的構成を
示すブロック図、第２図は本発明によるマルチパルスを
ボコーダの分析側の一実施例を示すブロック図、第３図
は本発明によるマルチパルス型ボコーダの合成側の一実
施例を示すブロック図である。１・・・・・・ＬＰＣ合成器、２・・団・ＬＰＣ分析器
、３・・・・・・音源パルス発生器、４・川・・減算器
、５・旧・・聴感重み付は器、６・・・・・・２粱誤差
最小化器、７・・・・・・ＬＰＣ分析器、８・・・・・
・相互相関関数算出器、９・・・・・・符号化器（１）
、ｌＯ・・・・・・自己相関関数算出器。１１・・・・・・マルチパルス伸出器、１２・・・・・
・符号化綿（２）、１３・・・・・・マルチプレクサ、
１４・・・・・・デマルチプレクサ、１５・・・・・・
復号化器（１）、１６・・・・・・蓚号比容（２）、１
７・・・・・・ＬＰＣ合成器、１８・・・・・・ＬＰＦ
、１９・・・・・・曲流信号発生器、２０・・・・・・
曲流加算器。事２圀不３旧FIG. 1 is a block diagram showing the basic configuration of a conventional multi-pulse vocoder, FIG. 2 is a block diagram showing an embodiment of the analysis side of the multi-pulse vocoder according to the present invention, and FIG. 3 is a block diagram showing the basic configuration of a conventional multi-pulse vocoder. FIG. 2 is a block diagram showing an embodiment of the synthesis side of a pulse-type vocoder. 1...LPC synthesizer, 2... Group LPC analyzer, 3... Sound source pulse generator, 4... Subtractor, 5... Old... Auditory weighting device , 6...2 error minimizer, 7...LPC analyzer, 8...
- Cross-correlation function calculator, 9... Encoder (1)
, lO... Autocorrelation function calculator. 11...Multi-pulse stretcher, 12...
・Encoded cotton (2), 13...Multiplexer,
14...Demultiplexer, 15...
Decoder (1), 16...Decoder (2), 1
7...LPC synthesizer, 18...LPF
, 19... Curved flow signal generator, 20...
Curved adder. thing 2 kunifu 3 old

Claims

[Claims]

The input audio signal is subjected to LPC (Linea) for each analysis frame.
r Prediction Coefficient,
The LPC coefficients analyzed and extracted (linear prediction coefficients) are used as spectral envelope information, and together with this spectral envelope information, the sound source information that constitutes the audio information of the input/output voice code is generated corresponding to all the features of this sound source information for each analysis frame. In a multi-pulse vocoder that analyzes and synthesizes the input audio signal by expressing it with a plurality of impulse sequences (multipulses) having time positions and amplitudes, means for adding a DC signal to the input audio signal. A multi-pulse vocoder characterized in that it has a function of determining the amplitude and position of a multi-pulse on an analysis side.