JPH0229238B2

JPH0229238B2 -

Info

Publication number: JPH0229238B2
Application number: JP57112222A
Authority: JP
Inventors: Satoru Taguchi; Masanori Kobayashi; Takayuki Ishikawa
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1982-06-29
Filing date: 1982-06-29
Publication date: 1990-06-28
Also published as: JPS593493A

Description

【発明の詳細な説明】本発明は帯域分割型ボコーダ、特に入力音声信
号を予め定めた低域および高域２つの音声伝送帯
域に分割しおのおのの帯域について線形予測分析
を行う帯域分割線形予測型ボコーダに関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a band division type vocoder, particularly a band division linear prediction type which divides an input audio signal into two predetermined audio transmission bands, a low frequency band and a high frequency band, and performs linear predictive analysis on each band. Regarding the vocoder.

入力音声信号を線形予測分析法（LPC分析法、
Linear Prediction Coefficient分析法）によつて
分析し、これによつて得られる入力音声信号のス
ペクトル分布情報を特徴パラメータとして入力音
声信号の代りに伝送する線形予測型ボコーダは近
時よく知られている。 The input audio signal is analyzed using the linear predictive analysis method (LPC analysis method).
A linear prediction vocoder that analyzes the input audio signal using a Linear Prediction Coefficient analysis method and transmits the resulting spectral distribution information of the input audio signal as a feature parameter in place of the input audio signal is well known these days.

このLPC分析法は入力音声の特徴パラメータ
であるスペクトル分布情報と音源情報のうち、ス
ペクトル分布情報として音声のスペクトル包絡を
全極型モデルで近似するものであるが、この
LPC分析によつて求められる結果については従
来次のような２つの欠点があることが知られてい
る。 This LPC analysis method uses an all-pole model to approximate the spectral envelope of the voice as spectral distribution information among the spectral distribution information and sound source information that are characteristic parameters of the input voice.
It is known that the results obtained by LPC analysis have the following two drawbacks.

その１つは、入力音声信号のフオルマント帯域
幅の過少推定として経験的に知られているもの
で、特に第１フオルマントに関してよく起る現象
であり、しばしば実音声の数分の１にフオルマン
ト帯域幅を過少推定する結果として現われる。 One is what is known empirically as underestimation of the formant bandwidth of the input speech signal, a phenomenon that is particularly common with respect to the first formant, and which often has a formant bandwidth that is a fraction of that of the actual speech signal. This appears as a result of underestimating.

その２は、LPC分析法は、比較的にエネルギ
ーの小さい第３フオルマントの近似性が第１フオ
ルマントに比して悪いということである。 Second, in the LPC analysis method, the approximation of the third formant, which has relatively small energy, is poorer than that of the first formant.

上述した第１および第２の欠点は、人間の聴覚
特性が対数的感度を有しかつ第３フオルマントま
で確保されないと明瞭性が著しく損われるという
件と相俟つて、ボコーダで分析し合成される音声
の品質を劣化せしめている。 The first and second drawbacks mentioned above, coupled with the fact that human auditory characteristics have logarithmic sensitivity and clarity is significantly impaired unless the third formant is maintained, are analyzed and synthesized using a vocoder. This degrades the quality of the audio.

このようなLPC分析法の第１の欠点は、入力
音声の第１フオルマントのようにエネルギーの集
中する周波数領域では極が過度に集中するために
起るものと考えられている。このように、特定の
周波数領域に極が集中するのを防ぐため、音声帯
域を複数のサブバンドに分割し、その各のバンド
に対してLPC分析を行うことにより極の適当な
分散を図る帯域分割型線形予測分析法が最近検討
されている。この方式によれば、帯域の分割数等
を適当に選択することにより、上述した第１の欠
点のほか第２の欠点も緩和し、分析し合成すべき
音質の改善につながる有力な手段になる可能性を
秘めている。 The first drawback of such an LPC analysis method is thought to occur because the poles are excessively concentrated in a frequency region where energy is concentrated, such as the first formant of input speech. In this way, in order to prevent poles from concentrating in a specific frequency region, the audio band is divided into multiple subbands and LPC analysis is performed on each of the subbands to appropriately disperse the poles. Partitioned linear predictive analysis methods have been recently investigated. According to this method, by appropriately selecting the number of band divisions, etc., in addition to the first drawback mentioned above, the second drawback is alleviated, and it becomes an effective means for improving the sound quality to be analyzed and synthesized. It has potential.

このように、従来のLPC分析法の欠点を改善
できる帯域分割型LPC分析、合成法には次のよ
うな特徴があることが実験の結果から明らかにさ
れている。 As described above, experimental results have revealed that the band-splitting LPC analysis and synthesis method, which can improve the shortcomings of the conventional LPC analysis method, has the following characteristics.

すなわち、基本的な帯域分割数としては２個の
ものが他の分割数のものに比して良好な分析結果
が得られ、これによつて従来のLPC分析法特有
の、上述した２つの欠点が著しく改善される。ま
た、このような帯域分割型LPC分析法では、各
帯域を必要最小限度のサンプリング周波数で表現
するようにサンプリンググレートを分割前のサン
プリンググレートより低減変更する、いわゆるデ
シメート（Decimate）することが必要で、また
このデシメートは周波数領域で行うことが演算処
理に有利であるといつた様様な特徴を有する。 In other words, using two as the basic number of band divisions can yield better analytical results than other numbers of division, and this eliminates the two drawbacks mentioned above that are unique to the conventional LPC analysis method. is significantly improved. In addition, in this type of band-splitting LPC analysis method, it is necessary to reduce the sampling rate from the sampling rate before dividing so that each band is represented by the minimum necessary sampling frequency, which is called decimate. , and this decimation has various characteristics such that it is advantageous for calculation processing to perform it in the frequency domain.

また、帯域分割型LPC分析法、特に２分割の
帯域分割型LPC分析法の欠点として、LPC分析
の結果得られるＫパラメータは、２分割した帯域
のうち特に低域の２次のＫパラメータ、K₂が低
域および高域における別次のＫパラメータに比し
てその分布が著しく偏つており、数多い実測結果
はいずれもこのK₂パラメータ値の出現分布は−
１の近傍に集中する。この理由について詳述する
と、帯域分割型LPC分析法においては、入力音
声のスペクトル包絡を有限個の極周波数の組合せ
で表現し、帯域分割した低域側に第１フオルマン
トが存在するように境界周波数を設定する。ま
た、第２フオルマント以降のフオルマントは低域
側に存在せず、高域側に存在する比率が高い。し
たがつて、通常は低域側には１個のフオルマント
しか存在しない場合が多い。１個のフオルマント
は１個の極で表現（近似）可能である。この場
合、１次のＫパラメータK₁は第１フオルマント
の中心周波数（境界周波数をπrad.としたとき）
の角周波数のコサインになる。また、K₂パラメ
ータは第１フオルマントの帯域幅に依存する。帯
域幅がゼロすなわちラインスペクトルであるとき
K₂＝−１、帯域幅が無限大すなわち極が存在し
ないときK₂＝０、および帯域幅が有限であると
き０＜K₂＜−１となる。０＜K₂＜−１は低域側
に第１フオルマントのみが存在する場合のK₂パ
ラメータの分布範囲である。第１フオルマントは
通常スペクトルの鋭いピークを有し、かつ帯域幅
が狭い。したがつて、低域のK₂パラメータの分
布は−１の近傍に集中する。 In addition, a drawback of the band-splitting LPC analysis method, especially the two-split band LPC analysis method, is that the K parameter obtained as a result of LPC analysis is the second-order K parameter, especially in the low range of the two-split band, ₂ has a significantly biased distribution compared to other K parameters in the low and high ranges, and numerous actual measurement results show that the appearance distribution of K ₂ parameter values is −
Concentrate near 1. To explain the reason for this in detail, in the band-splitting LPC analysis method, the spectral envelope of the input voice is expressed by a combination of a finite number of polar frequencies, and the boundary frequency is Set. Further, the second formant and subsequent formants do not exist in the low frequency range, but have a high proportion in the high frequency range. Therefore, there is usually only one formant on the low frequency side. One formant can be expressed (approximated) by one pole. In this case, the first-order K parameter K ₁ is the center frequency of the first formant (when the boundary frequency is πrad.)
It becomes the cosine of the angular frequency. Also, the K ₂ parameter depends on the bandwidth of the first formant. When the bandwidth is zero or line spectrum
K ₂ =-1, K ₂ =0 when the bandwidth is infinite, ie, there are no poles, and 0<K ₂ <-1 when the bandwidth is finite. 0<K ₂ <-1 is the distribution range of the K ₂ parameter when only the first formant exists on the low frequency side. The first formant usually has a sharp peak in the spectrum and a narrow bandwidth. Therefore, the distribution of the K ₂ parameter in the low range is concentrated near −1.

よく知られる如く、Ｋパラメータは合成側で音
声合成に使用するデジタル合成フイルタの係数を
設定する情報として利用され、この合成フイルタ
が安定に動作するための必要条件として｜Kn｜
＜１、（ｎ＝１，２，……Ｐ）を満足することが
必要である。ここでＰは合成フイルタの次数であ
る。 As is well known, the K parameter is used as information to set the coefficients of the digital synthesis filter used for speech synthesis on the synthesis side, and is a necessary condition for the stable operation of this synthesis filter |Kn|
It is necessary to satisfy <1, (n=1, 2, . . . P). Here, P is the order of the synthesis filter.

Knで示されるＫパラメータはそれぞれ所定の
ビツト数を割当てて量子化されるが、ボコーダの
分析側から合成側に量子化された伝送符号として
送出されるこのＫパラメータは、量子化誤差の抑
圧、必要とする量子化ビツト数従つて伝送すべき
帯域の圧縮という点から考えた伝送効率の改善と
いう観点、あるいはまたスペクトル感度の適当さ
という観点等からするとその値は０を中心として
正負ほぼ均等に分布するような分布特性が望まし
い。逆に言つて、特定の次数のＫパラメータの分
布が＋１もしくは−１といつた限界値近傍に集中
した偏りをもつていると、このＫパラメータの符
号化の際におけるスペクトル歪、いわゆる量子化
歪に対する敏感性を増しこれが合成すべき音声の
品質劣化を招く原因となり、これを避けるために
量子化ビツト数を増すことは当然ビツトレーの増
大、すなわち伝送帯域の拡大につながるというい
ろいろな欠点を有する。 The K parameters, denoted by Kn, are each assigned a predetermined number of bits and quantized. This K parameter, which is sent as a quantized transmission code from the analysis side of the vocoder to the synthesis side, is used to suppress quantization errors, From the viewpoint of improving transmission efficiency in terms of the required number of quantization bits and therefore compressing the transmission band, or from the viewpoint of appropriateness of spectral sensitivity, the value should be approximately equal to the positive and negative values around 0. It is desirable to have a distribution characteristic that is distributed. Conversely, if the distribution of the K parameter of a specific order has a bias concentrated near the limit value such as +1 or -1, spectral distortion, so-called quantization distortion, occurs when encoding this K parameter. This increases the sensitivity to quantization, which causes deterioration in the quality of the speech to be synthesized.Increasing the number of quantization bits to avoid this has various drawbacks, such as increasing the bit rate, that is, expanding the transmission band.

従つてK₂パラメータの如き偏つた分布を含む
Ｋパラメータの量子化によるこのような欠点を避
けるには、そのような偏りを示す特定のＫパラメ
ータのスペクトル感度と分布を配慮した量子化が
必要となる。Ｋパラメータはその次数によりそれ
ぞれ異るスペクトル感度を有し、このスペクトル
感度とはＫパラメータの微少変化に対応するスペ
クトルの変換の割合であり、K₂パラメータの如
く極端な分布の偏りを示しこのスペクトル感度が
敏感なものは、その量子化にあたつて何等かの手
段でこの分布の偏りを圧縮した量子化を行い、量
子化歪の抑圧を行うとともにビツトレートの低
減、伝送帯域の圧縮を図らなければならない。 Therefore, in order to avoid such drawbacks due to quantization of K parameters that include biased distributions such as the _K2 parameter, it is necessary to quantize with consideration to the spectral sensitivity and distribution of specific K parameters that exhibit such biases. Become. Each K parameter has a different spectral sensitivity depending on its order, and this spectral sensitivity is the conversion rate of the spectrum that corresponds to a _minute change in the K parameter. If the sensitivity is sensitive, quantization must be performed to compress the bias of this distribution by some means to suppress quantization distortion, reduce the bit rate, and compress the transmission band. Must be.

第１図は低域Ａ、および高域Ｂの帯域分割型線
形予測ボコーダにおける６次までのＫパラメータ
を利用した場合の分布の一例を示すＫパラメータ
分布図である。低域Ａおよび高域ＢいずれもK₁
からK₆までのＫパラメータを利用しておるが、
図からも明らかな如く、特に低域ＡのK₂パラメ
ータの分布が−１の近傍に集中しておるのがよく
わかる。 FIG. 1 is a K parameter distribution diagram showing an example of the distribution when K parameters up to the sixth order are used in a band division type linear predictive vocoder for low band A and high band B. Both low range A and high range B are K ₁
We use K parameters from to K ₆ , but
As is clear from the figure, it is clearly seen that the distribution of the K ₂ parameter, especially in the low range A, is concentrated near -1.

本発明の目的は上述した欠点を除去し、入力音
声信号を予め定めた低域と高域との伝送帯域に分
割して線形予測分析を行う帯域分割型ボコーダに
おいて、低域のK₂パラメータを分析側で非線形
量子化して伝送するという簡単な手段を備えるこ
とにより量子化歪の影響を大幅に低減するととも
に、必要とする量子化ビツトレート従つて伝送帯
域の低減を著しく改善することができる帯域分割
型ボコーダを提供することにある。 The purpose of the present invention is to eliminate the above-mentioned drawbacks, and to provide a band division _type vocoder that divides an input audio signal into predetermined transmission bands of low and high frequencies and performs linear predictive analysis. Band division can greatly reduce the effects of quantization distortion by providing a simple means of nonlinear quantization and transmission on the analysis side, and can also significantly improve the reduction of the required quantization bit rate and therefore the transmission band. Its purpose is to provide a type vocoder.

本発明のボコーダは、入力音声信号を予め定め
た低域および高域２つの音声帯域に分割しおのお
のの帯域について線形予測分析を行う帯域分割型
ボコーダにおいて、前記低域の音声伝送帯域の線
形予測分析で得られる前記低域の音声伝送帯域の
線形予測分析で得られる偏自己相関係数Ｋパラメ
ータのうち２次のＫパラメータK₂を非線形量子
化して伝送するK₂パラメータ非線形量子化伝送
手段を備えて構成される。 The vocoder of the present invention is a band division type vocoder that divides an input audio signal into two predetermined audio bands, a low frequency band and a high frequency band, and performs linear predictive analysis on each band. K ₂ parameter nonlinear quantization transmission means for nonlinearly quantizing and transmitting a second-order K parameter K ₂ of the partial autocorrelation coefficient K parameters obtained by linear predictive analysis of the low frequency audio transmission band obtained by analysis; Prepared and configured.

次に図面を参照して本発明を詳細に説明する。 Next, the present invention will be explained in detail with reference to the drawings.

第２図は本発明の帯域分割型ボコーダの分析側
の一実施例を示すブロツク図、第３図は本発明の
帯域分割型ボコーダの合成側の一実施例を示すブ
ロツク図である。 FIG. 2 is a block diagram showing an embodiment of the analysis side of the band division type vocoder of the present invention, and FIG. 3 is a block diagram showing an embodiment of the synthesis side of the band division type vocoder of the present invention.

入力端子１０００から入力音声信号１００１を
受けると、これを高域遮断周波数3333Hzの低域通
過フイルタLPF１を介して高域を遮断したLPF
出力１０１を得て、さらにこれをＡ／Ｄコンバー
タ２により8KHzのサンプリング周波数でサンプ
リングし所定のビツト数のデジタル量に変換した
Ａ／Ｄコンバータ出力２０１とし、これをウイン
ドウ処理器３に送出する。上述したLPF１の高
域遮断周波数は分析すべき音声信号の上限として
3333Hzを設定している。 When an input audio signal 1001 is received from an input terminal 1000, it is passed through a low-pass filter LPF1 with a high-frequency cutoff frequency of 3333Hz to an LPF that cuts off high frequencies.
An output 101 is obtained, which is further sampled by the A/D converter 2 at a sampling frequency of 8 KHz, converted into a digital quantity of a predetermined number of bits, and output as an A/D converter output 201, which is sent to the window processor 3. The high cutoff frequency of LPF1 mentioned above is the upper limit of the audio signal to be analyzed.
It is set to 3333Hz.

ウインドウ処理器３は、入力するＡ／Ｄコンバ
ータ出力２０１を一旦内部のバツフアメモリにス
トアする。このメモリは入力信号の30mSEC分す
なわち240サンプル分を記憶し、これにハミング
関数を乗算するウインドウ処理を行う。このウイ
ンドウ処理は10mSEC周期で行われ、この周期が
基本分析周期、いわゆる基本フレーム周期とな
る。 The window processor 3 temporarily stores the input A/D converter output 201 in an internal buffer memory. This memory stores 30 mSEC of the input signal, ie, 240 samples, and performs window processing by multiplying this by a Hamming function. This window processing is performed at a cycle of 10 mSEC, and this cycle is the basic analysis cycle, or so-called basic frame cycle.

ウインドウ処理された音声波形データ３０１
は、上述した基本フレーム周期ごとにDFT
（Discrete Fourier Transformation）回路４、
有声／無声判別器５およびピツチ抽出器６に送出
される。 Windowed audio waveform data 301
is the DFT for each basic frame period mentioned above.
(Discrete Fourier Transformation) circuit 4,
The signal is sent to the voiced/unvoiced discriminator 5 and the pitch extractor 6.

DFT回路４は入力した音声波形データ３０１
をフーリエ変換して周波数領域のスペクトル成分
としこれをDFT出力４０１としてパワースペク
トル計測回路７に送出する。 DFT circuit 4 input audio waveform data 301
is Fourier-transformed into a frequency domain spectral component, which is sent to the power spectrum measurement circuit 7 as a DFT output 401.

パワースペクトル計測回路７は入力したDFT
出力４０１の自乗演算を行なつて各周波数におけ
るパワースペクトル成分に変換してこれを内蔵す
るメモリにストアする。メモリにストアされたパ
ワースペクトルデータ７０１は、低域自己相関係
数計測器８、高域自己相関係数計測器９によつて
自由に読出され、音声波形データ３０１の特定す
る低域および高域における自己相関係数計測に用
いられる。 The power spectrum measurement circuit 7 uses the input DFT
The output 401 is squared and converted into power spectrum components at each frequency, which are stored in a built-in memory. The power spectrum data 701 stored in the memory is freely read out by the low-frequency autocorrelation coefficient measuring device 8 and the high-frequency autocorrelation coefficient measuring device 9, and the low-frequency and high-frequency bands specified by the audio waveform data 301 are It is used to measure the autocorrelation coefficient in

低域自己相関係数計測器８は、パワースペクト
ルデータ７０１の低域、本実施例では０から1333
Hzまでのパワースペクトルデータを読出し、離散
型逆フーリエ変換を施して必要な範囲内の各遅れ
時間における自己相関係数を計測しこの低域自己
相関係数データ８０１を低域線形予測係数分析器
１０に送出する。 The low frequency autocorrelation coefficient measuring device 8 measures the low frequency range of the power spectrum data 701, from 0 to 1333 in this embodiment.
The power spectrum data up to Hz is read out, subjected to discrete inverse Fourier transform to measure the autocorrelation coefficient at each delay time within the required range, and this low frequency autocorrelation coefficient data 801 is sent to a low frequency linear prediction coefficient analyzer. Send on 10.

また低域自己相関係数計測器８は、計測した遅
れ時間零における自己相関係数、すなわちこの基
本フレーム周期における短時間平均電力を低域短
時間平均電力データ８０２として符号化器１１に
送出する。 Furthermore, the low-frequency autocorrelation coefficient measuring device 8 sends the measured autocorrelation coefficient at a delay time of zero, that is, the short-time average power in this basic frame period, to the encoder 11 as low-frequency short-term average power data 802. .

低域線形予測分析器１０は、入力する低域自己
相関係数データ８０１の組からＫパラメータを所
定の次数まで、よく知られるオートコリレーシヨ
ン（AUTO CORRELATION）法による手法で
抽出し、これを低域Ｋパラメータデータ１０１０
として符号化器１１に送出する。ただしこのうち
K₂パラメータのみはそのまま符号化器１１に送
ることなく、K₂パラメータデータ１０２０とし
てK₂パラメータメモリ１２に送出する。 The low-frequency linear prediction analyzer 10 extracts K parameters up to a predetermined order from a set of input low-frequency autocorrelation coefficient data 801 using the well-known autocorrelation method. Low range K parameter data 1010
It is sent to the encoder 11 as However, of these
Only the K ₂ parameter is sent to the K ₂ parameter memory 12 as K ₂ parameter data 1020 without being sent as is to the encoder 11 .

K₂パラメータメモリ１２に入力したK₂パラメ
ータデータ１０２０は、非線形変換回路１３によ
つて次次に読出され、符号化器１１による量子化
に先立ち予め次のようにしてデータの非線形変換
を行う。 The K ₂ parameter data 1020 input to the K ₂ parameter memory 12 is read out one after another by the nonlinear conversion circuit 13, and prior to quantization by the encoder 11, the data is subjected to nonlinear conversion as follows.

非線形変換回路１３は、予め制御用プログラム
を内蔵したBOMによつて次次に読出され所定の
アドレスにストアしたK₂パラメータデータ１０
２０に対しログエリアレシオ（Log Area
Ratio）変換を行ない分布の偏りを圧縮する。 The non-linear conversion circuit 13 uses _K2 parameter data 10 that is read out one after another by a BOM containing a control program in advance and stored at a predetermined address.
20 to log area ratio (Log Area
Ratio) conversion to compress the bias in the distribution.

Ｋパラメータは声道内を等価的にフイルタと見
たてた声道音響フイルタの反射係数に対応する値
を有し、またこの声道音響フイルタは声道を断面
積の異るｎ個の等長音響管を次次に接続したもの
と見なすことができ、従つてＫパラメータがこの
ような声道断面積に対応する値をもつことはよく
知られている。声道内各部の断面積は時間的に連
続変化しており、音声分析においてはこれを声道
内のサンプリング点の両側の各断面積を代表する
断面積比を以つて表している。このような関係か
ら断積比Dnが次の(1)式で示されることもまたよ
く知られている。 The K parameter has a value corresponding to the reflection coefficient of a vocal tract acoustic filter that equivalently regards the inside of the vocal tract as a filter, and this vocal tract acoustic filter has a value corresponding to the reflection coefficient of a vocal tract acoustic filter that treats the vocal tract as a filter. It is well known that long acoustic tubes can be regarded as connected one after the other, and that the K parameter has a value corresponding to such a vocal tract cross-sectional area. The cross-sectional area of each part within the vocal tract changes continuously over time, and in speech analysis, this is expressed by a cross-sectional area ratio representing each cross-sectional area on both sides of a sampling point within the vocal tract. It is also well known that from this relationship, the cross-sectional area ratio Dn is expressed by the following equation (1).

Dn＝（１−Kn）／（１＋Kn） ……(1) (1)式において添字ｎはＫパラメータの次数を示
す。(1)式のlog Dnすなわちlog（１−Kn）／（１
＋Kn）がそのＫパラメータのログエリアレ
シオと呼ばれ、これはＫパラメータの値に対応し
た対数変換形としてしばしば音声分析に用いられ
ている。このログエリアレジオをＫパラメー
タの代りに利用することによつてＫパラメータは
著しく分散を圧縮されたものとなる。 Dn=(1-Kn)/(1+Kn)...(1) In equation (1), the subscript n indicates the order of the K parameter. log Dn of equation (1), that is, log(1-Kn)/(1
+Kn) is called the log area ratio of the K parameter, and this is often used in speech analysis as a logarithmic conversion form corresponding to the value of the K parameter. By using this log area regio instead of the K parameter, the K parameter becomes one whose variance is significantly compressed.

分散を圧縮する手段の１つとしてのログエリ
アレジオは、K₂パラメータの場合次の(2)式で
示される。 Log area ratio, which is one of the means to compress variance, is expressed by the following equation (2) in the case of K ₂ parameter.

log D₂＝log（１−K₂）／（１＋K₂） ……(2) 本実施例においてもK₂パラメータの分散圧縮
のために(2)式のlog D₂をK₂パラメータの代りに
利用している。 log D ₂ = log(1-K ₂ )/(1+K ₂ ) ...(2) In this example, log D ₂ in equation (2) is substituted for the K ₂ parameter in order to compress the distribution of the K ₂ parameter. We are using.

第２図の非線形変換回路１３は(2)式に示すログ
エリアレシオlog D₂を、内蔵するプログラ
ムによる制御のもとに入力するK₂パラメータデ
ータ１０２０に(2)式による演算処理を施すことに
よつて非線形変換K₂パラメータデータ１３０１
として得て、これを符号化器１１に送出し、符号
化器１１は低域線形予測係数分析器１０から出力
するK₂パラメータを除くＫパラメータデータ１
０１０と、非線形変換回路１３から出力する。非
線形変換によつて圧縮されたK₂パラメータ、す
なわちK₂ログエリアレジオをそれぞれ特定
のビツト数で量子化し伝送符号に変換したものを
伝送ライン１１０１を介して合成側に送出する。 The non-linear conversion circuit 13 in FIG. 2 applies arithmetic processing according to equation (2) to the input K ₂ parameter data 1020 under the control of a built-in program to obtain the log area ratio log D ₂ shown in equation (2). Nonlinear transformation K ₂ parameter data 1301 by
and sends it to the encoder 11, which encodes the K parameter data 1 excluding the _K2 parameter output from the low-band linear prediction coefficient analyzer 10.
010 is output from the nonlinear conversion circuit 13. The K ₂ parameters compressed by nonlinear transformation, that is, the K ₂ log area registers, are each quantized with a specific number of bits and converted into a transmission code, and then sent to the combining side via a transmission line 1101.

さて、高域自己相関係数計測器９はパワースペ
クトル計測回路７から出力するパワースペクトル
データ７０１のうち、1333Hzから3333Hzの成分を
高域周波数帯として読出し、これにフーリエ逆変
換を施すことにより、必要な範囲内の各遅れ時間
における自己相関係数を計測し、この高域自己相
関係数データ９０１を高域線形予測係数分析器１
４に送出するとともに、遅れ時間零における高域
自己相関係数データを前述した基本フレーム周期
における高域短時間平均電力データ９０２として
これを符号化器１１に送出する。 Now, the high-frequency autocorrelation coefficient measuring device 9 reads out the components from 1333Hz to 3333Hz as a high-frequency band out of the power spectrum data 701 output from the power spectrum measurement circuit 7, and performs an inverse Fourier transform on this to obtain The autocorrelation coefficient at each delay time within the required range is measured, and this high frequency autocorrelation coefficient data 901 is sent to the high frequency linear prediction coefficient analyzer 1.
At the same time, the high-frequency autocorrelation coefficient data at zero delay time is transmitted to the encoder 11 as high-frequency short-term average power data 902 in the aforementioned basic frame period.

なお、上述したフーリエ逆変換においては、
1333Hzから3333Hzのパワースペクトル成分はこれ
を1333Hz低周波領域にシフトし、等価的に０から
2000Hzのパワースペクトル成分として自己相関係
数を計測する。 In addition, in the Fourier inverse transform mentioned above,
The power spectrum component from 1333Hz to 3333Hz shifts this to the 1333Hz low frequency region, equivalently shifting from 0 to 3333Hz.
The autocorrelation coefficient is measured as a 2000Hz power spectrum component.

高域線形予測係数分析器１４は、低域線形予測
係数分析器１０と全く同様して入力した高域自己
相関係数データ９０１の組からＫパラメータを所
定の次数まで抽出し、これを高域Ｋパラメータデ
ータ１４０１として符号化器１１に送出し、これ
により各Ｋパラメータデータをそれぞれ特定のビ
ツト数で量子化し伝送符号に変換して伝送ライン
１１０１を介して第３図に示す合成側に送出す
る。 The high-frequency linear prediction coefficient analyzer 14 extracts K parameters up to a predetermined order from the input high-frequency autocorrelation coefficient data set 901 in exactly the same way as the low-frequency linear prediction coefficient analyzer 10, and extracts K parameters up to a predetermined order. It is sent to the encoder 11 as K parameter data 1401, whereby each K parameter data is quantized with a specific number of bits, converted into a transmission code, and sent via transmission line 1101 to the combining side shown in FIG. .

なお、上述した低域線形予測係数分析器１０お
よび高域線形予測係数分析器１４におけるそれぞ
れの線形予測係数分析の場合、これら低域周波数
帯の０から1333Hzまで、および高域周波数帯の０
から2000Hzまでの周波数帯域幅はともにその最高
周波数がもとのサンプリング周期できまる最高周
波数4000Hzすなわち8KHzの１／２に比しそれぞ
れ１／３および１／２に制限されるため、これら
の線形予測分析におけるサンプリング周期はそれ
ぞれもとのサンプリング周期の３倍および２倍と
する、いわゆるデシメート（Decimate）したサ
ンプリング周期を用いたことと等価になる。 In addition, in the case of each linear prediction coefficient analysis in the above-mentioned low frequency linear prediction coefficient analyzer 10 and high frequency linear prediction coefficient analyzer 14, these low frequency bands from 0 to 1333Hz and the high frequency band 0
These linear predictions The sampling period in the analysis is three times and twice the original sampling period, respectively, which is equivalent to using a so-called decimated sampling period.

ところで、ウインドウ処理器３から出力した音
声波形データ３０１は、有声／無声判別器５およ
びピツチ抽出器６にも供給される。 By the way, the audio waveform data 301 output from the window processor 3 is also supplied to the voiced/unvoiced discriminator 5 and the pitch extractor 6.

有声／無声判別器５は、ウインドウ処理器３に
よつてウインドウ処理された音声波形データ３０
１を受け、各基本フレームごとにそのデータが有
声音か無声音かを判別しその結果を有声無声判別
データ５０１として符号化器１１に送出する。 The voiced/unvoiced discriminator 5 uses voice waveform data 30 that has been window-processed by the window processor 3.
1, it determines whether the data is voiced or unvoiced for each basic frame, and sends the result to the encoder 11 as voiced/unvoiced discrimination data 501.

有声、無声もしくは無音等の状態を判別する方
法は、いわばパターン認識とも相通ずる技術とし
てよく知られており、これに関してはビー・エ
ス・アタル他；“アパターンリコグニツシヨ
ンアローチツウボイスドーアンボイスドーサ
イレンスクラシフイケーシヨンウイズアプ
リケーシヨンツウスピーチリコグニツシヨ
ン（B.S.Atal etal：“Ａ Pattern Recognition
to Voiced−Unvoiced−Silence Classification
with Application to Speech Recognition”，
IEEE TRANS−ACTION ON ACOUSTIC，
SPEECH，AND SIGNAL PROCESSING，
VOL.ASSP−24，NO.3，June 1976.）や、その
他多くの文献に詳述されている。 The method of determining voiced, unvoiced, or silent states is well known as a technology that is similar to pattern recognition, and this is discussed in B.S. Attal et al. Silence Classification with Application to Speech Recognition (BSAtal etal: “A Pattern Recognition
to Voiced－Unvoiced－Silence Classification
with Application to Speech Recognition”，
IEEE TRANS-ACTION ON ACOUSTIC,
SPEECH, AND SIGNAL PROCESSING,
VOL.ASSP-24, NO.3, June 1976.) and many other documents.

また、ピツチ抽出器６は、入力した音声波形デ
ータ３０１から基本フレーム周期におけるピツチ
周波数データ６０１を抽出しこれを符号化器１１
に送出する。このピツチ周波数データ６０１の抽
出には入力した音声波形データ３０１の自己相関
係数を求め、有声音の如くほぼ相似的な波の繰返
しがあるとき、すなわち信号がほぼ周期的であれ
ば、信号のピツチ周期と同じ遅れ時間における自
己相関係数が最大値をとることに着目した、よく
知られるピツチ出手段を利用している。 Further, the pitch extractor 6 extracts pitch frequency data 601 in the basic frame period from the input audio waveform data 301, and transmits it to the encoder 11.
Send to. To extract this pitch frequency data 601, the autocorrelation coefficient of the input audio waveform data 301 is calculated. A well-known pitch extraction method is used, which focuses on the fact that the autocorrelation coefficient takes the maximum value at a delay time that is the same as the pitch period.

符号化器１１は、以上のようにして供給された
種類のデータを符号化して伝送フレームを作成
し、各基本フレームごとに１伝送フレームとして
伝送ライン１１０１を介して第３図に示す合成側
に送出する。 The encoder 11 encodes the type of data supplied as described above to create a transmission frame, and sends each basic frame as one transmission frame to the synthesis side shown in FIG. 3 via the transmission line 1101. Send.

次に第３図を参照して合成側における動作を説
明する。 Next, the operation on the synthesis side will be explained with reference to FIG.

復合化器２１は、第２図に示す分析側の符号化
器１１から送出される各種の符号化データを復号
化して再生し、これらの各復号化データをそれぞ
れ次のように供給する。 The decoder 21 decodes and reproduces various encoded data sent from the analysis-side encoder 11 shown in FIG. 2, and supplies each of these decoded data as follows.

再生された非線形化K₂パラメータを除く低域
Ｋパラメータデータ１０１０を低域LPC（Linear
Prediction Coefficient、線形予測係数）フイル
タ２２に供給し、また再生された高域Ｋパラメー
タデータ１４０１は高域LPCフイルタ２３に、
さらに低域の非線形変換K₂パラメータデータ１
３０１はバツフアメモリ２４にそれぞれ供給す
る。また、再生された有声無声判別データ５０１
は音源情報切替器２５に供給し、さらに再生され
たピツチ周波数データ６０１はピツチ発生器２６
に供給する。 The reproduced low-range K parameter data 1010 excluding the non-linearized _K2 parameters are processed by low-range LPC (Linear
The high-frequency K parameter data 1401 that is supplied to the Prediction Coefficient (Linear Prediction Coefficient) filter 22 and reproduced is sent to the high-frequency LPC filter 23.
Furthermore, low-frequency nonlinear conversion K ₂ parameter data 1
301 supplies each to the buffer memory 24. In addition, the reproduced voiced/unvoiced discrimination data 501
is supplied to the sound source information switch 25, and the reproduced pitch frequency data 601 is supplied to the pitch generator 26.
supply to.

復号化器２１は、また再生された低域短時間平
均電力データ８０２および高域短時間平均電力デ
ータ９０２をそれぞれ低域可変利得増幅器２７お
よび高域可変利得増幅器２８に供給する。 The decoder 21 also supplies the reproduced low-frequency short-time average power data 802 and high-frequency short-term average power data 902 to the low-frequency variable gain amplifier 27 and the high-frequency variable gain amplifier 28, respectively.

このようにして、復号化器２１によつて供給さ
れる各種の再生データのうち、ピツチ周波数デー
タ６０１はピツチ発生器２６によりピツチ周波数
情報に対応する繰返し周波数のパルス系列のピツ
チパルスデータ２６０１を発生し、これを音源情
報切替器２５に送出する。 In this way, among the various reproduced data supplied by the decoder 21, the pitch frequency data 601 is used to generate pitch pulse data 2601 of a pulse sequence of a repetition frequency corresponding to the pitch frequency information by the pitch generator 26. and sends this to the sound source information switch 25.

音源情報切替器２５は、別に入力する有声無声
判別データ５０１が有声を指定するときにはピツ
チパルスデータ２６０１を選択し、また有声無声
判別データ５０１が無声を指定するデータである
ときには雑音発生器２９の出力する白色雑音信号
２９０１を切替回路により選択出力して、有声無
声判別データ５０１による有声、もしくは無声情
報に対応してピツチパルスデータ２６０１、もし
くは白色雑音信号２９０１のいずれかを、出力ラ
イン２５０１を介して前述した低域可変利得増幅
器２７および高域可変利得増幅器２８に送出す
る。 The sound source information switch 25 selects the pitch pulse data 2601 when the separately inputted voiced/unvoiced discrimination data 501 specifies voiced, and selects the pitch pulse data 2601 when the voiced/unvoiced discrimination data 501 specifies voiced. A switching circuit selects and outputs the white noise signal 2901 that corresponds to the voiced/unvoiced discrimination data 501, and outputs either the pitch pulse data 2601 or the white noise signal 2901 via the output line 2501 in accordance with the voiced or unvoiced information. The signal is sent to the low frequency variable gain amplifier 27 and the high frequency variable gain amplifier 28 described above.

このようにして入力した、ピツチパルスデータ
２６０１もしくは白色雑音信号２９０１いずれか
を、音源の有声もしくは無声に対応して入力した
低域可変利得増幅器２７および高域可変利得増幅
器２８は、それぞれ別途入力する低域短時間平均
電力データ８０２および高域短時間平均電力デー
タ９０２により、それぞれの短時間平均電力デー
タの値に対応する重み付けを受けるように可変増
幅されたのち、それぞれ低域および高域励振信号
データ２７０１および２８０１として低域LPC
フイルタ２２および高域LPCフイルタ２３に送
出される。 Either the pitch pulse data 2601 or the white noise signal 2901 inputted in this manner is input separately to the low-frequency variable gain amplifier 27 and the high-frequency variable gain amplifier 28, respectively, depending on whether the sound source is voiced or unvoiced. The low-frequency short-time average power data 802 and the high-frequency short-term average power data 902 are variably amplified so as to receive weighting corresponding to the values of the respective short-term average power data, and then the low-frequency and high-frequency excitation signals are generated, respectively. Low frequency LPC as data 2701 and 2801
The signal is sent to filter 22 and high-frequency LPC filter 23.

さて、復号化器２１から、K₂以外のＫパラメ
ータデータ１０１０を受けた低域LPCフイルタ
２２は、内蔵するＫパラメータ／αパラメータ変
換回路によりこれらのＫパラメータをαパラメー
タに変換し、このαパラメータをLPCフイルタ
のフイルタ係数とする。 Now, the low-pass LPC filter 22 that receives K parameter data 1010 other than K ₂ from the decoder 21 converts these K parameters into α parameters using a built-in K parameter/α parameter conversion circuit, and converts these K parameters into α parameters. Let be the filter coefficient of the LPC filter.

また、非線形変換K₂パラメータデータ１３０
１は、一旦これをバツフアメモリ２４に各基本フ
レームごとにストアされ、これはK₂パラメータ
線形変換回路３０のROMに予め内蔵するプログ
ラムの制御のもとに次次に出力ライン２４０１を
介して読出されたうえ、分析側で行なつたログ
エリアレシオ変換に対する逆変換演算、すなわ
ち前述した(2)式によるlog D₂から(1)式によつて
示されるD₂を得る非線形変換演算およびこのD₂
から(1)式に示す四則演算等によりK₂を求める演
算を施してK₂パラメータデータを再生する。こ
れらの演算に必要なプログラムはすべてK₂パラ
メータ線形変換回路３０の内蔵するROMに予め
設定してあり、また後述するように、分析側で行
われるK₂パラメータの非線形変換は本実施例の
如く対数的圧縮を図つたログエリアレシオ関
数による非線形変換だけでなく、所望のK₂パラ
メータ分散圧縮特性に対応して種々の非線形変換
関数を利用することができるので、K₂パラメー
タ線形変換回路３０においてK₂パラメータを再
生するための非線形演算は、基本的には分析側で
K₂パラメータの非線形変換に利用した関数の逆
演算による非線形処理を施せばよい。 In addition, nonlinear transformation _K2 parameter data 130
1 is once stored in the buffer memory 24 for each basic frame, and is successively read out via the output line 2401 under the control of a program pre-built in the ROM of the _K2 parameter linear conversion circuit 30. Moreover, logs performed on the analysis side
Inverse conversion operation for area ratio conversion, that is, nonlinear conversion operation to obtain D ₂ shown by equation (1) from log D ₂ according to equation (2) mentioned above, and this D ₂
The K ₂ parameter data is reproduced by calculating K ₂ using the four arithmetic operations shown in equation (1). All the programs necessary for these calculations are preset in the built-in ROM of the K ₂ parameter linear conversion circuit 30, and as will be described later, the nonlinear conversion of the K ₂ parameters performed on the analysis side is performed as in this embodiment. In addition to nonlinear transformation using a log area ratio function for logarithmic compression, various nonlinear transformation functions can be used depending on the desired K ₂ _parameter dispersion compression characteristics. The nonlinear operation to reproduce the _K2 parameter is basically performed on the analysis side.
Nonlinear processing may be performed by inverse calculation of the function used for nonlinear transformation of the K ₂ parameter.

さて、このようにして再生された非線形変換前
のK₂パラメータデータ１０２０は低域LPCフイ
ルタ２２に送出される。 Now, the K ₂ parameter data 1020 before nonlinear conversion thus reproduced is sent to the low-pass LPC filter 22.

低域LPCフイルタ２２は、このようにして復
元されたK₂パラメータデータ１０２０を入力す
ると、内蔵するＫパラメータ／αパラメータ変換
回路によりこれをαパラメータに変換し、このα
パラメータを前述した他のαパラメータとともに
LPCフイルタのフイルタ係数とする。かくして
必要なLPCフイルタ係数を得た低域LPCフイル
タ２２は、これらのフイルタ係数と低域可変利得
増幅器２７から送出された低域励振信号データ２
７０１とにより低域の音声波形データを合成し、
この低域音声波形データ２２０１を低域補間器３
１に送出する。この低域音声波形データ２２０１
は、第１図に示す分析側におけるデシメート処理
のため、そのサンプリング周期が正常のサンプリ
ング周期の３倍となつている。 When the low-pass LPC filter 22 receives the K ₂ parameter data 1020 restored in this way, it converts it into an α parameter using a built-in K parameter/α parameter conversion circuit, and converts it into an α parameter.
parameter along with the other α parameters mentioned above.
Let it be the filter coefficient of the LPC filter. The low-pass LPC filter 22 that has obtained the necessary LPC filter coefficients in this way uses these filter coefficients and the low-pass excitation signal data 2 sent out from the low-pass variable gain amplifier 27.
701 to synthesize low-frequency audio waveform data,
This low frequency audio waveform data 2201 is transferred to the low frequency interpolator 3.
Send to 1. This low frequency audio waveform data 2201
Because of the decimating process on the analysis side shown in FIG. 1, the sampling period is three times the normal sampling period.

低域補間器３１は、入力する低域音声波形デー
タ２２０１を、1333Hzを遮断周波数とする低域フ
イルタを通して補間し、正常のサンプリング周期
の音声波形データとし、この低域補間器出力デー
タ３１０１を低域BPF３２に送出する。 The low-frequency interpolator 31 interpolates input low-frequency audio waveform data 2201 through a low-pass filter with a cutoff frequency of 1333 Hz, converts it into audio waveform data with a normal sampling period, and converts this low-frequency interpolator output data 3101 into a low-frequency filter. Send to area BPF32.

低域BPE３２は、入力した低域補間器出力デ
ータ３１０１から必要とする帯域のみとり出すた
めの帯通過フイルタであり、本実施例においては
高域遮断周波数を1333Hzとし、低域遮断周数は
300Hzとしているが、この低域遮断周波数は所望
により任意に設定しうる。低域BPF３２の出力
は、低域音声信号データ３２０１として低域高域
合成器３３に送出される。 The low-frequency BPE 32 is a band-pass filter for extracting only the necessary band from the input low-frequency interpolator output data 3101. In this embodiment, the high-frequency cutoff frequency is 1333Hz, and the low-frequency cutoff frequency is 1333Hz.
Although it is set to 300Hz, this low cutoff frequency can be arbitrarily set as desired. The output of the low frequency BPF 32 is sent to the low frequency high frequency synthesizer 33 as low frequency audio signal data 3201.

一方、高域LPCフイルタ２３は、入力した高
域Ｋパラメータデータ１４０１を、内蔵するＫパ
ラメータ／αパラメータ変換回路によつてαパラ
メータに変換し、このαパラメータをLPCフイ
ルタの係数とし、これと高域励振信号データ２８
０１とによつて高域の音声波形データを合成し、
この高域音声波形データ２３０１を高域補間器３
４に送出する。 On the other hand, the high-frequency LPC filter 23 converts the input high-frequency K parameter data 1401 into an α parameter using a built-in K parameter/α parameter conversion circuit, uses this α parameter as a coefficient of the LPC filter, and uses this α parameter as a coefficient of the LPC filter. Area excitation signal data 28
01 to synthesize high-frequency audio waveform data,
This high frequency audio waveform data 2301 is transferred to the high frequency interpolator 3.
Send to 4.

第２図によつて説明した如く、分析側において
得られる高域Ｋパラメータデータ１４０１は、も
との音声波形データ３０１のうち1333Hzから3333
Hzの成分を周波数シフトしてこれを０から2000Hz
の帯域とし、正常の２倍のサンプリング周期にデ
シメートされた音声波形データに対するＫパラメ
ータと等価になつている。従つてこのＫパラメー
タによつて合成される高域音声波形データ２３０
１は、サンプリング周期が正常のものの２倍とな
つており、また周波数も1333Hzだけ低い方にシフ
トされたものとなつている。高域補間器３４は、
入力した高域音声波形データ２３０１を2000Hzを
高域遮断周波数とする低域フイルタにかけること
によつてこれを補間し正常のサンプリング周期を
もつ音声波形データとし、この高域補間器出力３
４０１を周波数変換器３５に送出する。 As explained with reference to FIG.
Frequency shift the Hz component and convert it from 0 to 2000Hz
, and is equivalent to the K parameter for audio waveform data decimated to twice the normal sampling period. Therefore, the high frequency audio waveform data 230 synthesized using this K parameter
1, the sampling period is twice that of the normal one, and the frequency is also shifted lower by 1333Hz. The high-frequency interpolator 34 is
The input high-frequency audio waveform data 2301 is interpolated by passing it through a low-pass filter with a high-frequency cutoff frequency of 2000 Hz to obtain audio waveform data with a normal sampling period, and this high-frequency interpolator output 3
401 to the frequency converter 35.

周波数変換器３５は、入力した高域補間器出力
３４０１に1333Hzの変換用信号３５０１を乗算
し、入力した０から2000Hzの高域補間器出力を
1333Hzシフトする周波数変換を行なつたのち、こ
の周波数変換器出力３５０２を高域BPF３６に
送出する。 The frequency converter 35 multiplies the input high-frequency interpolator output 3401 by the 1333Hz conversion signal 3501, and converts the input high-frequency interpolator output from 0 to 2000Hz.
After performing frequency conversion to shift by 1333 Hz, this frequency converter output 3502 is sent to the high frequency BPF 36.

高域BPF３６は、入力する周波数変換器出力
３５０２から必要な高域周波数、すなわち1333Hz
から3333Hzまでの成分をとり出すための帯域フイ
ルタで、これによつて得られた成分は高域音声信
号データ３６０１として低域高域合成器３３に送
出される。 The high frequency BPF 36 is the high frequency required from the input frequency converter output 3502, that is, 1333Hz.
This is a band filter for extracting components from 3333 Hz to 3333 Hz, and the components obtained by this are sent to the low frequency high frequency synthesizer 33 as high frequency audio signal data 3601.

低域高域合成器３３は、かくして入力した低域
音声信号データ３２０１と高域音声信号データ３
６０１とを加算回路によつて加算、合成し、これ
を合成音声波形データ３３０１としてＤ／Ａコン
バータ３７に送出する。 The low frequency high frequency synthesizer 33 combines the thus inputted low frequency audio signal data 3201 and high frequency audio signal data 3.
601 are added and synthesized by an adder circuit, and this is sent to the D/A converter 37 as synthesized speech waveform data 3301.

Ｄ／Ａコンバータ３７は、入力した合成音声波
形データ３３０１を、デジタル量からアナログ量
に変換し、このアナログ音声信号出力３７０１を
低域フイルタLPF３８に送出する。 The D/A converter 37 converts the input synthetic audio waveform data 3301 from a digital amount to an analog amount, and sends this analog audio signal output 3701 to the low-pass filter LPF 38.

LPF３８は3.4KHzを高域遮断周波数とする低
域フイルタで、これを通したアナログ音声信号３
７０１は合成音声信号３８０１として出力され
る。 LPF38 is a low-pass filter with a high cutoff frequency of 3.4KHz, and the analog audio signal 3 that passes through it
701 is output as a synthesized audio signal 3801.

本発明は、入力音声信号を予め定めた低域およ
び高域に分割し、そのおのおのについて線形予測
分析を行う帯域分割型ボコーダにおいて、特に分
布の偏りが大きい低域の線形予測係数のK₂パラ
メータを非線形量子化して分析側から合成側に伝
送する点に基本的特徴を有するものであり、本実
施例の変形も種種考えられる。 The present invention provides a band division _type vocoder that divides an input audio signal into predetermined low and high frequencies and performs linear predictive analysis on each of them. The basic feature is that the data is nonlinearly quantized and transmitted from the analysis side to the synthesis side, and various modifications of this embodiment can be considered.

たとえば、本実施例においては、合成側でK₂
パラメータを非線形量子化する際、これをログ
エリアレシオに変換するという方法で対数関数
的な非線形変換を施して量子化を行なつている
が、これは非線形量子化手段の一例に過ぎず、基
本的には、処理すべき入力音声信号の種類によつ
て異るK₂パラメータの分布の偏りの程度、必要
とする圧縮の程度およびスペクトル感度の平滑化
の程度、あるいはまた量子化において付与すべき
所望のビツト数すなわち量子化ステツプ数等の諸
条件を勘案してこれらの条件をほぼ満足するよう
な非線形変換特性を有する二次、もくは三次関数
や三角関数その他所望の関数を利用した近似関数
データを予め非線形変換回路１３のROMに設定
しておき、これによつて所望の非線形変換、量子
化を行うことも容易に実施できる。 For example, in this example, K ₂
When nonlinearly quantizing parameters, log this
Quantization is performed by applying a logarithmic nonlinear transformation to an area ratio, but this is just an example of a nonlinear quantization method, and basically the input audio signal to be processed is The degree of bias in the distribution of the _K2 parameter that differs depending on the type, the degree of compression required and the degree of smoothing of the spectral sensitivity, or the desired number of bits to be added in quantization, that is, the number of quantization steps, etc. Approximate function data using quadratic or cubic functions, trigonometric functions, or other desired functions having nonlinear conversion characteristics that almost satisfy these conditions is stored in the ROM of the nonlinear conversion circuit 13 in advance, taking various conditions into consideration. By setting the values in advance, it is possible to easily perform desired nonlinear transformation and quantization.

また上述した実施例においては1333Hzをもつて
低域側と高域側の境界周波数としているが、これ
は所望によつて任意に選択し得るものであり、こ
れと関連してデシメートの比率も任意に選択しう
ることは明らかである。 In addition, in the above embodiment, 1333 Hz is used as the boundary frequency between the low and high range sides, but this can be arbitrarily selected as desired, and in connection with this, the decimation ratio can also be arbitrarily selected. It is clear that there is a choice.

さらに、上述した実施例においては、第２図に
示す分析側で入力音声信号の全帯域にわたるパワ
ースペクトルを計測したのちこれを低域および高
域に分割して処理しているが、これを時間軸上で
取扱い、入力音声信号を所望の低、高域の周波数
帯の通過特性を有する帯域フイルタを用いて分割
し、その後ほぼ低域周波数をベースバンドとする
ように高域周波数帯をシフトしたうえ、それぞれ
の帯域に応じたデシメート処理を行つて線形予測
分析を行う方法でも全く同じ結果が得られること
は明らかである。 Furthermore, in the above-mentioned embodiment, the analysis side shown in FIG. 2 measures the power spectrum over the entire band of the input audio signal and then divides it into low and high frequencies for processing. The input audio signal was divided using a bandpass filter having the desired low and high frequency band pass characteristics, and then the high frequency band was shifted so that the baseband was approximately the low frequency band. Moreover, it is clear that exactly the same results can be obtained by performing linear predictive analysis by performing decimation processing according to each band.

なお、本実施例においては第３図に示す如く、
低域および高域LPCフイルタ２２および２３は、
それぞれ分析側から伝送された低、高域Ｋパラメ
ータと低、高域短時間平均電力とを独立的に用い
てそれぞれのLPCフイルタの係数および励振信
号としてこれらの合成フイルタを別別に形成して
それぞれの時間軸上の出力波形を求め、時間軸上
でこれらを合成するという方法をとつていたが、
これを次に示すように周波数領域で合成してから
時間軸上の波形を形成せしむるようにすることも
容易に実施することができる。 In addition, in this example, as shown in FIG.
The low-pass and high-pass LPC filters 22 and 23 are
These composite filters are separately formed as coefficients and excitation signals for each LPC filter by independently using the low and high frequency K parameters and the low and high frequency short-term average powers transmitted from the analysis side. The method used was to obtain the output waveform on the time axis and synthesize these on the time axis.
It is also possible to easily synthesize this in the frequency domain and then form a waveform on the time axis as shown below.

すなわち、第３図において復号化器２１で再生
された低域Ｋパラメータデータおよび高域Ｋパラ
メータデータにより低域および高域におけるスペ
クトル包絡を演算し、この結果と、復号化器２１
から別に入力する低域、および高域短時間平均電
力データとにより全帯域の自己相関係数を計測す
る。こうして、低域および高域を含む全帯域の自
己相関係数を得てこれから全帯域についての線形
予測分析を施して全帯域のαパラメータが得られ
ることになる。この際、低域のＫパラメータデー
タ中単独で伝送され、復号化器２１で再生され、
バツフアメモリ２４、K₂パラメータ線形変換回
路３０を介して得られる低域Ｋパラメータデータ
は、これを上述した低域スペクトル包絡演算操作
に容易に組入れることができることは明らかであ
る。 That is, in FIG. 3, the spectral envelopes in the low and high ranges are calculated using the low range K parameter data and the high range K parameter data reproduced by the decoder 21, and this result and the decoder 21
The autocorrelation coefficient of the entire band is measured using low-frequency and high-frequency short-term average power data that are input separately from . In this way, the autocorrelation coefficients for all bands including low and high bands are obtained, and linear predictive analysis is performed on all bands to obtain α parameters for all bands. At this time, the low-frequency K parameter data is transmitted alone and reproduced by the decoder 21,
It is clear that the low-band K-parameter data obtained via the buffer memory 24 and the K2 _- parameter linear conversion circuit 30 can be easily incorporated into the low-band spectral envelope calculation operation described above.

このようにして全帯域のαパラメータを得たの
ち、このαパラメータをフイルタ係数とする
LPCフイルタを構成する。さらに、低域短時間
平均電力データと高域短時間平均電力データおよ
び低域スペクトル包絡情報と高域スペクトル包絡
情報を用いて全帯域についての正規化予測残差電
力を計測しこれをLPCフイルタの励振信号を供
給すべき可変利得増幅器の利得制御信号とする。 After obtaining the α parameter for the entire band in this way, use this α parameter as the filter coefficient.
Configure LPC filter. Furthermore, the normalized predicted residual power for all bands is measured using the low-frequency short-term average power data, the high-frequency short-term average power data, the low-frequency spectrum envelope information, and the high-frequency spectrum envelope information, and this is applied to the LPC filter. The excitation signal is assumed to be a gain control signal of a variable gain amplifier to be supplied.

上述した可変利得増幅器は全帯域を対象とする
もので、従つて１個のみでよい。この可変利得増
幅器には入力としての音源情報が必要であるがこ
の音源情報は実施例における音源情報すなわち、
音源情報切替器２５より出力するピツチパルスデ
ータ２６０１もしくは白色雑音信号２９０１いず
れかを有声無声判別信号５０１の有声もしくは無
声に対応して入力する。このようにして、低域お
よび高域のＫパラメータ情報を周波数領域で処理
する音声合成も容易に実施することができる。 The variable gain amplifier described above covers the entire band, and therefore only one is required. This variable gain amplifier requires sound source information as input, and this sound source information is the sound source information in the embodiment, that is,
Either the pitch pulse data 2601 or the white noise signal 2901 output from the sound source information switch 25 is input in accordance with whether the voiced/unvoiced discrimination signal 501 is voiced or unvoiced. In this way, it is also possible to easily perform speech synthesis in which low-frequency and high-frequency K parameter information is processed in the frequency domain.

以上述べたことはいずれも本発明の主旨を損う
ことなく容易に実施できる。 All of the above descriptions can be easily implemented without departing from the spirit of the present invention.

以上説明したように本発明によれば、入力音声
信号を予め定めた低域と高域との２つの伝送帯域
に分割して線形予測分析を行う帯域分割型ボコー
ダにおいて、低域の線形予測分析で得られるK₂
パラメータを非線形量子化して伝送するという簡
単な手段を備えることにより、分析側から合成側
に伝送すべき音声信号の量子化歪を大幅に低減
し、必要とする量子化ビツト数、従つて伝送帯域
の圧縮を大幅に図ることができるうえ、K₂パラ
メータのスペクトル感度の平滑化が大きく改善で
きて高品質の音声合成が可能となる帯域分割型ボ
コーダが実現できるという効果がある。 As explained above, according to the present invention, in a band division type vocoder that divides an input audio signal into two predetermined transmission bands, a low frequency band and a high frequency band, and performs a linear predictive analysis, a linear predictive analysis of the low frequency band is performed. K obtained with ₂
By providing a simple means of nonlinearly quantizing and transmitting parameters, the quantization distortion of the audio signal to be transmitted from the analysis side to the synthesis side can be significantly reduced, and the required number of quantization bits and therefore the transmission band can be reduced. This has the effect of realizing a band-splitting vocoder that can significantly improve the compression of the spectral sensitivity of the K ₂ parameter and greatly improve the smoothing of the spectral sensitivity of the K 2 parameter, making it possible to synthesize high-quality speech.

[Brief explanation of drawings]

第１図は帯域分割型線形予測ボコーダのＫパラ
メータの分布の一例を示すＫパラメータ分布図、
第２図は本発明による帯域分割型ボコーダの分析
側の一実施例を示すブロツク図、第３図は本発明
による帯域分割型ボコーダの合成側の一実施例を
示すブロツク図である。１……LPF、２……Ａ／Ｄコンバータ、３…
…ウインドウ処理器、４……DFT回路、５……
有声／無声判別器、６……ピツチ抽出器、７……
パワースペクトル計測器、８……低域自己相関係
数計測器、９……高域自己相関係数計測器、１０
……低域線形予測係数分析器、１１……符号化
器、１２……K₂パラメータメモリ、１３……非
線形変換回路、１４……高域線形予測係数分析
器、２１……復号化器、２２……低域LPCフイ
ルタ、２３……高域LPCフイルタ、２４……バ
ツフアメモリ、２５……音源情報切替器、２６…
…ピツチ発生器、２７……低域可変利得増幅器、
２８……高域可変利得増幅器、２９……雑音発生
器、３０……K₂パラメータ線形変換回路、３１
……低域補間器、３２……低域LPF、３３……
低域高域合成器、３４……高域補間器、３５……
周波数変換器、３６……高域BPF、３７……
Ｄ／Ａコンバータ、３８……LPF。 FIG. 1 is a K-parameter distribution diagram showing an example of the K-parameter distribution of a band-splitting linear predictive vocoder.
FIG. 2 is a block diagram showing an embodiment of the analysis side of the band division type vocoder according to the present invention, and FIG. 3 is a block diagram showing an embodiment of the synthesis side of the band division type vocoder according to the invention. 1...LPF, 2...A/D converter, 3...
...Window processor, 4...DFT circuit, 5...
Voiced/unvoiced discriminator, 6... Pitch extractor, 7...
Power spectrum measuring device, 8...Low frequency autocorrelation coefficient measuring device, 9...High frequency autocorrelation coefficient measuring device, 10
...Low band linear prediction coefficient analyzer, 11...Encoder, 12...K ₂ parameter memory, 13...Nonlinear conversion circuit, 14...High band linear prediction coefficient analyzer, 21...Decoder, 22...Low frequency LPC filter, 23...High frequency LPC filter, 24...Buffer memory, 25...Sound source information switch, 26...
...Pitch generator, 27...Low-range variable gain amplifier,
28... High-frequency variable gain amplifier, 29... Noise generator, 30... K _2- parameter linear conversion circuit, 31
...Low frequency interpolator, 32...Low frequency LPF, 33...
Low-frequency high-frequency synthesizer, 34... High-frequency interpolator, 35...
Frequency converter, 36...High frequency BPF, 37...
D/A converter, 38...LPF.

Claims

[Claims]

1 Low range and high range with predetermined input audio signal 2
In a band division type vocoder that performs linear predictive analysis on each band divided into two audio transmission bands, the second-order K parameter among the partial autocorrelation coefficient K parameters obtained by the linear predictive analysis of the low-frequency audio transmission band. K ₂ that nonlinearly quantizes and transmits K ₂
A band division type vocoder comprising a parameter nonlinear quantization transmission means.