JPH0439678B2

JPH0439678B2 -

Info

Publication number: JPH0439678B2
Application number: JP59160491A
Authority: JP
Priority date: 1984-07-31
Filing date: 1984-07-31
Publication date: 1992-06-30
Also published as: JPS6139099A

Description

【発明の詳細な説明】（技術分野）本発明はCSMパラメータ、すなわち高々４〜
６波の周波数で表現されるCSM（Composite
Simusoidal Modeling：複合正弦波モデル）パ
ラメータ量子化に関する。DETAILED DESCRIPTION OF THE INVENTION (Technical Field) The present invention relates to CSM parameters, i.e. at most 4 to
CSM (Composite
Simusoidal Modeling: Complex sine wave model) related to parameter quantization.

（従来技術）音声合成器として従来LPC型音声合成器が広
く用いられているが、LPC型音声合成器は一般
に構造が複雑である。また音声合成に用いる
LPCフイルタの特性が、パラメータ伝送時のエ
ラー等によりその安定性が損なわれるという欠点
がある。(Prior Art) Conventionally, LPC-type speech synthesizers have been widely used as speech synthesizers, but LPC-type speech synthesizers generally have a complicated structure. Also used for speech synthesis
The characteristic of the LPC filter is that its stability is compromised due to errors during parameter transmission.

これに対してCSMを用いて音成合成を行なう
CSM型音声合成器は、後に詳述するように、フ
イルタを有しておらずその構造が非常に簡単であ
り、本質的に合成時における安定性の問題を生ず
ることはない。しかしながらCSMパラメータの
量子化に関しては従来、パラメータの各振幅を別
別に量子化されており、パラメータ相互の関係を
考慮していなかつた。そのためCSMパラメータ
の特性を充分に利用した量子化が行なわれず量子
化の効率が低いという欠点を有していた。 For this, we perform sound synthesis using CSM.
As will be described in detail later, the CSM type speech synthesizer does not have a filter and has a very simple structure, and essentially does not cause stability problems during synthesis. However, regarding the quantization of CSM parameters, conventionally each amplitude of the parameter was quantized separately, and the relationship between the parameters was not considered. Therefore, quantization that takes full advantage of the characteristics of the CSM parameters is not performed, resulting in low quantization efficiency.

（発明の目的）本発明の目的はCSMパラメータを量子化する
場合における上述の問題を解決し、効率のよい量
子化方法を提供することにある。(Objective of the Invention) An object of the present invention is to solve the above-mentioned problems when quantizing CSM parameters and to provide an efficient quantization method.

（発明の構成）本発明の量子化方法は、音声のスペクトル包絡
を所定の数ｎの周波数と振幅とが自由な正弦波の
集合で表現するCSMパラメータの量子化に於い
て、振幅の集合｛m₁、m₂、…、m_o｝をａ＝max
｛m₁、m₂、…、m_o｝で表現される正規化係数ａ
により正規化する手段を有して構成される。(Structure of the Invention) The quantization method of the present invention uses a set of amplitudes { m ₁ , m ₂ , ..., m _o }, a=max
Normalization coefficient a expressed as {m ₁ , m ₂ , ..., m _o }
It is configured to have means for normalizing by.

（原理）最初にCSM型音声合成器の原理について説明
する。(Principle) First, we will explain the principle of the CSM type speech synthesizer.

CSMとは、音声信号を、振幅と周波数とを自
由に選べるパラメータとしてもつ特定の個数の正
弦波の和として、表現するものである。この正弦
波の個数としては高々４〜６個の予め定めた数が
用いられる。 CSM expresses an audio signal as the sum of a specific number of sine waves whose amplitude and frequency are freely selectable parameters. A predetermined number of 4 to 6 sine waves is used at most.

従つてCSM音声合成を行なう場合には、まず、
音声信号をCSM音声分析により、予め定められ
た個数の正弦波の和として表現する必要がある。
CSM音声分析については後に詳述することとし、
ここではその要点のみを説明する。 Therefore, when performing CSM speech synthesis, first,
It is necessary to express the audio signal as a sum of a predetermined number of sine waves using CSM audio analysis.
CSM voice analysis will be explained in detail later.
Only the main points will be explained here.

CSM分析においてもLPC分析の場合と同様に、
位相情報の無視、音源の影響の平均化、雑音成分
による不安定性の回避等を目的に中間パラメータ
として自己相関係数を使用する。 In CSM analysis, as in LPC analysis,
An autocorrelation coefficient is used as an intermediate parameter for the purpose of ignoring phase information, averaging the influence of sound sources, and avoiding instability due to noise components.

すなわち、CSM分析は、各分析フレーム毎に
表現されるべき音声波形から直接算出される標本
自己相関係数の低次のタツプのＮ個を、合成波の
自己相関係数の低次のタツプのＮ個と一致するよ
うに、合成すべき各正弦波の周波数およびその強
度（電力振幅）を決定することである。 In other words, in CSM analysis, N low-order taps of sample autocorrelation coefficients directly calculated from the speech waveform to be expressed for each analysis frame are combined with N low-order taps of autocorrelation coefficients of a synthesized wave. The purpose is to determine the frequency and intensity (power amplitude) of each sine wave to be synthesized so that they match the N sine waves.

今、合成すべき正弦波の個数をｎとし、各正弦
波の角周波数をω_i（ｉ＝１、２、…、ｎ）、各正弦
波の強度をm_iとすると、CSMの合成波ytは yt＝_o 〓ⁱ⁼¹ √_isin（ω_iｔ＋φ_i）となるが、このタツプｌの自己相関係数γ_lはω_i、
m_iを用いて容易に表わされ γ_l＝_o 〓ⁱ⁼¹ m_icoslω_i である。 Now, if the number of sine waves to be synthesized is n, the angular frequency of each sine wave is ω _i (i=1, 2,..., n), and the intensity of each sine wave is m _i , then the CSM composite wave yt is yt＝ _o 〓 ⁱ⁼¹ √ _i sin(ω _i t+φ _i ), but the autocorrelation coefficient γ _l of this tap l is ω _i ,
It is easily expressed using m _{i and} γ _l = _o 〓 ⁱ⁼¹ m _i coslω _i .

一方、表現されるべき音声波形のサンプルを
X_tとすると、あるフレームにおけるタツプｌの
標本自己相関係数v_lは v_l＝１／Ｍ_M-1 〓^t=l X_tX_t-l として与えられる。但し、Ｍは１分析フレームに
おけるサンプル数である。 On the other hand, the sample of the audio waveform to be expressed is
When X _t is assumed, the sample autocorrelation coefficient v _l of tap l in a certain frame is given as v _l =1/M _M-1 〓 ^t=l X _t X _tl . However, M is the number of samples in one analysis frame.

さて、CSM分析においては、上述のγ_lが、与
えられたv_lと低次のＮ個について等しくなるよう
に各m_i、ω_iの値を決定することである。 Now, in the CSM analysis, the value of each m _i and ω _i is determined so that the above-mentioned γ _l is equal to the given v _l for N low-order values.

すなわち、 γ_l＝v_l ：但し、ｌ＝０、１、２、…、Ｎが成立するようにm_i、ω_iの値を決定することであ
る。 In other words, the values of m _i and ω _i are determined so that γ _l =v _l where l=0, 1, 2, . . . , N holds true.

この具体的な方法については後に詳述すること
にして、ここでは、上述のｎ個の正弦波のm_iお
よびω_iが与えられた音声信号に応答して各分析フ
レームごとに次次に得られるものとする。 This specific method will be explained in detail later, but here, m _i and ω _i of the n sine waves mentioned above are obtained one after another for each analysis frame in response to a given audio signal. shall be provided.

こうして得られたCSMパラメータm_i、ω_iによ
る音声特徴ベクトルパターンの一例を第１図に示
す。 FIG. 1 shows an example of a speech feature vector pattern based on the CSM parameters m _i and ω _i obtained in this way.

また、分析フレームの窓長を30cmSECとして分
析した９次（Ｎ＝９）のCSM（正弦波の個数ｎ＝
５）ラインスペクトルと、同一の音声サンプルよ
り求めた９次のLPCスペクトル包絡（LPC合成
フイルタの周波数伝送特性）との対応例を第２図
に示す。 In addition, the 9th order (N = 9) CSM (number of sine waves n =
5) Figure 2 shows an example of the correspondence between the line spectrum and the 9th-order LPC spectrum envelope (frequency transmission characteristic of the LPC synthesis filter) obtained from the same audio sample.

なお、上述の次数Ｎと、正弦波の個数ｎとの間
には、後述するようにＮ＝2n−１の関係がある。 Note that there is a relationship of N=2n-1 between the above-mentioned order N and the number n of sine waves, as described later.

これらの図より、CSMは表現すべき原音声の
特徴を抽出した情報を含んでいることが窺える。 From these figures, it can be seen that the CSM contains information that extracts the features of the original speech that should be expressed.

しかしながら、こうしてCSM分析の結果得ら
れたｎ組のm_i、ω_iの値を用いて、このm_i、ω_iで指
定される強度（実際の振幅は前述のようにm_i）
および角周波数をもつｎ個の正弦波を作り、これ
を単純に加算合成したたけでは、人間の耳には、
単に正弦波が合成された音として聞えるだけで、
もとの音声を再現するという目的は達成できな
い。 However, using n sets of m _i and ω _i values obtained as a result of CSM analysis, the intensity specified by m _i and ω _i (the actual amplitude is m _i as mentioned above)
If you create n sine waves with angular frequencies and simply add and synthesize them, the human ear will hear:
You can simply hear it as a synthesized sound of sine waves,
The purpose of reproducing the original sound cannot be achieved.

これは、正弦波を単純加算しても、発生された
信号のスペクトルは、離散化されたｎ個の線スペ
クトルに過ぎず、一方音声信号のスペクトルは連
続的なスペクトル包絡を有し、さらにまた、有声
音ではピツチ構造で表現され、また無声音では確
率過程で表現される微細なスペクトル構造を合せ
もつていて、単純加算したCSMと音声信号とは
スペクトル構造が全く異なつていることに起因す
ると考えられる。 This means that even if you simply add sine waves, the spectrum of the generated signal is just a discretized n-line spectrum, whereas the spectrum of the audio signal has a continuous spectral envelope, and also This is thought to be due to the fact that voiced sounds are expressed by a pitch structure, and unvoiced sounds have a fine spectral structure expressed by a stochastic process, and the spectral structures of the simply added CSM and the speech signal are completely different. It will be done.

そこで、CSMを用いて音声を合成するには、
何らかの方法を用いて線スペクトルを連続的なス
ペクトルへ拡散することが必要となる。つまり
CSM音声合成とは、第１図、第２図で示される
ような線スペクトルで表現された音声特徴ペクト
ルパターンから音声スペクトルパターンを発生さ
せることと考えることができる。 Therefore, to synthesize speech using CSM,
It is necessary to use some method to spread the line spectrum into a continuous spectrum. In other words
CSM speech synthesis can be thought of as generating a speech spectrum pattern from speech feature spectral patterns expressed by line spectra as shown in FIGS. 1 and 2.

本発明においては、CSM音声合成において上
述のスペクトル拡散を行なうために、以下のよう
な手法を用いる。 In the present invention, the following method is used to perform the above-described spread spectrum in CSM speech synthesis.

すなわち、有声音は明確なピツチ構造を有する
ため、前述のようにして指定されるｎ個の各正弦
波を、このピツチ周期ごとに位相のリセツトを行
なう。これにより、簡単にスペクトル包絡の発生
とピツチの微細スペクトル構造の発生とが可能に
なる。 That is, since voiced sounds have a clear pitch structure, the phase of each of the n sine waves specified as described above is reset every pitch period. This makes it possible to easily generate a spectral envelope and a fine pitch spectral structure.

さらにまた、実施例の説明において詳述するよ
うな特殊の時間窓処理を上述の位相リセツト波形
に施すことにより位相リセツト時における合成波
形の不連続性を除き、音声波形のもつ連続性を確
保している。 Furthermore, by applying special time window processing to the above-mentioned phase reset waveform as detailed in the explanation of the embodiment, discontinuity in the synthesized waveform at the time of phase reset can be removed and continuity of the audio waveform can be ensured. ing.

以上の実施により第２図に示したCSMのライ
ンスペクトルは、第３図Ａに示されるように拡散
され、スペクトル包絡とピツチの微細構造とを有
するスペクトルに変化し、聴覚的にも充分実用に
耐える音質が得られることが実験結果明らかとな
つている。 Through the above implementation, the CSM line spectrum shown in Figure 2 is diffused as shown in Figure 3A, and changed to a spectrum having a spectral envelope and pitch fine structure, which is sufficiently practical for auditory purposes. Experimental results have shown that durable sound quality can be obtained.

なお、参考のため、上述の処理を行なわず、単
純加算をしただけのCSMのスペクトルを第３図
Ｂに示す。前述のように、このようなスペクトル
をもつ波形では聴覚的には単に正弦波が合成され
た音として聞えるだけで、音声を再現するという
目的は達成されない。 For reference, FIG. 3B shows a CSM spectrum obtained by simple addition without performing the above-mentioned processing. As mentioned above, a waveform with such a spectrum simply sounds like a synthesized sound of sine waves, and the purpose of reproducing speech cannot be achieved.

以上は有声音の場合であるが、無声音の場合に
は以下のように行なう。すなわち、上述の有声音
の場合に、ピツチ同期毎に行なつた位相のリセツ
トと特殊の時間窓処理とを、無声音の場合にはピ
ツチ同期のかわりに、確率過程としてランダムに
発生するその同期が分布幅と下限値とを設定され
たパルスを用い、このパルスの発生時点ごとに上
述の処理を実施するようにする。 The above is for voiced sounds, but in the case of unvoiced sounds, it is performed as follows. In other words, in the case of voiced sounds described above, the phase reset and special time window processing performed at each pitch synchronization are performed, and in the case of unvoiced sounds, the synchronization that occurs randomly as a stochastic process is performed instead of pitch synchronization. Using a pulse with a distribution width and a lower limit value set, the above-described process is performed every time this pulse is generated.

以上の手法を用いることにより聴覚的に充分実
用に耐えるCSM合成を行なうことができる。な
お、以上のCSM合成はフイルタを用いない合成
法であるため、合成側の安定性に対する考慮を必
要としない。このため、m_i、ω_iの情報を合成側に
伝送し、合成側で音声を再現するような通信手段
に用いる場合に、回線品質が比較的に劣悪で伝送
途中にエラーを発生するようなときにはボコーダ
よりも良好な音質が得られるという特徴が考えら
れる。 By using the above method, it is possible to perform CSM synthesis that is auditorily sufficient for practical use. Note that the above CSM synthesis is a synthesis method that does not use a filter, and therefore does not require consideration of stability on the synthesis side. Therefore, when using a communication method that transmits the information of m _i and ω _i to the synthesis side and reproduces the voice on the synthesis side, the line quality is relatively poor and errors may occur during transmission. In some cases, it may be possible to obtain better sound quality than a vocoder.

（実施例）次に本発明を実施例を用いて詳細に説明する。(Example) Next, the present invention will be explained in detail using examples.

説明の都合上、本発明を含む分析合成系を用い
て本発明を説明する。 For convenience of explanation, the present invention will be described using an analytical synthesis system that includes the present invention.

第４図は本発明の一実施例を示すブロツク図で
ある。 FIG. 4 is a block diagram showing one embodiment of the present invention.

本実施例は送信側１と、受信側２よりなる。 This embodiment consists of a transmitting side 1 and a receiving side 2.

送信側１は、さらに、Ａ／Ｄ変換器１０１、ハ
ミング窓処理器１０２、自己相関係数計測器１０
３CSM分析器１０４、CSM量子化器１０５、電
力補正量子化器１０６、ピツチ抽出器１０７、有
声音／無声音判定器１０８およびマルチプレクサ
１０９を含む。 The transmitting side 1 further includes an A/D converter 101, a Hamming window processor 102, and an autocorrelation coefficient measuring device 10.
3CSM analyzer 104, CSM quantizer 105, power correction quantizer 106, pitch extractor 107, voiced/unvoiced sound determiner 108, and multiplexer 109.

また、受信側２は、さらに、デマルチプレクサ
および復合化器２０１、補間器２０２、有声音／
無声音切替器２０３、周期算出器２０４、乱数発
生器２０５、ｎ個の、位相リセツト機能付可変周
波数発振器２０６−１，２０６−２，……，２０
６−ｎ、ｎ個の可変利得増幅器２０７−１，２０
７−２，……２０７−ｎ、加算合成器２０８、可
変長窓関数発生器２０９、乗算器２１０および乗
算器２１１を含んでいる。 In addition, the receiving side 2 further includes a demultiplexer and decoder 201, an interpolator 202, a voiced/
Unvoiced sound switcher 203, period calculator 204, random number generator 205, n variable frequency oscillators with phase reset function 206-1, 206-2, ..., 20
6-n, n variable gain amplifiers 207-1, 20
7-2, .

さて、本実施例の動作は下記の通りである。伝
送されるべき音声波形は、入力ライン１０００を
介して、Ａ／Ｄ変換器１０１に供給され、ここ
で、振幅および時間軸が量子化されたデイジタル
データに変換され、この出力はそれぞれ、ハミン
グ窓処理器１０２、ピツチ抽出器１０７、有声
音／無声音判定器１０８の入力側に供給される。 Now, the operation of this embodiment is as follows. The audio waveform to be transmitted is fed via an input line 1000 to an A/D converter 101, where it is converted into digital data whose amplitude and time axis are quantized, and whose outputs are each passed through a Hamming window. The signal is supplied to the input sides of a processor 102, a pitch extractor 107, and a voiced/unvoiced sound determiner 108.

ハミング窓処理器１０２に供給されたデイジタ
ルデータは、予め定められている１フレームごと
に、公知のハミング窓関数による荷重乗算がなさ
れ、各フレームのデータごとに自己相関係数計測
器１０３に供給される。 The digital data supplied to the Hamming window processor 102 is subjected to weight multiplication using a known Hamming window function for each predetermined frame, and is supplied to the autocorrelation coefficient measuring device 103 for each frame of data. Ru.

自己相関係数計測器１０３は、こうして入力さ
れた各フレームのデータごとに前述した下記の演
算により低位のＮ個の自己相関係数v_l（但しｌ＝
１、２、……Ｎ）を求める。 The autocorrelation coefficient measuring device 103 calculates the lowest N autocorrelation coefficients v _l (where l=
1, 2,...N).

すなわち、１フレーム分のデータをX_t（但しｔ
＝０、１、……、Ｍ−１）とすると、 v_l＝１／Ｍ_M-1 〓^t=l X_tX_t-l の演算処理を行なうことにより、Ｎ個の各v_lを求
める。 In other words, data for one frame is X _t (where t
⁼ ₀ _, ₁ _, _.

こうして求められた各フレームごとのv_lの組を
次のCSM分析器に供給するとともにこの中のv₀
（つまりv₀＝１／Ｍ_M-1 〓^t=0 X² _t）をこのフレームにおける電力情報として、電力補正量子化器１０６に供給
する。 The set of v _l for each frame obtained in this way is supplied to the next CSM analyzer, and the set of v ₀ in this is supplied to the next CSM analyzer.
(that is, v ₀ =1/M _M-1 〓 ^t=0 X ² _t ) is supplied to the power correction quantizer 106 as power information in this frame.

さて、上述の各フレームごとの自己相関係数v_l
の組の供給を受けたCSM分析器１０４は後に詳
述する演算を行なうことによつて、対応するフレ
ームのCSMのｎ個の各正弦波の強度および角周
波数を指定するm_i、ω_i（但しｉ＝１、２、……ｎ）
の組を決定し、これをCSM量子化器１０５に供
給する。 Now, the autocorrelation coefficient v _l for each frame mentioned above
The CSM analyzer 104, which has been supplied with the set m _i , ω _i ( However, i=1, 2,...n)
and supplies this to the CSM quantizer 105.

CSM量子化器１０５は本発明を構成する直接
的な部分であり別途詳細に説明するが、その概要
は以下の通りである。 The CSM quantizer 105 is a direct part constituting the present invention and will be explained in detail separately, but its outline is as follows.

CSM量子化器１０５はこれらm_i、ω_iの値の組
を振幅の集合｛m₁、m₂、……、m_o｝から求めら
れるａ＝max｛m₁、m₂、……、m_o｝で表わされ
る正規化係数ａを検索し、前記ａを補正データと
して電力量子化器１０６へ出力するとともに、前
記ａを用いて前記集用｛m₁、m₂、……m_o｝を正
規化する手段を含んで量子化するものであり、量
子化ビツト数は、再生音質に対する要求と回線の
伝送容量とを勘案して定まる適当なビツト数が選
択される。CSM量子化器１０５は前記m_i、ω_iの
値の組を量子化した後、マルチプレクサ１０９に
供給する。 The CSM quantizer 105 converts the set of values of m _i and ω _i into a=max {m ₁ , m ₂ , ..., m obtained from the set of amplitudes {m ₁ , m ₂ , ..., m _o } _o }, and outputs the a to the power quantizer 106 as correction data, and uses the a to calculate the collection {m ₁ , m ₂ , . . . m _o }. The quantization method includes normalization means, and the number of quantization bits is selected as an appropriate number of bits determined by taking into consideration the requirements for reproduction sound quality and the transmission capacity of the line. The CSM quantizer 105 quantizes the set of values of m _i and ω _i and then supplies it to the multiplexer 109 .

また前述のv₀と正規化係数ａの供給を受けた電
力量子化器１０６も、v₀を上述の観点から定まる
適当な粗さで量子化した後、同様にマルチプレク
サ１０９に供給する。 Furthermore, the power quantizer 106 that receives the aforementioned v ₀ and the normalization coefficient a quantizes the v ₀ with an appropriate roughness determined from the above-mentioned viewpoint, and then similarly supplies the same to the multiplexer 109 .

また、Ａ／Ｄ変換器１０１から原音声信号のデ
イジタルを適当に量子化したデータとしてマルチ
プレクサ１０９に供給し、同様に有声音／無声音
判定器１０８も供給されたデイジタルデータより
有声音／無声音の判定を行ないこれを２値信号と
してマルチプレクサ１０９に供給する。 Further, the digital original audio signal is supplied from the A/D converter 101 as appropriately quantized data to the multiplexer 109, and similarly, the voiced/unvoiced sound determiner 108 also determines voiced/unvoiced sound based on the supplied digital data. and supplies it to the multiplexer 109 as a binary signal.

以上の信号の供給を受けたマルチプレクサ１０
９は、これらの信号を、受信側における分離が容
易に行なえ、また与えられた伝送路を伝送するの
に適した形に合成し、伝送路１２００を介して受
信側２に伝送する。 The multiplexer 10 supplied with the above signals
9 combines these signals into a form that can be easily separated on the receiving side and is suitable for transmission over a given transmission path, and transmits the combined signals to the receiving side 2 via a transmission path 1200.

さて受信側２においては、こうして伝送された
信号をデマルチプレクサおよび復合化器２０１に
おいて復合化および分離を行なうことによつて、
送信側１のマルチプレクサ１０９の入力側におけ
る各信号を復元する。 Now, on the receiving side 2, the thus transmitted signals are demultiplexed and separated in the demultiplexer and decoder 201, so that
Each signal at the input side of the multiplexer 109 on the transmitting side 1 is restored.

こうして復元された各信号は、メモリ機能を有
する補間器２０２に供給され、必要な補間がほど
こされた後、それぞれ次のように用いられる。 Each signal thus restored is supplied to an interpolator 202 having a memory function, and after performing necessary interpolation, it is used as follows.

まず、CSMのｎ個の各波の角周波数を指定す
るω_i（ω₁〜ω_o）は、前記ｎ個の位相リセツト機能
付可変周波数発振器２０６−１〜２０６−ｎの周
波数制御入力に加えられ、これらの発振器の出力
角周波数を指定された角周波数ω₁〜ω_oに設定す
る。 First, ω _i (ω ₁ to _{ω o} ) specifying the angular frequency of each of the n waves of the CSM is added to the frequency control input of the n variable frequency oscillators with phase reset function 206-1 to 206-n. and sets the output angular frequencies of these oscillators to specified angular frequencies ω ₁ to ω _o .

また、CSMのｎ個の各波の強度（電力振幅）
と指定するm₁〜m_oは前記ｎ個の可変利得増幅器
２０７〜１〜２０７−ｎの利得制御端子に供給さ
れ、これによつて各周波数の発振電力が指定され
た値になるように制御する。 Also, the intensity (power amplitude) of each of the n waves of CSM
m ₁ to _{m o} specified as are supplied to the gain control terminals of the n variable gain amplifiers 207 to 1 to 207-n, thereby controlling the oscillation power of each frequency to a specified value. do.

こうして得られたｎ個の出力は、可算合成器２
０８において可算合成が行なわれた後、次の乗算
器２１０に供給される。 The n outputs thus obtained are the countable combiner 2
After the countable combination is performed in step 08, the signal is supplied to the next multiplier 210.

さて、デマルチプレクサおよび復合化器２０１
から出力されるピツチ周期情報は、メモリを含む
補間器２０２において、必要に応じて補間が施さ
れ、ピツチ周期を表わすデイジタルデータとして
有声音／無声音切替器２０３に供給される。 Now, the demultiplexer and demultiplexer 201
The pitch period information outputted from the interpolator 202 including a memory performs interpolation as necessary, and is supplied to the voiced/unvoiced sound switch 203 as digital data representing the pitch period.

一方、乱数発生器２０５で発生された乱数が、
パルス間隔演算器２０４に供給され、ここで乱数
の分布幅およびその下限値が特定の値になるよう
に変換され、無声音時の位相リセツト時間間隔を
決定するデータ列として有声音／無声音切替器２
０３の他方の入力に供給される。 On the other hand, the random number generated by the random number generator 205 is
It is supplied to the pulse interval calculator 204, where it is converted so that the distribution width of the random number and its lower limit become a specific value, and is sent to the voiced/unvoiced sound switch 2 as a data string that determines the phase reset time interval for unvoiced sounds.
03's other input.

またデマルチプレクサおよび復号化器２０１か
ら出力される有声音無声音を区別する２値信号
（Ｖ／Ｕ）は前述の切替器２０３の切替制御信号
として供給され、有声音の場合には、切替器２０
３が補間器２０２から出力する前述のピツチ周期
を表わすデイジタルデータ側を選択して、これを
窓関数発生器２０９に供給する。 Further, a binary signal (V/U) that distinguishes between voiced and unvoiced sounds output from the demultiplexer and decoder 201 is supplied as a switching control signal to the above-mentioned switch 203, and in the case of voiced sounds, the switch 20
3 selects the digital data representing the aforementioned pitch period output from the interpolator 202 and supplies it to the window function generator 209.

またもし前記２値信号（Ｖ／Ｕ）が無声音を指
定する場合には、切替器２０３は、前述の周期演
算器２０４の出力の確率過程で発生するランダム
な時間間隔を表わすデータ列側を選択し、これを
上述のピツチ周期を表わすデイジタルデータ例の
かわりに、窓関数発生器２０９に供給する。 Furthermore, if the binary signal (V/U) specifies an unvoiced sound, the switch 203 selects the data string side representing a random time interval generated in the stochastic process of the output of the period calculator 204. This is then supplied to the window function generator 209 in place of the above-mentioned example of digital data representing the pitch period.

さて、窓関数発生器２０９は、位相リセツトに
よつて出力波形に生ずる不連続を除き音声波形の
もつ連続性を確保する窓関数を発生するためのも
ので、またさらにこの窓関数と密接な時間関係を
有する位相リセツト用パルスをも発生する。 Now, the window function generator 209 is for generating a window function that ensures the continuity of the audio waveform except for discontinuities that occur in the output waveform due to phase reset, and furthermore, the window function A related phase reset pulse is also generated.

前述のように窓関数発生器２０９には切替器２
０３を介して、次次の位相リセツト用パルス間の
間隔を指定するデータ列が入力されるが、窓関数
発生器２０９は、このデータで指定される時間間
隔を有するインパルスを次次に発生し、これをラ
イン２０９０を介して位相リセツト機能付可変周
波数発振器２０６−１〜２０６−ｎの位相リセツ
ト端子に供給し、これによつてこれら発振器の位
相リセツトを行なう。またこれをライン２０９０
を介して補間器２０２に供給し、角周波数データ
ω_iおよび強度データm_iを補間するためのタイミン
グ信号として使用する。 As mentioned above, the window function generator 209 includes a switch 2.
03, a data string specifying the interval between the next phase reset pulses is input, and the window function generator 209 successively generates impulses having the time intervals specified by this data. , is supplied via line 2090 to the phase reset terminals of variable frequency oscillators with phase reset function 206-1 to 206-n, thereby resetting the phases of these oscillators. Also add this to line 2090
and is used as a timing signal for interpolating the angular frequency data ω _i and the intensity data m _i .

さて、窓関数発生器２０９は上述の位相リセツ
ト用パルスの発生と同期して下記のような可変長
の窓関数w_(x)を発生する。 Now, the window function generator 209 generates a variable length window function w _(x) as shown below in synchronization with the generation of the above-mentioned phase reset pulse.

すなわち、入力されたデータにより指定された
その時点における位相リセツト用パルス間間隔の
値をＴとし、前の位相リセツト用パルスが発生し
てからの経過時間をｘとすると w_(x)＝0.5＋0.5cos（πｘ／Ｔ）但し０＜ｘＴで表わされるような窓関数を発生する。この窓関
数w_(x)を第５図Ａに示す。上述のＴの値は、有声
音の場合にはピツチ周期を表わし、無声音の場合
には確率過程で発生する変数を表わすので時間と
ともに変化する。従つて、この窓関数w_(x)は可変
長であり、上述の位相リセツト用パルスの発生と
第５図Ｂに示すような相対時間関係で同期してい
る（窓関数の開始時点および終止時点が位相リセ
ツト用パルスの発生時点とほぼ一致している）。 That is, if T is the value of the interval between phase reset pulses at that point specified by the input data, and x is the elapsed time since the previous phase reset pulse was generated, w _(x) = 0.5 + 0. .5cos(πx/T) However, a window function expressed as 0<xT is generated. This window function w _(x) is shown in FIG. 5A. The above-mentioned value of T represents the pitch period in the case of a voiced sound, and represents a variable that occurs in a stochastic process in the case of an unvoiced sound, so it changes over time. Therefore, this window function w _(x) has a variable length and is synchronized with the generation of the phase reset pulse described above in the relative time relationship shown in FIG. 5B (the start and end points of the window function (This almost coincides with the time point at which the phase reset pulse is generated.)

こうして発生された窓関数はライン２０９１を
介して乗算器２１０に供給される。この結果、乗
算器２１０において、加算合成器２０８で合成さ
れた各位相リセツト用パルスごとに位相リセツト
されるｎ個の正弦波形と、各位相リセツト用パル
スに同期して発生される上述の窓関数w_(x)との積
が得られる。こうして得られる波形は、各正弦波
が位相リセツトされる直前で窓関数w_(x)の乗算に
より連続的に０に収束されており、また位相リセ
ツト時点では各正弦波は０から立ち上るので波形
の連続性が確保され、かくして窓関数w_(x)の乗算
により位相リセツト波形に生ずる不連続性を除く
ことができる。 The window function thus generated is provided to multiplier 210 via line 2091. As a result, the multiplier 210 generates n sine waveforms whose phase is reset for each phase reset pulse synthesized by the summing synthesizer 208, and the above-mentioned window function generated in synchronization with each phase reset pulse. The product with w _(x) is obtained. The waveform obtained in this way is continuously converged to 0 by multiplying the window function w _(x) just before each sine wave is phase reset, and since each sine wave rises from 0 at the time of phase reset, the waveform Continuity is ensured, and thus discontinuity occurring in the phase reset waveform by multiplication by the window function w _(x) can be removed.

不連続性を除かれた乗算器２１０の出力は、次
の乗算器２１１に供給され、ここで送信側１から
送られた各フレームの電力情報によつて加重さ
れ、合成音声としてライン２０００から出力され
る。 The output of the multiplier 210 from which discontinuities have been removed is supplied to the next multiplier 211, where it is weighted by the power information of each frame sent from the transmitting side 1, and is output from line 2000 as synthesized speech. be done.

以上に説明したように、本実施例の受信側２に
おいては、前述した音声合成に必要なCSM合成
が実行され、この結果、送信側１に入力した原音
声の再現が、伝送路１２００における情報量の圧
縮や伝送エラーにもかかわらず比較的良好な音質
をもつて行なわれることになる。 As explained above, on the receiving side 2 of this embodiment, the CSM synthesis necessary for the above-mentioned speech synthesis is executed, and as a result, the reproduction of the original voice inputted to the transmitting side 1 is transmitted using the information on the transmission path 1200. This is done with relatively good sound quality despite volume compression and transmission errors.

以上で説明した補間器２０２における各伝送デ
ータに対する補間は、送信側１で各伝送データを
量子化する際の粗さに応じて種種の組合せ（例え
ばω_iだけ、あるいはω_i、m_iだけ等）で行なうこと
が可能で、また補間の方法も、直線補間あるいは
さらに高級な関数による補間を用いることも可能
である。なお、ω_i、m_iに対する補間に関しては、
上述の位相リセツト用パルスの発生時点ごとに補
間データが得られるように補間点を選定すること
が有利であり、ω_i、m_iの値の更新をこのタイミン
グで行なうために前述のように位相リセツト用パ
ルスをライン２０９０を介して補間器２０２に供
給している。 The interpolation for each transmission data by the interpolator 202 described above is performed using various combinations (for example, only ω _i , or only ω _i , m _i , etc.) ), and it is also possible to use linear interpolation or interpolation using a higher-level function. Regarding interpolation for ω _i and m _i ,
It is advantageous to select interpolation points so that interpolated data can be obtained at each time point when the above-mentioned phase reset pulse occurs, and in order to update the values of ω _i and m _i at this timing, the phase A reset pulse is provided to interpolator 202 via line 2090.

このような補間を行なうためには、必要な後の
データが到着するかまたは発生するかした後に補
間データが求められるため、発振器２０６に対す
る位相のリセツトおよび周波数ω_iの設定、また増
幅器２０７に対する強度m_iの設定等の実際の処
理は、実時間より必要な一定時間だけ遅れて実行
されることになる。このため補間器２０２には必
要な情報を必要時点まで記憶しておくためのメモ
リが含まれている。 In order to perform such interpolation, since the interpolated data is obtained after the required subsequent data arrives or occurs, it is necessary to reset the phase and set the frequency ω _i for the oscillator 206, and also adjust the intensity for the amplifier 207. Actual processing such as the setting of m _i will be executed with a necessary fixed time delay from the actual time. For this reason, interpolator 202 includes a memory for storing necessary information until a necessary point in time.

次に、位相リセツト機能付可変周波数発振器２
０６の回路例を第６図に示す。周波数制御端子２
０６１に加わる電圧によつて、定電流電源２０６
２および２０６３に流れる。容量２０６４に対す
る充放電電流値を制御し、これによつて発振周波
数を可変とする。ｖ点の発振電圧波形は基準電圧
の＋Vrと−Vrとの間を直線的に上下する三角波
形となる。位相リセツト端子２０６５にインパル
スを加えるとｖ点は瞬間的に接地されて、強制的
に０電位に引き戻され、そこから発振を再スター
トして位相リセツトが行なわれる。このｖ点の三
角波発振出力を正弦波変換器２０６６に入力し正
弦波に変換して端子２０６７より出力し、これを
発振器２０６の出力として用いる。正弦波変換器
２０６６は例えばROMに格納したサイン関数値
を入力波形で読出す等の方法により容易に実現で
きる。 Next, variable frequency oscillator 2 with phase reset function
A circuit example of 06 is shown in FIG. Frequency control terminal 2
The voltage applied to 061 causes constant current power supply 206
2 and 2063. The charging/discharging current value for the capacitor 2064 is controlled, thereby making the oscillation frequency variable. The oscillation voltage waveform at point v is a triangular waveform that linearly rises and falls between +Vr and -Vr of the reference voltage. When an impulse is applied to the phase reset terminal 2065, the point v is momentarily grounded and forcibly pulled back to 0 potential, and oscillation is restarted from there to perform a phase reset. This triangular wave oscillation output at point v is input to a sine wave converter 2066, converted to a sine wave, outputted from a terminal 2067, and used as the output of the oscillator 206. The sine wave converter 2066 can be easily realized, for example, by reading out a sine function value stored in a ROM using an input waveform.

またこのような位相リセツト機能付可変周波数
発振器は計算機のプログラムを用いて実現するこ
とも容易である。 Further, such a variable frequency oscillator with a phase reset function can be easily realized using a computer program.

次に可変利得増幅器２０７の回路例を第７図に
示す。増幅すべき信号を端子２０７１に加え、制
御信号を端子２０７２に加えることによつて負帰
還量を制御し出力端子２０７３に制御された振幅
を有する出力を得る。 Next, a circuit example of the variable gain amplifier 207 is shown in FIG. By applying a signal to be amplified to a terminal 2071 and a control signal to a terminal 2072, the amount of negative feedback is controlled, and an output having a controlled amplitude is obtained at an output terminal 2073.

またこのほかに、アナログ乗算器を用いて実現
することもできるし、またＤ／Ａ変換器の基準電
圧にアナログ波形入力を用い、デイジタル入力
に、デイジタル量で表現された制御情報を用いる
等の方法によつても容易に実現することができ
る。 In addition to this, it can also be realized by using an analog multiplier, or by using an analog waveform input as the reference voltage of the D/A converter and using control information expressed in digital quantities as the digital input. It can also be easily realized by a method.

次に乱数発生器２０５の一回路例を第８図に示
す。15段のレフトレジスタ２０５１と１個の中加
算器２０５２とにより2¹⁵−１の周期を有する15
次のＭ系列の疑似乱数を発生する。必要な時点で
クロツク端子２０５３にシフトパルスを加えるこ
とにより、次の乱数値が得られる。 Next, an example of a circuit of the random number generator 205 is shown in FIG. The 15-stage left register 2051 and one middle adder 2052 have a period of 2 ¹⁵ -1.
Generate the next M series of pseudo-random numbers. By applying a shift pulse to clock terminal 2053 at the required time, the next random value is obtained.

次に周期算出器２０４のブロツク図を第９図Ａ
に示す。これは上述の乱数発生器２０５から出力
される０から2¹⁵−１の範囲に一様に分布してい
る乱数を無声音時の位相リセツト用パルスの時間
間隔を指定する乱数として用いるのに適した分布
に変換するもので、定数乗算器２０４１と定数加
算器２０４２よりなる。これによつて、第９図Ｂ
に示すように、乱数の分布幅Ｄと下限値Ｌとを適
当な値に設定することができる。 Next, the block diagram of the period calculator 204 is shown in FIG. 9A.
Shown below. This is suitable for using the random numbers uniformly distributed in the range of 0 to ²¹⁵ -1 output from the random number generator 205 mentioned above as random numbers for specifying the time interval of the phase reset pulse during unvoiced speech. It converts into a distribution and consists of a constant multiplier 2041 and a constant adder 2042. By this, Figure 9B
As shown in the figure, the random number distribution width D and the lower limit L can be set to appropriate values.

次に窓関数発生器２０９の一実施例を第１０図
に示す。これはレジスタ２０９１、プリセツト可
能なダウンカウンタ２０９２、カウンタ２０９
３、読出し専用メモリ（ROM）２０９４を含ん
でいる。 Next, an embodiment of the window function generator 209 is shown in FIG. This includes a register 2091, a presettable down counter 2092, and a counter 209.
3. Contains read-only memory (ROM) 2094.

切替器２０３から供給された位相リセツト用パ
ルス間隔を指定するデータＴは、レジスタ２０９
１に格能される。ダウンカウンタ２０９２は一定
周期の高速クロツクCLKをカウントするカウン
タで、まず、レジスタ２０９１の内容Ｔをプリセ
ツトし、これをクロツクCLKを用いてダウンカ
ウントする。カウンタ２０９２の内容が０になる
と出力端子よりパルスを発生し、これにより再び
レジスタ２０９１の内容をプリセツトしてこの値
のダウンカウントを開始する。かくしてダウンカ
ウンタ２０９２の出力２０９２−１にはＴに比例
した周期（例えばＴ／ｋ）をもつパルス列が発生
する。このパルス列はカウンタ２０９３のクロツ
クとして加えられる。このクロツクで歩進される
カウンタ２０９３のカウント出力２０９３−１は
ROM２０９４にアドレス指定信号として加えら
れ、そこに書き込まれている窓関数w_(x)、のデー
タを順番に読出してライン２０９１に出力する。
カウンタ２０９３の内容がｋになると、ROM２
０９４の窓関数w_(x)の最後のデータが読出され、
これとともにカウンタ２０９３はリセツトされて
ライン２０９０にリセツトパルスを出力する。こ
のリセツトパルスは、発振器２０６−１〜２０６
−ｎの位相リセツト用端子および補間器２０２に
供給される前述の位相リセツト用パルスとして用
いられると共に、レジスタ２０９１に次の入力デ
ータをセツトするために用いられる。またROM
２０９４の中にｋ個のサンプルとして予め格納さ
れている窓関数w_(l)のデータはライン２０９１に
流出されて乗算器２１０に供給される。かくし
て、パルス間間隔がつぎつぎに指定された値をも
つ位相リセツト用パルスと、これと第５図Ｂに示
すように同期された可変長の窓関数w_(x)とが生成
される。 Data T specifying the phase reset pulse interval supplied from the switch 203 is stored in the register 209.
It is ranked as 1. The down counter 2092 is a counter that counts the high speed clock CLK of a fixed period. First, the content T of the register 2091 is preset, and this is down counted using the clock CLK. When the contents of the counter 2092 reach 0, a pulse is generated from the output terminal, thereby presetting the contents of the register 2091 again and starting counting down this value. Thus, a pulse train having a period proportional to T (for example, T/k) is generated at the output 2092-1 of the down counter 2092. This pulse train is added as a clock to counter 2093. The count output 2093-1 of the counter 2093, which is incremented by this clock, is
The data of the window function w _(x) , which is added to the ROM 2094 as an address designation signal and written therein, is sequentially read out and output to the line 2091.
When the contents of the counter 2093 reach k, ROM2
The last data of the window function w _(x) of 094 is read out,
At the same time, counter 2093 is reset and outputs a reset pulse on line 2090. This reset pulse is applied to the oscillators 206-1 to 206
-n phase reset terminal and the aforementioned phase reset pulse supplied to the interpolator 202, and is also used to set the next input data in the register 2091. Also ROM
Data of the window function w _(l) previously stored as k samples in line 2094 is output to line 2091 and supplied to multiplier 210 . In this way, phase reset pulses whose inter-pulse intervals have successively specified values and a variable-length window function w _(x) synchronized with these pulses as shown in FIG. 5B are generated.

次にCSM分析について説明する。 Next, we will explain CSM analysis.

前述のように、CSM分析は、各分析フレーム
毎に、表現されるべき音声波形から直接算出され
る標本自己相関係数のＮ個の低次のタツプ値と、
合成波（ｎ個の正弦波の和）のＮ個の低次のタツ
プ値とが一致するように、合成すべき各正弦波の
周波数ω_iとその強度（電力振幅）m_iとを決定する
ことである。 As mentioned above, CSM analysis uses, for each analysis frame, N low-order tap values of sample autocorrelation coefficients calculated directly from the speech waveform to be represented;
The frequency ω _i of each sine wave to be synthesized and its intensity (power amplitude) m _i are determined so that the N low-order tap values of the synthesized wave (sum of n sine waves) match. That's true.

今、合成波のタツプｌの自己相関係数をγ_lとす
ると、前述のように γ_l＝_o 〓ⁱ⁼¹ m_icoslω_i となる。 Now, if the autocorrelation coefficient of tap l of the composite wave is γ _l , then as mentioned above, γ _l = _o 〓 ⁱ⁼¹ m _i coslω _i .

一方、表現されるべき音声波形のサンプルX_t
から、あるフレームの、タツプｌの標本自己相関
係数v_lは v_l＝１／Ｍ_M-1 〓^t=l X_tX_t-l ……(1) である。 On the other hand, the sample of the audio waveform to be represented X _t
Therefore, the sample autocorrelation coefficient v _l of tap l in a certain frame is v _l =1/M _M-1 〓 ^t=l X _t X _tl (1).

これより γ_l＝v_l ……(2) ｌ＝０、１、２、……Ｎ但しＮ＝2n−１とすると下記のマトリツクス表現が得られる。 From this, γ _l =v _l ...(2) l=0, 1, 2,...N However, if N=2n-1, the following matrix expression is obtained.

しかし上式は、ω_iおよびm_iが未知のため単純な
行列演算では解けない。そこで、 ω_i＝cos^-1x_i ……(4) とおき、 coslω_i＝coslcos^-1x_i≡T_l（x_i） ……(5) の置換を行なう。このTl（ｘ）はTchebycheff（チ
エビシエフ）の多項式である。この置換を行なう
と(3)式は次のように変換される。 However, the above equation cannot be solved by simple matrix operations because ω _i and m _i are unknown. Therefore, let ω _i = cos ^-1 x _i ...(4) and perform the substitution coslω _i = cosl cos ^-1 x _i ≡T _l (x _i ) ...(5). This Tl(x) is a Tchebycheff polynomial. When this substitution is performed, equation (3) is converted as follows.

ところが、一般にx^lはT₀（ｘ）、T₁（ｘ）……T_l
（ｘ）の線形結合として表わすことができる。 However, in general, x ^l is T ₀ (x), T ₁ (x)...T _l
It can be expressed as a linear combination of (x).

すなわち、 x^l＝_l 〓^j=0 S^(l) _jT_j（ｘ） ……(7) 但しS^(l) _jは逆Tchebycheff（チエビシエフ）係数
である。 That is, x ^l = _l 〓 ^j=0 S ^(l) _j T _j (x) ... (7) where S ^(l) _j is the inverse Tchebycheff coefficient.

このS^(l) _jを用いて、前述の標本自己相関係数v_j
の線形結合A_lを下式のように定義する。 Using this S ^(l) _j , the sample autocorrelation coefficient v _j
Define the linear combination A _l as shown below.

A_l＝_l 〓^j=0 S^(l) _jv_j ……(8) 但しｌ＝０、１、２、……、2n−１こうすると、(6)式の左辺および右辺にそれぞれ
(7)式および(8)式の関係を用いることにより、下記
の関係式が成立する。 A _l = _l 〓 ^j=0 S ^(l) _j v _j ...(8) However, l=0, 1, 2, ..., 2n-1 In this way, on the left and right sides of equation (6), respectively
By using the relationships of equations (7) and (8), the following relational expression is established.

さて、ここで、x₁、x₂、……、x_oに零点をもつ
ｎ次の多項式 Pn（ｘ）≡_o 〓^k=0 p⁽ⁿ⁾ _kx^k＝_o 〓ⁱ⁼¹ （ｘ−x_i）を定義し、このPn（ｘ）を用いて、(9)式の左辺と
似た式の _o 〓ⁱ⁼¹ m_iPn（x_i）x^l _i を作り、これを検討してみる。上式が０であるこ
とは明らかであるが、さらにこれは次のように書
き換えることができる。 Now, here, _the nth degree polynomial Pn( _x )≡ _o 〓 _k ⁼⁰ p ⁽ⁿ⁾ _k x ^k ＝ _o 〓 ⁱ⁼¹ (x− x _i ), and using this Pn(x), create _o 〓 ⁱ⁼¹ m _i Pn(x _i )x ^l _i , which is similar to the left side of equation (9), and consider this. View. It is clear that the above equation is 0, but it can be further rewritten as follows.

０＝_o 〓ⁱ⁼¹ m_iPn（x_i）x^l _i＝_o 〓ⁱ⁼¹ m_io 〓^k=0 p⁽ⁿ⁾ _kx^k+l _i ＝_o 〓^k=0 p⁽ⁿ⁾ _ko 〓ⁱ⁼¹ m_ix^k+l _i＝_o 〓^k=0 p⁽ⁿ⁾ _kA_k+l 以上より、ｌ＝０、１、２、……ｎとして下式
が得られる。0 = _o 〓 ⁱ⁼¹ m _i Pn(x _i ) x ^l _i = _o 〓 ⁱ⁼¹ m _io 〓 ^k=0 p ⁽ⁿ⁾ _k x ^k+l _i = _o 〓 ^k=0 p ⁽ⁿ⁾ _ko 〓 ⁱ⁼¹ m _i x ^k+l _i = _o 〓 ^k=0 p ⁽ⁿ⁾ _k A _k+l From the above, the following formula can be obtained with l=0, 1, 2,...n.

しかるにp⁽ⁿ⁾ _o＝１であるからが成立する。左辺のA_iででまるマトリクスは一般
にHankcl（ハンケル）行列と呼ばれているもので
ある。前述のように各A_iは、表現すべき音声波形
の標本自己相関係数v_jから(8)式により与えられる
もので既知である。 However, since p ⁽ⁿ⁾ _o = 1 holds true. The matrix defined by A _i on the left side is generally called the Hankcl matrix. As described above, each A _i is given by equation (8) from the sample autocorrelation coefficient v _j of the speech waveform to be expressed and is known.

従つて(10)式を解くことにより、p⁽ⁿ⁾ ₀、p⁽ⁿ⁾ ₁、……
p⁽ⁿ⁾ _o-1の値を求めることができる。 Therefore, by solving equation (10), p ⁽ⁿ⁾ ₀ , p ⁽ⁿ⁾ ₁ , ...
The value of p ⁽ⁿ⁾ _o-1 can be found.

この各p⁽ⁿ⁾ _iが求まるとｎ次方程式 p_o（ｘ）＝xⁿp⁽ⁿ⁾ _o-1x^n-1＋……p⁽ⁿ⁾ ₀＝０の解として、｛x₁、x₂、…、x_o｝が求められる。 Once each p ⁽ⁿ⁾ _i is found, as a solution to the n-dimensional equation p _o (x) = x ⁿ p ⁽ⁿ⁾ _o-1 x ^n-1 +... p ⁽ⁿ⁾ ₀ = 0, {x ₁ , x ₂ ,..., x _o } are found.

これより各CSM周波数ω_iは(4)式の ω_i＝cos^-1x_i より求められ、またCSM強度m_iは(9)式より導か
れる下式を用いて求められる。 From this, each CSM frequency ω _i is obtained from ω _i =cos ⁻¹ x _i in equation (4), and the CSM intensity m _i is obtained using the following equation derived from equation (9).

なお、上式の左辺の行列は一般にVander
Monde（フアレデルモンデ）行列と呼ばれている
ものである。 Note that the matrix on the left side of the above equation is generally Vander
This is called the Monde (Juare del Monde) matrix.

以上をまとめると、CSM分析の分析アルゴリ
ズムは以下のようになる。 To summarize the above, the analysis algorithm for CSM analysis is as follows.

(1) 標本自己相関係数を計算する v_l＝１／Ｍ_M-1 〓〓_t=lX_tX_t-l (2) 逆チエビシエフ係数を用いてA_lを定義する。(1) Calculate the sample autocorrelation coefficient v _l = 1/M _M-1 〓〓 _t=l X _t X _tl (2) Define A _l using the inverse Tievisiev coefficient.

A_l＝_l 〓^j=0 S^(l) _jv_j (3) A_lによるHankel（ハンケル）行列方程式を解
いてp⁽ⁿ⁾ _iを求める (4) p⁽ⁿ⁾ _iを係数としてもつｎ次代数方程式を解い
てｎ個のx_tを求める。 A _l = _l 〓 ^j=0 S ^(l) _j v _j (3) Solve the Hankel matrix equation by A _l to find p ⁽ⁿ⁾ _i (4) p ⁽ⁿ⁾ Find n x _t by solving an n-dimensional algebraic equation with _i as a coefficient.

p_o（ｘ）≡xⁿ＋p⁽ⁿ⁾ _o-1x^n-1＋p⁽ⁿ⁾ _o-2x^n-2＋… ＋p⁽ⁿ⁾ ₁ｘ＋p₀＝０ (5) cos逆変換を行なつてCSM角周波数｛ω_i｝を
求める。 p _o (x)≡x ⁿ +p ⁽ⁿ⁾ _o-1 x ^n-1 +p ⁽ⁿ⁾ _o-2 x ^n-2 +… +p ⁽ⁿ⁾ ₁ x+p ₀ =0 (5) Perform inverse cos transformation Find the CSM angular frequency {ω _i }.

ω_i＝cos^-1x_i (6) Van del Monde（フアンデルモンデ）行列方
程式を解いてCSM強度｛m_i｝を求める。 ω _i =cos ^-1 x _i (6) Solve the Van del Monde matrix equation to find the CSM intensity {m _i }.

以上の各ステツプを実行することによりCSM
の各角周波数｛ω₁、ω₂…ω_o｝および各波の強度
｛m₁、m₂、…m_o｝を求めることができる。 By performing each of the above steps, CSM
The angular frequencies {ω ₁ , ω ₂ ...ω _o } and the intensities of each wave {m ₁ , m ₂ , ...m _o } can be determined.

なお、上述のHankel（ハンケル）行列方程式の
能率的解法として、初期条件を与えて遂次的に解
を求める方法が知られている。 Note that, as an efficient method for solving the above-mentioned Hankel matrix equation, a method is known in which initial conditions are given and solutions are sequentially obtained.

また、上記ｎ次の代数方程式は実根のみを有す
ることが証明されているため、ニユートン・ラプ
ソンの方法等を用いて根を求めることができる。 Furthermore, since it has been proven that the above nth-order algebraic equation has only real roots, the roots can be found using the Newton-Raphson method or the like.

さらに、上記Vander Monde（フアンデルモン
デ）行列方程式の能率的解法として三角行列化を
行なつて順次に解を求める方法を用いることがで
きる。なお上述の分析方法は嵯峨山氏らの論文
“複合正弦波モデルによる音声スペクトル分析”
電子通信学会論文誌’81／２ Vol.J64−ＡNo.
2P.105〜112に詳しく述べられている。 Further, as an efficient method for solving the above-mentioned Vander Monde matrix equation, it is possible to use a method of sequentially obtaining solutions by performing triangular matrix formation. The above analysis method is based on the paper “Speech spectrum analysis using a composite sine wave model” by Mr. Sagayama et al.
Journal of the Institute of Electronics and Communication Engineers '81/2 Vol.J64-ANo.
Details are given on 2P.105-112.

最後に本発明を構成する直接的な部分である
CSM量子化器１０５、電力補正量子化器１０６
を図面を用いて詳細に説明する。第１１図は
CSM量子化器１０５、電力補正量子化器１０６
を詳細に説明するためのブロツク図である。 Finally, it is a direct part constituting the present invention.
CSM quantizer 105, power correction quantizer 106
will be explained in detail using the drawings. Figure 11 is
CSM quantizer 105, power correction quantizer 106
FIG. 2 is a block diagram for explaining in detail.

CSM分析器１０４よりCSMのｎ個の各正弦波
の強度および角周波数を指定するm_i、ω_i（但しｉ
＝１、２、……ｎ）の組が一時メモリ(1)、１０５
１へ供給される。一時メモリ(1)、１０５１は前記
m_iを正規化係数検索器１０５２とCSM強度正規
化器１０５３とへ出力する。正規化係数検索器１
０５２は以下の手順に従つて正規化係数ａと最大
周波数の番号工とを検索する。 m _i , ω _i (where i
= 1, 2, ...n) is temporary memory (1), 105
1. Temporary memory (1), 1051 is the above
m _i is output to the normalization coefficient searcher 1052 and the CSM intensity normalizer 1053. Normalization coefficient searcher 1
052 searches for the normalization coefficient a and the maximum frequency number according to the following procedure.

(1) 初期状態ａ＝m_i、Ｉ＝１を設定する。(1) Set the initial state a=m _i and I=1.

(2) ａとm₂との大小関係を調査する。もしａ
m₂であれば(4)を次に実施する。もしａ＜m₂で
あれば(3)を次に実施する。(2) Investigate the magnitude relationship between a and m ₂ . If a
If m ₂ , execute (4) next. If a<m ₂ , perform (3) next.

(3) ａ＝m₂、Ｉ＝２を設定する。(3) Set a=m ₂ and I=2.

(4) ａとm₃との大小関係を調査し、上記(2)と同
様の処理を行なう。(4) Investigate the magnitude relationship between a and m ₃ and perform the same processing as in (2) above.

(5) 以下m₄……m_Nまで(4)と同様の処理を行な
う。(5) From here on, perform the same processing as in (4) up to m ₄ . . . m _N.

正規化係数検索器１０５２は検索した前記ａを
電力補正器１０６１とCSM強度正規化器１０５
３とへ、又、前記ＩをCSM強度量子化器１０５
４へ出力する。CSM強度正規化器１０５３は一
時メモリ(1)、１０５１より供給された前記m_iを
前記正規化係数ａを用いてm′_i＝m_i／ａ（但し、ｉ
＝１、２、……ｎ）を算出する。更にCSM強度
正規化器１０５３は算出したm′_iの平方根√′_i
（ｉ＝１、２、……ｎ）を求めCSM強度量子化器
１０５４へ出力する。CSM強度量子化器１０５
４は正規化係数検索器１０５２より供給される最
大周波数の番号Ｉと前記√′_i（ｉ＝１、２、…
…ｎ）とを用いて例えば第１２図に示す形式のビ
ツト配分で線形量子化を実施し、量子化データを
一時メモリ(2)、１０５６へ出力する。次に量子化
の形式を第１２図を参照して説明する。第１２図
ａは９次CSM分析（ｎ＝５に相当する）の結果
得られるCSM強度m₁、m₂……m₅を16bitsで量子
化するためのビツト配分を示したものである。前
記番号Ｉに対応して最強CSM強度の指定が行な
われる。ここで最強CSMが番号Ｉが１の場合に
は第１２図ｂ−ａに示す様に図上で左端に示され
るビツトに“０”が与えられ、Ｉが２、３、４、
５の場合には第１２図ｂ−ｂ〜ｅに示す様に図上
で左端に示されるビツトに“１”が与えられる。 The normalization coefficient searcher 1052 applies the searched a to the power corrector 1061 and the CSM intensity normalizer 105.
3, and also convert the above I to the CSM intensity quantizer 105
Output to 4. The CSM intensity normalizer 1053 uses the normalization coefficient a to convert the m _i supplied from the temporary memory (1) 1051 into m′ _i =m _i /a (where i
=1, 2,...n). Furthermore, the CSM intensity normalizer 1053 calculates the square root √′ _i of the calculated m′ _i
(i=1, 2, . . . n) is obtained and output to the CSM intensity quantizer 1054. CSM intensity quantizer 105
4 is the maximum frequency number I supplied from the normalization coefficient searcher 1052 and the aforementioned √′ _i (i=1, 2, . . .
. Next, the format of quantization will be explained with reference to FIG. FIG. 12a shows the bit allocation for quantizing the CSM _intensities m ₁ , m ₂ . The highest CSM strength is designated in correspondence with the number I. Here, when the strongest CSM has a number I of 1, "0" is given to the bit shown at the left end in the figure, as shown in FIG. 12 b-a, and I is 2, 3, 4,
In the case of 5, "1" is given to the bit shown at the left end in the figure, as shown in FIG. 12 b-b-e.

所で、CSM強度の分布を調査すると、最強の
CSM強度を持つのはm₁となる場合がきわめて多
い。第１３図は９次CSM分析（ｎ＝５）の結果
得られるCSM強度m₁、m₂、……m₅を正規化係
数ａを算出して正規化した場合の分布図であり図
中“フレーム数”と書かれたものは該強度が最強
となつたフレーム数である。なお、全分析フレー
ム数は6963である。即ちm₁が最強のCSM強度を
持つ割合は5895／6963＝0.847であり、第１２図
ｂに示すようにm₁が最強（Ｉ＝１）の場合にＩ
の指定が最も少ないビツト数で行なえるように構
成されている。 By the way, when we investigate the distribution of CSM strength, we find that the strongest
Very often m ₁ has CSM intensity. Figure 13 is a distribution diagram when the CSM intensities m ₁ , m ₂ , ... m ₅ obtained as a result of the 9th order CSM analysis (n = 5) are normalized by calculating the normalization coefficient a. ``Number of frames'' is the number of frames at which the intensity is the strongest. Note that the total number of analysis frames is 6963. That is, the ratio of m ₁ having the strongest CSM strength is 5895/6963 = 0.847, and as shown in Figure 12b, when m ₁ is the strongest (I = 1), I
The configuration is such that the specification can be made using the least number of bits.

尚、最強のCSM強度そのものは自分自身で正
規化されているため、必らず1.0となり情報の伝
送を必要としない。 Note that the strongest CSM strength itself is normalized, so it is always 1.0 and does not require information transmission.

再び第１１図に戻り、こうして量子化された
CSM強度パラメータは一時メモリ(2)、１０５６
へ出力される。CSM周波数量子化器１０５５は
CSMのｎ個の正弦波の角周波数を指定するω_i（但
しｉ＝１、２、……ｎ）の組を一時メモリ(1)、１
０５１より供給を受け予じめ調査されている各ω_i
の分布範囲を考慮した線形量子化を実施し、量子
化データを一時メモリ(2)、１０５６へ出力する。
一時メモリ(2)、１０５６は量子化されたCSM強
度とCSM角周波数データとをマルチプレクサ１
０９へ出力する。電力補正器１０６１は自己相関
係数計測器１０３より供給を受けた電力データに
正規化係数検索器より供給された係数ａを掛け結
果を電力量子化器１０６２へ出力する。電力量子
化器１０６２は前記結果を1/2乗し振幅情報に変
換した後、例えばμ255PCMで用いられている非
線形量子化を行ないマルチプレクサ１０９へ出力
する。 Returning to Figure 11 again, the quantized
CSM intensity parameter is temporary memory (2), 1056
Output to. The CSM frequency quantizer 1055
A set of ω _i (where i = 1, 2,...n) specifying the angular frequencies of n sine waves of CSM is temporarily stored in
Each ω _i supplied from 051 and investigated in advance
Linear quantization is performed in consideration of the distribution range of , and the quantized data is output to temporary memory (2), 1056.
A temporary memory (2) 1056 sends the quantized CSM intensity and CSM angular frequency data to multiplexer 1.
Output to 09. The power corrector 1061 multiplies the power data supplied from the autocorrelation coefficient measuring device 103 by the coefficient a supplied from the normalization coefficient search device and outputs the result to the power quantizer 1062. The power quantizer 1062 raises the result to the 1/2 power and converts it into amplitude information, then performs nonlinear quantization, such as that used in μ255 PCM, and outputs it to the multiplexer 109.

なお合成側での逆正規化は乗算器２１１で自動
的に行なわれる。 Note that the denormalization on the synthesis side is automatically performed by the multiplier 211.

（発明の効果）以上述べた様に本発明を用いるとCSM強度パ
ラメータ相互の関係を考慮してCSM強度パラメ
ータを量子化することにより、量子化の効率を高
められるという効果がある。(Effects of the Invention) As described above, the present invention has the effect of increasing the efficiency of quantization by quantizing the CSM intensity parameters in consideration of the relationship between the CSM intensity parameters.

[Brief explanation of drawings]

第１図はCSMパラメータによる音声特徴ベク
トルパターンの一例を示す図、第２図はCSMラ
インスペクトルと、同一音声サンプルより求めた
LPCスペクトル包絡との対応例を示す図、第３
図Ａは拡散されたCSMのスペクトル包絡とピツ
チの微細構造とを示す図、第３図Ｂは単純加算し
ただけのCSMスペクトルを示す図、第４図は本
発明を含む分析合成系の一実施例を示すブロツク
図、第５図Ａは可変長窓関数の関数形を示す図、
第５図Ｂは前記可変長窓関数と位相リセツト用パ
ルスとの相対時間関係を示す図、第６図は位相リ
セツト機能付可変周波数発振器の一回路例を示す
図、第７図は可変利得増幅器の一回路例を示す
図、第８図は乱数発生器の一回路例を示す図、第
９図Ａは周期算出器のブロツク図、第９図Ｂは前
記周期算出器の出力の乱数の分布を示す図および
第１０図は可変長窓発生器の一例を示すブロツク
図、第１１図は本発明を構成する直接的な部分を
詳細に説明するためのブロツク図、第１２図は量
子化の形式の一例を示す図、第１３図はCSM強
度の分布例を示す図である。図において、１……送信側、２……受信側、１
０１……Ａ／Ｄ変換器、１０２……ハミング窓処
理器、１０３……自己相関係数計測器、１０４…
…CSM分析器、１０５……CSM量子化器、１０
６……電力量子化器、１０７……ピツチ抽出器、
１０８……有声音／無声音判定器、１０９……マ
ルチプレクサ、２０１……デマルチプレクサおよ
び復号化器、２０２……補間器、２０３……有声
音／無声音切替器、２０４……周期算出器、２０
５……乱数発生器、２０６−１〜２０６−ｎ……
位相リセツト機能付可変周波数発振器、２０７−
１〜２０７−ｎ……可変利得増幅器、２０８……
加算合成器、２０９……可変長窓関数発生器、２
１０，２１１……乗算器、１０５１……一時メモ
リ(1)、１０５２……正規化係数検索器、１０５３
……CSM強度正規化器、１０５４……CSM強度
量子化器、１０５５……CSM周波数量子化器、
１０５６……一時メモリ(2)、１０６１……電力補
正器、１０６２……電力量子化器。 Figure 1 shows an example of a voice feature vector pattern based on CSM parameters, and Figure 2 shows a CSM line spectrum and a pattern obtained from the same voice sample.
Diagram showing an example of correspondence with LPC spectrum envelope, 3rd
Figure A is a diagram showing the spectral envelope of the diffused CSM and the fine structure of the pitch, Figure 3B is a diagram showing the CSM spectrum obtained by simple addition, and Figure 4 is an implementation of the analysis and synthesis system including the present invention. A block diagram showing an example, FIG. 5A is a diagram showing the functional form of a variable length window function,
FIG. 5B is a diagram showing the relative time relationship between the variable length window function and the phase reset pulse, FIG. 6 is a diagram showing an example of a circuit of a variable frequency oscillator with a phase reset function, and FIG. 7 is a diagram showing a variable gain amplifier. FIG. 8 is a diagram showing an example of a circuit of a random number generator, FIG. 9A is a block diagram of a period calculator, and FIG. 9B is a distribution of random numbers output from the period calculator. and FIG. 10 are block diagrams showing an example of a variable length window generator, FIG. 11 is a block diagram for explaining in detail the direct parts constituting the present invention, and FIG. 12 is a block diagram showing an example of a variable length window generator. FIG. 13, which is a diagram showing an example of the format, is a diagram showing an example of the distribution of CSM intensity. In the figure, 1...sending side, 2... receiving side, 1
01... A/D converter, 102... Hamming window processor, 103... Autocorrelation coefficient measuring device, 104...
...CSM analyzer, 105 ...CSM quantizer, 10
6...Power quantizer, 107...Pitch extractor,
108... Voiced/unvoiced sound determiner, 109... Multiplexer, 201... Demultiplexer and decoder, 202... Interpolator, 203... Voiced/unvoiced sound switcher, 204... Period calculator, 20
5... Random number generator, 206-1 to 206-n...
Variable frequency oscillator with phase reset function, 207-
1 to 207-n...variable gain amplifier, 208...
Addition synthesizer, 209...Variable length window function generator, 2
10, 211... Multiplier, 1051... Temporary memory (1), 1052... Normalization coefficient searcher, 1053
...CSM intensity normalizer, 1054...CSM intensity quantizer, 1055...CSM frequency quantizer,
1056...temporary memory (2), 1061...power corrector, 1062...power quantizer.

Claims

[Claims] 1. A CSM that expresses the spectral envelope of speech as a set of a predetermined number n of sine waves with free frequencies and amplitudes.
In parameter quantization, the set of amplitudes {m ₁ ,
A CSM parameter quantization method characterized by using means for normalizing m ₂ , ..., m _o } by a normalization coefficient a expressed as a=max{m ₁ , m ₂ , ..., m _o }. 2 CSM that expresses the spectral envelope of speech as a set of sine waves with a predetermined number n of free frequencies and amplitudes.
For expressing parameters with a small number of bits
In the CSM parameter quantizer, from the set of amplitudes of the sine wave {m ₁ , m ₂ , ..., m _o }, the maximum value a=max of the set {m ₁ , m ₂ , ..., m _o } is calculated. a means for searching and a maximum value a searched by said means;
and means for normalizing the set of amplitudes by.