JP2581050B2

JP2581050B2 - Voice analysis and synthesis device

Info

Publication number: JP2581050B2
Application number: JP61284282A
Authority: JP
Inventors: 智安永
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1986-12-01
Filing date: 1986-12-01
Publication date: 1997-02-12
Anticipated expiration: 2012-02-12
Also published as: JPS63138400A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は，音声を分析符号化した後，復号化合成する
音声分析合成装置に関し，特に音声を線形予測係数，パ
ーコール係数などの音韻情報と，残差信号などの音源情
報とに分析合成する装置に関する。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech analysis / synthesis apparatus that analyzes and encodes speech and then decodes and synthesizes the speech. , And a device for analyzing and synthesizing with sound source information such as a residual signal.

[Conventional technology]

従来，ディジタル伝送系統で音声伝送を行う場合に，
情報量の圧縮，または秘話を行うために，マルチパルス
駆動形線予測符号化方法（MPEC），残差駆動形ボコーダ
（RELP），適応予測符号化方式（APC）などのように，
音声を一定時間毎に線形予測係数，パーコール係数等の
音韻情報（スペクトル情報）と残差信号，マルチパルス
情報等の音源情報とに分析量子化後伝送する符号化方式
が用いられている。Conventionally, when voice transmission is performed in a digital transmission system,
In order to compress the amount of information or perform confidentiality, multi-pulse driven linear predictive coding (MPEC), residual driven vocoder (RELP), adaptive predictive coding (APC), etc.
2. Description of the Related Art An encoding method is used in which a speech is analyzed and quantized and transmitted at regular time intervals to phonetic information (spectral information) such as linear prediction coefficients and Percoll coefficients and sound source information such as a residual signal and multipulse information.

[Problems to be solved by the invention]

ところで，上述した符号化方式は，限られた伝送速度
（ビットレート）に分析結果を量子化するために，音韻
情報，音源情報に量子化ビット数を割り当てている。各
情報はさらに効率よく量子化するために，統計的に量子
化方法を定めている。By the way, in the above-mentioned encoding method, in order to quantize an analysis result to a limited transmission rate (bit rate), a quantization bit number is assigned to phoneme information and sound source information. In order to quantize each information more efficiently, a quantization method is statistically determined.

例えば，音韻情報として利用されるＫパラメータ（パ
ーコール係数）は，基本的に絶対値が１以下であるが，
入力音声の特徴によって，取り得る値の分布が決ってく
る。For example, the K parameter (Percoll coefficient) used as phonological information basically has an absolute value of 1 or less,
The distribution of possible values depends on the characteristics of the input speech.

第３図は有音声のＫパラメータ分布を示す。第４図は
無声音のＫパラメータ分布を示す。従来は，これらの分
布を総括して分布範囲を規定した後，量子化を行ってい
た。しかしながら無音声のＫパラメータ分布は，第４図
により明らかなように，分布範囲が狭く，総括的な分布
範囲では冗長度が非常に高くなってしまう欠点がある。FIG. 3 shows the K parameter distribution of voiced speech. FIG. 4 shows the K parameter distribution of unvoiced sounds. Conventionally, quantization is performed after defining these distributions and defining a distribution range. However, as is clear from FIG. 4, the K-parameter distribution of no sound has a disadvantage that the distribution range is narrow and the redundancy is extremely high in the general distribution range.

そこで，本発明の目的は，上記欠点に鑑み，有声音及
び無声音について一次から高次にわたるＫパラメータの
分布範囲を個別的に考慮して、量子化精度を劣化させる
ことなくＫパラメータの割り当てビット数を削減する音
声分析合成装置を提供することである。In view of the above drawbacks, an object of the present invention is to individually consider the distribution range of K parameters for voiced and unvoiced sounds from primary to higher order, and to allocate the number of K parameter bits without deteriorating quantization accuracy. It is to provide a speech analysis / synthesis apparatus that reduces the number of voices.

[Means for solving the problem]

本発明によれば、入力音声の音源情報を抽出する手
段、入力音声のスペクトル情報を抽出するスペクトル抽
出手段、及び前記音源情報及びスペクトル情報を量子化
する手段とを有する分析部と、量子化された前記音源情
報及びスペクトル情報を受け、逆量子化する手段を有す
る合成部とを有する音声分析合成装置において、前記分
析部のスペクトル抽出手段は、前記スペクトル情報とし
て、一次から高次にわたるＫパラメータを抽出し、更
に、前記分析部には、入力音声の有声無声を検出する検
出手段と、前記検出手段の検出結果に応じて、前記Ｋパ
ラメータの量子化範囲を選択する量子化範囲制御回路
と、該量子化範囲制御回路による選択結果にしたがっ
て、前記音源情報及び前記Ｋパラメータを量子化する手
段と、量子化された前記音源情報及び前記Ｋパラメータ
を前記量子化範囲制御回路による選択結果と共に送信す
る手段とを設け、前記合成部には、前記選択結果に基づ
いて逆量子化範囲を選択する手段を設けたことを特徴と
する音声合成分析装置が得られる。According to the present invention, an analyzer having means for extracting sound source information of an input voice, spectrum extracting means for extracting spectrum information of an input voice, and means for quantizing the sound source information and the spectrum information, And a synthesizing unit having means for receiving and dequantizing the sound source information and the spectrum information, wherein the spectrum extracting means of the analyzing unit includes a K parameter ranging from a first order to a higher order as the spectrum information. Extracting, the analyzing unit further includes: a detecting unit that detects voiced or unvoiced input voice; a quantization range control circuit that selects a quantization range of the K parameter according to a detection result of the detection unit; Means for quantizing the sound source information and the K parameter according to a selection result by the quantization range control circuit; Means for transmitting the information and the K parameter together with the selection result by the quantization range control circuit, and the synthesizing unit includes means for selecting an inverse quantization range based on the selection result. A speech synthesis analyzer that performs

〔Example〕

次に，本発明の実施例について図面を参照して説明す
る。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例の分析部である。 FIG. 1 shows an analysis unit according to one embodiment of the present invention.

音声入力端子１より入力された音声は，スペクトル情
報分析器2,音声無声検出器3,音源情報分析器４へ入力さ
れる。スペクトル情報分析器２の出力はスペクトル量子
化器６へ接続され，音源情報分析器４の出力は音源量子
化器７へ接続されている。有声無声検出器３の出力は，
量子化範囲制御回路５へ入力され，その出力は量子化範
囲制御信号として音源量子化器７とスペクトル量子化器
６と多重回路８とに入力される。多重回路８はスペクト
ル情報，音源情報，量子化範囲制御信号を多重した後，
信号出力端子９へ出力する。The voice input from the voice input terminal 1 is input to a spectrum information analyzer 2, a voiceless detector 3, and a sound source information analyzer 4. The output of the spectrum information analyzer 2 is connected to a spectrum quantizer 6, and the output of the sound source information analyzer 4 is connected to a sound source quantizer 7. The output of the voiced and unvoiced detector 3 is
The signal is input to the quantization range control circuit 5, and its output is input to the sound source quantizer 7, the spectrum quantizer 6, and the multiplexing circuit 8 as a quantization range control signal. The multiplexing circuit 8 multiplexes the spectrum information, the sound source information, and the quantization range control signal,
Output to the signal output terminal 9.

第２図は本発明の一実施例の合成部である。 FIG. 2 shows a synthesizing unit according to one embodiment of the present invention.

信号入力端子10より入力された各情報は，分離回路11
で分離された後，スペクトル逆量子化器12,逆量子化範
囲制御回路13,および音源逆量子化器14へ出力される。
逆量子化範囲制御回路13の出力はスペクトル逆量子化器
12と音源逆量子化器へ入力される。逆量子化されたスペ
クトル情報及び音源情報は合成フィルタ15に入力され，
合成フィルタ15の出力は合成音として音声出力端子16へ
出力される。Each information input from the signal input terminal 10 is separated by the separation circuit 11
Are output to the spectrum inverse quantizer 12, the inverse quantization range control circuit 13, and the sound source inverse quantizer 14.
The output of the inverse quantization range control circuit 13 is a spectrum inverse quantizer.
12 and input to the sound source inverse quantizer. The dequantized spectrum information and sound source information are input to the synthesis filter 15,
The output of the synthesis filter 15 is output to the audio output terminal 16 as a synthesized sound.

〔The invention's effect〕

以上説明したように本発明は，入力音声の有声無性を
検出し，無性時にスペクトル情報の量子化範囲を制限す
ることにより，量子化精度を劣化させることなく割り当
てビット数を削減できる。従って，削減したビットは音
源情報等に利用でき，音声品質の向上ができる効果があ
る。As described above, according to the present invention, the number of allocated bits can be reduced without deteriorating the quantization accuracy by detecting the voiced apex of the input speech and limiting the quantization range of the spectrum information when the voice is absent. Therefore, the reduced bits can be used for sound source information and the like, and there is an effect that voice quality can be improved.

[Brief description of the drawings]

第１図は本発明の音声分析合成装置の分析部のブロック
図，第２図は本発明の音声分析合成装置の合成部のブロ
ック図，第３図は，有声音Ｋパラメータ分布を示す分布
図，第４図は，無声音Ｋパラメータ分布を示す分布図で
ある。１……音声入力端子,2……スペクトル情報分析器,3……
有声無声検出器,4……音源情報分析器,5……量子化範囲
制御回路,6……スペクトル量子化器,7……音源量子化
器,8……多重回路,9……信号出力端子,10……信号入力
端子,11……分離回路,12……スペクトル逆量子化器,13
……逆量子化範囲制御回路,14……音源逆量子化器,15…
…合成フィルタ,16……音声出力端子。FIG. 1 is a block diagram of an analysis unit of the speech analysis / synthesis device of the present invention, FIG. 2 is a block diagram of a synthesis unit of the speech analysis / synthesis device of the present invention, and FIG. 3 is a distribution diagram showing a voiced K parameter distribution. , FIG. 4 is a distribution diagram showing an unvoiced sound K parameter distribution. 1 ... voice input terminal, 2 ... spectral information analyzer, 3 ...
Voiced and unvoiced detector, 4 …… Sound source information analyzer, 5 …… Quantization range control circuit, 6 …… Spectrum quantizer, 7 …… Sound source quantizer, 8 …… Multiplexer, 9 …… Signal output terminal , 10 ... Signal input terminal, 11 ... Separation circuit, 12 ... Spectrum inverse quantizer, 13
…… Inverse quantization range control circuit, 14 …… Sound source inverse quantizer, 15…
... Synthesis filter, 16 ... Audio output terminal.

Claims

(57) [Claims]

A means for extracting sound source information of an input voice; a spectrum extracting means for extracting spectrum information of an input voice;
And an analysis unit having means for quantizing the sound source information and the spectrum information, and a speech analysis and synthesis device having a synthesis unit having means for receiving the quantized sound source information and the spectrum information and performing inverse quantization. The spectrum extraction means of the analysis unit extracts K parameters ranging from primary to high order as the spectrum information, and the analysis unit further includes detection means for detecting voiced / unvoiced input voice;
A quantization range control circuit that selects a quantization range of the K parameter according to a detection result of the detection unit; and quantizes the sound source information and the K parameter according to a selection result by the quantization range control circuit. Means, and means for transmitting the quantized sound source information and the K parameter together with the selection result by the quantization range control circuit, wherein the synthesizing unit selects an inverse quantization range based on the selection result. A speech synthesis analyzer characterized by comprising means for performing.