JPS60113300A

JPS60113300A - Voice synthesization system

Info

Publication number: JPS60113300A
Application number: JP58220931A
Authority: JP
Inventors: 磯崎　智明
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-11-24
Filing date: 1983-11-24
Publication date: 1985-06-19

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は音声合成方式に関し、とくに波形合成方式に関
する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech synthesis method, and particularly to a waveform synthesis method.

音声合成方式の１つに波形素片合成方式がある。One of the speech synthesis methods is a waveform segment synthesis method.

これは音声波形が隣接した波形間で強い類似性を持つこ
とに注目したものである。波形素片合成方式においては
類似した音声波形が連続して現れた場合、類似した音声
波形のうちの１波形を代表素片として選び、この選択さ
れた波形をくり返して使用することにより音声信号のデ
ータ量を圧縮しようというものである。選択された代表
素片の波形を合成する方法としては、ＰＣＭ方式、ＡＤ
ＰＣＭ（適応差分ＰＣＭ）方式等の波形符号化技術を用
いることができるが、特にＡＤＰＣＭ方式を用いて波形
素片を合成ずれは、ＰＣＭ方式で波形素片を合成する場
合に比べて約１／２のデータ量で波形素片を合成できる
ため、データ量圧縮という面では非常に有効な手段であ
る。This method focuses on the fact that adjacent audio waveforms have strong similarities. In the waveform segment synthesis method, when similar speech waveforms appear consecutively, one of the similar speech waveforms is selected as a representative segment, and this selected waveform is repeatedly used to synthesize the speech signal. The idea is to compress the amount of data. Methods for synthesizing waveforms of selected representative segments include PCM method, AD
Waveform encoding techniques such as the PCM (adaptive differential PCM) method can be used, but in particular, the deviation in synthesizing waveform segments using the ADPCM method is about 1/1 compared to when synthesizing waveform segments using the PCM method. Since waveform segments can be synthesized with a data amount of 2, it is a very effective means in terms of data amount compression.

本発明はこのＡＤＰＣＭ方式を用いて、代表波形素片を
合成し、この代表波形素片をくり返し、使用することに
より、音声を合成する波形素片合成方式に関する。The present invention relates to a waveform element synthesis method for synthesizing representative waveform elements using this ADPCM method, and repeating and using these representative waveform elements to synthesize speech.

ＡＤＰＣＭ方式は音声をナイキスト周波数でサンプリン
グし、隣接したサンプリングポイント間での音声波形の
振幅値の差分値を適当な量子化幅で量子化、相好化する
もので、かつ量子化幅を各サンプリングポイントの差分
値の大きさに応じて適応的に変化させる方式である。従
ってＡＤＰＣＭ方式で代表波形木片を合成する場合量子
化幅は−波形内で一定ではなく、また各サンプリングポ
イントでの量子化幅はそれぞれ直前のＡＤＰＣＭデータ
に依存することになる。また、ＡＤＰＣＭ方式において
は、サンプリング周波数は通常音質とピットレートの関
係よυｆ、＝＝４ｋＨｚ〜８ｋＨｚが使用されている。The ADPCM method samples audio at the Nyquist frequency, and quantizes and equalizes the difference in amplitude values of audio waveforms between adjacent sampling points using an appropriate quantization width, and the quantization width is set at each sampling point. This is a method that adaptively changes the value according to the magnitude of the difference value. Therefore, when synthesizing representative waveform tree pieces using the ADPCM method, the quantization width is not constant within the waveform, and the quantization width at each sampling point depends on the immediately preceding ADPCM data. Further, in the ADPCM system, the sampling frequency is usually υf, which is 4 kHz to 8 kHz, depending on the relationship between sound quality and pit rate.

以上に説明したようなＡＩ）ＰＣＭ方式を用いて代表波
形素片を合成する場合、実際の原波形のピッチ周期と合
成した波形のピッチ周期とを完全に一致させることはで
きない。これは合成できる波形のピッチ周期１１１ｐ／
はサンプリング周期をＴ１とすればＴｐ’　””　Ｔｓ
　Ｘ　Ｎ（Ｎはサンプリングポイント数）となり、Ｔ、
′はＴ、の整数倍の値しか′とれず１最大−Ｌ　Ｔｓ　
の誤差を生じるからである。例えば、第１図に示すよう
なピッチ周期Ｔ、の波形をサンプリング周期Ｔ１でサン
プリングして合成する場合、前述したようなピッチ周期
の誤差の影響により、原波形の終りはサンプリングポイ
ント８ｔ。When synthesizing representative waveform segments using the AI) PCM method as described above, it is not possible to completely match the pitch period of the actual original waveform and the pitch period of the synthesized waveform. This is the pitch period 111p/of the waveform that can be synthesized.
If the sampling period is T1, then Tp'"" Ts
X N (N is the number of sampling points), T,
' can only take a value that is an integer multiple of T, and 1 maximum - L Ts
This is because it causes an error of . For example, when a waveform with a pitch period T as shown in FIG. 1 is sampled and synthesized at a sampling period T1, the end of the original waveform is at the sampling point 8t due to the influence of the pitch period error as described above.

と８２１の中間の位１ｉに来ることになる。従って合成
する波形の最後のサンプリングポイントを５Ｈ１ｌ＋８
２１のいずれにしても合成した波形の最終値の振幅を０
にすることはできない。また原音のピッチ周期と合成波
形のピッチ周期が完全に一致した場合でも、最後のサン
プリングポイントでの振幅値は、直前のサンプリングポ
イントでの振幅値に量子化幅とＡＤＰＣＭ符号の積で得
られる差分値を加えたものであるだめ、直前のサンプリ
ングポイントでの量子化幅の値によっては１波形内の最
後のサンプリングポイントの振幅値をＯにできない場合
が生じる。and 821, the middle place is 1i. Therefore, set the last sampling point of the waveform to be synthesized to 5H1l+8
In either case, the amplitude of the final value of the synthesized waveform is set to 0.
It cannot be done. Furthermore, even if the pitch period of the original sound and the pitch period of the synthesized waveform perfectly match, the amplitude value at the last sampling point is the difference obtained by multiplying the amplitude value at the previous sampling point by the quantization width and the ADPCM code. However, depending on the value of the quantization width at the immediately preceding sampling point, it may not be possible to set the amplitude value at the last sampling point within one waveform to O.

従って、従来のＡＤＰＣＭ方式をそのまま用いて代表波
形素片を合成する場合、代表波形素片の最後のサンプリ
ングポイントでの振幅値をＯにできないため、このよう
な代表波形素片を用いて複数回くり返しを行うと、第２
図に示すように合成波形の振幅中心が変化してしまう。Therefore, when synthesizing a representative waveform segment using the conventional ADPCM method as is, the amplitude value at the last sampling point of the representative waveform segment cannot be set to O. When repeated, the second
As shown in the figure, the amplitude center of the composite waveform changes.

これはＡＤＰＣＭ方式が基本的には、各サンプリングポ
イントの振幅値に差分値を加えて次のサンプリングポイ
ントの振幅値を得るという、差分符号化方式であるため
、１波形の最終振幅値がＯでないとその誤差（第１図の
ΔＥ）が累積されていき、合成波形の振幅中心が変化し
てしまうことに帰因している。This is because the ADPCM method is basically a differential encoding method that adds a difference value to the amplitude value of each sampling point to obtain the amplitude value of the next sampling point, so the final amplitude value of one waveform is not O. This is due to the fact that the error (ΔE in FIG. 1) is accumulated, and the amplitude center of the composite waveform changes.

合成波形の振幅中心が変化すると、波形データを音声に
変換するＤ／Ａコンバータで振幅中心が変化した分だけ
オーバーフロラする可能性があシ、その場合音声に雑音
が入ってしまうという重大な欠点が生じる。If the amplitude center of the synthesized waveform changes, there is a possibility that the D/A converter that converts the waveform data into audio will overflow by the amount that the amplitude center has changed, and in that case, noise will be added to the audio, which is a serious drawback. occurs.

木兄！ｊ１４の目的はＡ、　Ｄ−Ｐ　ＣＭ方式で合成し
た代表波形素片を〈シ返して使用して音声を合成しても
、合成波形の振幅中心が変化することなく音声が合成で
きる音７＋ｆ合成方式を拵供することにある。Tree brother! The purpose of j14 is A. Sound 7+f synthesis, in which voice can be synthesized without changing the amplitude center of the synthesized waveform even if the representative waveform elements synthesized using the D-P CM method are used repeatedly to synthesize voice. The purpose is to provide a method.

以下図面第３図を参照して本発明の一実施例を説すＪシ
する。本合成方式においては＄彼のサンプリングポイン
トに対して、その振幅値を直前のサンプリングポイント
での振’１９４　ＩｒＭ　ｌぞ差分値を力ｎえて振１Ｍ
ｉ値を得るという方法ではなく、１０前のサンプリング
ポイントでの４ｈ　’を曽（ｉｔｆとは無門係に強制的
に０にする手段を用いる。従っ−Ｃ１本方式で合成した
波形素片を用いてくり返しを行っても、合成波形の振幅
中心は全く変化しない。すなわち従来の方法で第３図に
示す波形を合成した場合最後のサンプリングポイントで
の振幅値は８１５で示す位置になるが本方式を用いると
、Ｓ１４．Ｓ、ｌ＋に関係なく１波形内の最後のサンプ
リングポイントでの振幅値を０にするため１波形内の最
終サンプリングポイントは５ｌｆｉ’となる。この場合
原波形のピッチ周期と合成した波形のピッチ周期の間に
ΔＴの誤差が生じるがこれはサンプリング周期Ｔ＠を適
当な値にとることによシ減少させることができる。最後
のサンプリングポイントの振幅値ＴｈＯにするための回
路としては、最終サンプリングポイントであることを検
出する回路（例えばカウンタ）とその時点で′０″をデ
ータとしてセットする回路とで実現できる。An embodiment of the present invention will be described below with reference to FIG. 3 of the drawings. In this synthesis method, for each sampling point, the amplitude value is divided by the difference value from the previous sampling point, and the difference value is calculated by 1M.
Rather than obtaining the i value, we use a method to force the 4h' at the sampling point 10 times before to 0 (itf is a means of forcing an innocent person to 0. Therefore, the waveform element synthesized by the C1 method is Even if the waveform shown in Figure 3 is synthesized using the conventional method, the amplitude value at the last sampling point will be at the position indicated by 815, but the amplitude center of the synthesized waveform will not change at all even if the waveform shown in Figure 3 is synthesized using the conventional method. If the method is used, the amplitude value at the last sampling point in one waveform will be 0 regardless of S14.S and l+, so the final sampling point in one waveform will be 5lfi'.In this case, the pitch period of the original waveform An error of ΔT occurs between the pitch period of the synthesized waveform, but this can be reduced by setting the sampling period T@ to an appropriate value. As a circuit, this can be realized by a circuit (for example, a counter) that detects that it is the final sampling point and a circuit that sets '0' as data at that point.

以上に説明したように本発明によれば、ＡＤＰＣＭ方式
を用いて代表波形素片を作成し、その代表波形素片をく
シ返して音声を合成しても、合成波形の振幅中心が変化
しないような音声波形を簡単に実現できる。また本発明
によれば、くシ返しに用いる代表波形素片の最後のサン
プリングポイントを出力するためのデータを持つ必要は
ないので、その分だけ合成時のビットレートがよくなる
という動床も賀られる。As explained above, according to the present invention, even if a representative waveform segment is created using the ADPCM method and voice is synthesized by repeating the representative waveform segment, the amplitude center of the synthesized waveform does not change. You can easily create audio waveforms like this. Furthermore, according to the present invention, there is no need to have data for outputting the last sampling point of the representative waveform segment used for repeating, so the bit rate during synthesis can be improved accordingly. .

[Brief explanation of drawings]

第１図は従来の音声合成方式を説明Ｊるため波形図１、
＠２図は従来のａ声−合成方式で合成した波形を示す波
形図、抛３図は本発明の音声合成方式を説明するだめの
波形図である。＼−１−Figure 1 is a waveform diagram 1 to explain the conventional speech synthesis method.
Figure 2 is a waveform diagram showing a waveform synthesized by the conventional a-voice synthesis method, and Figure 3 is a waveform diagram for explaining the voice synthesis method of the present invention. \-1-

Claims

[Claims]

A speech synthesis method in which representative waveform segments are synthesized using an ADPCM method, characterized in that the amplitude value at the last sampling point of the synthesized waveform is forcibly set to 0.