JP2572961B2

JP2572961B2 - Voice synthesis method

Info

Publication number: JP2572961B2
Application number: JP60160702A
Authority: JP
Inventors: 徹北村
Original assignee: Sanyo Denki Co Ltd
Current assignee: Sanyo Denki Co Ltd
Priority date: 1985-07-19
Filing date: 1985-07-19
Publication date: 1997-01-16
Anticipated expiration: 2012-01-16
Also published as: JPS6221200A

Description

【発明の詳細な説明】（イ）産業上の利用分野本発明は音声合成装置に関する。特に無声音源の発生
回路に関する。The present invention relates to a speech synthesizer. In particular, it relates to a circuit for generating an unvoiced sound source.

（ロ）従来の技術音声合成方式としては1980年２月４日発行の日経エレ
クトロニクスの記事「身近になった音声合成の各種方式
を比較する」に示されているように種々のものが提案さ
れており、一般には、音声波形をそのままの形で情報圧
縮する波形処理方式と、音源の基本周期（ピッチと称
す）と声道の伝達特性をあらわすパラメータを抽出し、
音源と該パラメータにより音声を合成する生成源方式に
大別される。近年では情報圧縮率が高く再生音声の品質
がよい点から生成源方式が主流になりつつある。第３図
は生成源方式の音声合成装置の概略を示したものであ
る。同図に於いて、（１）は有声音源生成部であり、ピ
ッチ振幅に対応した音源信号Ｐを発生する。（２）は、
無声音源生成部であり、声帯を振動させない場合のホワ
イトノイズとみなせる呼気流に対応する音源信号Ｎを発
生する。（３）は、有声無声判定情報Ｄにより上記両音
源生成部（１）（２）からの音源信号P,Nを切り換える
スイッチである。（４）は声道の伝達特性に対応するデ
ィジタルフィルタ部であり、上記スイッチ（３）からの
音源信号N,Pの入力により、合成音声Ｖを生成出力す
る。(B) Conventional technology Various speech synthesis methods have been proposed as shown in the article "Comparison of Various Speech Synthesis Methods Familiar with You" published by Nikkei Electronics on February 4, 1980. In general, a waveform processing method that compresses information in the form of a voice waveform as it is, and a parameter representing a fundamental period (referred to as a pitch) of a sound source and a transfer characteristic of a vocal tract are extracted.
It is roughly classified into a sound source and a generation source method for synthesizing speech by the parameter. In recent years, the generation source method is becoming mainstream because of its high information compression ratio and good quality of reproduced sound. FIG. 3 shows an outline of a speech synthesizer of a generation source system. In the figure, (1) is a voiced sound source generation unit, which generates a sound source signal P corresponding to the pitch amplitude. (2)
The unvoiced sound source generation unit generates a sound source signal N corresponding to an expiratory flow that can be regarded as white noise when the vocal cords are not vibrated. (3) is a switch for switching the sound source signals P and N from the two sound source generation units (1) and (2) according to the voiced / unvoiced judgment information D. (4) is a digital filter unit corresponding to the transfer characteristic of the vocal tract, which generates and outputs a synthesized voice V in response to the input of the sound source signals N and P from the switch (3).

斯様な音声合成装置における無声音生成部（２）とし
ては、従来Ｍ系列（マキシムレングスヌルシーケンス）
のフィードバックカウンタを用いており、このカウンタ
は通常10ビット以上の構成となっている。As the unvoiced sound generator (2) in such a speech synthesizer, a conventional M sequence (maximum length null sequence)
, And this counter usually has a configuration of 10 bits or more.

（ハ）発明が解決しようとする問題点従来のＭ系列フィードバックカウンタ構成の無声音生
成部を採用した音声合成方法に於いては、その無声音生
成部でのＭ系列フィードバックカウンタは、その構成ビ
ット数を多くすれば、この出力の周期を長くする事がで
きるが、即ち１周期内のノイズ期間の時間長を大きくす
る事はできるが、完全はホワイトノイズ情報を得る事が
できなかった。従って、継続長の大きな無声音を合成す
る時に、周期性が表われて合成音声の品質の低下を招く
事になる。(C) Problems to be Solved by the Invention In a conventional speech synthesis method employing an unvoiced sound generation section having an M-sequence feedback counter configuration, the M-sequence feedback counter in the unvoiced sound generation section has a number of constituent bits. If the number is increased, the output cycle can be lengthened, that is, the time length of the noise period within one cycle can be increased, but white noise information cannot be completely obtained. Therefore, when synthesizing an unvoiced sound having a large continuous length, periodicity appears and the quality of synthesized speech is deteriorated.

（ニ）問題点を解決する為の手段本発明の音声合成方法は、無声音源生成部として従来
のフィードバックカウンタを用いる事なく、なる漸化式の演算処理によって無声音のノイズ情報を得
るものである。(D) Means for Solving the Problems The speech synthesis method of the present invention uses the conventional feedback counter as the unvoiced sound source generation unit, The noise information of the unvoiced sound is obtained by the arithmetic processing of the recurrence formula.

（ホ）作用上記の漸化式は、Mayの提案であるファイグンパウム
分岐（数理科学No.207,SEPTEMBER,1980に詳しい）のお
こる簡単な定差方程式であり、これを演算する事によっ
て、ホワイトノイズ情報を得る事ができるのである。(E) Action The above recurrence formula is a simple fixed-difference equation caused by May-proposed Figungpaum bifurcation (detailed in Mathematical Science No. 207, SEPTEMBER, 1980). You can get information.

（ヘ）実施例第１図に本発明の音声合成方法に用いられる無声音源
生成部の機能構成を示す。同図に於いて、（５）は（１
−xn）を算出する減算器、（６）はaxnを算出する乗算
器、（７）はaxn（１−xn）を算出する乗算器であり、
この値xn＋ｉ＝axn（１−xn）がレジスタ（８）に一旦
貯えられてから次のタイミング（ｎ＋１）においてフィ
ードバックされ上記減算器（５）への入力となる。(F) Embodiment FIG. 1 shows a functional configuration of an unvoiced sound source generator used in the speech synthesis method of the present invention. In the figure, (5) is (1)
−xn), (6) is a multiplier for calculating axn, (7) is a multiplier for calculating axn (1−xn),
This value xn + i = axn (1-xn) is temporarily stored in the register (8) and then fed back at the next timing (n + 1) to be input to the subtracter (5).

而して、第１図の無声音源生成部はxn＋ｉ＝axn（１
−xn）の漸化式を順次演算して数列｛x_n｝を生成し、こ
の数列｛x_n｝を、ノイズ情報の振幅情報として用いる。
この数列｛x_n｝の演算に際しての初期値x0をｏ＜x0＜１
とする必要がある。又、数列｛xn｝をホワイトノイズと
する為の定数ａの具体的数値については、3.9ａ＜４
の範囲でさらに４に近づく程に近づく程好ましい事が実
験的に見い出されている。第２図に、ａ＝3.99とし、ｎ
＝0,1,2,…,255の256個のサンプル列｛xn｝をフーリエ
変換した場合の周波数特性を示し、さらにこれと対比す
る為に、第４図に同条件にて従来の10ビット構成のＭ系
列のフィードバックカウンタの出力サンプル列の周波数
特性図を示す。Thus, the unvoiced sound source generation unit in FIG. 1 has xn + i = axn (1
−xn) is sequentially calculated to generate a sequence {x _n }, and this sequence {x _n } is used as amplitude information of noise information.
The initial value x0 for the operation of this sequence {x _n } is defined as o <x0 <1
It is necessary to The specific value of the constant a for making the sequence {xn} white noise is 3.9a <4
It has been experimentally found that the closer to 4 in the range, the better. FIG. 2 shows that a = 3.99 and n
= 0,1,2, ..., 255 shows the frequency characteristics when the Fourier transform is performed on 256 sample strings {xn}. To compare with this, Fig. 4 shows the conventional 10-bit under the same conditions. FIG. 4 shows a frequency characteristic diagram of an output sample sequence of an M-sequence feedback counter having a configuration.

これ等の図を比較すると、本発明方法にて得られる無
声音源のノイズ情報の方が従来方法によるそれよりも平
坦化が図れ、より周期性のないホワイトノイズとなって
いる事が分かる。Comparing these figures, it can be seen that the noise information of the unvoiced sound source obtained by the method of the present invention can be flattened and white noise with less periodicity than that of the conventional method.

上述の第１図の説明に於いては、説明の都合上乗算器
（６）（７）を個別に示したが、単一の乗算器の時分割
使用が可能であり、簡単な構成にて本発明の実施が可能
となる。さらに、この乗算器は、無声音源生成部専用の
ものを設ける必要がなく、ディジタルフィルタ部に設け
られる乗算器を時分割使用する事ができる。例えば、パ
ーコール音声合成方式の場合、このフィルタ部での乗算
器は音声信号の１サンプルを合成する１サンプル期間に
通常20回程度の乗算を行なうが、上述した無声音源信号
を生成するためにこの１サンプル期間中に上述の如く２
回の乗算を追加するだけでよい。In the above description of FIG. 1, the multipliers (6) and (7) are shown individually for the sake of explanation. However, a single multiplier can be used in a time-sharing manner, and the configuration is simple. The present invention can be implemented. Further, it is not necessary to provide a dedicated multiplier for the unvoiced sound source generating unit, and the multiplier provided in the digital filter unit can be time-divisionally used. For example, in the case of the Percall speech synthesis method, the multiplier in this filter unit normally performs about 20 multiplications in one sample period for synthesizing one sample of the speech signal. During one sample period, 2
You only need to add multiplications.

（ト）発明の効果本発明の音声合成方法は、以上の説明から明らかな如
く、May提案の方程式を音声合成の為の無声音源生成部
に採用し得る事を見い出したものであり、これに依っ
て、無声音源信号を周期性のないホワイトノイズ信号と
でき、無声音の品質を大巾に向上せしめる事が可能とな
り、より自然な合成音声が得られる。しかも本発明方法
を用いれば、無声音源生成部の構成の簡略化が図れ、そ
の結果音声合成装置のコストダウンが望める。(G) Effect of the Invention As is clear from the above description, the speech synthesis method of the present invention has found that the equation proposed by May may be employed in an unvoiced sound source generation unit for speech synthesis. Accordingly, the unvoiced sound source signal can be a white noise signal having no periodicity, and the quality of unvoiced sound can be greatly improved, so that a more natural synthesized voice can be obtained. Moreover, if the method of the present invention is used, the configuration of the unvoiced sound source generation unit can be simplified, and as a result, the cost of the speech synthesizer can be reduced.

[Brief description of the drawings]

第１図は本発明の音声合成方法を用いた時の無声音源生
成部の機能構成図、第２図は本発明方法を採用した時の
無声音源信号の周波数特性図、第３図は一般的な音声合
成装置の構成図、第４図は従来方法による無声音源信号
の周波数特性図である。（２）……無声音源生成部、（５）……減算器、（６）
（７）……乗算器、（８）……レジズタ。FIG. 1 is a functional configuration diagram of an unvoiced sound source generating unit when the voice synthesis method of the present invention is used, FIG. 2 is a frequency characteristic diagram of an unvoiced sound source signal when the method of the present invention is used, and FIG. FIG. 4 is a diagram showing a frequency characteristic of an unvoiced sound source signal according to a conventional method. (2)… unvoiced sound source generation unit, (5)… subtractor, (6)
(7)… Multiplier, (8)… Register.

Claims

(57) [Claims]

A sound source signal containing pitch information of a voiced sound from a voiced sound source generation unit and a sound source signal containing unvoiced sound noise information from an unvoiced sound source generation unit are input to a vocal tract characteristic filter unit. In the speech synthesis method for reproducing and outputting a speech signal by adding a vocal tract characteristic to the sound source signal, the unvoiced sound source generation unit includes: x _{n + 1} = ax _n (1-x _n ) n = 0,1 , 2,3,... 0 <x ₀ <1 a is a constant The operation of a recurrence formula is performed, and the sequence {x _n } obtained by the operation is used as amplitude information of the noise information. Speech synthesis method.