JPS5993498A

JPS5993498A - Voice synthesizer

Info

Publication number: JPS5993498A
Application number: JP57201959A
Authority: JP
Inventors: 孝一飯田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1982-11-19
Filing date: 1982-11-19
Publication date: 1984-05-29

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】この発明は、音声合成装置に関する。[Detailed description of the invention] The present invention relates to a speech synthesis device.

ＰＡＲＣＯＲ方式等の音声合成装置においては、ディジ
タルフィルタを用いて、声道共振特性を実現している。In a speech synthesizer such as the PARCOR method, a digital filter is used to realize vocal tract resonance characteristics.

このディジタルフィルタがある周波数特性をもつ場合、
入力音源信号がなくてもディジタルフィルタの誤差が帰
還されることによって出力信号が送出され続ける。すな
わち、無音期間において、発振を生じてしまい極めて耳
ざわすなものとなってしまうという欠点がある。この発
振が生じる条件は、フィルタ定数、過去の入力音源によ
り決まるので、予測することが非常に困難である。If this digital filter has certain frequency characteristics,
Even if there is no input sound source signal, the output signal continues to be sent out by feeding back the error of the digital filter. That is, there is a drawback that oscillation occurs during the silent period, resulting in an extremely noisy sound. The conditions under which this oscillation occurs are determined by the filter constant and past input sound sources, and are therefore very difficult to predict.

この発明の目的は、簡単な操作により無音期間での発振
を防止した音声合成装置を提供することにある。An object of the present invention is to provide a speech synthesis device that prevents oscillation during silent periods through simple operation.

この発明の他の目的は、以下の説明及び図面から明らか
になるであろう。Other objects of the invention will become apparent from the following description and drawings.

以下、この発明を実施例とともに詳細に説明する。Hereinafter, this invention will be explained in detail together with examples.

図面には、この発明の一実施例のブロック図が示されて
いる。同図の破線で囲まれた各回路ブロックは、公知の
半導体集積回路の製造技術によって、シリコン単結晶の
ような半導体基板上において形成される。A block diagram of an embodiment of the invention is shown in the drawings. Each circuit block surrounded by a broken line in the figure is formed on a semiconductor substrate such as a silicon single crystal by a known semiconductor integrated circuit manufacturing technique.

音声合成用半導体集積回路装置ＩＣは、特に制限されな
いが、公知のＰＡＲＣＯＲ方式により合肥パルス列発生
器１．雑音発生器２からなる音源部、ディジタルフィル
タ４、Ｄ／Ａ変換器５、パラメータ復号化回路３等によ
り構成されるものにおいて、次の各回路が付加される。Although the semiconductor integrated circuit device IC for speech synthesis is not particularly limited, the Hefei pulse train generator 1. The following circuits are added to the system which is composed of a sound source section consisting of a noise generator 2, a digital filter 4, a D/A converter 5, a parameter decoding circuit 3, etc.

遅延回路６は、上記復号化回路３からの無音信号を受け
て動作を開始し、後述するような一定時間遅れて、マル
チプレクサ７の切り換え制御信号を形成する。このマル
チプレクサ７は、上記復号北回ＴｉＲ３から出力される
特徴パラメータに１と、内部で固定的に形成された特徴
パラメータｋｉ−０とを受は選択的に上記ディジタルフ
ィルタ４に伝える。The delay circuit 6 starts operating upon receiving the silence signal from the decoding circuit 3, and forms a switching control signal for the multiplexer 7 after a certain period of time delay as described later. This multiplexer 7 selectively transmits 1 to the feature parameter output from the decoding northern TiR3 and the feature parameter ki-0 fixedly formed internally to the digital filter 4.

すなわち、基本的には、有音期間にはマルチプレクサ７
を通して上記ｔ１号化回路３からの！徴パラメータｋｉ
がディジタルフィルタ４に供給され、無音期間にはマル
チプレクサ７を通して上記ｋＩ＝０とされた特徴パラメ
ータがディジタルフィルタに伝えられる。That is, basically, multiplexer 7 is activated during the sound period.
from the t1 encoding circuit 3 through the ! characteristic parameter ki
is supplied to the digital filter 4, and during the silent period, the feature parameter set to kI=0 is transmitted to the digital filter through the multiplexer 7.

例えば、第２図に示すようなＰＡＲＣＯＲ型のディジタ
ルフィルタ４においては、格子型フィルタとなっており
、その誤差による帰還が起こり発振の原因になるもので
ある。そこで、上記のように特徴パラメータｋｌ−００
ように置き換えると、ディジタルフィルタ４の等価回路
は、第３図のように帰還経路が断たれるため、発振が生
じない。For example, a PARCOR type digital filter 4 as shown in FIG. 2 is a lattice type filter, and feedback due to the error occurs, causing oscillation. Therefore, as mentioned above, the feature parameter kl-00
When replaced as shown in FIG. 3, in the equivalent circuit of the digital filter 4, the feedback path is cut off as shown in FIG. 3, so no oscillation occurs.

この実施例では、遅延回路６により、無音期間となって
も直ちに上記マルチプレクサ７の切り換えを行うのでは
なく、一定期間遅らせて切り換えるものである。この理
由を次に説明する。In this embodiment, the delay circuit 6 does not immediately switch the multiplexer 7 even when there is a silent period, but switches the multiplexer 7 after a certain period of time. The reason for this will be explained next.

音源入力が無くなった直後（発声終了）に上記特徴パラ
メータｋｉ＝ｏに置き換えると、ディジタルフィルタ４
内に情報が残っている（ディジタルフィルタ４の出力が
０になっていない）ため出力音声に異常音が発生してし
まう、そのため、上記特徴パラメータに！＝Ｏに置き換
えるタイミングに留意する必要がある。フィルタ出力の
減衰特性は、最も惣岐（バンド１Ｉ９ｉ！Ｂの小さい）
なホルマントでは＼決定される。If the above feature parameter ki=o is replaced immediately after the sound source input disappears (the end of utterance), the digital filter 4
Because information remains in the output (the output of the digital filter 4 is not 0), abnormal sounds will occur in the output audio.Therefore, the above feature parameters should be changed! It is necessary to pay attention to the timing of replacing with =O. The attenuation characteristic of the filter output is the most Sogi (smallest in band 1I9i!B)
In the formant, it is determined.

一例として、第４図に示すフィルタ特性（女声母音／ａ
／）で音声の発声が終了した場合を考えてみる。同図か
ら最も急峻な特性をもつホルマントについて、１Ｚｌ＝
０．９８１２、周波数ｆ−１０Ｂ４Ｈｚ、バンド幅Ｂ＝
４８．３３Ｈｚ、減衰定数α＝１５１．　８７ｓｅｃ　
、時定数τ＝１／α＝６．５８３ｍ５が求まる。As an example, the filter characteristics (female vowel/a
Let's consider a case where voice production ends with /). From the same figure, for the formant with the steepest characteristic, 1Zl=
0.9812, frequency f-10B4Hz, bandwidth B=
48.33Hz, damping constant α=151. 87 seconds
, the time constant τ=1/α=6.583m5 is found.

このフィルタに音源入力としてインパルスを加えた場合
の応答を第５図に示す。同図から上記時定数には−等し
い減衰特性を示していることが解る。実際には、音声合
成用ディジタルフィルタの最小バンド幅は、共振による
異常振幅等を防止し、音質を向上させるため、３０〜１
００Ｈｚ以上にすることが多い８例えば、Ｂ−３０Ｈｚ
としても発声終了から４０ｍ５後に３２．７ｄＢとなり
、６０１ＩＩｓ後に４９．１ｄＢの減衰となる。したが
って、この実施例では、上記遅延回路６で形成する遅延
時間を４０〜６０ｍ５に設定するものである。これによ
り、合成音声に影響を与えずに、無音期間での発振を防
止することができる。FIG. 5 shows the response when an impulse is applied to this filter as a sound source input. It can be seen from the figure that the above time constant exhibits a damping characteristic equal to -. In reality, the minimum bandwidth of a digital filter for speech synthesis is 30 to 100 nm in order to prevent abnormal amplitudes due to resonance and improve sound quality.
Frequently set to 00Hz or higher 8 For example, B-30Hz
Even so, the attenuation becomes 32.7 dB 40 m5 after the end of the utterance, and 49.1 dB after 601 IIs. Therefore, in this embodiment, the delay time formed by the delay circuit 6 is set to 40 to 60 m5. This makes it possible to prevent oscillation during silent periods without affecting the synthesized speech.

特に制限されないが、上記遅延回路は上記フレーム周期
を計数するフリップフロップ等で構成されたカウンタを
用いる。そして、次の発声期間の到来により直ちにリセ
ットされ、マルチプレクサ７を復号化回路３側に切り換
えるものである。Although not particularly limited, the delay circuit uses a counter constituted by a flip-flop or the like that counts the frame period. Immediately upon arrival of the next utterance period, it is reset and the multiplexer 7 is switched to the decoding circuit 3 side.

なお、ＰＡＲＣＯＲ方式の音声合成動作は、第１図のブ
ロック図において、音声データＲＯＭ　（リード・オン
リー・メモリ）等の音声データ信号がフレーム周期毎に
入力され、パラメータ復元化回路３により、例えば４８
ビツト／フレームの場合には、上記特徴パラメータがお
のおの１０ビツトデータに復号化される。さらに、図示
しないがパラメータ補間回路において、前フレームの特
徴パラメータ値と２．５ｍｓごとに準線形補間され、特
徴パラメータごとの離散的変化に対する平滑化が行われ
る。In the block diagram of FIG. 1, the voice synthesis operation of the PARCOR method is such that an audio data signal such as an audio data ROM (read only memory) is inputted every frame period, and the parameter restoration circuit 3 processes, for example, 48
In the bit/frame case, the feature parameters are each decoded into 10 bit data. Furthermore, in a parameter interpolation circuit (not shown), quasi-linear interpolation is performed every 2.5 ms with the feature parameter value of the previous frame, and smoothing is performed for discrete changes in each feature parameter.

そして、ピッチ情ｆｌＰは音源部を、振幅情ＮＡ及びＰ
ＡＲＣＯＲ係数ｋｉはディジタルフィルタ４を制御する
。これによって原音声が復元合成され、Ｄ／Ａ変換器５
及び外部のスピーカＳＰを介して発声される。Then, the pitch information flP connects the sound source section to the amplitude information NA and P.
The ARCOR coefficient ki controls the digital filter 4. As a result, the original voice is restored and synthesized, and the D/A converter 5
and is uttered via an external speaker SP.

この発明は、前記実施例に限定されない。The invention is not limited to the above embodiments.

ディジタルフィルタを用いる他の方式、線スペクトル係
数を用いるもの、ホルマントのＱを表す特徴パラメータ
ｒ、ｆ３を用いるものにおいて、これらの特徴パラメー
タは相互において変換可能である。したがって、上記無
音期間での発振を防止するために線スペクトル係数を用
いるものにあってはその定数をは一等間隔に、ホルマン
トのＱを表す特徴パラメータｒ、ｅを用いるものにあっ
てはｒをぼり零に置き換えるようにすればよい。In other methods using digital filters, methods using line spectral coefficients, and methods using feature parameters r and f3 representing formant Q, these feature parameters can be mutually converted. Therefore, in the case of using line spectrum coefficients to prevent oscillation during the silent period, the constants are set at equal intervals, and in the case of using characteristic parameters r and e representing formant Q, r All you have to do is replace it with Bori Zero.

また、第１図のブロック図において、上記無音期間での
発振を防止するために特徴パラメータを置き換えるため
のマルチプレクサ、遅延回路等は、音声合成用半導体集
積回路装置の外部に設けるものであってもよい。Furthermore, in the block diagram of FIG. 1, multiplexers, delay circuits, etc. for replacing characteristic parameters in order to prevent oscillation during the silent period may be provided outside the semiconductor integrated circuit device for speech synthesis. good.

[Brief explanation of drawings]

第１図は、この発明の一実施例を示す回路図、第２図は
、ＰＡＲＣＯＲ型のディジタルフィルタの等価回路図、第３図は、パラメータｋｌ−Ｑとしたときのディジタル
フィルタの等価回路図、第４図は、フィルタ特性図、第５図は、フィルタの応答を示す特性図である。１・・パルス列発生回路、２・・雑音発生回路、３・・
復号化回路、４・・ディジタルフィルタ、５・・Ｄ／Ａ
変換器、６・・遅延回路、７・・マルチプレクサモ幣 ― ６５５〈゛ビFig. 1 is a circuit diagram showing an embodiment of the present invention, Fig. 2 is an equivalent circuit diagram of a PARCOR type digital filter, and Fig. 3 is an equivalent circuit diagram of a digital filter when the parameter kl-Q is set. , FIG. 4 is a filter characteristic diagram, and FIG. 5 is a characteristic diagram showing the response of the filter. 1...Pulse train generation circuit, 2...Noise generation circuit, 3...
Decoding circuit, 4...Digital filter, 5...D/A
Converter, 6... Delay circuit, 7... Multiplexer money - 655

Claims

[Scope of Claims] 1. A speech synthesis device characterized in that oscillation in a digital filter is prevented by detecting a silent period, delaying it for a certain period of time, and manipulating a filter constant. 2. The filter constant has a partial autocorrelation coefficient as a characteristic parameter, and the constant ki is replaced with one zero during the silent period. The described speech synthesizer. 3. The filter constant according to claim 1, wherein the filter constant has a line spectrum coefficient as a characteristic parameter, and the constant is replaced by equal division during the silent period. Speech synthesizer. 4. The above filter constant uses formant information as a feature parameter, and the formant Q during the silent period
2. The speech synthesis device according to claim 1, wherein the constant r representing .times..times..times..times..times..times..times.