JPS603000A

JPS603000A - Voice synthesization system

Info

Publication number: JPS603000A
Application number: JP58110334A
Authority: JP
Inventors: 小山　斉
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-06-20
Filing date: 1983-06-20
Publication date: 1985-01-09

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は、音声合成方式に関し、特にスペクトル符号化
方式による音声合成方式に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech synthesis method, and particularly to a speech synthesis method using a spectral coding method.

自然音声？標本化し、一定の分析周期（フレーム周期）
内の音声の標本値よシ抽出し之、音声合成に必要な各種
の情報（以下音声パラメータとよぶ）１口るいは自然音
声よシ抽出した音声パラメータに類似して人工的に作成
されｔ音声パラメータ上用いるスペクトル符号化方式に
よる音声合成においては１合成時に前記フレーム周期と
同一の時間周期で合成に用いる音声パラメータを更新し
て用いる方式が知られている。しかし、このような−定
周期の音声パラメータの更新は音声合成に用いる合成フ
ィルターの過渡特性全考慮しておらず、音声パラメータ
の更新直後における合成フィルターの出刃が不安定化し
、不自然な音声合成になる。Ｍ７”背の範囲で音声パラ
メータが急激に変化する場合に、甘酸音声の自然性が特
にめだって悪い。Natural voice? Sampling and constant analysis period (frame period)
Various types of information necessary for speech synthesis (hereinafter referred to as speech parameters) are extracted from sample values of speech within a single mouth or natural speech, and are artificially created similar to the extracted speech parameters. In speech synthesis using a spectral coding method using parameters, a method is known in which the speech parameters used for synthesis are updated at the same time period as the frame period during one synthesis. However, this -periodic voice parameter update does not take into account all the transient characteristics of the synthesis filter used for speech synthesis, and the synthesis filter becomes unstable immediately after the voice parameter update, resulting in unnatural speech synthesis. become. The naturalness of the sweet-sour voice is particularly bad when the voice parameters change rapidly in the M7” dorsal range.

この発明の目的は、自然性に優れた合成音全発生する音
声合成方式の提供にある。An object of the present invention is to provide a speech synthesis method that generates all synthesized sounds with excellent naturalness.

本発明の惜敗は、励振信号ｔフィルターに入力しこのフ
ィルターのフィルター係数全変更して所要の音声の合成
？行う方式において、少なくとも予め定めである特性の
前記励振信号が前記フィルターに入力されたときは前記
フィルター係数の更新は前記励振信号の入力に同期して
行なうこと全特徴とする。The disadvantage of the present invention is that the excitation signal is input to the t-filter and all the filter coefficients of this filter are changed to synthesize the desired voice. The system is characterized in that, at least when the excitation signal having a predetermined characteristic is input to the filter, the filter coefficients are updated in synchronization with the input of the excitation signal.

この発明によれば、フィルターの励振イボ号の特性（た
とえば１Ｍ声皆源によるフィルターの駆動であるかある
いは、無声音源によるフィルターの駆動であるか）に応
じ、かつ、合成フィルターへの前記フィルターの励振惰
号の入力時期に、たとえば有声音源駆動時におけるピッ
チ周期に同期して、合成フィルターの係数全変更するこ
とによって、フィルターの励振１Ｍ号人力直前（ピッチ
周期終端）のフィルター出刃が最も減衰しきシ、変更に
最も適した時点にフィルター係数全更新することが実現
される。無声音源駆動時においては、フィルター出刃と
して得られる音が自然音声と同様に小さな値しかとらな
いことから、予め定められた長さ、あるいは音声パラメ
ータ中に含まれる時間長が経過した時点で変更する従来
の方式を用いてもよい。According to the present invention, depending on the characteristics of the excitation signal of the filter (for example, whether the filter is driven by a 1M voice source or a silent sound source), the filter is applied to the synthesis filter. By changing all the coefficients of the synthesis filter at the input timing of the excitation signal, for example, in synchronization with the pitch cycle when driving a voiced sound source, the filter blade just before the excitation number 1M of the filter (at the end of the pitch cycle) is the most attenuated. Therefore, it is possible to update all filter coefficients at the most appropriate time for the change. When driving a silent sound source, the sound obtained as a filter blade takes only a small value like natural speech, so it is changed when a predetermined length or a time length included in the audio parameter has elapsed. Conventional methods may also be used.

本発明は、合成フィルターのフィルター係数変更ｔ１フ
ィルターの励振信号の特性に応じかつ前記合成フィルタ
ーへの前記フィルターの励振信号の入力と同期して行な
うことによって、′自然音声から抽出された音声パラメ
ータを用いるときも、又人為的に作成されたパラメータ
？用いるときも、音声パラメータ更新直後に生ずる合成
フィルターの不安定化？回避しであるから、自然性に優
れ友高品質の合成音が発生できる音声合成方式を提供で
きる。According to the present invention, the filter coefficients of the synthesis filter are changed in accordance with the characteristics of the excitation signal of the t1 filter and in synchronization with the input of the excitation signal of the filter to the synthesis filter. Even when using parameters, are they artificially created? Even when using it, does the synthesis filter become unstable immediately after updating the audio parameters? Therefore, it is possible to provide a speech synthesis method that can generate synthesized sounds with excellent naturalness and high quality.

以下に図面？参照して本発明をよシ詳細に説明する。Is there a drawing below? The present invention will be explained in more detail with reference to the drawings.

第１図は本発明の一実施例？示すプレツタ図である。音
声合成装置１０は、合成に用いる音声ノくラメータ會記
憶してお（ＲＯＭあるいはＲＡＭ等の音声パラメータ記
憶回路２０と、記憶回路２０から出力されるフィルター
係数の入力を行なうフィルター係数入力回路３０と有声
励振信号全発生　□する有声音源回路４０と、無声励振
偏号？発生する無声音源回路５０と、発生したフィルタ
ー励振偏号七入力するフィルター励振信号人力回路６０
と、フィルター係数人力回路３０およびフィルター励振
信号人力回路６０とから入力されたフィルター係数およ
び励振信号に応答して音声合成上行なう合成フィルター
７０と、合成音１Ｍ号金出力する音声出刃端子８０と、
記憶回路２０からの音声パラメータの読み出しや、フィ
ルター係数入力回路３０．有声音源回路４０．無声晋源
回路５０゜フィルター励振信号大力回路６０および合成
フィルター７０の各動作全制御する制御回路９０と、制
御回路９０に対し音声合成の動作開始及び停止の命令を
人力する命令入力端子１００と全備えている。Is Fig. 1 an embodiment of the present invention? FIG. The speech synthesis device 10 includes a speech parameter storage circuit 20 (such as ROM or RAM) that stores speech parameters used for synthesis, and a filter coefficient input circuit 30 that inputs filter coefficients output from the storage circuit 20. A voiced sound source circuit 40 that generates all voiced excitation signals, an unvoiced sound source circuit 50 that generates unvoiced excitation polarization, and a filter excitation signal manual circuit 60 that inputs the generated filter excitation polarization.
, a synthesis filter 70 that performs voice synthesis in response to the filter coefficients and excitation signals input from the filter coefficient human power circuit 30 and the filter excitation signal human power circuit 60, and a voice output terminal 80 that outputs a synthesized sound 1M,
Reading audio parameters from the storage circuit 20 and filter coefficient input circuit 30. Voiced sound source circuit 40. A control circuit 90 that controls all operations of the silent Shingen circuit 50° filter excitation signal power circuit 60 and the synthesis filter 70, and a command input terminal 100 that manually issues instructions to the control circuit 90 to start and stop voice synthesis operations We are prepared.

音声合成装置１０は、命令入力端子１００から入力式れ
る命令にしたがって制御回路９０に起動し、合成処理全
開始する。制御回路９０１″ｌ：、音声パラメータのア
ドレス全指定する。記憶回路２０は、指定された音声パ
ラメータの一つである合成フィルター７０のフィルター
係数ｔフィルター係数人力回路３０へ出力し、Ｍ声無声
判別、ピッチ周期、音源振幅等のデータを制御回路９０
へ送出する。制御回ｗｒ９０は、受取っ７’ＣＶ声無戸
判別データに従って■声音源回路４０．あるいは無声音
源回路５０ヘピッチ周期及び音源振幅デ゛−夕を送出す
る。■声音源回路４０が選択され′ｆｃ場会、制御回路
９０は、フィルター係数人力回路３０を閉シ合成フィル
ター７０にフィルター係数を人力し、有声音源回路４０
奮起動し、前記音源振幅に従ったフィルター励振偏号？
発生させると同時にフィルター励振信号人力回路６０を
閉じ励振信号？合成フィルター７０へ大力する。The speech synthesis device 10 is activated by the control circuit 90 in accordance with a command inputted from the command input terminal 100, and the entire synthesis process is started. Control circuit 901''l: Specifies all the addresses of the voice parameters. The memory circuit 20 outputs the filter coefficient t of the synthesis filter 70, which is one of the voice parameters specified, to the filter coefficient human power circuit 30, and performs M voice/voice discrimination. , pitch period, sound source amplitude, etc., to the control circuit 90.
Send to. The control circuit wr90 operates the voice sound source circuit 40 according to the received 7'CV voiceless discrimination data. Alternatively, the pitch period and sound source amplitude data are sent to the silent sound source circuit 50. When the voice sound source circuit 40 is selected, the control circuit 90 closes the filter coefficient circuit 30, manually inputs the filter coefficient to the synthesis filter 70, and inputs the filter coefficient to the voiced sound source circuit 40.
The filter excitation polarization according to the source amplitude?
At the same time as generating the filter excitation signal, the human power circuit 60 is closed and the excitation signal? A large amount of power is applied to the synthesis filter 70.

合成フィルター７０は、制御回路９０から大力てれる動
作信号（クロック備考）に従って合成を行ない、音声出
刃端子８０へ合成音酒号盆出力する。一方、制御回路９
０は合成フィルター７０へ大力する動作１１号の送出を
一定時間間隔でくシかえし、その送出回数を計数し、ピ
ッチ周期の整数倍であるかどうか？検出する。’！、′
ｆｃ、制御回路９０は、送出回数と一定送出時間間隔か
ら合成時間全計数し９合成時間がフレーム周期と一致し
た場合にフレーム周期の超過送出回数の計数？開始する
。送出回数の計数結果がピッチ周期の整数倍と一致した
場合、制御回路９０は、超過送出回数の計数結果がＯで
あるかどうか検出し、０の場合は、再度有声音源回路４
０を起動すると同時に、フィルター励振信号入力回路６
０？！−閉じ、励振信号上合成フィルター７０へ大力す
る。超過送出回数の計数結果が０以外の場合、制御回路
９０に、新らたな音声パラメータのアドレス七指定し、
超過送出回数の計数結果？送出回数結果とした後、超過
送出回数？０に戻す。The synthesis filter 70 performs synthesis according to the operation signal (clock note) outputted from the control circuit 90 and outputs the synthesized sound to the audio output terminal 80 . On the other hand, the control circuit 9
0 repeats the transmission of operation No. 11 that outputs a large amount of power to the synthesis filter 70 at regular time intervals, counts the number of times of transmission, and determines whether it is an integral multiple of the pitch period. To detect. '! ,′
fc, the control circuit 90 counts the total synthesis time from the number of transmissions and a fixed transmission time interval, and when the synthesis time matches the frame period, counts the number of transmissions exceeding the frame period? Start. When the counting result of the number of sending out matches an integral multiple of the pitch period, the control circuit 90 detects whether the counting result of the number of excess sending out is O, and if it is 0, the control circuit 90 starts the voiced sound source circuit 4 again.
0, the filter excitation signal input circuit 6
0? ! - close, powering the excitation signal onto the synthesis filter 70; If the counting result of the number of excess transmissions is other than 0, specify a new audio parameter address to the control circuit 90,
Result of counting the number of excess transmissions? After setting the result as the number of transmissions, is the number of excess transmissions? Return to 0.

一方記憶回路２０は、制御回路９０によって新らたに指
定されたアドレスの音声パラメータの内。On the other hand, the storage circuit 20 stores audio parameters of the address newly designated by the control circuit 90.

合成フィルグーのフィルター係数？フィルター係数入力
回Ｗ＆３０へ出力し１ＭＰ無声判別、ピンチ周期、音源
振幅等のデータを制御回路９０へ送出し、前記の動作全
くシかえずことによって音声の合成上行なう。Filter coefficient of synthetic filter? It outputs data to the filter coefficient input circuit W&30 and sends data such as 1MP unvoiced discrimination, pinch period, sound source amplitude, etc. to the control circuit 90, and performs voice synthesis without changing any of the above operations.

以上述べたように、本実施では、フレーム周期ごとのフ
ィルター係数変更ではなく、フレーム周期後に最初にお
とずれるピッチ周期の終端においてフィルター係数全変
更でき、ピッチ周期ごとに減衰振動出刃ｖ＜ｂかえす音
声合成フィルターの最も適した時点でフィルター係数の
変更？行なうことによって、フレーム周期ごとにフィル
ター係数全変更する従来の方式に収らべて、自然性に優
れた高品質の音声合成が可能である。しかも２本方式は
比較的少址の回路によって実現でき、周知の半導体技術
上用いて容易に作成できることは図面よシ明らかである
。As described above, in this implementation, instead of changing the filter coefficients for each frame period, all the filter coefficients can be changed at the end of the pitch period that first passes after the frame period, and the voice synthesis that returns damped vibrations v<b for each pitch period is possible. Changing filter coefficients at the most appropriate point in the filter? By doing this, it is possible to synthesize high-quality speech with excellent naturalness, in contrast to the conventional method in which all filter coefficients are changed every frame period. Moreover, it is clear from the drawings that the two-wire system can be realized with a relatively small amount of circuitry and can be easily manufactured using well-known semiconductor technology.

[Brief explanation of the drawing]

第１図は本発明の一実施しｌＪ’に示すブロック図であ
る。１０・・・・・・音声合成装置、３０・・・・・・フィ
ルター係数人力回路、４０・・・・・・石声音源回路、
５０・・・・・・無声旨源回！、、６０・・・・・・フ
ィルター励振信号人力回路、８０・・・・・・背戸出力
端子、１００・・・・・・命令入力端子。FIG. 1 is a block diagram showing one implementation of the present invention. 10...Speech synthesis device, 30...Filter coefficient human power circuit, 40...Stone sound source circuit,
50... Silent episode! , 60... Filter excitation signal human power circuit, 80... Back door output terminal, 100... Command input terminal.

Claims

[Claims]

In a method in which the excitation signal is input to an excitation iq'e filter and the filter coefficients of this filter are all updated to synthesize the required speech, at least when the excitation signal with a predetermined characteristic is input to the filter, the filter coefficients are The update shall be performed in synchronization with the input of the excitation signal.
Characteristic voice synthesis method.