JPS59102297A

JPS59102297A - Voice synthesizer

Info

Publication number: JPS59102297A
Application number: JP21354782A
Authority: JP
Inventors: 小山　斉
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1982-12-06
Filing date: 1982-12-06
Publication date: 1984-06-13

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は、音声合成装置に関し、特に時間的に隣り合う
２組の音声パラメータを補間する回路を有する合成装置
に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech synthesis device, and more particularly to a synthesis device having a circuit for interpolating two sets of temporally adjacent speech parameters.

従来、スペクトラム包絡パラメータを自然音声ｒｈ匁（
−ａ、ｌ−トＪ−ｒ＾７１１−＋−，，，−，−得る為
に、前６α分＠周期より短かい時間周ル」で一様に補間
して用いる装置が知られている。Conventionally, the spectral envelope parameter was calculated using natural speech rh momme (
In order to obtain -a, l-to J-r^711-+-,,,-,-, there is a known device that uses uniform interpolation over a period shorter than the previous 6α minutes @period. .

しかしなから、上ｂピの装置では本来スペクトラム包絡
パラメータが犬きく変化する場合でも先程変化しない場
合でも、すべて同一の手法で補間しているので、合成さ
れた音声が自然音にたいして不自然となる欠点があった
。However, in the above-mentioned device, the same method is used for interpolation regardless of whether the spectral envelope parameter changes sharply or remains unchanged, so the synthesized speech becomes unnatural compared to natural sounds. There were drawbacks.

本発明の目的は逸切な補間動作を行なうことによって、
自然音声に近い合成音声を出力する音声合成装置を提供
することにある。The purpose of the present invention is to perform an extrapolation operation to
An object of the present invention is to provide a speech synthesis device that outputs synthesized speech close to natural speech.

この発明は、時間的にとなシりったフレーム間の音声パ
ラメータを補間する回路才Ｍする音声合成装置において
、前記補間回路は腺形相間像能、非巌形袖［−］慎舵、
無補間機能等の複数の機能を含与、とな多合うフレーム
の音Ｆ　パラメータの差に応じて前記補間回路の中から
欣定の後層を選択する手段を有すること全特徴とする。The present invention provides a speech synthesis device equipped with a circuit for interpolating speech parameters between temporally distinct frames, in which the interpolation circuit has an interpolation circuit that interpolates speech parameters between temporally distinct frames.
The present invention is characterized in that it includes a plurality of functions such as a non-interpolation function, and has means for selecting a predetermined later layer from among the interpolation circuits according to differences in sound F parameters of multiple matching frames.

この発明によれば、過渡的な変化の激しい材声過渡的な
変化に追随できるようにし、又、前体するフレームの包
絡パラメータ同志が過渡に違わない場合は両者の差分値
に応じて、その値が予め足められた値と比較してその結
果に応じて線形補間および非線形補間のいずれかの結果
を選択して波形を滑らかにすることができる。従って、
自然音声から抽出されたパラメータを用いようとも、又
人工的に設定づれたパラメータを用いようとも優れた音
買で合成を行なうことができる、以下に図面を℃照して
本発明の実施例をよシ許軸に睨明する。第１図は本発明
の一実施例會ボすブロック栴成図でるる。ｉｐ合合成連
通１、合成に用いるスペクトラム色釉パラメータ↑ＰＩ
報、有声無声判別情報、ピンチ情報、を源に幅情報等の
音声合成に必要とされる合成情報を記憶しておくＲ（Ｊ
ＭあるいはＲＡＩＩ等の合成情報記憶回路２と、合成情
報記憶回路２から出力される時間的にとな１）ｈりだフ
レームの各スペクトラム包絡パラメータ情緒を補間する
補間回路３と、補間回路３の内部にありｓｎｌ」ｉピ合
成情報記憶回路２から入力さｆｒた合成情報に対し、福
１ｉ４］を施さずに直接出力端子１３へ出力する無補間
回路４、及び前記合成情報記憶回路２から入力きれた合
成情報に対し、非線形補間ｔｉし出力端子１４へ出力す
る非線形補間回路５、及び前記合成情報記憶回路２から
入力された合成′Ｉ′ｉｉｒ報に対し、線形補間を施し
出力端子１５へ出力する線形補間回路６と、合成情報記
′は回路２から出力される時間的にとなシ倉ったフレー
ムの各スベタトラム包絡パラメータ情報間の変化量の検
出及び時間的にとなシあった有声無声判別情報の変化を
検出し、その変化に応じて前記補間回路の異なった出力
端子を迅択する検出回路８と、検出回路８によって選さ
れた端子から出力される合成情報に便って、フィルター
の係数、晋源振暢、ピッチ、有声無声判別の各情＠を設
定して音声の合ｍｔｆｅ行なう合成フィルター８と、合
成フィルター８からの合成出力をアナログ信号に変目す
る１）／Ａ変変目回路９、合成情報記憶回路２からの合
成情報の読み出しや、補間回路３、検出回路７、合成フ
ィルター８．Ｊ）／ＡＡ換回路９の各動作を制御する制
御回路１０と、制御回路１０に対し、音声合成の動作開
始及び停止の命令を入力する命令入力端子１１と、合成
された音声を出力する出力端子１２と全含み、それぞれ
は内部で接続されている。According to this invention, it is possible to follow the transient changes in the sound of the material, which are subject to rapid transient changes, and if the envelope parameters of the preceding frames are not different from each other depending on the difference between the two, The waveform can be smoothed by comparing the value with a pre-added value and selecting either linear interpolation or non-linear interpolation depending on the result. Therefore,
Whether using parameters extracted from natural speech or artificially set parameters, synthesis can be performed with excellent sound quality.Examples of the present invention are described below with reference to the drawings. I glared at him. FIG. 1 is a block diagram showing an embodiment of the present invention. ip synthesis communication 1, spectrum color glaze parameters used for synthesis ↑PI
R (J
A composite information storage circuit 2 such as M or RAII; There is a non-interpolation circuit 4 which directly outputs the input synthesis information from the synthesis information storage circuit 2 to the output terminal 13 without subjecting it to the synthesis information input from the synthesis information storage circuit 2. A nonlinear interpolation circuit 5 performs nonlinear interpolation ti on the synthesized information and outputs it to an output terminal 14, and a nonlinear interpolation circuit 5 performs linear interpolation on the synthesized information inputted from the synthesis information storage circuit 2 and outputs it to an output terminal 15. The output linear interpolation circuit 6 and the synthesis information recorder detect the amount of change between each smooth tram envelope parameter information of the temporally irregular frames output from the circuit 2, and detect the temporal irregularities. a detection circuit 8 that detects a change in voiced/unvoiced discrimination information and quickly selects a different output terminal of the interpolation circuit according to the change; , a synthesis filter 8 that performs voice synthesis by setting the filter coefficients, the pitch, voiced/unvoiced discrimination information, and converting the synthesis output from the synthesis filter 8 into an analog signal 1)/ Reading of synthesis information from the A change change circuit 9, synthesis information storage circuit 2, interpolation circuit 3, detection circuit 7, synthesis filter 8. J) A control circuit 10 that controls each operation of the /AA conversion circuit 9, a command input terminal 11 that inputs commands to start and stop voice synthesis operations to the control circuit 10, and an output that outputs synthesized voice. All terminals 12 are connected internally.

音声合成装置１は、命令入力端子１１から入力される命
令に従って、制御回路１０を起動し、合成処理を開始す
る。制御回路１ｏは合成情報記憶回路２に対し、スペク
トラム包絡パラメータ情緒、有声無声判別情報、ピッチ
情報、音源振幅情報のアドレス指定金行ない、時間的・
に−となシあった合成情＠全胱与出して、補間回路３、
検出回路７へ入力する。補間回路３の内部にある無補間
回路４、非線形補間回路５、線形補間回路６は前記合成
情報に対し、所定の補間を施した後、出力端子１３．１
４．１５に出力する。検出回路７は、合成情報記憶回路
２から入力された合成情報の内時量的にとなシあった有
声無声判別情報が異なっておシ、かつ無声から有声への
変化の場合は合成情報の内時量的に前のフレームの合成
情報を保持している出力端子１３ヲ選択し、合成フィル
ター８へ入力する。一方、合成情報記憶回路２から入力
された合成情報の内、時間的にとなｐあった鳴声無声判
別情報に変化がない場合、あるいは変化があってもＭ？
から無声への変化の場合は出力端子１３の選択は行なわ
ず、スペクトラム包絡パラメータの変化量の検出を行な
い、検出回路７内にあらかじめ設定されている閾値との
比較を行ない、変化量か設定された閾値よシ犬さい硼盆
は、非線形補間回路５の出力端子１４を選択し、変化ｉ
が設定芒れた閾値より小さい場合は、線形補間回路６の
出力端子１５を選択し、合成フィルター８へ補間値を出
力する。The speech synthesis device 1 activates the control circuit 10 in accordance with a command input from the command input terminal 11 and starts synthesis processing. The control circuit 1o provides the synthesis information storage circuit 2 with addressing of spectrum envelope parameter emotion, voiced/unvoiced discrimination information, pitch information, sound source amplitude information, temporal and
Interpolation circuit 3, with the combined information @whole bladder
Input to detection circuit 7. A non-interpolation circuit 4, a non-linear interpolation circuit 5, and a linear interpolation circuit 6 inside the interpolation circuit 3 perform predetermined interpolation on the composite information, and then output the output terminal 13.1.
Output on 4.15. The detection circuit 7 detects whether the synthesized information inputted from the synthesized information storage circuit 2 has different pieces of voiced/unvoiced discrimination information that are different in terms of time, and when there is a change from unvoiced to voiced, the synthesized information The output terminal 13 holding the synthesis information of the previous frame is selected and inputted to the synthesis filter 8. On the other hand, if there is no change in the voice/unvoice discrimination information that has been present at different times in the synthesis information inputted from the synthesis information storage circuit 2, or even if there is a change, M?
In the case of a change from to silent, the output terminal 13 is not selected, but the amount of change in the spectrum envelope parameter is detected, and compared with a threshold value preset in the detection circuit 7, the amount of change is determined. The output terminal 14 of the nonlinear interpolation circuit 5 is selected for the threshold value, and the output terminal 14 of the nonlinear interpolation circuit 5 is selected.
is smaller than the set threshold, the output terminal 15 of the linear interpolation circuit 6 is selected and the interpolated value is output to the synthesis filter 8.

以上のように、この実施例によれば、相隣り合うフレー
ムの包絡パラメータの甑に応じて、無補間、線形補間、
非線形補間の結果を通宣退択して用いることができ、自
然音声に近い音声を合成することができる。しかも、そ
のハードウェア機構は極めて簡単であシ、周知の半導体
技術を用いて容易に作成できることは図面よシ明らかで
ある。As described above, according to this embodiment, no interpolation, linear interpolation,
The results of nonlinear interpolation can be selectively used, and speech close to natural speech can be synthesized. Furthermore, it is clear from the drawings that the hardware mechanism is extremely simple and can be easily created using well-known semiconductor technology.

[Brief explanation of drawings]

第１図は本発明の一実施例を示すブロック構成図である
。１・・・・・・音声合成装置、２・・・・・・合成情報
記憶回路、３・・・・・・補間回路、４・・・・・・無
補間回路部、５・・・・・・非線形補間回路部、６・・
・・・・線形補間回路部、７・・・・・・検出回路、８
・・・・・・合成フィルター、９・・・・Ｄ／Ａ笈挨器
、１０・・・・・・制御回路、１１・・・・・・命令入
力端子、１２・・・・・・出力端子、１３・・・・・・
無輛間出力ｙ−子、１４・・・・・・非巌形袖＋ｉ４］
出力端子、１５・・・・・・腺形輛同。出力端子。FIG. 1 is a block diagram showing one embodiment of the present invention. DESCRIPTION OF SYMBOLS 1... Speech synthesis device, 2... Synthesis information storage circuit, 3... Interpolation circuit, 4... Non-interpolation circuit section, 5... ...Nonlinear interpolation circuit section, 6...
...Linear interpolation circuit section, 7...Detection circuit, 8
...Synthesizing filter, 9...D/A fan, 10...Control circuit, 11...Command input terminal, 12...Output Terminal, 13...
Musashi output y-child, 14...Non-wagata sleeve +i4]
Output terminal, 15... glandular body. Output terminal.

Claims

[Claims]

In a speech synthesis device having a circuit for interpolating audio parameters between temporally adjacent frames, the interpolation circuit has a function of interpolating different types of Be, and the difference between the audio parameters of adjacent frames. A voice synthesizer that can select any & function according to the situation.