JP2847699B2

JP2847699B2 - Speech synthesizer

Info

Publication number: JP2847699B2
Application number: JP59138653A
Authority: JP
Inventors: 泰石川
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1984-07-04
Filing date: 1984-07-04
Publication date: 1999-01-20
Anticipated expiration: 2014-01-20
Also published as: JPS6117199A

Description

【発明の詳細な説明】〔発明の技術分野〕この発明は、キーボード等から入力される任意のメツ
セージ内容を音声に変換する音声合成装置に関するもの
である。〔従来技術〕従来この種の音声合成装置では、CV音節のパラメータ
を入力されたメツセージ内容に応じて編集を行うと共
に、音声メツセージのアクセント，イントネーシヨンに
応じた音源波形を生成し、音声合成フイルタを用いて合
成音声を得る様にしていた。しかるに、CV音節を単純に
接続しただけでは、得られる合成音声のスペクトルが接
続点で急激に変化するため、合成音声は非常に不連続な
感じの強いばかりでなく、誤つた音として聴取される場
合がかなり多く、非常に品質の悪いものであつた。ま
た、急激な変化を生じない様にパラメータの補間を行う
補間回路を備えた装置もあるが、この装置では、パラメ
ータの補間によつてスペクトルの連続性が得られること
はあつても、自然音声には見られないスペクトルが生じ
ることが多く、この様な装置によるもやはり誤つた音と
して聴取される場合があり、品質の良い合成音声は得ら
れない欠点があつた。また、VCV音節のパラメータを記
憶しておき、パラメータの結合を同じ母音どうしで行
い、高品質な合成音を得る様にした装置も提案されてい
るが、この装置では、用意しなければならない音節数が
６倍以上になり、記憶回路の容量ならびにパラメータの
作成における手間の点で多くの問題点があるという欠点
があつた。〔発明の概要〕この発明は、上記の様な従来のものの欠点を改善する
目的でなされたもので、CV音節のパラメータと共に、こ
のCV音節のパラメータの結合時に挿入する補間用パラメ
ータを記憶している記憶回路を設け、この記憶回路から
発声すべきメツセージ内容に応じて、CV音節のパラメー
タと補間パラメータを読み出し、補間回路により適宜
に、読み出されたCV音節のパラメータに補間パラメータ
を挿入する様に構成することにより、自然な連続性を持
つ高品質な合成音声を得ることができる音声合成装置を
提供するものである。〔発明の実施例〕以下、この発明の実施例を図について説明する。第１
図はこの発明の一実施例である音声合成装置を示すブロ
ツク構成図である。図において、１は図示されないキー
ボード等から入力されるメツセージ内容、２はメツセー
ジ内容１から音声のアクセント，イントネーシヨンに対
応したピツチを持つ音源波形を生成する音源波形生成回
路、３は音源波形生成回路２から得られる音源波形、４
はCV音節のパラメータと補間用パラメータを記憶してい
る記憶回路、５はメツセージ内容１に応じて必要なパラ
メータを読み出す読み出し回路、６は読み出されたCV音
節のパラメータ及び補間パラメータ、７は補間回路、８
は編集された合成パラメータ、９は音声合成フイルタ回
路、10は合成音声である。第２図は、第１図の音声合成装置において、パラメー
タの読み出しと補間を行う一例を示す説明図である。次に、上記第１図に示すこの発明の一実施例である音
声合成装置の動作について説明する。図示されないキー
ボード等から入力されるメツセージ内容１にしたがつ
て、読み出し回路５は記憶回路４から合成に必要なパラ
メータを読み出す。今、第２図に示す様な単語音声「未
来」（mi ra i）を合成する場合には、読み出し回路５
はCV音節のパラメータとして記憶されている「mi」，
「ra」，「ｉ」の各パラメータを読み出す。ここで、
「mi」と「ra」の各パラメータを結合する際、第２図に
斜線で示した様に、CV音節のパラメータ「ri」として記
憶されているパラメータより、母音の立ち上がりの部分
のパラメータを読み出し回路５は読み出し、各パラメー
タ「mi」と「ra」の間に図の矢印に示す様に挿入する。
また、各パラメータ「ra」と「ｉ」の結合においては、
自然音声の「あい」から切り出した補間用パラメータと
して記憶されている「ａ−ｉ」のパラメータを、読み出
し回路５が読み出して図の矢印に示す様に挿入する。こ
れらのパラメータについては、第２図に横縞で示す様な
補間区間、すなわち挿入したパラメータの前後を補間回
路７によつて補間を行う。この様にして得られた合成パ
ラメータ８と音源波形生成回路２によつて生成された音
源波形３とから、音声合成フイルタ回路９によつて合成
音声10が得られる。これは、自然音声では、母音は後続
の子音に近付くにつれ、そのスペクトルが大きく変化す
るが、この変化は同じ子音，母音の結合において、丁度
逆の形で表われることを利用したものであり、すなわ
ち、この発明では母音から子音への調音結合の様子が、
子音から母音への調音結合の時間軸を逆に見た時の様子
に近いという特徴を利用したものであって、そのため
に、補間パラメータの挿入とその前後の補間区間の補間
によつて合成音の品質は非常に向上する。また、あらか
じめ記憶されているCV音節のパラメータの一部を切り出
して用いるため、新たに記憶が必要なパラメータは、CV
音節のパラメータにない母音から母音への変化時のパラ
メータ及び母音から半母音への変化時のパラメータ等、
非常に少ない。このように、この発明ではCV音節の過度
区間に現れない母音から他の母音へ補間用パラメータさ
え用意しておけば良く、後はCV音節の中からデータを読
み出して補間を行うようにしたものである。〔発明の効果〕この発明は以上説明した様に、音声合成装置におい
て、CV音節のパラメータと共に、このCV音節のパラメー
タの結合時に挿入する補間用パラメータを記憶している
記憶回路を設け、この記憶回路から発声すべきメツセー
ジ内容に応じてCV音節のパラメータと補間パラメータを
読み出し、補間回路により適宜に、読み出されたCV音節
のパラメータに補間パラメータを挿入する様に構成した
ので、CV音節のパラメータの結合点における合成音声の
品質の低下を解消でき、自然な連続性を持つ高品質な合
成音声が得られるという優れた効果を奏するものであ
る。Description: TECHNICAL FIELD [0001] The present invention relates to a voice synthesizing device that converts arbitrary message content input from a keyboard or the like into voice. [Prior Art] Conventionally, this type of speech synthesizer edits the parameters of CV syllables according to the input message content, generates a sound source waveform according to the accent and intonation of the speech message, and performs speech synthesis. A synthesized voice was obtained using a filter. However, simply connecting CV syllables causes the spectrum of the resulting synthesized speech to change rapidly at the connection points, so the synthesized speech is not only very discontinuous but also heard as an incorrect sound. Quite often, very poor quality. Some devices have an interpolation circuit that interpolates parameters so as not to cause abrupt changes.However, in this device, even if spectral continuity can be obtained by parameter interpolation, natural sound In many cases, a spectrum that cannot be seen is generated, and even with such a device, the sound may be heard as an erroneous sound, and there is a disadvantage that a high-quality synthesized speech cannot be obtained. Also, a device has been proposed in which the parameters of VCV syllables are stored, and the parameters are combined between the same vowels to obtain a high-quality synthesized sound. The number is increased by a factor of six or more, and there is a drawback in that there are many problems in terms of the capacity of the storage circuit and the time required for creating parameters. [Summary of the Invention] The present invention has been made for the purpose of improving the above-mentioned drawbacks of the conventional art. A storage circuit is provided to read CV syllable parameters and interpolation parameters according to the content of the message to be uttered from this storage circuit, and insert interpolation parameters into the read CV syllable parameters as appropriate by the interpolation circuit. Accordingly, the present invention provides a speech synthesizer capable of obtaining high-quality synthesized speech having natural continuity. Embodiments of the Invention Hereinafter, embodiments of the present invention will be described with reference to the drawings. First
FIG. 1 is a block diagram showing a speech synthesizer according to an embodiment of the present invention. In the figure, reference numeral 1 denotes a message content input from a keyboard or the like (not shown), 2 denotes a sound source waveform generating circuit for generating a sound source waveform having a pitch corresponding to the accent and intonation of a voice from the message content 1, and 3 denotes a sound source waveform generation Sound source waveform obtained from circuit 2, 4
Is a storage circuit that stores CV syllable parameters and interpolation parameters, 5 is a readout circuit that reads out necessary parameters according to message content 1, 6 is the readout CV syllable parameters and interpolation parameters, and 7 is interpolation Circuit, 8
Is an edited synthesis parameter, 9 is a voice synthesis filter circuit, and 10 is a synthesized voice. FIG. 2 is an explanatory diagram showing an example in which parameters are read and interpolated in the speech synthesizer shown in FIG. Next, the operation of the voice synthesizing apparatus according to the embodiment of the present invention shown in FIG. 1 will be described. The readout circuit 5 reads out parameters necessary for synthesis from the storage circuit 4 in accordance with the message content 1 input from a keyboard or the like (not shown). Now, in the case of synthesizing the word speech “future” (mi rai) as shown in FIG.
Is "mi" stored as a parameter of CV syllable,
The parameters “ra” and “i” are read. here,
When combining the “mi” and “ra” parameters, as shown by the hatched lines in FIG. 2, read the parameters of the rising part of the vowel from the parameters stored as the CV syllable parameter “ri”. The circuit 5 reads and inserts between each parameter "mi" and "ra" as shown by the arrow in the figure.
Also, in the combination of each parameter “ra” and “i”,
The readout circuit 5 reads out the parameter of “ai” stored as the interpolation parameter extracted from the natural voice “Ai” and inserts it as shown by the arrow in the figure. These parameters are interpolated by an interpolation circuit 7 in an interpolation section as shown by horizontal stripes in FIG. 2, that is, before and after the inserted parameter. A synthesized speech 10 is obtained by a speech synthesis filter circuit 9 from the synthesized parameters 8 thus obtained and the sound source waveform 3 generated by the sound source waveform generation circuit 2. This is based on the fact that in natural speech, the spectrum of a vowel changes greatly as it approaches a subsequent consonant, but this change appears in the opposite form in the combination of the same consonant and vowel. That is, in the present invention, the state of articulation coupling from a vowel to a consonant is
It utilizes the characteristic that the time axis of articulation coupling from a consonant to a vowel is close to the appearance when viewed in reverse.For this purpose, a synthesized sound is produced by inserting interpolation parameters and interpolating interpolation sections before and after the interpolation parameters. The quality is greatly improved. Also, since a part of the parameters of the CV syllables stored in advance are cut out and used, the parameters that need to be newly stored are CV syllables.
Parameters at the time of change from vowels to vowels not in syllable parameters and parameters at the time of change from vowels to semi-vowels, etc.
Very little. As described above, in the present invention, it is only necessary to prepare parameters for interpolation from a vowel that does not appear in an excessive section of a CV syllable to another vowel, and thereafter, data is read from the CV syllable and interpolation is performed. It is. [Effects of the Invention] As described above, the present invention provides, in a speech synthesizer, a storage circuit that stores a parameter of a CV syllable and an interpolation parameter to be inserted when combining the parameter of the CV syllable. The CV syllable parameters and interpolation parameters are read out from the circuit according to the content of the message to be uttered, and the interpolation circuit appropriately inserts the interpolation parameters into the read out CV syllable parameters. Thus, it is possible to eliminate the deterioration of the quality of the synthesized voice at the connection point of and to obtain an excellent effect that a high-quality synthesized voice having natural continuity can be obtained.

【図面の簡単な説明】第１図はこの発明の一実施例である音声合成装置を示す
ブロック構成図、第２図は、第１図の音声合成装置にお
いて、パラメータの読み出しと補間を行う一例を示す説
明図である。図において、１…メツセージ内容、２…音源波形生成回
路、３…音源波形、４…記憶回路、５…読み出し回路、
６…CV音節のパラメータ及び補間パラメータ、７…補間
回路、８…合成パラメータ、９…音声合成フイルタ回
路、10…合成音声である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing a speech synthesizer according to an embodiment of the present invention, and FIG. 2 is an example in which parameters are read and interpolated in the speech synthesizer shown in FIG. FIG. In the figure, 1 ... message contents, 2 ... sound source waveform generation circuit, 3 ... sound source waveform, 4 ... storage circuit, 5 ... readout circuit,
6: CV syllable parameters and interpolation parameters, 7: interpolation circuit, 8: synthesis parameter, 9: speech synthesis filter circuit, 10: synthesized speech.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁶，ＤＢ名) G10L 5/00 - 5/04──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int.Cl. ⁶ , DB name) G10L 5/00-5/04

Claims

(57) [Claims] In a speech synthesizer for storing feature parameters of CV syllables, editing the feature parameters according to the content of a message to be uttered, generating a sound source waveform, and synthesizing speech by a speech synthesis filter, A storage circuit that stores the feature parameters and stores an interpolation parameter that does not appear in the CV syllable, and according to the message content to be uttered from the recording circuit,
A readout circuit that reads out the feature parameter of the CV syllable, and reads out the feature parameter or the interpolation parameter of the CV syllable in accordance with the excessive section of the CV syllable, and the readout feature parameter of the CV syllable. , The CV read according to the excessive section of the CV syllable
An interpolation circuit for interpolating by inserting an interpolation parameter generated from a syllable characteristic parameter or an interpolation parameter consisting of the interpolation parameter; a sound source waveform generation circuit for generating a sound source waveform according to the content of the message to be uttered; A speech synthesizer, comprising: an interpolation circuit; and a speech synthesis filter circuit that synthesizes speech from parameters obtained by the sound source waveform generation circuit and a sound source waveform.