JPH0315760B2

JPH0315760B2 -

Info

Publication number: JPH0315760B2
Application number: JP56098539A
Authority: JP
Inventors: Yukio Mitome; Katsunobu Fushikida
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1981-06-25
Filing date: 1981-06-25
Publication date: 1991-03-01
Also published as: JPS57212500A

Description

【発明の詳細な説明】本発明は単音節等の単位音声波形を編集合成す
る波形編集型音声合成装置に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a waveform editing type speech synthesis device that edits and synthesizes unit speech waveforms such as monosyllables.

従来、単音節等（例えば自然音声波形から抽出
されたCV，VC波形の語頭語尾を除いた波形；こ
こにＣは子音、Ｖは母音を表わす。）の単位音声
波形を編集合成して任意の語の音声合成を行なう
音声合成方式において、時間的に先行する前記単
位音声波形の後部接続部における１ピツチ分程度
の音声波形と後続する前記単位音声波形の前部接
続部における１ピツチ分程度の音声波形とを重み
づけして加え合わせることにより、前記２つの単
位音声波形間の波形を補間することによつて音声
波形を合成する方式が知られている。 Conventionally, unit speech waveforms such as monosyllables (for example, CV and VC waveforms extracted from natural speech waveforms, excluding the beginning and end of the word; here, C represents a consonant and V represents a vowel) are edited and synthesized to create an arbitrary sound. In a speech synthesis method that performs speech synthesis of words, a speech waveform of about 1 pitch at the rear connection part of the temporally preceding unit speech waveform and about 1 pitch worth of speech waveform at the front connection part of the following unit speech waveform are used. A method is known in which a speech waveform is synthesized by interpolating the waveform between the two unit speech waveforms by weighting and adding together the speech waveforms.

しかしながら前記従来方式は、補間処理によつ
て前記２つの単位音声波形がなめらかに接続され
るものの、補間処理の際に重みづけ等の演算が必
要であるため同時に複数のチヤンネルに対して相
異なる応答音声を出力するいわゆる多チヤンネル
応答が比較的行ないにくい欠点があつた。 However, in the conventional method, although the two unit audio waveforms are smoothly connected through interpolation processing, calculations such as weighting are required during interpolation processing, so different responses are given to multiple channels at the same time. The drawback was that it was relatively difficult to perform a so-called multi-channel response that outputs audio.

本発明の目的は単音節等の単位音声波形を編集
合成することにより任意の語の音声波形を生成す
る型の波形編集型音声合成装置において、前記単
位音声波形間の補間処理が比較的簡単で多チヤン
ネル同時応答が容易な波形編集型音声合成装置を
提供することにある。 An object of the present invention is to provide a waveform editing type speech synthesis device that generates a speech waveform of an arbitrary word by editing and synthesizing unit speech waveforms such as monosyllables, in which interpolation processing between the unit speech waveforms is relatively simple. An object of the present invention is to provide a waveform editing type speech synthesis device that can easily respond to multiple channels simultaneously.

本発明は単音節等の単位音声波形を記憶する手
段と、同一音韻間であつてもスペクトラムが異な
つている波形を補間するための補間用音声波形を
記憶する手段と、先行する前記単位音声波形と後
続する前記単位音声波形間を前記補間用音声波形
を用いて補間し編集合成する手段とから構成され
る。 The present invention provides means for storing unit speech waveforms such as monosyllables, means for storing interpolation speech waveforms for interpolating waveforms with different spectra even between the same phonemes, and the preceding unit speech waveforms. and a means for interpolating and editing and synthesizing the subsequent unit audio waveforms using the interpolation audio waveform.

次に本発明における補間方式と補間用波形につ
いて、単位音声波形としてダイアツド（Dyad）
波形（数段階のピツチを有するCV、VC波形）を
用いた場合の一例を説明する。 Next, regarding the interpolation method and interpolation waveform in the present invention, we will discuss the interpolation method and interpolation waveform using dyad as a unit audio waveform.
An example of a case where a waveform (CV or VC waveform with several pitches) is used will be explained.

同一の音韻であつても、時間的に先行する前記
ダイアツド波形の後部接続部におけるピツチある
いはスペクトと後続する前記ダイアツド波形の前
部接続部におけるピツチあるいはスペクトルとは
それぞれ異なる場合がある。同一の音韻であつて
もスペクトルが異なる場合としては例えば同一の
母音であつても調音結合の結果鼻音化した母音と
鼻音化しない母音と鼻音化しない母音のちがいが
ある場合等がある。 Even if the phoneme is the same, the pitch or spectrum at the rear connection part of the temporally preceding diad waveform may be different from the pitch or spectrum at the front connection part of the following diad waveform. An example of a case where the spectra differ even if the phoneme is the same is that, even if the vowel is the same, there is a difference between a vowel that is nasalized as a result of articulatory combination, a vowel that is not nasalized, and a vowel that is not nasalized.

このようにピツチあるいはスペクトルが異なる
２つのダイアツド波形を接続するために、ピツチ
あるいはスペクトルが除々に変化するような補間
用波形をあらかじめ用意し、前記２つのダイアツ
ド波形を補間する。 In order to connect two diad waveforms having different pitches or spectra, an interpolation waveform whose pitch or spectrum gradually changes is prepared in advance, and the two diad waveforms are interpolated.

前記補間用波形は例えば前記ダイアツド波形の
中間的ピツチを含む数段階のピツチを有し、鼻音
化母音あるいは非鼻音化母音等のスペクトルの異
なる自然音声波形から１ピツチ分程度の音声素片
波形を抽出し、さらにそれらの音声素片波形を重
みづけして加え合わせることによつて中間的なス
ペクトルを有する音声素片波形を生成することに
よつて得られる。 The interpolation waveform has several pitches including, for example, an intermediate pitch of the diad waveform, and it is possible to obtain a speech segment waveform of about one pitch from a natural speech waveform with a different spectrum such as a nasalized vowel or a non-nasalized vowel. It is obtained by extracting the speech segment waveforms and then weighting and adding the speech segment waveforms to generate a speech segment waveform having an intermediate spectrum.

前記２つのダイアツド波形の補間は、前記音声
素片波形の中からいくつかの音声素片波形を適宜
選択して接続することにより行なう。音声素片波
形の選択は、前記２つのダイアツド波形のピツチ
の組合せおよびスペクトルの組合せに応じてあら
かじめ補間情報を記憶しておくことにより行な
う。 The interpolation of the two diad waveforms is performed by appropriately selecting and connecting some speech unit waveforms from the speech unit waveforms. Selection of the speech segment waveform is performed by storing interpolation information in advance in accordance with the pitch combination and spectrum combination of the two diad waveforms.

補間用波形としては前記１ピツチ分程度の音声
素片波形ではなく、前記２つの単位音声波形間の
ピツチ及びスペクトルの変化がなめらかになるよ
うな１ピツチ分程度以上の音声波形をあらかじめ
用意しておき、前記２つの単位音声の補間に利用
することも可能である。 As the waveform for interpolation, prepare in advance a speech waveform of about 1 pitch or more so that changes in pitch and spectrum between the two unit speech waveforms are smooth, rather than the speech unit waveform of about 1 pitch. It is also possible to use it for interpolation of the two unit voices.

以上の説明の如く、本発明によれば乗算等の複
雑な演算を必要としない簡単な補間処理で前記２
つの単位音声波形間を補間することによつて比較
的高品質な合成音が得られ、しかも補間処理が簡
単なため多チヤンネルの同時応答が容易であると
いう効果があることは明らかである。 As described above, according to the present invention, the above two
It is clear that by interpolating between two unit speech waveforms, a synthesized sound of relatively high quality can be obtained, and since the interpolation process is simple, it is easy to respond to multiple channels simultaneously.

次に図面を用いて本発明の実施例を説明する。 Next, embodiments of the present invention will be described using the drawings.

第１図は本発明の一実施例を示すブロツク図で
ある。 FIG. 1 is a block diagram showing one embodiment of the present invention.

文次列一時記憶回路１０２は、文字列入力端子
１０１から入力された単位音声波形名、ピツチレ
ベル等の合成すべき音声を表わす文字列を蓄え
る。文字列が入力されたら合成の開始を要求する
情報を開始終了情報報伝送路１０３を介して制御
回路１２１に送る。制御回路１２１から制御情報
伝送路Ａ１０４を介して文字列の送出を指示する
制御情報が送られたら、単位音声波形名を単位音
声波形名を単位音声波形名伝送路Ａ１０５もしく
は同Ｂ１０７を介して、ピツチレベルをピツチレ
ベル伝送路Ａ１０６もしくは同Ｂ１０８を介して
波形アドレス表探索回路Ａ１０９もしくは同Ｂ１
１０に遂次送る。最後の単位音声波形名とピツチ
レベルを送出したら、終了を示す情報を制御回路
１２１に送る。 The sentence sequence temporary storage circuit 102 stores character strings input from the character string input terminal 101 representing the speech to be synthesized, such as unit speech waveform names and pitch levels. When a character string is input, information requesting the start of synthesis is sent to the control circuit 121 via the start/end information transmission line 103. When control information instructing the transmission of a character string is sent from the control circuit 121 via the control information transmission path A104, the unit audio waveform name is transmitted via the unit audio waveform name transmission path A105 or B107. The pitch level is transmitted to the waveform address table search circuit A109 or B1 via the pitch level transmission line A106 or B108.
Send it to 10th one after another. After transmitting the last unit audio waveform name and pitch level, information indicating the end is transmitted to the control circuit 121.

波形アドレス表探索回路Ａ１０９は制御回路１
２１から制御情報伝送路Ｂ１１２を介してデータ
の入力を指示する制御情報を受けたら、前記文字
列一時記憶回路１０２から送られる単位音声波形
名とピツチレベルを入力し、内部の表中からその
単位音声波形名とピツチレベルに対応する単位音
声波形が記憶されている先頭アドレスとサンプル
数を探し出す。先頭アドレスとサンプル数を得た
らそのことを示す情報を応答情報伝送路Ａ１１１
を介して制御回路１２１に送り、制御回路１２１
からデータ送出を指示する制御情報を受けたら先
頭アドレスとサンプル数をそれぞれ先頭アドレス
伝送路Ａ１１４及びサンプル数伝送路Ａ１１５を
介してアドレス生成回路Ａ１１３に送る。 The waveform address table search circuit A109 is the control circuit 1
21 via the control information transmission path B112, input the unit audio waveform name and pitch level sent from the character string temporary storage circuit 102, and select the unit audio from the internal table. Find the start address and number of samples where the unit audio waveform corresponding to the waveform name and pitch level is stored. Once the start address and number of samples are obtained, information indicating this is sent to the response information transmission path A111.
to the control circuit 121 via the control circuit 121.
When control information instructing data transmission is received from , the start address and the number of samples are sent to the address generation circuit A113 via the start address transmission path A114 and the sample number transmission path A115, respectively.

波形アドレス表探索回路Ｂ１１０は制御回路１
２１から制御情報伝送路Ｃ１１７を介してデータ
の入力を指示する制御情報を受けたら、前記文字
列一時記憶回路１０２から送られる単位音声波形
名とピツチレベルを入力し、初期化直後であれば
それを先行する単位音声波の波形名とピツチレベ
ルとして内部に記憶し、初期化後２番目以後のデ
ータであればそれを後続する単位音声波形の波形
名とピツチレベルとして、記憶してある先行する
単位音声波形の波形名とピツチレベル及び後続す
る単位音声波形の波形名とピツチレベルに従つて
内部の表中から対応する補間波形の先頭アドレス
とサンプル数を探し出す。更に、前記後続する単
位音声波形の波形名とピツチレベルを新たな先行
する単位音声波形の波形名とピツチレベルとして
記憶し、制御回路１２１に対して先頭アドレスと
サンプル数を得たことを示す情報を応答情報伝送
路Ｂ１１６を介して送り、制御回路１２１からデ
ータの送出を指示する制御情報を受けたら、先頭
アドレス及びサンプル数をそれぞれ先頭アドレス
伝送路Ｂ１１９及びサンプル数伝送路Ｂ１２０を
介してアドレス生成回路Ｂ１１８に送る。 The waveform address table search circuit B110 is the control circuit 1
21 via the control information transmission path C117, input the unit audio waveform name and pitch level sent from the character string temporary storage circuit 102, and if it is immediately after initialization, input the unit audio waveform name and pitch level. The preceding unit speech waveform is stored internally as the waveform name and pitch level of the preceding unit speech wave, and if it is the second or subsequent data after initialization, it is stored as the waveform name and pitch level of the subsequent unit speech waveform. The starting address and number of samples of the corresponding interpolated waveform are searched from the internal table according to the waveform name and pitch level of , and the waveform name and pitch level of the following unit audio waveform. Furthermore, it stores the waveform name and pitch level of the following unit audio waveform as a new waveform name and pitch level of the preceding unit audio waveform, and responds to the control circuit 121 with information indicating that the start address and number of samples have been obtained. When control information is sent via the information transmission path B116 and instructs to send data from the control circuit 121, the start address and the number of samples are sent to the address generation circuit B118 via the start address transmission path B119 and the number of samples transmission path B120, respectively. send to

アドレス生成回路Ａ１１３あるいは同Ｂ１１８
は制御回路１２１からそれぞれ制御情報伝送路Ｄ
１２２あるいは同Ｅ１２４を介してデータの入力
を指示する制御情報が送られたら、それぞれ前記
単位音声波形の先頭アドレスとサンプル数あるい
は補間波形の先頭アドレスとサンプル数を入力
し、制御回路１２１からアドレスデータの送出を
指示する制御情報が送られたらそれぞれアドレス
伝送路Ａ１２６あるいは同Ｂ１２７を介してそれ
ぞれ単位音声波形データ記憶回路１２８あるいは
補間波形データ記憶回路１２９に対し前記単位音
声波形あるいは補間波形の先頭アドレスからサン
プル数分のアドレスを遂次送る。サンプル数分の
アドレスの送出が終了したらそのことを示す情報
を応答情報伝送路Ｃ１２３あるいは同Ｄ１２５を
介して制御回路１２１に送る。 Address generation circuit A113 or B118
are the control information transmission paths D from the control circuit 121, respectively.
122 or the same E124, the start address and number of samples of the unit audio waveform or the start address and number of samples of the interpolation waveform are input, and the address data is sent from the control circuit 121. When the control information instructing the transmission of the unit audio waveform or interpolated waveform is sent to the unit audio waveform data storage circuit 128 or the interpolated waveform data storage circuit 129 via the address transmission path A126 or B127, respectively, from the start address of the unit audio waveform or interpolated waveform. Send addresses for the number of samples one after another. When the sending of addresses for the number of samples is completed, information indicating this is sent to the control circuit 121 via the response information transmission path C123 or D125.

単位音声波形データ記憶回路１２８あるいは補
間波形データ記憶回路１２９はそれぞれ前記アド
レス生成回路Ａ１１３あるいはアドレス生成回路
Ｂ１１８から送られるアドレスの波形データをそ
れぞれ波形データ伝送路Ａ１３１あるいは波形デ
ータ伝送路Ｂ１３２を介して編集合成回路１３３
に送る。 The unit audio waveform data storage circuit 128 or the interpolated waveform data storage circuit 129 edits the waveform data of the address sent from the address generation circuit A113 or address generation circuit B118, respectively, via the waveform data transmission path A131 or the waveform data transmission path B132, respectively. Synthesis circuit 133
send to

編集合成回路１３３は制御回路１２１から制御
情報伝送路Ｆ１３０を介して送られる制御情報に
従い、前記単位音声波形データ記憶回路１２８も
しくは補間波形データ記憶回路１２９から送られ
る波形データを入力し合成音声出力端子１３４に
出力する。 The editing/synthesizing circuit 133 inputs the waveform data sent from the unit audio waveform data storage circuit 128 or the interpolated waveform data storage circuit 129 according to the control information sent from the control circuit 121 via the control information transmission line F130, and outputs the synthesized audio output terminal. 134.

制御回路１２１は前記各回路を制御する。まず
文字列一時記憶回路１０２から合成開始の要求情
報が送られたら各回路に対して初期化を指示し、
続いて文字列一時記憶回路１０２、波形アドレス
表探索回路Ａ１０９及び同Ｂ１１０に対して単位
音声波形名とピツチレベルの転送を指示する。波
形アドレス表探索回路Ａ１０９もしくは同Ｂ１１
０から表の探索終了の応答情報が送られたら、ア
ドレス生成回路Ａ１１３もしくは同Ｂ１１８がア
ドレスデータの送出を行なつていないことを確認
して波形アドレス表探索回路Ａ１０９とアドレス
生成回路Ａ１１３もしくは波形アドレス表探索回
路Ｂ１１０とアドレス生成回路Ｂ１１８に対し先
頭アドレスとサンプル数の転送を指示し、続いて
文字列一時記憶回路１０２と波形アドレス表探索
回路Ａ１０９もしくは同Ｂ１１０に対し次のデー
タの転送を指示する。続いて初期化直後であれば
探し出した先頭アドレスとサンプル数は合成すべ
き音声の初めの波形データに関するものであるか
ら、直ちにアドレス生成回路Ａ１１３に対してア
ドレスの送出を指示すると同時に編集合成回路１
３３に対して単位音声波形データ記憶回路１２８
から波形データを入力するように指示し、以後は
アドレス生成回路Ａ１１３もしくは同Ｂ１１８の
一方からアドレスの送出が終了したことを示す情
報を受けたら他方に対してアドレスの送出を指示
すると同時に編集合成回路１３３に対して単位音
声波形データ記憶回路１２８もしくは補間波形デ
ータ記憶回路１２９を切り換えて波形データの入
出力を行なうことを指示する。 A control circuit 121 controls each of the circuits. First, when request information to start synthesis is sent from the character string temporary storage circuit 102, initialization is instructed to each circuit,
Next, the character string temporary storage circuit 102 and the waveform address table search circuits A109 and B110 are instructed to transfer the unit audio waveform name and pitch level. Waveform address table search circuit A109 or B11
When the table search end response information is sent from 0, it is confirmed that the address generation circuit A113 or the address generation circuit B118 is not sending out address data, and the waveform address table search circuit A109 and the address generation circuit A113 or the waveform address are sent. The table search circuit B110 and address generation circuit B118 are instructed to transfer the start address and the number of samples, and then the character string temporary storage circuit 102 and the waveform address table search circuit A109 or B110 are instructed to transfer the next data. . Next, immediately after initialization, since the found start address and the number of samples are related to the initial waveform data of the audio to be synthesized, the address generation circuit A 113 is immediately instructed to send the address, and at the same time, the editing synthesis circuit 1
33, the unit audio waveform data storage circuit 128
From then on, when receiving information from either address generation circuit A113 or address generation circuit B118 indicating that address transmission has been completed, the editing/synthesis circuit simultaneously instructs the other to input the address. 133 to switch between the unit audio waveform data storage circuit 128 or the interpolated waveform data storage circuit 129 to input and output waveform data.

本実施例では各回路の並行的動作により、文字
列入力端子１０１に文字列が入力されると若干の
遅延の後に合成音が出力され始める。 In this embodiment, each circuit operates in parallel, so that when a character string is input to the character string input terminal 101, a synthesized sound begins to be output after a slight delay.

第２図は本発明の第２の実施例であり、マイク
ロプロセツサ等を用いプログラム制御によつて実
現する場合のプログラムの流れ図である。この例
では編集合成が終了してから合成音が出力され
る。 FIG. 2 shows a second embodiment of the present invention, and is a flowchart of a program implemented by program control using a microprocessor or the like. In this example, the synthesized sound is output after editing and synthesis are completed.

まず準備の手続２０１において出力バツフア
（OUTBとする）を空にし、出力バツフア内の波
形データのサンプル数を表わすカウンタ
（COUNTとする）を零にし、入力バツフア
（INBとする）内の単位音声波形名とピツチレベ
ルを示すポインタ（POINTとする）を１にす
る。 First, in the preparation procedure 201, the output buffer (referred to as OUTB) is emptied, the counter (referred to as COUNT) representing the number of samples of waveform data in the output buffer is set to zero, and the unit audio waveform in the input buffer (referred to as INB) is Set the pointer (referred to as POINT) indicating the name and pitch level to 1.

次に文字列入力手続２０２において合成すべき
音声を表わす単位音声波形名とピツチレベル等の
文字列を入力バツフア（INB）に読み込む。 Next, in a character string input procedure 202, character strings such as the unit speech waveform name and pitch level representing the speech to be synthesized are read into the input buffer (INB).

次に単位音声波形アドレス表探索手続２０３に
おいて、入力バツフア（INB）のポインタ
（POINT）が示す単位音声波形名とピツチレベ
ルを単位音声波形アドレス表（TADRとする）
中に探し、対応する先頭アドレス（SADR1とす
る）とサンプル数（NSPL1とする）を得る。 Next, in the unit audio waveform address table search procedure 203, the unit audio waveform name and pitch level indicated by the pointer (POINT) of the input buffer (INB) are set as the unit audio waveform address table (TADR).
Find the corresponding start address (assumed to be SADR1) and number of samples (assumed to be NSPL1).

次に単位音声波形データ転送手続２０４におい
て前記単位音声波形アドレス表探索手続２０３に
おいて得た単位音声波形の先頭アドレスからサン
プル数分の波形データ（THAKEIとする）を出
力バツフア（OUTB）に、そのカウンタ
（COUNT）が示すアドレスの次から転送する。 Next, in the unit audio waveform data transfer procedure 204, the waveform data for the number of samples (referred to as THAKEI) from the start address of the unit audio waveform obtained in the unit audio waveform address table search procedure 203 is transferred to the output buffer (OUTB) and its counter Transfer from the next address indicated by (COUNT).

次にカウンタ更新手続２０５において、前記単
位音声波形データ転送手続２０４において転送し
たサンプル数分だけカウンタ（COUNT）を進め
る。 Next, in a counter update procedure 205, the counter (COUNT) is incremented by the number of samples transferred in the unit audio waveform data transfer procedure 204.

次に終了判定手続２０６において、入力バツフ
ア（INB）内に単位音声波形名とピツチレベル
を表わす文字列が残つているか否かを判定し、残
つていれば次の補間波形アドレス表探索手続２０
７に、残つていなければ合成音声出力手続２１０
に制御を移す。 Next, in the end determination procedure 206, it is determined whether character strings representing unit audio waveform names and pitch levels remain in the input buffer (INB), and if so, the next interpolation waveform address table search procedure 20
7, if there is no remaining synthesized voice output procedure 210
transfer control to

補間波形アドレス表探索手続２０７では入力バ
ツフア（INB）のポインタ（POINT）が示す先
行する単位音声波形名とピツチレベルとポインタ
（POINT）が示す次の後続する単位音声波形名
とピツチレベルとを補間波形アドレス表
（HADRとする）中に探し、対応する先頭アドレ
ス（SADR2とする）とサンプル数（NSPL2とす
る）を得る。 In the interpolation waveform address table search procedure 207, the preceding unit audio waveform name and pitch level indicated by the pointer (POINT) of the input buffer (INB) and the next succeeding unit audio waveform name and pitch level indicated by the pointer (POINT) are used as the interpolation waveform address. Search in the table (assumed to be HADR) and obtain the corresponding start address (assumed to be SADR2) and number of samples (assumed to be NSPL2).

次に補間波形データ転送手続２０８において、
前記補間波形アドレス表探索手続２０７において
得た補間波形の先頭アドレスからサンプル数分の
波形データ（HHAKE1とする）を出力バツフア
（OUTB）に、そのカウンタ（COUNT）が示す
アドレスの次から転送する。 Next, in the interpolated waveform data transfer procedure 208,
Waveform data for the number of samples (called HHAKE1) from the top address of the interpolated waveform obtained in the interpolated waveform address table search procedure 207 is transferred to the output buffer (OUTB) from the address following the address indicated by the counter (COUNT).

次にカウンタ、ポインタ更新手続２０８におい
て転送したサンプル数分だけ出力バツフアのカウ
ンタ（COUNT）を進め、入力バツフア内のポイ
ンタ（POINT）を次の単位音声波形名とピツチ
レベルのはじめに進め、前記単位音声波形アドレ
ス表探索手続２０３に制御を移す。 Next, in the counter/pointer update procedure 208, the counter (COUNT) of the output buffer is advanced by the number of samples transferred, the pointer (POINT) in the input buffer is advanced to the beginning of the next unit audio waveform name and pitch level, and the unit audio waveform is Control is transferred to address table search procedure 203.

合成音声出力手続２１０は出力バツフア
（OUTB）内に蓄えられた、出力バツフアのカウ
ンタ（COUNT）が示す分のデータを合成音声波
形データとして出力する。 The synthesized speech output procedure 210 outputs the data stored in the output buffer (OUTB) and indicated by the output buffer counter (COUNT) as synthesized speech waveform data.

本実施例では単位音声波形を記憶しているるメ
モリが単位音声波形を記憶する手段に、補間波形
を記憶しているメモリが補間用の音声波形を記憶
する手段に、前記の一連の手続を処理する処理装
置と出力バツフアとが編集合成する手段に相当す
る。 In this embodiment, the above-mentioned series of procedures are applied to the memory that stores unit audio waveforms as a means for storing unit audio waveforms, and the memory that stores interpolated waveforms as means that stores audio waveforms for interpolation. The processing device and output buffer correspond to means for editing and combining.

なお、本発明の音声合成装置は、従来知られて
いるいくつかの特定の文章等の固定語の音声波形
をランダムアクセス方式により出力する音声応答
装置と同様な処理によつて任意の語の音声波形を
合成することが可能であり、固定語および任意語
の音声を出力する音声応答装置への応用において
特に効果がある。 Note that the speech synthesis device of the present invention generates speech of any word by a process similar to that of conventional speech response devices that output speech waveforms of fixed words such as certain specific sentences using a random access method. It is possible to synthesize waveforms, and it is particularly effective in application to voice response devices that output sounds of fixed words and arbitrary words.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロツク図、
第２図は本発明の第２の実施例としてマイクロプ
ロセツサ等を用いた場合のプログラムの流れを示
した図である。図において、１０１は文字列入力端子、１０２
は文字列一時記憶回路、１０９は波形アドレス表
探索回路Ａ、１１０は波形アドレス表探索回路
Ｂ、１１３はアドレス生成回路Ａ、１１８はアド
レス生成回路Ｂ、１２１は制御回路、１２８は単
位音声波形データ記憶回路、１２９は補間波形デ
ータ記憶回路、１３３は編集合成回路、１３４は
合成音声出力端子、２０１は準備の手続、２０２
は文字列入力手段、２０３は単位音声波形アドレ
ス表探索手続、２０４は単位音声波形データ転送
手続、２０５はカウンタ更新手続、２０６は終了
判定手続、２０７は補間波形アドレス表探索手
続、２０８は補間波形データ転送手続、２０９は
カウンタ、ポインタ更新手続、２１０は合成音声
出力手続を示す。 FIG. 1 is a block diagram showing one embodiment of the present invention;
FIG. 2 is a diagram showing the flow of a program when a microprocessor or the like is used as a second embodiment of the present invention. In the figure, 101 is a character string input terminal, 102
109 is a character string temporary storage circuit, 109 is a waveform address table search circuit A, 110 is a waveform address table search circuit B, 113 is an address generation circuit A, 118 is an address generation circuit B, 121 is a control circuit, and 128 is unit audio waveform data. 129 is an interpolated waveform data storage circuit; 133 is an editing/synthesizing circuit; 134 is a synthesized audio output terminal; 201 is a preparation procedure; 202
203 is a unit audio waveform address table search procedure, 204 is a unit audio waveform data transfer procedure, 205 is a counter update procedure, 206 is an end determination procedure, 207 is an interpolated waveform address table search procedure, and 208 is an interpolated waveform A data transfer procedure, 209 a counter and pointer update procedure, and 210 a synthesized voice output procedure.

Claims

[Claims]

1 In a waveform editing type speech synthesis device that edits and synthesizes unit speech waveforms such as monosyllables, there is a means for storing unit speech waveforms such as monosyllables, and a means for interpolating waveforms with different spectra even between the same phoneme. a means for storing an interpolation speech waveform for use in the interpolation speech waveform; and a means for editing and synthesizing a speech waveform of an arbitrary word by interpolating between the preceding unit speech waveform and the following unit speech waveform using the interpolation speech waveform. A waveform editing type speech synthesis device comprising: