JP2844588B2

JP2844588B2 - Waveform editing type speech synthesizer

Info

Publication number: JP2844588B2
Application number: JP59257831A
Authority: JP
Inventors: 慶子高島; 勝信伏木田
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1984-12-06
Filing date: 1984-12-06
Publication date: 1999-01-06
Anticipated expiration: 2014-01-06
Also published as: JPS61134800A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、単音節等の単位音声波形を編集合成する型
の波形編集型音声合成装置に関する。（従来技術の問題点）従来、入力として与えられる文字列に従ってCV、VC波
形（Ｃは子音、Ｖは母音）等の音声波形を編集合成する
ことにより単語、文章等を生成する方式が知られている
（音響学会春季講演論文集１−７−２（1981））。しか
しながら、前記方式は単位音声波形の時間長が固定され
ており、合成する単語、文章等の音声の発話時間長を変
えることができないという欠点がある。（発明の目的）本発明の目的は、編集するために取り出す単位音声波
形の取り出す位置を変えることにより、合成する単語、
文章等の音声の発話時間長を変え、時間長規則を付与で
きる。波形編集型音声合成装置を提供することにある。（発明の構成）本発明は、複数の単位音声波形を持ち、入力文字列か
ら該単位音声波形に対する時間長データを生成する手段
と、各単位音声波形についてそれぞれ編集に必要な部分
の波形を取り出す位置データを複数通り記憶する手段
と、前記時間長データにしたがって選択された前記位置
データにしたがって前記単位音声波形から前記必要な部
分の波形を取り出し編集合成する手段とを有することを
特徴とする。（構成の詳細な説明）本発明は、前述の構成をとることにより、従来技術の
問題点を解決した。すなわち、波形を取り出す位置を変
えることにより、単位音声波形の時間を変えることがで
きる。すなわち、編集合成する単語、文章等の音声の時
間長を変えることができ、波形編集型合成装置において
時間長の制御が可能である。本発明の原理を第１図を用いて説明する。単位音声波
形101において一点鎖線a,b,c,d,e,fは波形の切り出し位
置を示す。この例はほんの一例であり、切り出し位置の
数等は任意である。切り出し位置ａからｆまでの波形は
時間長T₁を有し、切り出し位置ａからｆまでの間から取
り出された波形が波形102である。同様に切り出し位置
ｂからｆまで取り出された波形は波形103で時間長T₂を
有し、切り出し位置ｃからｆまでの波形は波形104で時
間長T₃、切り出し位置ｄからｆまでの波形は波形105で
時間長T₄、切り出し位置ｅからｆまでの波形は波形106
で時間長T₅である。このようにして、１つの単位波形について時間長の異
なる波形102,103,104,105,106を切り出すことができ
る。すなわち、波形の切り出し位置（この例ではa,b,c,d,
e,f）を１つの単位波形について数種類用意しておくこ
とによって、時間長を変えることができる。（実施例）第２図は、本発明の一実施例を示すブロック図であ
る。文字列、入力端子に入力した文字列を単位音声番号生
成回路202に出力する。単位音声番号生成回路202におい
て、前記文字列から単位音声番号を生成する。生成され
た単位音声番号をアドレス生成回路203に出力する。ア
ドレス生成回路203において、前記単位音声番号から、
Ｎ個（但しＮは任意の数）の切り出し位置c₀,c₁,…、c_N
を生成する。時間長制御回路213より、アドレス生成回路制御信号
伝送路210を介して、アドレス生成回路制御信号をアド
レス生成回路203出力する。アドレス生成回路203は前記
切り出し位置c₀,c₁,…,c_Nを波形切り出しアドレス記憶
回路205に出力する。時間長制御回路213より、波形切り出しアドレス記憶
回路制御信号伝送路211を介して、波形切り出しアドレ
ス記憶回路制御信号を前記205に出力すると、時間長制
御回路213により指定された時間長に対応する切り出し
位置が選択される。選択された切り出し位置を単位音声
波形記憶回路206に出力する。時間長制御回路213により、単位音声波形記憶回路制
御信号伝送路212を介して単位音声波形記憶回路制御信
号を単位音声波形記憶回路206に出力する。単位音声波
形記憶回路は指定された切り出し位置で切り出される単
位音声波形を編集合成回路207に出力する。ピッチ波形
補間合成回路209より補間波形を編集合成回路207に出力
すると、編集合成回路207は合成波形を編集合成する
（補間については音響学会春季講演論文集１−７−２
（1981）を参照）。前記207から合成波形出力端子208に
合成波形を出力する。なお、ピッチ波形補間合成回路209において、補間数
を変えることにより、補間波形の時間長を変えることが
できるので、補間時間長の制御も併用することが可能で
ある。（発明の効果）本発明によれば、従来技術の波形編集型音声合成装置
において、各単位音声波形についてそれぞれ時間長の異
なる波形の切り出し位置のアドレスを記憶しておくこと
により時間長を制御でき比較的高品質な音声が生成でき
るという効果がある。Description: TECHNICAL FIELD The present invention relates to a waveform editing type speech synthesizer that edits and synthesizes a unit speech waveform such as a single syllable. (Problems of the Related Art) Conventionally, there has been known a method of generating words, sentences, and the like by editing and synthesizing voice waveforms such as CV and VC waveforms (C is a consonant and V is a vowel) according to a character string given as an input. (Acoustic Society Spring Meeting, 1-7-2 (1981)). However, the above method has a drawback that the time length of the unit speech waveform is fixed, and the speech time length of speech such as words or sentences to be synthesized cannot be changed. (Object of the Invention) An object of the present invention is to change a position where a unit sound waveform to be extracted for editing is extracted, thereby forming a word to be synthesized,
It is possible to change the utterance time length of a voice such as a sentence and add a time length rule. An object of the present invention is to provide a waveform editing type speech synthesizer. (Configuration of the Invention) The present invention has a plurality of unit sound waveforms, means for generating time length data for the unit sound waveform from an input character string, and extracting a part of each unit sound waveform necessary for editing. Means for storing a plurality of types of position data; and means for extracting and synthesizing the required part of the waveform from the unit sound waveform in accordance with the position data selected in accordance with the time length data. (Detailed Description of Configuration) The present invention has solved the problems of the prior art by adopting the above-described configuration. That is, by changing the position from which the waveform is extracted, the time of the unit sound waveform can be changed. That is, it is possible to change the time length of the sound of words and sentences to be edited and synthesized, and it is possible to control the time length in the waveform editing type synthesizing apparatus. The principle of the present invention will be described with reference to FIG. In the unit speech waveform 101, dashed lines a, b, c, d, e, and f indicate the cut-out positions of the waveform. This example is merely an example, and the number of cutout positions and the like are arbitrary. Waveform from the extraction position a to f has a time length T _1, waveforms taken from between the cut-out position a to f is a waveform 102. Similarly, the waveform extracted from the extraction positions b to f has a waveform 103 having a time length T ₂ , the waveform from the extraction positions c to f has a time length T _{3 as} the waveform 104, and the waveform from the extraction positions d to f has a time length T _2. The waveform 105 has a time length T ₄ , and the waveform from the extraction positions e to f is the waveform 106.
In a length of time T _5. In this manner, waveforms 102, 103, 104, 105, and 106 having different time lengths can be cut out from one unit waveform. In other words, the waveform cutout position (a, b, c, d,
By preparing several types of e, f) for one unit waveform, the time length can be changed. (Embodiment) FIG. 2 is a block diagram showing an embodiment of the present invention. The character string and the character string input to the input terminal are output to the unit voice number generation circuit 202. A unit voice number generation circuit 202 generates a unit voice number from the character string. The generated unit voice number is output to the address generation circuit 203. In the address generation circuit 203, from the unit voice number,
N cutout positions c ₀ , c ₁ ,..., C _N (where N is an arbitrary number)
Generate An address generation circuit control signal is output from the time length control circuit 213 via the address generation circuit control signal transmission line 210 to the address generation circuit 203. The address generation circuit 203 outputs the cut-out positions c ₀ , c ₁ ,..., C _N to the waveform cut-out address storage circuit 205. When the time length control circuit 213 outputs a waveform cutout address storage circuit control signal to the 205 via the waveform cutout address storage circuit control signal transmission line 211, the cutout corresponding to the time length designated by the time length control circuit 213 is output. The position is selected. The selected cutout position is output to the unit sound waveform storage circuit 206. The time length control circuit 213 outputs a unit audio waveform storage circuit control signal to the unit audio waveform storage circuit 206 via the unit audio waveform storage circuit control signal transmission line 212. The unit sound waveform storage circuit outputs the unit sound waveform cut out at the specified cutout position to the editing / synthesizing circuit 207. When the pitch waveform interpolation / synthesis circuit 209 outputs the interpolation waveform to the edit / synthesis circuit 207, the edit / synthesis circuit 207 edits and synthesizes the synthesized waveform.
(1981)). The composite waveform is output from the 207 to a composite waveform output terminal 208. In the pitch waveform interpolation / synthesis circuit 209, the time length of the interpolation waveform can be changed by changing the number of interpolations, so that the interpolation time length can be controlled together. (Effects of the Invention) According to the present invention, the time length can be controlled in the conventional waveform editing type speech synthesizer by storing the addresses of the cutout positions of the waveforms having different time lengths for each unit sound waveform. There is an effect that relatively high-quality sound can be generated.

【図面の簡単な説明】第１図は本発明の原理説明図、第２図は本発明の一実施
例を示すブロック図である。図において、 201は文字列入力端子、 202は単位音声番号生成回路、 203はアドレス生成回路、 205は波形切り出しアドレス記憶回路、 206は単位音声波形記憶回路、 207は編集合成回路、 208は合成波形出力端子、 209はピッチ波形補間合成回路、 213は時間長制御回路、をそれぞれ示す。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a view for explaining the principle of the present invention, and FIG. 2 is a block diagram showing one embodiment of the present invention. In the figure, 201 is a character string input terminal, 202 is a unit voice number generation circuit, 203 is an address generation circuit, 205 is a waveform cutout address storage circuit, 206 is a unit voice waveform storage circuit, 207 is an edit synthesis circuit, and 208 is a synthesized waveform. An output terminal, 209 is a pitch waveform interpolation / synthesis circuit, and 213 is a time length control circuit.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁶，ＤＢ名) G10L 3/00,5/00 - 5/04──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int.Cl. ⁶ , DB name) G10L 3 / 00,5 / 00-5/04

Claims

(57) [Claims] Means for generating a plurality of unit audio waveforms, generating time length data for the unit audio waveform from the input character string, and storing a plurality of pieces of position data for extracting a waveform of a portion necessary for editing for each unit audio waveform; Means for extracting and synthesizing the required portion of the waveform from the unit voice waveform in accordance with the position data selected in accordance with the time length data, and performing editing and synthesis.