JPH01284898A - Voice synthesizing device - Google Patents

Voice synthesizing device

Info

Publication number
JPH01284898A
JPH01284898A JP63115721A JP11572188A JPH01284898A JP H01284898 A JPH01284898 A JP H01284898A JP 63115721 A JP63115721 A JP 63115721A JP 11572188 A JP11572188 A JP 11572188A JP H01284898 A JPH01284898 A JP H01284898A
Authority
JP
Japan
Prior art keywords
waveform
information
dictionary
unit
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP63115721A
Other languages
Japanese (ja)
Other versions
JP2761552B2 (en
Inventor
Tomohisa Hirokawa
広川 智久
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP63115721A priority Critical patent/JP2761552B2/en
Publication of JPH01284898A publication Critical patent/JPH01284898A/en
Application granted granted Critical
Publication of JP2761552B2 publication Critical patent/JP2761552B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Abstract

PURPOSE:To output a composite voice whose naturalness and articulation are both high by accumulating in advance a large quantity of waveforms as a dictionary and selecting and connecting the most suitable waveform to an input text. CONSTITUTION:An input text from a terminal 1 is analyzed by a text analyzing part 2, and based on an output of the analyzing part 2, meter information for synthesizing a voice is generated by a meter information generating part 3. On the other hand, a waveform dictionary 9 for storing a large quantity of waveform information at every appropriate unit for constituting an output voice such as a phoneme, etc., is provided, and by information from the analyzing part 2 and the generating part 3, an appropriate waveform is selected from the dictionary 9 by a waveform selecting part 8. In case when a desired waveform does not exist, a deformation is performed by a waveform deformation processing part 10 so as to conform to a use purpose with respect to the nearest waveform to the selecting condition, and in case when the desired waveform does not exist at all, a new waveform is generated by a waveform generating part 11. These waveforms from the selecting part 8, the processing part 10 and the generating part 11 are connected by a waveform connecting part 12.

Description

【発明の詳細な説明】 「産業上の利用分野」 この発明は、テキストを入力しそのテキストに応じた任
意の音声を出力する音声合成装置、特に主)−音韻系列
と韻律情報とから音声を合成する規則合成装置(−関す
るものである。
[Detailed Description of the Invention] "Industrial Application Field" This invention relates to a speech synthesis device that inputs text and outputs arbitrary speech according to the text. A rule-based synthesizer (-) that performs synthesis.

「従来の技術」 従来、任意の音声を出力する規則合成装置では、音声合
成方式にL P C(Linear Predicti
onCorder )方式を利用し、音声の結合単位に
はCVやvcv 、cvcなど音韻との対応や調音結合
を考慮した単位を設定し、基本周波数パタンなどの韻律
情報はアクセント形や呼気段落内のモーラ数などから音
韻情報とは独立C二生成する方式を保つているものが多
い。しかしながらこれらの方法では、当然ながら音声合
成時に分析時とは異なった基本周波数を持つ音源で駆動
するため、LPCパラメータがあられす声道スペクトル
と音源ス々クトルの不整合により、異常振幅やスペクト
ルQ値の低減が生じ、合成音品質劣化の原因となってい
る。これはLPC方式が声道パラメータと音源パラメー
タとの独立を仮定した分析合成方式であるにもかかわら
ず、実際(二はこれら二つのパラメータは本来独立では
なく微妙(二関係しあっているために起こる劣化で、規
則合成にLPC分析合成手法を用いる根本的な問題と考
えられる。
“Prior Art” Conventionally, in rule synthesis devices that output arbitrary speech, LPC (Linear Predictive) is used as a speech synthesis method.
onCorder) method, units such as CV, vcv, and cvc are set in consideration of the correspondence with phonemes and articulatory connections, and prosodic information such as fundamental frequency patterns is set using the accent form and mora in exhalation paragraphs. Many of them maintain a method of generating C2 independently from phonological information based on the number and other factors. However, these methods naturally use a sound source with a different fundamental frequency from that used during analysis during speech synthesis, which causes abnormal amplitude and spectrum Q due to the mismatch between the vocal tract spectrum and the sound source spectrum, which causes LPC parameters. This results in a reduction in the value, causing a deterioration in the quality of the synthesized sound. This is despite the fact that the LPC method is an analysis and synthesis method that assumes the independence of vocal tract parameters and sound source parameters. This degradation is considered to be a fundamental problem in using the LPC analysis synthesis method for ordered synthesis.

他に音声の特徴をホルマントで記述し、ホルマントの動
きを規定すること(二より規則合成音を得る方式がある
が、ホルマントの自動抽出が難しく、ホルマント遷移の
記述も十分ではないため、LPCを用いる方式より品質
が良くないのが現状である。
Another method is to describe the characteristics of speech in formants and specify the movement of formants (there is a method to obtain a regular synthesized sound from two methods, but automatic extraction of formants is difficult and the description of formant transitions is not sufficient, so LPC is used). Currently, the quality is not as good as the method used.

一方、このような問題を回避し、明瞭性の高い原波形に
着目した方式もいくつか提案されている。
On the other hand, some methods have been proposed that avoid this problem and focus on original waveforms with high clarity.

しかしいずれも音素や音節単位に高々数種類の波形を用
意し、基本周波数や継続時間長の調整は、波形の打ち切
りや繰り返し、間引き等を施すことにより対処している
。従って合成音声は細かな制御は不可能であり、短かい
音声をはりあわせた感じの音や、ブザーのような機械音
的になってしまうという欠点を有していた。
However, in both cases, at most several types of waveforms are prepared for each phoneme or syllable, and the fundamental frequency and duration length are adjusted by truncating, repeating, or thinning out the waveforms. Therefore, synthetic speech cannot be controlled in detail, and has the disadvantage that it sounds like a combination of short sounds or a mechanical sound like a buzzer.

この発明の目的は、テキスト合成に必要な規則合成にお
いて、自然性、明瞭性のともに高い合成音声の出力を可
能とする音声合成装置を提供することにある。
An object of the present invention is to provide a speech synthesis device that is capable of outputting synthesized speech with high naturalness and clarity in rule synthesis required for text synthesis.

「課題を解決するための手段」 この発明によれば入力テキストはテキスト解析部で解析
され、そのテキスト解析部の出力を基に音声合成のため
の韻律情報が韻律情報生成部で生成される。一方音素な
ど出力音声を組み立てる上で適切な単位毎に、原波形、
発声された音韻環境、基本周波数パタン形状、継続時間
情報、振幅情報などを記載した大量の波形情報を格納す
る波形辞書か設けられ、テキスト解析部及び韻律情報生
成部からの情報により波形辞書より適切な波形が波形選
択部で選択され、所望の波形がない場合c畷ま最も選択
条件に近い波形(二対し使用目的に合致するように波形
変形処理部で変形が施され、所望の波形が全くない場合
は新たに波形が波形生成部で生成される。これら波形選
択部、波形変形処理部及び波形生成部からの波形は波形
接続部で接続される。
"Means for Solving the Problem" According to the present invention, an input text is analyzed by a text analysis section, and based on the output of the text analysis section, prosody information for speech synthesis is generated by a prosody information generation section. On the other hand, the original waveform,
A waveform dictionary is provided that stores a large amount of waveform information including the phonological environment of utterance, fundamental frequency pattern shape, duration information, amplitude information, etc., and is more suitable than the waveform dictionary based on information from the text analysis section and the prosodic information generation section. If the desired waveform is selected in the waveform selection section and the desired waveform is not available, the waveform closest to the selection condition (on the other hand, the waveform modification processing section deforms it to match the purpose of use and the desired waveform is not found at all) If there is no waveform, a new waveform is generated by the waveform generation section.The waveforms from the waveform selection section, waveform modification processing section, and waveform generation section are connected at the waveform connection section.

このようにこの発明によれば大量の波形を辞書として蓄
積しておき、入力テキストに対し最も適した波形を選択
して接続することにより出力音声を合成しているため、
明瞭性が高く、シかも自然性の良い音声が得られる。
In this way, according to the present invention, a large number of waveforms are stored as a dictionary, and output speech is synthesized by selecting and connecting the most suitable waveforms for the input text.
High clarity and natural sound can be obtained.

「実施例」 第1図はこの発明の一実施例を示すブロック図である。"Example" FIG. 1 is a block diagram showing one embodiment of the present invention.

すなわち端子lより音声に変換すべきテキストが入力さ
れると、テキスト解析部2により係り受けや品詞解析な
どの形態素解析、および漢字かな変換、アクセント処理
が行われ、音韻系列バッファ7、韻律情報生成部3(二
必要な情報が送出される。その情報としては音韻系列バ
ッファ7に対しては音韻の区別を示す記号列、韻律情報
生成部3に対しては呼気段落内モーラ数、アクセント形
、発声スピードなどである。韻律情報生成部3はこれら
の情報を基にピッチパタン、各音素毎の時間長パタン、
および振幅パタンを規則により生成し、それぞれのバッ
ファ4,5.6に書き込む。
That is, when text to be converted into speech is input from terminal l, the text analysis unit 2 performs morphological analysis such as dependency and part-of-speech analysis, kanji-kana conversion, and accent processing, and the phonological sequence buffer 7 and prosody information generation. Part 3 (2) Necessary information is sent.The information includes a symbol string indicating phoneme distinctions to the phoneme sequence buffer 7, the number of moras in an exhalation paragraph, the accent shape, and the prosodic information generator 3. The prosodic information generation unit 3 generates pitch patterns, duration patterns for each phoneme, etc. based on this information.
and amplitude patterns are generated according to the rules and written into the respective buffers 4, 5.6.

波形選択部8は音韻系列バッファ7、ピッチパタンバッ
ファ41時間長バッファ5、振幅バッファ6を参照して
、波形辞M9より最適な波形を選択する。波形辞書9は
一例として、第2図に示すような構成tしており発声時
の種々の情報とともに波形が格納されている。種々の情
報とは、すなわち音韻種別、前後7程度の音韻環境、音
素内の平均ピッチ、ピッチの形状を示すための1次直線
で近似した場合の傾き、音素の継続時間長、波形中心部
での数ピツチの始点・終点を示す時間長調整用情報、正
規化した音素波形のRMS値(振幅)および実際の波形
データである。波形辞書9の作成は、大量の発声データ
をもとにオフライン処理で予め作成しておく。例えば男
性アナウンサー名の発声による単語、文章など約数時間
の音声データを12KHzでAD変換し、デジタルツナ
グラムの視察により音韻ラベリングを施す。この音声デ
ータに対し、ラベルの音韻境界前後20〜30m5の波
形をデイスプレィに表示し、カーソルで切り出すことで
作成できる。切り出し位置は原則として波形の負から正
へのO切片とし、さら(二音韻毎に例えば正のピークの
手前の0切片で切り出すなど、ルールを定めておく。こ
うすることで接続点での不連続は避けられ、滑らかな連
続波形が得られる。また音声データをLPC等で音声分
析してピッチを抽出し、ピッチ形状や時間長などにおい
て類似の音韻の統合化を行っておけば、切り出し波形数
を低減でき、能率良く波形辞書の作成ができる。
The waveform selection section 8 refers to the phoneme sequence buffer 7, pitch pattern buffer 41, time length buffer 5, and amplitude buffer 6, and selects the optimal waveform from the waveform dictionary M9. The waveform dictionary 9 has, for example, a configuration as shown in FIG. 2, and stores waveforms together with various information at the time of utterance. The various types of information include the phoneme type, the phoneme environment of about 7 degrees before and after, the average pitch within a phoneme, the slope of approximation by a linear straight line to indicate the shape of the pitch, the duration of the phoneme, and the center of the waveform. These are time length adjustment information indicating the start and end points of the number pitch, the RMS value (amplitude) of the normalized phoneme waveform, and the actual waveform data. The waveform dictionary 9 is created in advance by off-line processing based on a large amount of voice data. For example, approximately several hours of audio data, such as words and sentences uttered by a male announcer's name, are converted into AD at 12 KHz, and phonological labeling is performed by observing a digital tunagram. This audio data can be created by displaying a waveform of 20 to 30 m5 before and after the phoneme boundary of the label on a display and cutting it out with a cursor. As a general rule, the cutout position should be the O-intercept from the negative to the positive waveform, and rules should also be established (for example, cut out at the 0-intercept before the positive peak for each diphoneme).By doing this, it is possible to avoid errors at connection points. Continuity can be avoided and a smooth continuous waveform can be obtained.In addition, if the audio data is analyzed using LPC, etc. to extract the pitch, and phonemes that are similar in terms of pitch shape and duration are integrated, the extracted waveform can be obtained. The number of waveform dictionaries can be reduced and waveform dictionaries can be created efficiently.

波形選択部8の動作をさらに詳細に述べると、−例とし
て第3図に示したようになる。まず検索音韻系列を設定
する。検索音韻系列は該当する音韻?中心(=置き、辞
書中にある環境音韻の数での窓かけを行って入力音韻系
列から切り出しで設定する。波形辞書9を検索して波形
候補が見つからない場合は順次検索音韻系列を両側から
削除していきながら検索を行う。検索音韻系列が該当す
る音韻のみとなっても、波形候補が見つからない場合、
波形生成部11において所望のピッチ波形の生成を行う
。次に合成音声の自然性に最も大きな影響を及ぼすと考
えられるピッチパタンを考慮し、選択すべき音素のピッ
チ条件を設定する。これはピッチパタンバッファ4を参
照して、平均ピッチ、ピッチの形状より決定する。許容
範囲は実験値より決定すべきであるが、およそ所望値の
5チ以内ならば自然性は作たれると考えられる。波形1
+ 浦が見つかった場合は、それらに対し時間長条件に
よる選択を行う。時間長条件は、時間長バッファ5の時
間長と、ピッチ条件と同様に、実験値より決まる許容範
囲とから設定される。時間長条件(ユ合う波形候補がな
い場合、最も条件に近い波形候補が選択され、波形変形
処理部10(=おいて時間長調整処理を施す。波形候補
が見つかりた場合は、次に振幅条件による選択を行う。
The operation of the waveform selection section 8 will be described in more detail as shown in FIG. 3 as an example. First, a search phoneme sequence is set. Is the search phoneme series the corresponding phoneme? Set the center (= placement) by cutting out the input phoneme sequence from the input phoneme sequence by performing windowing using the number of environmental phonemes in the dictionary.If a waveform candidate is not found by searching the waveform dictionary 9, sequentially search the phoneme sequence from both sides. Search while deleting.Even if the search phoneme sequence contains only the corresponding phoneme, if no waveform candidates are found,
The waveform generator 11 generates a desired pitch waveform. Next, pitch conditions for the phonemes to be selected are set, taking into account the pitch pattern that is thought to have the greatest effect on the naturalness of the synthesized speech. This is determined based on the average pitch and pitch shape with reference to the pitch pattern buffer 4. Although the permissible range should be determined based on experimental values, it is considered that naturalness can be created if it is within approximately 5 inches of the desired value. Waveform 1
+ If ura are found, select them based on time length conditions. The time length condition is set from the time length of the time length buffer 5 and an allowable range determined from experimental values, similar to the pitch condition. If there is no waveform candidate that matches the time length condition, the waveform candidate closest to the condition is selected, and time length adjustment processing is performed at the waveform transformation processing unit 10 (=.If a waveform candidate is found, then the amplitude condition Make a selection.

この場合も時間長条件と同様、振幅バッファ6の振幅、
および許容範囲から条件が設定される。波形候補が見つ
からない場合は、時間長と同様、最も条件(二近い波形
候補が選択され、波形変形処理部10(二おいて振幅調
整処理を施す。こうして音韻環境や韻律条件により波形
選択部8で選択された波形、波形生成部11で作られた
波形、および波形変形処理部10で調整された波形は波
形接続部12(″−送出され、順次結合されて音声波形
として出力端子13(=出力される。
In this case, as well as the time length condition, the amplitude of the amplitude buffer 6,
Conditions are set from the and permissible ranges. If a waveform candidate is not found, the waveform candidate closest to the condition (2) is selected, and amplitude adjustment processing is performed at the waveform modification processing section 10 (2).In this way, the waveform selection section 8 The waveform selected in , the waveform created in the waveform generation section 11 , and the waveform adjusted in the waveform modification processing section 10 are sent out to the waveform connection section 12 (''-), and are sequentially combined as an audio waveform to the output terminal 13 (= Output.

波形生成部11では、例えばLPG技術を用いて任意の
ピッチを持つ波形を生成する。すなわち音韻対応にスペ
クトルを示すLPCパラメータを蓄積しておき、指定さ
れたピッチによりパルス、または残差などを駆動し波形
を生成する。ここでLPG技術を用いることは発明の目
的と異なるが、この波形生成部11は波形が全くない場
合の、いわば救済措置であり使用頻度は少ないと考えら
れる。
The waveform generation unit 11 generates a waveform with an arbitrary pitch using, for example, LPG technology. That is, LPC parameters indicating spectra corresponding to phonemes are stored, and a waveform is generated by driving pulses or residuals with a specified pitch. Although the use of LPG technology here is different from the purpose of the invention, this waveform generation unit 11 is considered to be a so-called rescue measure when there is no waveform at all, and is not used frequently.

また波形変形処理部10では、時間長調整処理、振幅調
整処理を行っている。以下にそれらの処理について説明
する。
The waveform modification processing section 10 also performs time length adjustment processing and amplitude adjustment processing. These processes will be explained below.

時間長調整処理は、当該音韻が無声音と有声音で処理が
異なる。無声音の場合、破裂音であれば無音区間を伸縮
する事で対処し、摩擦音であれば中心部から前後に向か
って所望の時間長になるよう、波形の切断、または繰り
返し使用を行う。有声音の場合は、波形中心部でのピッ
チ位置を3ピッチ程度辞書中に蓄えておき、波形データ
の方が長い場合はそれらの間引き、波形データの方が短
い場合はそれらの繰り返し使用を行う。
The time length adjustment process differs depending on whether the phoneme is a voiceless sound or a voiced sound. In the case of unvoiced sounds, if it is a plosive sound, it is dealt with by expanding or contracting the silent section, and if it is a fricative sound, the waveform is cut or used repeatedly so that the desired time length is reached from the center to the front and back. In the case of voiced sounds, the pitch position at the center of the waveform is stored in a dictionary of about 3 pitches, and if the waveform data is longer, they are thinned out, and if the waveform data is shorter, they are used repeatedly. .

振幅調整処理は、音素毎に定められた振幅値を振幅バッ
ファより参照して、選択または生成された波形のRMS
値との比率(二より振幅値を線形に調整する。
The amplitude adjustment process refers to the amplitude value determined for each phoneme from the amplitude buffer, and calculates the RMS of the selected or generated waveform.
linearly adjust the amplitude value by the ratio of the two values.

「発明の効果」 以上述べたようC二この発明(二よれば、大量の波形を
辞書として蓄積しておき、入力テキストに対し最も適し
た波形を選択し、接続することで出力音声を合成してい
るため、明瞭性が高く、シかも自然性も良い音声を提供
できる。
"Effects of the Invention" As stated above, according to this invention (2), a large number of waveforms are stored as a dictionary, the most suitable waveforms are selected for the input text, and the output speech is synthesized by connecting them. As a result, it is possible to provide audio with high clarity, clarity, and naturalness.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図はこの発明による音声合成装置の実施例を示すブ
ロック図、第2図は波形辞書9の一構成例を示す図、第
3図は波形辞書から最も適切な波形を選択する方法を示
すフロー図である。 特許出願人  日本電信電話株式会社 代  理  人   草   野     卓〒3 図
FIG. 1 is a block diagram showing an embodiment of a speech synthesis device according to the present invention, FIG. 2 is a diagram showing an example of the configuration of a waveform dictionary 9, and FIG. 3 is a diagram showing a method for selecting the most appropriate waveform from the waveform dictionary. It is a flow diagram. Patent applicant: Nippon Telegraph and Telephone Corporation Representative: Takashi Kusano Figure 3

Claims (1)

【特許請求の範囲】[Claims] (1)入力されたテキストに応じ、音声を出力する音声
合成装置において、 上記入力テキストを解析するテキスト解析部と、 そのテキスト解析部の出力を基に音声合成のための韻律
情報を生成する韻律情報生成部と、音素など出力音声を
組み立てる上で適切な単位毎に、原波形、発声された音
韻環境、基本周波数パタン形状、継続時間情報、振幅情
報などを記載した大量の波形情報を格納する波形辞書と
、 上記テキスト解析部と上記韻律情報生成部からの情報に
より、上記波形辞書より適切な波形を選択する波形選択
部と、 所望の波形がない場合には最も選択条件に近い波形に対
し使用目的に合致するよう変形を施す波形変形処理部と
、 所望の波形が全くない場合は、新たに波形を生成する波
形生成部と、 これら波形選択部、波形変形処理部、及び波形生成部か
らの波形を接続する波形接続部とを備えることを特徴と
する音声合成装置。
(1) A speech synthesis device that outputs speech according to input text, including a text analysis section that analyzes the input text, and a prosody that generates prosody information for speech synthesis based on the output of the text analysis section. It stores a large amount of waveform information that describes the original waveform, phonological environment of utterance, fundamental frequency pattern shape, duration information, amplitude information, etc. for each appropriate unit for assembling output speech such as an information generation unit and phoneme. a waveform dictionary, a waveform selection unit that selects an appropriate waveform from the waveform dictionary based on information from the text analysis unit and the prosodic information generation unit, and a waveform selection unit that selects an appropriate waveform from the waveform dictionary; A waveform transformation processing unit that performs transformation to match the purpose of use, a waveform generation unit that generates a new waveform if the desired waveform is not found at all, and a waveform selection unit, waveform transformation processing unit, and waveform generation unit. A speech synthesis device comprising: a waveform connecting section that connects waveforms of.
JP63115721A 1988-05-11 1988-05-11 Voice synthesis method Expired - Lifetime JP2761552B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63115721A JP2761552B2 (en) 1988-05-11 1988-05-11 Voice synthesis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63115721A JP2761552B2 (en) 1988-05-11 1988-05-11 Voice synthesis method

Publications (2)

Publication Number Publication Date
JPH01284898A true JPH01284898A (en) 1989-11-16
JP2761552B2 JP2761552B2 (en) 1998-06-04

Family

ID=14669490

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63115721A Expired - Lifetime JP2761552B2 (en) 1988-05-11 1988-05-11 Voice synthesis method

Country Status (1)

Country Link
JP (1) JP2761552B2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5950152A (en) * 1996-09-20 1999-09-07 Matsushita Electric Industrial Co., Ltd. Method of changing a pitch of a VCV phoneme-chain waveform and apparatus of synthesizing a sound from a series of VCV phoneme-chain waveforms
US6035272A (en) * 1996-07-25 2000-03-07 Matsushita Electric Industrial Co., Ltd. Method and apparatus for synthesizing speech
US6125346A (en) * 1996-12-10 2000-09-26 Matsushita Electric Industrial Co., Ltd Speech synthesizing system and redundancy-reduced waveform database therefor
WO2004109660A1 (en) * 2003-06-04 2004-12-16 Kabushiki Kaisha Kenwood Device, method, and program for selecting voice data
WO2004109659A1 (en) * 2003-06-05 2004-12-16 Kabushiki Kaisha Kenwood Speech synthesis device, speech synthesis method, and program
JP2006145848A (en) * 2004-11-19 2006-06-08 Kenwood Corp Speech synthesizer, speech segment storage device, apparatus for manufacturing speech segment storage device, method for speech synthesis, method for manufacturing speech segment storage device, and program
JP2006195207A (en) * 2005-01-14 2006-07-27 Kenwood Corp Device and method for synthesizing voice, and program therefor
WO2006129814A1 (en) * 2005-05-31 2006-12-07 Canon Kabushiki Kaisha Speech synthesis method and apparatus
JP2008139631A (en) * 2006-12-04 2008-06-19 Nippon Telegr & Teleph Corp <Ntt> Voice synthesis method, device and program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56153400A (en) * 1980-04-30 1981-11-27 Nippon Telegraph & Telephone Voice responding device
JPS6295595A (en) * 1985-10-23 1987-05-02 株式会社日立製作所 Voice response system
JPS62296198A (en) * 1986-06-16 1987-12-23 日本電気株式会社 Voice synthesization system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56153400A (en) * 1980-04-30 1981-11-27 Nippon Telegraph & Telephone Voice responding device
JPS6295595A (en) * 1985-10-23 1987-05-02 株式会社日立製作所 Voice response system
JPS62296198A (en) * 1986-06-16 1987-12-23 日本電気株式会社 Voice synthesization system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6035272A (en) * 1996-07-25 2000-03-07 Matsushita Electric Industrial Co., Ltd. Method and apparatus for synthesizing speech
US5950152A (en) * 1996-09-20 1999-09-07 Matsushita Electric Industrial Co., Ltd. Method of changing a pitch of a VCV phoneme-chain waveform and apparatus of synthesizing a sound from a series of VCV phoneme-chain waveforms
US6125346A (en) * 1996-12-10 2000-09-26 Matsushita Electric Industrial Co., Ltd Speech synthesizing system and redundancy-reduced waveform database therefor
WO2004109660A1 (en) * 2003-06-04 2004-12-16 Kabushiki Kaisha Kenwood Device, method, and program for selecting voice data
WO2004109659A1 (en) * 2003-06-05 2004-12-16 Kabushiki Kaisha Kenwood Speech synthesis device, speech synthesis method, and program
US8214216B2 (en) 2003-06-05 2012-07-03 Kabushiki Kaisha Kenwood Speech synthesis for synthesizing missing parts
JP2006145848A (en) * 2004-11-19 2006-06-08 Kenwood Corp Speech synthesizer, speech segment storage device, apparatus for manufacturing speech segment storage device, method for speech synthesis, method for manufacturing speech segment storage device, and program
JP2006195207A (en) * 2005-01-14 2006-07-27 Kenwood Corp Device and method for synthesizing voice, and program therefor
WO2006129814A1 (en) * 2005-05-31 2006-12-07 Canon Kabushiki Kaisha Speech synthesis method and apparatus
JP2008139631A (en) * 2006-12-04 2008-06-19 Nippon Telegr & Teleph Corp <Ntt> Voice synthesis method, device and program

Also Published As

Publication number Publication date
JP2761552B2 (en) 1998-06-04

Similar Documents

Publication Publication Date Title
US7565291B2 (en) Synthesis-based pre-selection of suitable units for concatenative speech
US8224645B2 (en) Method and system for preselection of suitable units for concatenative speech
US6470316B1 (en) Speech synthesis apparatus having prosody generator with user-set speech-rate- or adjusted phoneme-duration-dependent selective vowel devoicing
US8195464B2 (en) Speech processing apparatus and program
US20040030555A1 (en) System and method for concatenating acoustic contours for speech synthesis
JPH031200A (en) Regulation type voice synthesizing device
Inanoglu et al. A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality.
JPH01284898A (en) Voice synthesizing device
Thomas et al. Natural sounding TTS based on syllable-like units
Kayte et al. A Corpus-Based Concatenative Speech Synthesis System for Marathi
US6829577B1 (en) Generating non-stationary additive noise for addition to synthesized speech
JPH08335096A (en) Text voice synthesizer
Furtado et al. Synthesis of unlimited speech in Indian languages using formant-based rules
JPH0580791A (en) Device and method for speech rule synthesis
JP3081300B2 (en) Residual driven speech synthesizer
Niimi et al. Synthesis of emotional speech using prosodically balanced VCV segments
Ng Survey of data-driven approaches to Speech Synthesis
EP1640968A1 (en) Method and device for speech synthesis
Vine et al. Synthesising emotional speech by concatenating multiple pitch recorded speech units
JPH09292897A (en) Voice synthesizing device
Juergen Text-to-Speech (TTS) Synthesis
JPH1097268A (en) Speech synthesizing device
JPH09146576A (en) Synthesizer for meter based on artificial neuronetwork of text to voice
JPH08160990A (en) Speech synthesizing device
JPH06138894A (en) Device and method for voice synthesis

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090327

Year of fee payment: 11

EXPY Cancellation because of completion of term
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090327

Year of fee payment: 11