JPH08328575A

JPH08328575A - Voice synthesizer

Info

Publication number: JPH08328575A
Application number: JP7130771A
Authority: JP
Inventors: Hiroki Onishi; 宏樹大西; Takeshi Yumura; 武湯村; Masanori Miyatake; 正典宮武; Masashi Ochiiwa; 正士落岩; Takatsugu Izumi; 貴次泉; Terushige Sawada; 暉重澤田
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1995-05-29
Filing date: 1995-05-29
Publication date: 1996-12-13

Abstract

PURPOSE: To provide a voice synthesizer in which the voices generated as synthesized voices are converted to the voices having the voice quality of, for example, the character of an animation or the blood relative of the user who is using the synthesizer. CONSTITUTION: The synthesizer is provided with a voice data keeping section 8 which keeps the voice data to be reproduced as synthesized voices, a text data keeping section 9 which keeps the text data, a voice synthesizing section 11 which reproduces the data, that are kept in these sections, as synthesized voices, a voice quality feature extracting section 3 which generates the voice quality data in that the features of the voice quality of the inputted voice signals are extracted and a voice generation mode setting section 7 which sets the generation mode during the generation of the synthesized voices. The data kept in the sections 8 or 9 are voice synthesized by the section 11 using the voice quality data extracted by the section 3 and are reproduced in accordance with the generation mode specified by the section 7.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は音声合成装置に関し、特
に子供を対象として合成音声を発声する必要があるよう
な種々の装置、たとえば子供向けパーソナルコンピュー
タ等に組み込んで使用するような場合に好適な音声合成
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice synthesizing device, and is particularly suitable for use in a variety of devices that need to produce a synthetic voice for children, such as a personal computer for children. Speech synthesizer.

【０００２】[0002]

【従来の技術】近年のパーソナルコンピュータの発達及
び普及に伴って、学齢に達する前後の子供を対象とし
て、主として学習目的で使用させるためのパーソナルコ
ンピュータが商品化されている。ところで、このような
パーソナルコンピュータでは、パーソナルコンピュータ
の操作そのもの、あるいは学習の手順、設問、その解
答、解答の求め方、更には物語を読み上げたり、歌を歌
ったり等を合成音声の発声により行なっている場合が多
い。2. Description of the Related Art With the recent development and popularization of personal computers, personal computers mainly for learning purposes have been commercialized for children before and after reaching school age. By the way, in such a personal computer, the operation itself of the personal computer, or the learning procedure, question, answer, how to obtain the answer, and further, reading out the story, singing a song, etc. are performed by uttering a synthetic voice. In many cases

【０００３】[0003]

【発明が解決しようとする課題】このように、上述のよ
うな子供向けの従来のパーソナルコンピュータは、合成
音声により子供に種々の指示を与えるようになっている
が、これまでの合成音声は抑揚，アクセント等が不自然
であり、また発声速度も一定であり、更に声質も一種類
であるか、せいぜい数種類であり、しかもそれらの声質
は予めパーソナルコンピュータに設定されていて、子供
にとっては馴染みのない声である場合が多い。このた
め、そのようなパーソナルコンピュータを使用する子供
にとってはいまひとつ学習に身が入らないという結果を
もたらしていた。As described above, the conventional personal computer for children as described above is designed to give various instructions to the child by means of the synthesized voice, but the synthesized voice so far is inflectioned. , Accents, etc. are unnatural, the rate of speech is constant, and the voice quality is one type or at most several types. Moreover, those voice types are preset in the personal computer and are familiar to children. Often has no voice. As a result, children using such personal computers are less likely to learn.

【０００４】本発明はこのような事情に鑑みてなされた
ものであり、合成音声として発声される声を種々の声
質、たとえばアニメーションのキャラクタ、装置を使用
する人の肉親等の声質に変換し得る音声合成装置の提供
を目的とする。The present invention has been made in view of the above circumstances, and can convert a voice uttered as a synthetic voice into various voice qualities, for example, an animated character, a voice timbre of a relative of a person who uses the apparatus, or the like. An object is to provide a speech synthesizer.

【０００５】[0005]

【課題を解決するための手段】本発明に係る音声合成装
置は、合成音声として再生されるべき音声データ，テキ
ストデータを保存するデータ保存部と、データ保存部に
保存されているデータを合成音声として再生する音声合
成部と、入力された音声信号の声質の特徴を抽出した声
質データを生成する声質特徴抽出部とを備える。SUMMARY OF THE INVENTION A speech synthesizer according to the present invention comprises a data storage section for storing voice data and text data to be reproduced as synthetic speech, and a data storage section for storing data stored in the data storage section. And a voice quality feature extraction unit that generates voice quality data by extracting voice quality features of the input voice signal.

【０００６】また、本発明に係る音声合成装置は、合成
音声として再生されるべき音声データ，テキストデータ
を保存するデータ保存部と、データ保存部に保存されて
いるデータを合成音声として再生する音声合成部と、音
声合成部に合成音声を発声させる際の発声様式を指定す
る発声様式設定部とを備える。Further, the voice synthesizing apparatus according to the present invention includes a data storage unit for storing voice data and text data to be reproduced as a synthetic voice, and a voice for reproducing the data stored in the data storage unit as a synthetic voice. A synthesizing unit and a voicing style setting unit for designating a voicing style when the voice synthesizing unit utters a synthetic voice.

【０００７】更に、本発明に係る音声合成装置は、合成
音声として再生されるべき音声データ，テキストデータ
を保存するデータ保存部と、データ保存部に保存されて
いるデータを合成音声として再生する音声合成部と、入
力された音声信号の声質の特徴を抽出した声質データを
生成する声質特徴抽出部と、音声合成部に合成音声を発
声させる際の発声様式を指定する発声様式設定部とを備
える。Further, the voice synthesizing apparatus according to the present invention includes a data storage unit for storing voice data and text data to be reproduced as synthetic voice, and a voice for reproducing data stored in the data storage unit as synthetic voice. A synthesizing unit, a voice quality feature extracting unit that generates voice quality data obtained by extracting a voice quality feature of the input voice signal, and a voicing style setting unit that specifies a voicing style when the voice synthesizing unit utters a synthetic voice. .

【０００８】[0008]

【作用】本発明に係る音声合成装置では、データ保存部
に保存されているデータが声質特徴抽出部により抽出さ
れた声質データを使用して合成音声化される。In the voice synthesizing apparatus according to the present invention, the data stored in the data storage unit is converted into a synthetic voice by using the voice quality data extracted by the voice quality feature extraction unit.

【０００９】また、本発明に係る音声合成装置では、デ
ータ保存部に保存されているデータを発声様式設定部に
より指定されている発声様式に従って再生される。Further, in the voice synthesizing apparatus according to the present invention, the data stored in the data storage unit is reproduced according to the utterance style specified by the utterance style setting unit.

【００１０】更に、本発明に係る音声合成装置では、デ
ータ保存部に保存されているデータが声質特徴抽出部に
より抽出された声質データを使用して合成音声化され、
発声様式設定部により指定されている発声様式に従って
再生される。Further, in the voice synthesizing apparatus according to the present invention, the data stored in the data storage unit is synthesized into voice by using the voice quality data extracted by the voice quality feature extraction unit,
Reproduction is performed according to the utterance style specified by the utterance style setting unit.

【００１１】[0011]

【実施例】以下、本発明をその実施例を示す図面に基づ
いて詳述する。図１は本発明に係る音声合成装置の基本
的構成を示すブロック図である。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below in detail with reference to the drawings showing the embodiments thereof. FIG. 1 is a block diagram showing the basic configuration of a speech synthesizer according to the present invention.

【００１２】図１において、参照符号３は声質特徴抽出
部を示しており、生の音声を入力するためのマイクロフ
ォン１と、ビデオテープレコーダ，テレビジョン，ラジ
オ等からアナログの音声信号を直接、あるいは既に録音
機に録音された音声信号を入力するためのライン入力端
子２とが備えられている。この声質特徴抽出部３は、上
述のマイクロフォン１またはライン入力端子２から入力
された音声信号の声質の特徴をたとえばその周波数成分
を分析することにより抽出して声質データを生成する。
以下、この声質特徴抽出部３により声質の特徴が抽出さ
れて声質データが生成される音声信号をサンプル音声と
いう。In FIG. 1, reference numeral 3 indicates a voice quality feature extraction unit, which directly inputs an analog voice signal from a microphone 1 for inputting a raw voice, a video tape recorder, a television, a radio, or the like. A line input terminal 2 for inputting a voice signal already recorded in the recorder is provided. The voice quality feature extraction unit 3 extracts voice quality features of the voice signal input from the microphone 1 or the line input terminal 2 described above, for example, by analyzing its frequency component to generate voice quality data.
Hereinafter, a voice signal from which voice quality features are extracted by the voice quality feature extraction unit 3 to generate voice quality data is referred to as a sample voice.

【００１３】参照符号４は声質データ格納部を示してお
り、声質特徴抽出部３により抽出されたサンプル音声の
声質の特徴のデータ、即ち声質データを格納する。な
お、声質データ格納部４に格納されている複数の声質デ
ータは、たとえばキーボード等を利用した声質選択部５
の操作により選択することが可能である。Reference numeral 4 indicates a voice quality data storage unit, which stores voice quality feature data of the sample voice extracted by the voice quality feature extraction unit 3, that is, voice quality data. The plurality of voice quality data stored in the voice quality data storage unit 4 is stored in the voice quality selection unit 5 using, for example, a keyboard.
It is possible to select by the operation of.

【００１４】参照符号６は声質データ加工部を示してお
り、声質選択部５により選択され声質データ格納部４か
ら出力されたサンプル音声の声質データを加工する。こ
の声質データの加工としては、たとえば複数の声質デー
タの合成，男性の声の女性化，逆に女性の声の男性化，
子供の声の大人化，大人の声の子供化等が可能である。
この声質データ加工部６による声質データの加工処理も
声質選択部５により選択することが可能である。このよ
うな声質データの加工処理は、たとえば男性の声の声質
データと女性の声の声質データとの間の一般的な相違の
傾向が判明していれば、相互に変換することは容易であ
る。Reference numeral 6 indicates a voice quality data processing unit, which processes the voice quality data of the sample voice selected by the voice quality selection unit 5 and output from the voice quality data storage unit 4. The processing of the voice quality data includes, for example, synthesis of a plurality of voice quality data, feminization of a male voice, conversely, maleization of a female voice,
It is possible to adultize the voice of a child and to make the voice of an adult a child.
The processing of the voice quality data by the voice quality data processing unit 6 can also be selected by the voice quality selection unit 5. Such processing of voice quality data is easy to mutually convert, for example, if a general difference tendency between the voice quality data of male voice and the voice quality data of female voice is known. .

【００１５】参照符号７は発声様式設定部を示してお
り、合成音声を発声させる際の目的,状況、文章の内容
等に応じて、種々の韻律情報を与えることが可能であ
る。たとえば、人が設問を発する場合と物語を読み上げ
る場合とでは話し方が異なるが、従来の合成音声ではそ
のような区別をして合成音声を発声することは出来な
い。しかし、本発明の音声合成装置では、合成音声を発
声する際の目的，状況に応じて発声様式設定部７により
韻律情報を変化させて合成音声の抑揚，アクセント、あ
るいは発声速度を変化させることが可能になる。この音
声データ保存部８もたとえばキーボード等を利用するこ
とが可能であり、前述の声質選択部５と共に一つのキー
ボードで兼用することも勿論可能である。Reference numeral 7 indicates a utterance style setting section, which can give various prosody information depending on the purpose, situation, sentence content, etc. when uttering a synthetic voice. For example, when a person asks a question and when a story is read aloud, the way of speaking is different, but conventional synthetic speech cannot make such a distinction and speak synthetic speech. However, in the speech synthesizing apparatus of the present invention, the prosody information can be changed by the utterance style setting unit 7 according to the purpose and situation when uttering the synthesized voice, and the intonation, accent, or utterance speed of the synthesized voice can be changed. It will be possible. The voice data storage unit 8 can also use, for example, a keyboard or the like, and of course, one keyboard can be used together with the voice quality selection unit 5.

【００１６】参照符号８は音声データ保存部を示してお
り、既に音声データとして存在する種々の情報を保存し
ている。参照符号９はテキストデータ保存部を示してお
り、種々のテキストデータが保存されている。これらの
音声データ保存部８及びテキストデータ保存部９は一つ
の記憶装置を利用してもよい。Reference numeral 8 indicates a voice data storage unit, which stores various information already existing as voice data. Reference numeral 9 indicates a text data storage unit, in which various text data are stored. The voice data storage unit 8 and the text data storage unit 9 may use one storage device.

【００１７】参照符号10は出力データ選択部を示してお
り、音声データ保存部８及びテキストデータ保存部９に
保存されている音声データ, テキストデータの内のいず
れかを選択して音声合成部11へ出力させる。なお、この
出力データ選択部10に関しても、前述の声質選択部５及
び発声様式設定部７と共に一つのキーボードで兼用する
ことも勿論可能である。Reference numeral 10 indicates an output data selection unit, which selects either the voice data or the text data stored in the voice data storage unit 8 and the text data storage unit 9 to select the voice synthesis unit 11. Output to. As for the output data selecting section 10, it is of course possible to use the voice quality selecting section 5 and the utterance style setting section 7 together with one keyboard.

【００１８】参照符号11は音声合成部を示しており、音
声データ保存部８またはテキストデータ保存部９に保存
されているデータが出力データ選択部10により選択され
た場合に、声質データ加工部６から与えられるサンプル
音声の声質データで合成音声化し、スピーカ12から再生
する。なお、スピーカ12のみならず、たとえばイヤホー
ン, ヘッドホーン等を使用することも、更にライン出力
端子から種々の録音機に出力して録音することも勿論可
能である。Reference numeral 11 indicates a voice synthesizing unit, and when the data stored in the voice data storing unit 8 or the text data storing unit 9 is selected by the output data selecting unit 10, the voice quality data processing unit 6 The voice quality data of the sample voice given by the above is converted into a synthetic voice and reproduced from the speaker 12. It is of course possible to use not only the speaker 12 but also an earphone, a headphone, or the like, and it is also possible to output from the line output terminal to various recorders for recording.

【００１９】上述のような本発明の音声合成装置の動作
について図２のフローチャートを参照して以下に説明す
る。The operation of the speech synthesizer of the present invention as described above will be described below with reference to the flowchart of FIG.

【００２０】ところで、人の声の声質（音色）は、声
（音）の三要素（大きさ，高さ，音色）の内の一つであ
り、主としてその音を構成する各部分音の周波数によっ
て規定される。従って、アナログの音声信号をその周波
数成分を分析することにより声質の特徴係数として抽出
することが可能になる。声質特徴抽出部３においてはそ
のような手法により、マイクロフォン１またはライン入
力端子２からサンプル音声を入力し (ステップS1) 、入
力されたサンプル音声のアナログの音声信号を周波数分
析し、声質の特徴係数、即ち声質データを抽出する (ス
テップS2) 。By the way, the voice quality (timbre) of a human voice is one of the three elements (volume, pitch, and timbre) of a voice (sound), and mainly the frequency of each partial sound constituting the sound. Stipulated by Therefore, it becomes possible to extract an analog voice signal as a voice quality characteristic coefficient by analyzing its frequency component. The voice quality feature extraction unit 3 inputs the sample voice from the microphone 1 or the line input terminal 2 by such a method (step S1), frequency-analyzes the input analog voice signal of the sample voice, and the voice quality feature coefficient. That is, voice quality data is extracted (step S2).

【００２１】このようにして声質特徴抽出部３により抽
出されたサンプル音声の声質データは声質データ格納部
４に格納される (ステップS3) 。従って、上述の声質特
徴抽出部３により、たとえばマイクロフォン１から種々
の人の音声をサンプル音声として入力することにより、
それらのサンプル音声の声質データを抽出して予め蓄積
しておくことが可能である。たとえば、装置を主として
子供が使用する場合には、その子供の両親，祖父母等の
声、あるいは保育園，幼稚園等の先生の声をサンプル音
声とすることが望ましい。また、ライン入力端子２から
はビデオテープレコーダ，テレビジョン，ラジオ等から
音声を直接、あるいは既に録音機に録音された音声を入
力することによりそれらをサンプル音声として種々の声
質データを予め蓄積しておくことが可能である。この場
合は、たとえばアニメーションのキャラクタの声，タレ
ントの声の声質データ等を蓄積しておくことが可能にな
る。The voice quality data of the sample voice thus extracted by the voice quality feature extraction unit 3 is stored in the voice quality data storage unit 4 (step S3). Therefore, by inputting voices of various people as sample voices from the microphone 1 by the voice quality feature extraction unit 3 described above,
It is possible to extract voice quality data of these sample voices and store them in advance. For example, when the device is mainly used by a child, it is desirable that the voice of the child's parents, grandparents, or the like, or the voice of a teacher in a nursery school, kindergarten, or the like is used as the sample voice. Further, from the line input terminal 2, voices are directly recorded from a video tape recorder, a television, a radio, or the like, or by inputting voices already recorded in a recorder, various voice quality data are stored in advance as sample voices. It is possible to set. In this case, for example, it is possible to store voice data of animation characters, voice data of talent voices, and the like.

【００２２】一方、テキストデータに関してはテキスト
データ保存部９に予め種々のテキストデータを蓄積して
おくことが可能である。たとえば、テキストデータ保存
部９としてフレキシブルディスクドライバを使用すれ
ば、テキストデータが既に記憶されている記憶媒体とし
てのフレキシブルディスクを装着すればよいし、あるい
はワードプロセッサとしての機能を持たせておけば、キ
ーボード等の操作により直接テキストデータを入力する
ことも可能である。On the other hand, regarding the text data, various text data can be stored in the text data storage unit 9 in advance. For example, if a flexible disk driver is used as the text data storage unit 9, a flexible disk as a storage medium in which text data is already stored may be mounted, or if a function as a word processor is provided, a keyboard is provided. It is also possible to directly input the text data by an operation such as.

【００２３】また、音声データに関しても音声データ保
存部８に予め種々の音声データを蓄積しておくことが可
能である。たとえば、、テキストデータ保存部９の場合
と同様に、音声データ保存部８としてフレキシブルディ
スクドライバを使用すれば、音声データが既に記憶され
ている記憶媒体としてのフレキシブルディスクを装着す
ればよいし、あるいは図示されていないマイクロフォ
ン, ライン入力端子をを使用して直接入力することも可
能である。勿論、前述したように、音声データ保存部８
とテキストデータ保存部９とを一つのフレキシブルディ
スクドライバ, ハードディスクドライバ等の記憶装置で
構成して兼用することも可能である。Regarding voice data, it is possible to store various voice data in the voice data storage unit 8 in advance. For example, as in the case of the text data storage unit 9, if a flexible disk driver is used as the audio data storage unit 8, a flexible disk as a storage medium in which audio data is already stored may be mounted, or It is also possible to directly input using a microphone and a line input terminal not shown. Of course, as described above, the voice data storage unit 8
The text data storage unit 9 and the text data storage unit 9 may be combined into a single storage device such as a flexible disk driver or a hard disk driver to be used in common.

【００２４】次に、ユーザが実際に合成音声を発声させ
る際の手順について説明する。まず、ユーザは声質選択
部５を操作することにより、どの声質で合成音声を発声
するかを選択し、更に声質の加工を行なうか否か、行な
う場合にはどのような加工を行なうかを声質選択部５を
操作して設定する (ステップS4) 。Next, a procedure when the user actually speaks a synthetic voice will be described. First, the user operates the voice quality selection unit 5 to select which voice quality should be used for synthesizing the synthesized voice, and whether the voice quality should be further processed, and if so, what kind of processing should be performed. The selection unit 5 is operated to set (step S4).

【００２５】次に、ユーザは発声様式設定部７を操作す
ることにより、どのような発声様式で発声させるかを韻
律情報を指定して設定する。これは、たとえばこれから
合成音声出力される文章の内容, それの聴取者等に応じ
て指定を行なう。この後、ユーザは出力データ選択部10
を操作して音声データ保存部８及びテキストデータ保存
部９に予め保存されているテキストデータ, 音声データ
のうちのいずれを合成音声出力させるかを選択する (ス
テップS6) 。Next, the user operates the utterance style setting unit 7 to set the utterance style by designating the prosody information. This is specified, for example, according to the content of the sentence to be output as synthesized speech, the listener of the sentence, and the like. After this, the user selects the output data selection unit 10
Is operated to select which one of the text data and the voice data stored in advance in the voice data storage unit 8 and the text data storage unit 9 is to be output as the synthesized voice (step S6).

【００２６】以上により、音声合成部11には声質デー
タ, 発声様式のデータ (韻律データ)及び出力すべき文
章のデータ（テキストデータまたは音声データ）が与え
られる。音声データ保存部８に保存されている音声デー
タが選択されている場合は、音声合成部11はその音声デ
ータの声質データを声質選択部５により選択されている
サンプル音声の声質データに変換し、更に発声様式設定
部７により選択されている発声様式のデータに従って、
合成音声としてスピーカ12から発声する (ステップS7)
。また、テキストデータ保存部９に保存されているテ
キストデータが選択されている場合は、音声合成部11は
そのテキストデータに声質選択部５により選択されてい
るサンプル音声の声質データを付加し、更に発声様式設
定部７により設定されている発声様式のデータに従っ
て、合成音声としてスピーカ12から発声する (ステップ
S7) 。As described above, the voice synthesizing unit 11 is provided with voice quality data, utterance style data (prosodic data), and sentence data to be output (text data or voice data). When the voice data stored in the voice data storage unit 8 is selected, the voice synthesis unit 11 converts the voice quality data of the voice data into the voice quality data of the sample voice selected by the voice quality selection unit 5, Furthermore, according to the data of the vocalization style selected by the vocalization style setting unit 7,
Speak as synthesized voice from the speaker 12 (step S7)
. When the text data stored in the text data storage unit 9 is selected, the voice synthesis unit 11 adds the voice quality data of the sample voice selected by the voice quality selection unit 5 to the text data, and According to the vocalization style data set by the vocalization style setting unit 7, the speaker 12 speaks as a synthetic voice (step
S7).

【００２７】このように、本発明の音声合成装置の実施
例では、既に蓄積されているテキストデータ、あるいは
音声データを合成音声で発声させる際に、予め登録して
あるサンプル音声の声質で発声させることが可能になる
ので、たとえば子供用パーソナルコンピュータに適用し
た場合には、パーソナルコンピュータの操作そのもの、
あるいは学習の手順、問題の設定、解答、解答の求め
方、更には物語を読み上げたり、歌を歌ったり等を子供
にとって親しみやすい両親，祖父母、あるいはアニメー
ションのキャラクタの声で合成音声を発声することによ
り行なえるようになる。従って、従来の無味乾燥な合成
音声に比して子供にとっては親しみやすく、興味をもっ
てパーソナルコンピュータを使用することが出来るよう
になる。As described above, in the embodiment of the voice synthesizing apparatus of the present invention, when the already stored text data or voice data is uttered by the synthetic voice, the voice quality of the pre-registered sample voice is uttered. Therefore, when applied to a personal computer for children, for example, the operation of the personal computer itself,
Or to speak a synthetic voice with the voices of parents, grandparents, or animated characters who are familiar to children such as learning procedures, problem setting, answers, how to obtain answers, reading stories, singing songs, etc. Will be able to do it. Therefore, compared to the conventional dry and dry synthetic speech, the child is more familiar and can use the personal computer with interest.

【００２８】なお、本発明の音声合成装置は上述のよう
な子供用パーソナルコンピュータのみならず、たとえば
カラオケ装置，留守番電話，音声案内装置等、種々の合
成音声を発声する装置に適用可能であることは言うまで
もない。The voice synthesizing apparatus of the present invention can be applied not only to the above-mentioned personal computer for children but also to various synthesizing voice synthesizing apparatuses such as a karaoke apparatus, an answering machine and a voice guidance apparatus. Needless to say.

【００２９】[0029]

【発明の効果】以上に詳述したように、本発明の音声合
成装置によれば、これまでの不自然で画一的な合成音声
に代えて、種々の声質、たとえばアニメーションのキャ
ラクタ、装置を使用する人の肉親、あるいは知人等の声
質で合成音声を発声することが可能になるので、ユーザ
にとっては親しみやすい装置が実現する。また、合成音
声で発声される内容、それを聴取する人物等に応じて発
声様式を変化させることも可能である。As described above in detail, according to the speech synthesizer of the present invention, various voice qualities, for example, animation characters and apparatuses can be used in place of the unnatural and uniform synthesized speech that has been used so far. Since it becomes possible to utter a synthetic voice with the voice quality of the person who uses it or the acquaintance of an acquaintance, a device that is familiar to the user is realized. It is also possible to change the utterance style in accordance with the content uttered by the synthetic voice, the person listening to it, and the like.

[Brief description of drawings]

【図１】本発明の音声合成装置の一構成例を示すブロッ
ク図である。FIG. 1 is a block diagram showing a configuration example of a speech synthesizer of the present invention.

【図２】本発明の音声合成装置の動作手順を示すフロー
チャートである。FIG. 2 is a flowchart showing an operation procedure of the speech synthesizer of the present invention.

[Explanation of symbols]

３声質特徴抽出部７発声様式設定部８音声データ保存部９テキストデータ保存部 11 音声合成部 3 Voice quality feature extraction unit 7 Vocal style setting unit 8 Voice data storage unit 9 Text data storage unit 11 Voice synthesis unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者落岩正士大阪府守口市京阪本通２丁目５番５号三洋電機株式会社内 (72)発明者泉貴次大阪府守口市京阪本通２丁目５番５号三洋電機株式会社内 (72)発明者澤田暉重大阪府守口市京阪本通２丁目５番５号三洋電機株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Masashi Ochiiwa 2-5-5 Keihan Hondori, Moriguchi City, Osaka Prefecture Sanyo Electric Co., Ltd. (72) Inventor Kiji Izumi 2 Keihan Hondori, Moriguchi City, Osaka Prefecture 5-5 Sanyo Electric Co., Ltd. (72) Inventor Akashige Sawada 2-5-5 Keihan Hondori, Moriguchi City, Osaka Sanyo Electric Co., Ltd.

Claims

[Claims]

1. A voice synthesizer comprising: a data storage unit for storing voice data and text data to be reproduced as synthetic voice; and a voice synthesis unit for reproducing the data stored in the data storage unit as synthetic voice. In the device, a voice quality feature extraction unit that generates voice quality data by extracting voice quality features of an input voice signal is provided, and the voice synthesis unit uses the voice quality data extracted by the voice quality feature extraction unit to perform the data conversion. A voice synthesizing apparatus, characterized in that the data stored in the storage unit is converted into a synthetic voice for reproduction.

2. A voice synthesizer comprising a data storage unit for storing voice data and text data to be reproduced as synthetic voice, and a voice synthesis unit for reproducing the data stored in the data storage unit as synthetic voice. In the device, the speech synthesis unit includes a vocalization style setting unit for designating a vocalization style at the time of producing a synthetic voice, and the speech synthesis unit uses the vocalization style setting unit to store the data stored in the data storage unit. A voice synthesizer characterized by being reproduced in accordance with a designated utterance mode.

3. A voice synthesizer comprising a data storage unit for storing voice data and text data to be reproduced as synthetic voice, and a voice synthesizer unit for reproducing the data stored in the data storage unit as synthetic voice. The apparatus is provided with a voice quality feature extraction unit that generates voice quality data by extracting voice quality features of an input voice signal, and a voice style setting unit that specifies a voice style when the voice synthesis unit utters a synthetic voice. The voice synthesizing unit synthesizes the data stored in the data storage unit by using the voice quality data extracted by the voice quality feature extracting unit, and according to the utterance style specified by the utterance style setting unit. A speech synthesizer characterized by being played back.