JP2002366175A

JP2002366175A - Device and method for supporting voice communication

Info

Publication number: JP2002366175A
Application number: JP2001174632A
Authority: JP
Inventors: Toshiyasu Masakawa; 俊康政川; Yasushi Ishikawa; 泰石川; Tadashi Suzuki; 鈴木　　忠
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2001-06-08
Filing date: 2001-06-08
Publication date: 2002-12-20

Abstract

PROBLEM TO BE SOLVED: To provide a voice communication support device and its method which can reduce apparently improve a voiceless state and reduce the stress to an opposite side by sequentially outputting a synthesized voice according to user's sentence input operation and automatically outputting a redundant word voice when the voiceless state continues. SOLUTION: This device is equipped with a character output part which temporarily stores the character string of a sentence inputted through sentence input operation and outputs it in response to output operation, a voice synthesis part which outputs synthesized voice data from the output character string, a redundant word storage part which previously stores a proper number of pieces of redundant word voice data, and an output control part which takes an output voice out of the synthesized voice data and outputs selected redundant voice data when no synthesized voice data are not inputted for a certain time.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は言語障害者が意思
を伝達するために、入力した文を音声合成し出力する音
声コミュニケーション支援装置に係り、特にユーザ（す
なわち言語障害者）の入力操作に伴って起こるコミュニ
ケーションの相手方のストレスを軽減させることを可能
とする音声コミュニケーション支援装置およびその方法
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice communication support apparatus for voice-synthesizing and outputting an input sentence in order to convey the intention of a language-disabled person. TECHNICAL FIELD The present invention relates to a voice communication support device and a method thereof capable of reducing the stress of a communication partner that occurs during communication.

【０００２】[0002]

【従来の技術】言語障害者が他者とコミュニケーション
を行うには、相手が近くにいる場合には筆談、相手が遠
方にいる場合にはＦＡＸ、電子メールなどの音声を用い
ない方法がある。しかしこれらの方法は、いずれも音声
のように注意を喚起したり、多くの人に同時に意思を伝
えることができない。またＦＡＸや電子メールでコミュ
ニケーションを行うには、相手もＦＡＸやメール端末を
持っている必要があり、これらの普及率は電話機に比べ
ると低い。したがって、上記の方法でコミュニケーショ
ンを行える場面および相手は限定されていた。2. Description of the Related Art There is a method in which a speech-disabled person does not use speech such as facsimile or e-mail when a partner is near, and communicates with another person when the partner is near. However, none of these methods can alert like a voice or communicate to many people at the same time. Further, in order to communicate by facsimile or e-mail, the other party also needs to have a facsimile or mail terminal, and their penetration rate is lower than that of a telephone. Therefore, the scenes and the opponents who can communicate by the above method are limited.

【０００３】また、このような問題を解決する手段とし
て、例えばキーボードのような入力装置と音声合成装置
とを組み合わせた音声コミュニケーション支援装置が開
発されている。音声コミュニケーション支援装置の例と
して、ろうあ者がキー操作によって文字入力・音声合成
出力を行う会話補助装置があり、特開平０５−２８９６
０８号においても記載されている。As a means for solving such a problem, a voice communication support device has been developed in which an input device such as a keyboard and a voice synthesizer are combined. As an example of the voice communication support device, there is a conversation support device in which a deaf person performs character input and voice synthesis output by key operation.
No. 08 is also described.

【０００４】図１８は従来の音声コミュニケーション支
援装置の構成を示すブロック図で、ユーザの入力操作お
よび音声合成に関する部分を示すものである。図におい
て、１はキーボード、２は文字キー、３は機能キー、４
は文章入力処理回路、５は音声合成回路である。キーボ
ード１は文字入力のための文字キー２と発声指令などを
行う機能キー３とからなり、ユーザの操作したキーに対
応する会話文文字列を入力する。文章入力処理回路４
は、キーボード１で操作された文字キー２に対応する文
字コード列を生成し、また機能キー３のうち特に発声指
令に対応するキー（以下、発声キーという）が操作され
たときに音声合成回路５に出力する。音声合成回路５
は、入力された文字コード列に基づいて音声を合成し出
力する。FIG. 18 is a block diagram showing the configuration of a conventional voice communication support apparatus, showing parts related to a user's input operation and voice synthesis. In the figure, 1 is a keyboard, 2 is a character key, 3 is a function key, 4
Is a text input processing circuit, and 5 is a speech synthesis circuit. The keyboard 1 includes character keys 2 for inputting characters and function keys 3 for giving an utterance command or the like, and inputs a conversation character string corresponding to a key operated by the user. Sentence input processing circuit 4
Generates a character code string corresponding to the character key 2 operated by the keyboard 1, and generates a voice synthesizing circuit when a key corresponding to a voice command (hereinafter referred to as a voice key) among the function keys 3 is operated. 5 is output. Voice synthesis circuit 5
Synthesizes and outputs speech based on the input character code string.

【０００５】[0005]

【発明が解決しようとする課題】従来の音声コミュニケ
ーション支援装置は以上のように構成されているので、
ユーザが会話文の入力を完了し、発声キーを操作するま
で無音状態が続く。このため、会話文の入力に時間がか
かると無音時間が長くなり、相手方に大きなストレスを
与えるという問題があった。特に、このような従来の装
置と電話とを組み合わせて遠方の相手とのコミュニケー
ションをはかる場合には、ユーザの状況が相手側から見
えないため、上記のような相手方に与える影響が深刻な
ものとなるなどの課題があった。Since the conventional voice communication supporting device is configured as described above,
The silent state continues until the user completes the input of the conversation sentence and operates the utterance key. For this reason, there is a problem that if it takes a long time to input a conversational sentence, a silence time becomes long, and a great stress is applied to the other party. In particular, when communicating with a distant partner by combining such a conventional device with a telephone, since the situation of the user cannot be seen from the partner, the influence on the partner as described above is serious. There were issues such as becoming.

【０００６】この発明は上記のような課題を解決するた
めになされたもので、ユーザの文入力操作に従って逐次
的に合成音声を出力すると共に無音状態が続く場合に冗
長語音声を自動的に出力することによって、無音状態を
見かけ上改善し相手方のストレスを軽減させることが可
能な音声コミュニケーション支援装置およびその方法を
得ることを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problems, and outputs synthesized speech sequentially according to a user's sentence input operation and automatically outputs redundant word speech when silence continues. By doing so, it is an object of the present invention to obtain a voice communication support device and a method thereof capable of apparently improving a silent state and reducing stress of the other party.

【０００７】[0007]

【課題を解決するための手段】この発明に係る音声コミ
ュニケーション支援装置は、ユーザが入力した文を合成
音声に変換して出力する音声コミュニケーション支援装
置において、ユーザの文入力操作により文を入力する文
入力部と、入力された文の文字列を一時記憶し、ユーザ
により出力操作が行われたときに出力する文字列出力部
と、この文字列出力部から出力された文字列を音声合成
して合成音声データを出力する音声合成部と、適当数の
所定の冗長語音声データを予め記憶しておく冗長語記憶
部と、この冗長語記憶部から冗長語音声データを１つ選
択して出力する冗長語選択部と、合成音声データを読み
込み出力音声を取り出すと共に合成音声データの入力状
態を監視し、合成音声データが一定時間入力されないこ
とを感知した場合には、冗長語選択部が選択した冗長語
音声データから冗長語音声を出力する出力制御部とを備
えたものである。A voice communication support apparatus according to the present invention is a voice communication support apparatus for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, wherein the sentence is input by a user's sentence input operation. An input unit, a character string output unit that temporarily stores a character string of an input sentence, and outputs the character string when an output operation is performed by a user; and a speech synthesis unit that synthesizes a character string output from the character string output unit. A speech synthesizer for outputting synthesized speech data, a redundant word storage for storing an appropriate number of predetermined redundant word speech data in advance, and selecting and outputting one redundant word speech data from the redundant word storage. When the redundant word selection unit reads the synthesized voice data, extracts the output voice, monitors the input state of the synthesized voice data, and detects that the synthesized voice data is not input for a certain period of time. Is obtained by an output control section for the redundant word selection unit outputs a redundant word speech from the selected redundant word speech data.

【０００８】この発明に係る音声コミュニケーション支
援装置は、文字列出力部に代えて、文入力部から入力さ
れる文の文字列を解析して合成単位を抽出し、抽出した
合成単位を逐次的に出力する合成単位抽出部を備えたも
のである。A voice communication support device according to the present invention analyzes a character string of a sentence input from a sentence input unit instead of a character string output unit, extracts synthesis units, and sequentially extracts the extracted synthesis units. It is provided with a combining unit extracting unit for outputting.

【０００９】この発明に係る音声コミュニケーション支
援装置は、ユーザが入力した文を合成音声に変換して出
力する音声コミュニケーション支援装置において、ユー
ザの文入力操作により文を入力する文入力部と、この文
入力部から入力される文の文字列を解析して合成単位を
抽出すると共にこの合成単位の言語カテゴリを決定し、
合成単位と言語カテゴリを逐次的に出力する合成単位解
析部と、合成単位解析部から出力された合成単位の文字
列を音声合成して合成音声データを出力する音声合成部
と、適当数の所定の冗長語音声データを予め記憶してお
く冗長語記憶部と、この冗長語記憶部から冗長語音声デ
ータを１つ選択して出力する冗長語選択部と、入力され
る言語カテゴリに応じて冗長語音声データの出力タイミ
ングを決定するタイミング信号を出力するタイミング決
定部と、合成音声データを読み込み出力音声を取り出す
と共に合成音声データの入力状態を監視し、文入力操作
中においてタイミング信号が入力された場合には冗長語
選択部が選択した冗長語音声データによる冗長語音声を
出力するタイミング出力制御部とを備えたものである。A voice communication support device according to the present invention is a voice communication support device for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, and a sentence input unit for inputting a sentence by a user's sentence input operation; A character string of a sentence input from the input unit is analyzed to extract a synthesis unit, and a language category of the synthesis unit is determined.
A synthesis unit for sequentially outputting a synthesis unit and a language category; a voice synthesis unit for voice-synthesizing a character string of the synthesis unit output from the synthesis unit analysis unit to output synthesized voice data; A redundant word storage unit for storing redundant word voice data in advance, a redundant word selection unit for selecting and outputting one redundant word voice data from the redundant word storage unit, and a redundant word corresponding to the input language category. A timing determination unit that outputs a timing signal for determining the output timing of the word voice data, reads out the synthesized voice data, extracts the output voice, monitors the input state of the synthesized voice data, and receives the timing signal during the sentence input operation. In such a case, there is provided a timing output control section for outputting a redundant word voice based on the redundant word voice data selected by the redundant word selecting section.

【００１０】この発明に係る音声コミュニケーション支
援装置は、タイミング決定部の代わりに、直前の合成音
声データの時間長に応じて生成される冗長語音声データ
の出力タイミングを決定する時間長別タイミング信号を
出力する時間長別タイミング決定部を備え、タイミング
出力制御部が時間長別タイミング信号により冗長語音声
データを出力するものである。In the voice communication support apparatus according to the present invention, instead of the timing determining section, a time length-specific timing signal for determining the output timing of the redundant word voice data generated according to the time length of the immediately preceding synthesized voice data is provided. A timing determining unit for outputting by time length is provided, and a timing output control unit outputs redundant word voice data by a timing signal by time length.

【００１１】この発明に係る音声コミュニケーション支
援装置は、出力制御部に代えて、合成音声データを読み
込み出力音声を取り出すと共に合成音声データの入力状
態を監視し、文入力操作中に前の合成音声データの入力
後の予め決めた一定時間内に次の合成音声データの入力
が行われない場合には冗長語音声データを出力するため
の準備を行い、次の合成音声データが入力されたときに
は出力音声として次の合成音声データの前に冗長語選択
部が選択した冗長語音声データを読み出して接続する冗
長語前置出力制御部を備えたものである。The voice communication support apparatus according to the present invention reads out synthesized voice data instead of the output control unit, extracts the output voice, monitors the input state of the synthesized voice data, and monitors the previous synthesized voice data during the sentence input operation. If the next synthesized voice data is not input within a predetermined time after the input, the preparation for outputting the redundant word voice data is performed, and when the next synthesized voice data is input, the output voice is output. And a redundant word prefix output control unit for reading and connecting the redundant word voice data selected by the redundant word selecting unit before the next synthesized voice data.

【００１２】この発明に係る音声コミュニケーション支
援装置は、ユーザが入力した文を合成音声に変換して出
力する音声コミュニケーション支援装置において、ユー
ザの文入力操作により文を入力する文入力部と、この文
入力部から入力される文の文字列を解析して合成単位を
抽出すると共にこの合成単位の言語カテゴリを決定し、
合成単位と言語カテゴリを逐次的に出力する合成単位解
析部と、合成単位解析部から出力された合成単位の文字
列を音声合成して合成音声データを出力する音声合成部
と、合成単位の言語カテゴリごとに冗長語音声データを
予め記憶するカテゴリ別冗長語記憶部と、合成単位解析
部からの言語カテゴリに従ってカテゴリ別冗長語記憶部
から対応する冗長語音声データを１つ選択して出力する
カテゴリ別冗長語選択部と、合成音声データを読み込み
出力音声を取り出すと共に合成音声データの入力状態を
監視し、文入力操作中に前の合成音声データの入力後の
予め決めた一定時間内に次の合成音声データの入力が行
われない場合には冗長語音声を出力するための準備を行
い、次の合成音声データが入力されたときの出力音声と
して、次の合成音声データの前にカテゴリ別冗長語選択
部が選択した次の合成音声に対応する冗長語音声データ
を読み出して接続する冗長語前置出力制御部とを備えた
ものである。A voice communication support device according to the present invention is a voice communication support device for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, and a sentence input unit for inputting a sentence by a user's sentence input operation; A character string of a sentence input from the input unit is analyzed to extract a synthesis unit, and a language category of the synthesis unit is determined.
A synthesis unit that sequentially outputs a synthesis unit and a language category, a speech synthesis unit that synthesizes a character string of the synthesis unit output from the synthesis unit analysis and outputs synthesized speech data, and a language of the synthesis unit A category redundant word storage unit for storing redundant word voice data in advance for each category, and a category for selecting and outputting one corresponding redundant word voice data from the category redundant word storage unit according to the language category from the synthesis unit analysis unit Another redundant word selection unit, reads out the synthesized voice data, extracts the output voice, monitors the input state of the synthesized voice data, and monitors the input state of the next synthesized voice data during the sentence input operation within a predetermined time after the input of the previous synthesized voice data. When the synthesized speech data is not input, the preparation for outputting the redundant word speech is performed, and the next synthesized speech is output as the next synthesized speech data is input. It is obtained by a redundant word before 置出 force control unit that connects reads the redundant word speech data corresponding to the next synthesized speech category redundant word selection unit selects the previous data.

【００１３】この発明に係る音声コミュニケーション支
援装置は、冗長語記憶部に代えて、合成単位の末尾母音
ごとに出力可能な冗長語音声データを予め記憶する末尾
母音別冗長語記憶部を備え、冗長語選択部に代えて、入
力される合成単位の末尾母音の種類を識別し、この末尾
母音の種類に適合する冗長語音声データを末尾母音別冗
長語記憶部から１つ選択する末尾音声別冗長語選択部を
備えたものである。The speech communication support apparatus according to the present invention includes a redundant word storage unit for each tail vowel which stores redundant word voice data which can be output for each tail vowel of the synthesis unit in advance, instead of the redundant word storage unit. Instead of the word selection unit, the type of the tail vowel of the input synthesis unit is identified, and one redundant word voice data that matches the type of the tail vowel is selected from the tail vowel-based redundant word storage unit. It has a word selection unit.

【００１４】この発明に係る音声コミュニケーション支
援装置は、冗長語記憶部に代えて、合成単位の時間長ご
とに出力可能な冗長語音声データを記憶する時間長別冗
長語記憶部を備え、冗長語選択部に代えて、入力された
合成音声データの時間長に適合する冗長語音声データを
時間長別冗長語記憶部から１つ選択する時間長別冗長語
選択部を備えたものである。The speech communication support apparatus according to the present invention includes a redundant word storage unit for each time length for storing redundant word voice data that can be output for each time length of the synthesis unit, instead of the redundant word storage unit. Instead of the selecting unit, a redundant word selecting unit for each time length for selecting one redundant word voice data matching the time length of the input synthesized voice data from the redundant word storing unit for each time length is provided.

【００１５】この発明に係る音声コミュニケーション支
援装置は、出力制御部に代えて、合成音声データを読み
込み出力音声を取り出すと共に出力した合成音声データ
を一時的に記憶し、合成音声データの入力状態を監視し
て合成音声データが一定時間入力されないことを感知し
た場合には、冗長語選択部が選択した冗長語音声データ
から冗長語音声を出力するか、または記憶されている合
成音声データを再出力する復唱出力制御部を備えたもの
である。In the voice communication support apparatus according to the present invention, instead of the output control unit, the synthesized voice data is read in, the output voice is extracted, the output synthesized voice data is temporarily stored, and the input state of the synthesized voice data is monitored. If it is detected that the synthesized speech data has not been input for a certain period of time, the redundant word speech is output from the redundant word speech data selected by the redundant word selection unit, or the stored synthesized speech data is output again. It has a repeat output control unit.

【００１６】この発明に係る音声コミュニケーション支
援方法は、ユーザが入力した文を合成音声に変換して出
力する音声コミュニケーション支援装置において、ユー
ザの文入力操作により文を入力し、入力された文を音声
合成して合成音声データを生成し、適当数の所定の冗長
語音声データを予め記憶しておき、生成された合成音声
データによる出力音声を取り出すと共に合成音声データ
の入力状態を監視し、入力操作中で合成音声データが一
定時間入力されないことを感知した場合には記憶された
冗長語音声データを選択して冗長語音声を出力するもの
である。According to the voice communication support method of the present invention, in a voice communication support apparatus for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, a sentence is input by a user's sentence input operation, and the input sentence is converted into a voice. Synthesize to generate synthesized voice data, store an appropriate number of predetermined redundant word voice data in advance, take out the output voice by the generated synthesized voice data, monitor the input state of the synthesized voice data, and perform input operation. When it is detected that the synthesized speech data has not been input for a certain period of time, the stored redundant word speech data is selected and the redundant word speech is output.

【００１７】[0017]

【発明の実施の形態】以下、この発明の実施の形態を説
明する。実施の形態１．図１はこの発明の実施の形態１による音
声コミュニケーション支援装置の構成を示すブロック図
で、図において、１１は文入力部、１２は文字列出力
部、１３は音声合成部、１４は冗長語記憶部、１５は冗
長語選択部、１６は出力制御部である。次に動作につい
て説明する。文入力部１１は、ユーザの文入力操作によ
り文を入力する。この場合の文入力操作は、例えばキー
ボードからのかな文字入力操作およびかな漢字変換操作
とする。したがって、例えばユーザが文節ごとにかな漢
字変換を行ったとすれば、文入力部１１の出力はかな漢
字混じりの文節文字列となる。このかな漢字変換に必要
な辞書は文入力部１１内に含まれているものとする。文
字列出力部１２は、文入力部１１から入力される文字列
を一時記憶し、ユーザにより音声化するための出力操作
が行われたときに出力する。音声合成部１３は、文字列
出力部１２から入力された文字列を、例えばテキスト音
声変換により音声合成し、合成音声データとして出力す
る。冗長語記憶部１４は、例えば「えーと」、「あの
ー」のような冗長語の冗長語音声データを予め複数記憶
する。冗長語選択部１５は、冗長語記憶部１４から音声
データを１つ、例えばランダムに選択する。出力制御部
１６は、合成音声データを読み込み出力音声を取り出す
と共に合成音声データの入力状態を監視し、合成音声デ
ータが一定時間入力されないことを感知した場合には冗
長語選択部１５が選択した冗長語音声データから冗長語
音声を出力する。Embodiments of the present invention will be described below. Embodiment 1 FIG. FIG. 1 is a block diagram showing the configuration of a voice communication support apparatus according to Embodiment 1 of the present invention. In the figure, 11 is a sentence input unit, 12 is a character string output unit, 13 is a speech synthesis unit, and 14 is a redundant word storage. , 15 is a redundant word selection unit, and 16 is an output control unit. Next, the operation will be described. The sentence input unit 11 inputs a sentence by a user's sentence input operation. The sentence input operation in this case is, for example, a kana character input operation from a keyboard and a kana-kanji conversion operation. Therefore, for example, if the user performs kana-kanji conversion for each phrase, the output of the sentence input unit 11 is a phrase character string mixed with kana-kanji characters. It is assumed that the dictionary required for the kana-kanji conversion is included in the sentence input unit 11. The character string output unit 12 temporarily stores a character string input from the sentence input unit 11 and outputs it when a user performs an output operation for voice conversion. The speech synthesis unit 13 synthesizes a voice of the character string input from the character string output unit 12 by, for example, text-to-speech conversion, and outputs the synthesized voice data. The redundant word storage unit 14 stores in advance a plurality of redundant word voice data of redundant words such as “er” and “a”. The redundant word selecting unit 15 selects one piece of voice data from the redundant word storage unit 14, for example, randomly. The output control unit 16 reads the synthesized voice data, extracts the output voice, monitors the input state of the synthesized voice data, and monitors the input state of the synthesized voice data. A redundant word voice is output from the word voice data.

【００１８】図１０は実施の形態１における合成音声お
よび冗長語音声出力の具体例を示す説明図で、ユーザの
操作、音声の出力および出力制御部１６の動作のタイミ
ングを示したものであり、横軸が時間を表している。入
力する文が「カレー１つ、ミックスフライ定食１つ、炒
飯１つお願いします。」であるとする。ユーザはキーボ
ードから、かな文字を入力し、かな漢字変換を行う。一
度に音声出力したい文字列の入力および変換が終わる
と、例えばリターンキーを押すなどの出力操作を行う。
図中の文入力操作で白い部分はかな文字入力の時間帯を
表し、斜線の部分はかな漢字変換が行われている時間帯
を表す。また出力操作の矢印はリターンキーを押すタイ
ミングを示す。FIG. 10 is an explanatory diagram showing a specific example of synthesized speech and redundant word speech output according to the first embodiment, showing the user's operation, the output of speech, and the timing of the operation of the output control unit 16. The horizontal axis represents time. Suppose the sentence is "Please give me one curry, one mixed fry set meal and one fried rice." The user inputs kana characters from the keyboard and performs kana-kanji conversion. When the input and conversion of the character string to be output at once are completed, an output operation such as pressing a return key is performed.
In the sentence input operation in the figure, a white portion indicates a time zone of Kana character input, and a shaded portion indicates a time zone of Kana-Kanji conversion. The arrow of the output operation indicates the timing of pressing the return key.

【００１９】出力制御部１６は、ユーザが文入力操作を
開始すると同時に音声合成部１３から入力される合成音
声データの監視を始める。出力制御部１６は、合成音声
データの入力がない時間を常に計測しており、これが一
定時間（以下、無音許容時間という）に達すると冗長語
選択部１５が選択した冗長語音声データを読み出し、そ
の冗長語音声を出力する。この例では無音許容時間は４
秒とする。次に文入力操作開始から約３．５秒後に出力
操作が行われると、文字列「カレー１つ」が音声合成部
１３に入力され、合成音声データが出力される。文入力
操作開始から出力操作までの時間は無音許容時間（４
秒）以下だったので、冗長語音声データは読み出され
ず、出力制御部１６は合成音声「カレー１つ」を音声出
力する。The output control unit 16 starts monitoring the synthesized speech data input from the speech synthesis unit 13 at the same time when the user starts a sentence input operation. The output control unit 16 constantly measures the time when there is no input of synthesized speech data, and when this time reaches a certain time (hereinafter referred to as a silence allowable time), reads out the redundant word speech data selected by the redundant word selection unit 15, and The redundant word voice is output. In this example, the silence allowable time is 4
Seconds. Next, when an output operation is performed about 3.5 seconds after the start of the sentence input operation, the character string “one curry” is input to the voice synthesis unit 13 and the synthesized voice data is output. The time from the start of the sentence input operation to the output operation is the silence allowable time (4
Second), the redundant word voice data is not read out, and the output control unit 16 outputs the synthesized voice “one curry”.

【００２０】上記の音声出力の間に、ユーザは引き続き
次の文字列「ミックスフライ定食１つ」の入力を行って
いる。ここで、この次の文字列の入力に時間がかかり、
直前の音声出力終了から４秒が経過しても出力操作が行
われなかったとする。このとき、出力制御部１６は冗長
語選択部１５に冗長語記憶部１４から冗長語音声データ
を１つランダムに選択させ、選択された冗長語「えー
と」の音声データを冗長語選択部１５から読み出して音
声出力する。その後で出力操作が行われると、合成音声
「ミックスフライ定食１つ」を音声出力する。During the above-mentioned audio output, the user continues to input the next character string “one mixed fry set meal”. Here, it takes time to enter this next string,
It is assumed that the output operation has not been performed even if 4 seconds have elapsed since the end of the immediately preceding audio output. At this time, the output control unit 16 causes the redundant word selecting unit 15 to randomly select one redundant word voice data from the redundant word storage unit 14 and outputs the voice data of the selected redundant word “Eto” from the redundant word selecting unit 15. Read and output audio. Thereafter, when an output operation is performed, a synthesized voice “one mix fly set meal” is output as a voice.

【００２１】さらに、上記の音声出力の間に、ユーザは
引き続き次の文字列「炒飯１つお願いします。」の入力
を行っている。前の合成音声出力から「炒飯１つお願い
します。」の出力操作まで無音許容時間の４秒以下だっ
たとすると、冗長語は読み出されず、出力制御部は合成
音声「炒飯１つお願いします」を音声出力する。上記の
動作により、全体の音声出力は図１０に示すように、
「カレー１つ、えーと、ミックスフライ定食１つ、
炒飯１つお願いします」となり、４秒以上無音状態が継
続することがなくなる。もし冗長語出力がなされなかっ
たとすれば、「カレー１つ」と「ミックスフライ定食１
つ」との間に約６秒間の無音状態が生じることになる。Further, during the above-described voice output, the user continues to input the next character string "Please give me one fried rice." If it is less than 4 seconds of the silence allowable time from the previous synthesized voice output to the output operation of "Please give me one fried rice", the redundant word will not be read out, and the output control unit will make the synthesized voice "Please give me one fried rice." Is output as audio. By the above operation, the entire audio output is as shown in FIG.
"One curry, um, one mixed fry set meal,
Please give me one fried rice "and the silence will not continue for more than 4 seconds. If redundant word output is not made, "one curry" and "mix fry set meal 1"
A silent state of about 6 seconds occurs between the two.

【００２２】以上のように、実施の形態１によれば、ユ
ーザの文入力操作に従って合成音声を出力することがで
き、文入力操作中に無音状態が一定時間継続すると冗長
語音声を自動的に出力して補間し、ユーザの文入力操作
に伴い発生する無音時間を見かけ上短縮することにより
ユーザとコミュニケーションを行う相手方が感じるスト
レスを軽減できる効果が得られる。As described above, according to the first embodiment, a synthesized speech can be output in accordance with the sentence input operation of the user, and when the silent state continues for a certain period of time during the sentence input operation, the redundant word sound is automatically output. By outputting and interpolating and apparently reducing the silence time generated by the user's sentence input operation, the effect of reducing the stress felt by the other party communicating with the user can be obtained.

【００２３】実施の形態２．図２はこの発明の実施の形
態２による音声コミュニケーション支援装置の構成を示
すブロック図で、図において、図１と同じ符号を付した
要素は実施の形態１と同等のもので、その説明について
は省略する。異なる部分で、２１は合成単位抽出部であ
り、図１の文字列出力部１２に置き換えられたものであ
る。次に動作について説明する。合成単位抽出部２１
は、文入力部１１から入力される文の文字列を解析し
て、音声合成時にまとめて出力される文字列すなわち合
成単位を抽出し、抽出した合成単位を逐次的に音声合成
部１３へ出力するよう動作するものである。ここで合成
単位は、例えば句読点、疑問符、感嘆符などを合成単位
の境界とみなすことによって抽出する。Embodiment 2 FIG. FIG. 2 is a block diagram showing a configuration of a voice communication support device according to a second embodiment of the present invention. In the drawing, elements denoted by the same reference numerals as those in FIG. 1 are the same as those in the first embodiment. Omitted. In the different part, reference numeral 21 denotes a combination unit extraction unit, which is replaced by the character string output unit 12 in FIG. Next, the operation will be described. Synthesis unit extraction unit 21
Analyzes a character string of a sentence input from the sentence input unit 11, extracts a character string that is collectively output during speech synthesis, that is, a synthesis unit, and sequentially outputs the extracted synthesis units to the speech synthesis unit 13. To operate. Here, the composition unit is extracted by regarding punctuation marks, question marks, exclamation marks, and the like as boundaries of the composition unit.

【００２４】図１１は実施の形態２による合成音声およ
び冗長語音声出力の具体例を示す説明図で、ユーザの操
作、音声の出力および出力制御部１６の動作のタイミン
グを示したものである。入力する文例は実施の形態１と
同じとする。ユーザはキーボードからかな文字を入力
し、かな漢字変換を行う。確定したかな漢字文字列は順
次合成単位抽出部２１に入力される。合成単位抽出部２
１は、入力された文字列を解析し、例えば文字列中に句
読点、疑問符、感嘆符を検出すると、合成単位の境界と
みなし、それまでに入力された文字列を音声合成部に逐
次出力する。ここでは、かな漢字変換操作によって「カ
レー１つ、」、「ミックスフライ定食１つ、」、「炒飯
１つお願いします。」の３つの文字列が順次入力され
る。合成単位抽出部２１は、「カレー１つ、」「ミック
スフライ定食１つ、」の末尾の句点、および「炒飯１つ
お願いします。」の末尾の読点を合成単位の境界とみな
し、入力文字列と同じ「カレー１つ、」、「ミックスフ
ライ定食１つ、」、「炒飯１つお願いします。」の３つ
をそれぞれ合成単位として音声合成部１３に逐次的に出
力する。したがって、合成単位を抽出したときに自動的
に出力するので、ユーザは実施の形態１の場合に行って
いた出力操作を省くことができる。出力制御部１６は、
合成単位の入力状態を監視することになるが、以降の動
作、すなわち無音状態に対する冗長語音声の挿入動作は
実施の形態１と同様であり、説明は省略する。FIG. 11 is an explanatory diagram showing a specific example of a synthesized speech and a redundant word speech output according to the second embodiment, showing timings of user operations, speech output, and operations of the output control unit 16. The sentence example to be input is the same as in the first embodiment. The user inputs kana characters from the keyboard and performs kana-kanji conversion. The determined kana-kanji character strings are sequentially input to the synthesis unit extraction unit 21. Composition unit extraction unit 2
1 analyzes an input character string and, for example, when detecting a punctuation mark, a question mark, or an exclamation mark in the character string, regards the character string as a boundary of a synthesis unit and sequentially outputs the input character string to the speech synthesis unit. . Here, three character strings of "one curry,""one mixed fry set meal," and "one fried rice please" are sequentially input by the kana-kanji conversion operation. The combining unit extraction unit 21 regards the ending punctuation mark of "one curry,""one mixed fry set meal," and the ending point of "one fried rice please." The same three as the column, "one curry,""one mixed fry set meal," and "Please give me one fried rice." Are sequentially output to the speech synthesizer 13 as synthesis units. Therefore, since the synthesis unit is automatically output when extracted, the user can omit the output operation performed in the first embodiment. The output control unit 16
Although the input state of the synthesis unit is monitored, the subsequent operation, that is, the operation of inserting a redundant word voice into a silent state is the same as in the first embodiment, and a description thereof will be omitted.

【００２５】以上のように、実施の形態２によれば、ユ
ーザの文入力操作に従って合成音声を合成単位ごとに逐
次的に出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると冗長語
音声を自動的に出力して補間することでユーザの文入力
操作に伴う無音時間を見かけ上短縮することによりユー
ザとコミュニケーションを行う相手方が感じるストレス
を軽減できる効果が得られる。As described above, according to the second embodiment, the user's operation can be reduced by outputting the synthesized speech sequentially for each synthesis unit in accordance with the user's sentence input operation, and the silent state can be reduced. By continuously outputting and interpolating redundant word voices for a certain period of time, the silence time associated with the user's sentence input operation is apparently shortened, thereby reducing the stress felt by the other party communicating with the user. .

【００２６】実施の形態３．図３はこの発明の実施の形
態３による音声コミュニケーション支援装置の構成を示
すブロック図で、図において、図１と同じ符号を付した
要素は実施の形態１と同等のもので、その説明について
は省略する。図１と異なる部分で、３１は合成単位解析
部、３２はタイミング決定部、３３はタイミング出力制
御部である。Embodiment 3 FIG. 3 is a block diagram showing a configuration of a voice communication support device according to a third embodiment of the present invention. In the drawing, elements denoted by the same reference numerals as those in FIG. 1 are the same as those in the first embodiment. Omitted. 1, reference numeral 31 denotes a synthesis unit analysis unit, 32 denotes a timing determination unit, and 33 denotes a timing output control unit.

【００２７】次に動作について説明する。合成単位解析
部３１は、文入力部１１から入力される文字列を解析
し、合成単位を逐次的に抽出し出力すると共に、合成単
位の言語カテゴリを決定し出力する。この合成単位は、
例えば実施の形態２と同様に句読点、疑問符、感嘆符な
どを合成単位の境界とみなすことによって決定される。
また、言語カテゴリは、例えば連体修飾、連用修飾、並
列、文末、その他の５種類とし、合成単位末尾の助詞や
用言の活用形によるルールを適用して決定される。例え
ば合成単位末尾が連体助詞「の」や用言の連体形ならば
言語カテゴリは連体修飾である。Next, the operation will be described. The synthesis unit analysis unit 31 analyzes a character string input from the sentence input unit 11, sequentially extracts and outputs synthesis units, and determines and outputs a language category of the synthesis unit. This composite unit is
For example, it is determined by regarding punctuation marks, question marks, exclamation marks, and the like as boundaries of the synthesis unit, as in the second embodiment.
Further, the language category is, for example, five types such as continuous modification, continuous modification, parallel, end of sentence, and other five types, and is determined by applying a rule based on the use of particles or words at the end of the synthesis unit. For example, if the end of the composition unit is the adjunct form of the adjunct particle "no" or a verb, the language category is adnominal modification.

【００２８】タイミング決定部３２は、入力された言語
カテゴリに応じて生成されるタイミング信号を出力す
る。このタイミング信号は冗長語音声データの出力タイ
ミングを決定するものである。出力タイミングは、直前
の合成音声出力終了からの時間長、すなわち無音許容時
間であり、その値は入力された言語カテゴリに応じて、
例えば図１２に示すようなテーブルを参照して決定する
ものとする。タイミング決定部３２は、直前の合成音声
の出力が行われてから出力タイミングで指定された時間
が経過するたびに、タイミング信号をタイミング出力制
御部３３に出力する。なお、言語カテゴリが入力されて
いない状態での出力タイミングは、図１２のテーブルで
言語カテゴリ「文末」に対応する７秒とする。直前の合
成単位の言語カテゴリが連体修飾や並列の場合は、後続
の合成単位があることが推測できるため、無音状態が長
く続くと相手に大きなストレスを与えると考えられる。
このため連体修飾や並列に対応する無音許容時間は短く
設定されている。The timing determining section 32 outputs a timing signal generated according to the input language category. This timing signal determines the output timing of the redundant word voice data. The output timing is the time length from the end of the immediately preceding synthesized voice output, that is, the silence allowable time, and its value is determined according to the input language category.
For example, the determination is made with reference to a table as shown in FIG. The timing determination unit 32 outputs a timing signal to the timing output control unit 33 every time the time specified by the output timing elapses after the immediately preceding output of the synthesized speech. Note that the output timing when no language category is input is 7 seconds corresponding to the language category "end of sentence" in the table of FIG. If the language category of the immediately preceding synthesis unit is adjoint modification or parallel, it can be inferred that there is a subsequent synthesis unit, and thus it is considered that if the silent state continues for a long time, the partner will be greatly stressed.
For this reason, the silence permissible time corresponding to continuous modification or parallel is set short.

【００２９】タイミング出力制御部３３は、合成音声デ
ータを読み込み出力音声を取り出すと共に、文の入力操
作中にタイミング決定部３２からタイミング信号が入力
されるまでの間に合成音声データの入力がない場合に
は、冗長語選択部１５が選択した冗長語音声データから
冗長語音声を出力する。例えば上記の例で直前の合成単
位の言語カテゴリが「文末」の場合、合成音声出力終了
以降７秒が経過しても音声入力が行われなければ冗長語
データを読み出し音声出力する。The timing output control unit 33 reads the synthesized voice data to extract the output voice, and when there is no input of the synthesized voice data until the timing signal is input from the timing determination unit 32 during the sentence input operation. The redundant word voice is output from the redundant word voice data selected by the redundant word selection unit 15. For example, in the above example, if the language category of the immediately preceding synthesis unit is “end of sentence”, the redundant word data is read out and output as speech if no speech input is performed even after 7 seconds have elapsed since the completion of the output of the synthesized speech.

【００３０】ここで、冗長語出力タイミング決定の具体
例を以下に説明する。入力文は「ロースカツ定食１つ
と、ざるそば１つお願いします。いくらですか？」であ
るとする。合成単位は「ロースカツ定食１つと」、「ざ
るそば１つお願いします」、「いくらですか」の３つと
なり、それぞれの言語カテゴリは「並列」、「文末」、
「文末」となる。タイミング決定部３２は、図１２のテ
ーブルに従って音声合成終了から冗長語音声出力までの
カテゴリ別の無音許容時間をそれぞれ図１３の説明図に
示すように決定する。したがって、タイミング出力制御
部３３は、この無音許容時間に応じて合成音声データの
入力状態を監視し、各無音許容時間を超える合成音声デ
ータの未入力状態に対して冗長語選択部１５が選択した
冗長語音声データを読み出し冗長語音声として出力する
ことができる。A specific example of determining the redundant word output timing will be described below. Assume that the input sentence is "Please give me one set of low skatatsu and one zaru soba. How much is it?" The composition unit is three, "One set of roast cutlet", "Please give me one zaru soba", and "How much is it?". Each language category is "Parallel", "End of sentence",
"End of sentence". The timing determination unit 32 determines the permissible silence time for each category from the end of the speech synthesis to the output of the redundant word voice according to the table of FIG. 12, as shown in the explanatory diagram of FIG. Therefore, the timing output control unit 33 monitors the input state of the synthesized voice data according to the silence allowable time, and the redundant word selecting unit 15 selects the non-input state of the synthesized voice data exceeding each silence allowable time. The redundant word voice data can be read and output as redundant word voice.

【００３１】ここで、冗長語音声の出力タイミングは、
言語カテゴリのみから決定することについて述べたが、
代わりに、言語カテゴリと合成音声時間長の２つによっ
て決定するものとすることもできる。例えば図１２に示
した言語カテゴリごとの無音許容時間長を、さらに合成
音声の時間長をパラメータとする式によって変化させる
ようにすることにより決める。Here, the output timing of the redundant word voice is
We mentioned that we decided only on language categories,
Alternatively, it may be determined by two of the language category and the synthetic speech time length. For example, the allowable silence time length for each language category shown in FIG. 12 is determined by further changing the time length of the synthesized speech by using an equation.

【００３２】以上のように、実施の形態３によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が継続すると、その直前に入
力していた合成単位の言語カテゴリに応じた出力タイミ
ングで冗長語音声を自動的に出力して補間し、ユーザの
文入力操作に伴う無音時間を見かけ上短縮することによ
りユーザとコミュニケーションを行う相手が感じるスト
レスを軽減できる効果が得られる。As described above, according to the third embodiment, by sequentially outputting synthesized speech for each synthesis unit in accordance with the user's sentence input operation, the operation of the user can be reduced, and the silent state can be reduced. When continued, redundant word speech is automatically output and interpolated at the output timing according to the language category of the synthesis unit that was input immediately before, and the silence time accompanying the user's sentence input operation is apparently shortened. The effect of reducing the stress felt by the partner communicating with the user can be obtained.

【００３３】実施の形態４．図４はこの発明の実施の形
態４による音声コミュニケーション支援装置の構成を示
すブロック図で、図１乃至図３と同じ符号を付した要素
は、それぞれの実施の形態と同等のものであり、その説
明については省略する。異なる部分で、４１は時間長別
タイミング決定部である。Embodiment 4 FIG. FIG. 4 is a block diagram showing a configuration of a voice communication support device according to a fourth embodiment of the present invention. Elements denoted by the same reference numerals as in FIGS. 1 to 3 are the same as those in the respective embodiments. Description is omitted. In the different part, reference numeral 41 denotes a time length-based timing determination unit.

【００３４】次に動作について説明する。時間長別タイ
ミング決定部４１は、合成音声データの時間長に応じて
生成される時間長別タイミング信号を出力する。この時
間長別タイミング信号は、例えば音声合成部１３の出力
である合成単位の合成音声データに続く無音時間が、直
前の合成音声データの時間長をパラメータとして算出し
た無音許容時間に達すると発生される。すなわち、時間
長別タイミング信号は、冗長語音声データの出力タイミ
ングを決定するものである。したがって、タイミング出
力制御部３３は、文入力操作中において時間長別タイミ
ング信号が入力された時、すなわち次の合成音声データ
の入力が行われない無音許容時間を超えた場合には冗長
語選択部１５が選択した冗長語音声データを読み出し冗
長語音声として無音許容時間を超えた無音部分を補間す
る。なお、時間長別タイミング決定部４１は、冗長語音
声の出力タイミングを算出した無音許容時間によって決
定したが、代わりに、合成音声データの時間長の範囲と
直前の音声出力終了からの時間長との対応を予め記憶し
たテーブルを参照して決定してもよい。Next, the operation will be described. The time length-specific timing determination unit 41 outputs a time length-specific timing signal generated according to the time length of the synthesized voice data. The timing signal for each time length is generated, for example, when the silence time following the synthesized speech data in the synthesis unit which is the output of the speech synthesis unit 13 reaches the silence allowable time calculated using the time length of the immediately preceding synthesized speech data as a parameter. You. That is, the timing signal for each time length determines the output timing of the redundant word voice data. Therefore, when the timing signal for each time length is input during the sentence input operation, that is, when the time exceeds the silence allowable time during which the input of the next synthesized voice data is not performed, the redundant word selecting unit 33 15 reads out the selected redundant word voice data and interpolates a silent portion exceeding the allowable silent time as a redundant word voice. Note that the time length-specific timing determination unit 41 determines the output timing of the redundant word voice based on the calculated silence permissible time. Instead, the time length range of the synthesized voice data and the time length from the end of the immediately preceding voice output are determined. May be determined with reference to a table stored in advance.

【００３５】以上のように、実施の形態４によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が継続すると、その直前に入
力していた合成単位の時間長に応じて決定される出力タ
イミングで冗長語音声を自動的に出力して補間し、ユー
ザの文入力操作に伴う無音時間を見かけ上短縮すること
によりユーザとコミュニケーションを行う相手方が感じ
るストレスを軽減できる効果が得られる。As described above, according to the fourth embodiment, by sequentially outputting synthesized speech for each synthesis unit in accordance with the user's sentence input operation, the operation of the user can be reduced, and the silent state can be reduced. If it continues, the redundant word voice is automatically output and interpolated at the output timing determined according to the time length of the synthesis unit input immediately before, and the silence time associated with the user's sentence input operation is apparently shortened By doing so, the effect of reducing the stress felt by the other party communicating with the user can be obtained.

【００３６】実施の形態５．図５は、この発明の実施の
形態５による音声コミュニケーション支援装置の構成を
示すブロック図で、図において、図１および図２と同じ
符号を付した要素は、それぞれの実施の形態と同等のも
のであり、その説明については省略する。異なる部分
で、５１は冗長語前置出力制御部である。次に動作につ
いて説明する。冗長語前置出力制御部５１は、合成音声
データを読み込み出力音声を取り出すと共に合成音声デ
ータの入力状態を監視し、文入力操作中に前の合成音声
データの入力後の予め決めた一定時間内に次の合成音声
データの入力が行われない場合には、冗長語音声データ
を出力するための準備を行い、次の合成音声データが入
力されたときには出力音声として次の合成音声データの
前に冗長語選択部１５が選択した冗長語音声データを読
み出して接続する。Embodiment 5 FIG. 5 is a block diagram showing a configuration of a voice communication support device according to a fifth embodiment of the present invention. In the figure, elements denoted by the same reference numerals as those in FIGS. 1 and 2 are the same as those in the respective embodiments. , And the description is omitted. In a different part, 51 is a redundant word prefix output control unit. Next, the operation will be described. The redundant word prefix output control unit 51 reads the synthesized voice data, extracts the output voice, monitors the input state of the synthesized voice data, and monitors the input state of the synthesized voice data within a predetermined time after the input of the previous synthesized voice data during the sentence input operation. When the next synthesized voice data is not input, preparations are made to output redundant word voice data, and when the next synthesized voice data is input, output is performed before the next synthesized voice data as the output voice. The redundant word voice data selected by the redundant word selection unit 15 is read and connected.

【００３７】図１４は実施の形態５による合成音声およ
び冗長語音声出力の具体例を示す説明図で、ユーザの操
作、音声の出力、冗長語前置出力制御部５１の動作のタ
イミングを示したものである。入力する文は実施の形態
１と同じ「カレー１つ、ミックスフライ定食１つ、炒飯
１つお願いします。」とする。「カレー１つ」の音声合
成出力が終了して一定時間、例えば４秒経過すると、次
の合成音声出力の前に冗長語出力を行うことを決定し、
例えば冗長語出力のフラグを立てる。この時点では、ま
だ冗長語音声の出力は行わない。次の合成音声データ
「ミックスフライ定食１つ」が入力されると、冗長語出
力のフラグが立っているので、冗長語前置出力制御部５
１は冗長語選択部１５からの冗長語音声データを読み出
す。この結果、例えば冗長語「えーと」が読み出され
る。冗長語前置出力制御部５１は次の合成音声の前に冗
長語音声を接続し、「えーと、ミックスフライ定食１
つ」を音声出力する。この動作の結果、「ミックスフラ
イ定食１つ」と「炒飯１つお願いします」の間の時間が
短くなる。FIG. 14 is an explanatory diagram showing a specific example of synthesized speech and redundant word speech output according to the fifth embodiment, showing the timing of user operation, speech output, and operation of the redundant word prefix output control unit 51. Things. The sentence to be input is the same as in the first embodiment, "Please give me one curry, one mixed fry set meal, and one fried rice." When a predetermined time, for example, 4 seconds, has elapsed after the speech synthesis output of “one curry” has been completed, it is determined that a redundant word is to be output before the next synthesized speech output,
For example, a redundant word output flag is set. At this point, redundant word speech is not yet output. When the next synthesized voice data “one mixed fly set meal” is input, the redundant word output flag is set, so the redundant word prefix output control unit 5
1 reads the redundant word voice data from the redundant word selection unit 15. As a result, for example, the redundant word “erto” is read. The redundant word prefix output control unit 51 connects the redundant word voice before the next synthesized voice, and outputs “Um, mixed fly set meal 1”.
Is output as a voice. As a result of this operation, the time between “one mixed fry set meal” and “one fried rice please” is reduced.

【００３８】以上のように、実施の形態５によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると、冗長
語音声を次の合成音声の直前に自動的に出力して補間
し、冗長語が意味のある発話単位の直前に置かれるとい
う自然音声の特徴を実現し、ユーザの文入力操作に伴う
無音時間を平均的に短縮することによりユーザとコミュ
ニケーションを行う相手が感じるストレスを軽減できる
効果が得られる。As described above, according to the fifth embodiment, the user operation can be reduced by outputting synthesized speech sequentially for each synthesis unit in accordance with the user's sentence input operation, and the silence state can be reduced. If the speech continues for a certain period of time, the redundant speech is automatically output immediately before the next synthesized speech and interpolated, realizing the natural speech feature that the redundant speech is placed immediately before a meaningful utterance unit. By reducing the silence time associated with the input operation on average, the effect of reducing the stress felt by the partner communicating with the user can be obtained.

【００３９】実施の形態６．図６はこの発明の実施の形
態６による音声コミュニケーション支援装置の構成を示
すブロック図で、図１、図３および図５と同じ符号を付
した要素は、それぞれの実施の形態と同等のものであ
り、その説明については省略する。異なる部分で、６１
はカテゴリ別冗長語記憶部、６２はカテゴリ別冗長語選
択部である。Embodiment 6 FIG. FIG. 6 is a block diagram showing a configuration of a voice communication support device according to a sixth embodiment of the present invention. Elements denoted by the same reference numerals as in FIGS. 1, 3 and 5 are equivalent to those of the respective embodiments. Yes, and the description is omitted. In different parts, 61
Is a category-based redundant word storage unit, and 62 is a category-based redundant word selection unit.

【００４０】次に動作について説明する。なお、この実
施の形態６においては、合成単位解析部３１が決定する
合成単位の言語カテゴリは、疑問文文末、疑問文以外の
文末、文末以外の３種類からなるものとする。カテゴリ
別冗長語記憶部６１は、合成単位の言語カテゴリごとに
接続し得る冗長語音声データを、例えば図１５のテーブ
ルに示すように予め記憶する。カテゴリ別冗長語選択部
６２は、合成単位解析部３１からの合成単位の言語カテ
ゴリを識別してカテゴリ別冗長語記憶部６１から対応す
る冗長語音声データを１つ選択する。この場合、冗長語
前置出力制御部５１は、文入力操作中に前の合成音声デ
ータの入力後の予め決めた一定時間内に次の合成音声デ
ータの入力が行われない場合には冗長語音声を出力する
ための準備を行い、そして、次の合成音声データが入力
されたときの出力音声として、次の合成音声データの前
にカテゴリ別冗長語選択部が選択した次の合成音声に対
応する冗長語音声データを読み出して接続するよう動作
する。Next, the operation will be described. In the sixth embodiment, it is assumed that the language category of the synthesis unit determined by the synthesis unit analysis unit 31 is composed of three types: a question sentence end, a sentence other than a question sentence, and a sentence end. The category-specific redundant word storage unit 61 stores in advance redundant word speech data that can be connected for each language category of the synthesis unit, for example, as shown in the table of FIG. The category-based redundant word selection unit 62 identifies the language category of the synthesis unit from the synthesis unit analysis unit 31 and selects one corresponding redundant word voice data from the category-based redundant word storage unit 61. In this case, the redundant word prefix output control unit 51 outputs the redundant word if the next synthesized voice data is not input within a predetermined time after the input of the previous synthesized voice data during the sentence input operation. Prepare to output voice, and as the output voice when the next synthesized voice data is input, correspond to the next synthesized voice selected by the category redundant word selection unit before the next synthesized voice data The redundant word voice data is read and connected.

【００４１】ここで、冗長語決定の具体例を以下に説明
する。入力文は実施の形態３と同様に「ロースカツ定食
１つと、ざるそば１つお願いします。いくらですか？」
であるとする。カテゴリ別冗長語選択部６２に合成単位
解析部３１から言語カテゴリが入力されたときに選択さ
れ得るカテゴリ別冗長語は、図１５のテーブルを参照し
て、図１６に示すようになる。したがって、例えば冗長
語「うーん」が「いくらですか」の直前に出力されるこ
とはない。カテゴリ別冗長語選択部６２は選択され得る
図１６に示されるカテゴリ別冗長語のうちから１つを、
例えばランダムに選択することになる。Here, a specific example of determining a redundant word will be described below. The input sentence is the same as in Embodiment 3, "Please give me one set of roasted cutlet and one zaru soba. How much is it?"
And The redundant words by category that can be selected when the language category is input from the synthesis unit analysis unit 31 to the redundant word by word selection unit 62 are as shown in FIG. 16 with reference to the table of FIG. Therefore, for example, the redundant word "um" is not output immediately before "how much". The category-based redundant word selection unit 62 selects one of the category-based redundant words that can be selected as shown in FIG.
For example, it will be selected at random.

【００４２】以上のように、実施の形態６によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると、後続
する合成単位の言語カテゴリから適切と判断される冗長
語音声を次の合成音声の前に接続して自動的に出力して
補間し、冗長語が意味のある発話単位の直前に置かれる
という自然音声の特徴を実現し、ユーザの文入力操作に
伴う無音時間を見かけ上平均的して短縮することにより
ユーザとコミュニケーションを行う相手が感じるストレ
スを軽減できる効果が得られる。As described above, according to the sixth embodiment, by sequentially outputting synthesized speech for each synthesis unit in accordance with the user's sentence input operation, the operation of the user can be reduced, and the silent state can be reduced. After a certain period of time, the redundant word speech judged to be appropriate from the language category of the succeeding synthetic unit is connected before the next synthetic speech and automatically output and interpolated, and the redundant word By realizing the characteristic of the natural voice that is placed immediately before, and by shortening the silence time associated with the user's sentence input operation on the average, the effect of reducing the stress felt by the partner communicating with the user can be obtained.

【００４３】実施の形態７．図７は、この発明の実施の
形態７による音声コミュニケーション支援装置の構成を
示すブロック図で、図において、図２と同じ符号を付し
た要素は、その実施の形態２と同等のものであり、その
説明については省略する。異なる部分で、７１は末尾母
音別冗長語記憶部、７２は末尾母音別冗長語選択部であ
る。Embodiment 7 FIG. FIG. 7 is a block diagram illustrating a configuration of a voice communication support device according to a seventh embodiment of the present invention. In the drawing, elements denoted by the same reference numerals as those in FIG. 2 are the same as those in the second embodiment. The description is omitted. In different parts, reference numeral 71 denotes a tail vowel-based redundant word storage unit, and reference numeral 72 denotes a tail vowel-based redundant word selection unit.

【００４４】次に動作について説明する。末尾母音別冗
長語記憶部７１は、合成単位の末尾母音ごとに出力可能
な末尾母音別冗長語を予め記憶する。この合成単位の末
尾母音と冗長語の対応の例を図１７のテーブルに示す。
例えば末尾母音が／ｕ／の場合、冗長語「うーん」、
「うーんと」などが出力され得る。ここでは、冗長語先
頭の母音と合成単位末尾の母音がなるべく同じになるよ
うに出力可能な冗長語を設定している。末尾母音別冗長
語選択部７２は、合成単位抽出部２１から合成単位が入
力されると、その末尾の母音の種類を調べ、母音の種類
に適合する冗長語を末尾母音別冗長語記憶部７１から１
つ選択する。例えば合成単位が「カレー１つ」であれ
ば、末尾母音は／ｕ／であるので、末尾母音別冗長語選
択部７２は冗長語「うーん」、「うーんと」などから１
つを、例えばランダムに選択する。したがって、出力制
御部１６は、無音許容時間に応じて合成音声データの入
力状態を監視し、合成音声データの未入力状態が無音許
容時間を超えて継続した場合、末尾母音別冗長語選択部
７２が選択した冗長語音声データを自動的に出力し、流
れの良い文の音声出力を形成する。Next, the operation will be described. The tail-vowel-based redundant word storage unit 71 stores in advance a tail vowel-based redundant word that can be output for each tail vowel of the synthesis unit. An example of the correspondence between the tail vowel of this synthesis unit and the redundant word is shown in the table of FIG.
For example, if the last vowel is / u /, the redundant word "um"
"Hmm" can be output. Here, a redundant word that can be output is set so that the vowel at the beginning of the redundant word and the vowel at the end of the synthesis unit are as similar as possible. When the synthesis unit is input from the synthesis unit extraction unit 21, the tail vowel-specific redundant word selection unit 72 checks the type of the vowel at the end of the unit, and finds a redundant word that matches the vowel type and stores a redundant word that matches the vowel type. From 1
Choose one. For example, if the synthesis unit is “one curry”, the tail vowel is / u /, so the tail word-by-vowel redundant word selection unit 72 selects one from the redundant words “um”, “um”, etc.
One is selected at random, for example. Therefore, the output control unit 16 monitors the input state of the synthesized voice data according to the silence allowable time, and when the non-input state of the synthetic voice data continues beyond the silence allowable time, the tail word-based vowel-based redundant word selection unit 72. Automatically outputs the selected redundant word voice data to form a voice output of a sentence with a good flow.

【００４５】以上のように、実施の形態７によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると、合成
単位の末尾母音に適合する冗長語音声データを自動的に
出力して補間し、ユーザの文入力操作に伴う無音時間を
見かけ上短縮することによりユーザとコミュニケーショ
ンを行う相手が感じるストレスを軽減できる効果が得ら
れる。なお、末尾母音別冗長語記憶部７１と末尾母音別
冗長語選択部７２は、図４および図５の配置においても
置き換えて適用できるものである。As described above, according to the seventh embodiment, the user operation can be reduced by sequentially outputting the synthesized speech for each synthesis unit in accordance with the user's sentence input operation, and the silence state can be reduced. When a certain period of time continues, redundant words voice data that matches the last vowel of the synthesis unit is automatically output and interpolated, and the silence time associated with the user's sentence input operation is apparently shortened, so that the other party communicating with the user can An effect that can reduce feeling stress can be obtained. Note that the tail word-by-vowel redundant word storage unit 71 and the tail-vowel-by-tail redundant word selection unit 72 can be applied to the arrangements of FIGS.

【００４６】実施の形態８．図８は、この発明の実施の
形態８による音声コミュニケーション支援装置の構成を
示すブロック図で、図において、図２と同じ符号を付し
た要素は、その実施の形態２と同等のものであり、その
説明については省略する。異なる部分で、８１は時間長
別冗長語記憶部、８２は時間長別冗長語選択部である。Embodiment 8 FIG. FIG. 8 is a block diagram showing a configuration of a voice communication support device according to an eighth embodiment of the present invention. In the figure, elements denoted by the same reference numerals as those in FIG. 2 are the same as those in the second embodiment. The description is omitted. In different parts, reference numeral 81 denotes a time-based redundant word storage unit, and reference numeral 82 denotes a time-based redundant word selection unit.

【００４７】次に動作について説明する。時間長別冗長
語記憶部８１は、合成単位の時間長ごとに出力可能な冗
長語音声データを予め記憶する。例えば、冗長語「えー
と」の長音の時間長を数種類に変えた冗長語音声データ
を用意し、長い合成単位の時間長には、長音の時間長が
長い冗長語音声データを対応付ける。時間長別冗長語選
択部８２は、合成音声データが入力されると、その時間
長を調べ、合成単位の時間長に適合する冗長語を時間長
別冗長語記憶部８１から１つ選択する。直前の合成単位
の時間長が長ければ、長音の時間長が長い冗長語音声デ
ータが選択される。したがって、出力制御部１６は、無
音許容時間に応じて合成音声データの入力状態を監視
し、合成音声データの未入力状態が無音許容時間を超え
て継続した場合、時間長別冗長語選択部８２が選択した
冗長語音声データを自動的に出力し、流れの良い文の出
力音声を形成する。Next, the operation will be described. The redundant word storage unit 81 for each time length stores in advance redundant word voice data that can be output for each time length of the synthesis unit. For example, redundant word voice data in which the time length of the long sound of the redundant word “er” is changed to several types is prepared, and the time length of a long synthesis unit is associated with the redundant word voice data having a long time length of the long sound. When the synthesized speech data is input, the redundant word selecting section 82 checks the time length and selects one redundant word from the redundant word storing section 81 according to the time length of the synthesis unit. If the time length of the immediately preceding synthesis unit is long, redundant word voice data having a long sound length is selected. Therefore, the output control unit 16 monitors the input state of the synthesized speech data according to the permissible silence time, and when the non-input state of the synthesized speech data continues beyond the permissible silence time, the redundant word selection unit 82 for each time length. Automatically outputs the selected redundant word voice data to form an output voice of a sentence with a good flow.

【００４８】以上のように、実施の形態８によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると、合成
単位の時間長に適合する冗長語音声を自動的に出力して
補間し、ユーザの文入力操作に伴う無音時間を見かけ上
短縮することによりユーザとコミュニケーションを行う
相手が感じるストレスを軽減できる効果が得られる。な
お、時間長別冗長語記憶部８１と時間長別冗長語選択部
８２は、図１乃至図５のそれぞれの配置においても置き
換えて適用できるものである。As described above, according to the eighth embodiment, the user operation can be reduced by outputting synthesized speech sequentially for each synthesis unit in accordance with the user's sentence input operation, and the silence state can be reduced. When a certain period of time continues, redundant words voices that match the time length of the synthesis unit are automatically output and interpolated, and the silence time associated with the user's sentence input operation is apparently shortened, so that the person who communicates with the user feels The effect of reducing stress can be obtained. The redundant word storage unit 81 according to time length and the redundant word selection unit 82 according to time length can be replaced and applied to each of the arrangements of FIGS.

【００４９】実施の形態９．図９はこの発明の実施の形
態９による音声コミュニケーション支援装置の構成を示
すブロック図で、図において、図１および図２と同じ符
号を付した要素は、それぞれの実施の形態と同等のもの
であり、その説明については省略する。異なる部分で、
９１は復唱出力制御部である。Embodiment 9 FIG. FIG. 9 is a block diagram showing a configuration of a voice communication support device according to a ninth embodiment of the present invention. In the figure, elements denoted by the same reference numerals as those in FIGS. 1 and 2 are the same as those in the respective embodiments. Yes, and the description is omitted. In different parts,
Reference numeral 91 denotes a repetition output control unit.

【００５０】次に動作について説明する。復唱出力制御
部９１は、合成音声データを読み込み出力音声を取り出
すと共に出力した前記出力合成音声データを一時的に記
憶し、合成音声データの入力状態を監視して合成音声デ
ータが一定時間入力されないことを感知した場合には、
冗長語選択部１５から冗長語音声データを読み出し音声
出力するか、または記憶されている合成音声データを再
出力する。冗長語音声データと記憶されている合成音声
データのいずれを音声出力するかは、例えば直前の音声
合成出力以降に冗長語音声データまたは記憶されている
合成音声データが出力された回数によって決定する。例
えば、合成音声の出力の直後には、記憶されている合成
音声を復唱し、それ以降無音状態が続く場合には冗長語
音声を出力する。したがって、全体として、流れの良い
文の音声出力を形成する。Next, the operation will be described. The repetition output control unit 91 reads the synthesized voice data, extracts the output voice, temporarily stores the output synthesized voice data, monitors the input state of the synthesized voice data, and checks that the synthesized voice data is not input for a certain period of time. Is detected,
It reads out redundant word voice data from the redundant word selection unit 15 and outputs it as voice, or outputs stored synthetic voice data again. Which of the redundant word voice data and the stored synthesized voice data is output as speech is determined, for example, by the number of times the redundant word voice data or the stored synthesized voice data has been output since the immediately preceding voice synthesized output. For example, immediately after the output of the synthesized voice, the stored synthesized voice is repeated, and if a silent state continues thereafter, a redundant word voice is output. Therefore, as a whole, an audio output of a sentence with a good flow is formed.

【００５１】ここで、冗長語音声データと記憶されてい
る合成音声のいずれを音声出力するかを決める場合に、
直前の音声合成出力以降に、冗長語音声データまたは記
憶されている合成音声が出力された回数によって決定す
ることについて述べたが、これは、代わりに、合成音声
の時間長によって決定するようにしてもよい。例えば合
成音声の時間長が一定値以下の場合のみ復唱し、それ以
外は冗長語を出力するようにする。また、これは、合成
単位の言語カテゴリによって決定することもできる。例
えば合成単位に番号や固有名詞が含まれる場合には復唱
し、それ以外は冗長語音声データを出力するようにする
ことである。Here, when determining which of the redundant word voice data and the stored synthesized voice is to be output,
It has been described that it is determined by the number of times that the redundant word voice data or the stored synthesized voice has been output since the immediately preceding voice synthesis output, but this is instead determined by the time length of the synthesized voice. Is also good. For example, repetition is performed only when the time length of the synthesized speech is equal to or less than a certain value, and otherwise, a redundant word is output. This can also be determined by the language category of the composition unit. For example, if the synthesis unit contains a number or proper noun, the repetition is performed, and otherwise, redundant word voice data is output.

【００５２】以上のように、実施の形態９によれば、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると、直前
に出力された合成音声を自動的に再出力して補間し、ユ
ーザの文入力操作に伴う無音時間を見かけ上短縮するこ
とによりユーザとコミュニケーションを行う相手が感じ
るストレスを軽減できる効果が得られる。なお、復唱出
力制御部９１は図１乃至図８のそれぞれの配置において
置き換えることにより適応できるものである。As described above, according to the ninth embodiment, the user operation can be reduced by sequentially outputting synthesized speech for each synthesis unit according to the user's sentence input operation, and the silent state can be reduced. After a certain period of time, the synthesized speech output immediately before is automatically re-output and interpolated, and the silence time associated with the user's sentence input operation is apparently shortened, thereby reducing the stress felt by the person communicating with the user. The effect that can be obtained is obtained. The repetition output control section 91 can be adapted by replacing the arrangement in each of FIGS.

【００５３】以上、この発明の各実施の形態について述
べてきたが、これらは以下に示すようないくつかのバリ
エーションに対応できるものである。実施の形態１から
実施の形態９において、文入力操作について、キーボー
ドによる文字入力およびかな漢字変換操作について述べ
たが、これは、ペンによる手書き文字の入力、タッチパ
ネルへのタッチや視線入力による文字入力とかな漢字変
換操作、予め記憶されている文や句の選択、単語の穴埋
めが可能な文例である文テンプレートの選択と単語入力
操作等によっても行うことができる。While the embodiments of the present invention have been described above, these embodiments can cope with some variations as described below. In the first to ninth embodiments, the character input operation using a keyboard and the Kana-Kanji conversion operation have been described for the sentence input operation. Kana-kanji conversion operation, selection of a sentence or phrase stored in advance, selection of a sentence template which is a sentence example in which a word can be filled in, and word input operation can also be performed.

【００５４】実施の形態２から実施の形態９において、
合成単位の決定は、特に文入力操作が文テンプレートの
選択と単語入力操作である場合、文テンプレートに予め
合成単位境界を記述しておくことによって行うこともで
きる。In the second to ninth embodiments,
In particular, when the sentence input operation is the selection of a sentence template and the word input operation, the determination of the synthesis unit can be performed by previously describing the synthesis unit boundary in the sentence template.

【００５５】実施の形態１から実施の形態９において、
冗長語データの選択について、言語カテゴリ、時間長、
末尾母音などの条件に適合する冗長語の中から１つをラ
ンダムに選択するよう述べたが、これは、ユーザが使用
する冗長語を予め登録しておき、条件に反しない限りそ
れらの中から選択するようにすることもできる。In the first to ninth embodiments,
Regarding selection of redundant word data, language category, time length,
It has been described that one word is randomly selected from redundant words that meet conditions such as a tail vowel, but this is done by registering in advance the redundant words used by the user and selecting from them unless they violate the conditions. You can choose to do so.

【００５６】実施の形態３および実施の形態６におい
て、合成単位の言語カテゴリは、例えば合成単位末尾あ
るいは先頭の品詞、助詞の種類、用言の活用形、肯定・
疑問・否定などの文の種類などによって細分化したカテ
ゴリとすることもできる。また、言語カテゴリは、特に
文入力操作が文テンプレートの選択と単語入力操作であ
り、かつ合成単位境界が文テンプレートに記述されてい
る場合、文テンプレート中の合成単位に予め記述してお
くこともできる。この場合、合成単位の抽出・解析を行
う必要がないため、合成単位解析部３１に代えて、実施
の形態１に記載の文字列出力部１２相当を備える構成と
することもできる。In the third and sixth embodiments, the linguistic category of the composition unit is, for example, the part of speech at the end or the beginning of the composition unit, the type of particle, the inflected form of the word, the
Categories can be subdivided according to the type of sentence such as question or negation. In addition, the language category may be described in advance in the synthesis unit in the sentence template, particularly when the sentence input operation is the selection of a sentence template and the word input operation, and the synthesis unit boundary is described in the sentence template. it can. In this case, since it is not necessary to extract and analyze the synthesis unit, a configuration including the character string output unit 12 described in the first embodiment instead of the synthesis unit analysis unit 31 can be adopted.

【００５７】実施の形態３および実施の形態４におい
て、タイミング決定部３２または時間長別タイミング決
定部４１は、冗長語音声出力の出力タイミングである直
前の音声出力終了からの時間長を、言語カテゴリまたは
直前の合成音声時間長によって決まる一定値としたが、
代わりに、その一定値を基準として一定の時間幅を持っ
たランダムな時間長として出力するようにしてもよい。In the third and fourth embodiments, the timing determining section 32 or the time length-based timing determining section 41 determines the time length from the immediately preceding voice output, which is the output timing of the redundant word voice output, to the language category. Or it was a fixed value determined by the last synthesized speech time length,
Instead, it may be output as a random time length having a certain time width based on the certain value.

【００５８】実施の形態７から実施の形態９において、
合成単位抽出部２１に代えて実施の形態３に記載した合
成単位解析部３１とタイミング決定部３２相当のものを
備えることにより、冗長語音声データまたは直前に出力
した合成音声を、一定時間おきでなく、直前に出力され
た合成単位の言語カテゴリに応じて異なるタイミングで
出力することもできる。また、実施の形態４に記載した
時間長別タイミング決定部４１相当を備えるようにする
ことで、冗長語音声データもしくは直前に出力した合成
音声を、一定時間おきでなく直前に出力された合成単位
の時間長に応じて異なるタイミングで出力するものとす
ることもできる。In the seventh to ninth embodiments,
By providing a unit corresponding to the synthesis unit analysis unit 31 and the timing determination unit 32 described in the third embodiment in place of the synthesis unit extraction unit 21, the redundant word voice data or the synthesized voice output immediately before can be output at regular intervals. Instead, it is also possible to output at different timings according to the language category of the composition unit output immediately before. In addition, by providing the equivalent of the time length-based timing determination unit 41 described in the fourth embodiment, the redundant word voice data or the synthesized voice output immediately before can be combined with the synthesized unit output immediately before instead of at regular intervals. May be output at different timings according to the time length of the data.

【００５９】[0059]

【発明の効果】以上のように、この発明によれば、ユー
ザが入力した文を合成音声に変換して出力する音声コミ
ュニケーション支援装置において、ユーザの文入力操作
により文を入力する文入力部と、入力された文の文字列
を一時記憶し、ユーザにより出力操作が行われたときに
出力する文字列出力部と、この文字列出力部から出力さ
れた文字列を音声合成して合成音声データを出力する音
声合成部と、適当数の所定の冗長語音声データを予め記
憶しておく冗長語記憶部と、この冗長語記憶部から冗長
語音声データを１つ選択して出力する冗長語選択部と、
合成音声データを読み込み出力音声を取り出すと共に合
成音声データの入力状態を監視し、合成音声データが一
定時間入力されないことを感知した場合には、冗長語選
択部が選択した冗長語音声データから冗長語音声を出力
する出力制御部とを備えるように構成したので、ユーザ
の文入力操作に従って合成音声を出力することができ、
文入力操作中に無音状態が一定時間継続すると冗長語音
声を自動的に出力して補間し、ユーザの文入力操作に伴
い発生する無音時間を見かけ上短縮することによりユー
ザとコミュニケーションを行う相手方が感じるストレス
を軽減できる効果がある。As described above, according to the present invention, in a voice communication support apparatus for converting a sentence inputted by a user into a synthesized speech and outputting the synthesized speech, a sentence inputting section for inputting a sentence by a user's sentence inputting operation is provided. A character string output unit that temporarily stores a character string of an input sentence and outputs the character string when the output operation is performed by a user, and synthesizes speech data by speech-synthesizing the character string output from the character string output unit. , A redundant word storage unit for storing an appropriate number of predetermined redundant word voice data in advance, and a redundant word selection unit for selecting and outputting one redundant word voice data from the redundant word storage unit. Department and
The synthesized speech data is read in, the output speech is extracted, and the input state of the synthesized speech data is monitored. If it is detected that the synthesized speech data is not inputted for a certain period of time, the redundant word selecting unit selects the redundant word from the selected redundant word speech data. Since it is configured to include an output control unit that outputs a voice, it is possible to output a synthesized voice according to a user's sentence input operation,
If silence continues for a certain period of time during sentence input operation, redundant words are automatically output and interpolated, and the other party communicating with the user can be apparently reduced in silence time generated by the user's sentence input operation. It has the effect of reducing the stress you feel.

【００６０】この発明によれば、文字列出力部に代え
て、文入力部から入力される文の文字列を解析して合成
単位を抽出し、抽出した合成単位を逐次的に出力する合
成単位抽出部を備えるように構成したので、ユーザの文
入力操作に従って合成音声を合成単位ごとに逐次的に出
力してユーザの操作を軽減できると共に、無音状態が一
定時間継続すると冗長語音声を自動的に出力して補間
し、ユーザの文入力操作に伴う無音時間を見かけ上短縮
することによりユーザとコミュニケーションを行う相手
方が感じるストレスを軽減できる効果がある。According to the present invention, in place of the character string output unit, a character string of a sentence input from the sentence input unit is analyzed to extract a composite unit, and the extracted composite unit is sequentially output. Since the system is equipped with an extraction unit, synthesized speech can be sequentially output for each synthesis unit according to the user's sentence input operation to reduce the user's operation, and redundant speech is automatically generated when silence continues for a certain period of time. And the interpolation is performed, and the silence time associated with the user's sentence input operation is apparently shortened, thereby reducing the stress felt by the other party communicating with the user.

【００６１】この発明によれば、ユーザが入力した文を
合成音声に変換して出力する音声コミュニケーション支
援装置において、ユーザの文入力操作により文を入力す
る文入力部と、この文入力部から入力される文の文字列
を解析して合成単位を抽出すると共にこの合成単位の言
語カテゴリを決定し、合成単位と言語カテゴリを逐次的
に出力する合成単位解析部と、合成単位解析部から出力
された合成単位の文字列を音声合成して合成音声データ
を出力する音声合成部と、適当数の所定の冗長語音声デ
ータを予め記憶しておく冗長語記憶部と、この冗長語記
憶部から冗長語音声データを１つ選択して出力する冗長
語選択部と、入力される言語カテゴリに応じて冗長語音
声データの出力タイミングを決定するタイミング信号を
出力するタイミング決定部と、合成音声データを読み込
み出力音声を取り出すと共に合成音声データの入力状態
を監視し、文入力操作中においてタイミング信号が入力
された場合には冗長語選択部が選択した冗長語音声デー
タによる冗長語音声を出力するタイミング出力制御部と
を備えるように構成したので、ユーザの文入力操作に従
って合成単位ごとに逐次的に合成音声を出力することに
よりユーザの操作を軽減することができると共に、無音
状態が継続すると、その直前に入力していた合成単位の
言語カテゴリに応じた出力タイミングで冗長語音声を自
動的に出力して補間し、ユーザの文入力操作に伴う無音
時間を見かけ上短縮することによりユーザとコミュニケ
ーションを行う相手が感じるストレスを軽減できる効果
がある。According to the present invention, in a voice communication support device for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, a sentence input unit for inputting a sentence by a user's sentence input operation, and an input from the sentence input unit A synthesis unit that analyzes a character string of a sentence, extracts a synthesis unit, determines a language category of the synthesis unit, and sequentially outputs the synthesis unit and the language category; A speech synthesizing unit for speech-synthesizing the character string of the synthesized unit and outputting synthesized speech data; a redundant word storage unit for storing an appropriate number of predetermined redundant word speech data in advance; A redundant word selecting unit for selecting and outputting one word voice data, and a timing outputting a timing signal for determining an output timing of the redundant word voice data according to the input language category The decision unit reads the synthesized speech data, extracts the output speech, monitors the input state of the synthesized speech data, and uses the redundant word speech data selected by the redundant word selection unit when a timing signal is input during the sentence input operation. Since it is configured to include a timing output control unit that outputs a redundant word voice, the user operation can be reduced by sequentially outputting synthesized voice for each synthesis unit according to the user's sentence input operation, When silence continues, redundant words are automatically output and interpolated at the output timing according to the language category of the synthesis unit that was input immediately before, and the silence time associated with the user's sentence input operation is apparently reduced. This has the effect of reducing the stress felt by the partner communicating with the user.

【００６２】この発明によれば、タイミング決定部の代
わりに、直前の合成音声データの時間長に応じて生成さ
れる冗長語音声データの出力タイミングを決定する時間
長別タイミング信号を出力する時間長別タイミング決定
部を備え、タイミング出力制御部が時間長別タイミング
信号により冗長語音声データを出力するように構成した
ので、ユーザの文入力操作に従って合成単位ごとに逐次
的に合成音声を出力することによりユーザの操作を軽減
することができると共に、無音状態が継続すると、その
直前に入力していた合成単位の時間長に応じて決定され
る出力タイミングで冗長語音声を自動的に出力して補間
し、ユーザの文入力操作に伴う無音時間を見かけ上短縮
することによりユーザとコミュニケーションを行う相手
方が感じるストレスを軽減できる効果がある。According to the present invention, instead of the timing determining unit, the time length for outputting the timing signal for each time length for determining the output timing of the redundant word voice data generated according to the time length of the immediately preceding synthesized voice data A separate timing decision unit is provided, and the timing output control unit is configured to output redundant word voice data by a time length-specific timing signal, so that a synthesized voice is sequentially output for each synthesis unit according to a user's sentence input operation. When the silent state continues, the redundant word voice is automatically output at the output timing determined according to the time length of the synthesis unit input immediately before, and interpolation is performed. In addition, the silence time associated with the user's sentence input operation is apparently shortened, so that the person who communicates with the user can feel the stress. There is an effect that can be reduced.

【００６３】この発明によれば、出力制御部に代えて、
合成音声データを読み込み出力音声を取り出すと共に合
成音声データの入力状態を監視し、文入力操作中に前の
合成音声データの入力後の予め決めた一定時間内に次の
合成音声データの入力が行われない場合には冗長語音声
データを出力するための準備を行い、次の合成音声デー
タが入力されたときには出力音声として次の合成音声デ
ータの前に冗長語選択部が選択した冗長語音声データを
読み出して接続する冗長語前置出力制御部を備えるよう
に構成したので、ユーザの文入力操作に従って文字列の
合成音声を出力することができ、無音状態が一定時間継
続すると、冗長語音声を次の合成音声の直前に自動的に
出力して補間し、冗長語が意味のある発話単位の直前に
置かれるという自然音声の特徴を実現し、ユーザの文入
力操作に伴う無音時間を平均的に短縮することによりユ
ーザがコミュニケーションを行う相手が感じるストレス
を軽減できる効果がある。According to the present invention, instead of the output control unit,
The synthesized voice data is read, the output voice is extracted, and the input state of the synthesized voice data is monitored. During the sentence input operation, the input of the next synthesized voice data is performed within a predetermined time after the input of the previous synthesized voice data. If not, make preparations for outputting redundant word voice data. When the next synthesized voice data is input, redundant word voice data selected by the redundant word selection unit before the next synthesized voice data as output voice Is configured to include a redundant word prefix output control unit that reads and connects a redundant word voice according to a user's sentence input operation. Automatically outputs and interpolates immediately before the next synthesized speech, realizing the characteristic of natural speech that a redundant word is placed immediately before a meaningful utterance unit, and silence associated with the user's sentence input operation There is an effect that the user can reduce the stress party feels that performs communication by shortening between average.

【００６４】この発明によれば、ユーザが入力した文を
合成音声に変換して出力する音声コミュニケーション支
援装置において、ユーザの文入力操作により文を入力す
る文入力部と、この文入力部から入力される文の文字列
を解析して合成単位を抽出すると共にこの合成単位の言
語カテゴリを決定し、合成単位と言語カテゴリを逐次的
に出力する合成単位解析部と、合成単位解析部から出力
された合成単位の文字列を音声合成して合成音声データ
を出力する音声合成部と、合成単位の言語カテゴリごと
に冗長語音声データを予め記憶するカテゴリ別冗長語記
憶部と、合成単位解析部からの言語カテゴリに従ってカ
テゴリ別冗長語記憶部から対応する冗長語音声データを
１つ選択して出力するカテゴリ別冗長語選択部と、合成
音声データを読み込み出力音声を取り出すと共に合成音
声データの入力状態を監視し、文入力操作中に前の合成
音声データの入力後の予め決めた一定時間内に次の合成
音声データの入力が行われない場合には冗長語音声を出
力するための準備を行い、次の合成音声データが入力さ
れたときの出力音声として、次の合成音声データの前に
カテゴリ別冗長語選択部が選択した次の合成音声に対応
する冗長語音声データを読み出して接続する冗長語前置
出力制御部とを備えるように構成したので、ユーザの文
入力操作に従って合成単位ごとに逐次的に合成音声を出
力することによりユーザの操作を軽減することができる
と共に、無音状態が一定時間継続すると、後続する合成
単位の言語カテゴリから適切と判断される冗長語音声を
次の合成音声の前に接続して自動的に出力して補間し、
冗長語が意味のある発話単位の直前に置かれるという自
然音声の特徴を実現し、ユーザの文入力操作に伴う無音
時間を見かけ上平均的して短縮することによりユーザが
コミュニケーションを行う相手が感じるストレスを軽減
できる効果がある。According to the present invention, in a voice communication support apparatus for converting a sentence inputted by a user into a synthesized speech and outputting the synthesized speech, a sentence input section for inputting a sentence by a user's sentence input operation, and an input from the sentence input section A synthesis unit that analyzes a character string of a sentence, extracts a synthesis unit, determines a language category of the synthesis unit, and sequentially outputs the synthesis unit and the language category; A speech synthesizing unit for synthesizing the character string of the synthesized unit to output synthesized speech data, a redundant word storage unit for each category that stores redundant word speech data in advance for each language category of the synthesis unit, and a synthesis unit analyzing unit. A redundant word selection unit by category for selecting and outputting one corresponding redundant word voice data from a redundant word storage unit by category according to the language category of It monitors the input state of synthesized voice data while extracting the output voice only, and if the next synthesized voice data is not input within a predetermined time after the input of the previous synthesized voice data during the sentence input operation Prepares for the output of redundant word speech, and as the output speech when the next synthesized speech data is input, the next synthesized speech selected by the category-based redundant word selection unit before the next synthesized speech data A redundant word prefix output control unit for reading out and connecting the corresponding redundant word voice data, so that the user operation can be performed by sequentially outputting synthesized voice for each synthesis unit in accordance with the user's sentence input operation. When silence continues for a certain period of time, redundant word speech judged appropriate from the language category of the subsequent synthesis unit is connected before the next synthesized speech and automatically Output to the interpolation,
Realizes the feature of natural speech that a redundant word is placed immediately before a meaningful utterance unit, and apparently shortens the silence time associated with the user's sentence input operation on average, so that the user with whom the user communicates feels It has the effect of reducing stress.

【００６５】この発明によれば、冗長語記憶部に代え
て、合成単位の末尾母音ごとに出力可能な冗長語音声デ
ータを予め記憶する末尾母音別冗長語記憶部を備え、冗
長語選択部に代えて、入力される合成単位の末尾母音の
種類を識別し、この末尾母音の種類に適合する冗長語音
声データを末尾母音別冗長語記憶部から１つ選択する末
尾音声別冗長語選択部を備えるように構成したので、ユ
ーザの文入力操作に従って合成単位ごとに逐次的に合成
音声を出力することによりユーザの操作を軽減すること
ができると共に、無音状態が一定時間継続すると、合成
単位の末尾母音に適合する冗長語音声データを自動的に
出力して補間し、ユーザの文入力操作に伴う無音時間を
見かけ上短縮することによりユーザがコミュニケーショ
ンを行う相手が感じるストレスを軽減できる効果があ
る。According to the present invention, instead of the redundant word storage unit, a redundant word storage unit for each tail vowel is provided which stores in advance redundant word voice data that can be output for each tail vowel of the synthesis unit. Alternatively, the type of the tail vowel of the input synthesis unit is identified, and a tail word-based redundant word selection unit that selects one redundant word voice data that matches the type of the tail vowel from the tail vowel-based redundant word storage unit is provided. With this configuration, the operation of the user can be reduced by sequentially outputting synthesized speech for each synthesis unit in accordance with the user's sentence input operation, and when silence continues for a certain period of time, the end of the synthesis unit Redundant word voice data that matches vowels is automatically output and interpolated, and the silence time associated with the user's sentence input operation is apparently shortened so that the user with whom the user communicates feels There is an effect that can reduce stress.

【００６６】この発明によれば、冗長語記憶部に代え
て、合成単位の時間長ごとに出力可能な冗長語音声デー
タを記憶する時間長別冗長語記憶部を備え、冗長語選択
部に代えて、入力された合成音声データの時間長に適合
する冗長語音声データを時間長別冗長語記憶部から１つ
選択する時間長別冗長語選択部を備えるように構成した
ので、ユーザの文入力操作に従って合成単位ごとに逐次
的に合成音声を出力することによりユーザの操作を軽減
することができると共に、無音状態が一定時間継続する
と、合成単位の時間長に適合する冗長語音声を自動的に
出力して補間し、ユーザの文入力操作に伴う無音時間を
見かけ上短縮することによりユーザがコミュニケーショ
ンを行う相手が感じるストレスを軽減できるの効果があ
る。According to the present invention, instead of the redundant word storage unit, there is provided a redundant word storage unit by time length for storing redundant word voice data that can be output for each time length of the synthesis unit, and the redundant word selection unit is replaced. And a redundant word selecting unit for each time length for selecting one redundant word voice data corresponding to the time length of the input synthesized voice data from the redundant word storage unit for each time length. The user operation can be reduced by sequentially outputting synthesized speech for each synthesis unit according to the operation, and when silence continues for a certain period of time, a redundant word voice that matches the time length of the synthesis unit is automatically generated. By outputting and interpolating and apparently shortening the silence time associated with the user's sentence input operation, there is the effect that the stress felt by the user with whom the user communicates can be reduced.

【００６７】この発明によれば、出力制御部に代えて、
合成音声データを読み込み出力音声を取り出すと共に出
力した合成音声データを一時的に記憶し、合成音声デー
タの入力状態を監視して合成音声データが一定時間入力
されないことを感知した場合には、冗長語選択部が選択
した冗長語音声データから冗長語音声を出力するか、ま
たは記憶されている合成音声データを再出力する復唱出
力制御部を備えるように構成したので、ユーザの文入力
操作に従って合成音声を出力することができ、さらに無
音状態が一定時間継続すると、直前に出力された合成音
声を自動的に再出力して補間し、ユーザの文入力操作に
伴う無音時間を見かけ上短縮することによりユーザがコ
ミュニケーションを行う相手が感じるストレスを軽減で
きる効果がある。According to the present invention, instead of the output control unit,
When the synthesized voice data is read, the output voice is extracted, and the output synthesized voice data is temporarily stored, and the input state of the synthesized voice data is monitored. Since the redundant section voice data is output from the redundant word voice data selected by the selecting section, or a repetition output control section for re-outputting the stored synthesized voice data is provided, the synthesized voice is output in accordance with the user's sentence input operation. When silence continues for a certain period of time, the synthesized speech output immediately before is automatically re-output and interpolated, and the silence time accompanying the user's sentence input operation is apparently shortened. This has the effect of reducing the stress felt by the communication partner of the user.

【００６８】この発明によれば、ユーザが入力した文を
合成音声に変換して出力する音声コミュニケーション支
援装置において、ユーザの文入力操作により文を入力
し、入力された文を音声合成して合成音声データを生成
し、適当数の所定の冗長語音声データを予め記憶してお
き、生成された合成音声データによる出力音声を取り出
すと共に合成音声データの入力状態を監視し、入力操作
中で合成音声データが一定時間入力されないことを感知
した場合には記憶された冗長語音声データを選択して冗
長語音声を出力するように構成したので、ユーザの文入
力操作に従って合成音声を出力することができ、文入力
操作中に無音状態が一定時間継続すると冗長語音声を自
動的に出力して補間し、ユーザの文入力操作に伴い発生
する無音時間を見かけ上短縮することによりユーザとコ
ミュニケーションを行う相手方が感じるストレスを軽減
できる効果がある。According to the present invention, in a voice communication support apparatus for converting a sentence inputted by a user into a synthesized speech and outputting the synthesized speech, a sentence is inputted by a sentence input operation of the user, and the inputted sentence is synthesized by speech synthesis. Voice data is generated, an appropriate number of predetermined redundant word voice data is stored in advance, an output voice based on the generated synthesized voice data is taken out, an input state of the synthesized voice data is monitored, and a synthesized voice is input during the input operation. When it is detected that data is not inputted for a certain period of time, the stored redundant word voice data is selected and the redundant word voice is output, so that the synthesized voice can be output according to the user's sentence input operation. When silence continues for a certain period of time during sentence input operation, redundant word sounds are automatically output and interpolated, and the silence time generated by the user's sentence input operation is checked. There is an effect that can reduce the stress the other party feel to perform user and the communication by the above reduction.

[Brief description of the drawings]

【図１】この発明の実施の形態１による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a voice communication support device according to a first embodiment of the present invention.

【図２】この発明の実施の形態２による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 2 is a block diagram showing a configuration of a voice communication support device according to a second embodiment of the present invention.

【図３】この発明の実施の形態３による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 3 is a block diagram showing a configuration of a voice communication support device according to a third embodiment of the present invention.

【図４】この発明の実施の形態４による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 4 is a block diagram showing a configuration of a voice communication support device according to a fourth embodiment of the present invention.

【図５】この発明の実施の形態５による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 5 is a block diagram showing a configuration of a voice communication support device according to a fifth embodiment of the present invention.

【図６】この発明の実施の形態６による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 6 is a block diagram showing a configuration of a voice communication support device according to a sixth embodiment of the present invention.

【図７】この発明の実施の形態７による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 7 is a block diagram showing a configuration of a voice communication support device according to a seventh embodiment of the present invention.

【図８】この発明の実施の形態８による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 8 is a block diagram showing a configuration of a voice communication support device according to an eighth embodiment of the present invention.

【図９】この発明の実施の形態９による音声コミュニ
ケーション支援装置の構成を示すブロック図である。FIG. 9 is a block diagram showing a configuration of a voice communication support device according to a ninth embodiment of the present invention.

【図１０】この発明の実施の形態１による合成音声お
よび冗長語音声出力の具体例を示す説明図である。FIG. 10 is an explanatory diagram showing a specific example of synthesized speech and redundant word speech output according to the first embodiment of the present invention;

【図１１】この発明の実施の形態２による合成音声お
よび冗長語音声出力の具体例を示す説明図である。FIG. 11 is an explanatory diagram showing a specific example of synthesized speech and redundant word speech output according to Embodiment 2 of the present invention;

【図１２】この発明の実施の形態３による言語カテゴ
リと無音許容時間の関係を示す説明図である。FIG. 12 is an explanatory diagram showing a relationship between a language category and a silence allowable time according to Embodiment 3 of the present invention.

【図１３】この発明の実施の形態３によるタイミング
決定部の動作を説明するための説明図である。FIG. 13 is an explanatory diagram illustrating an operation of a timing determining unit according to the third embodiment of the present invention.

【図１４】この発明の実施の形態５による合成音声お
よび冗長語音声出力の具体例を示す説明図である。FIG. 14 is an explanatory diagram showing a specific example of synthesized speech and redundant word speech output according to Embodiment 5 of the present invention.

【図１５】この発明の実施の形態６による言語カテゴ
リと冗長語の対応の例を示す説明図である。FIG. 15 is an explanatory diagram showing an example of correspondence between language categories and redundant words according to Embodiment 6 of the present invention;

【図１６】この発明の実施の形態６による冗長語の選
択動作を説明するための説明図である。FIG. 16 is an explanatory diagram illustrating an operation of selecting a redundant word according to a sixth embodiment of the present invention.

【図１７】この発明の実施の形態７による合成単位末
尾母音と冗長語の対応の例を示す説明図である。FIG. 17 is an explanatory diagram showing an example of correspondence between a synthesized unit tail vowel and a redundant word according to the seventh embodiment of the present invention.

【図１８】従来の音声コミュニケーション支援装置の
構成を示すブロック図である。FIG. 18 is a block diagram showing a configuration of a conventional voice communication support device.

[Explanation of symbols]

１１文入力部、１２文字列出力部、１３音声合成
部、１４冗長語記憶部、１５冗長語選択部、１６
出力制御部、２１合成単位抽出部、３１合成単位解
析部、３２タイミング決定部、３３タイミング出力
制御部、４１時間長別タイミング決定部、５１冗長語
前置出力制御部、６１カテゴリ別冗長語記憶部、６２
カテゴリ別冗長語選択部、７１末尾母音別冗長語記
憶部、７２末尾母音別冗長語選択部、８１時間長別
冗長語記憶部、８２時間長別冗長語選択部、９１復
唱出力制御部。11 sentence input unit, 12 character string output unit, 13 speech synthesis unit, 14 redundant word storage unit, 15 redundant word selection unit, 16
Output control unit, 21 synthesis unit extraction unit, 31 synthesis unit analysis unit, 32 timing determination unit, 33 timing output control unit, 41 time length-dependent timing determination unit, 51 redundant word prefix output control unit, 61 redundant word storage by category Part, 62
Redundant word selecting unit by category, 71 Redundant word storing unit by last vowel, 72 Redundant word selecting unit by last vowel, 81 redundant word storing unit by time length, 82 Redundant word selecting unit by time length, 91 Repetition output control unit.

───────────────────────────────────────────────────── フロントページの続き (72)発明者鈴木忠東京都千代田区丸の内二丁目２番３号三菱電機株式会社内Ｆターム(参考） 5D045 AA11 DA11 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Tadashi Suzuki 2-3-2 Marunouchi, Chiyoda-ku, Tokyo Mitsubishi Electric Corporation F-term (reference) 5D045 AA11 DA11

Claims

[Claims]

1. A voice communication support apparatus for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, wherein a sentence input unit for inputting a sentence by a user's sentence input operation, and a character string of the input sentence are temporarily stored. A character string output unit that outputs when an output operation is performed by the user; a voice synthesis unit that voice-synthesizes the character string output from the character string output unit and outputs synthesized voice data; A redundant word storage section for storing predetermined redundant word voice data in advance; a redundant word selecting section for selecting and outputting one redundant word voice data from the redundant word storage section; And monitoring the input state of the synthesized speech data, and when it is detected that the synthesized speech data has not been input for a predetermined time, the redundant word speech data selected by the redundant word selection unit. And an output control unit for outputting a redundant word voice from the voice communication support device.

2. A system according to claim 1, further comprising a synthesis unit extracting a synthesis unit by analyzing a character string of a sentence input from the sentence input unit, and sequentially outputting the extracted synthesis unit, instead of the character string output unit. The voice communication support device according to claim 1, wherein:

3. A voice communication support device for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, wherein: a sentence input unit for inputting a sentence by a user's sentence input operation; Analyzing a character string to extract a synthesis unit, determining a language category of the synthesis unit, and sequentially outputting the synthesis unit and the language category; and a synthesis unit analysis unit that sequentially outputs the synthesis unit and the language category. A speech synthesizer for speech-synthesizing a character string of a synthesis unit and outputting synthesized speech data; a redundant word storage unit for storing an appropriate number of predetermined redundant word speech data in advance; A redundant word selecting unit that selects and outputs one word voice data, and outputs a timing signal that determines an output timing of the redundant word voice data according to the input language category. A timing determining unit that reads the synthesized voice data, extracts an output voice, monitors the input state of the synthesized voice data, and selects the redundant word selection unit when the timing signal is input during a sentence input operation. And a timing output control unit for outputting a redundant word voice based on the redundant word voice data.

4. A time length-dependent timing determining unit that outputs a time-length-specific timing signal that determines the output timing of redundant word voice data generated according to the time length of the immediately preceding synthesized voice, instead of the timing determining unit. 4. The voice communication support device according to claim 3, wherein a timing output control unit outputs redundant word voice data according to the time length-specific timing signal.

5. In place of the output control unit, read synthesized voice data to extract output voice, monitor the input state of the synthesized voice data, and determine a predetermined state after inputting the previous synthesized voice data during a sentence input operation. When the input of the next synthesized voice data is not performed within a predetermined time, preparations are made for outputting the redundant word voice data, and when the next synthesized voice data is input, the next synthesized voice data is used as the output voice. 3. The voice communication support device according to claim 1, further comprising a redundant word prefix output control unit that reads and connects the redundant word voice data selected by the redundant word selection unit before the voice data.

6. A voice communication support device for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, wherein: a sentence input unit for inputting a sentence by a user's sentence input operation; Analyzing a character string to extract a synthesis unit, determining a language category of the synthesis unit, and sequentially outputting the synthesis unit and the language category; and a synthesis unit analysis unit that sequentially outputs the synthesis unit and the language category. A speech synthesis unit that synthesizes a character string of a synthesis unit and outputs synthesized voice data, a category-based redundant word storage unit that stores redundant word voice data in advance for each language category of the synthesis unit, and a synthesis unit analysis unit. A category-based redundant word selection unit that selects and outputs one corresponding redundant word voice data from the category-based redundant word storage unit according to the language category of The synthesized voice data is read, the output voice is extracted, and the input state of the synthesized voice data is monitored. During the sentence input operation, the input of the next synthesized voice data is performed within a predetermined time after the input of the previous synthesized voice data. If not performed, the preparation for outputting the redundant word voice is performed, and as the output voice when the next synthesized voice data is input, the category-based redundant word selecting unit is arranged before the next synthesized voice data. And a redundant word prefix output control unit for reading and connecting redundant word voice data corresponding to the selected next synthesized voice.

7. A redundant word storage unit for each tail vowel that stores in advance redundant word voice data that can be output for each tail vowel of the synthesis unit, instead of the redundant word storage unit, and And a tail word-by-speech redundant word selection unit for identifying a tail vowel type of the synthesis unit to be selected and selecting one of the redundant word voice data matching the tail vowel type from a tail vowel-by-tail redundant word storage unit. The voice communication support device according to any one of claims 2 to 5, wherein:

8. A redundant word storage unit for storing a redundant word voice data which can be output for each time length of a synthesis unit in place of the redundant word storage unit. 6. A redundant word selecting unit for each time length which selects one redundant word voice data matching the time length of the synthesized voice data from the redundant word storing unit for each time length. Any one of them
The voice communication support device according to the item.

9. In place of the output control unit, the synthesized speech data is read in, the output speech is extracted, and the output synthesized speech data is temporarily stored. The input state of the synthesized speech data is monitored, and the synthesized speech data is monitored. If it is sensed that no input is made for a certain period of time, the redundant word selection unit outputs a redundant word voice from the selected redundant word voice data, or a repetition output control unit that re-outputs the stored synthesized voice data. The voice communication support device according to any one of claims 1 to 8, further comprising:

10. A voice communication support apparatus for converting a sentence input by a user into a synthesized voice and outputting the synthesized voice, stores an appropriate number of predetermined redundant word voice data in advance, and inputs a sentence by a user's sentence input operation. Generating synthesized speech data based on the input sentence, extracting an output speech based on the generated synthesized speech data, and monitoring the input state of the synthesized speech data. A speech communication support method, wherein when it is sensed that no input is made, the stored redundant word voice data is selected and the redundant word voice is output.