JP2006011002A

JP2006011002A - Unit, method and program for audio response

Info

Publication number: JP2006011002A
Application number: JP2004187358A
Authority: JP
Inventors: Hiroyuki Fujii; 洋之藤井; Michio Okada; 美智男岡田
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2004-06-25
Filing date: 2004-06-25
Publication date: 2006-01-12

Abstract

<P>PROBLEM TO BE SOLVED: To provide a unit, method and program for audio response for guiding an inexperienced user in an easy-to-understand way and an experienced user with less troublesomeness. <P>SOLUTION: The CPU of the audio response unit gives a vocal guidance for operations of equipment (step 1). Then the CPU decides whether a backward operation is needed (step S2). When the backward operation is not necessary, the CPU decides whether the time lag up to a next operation by the user is long (step S3). When the time up to the next operation by the user is shorter than a specified time, the CPU advances a step of simplifying a text of voice (step S4). When a backward operation is necessary in the step S2 and the time up to the next operation exceeds the specified time in the step S3, the CPU puts back the stage of simplifying the text of voice. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、音声応答装置、音声応答方法および音声応答プログラムに関する。 The present invention relates to a voice response device, a voice response method, and a voice response program.

カーナビゲーションシステム（以下、カーナビと呼ぶ）、ビデオテープレコーダ等の種々の機器には、音声を用いたナビゲーションシステムがよく用いられている（例えば、特許文献１参照）。 For various devices such as a car navigation system (hereinafter referred to as a car navigation system) and a video tape recorder, a navigation system using sound is often used (for example, see Patent Document 1).

使用者は、上記の機器を使用する場合、音声ナビゲーションシステムの音声により機器の操作手順の案内を受ける。
特開２０００−２３１６９７号公報 When the user uses the above-described device, the user receives guidance on the operation procedure of the device using the voice of the voice navigation system.
JP 2000-231697 A

一般的に、上記の音声ナビゲーションシステムを用いた機器は長期間使用されても、音声による操作手順の案内は変わらない。 In general, even if a device using the above-described voice navigation system is used for a long period of time, the guidance of the operation procedure by voice does not change.

しかし、使用者は、機器を長期間使用している間に、機器の操作に熟練する。この場合、例えば、使用者が操作手順に慣れない初心者であるときは、「わかりました」または「もう一度言ってください」等の言葉を用いた音声による丁寧な表現が役に立つが、使用者が操作手順に慣れるにつれ、使用者にとって上記音声がわずらわしく感じられるようになる。そして、熟練した使用者は音声を消音する場合がある。 However, the user is proficient in operating the device while using the device for a long time. In this case, for example, if the user is a beginner who is not familiar with the operation procedure, a polite expression using words such as `` I understand '' or `` Please say again '' is useful, but the user can operate As the user gets used to the procedure, the user feels bothered by the voice. A skilled user may mute the sound.

しかしながら、使用者が音声ナビゲーションシステムの進行の状態を把握するためには、なんらかの音または音声による案内が重要である。特に、カーナビでは、運転への集中を妨害しないためには、音声による案内が必要不可欠である。 However, in order for the user to grasp the state of progress of the voice navigation system, guidance by some sound or voice is important. In particular, in car navigation systems, voice guidance is indispensable in order not to disturb the concentration on driving.

本発明の目的は、使用者が初心者である場合にはわかりやすい案内をし、使用者が熟練者である場合には使用者へのわずらわしさを軽減することができる案内をするための音声応答装置、音声応答方法および音声応答プログラムを提供することである。 An object of the present invention is to provide a voice response device for providing easy-to-understand guidance when the user is a beginner and reducing guidance for the user when the user is an expert. An audio response method and an audio response program are provided.

第１の発明に係る音声応答装置は、機器の操作を音声で案内する音声応答装置であって、機器の操作に関する説明を音声出力する音声出力手段と、機器の操作の習熟度を判定する判定手段と、判定手段により判定される習熟度に基づいて音声出力手段による音声出力を簡略化する簡略化手段とを備えたものである。 The voice response device according to the first invention is a voice response device that guides the operation of the device by voice, a voice output means for outputting a description of the operation of the device, and a determination for determining the proficiency level of the operation of the device And a simplification means for simplifying sound output by the sound output means based on the proficiency level determined by the determination means.

本発明に係る音声応答装置においては、機器の操作に関する説明が音声出力手段により音声出力され、使用者による機器の操作の習熟度が判定手段により判定される。また、判定手段により判定される使用者の機器の操作の習熟度に基づいて、音声出力手段による音声出力が簡略手段により簡略化される。 In the voice response device according to the present invention, the explanation about the operation of the device is output by the voice output means, and the proficiency level of the operation of the device by the user is determined by the determination means. Further, the voice output by the voice output unit is simplified by the simplification unit based on the user's proficiency level of the operation of the device determined by the determination unit.

簡略化手段により音声出力をイントネーション等の韻律情報を適度に保ちながら連続的に簡略化させることによって、使用者が初心者である場合にはわかりやすい音声出力による説明が行われ、使用者が熟練者である場合には使用者へのわずらわしさを軽減することができる音声出力による説明が行われる。 If the user is a beginner, by simplifying the sound output continuously while keeping the prosodic information such as intonation moderately by simplification means, explanation will be given with easy-to-understand voice output, and the user will be an expert In some cases, explanation is given by voice output that can reduce the troublesomeness to the user.

また、使用者による機器の操作の習熟度に基づいて機器の操作に関する説明が変化することにより、使用者が音声応答装置に対して愛着を抱き、また、音声応答装置に対してより親和性を抱くことも期待できる。 In addition, the explanation about the operation of the device changes based on the proficiency of the operation of the device by the user, so that the user is attached to the voice response device, and more compatible with the voice response device. You can expect to hold them.

簡略化手段は、音声を非分節化することにより音声出力を簡略化してもよい。この場合、使用者の習熟度に基づいて音声出力による説明が簡略化手段により通常の言語的音声から非分節的な音声へと、イントネーション等の韻律情報を適度に保ちながら連続的に変化されることによって、使用者へのわずらわしさを軽減することができる。これにより、使用者は快適に音声応答装置を使用することができる。 The simplification means may simplify the audio output by de-segmenting the audio. In this case, on the basis of the user's proficiency level, the explanation by voice output is continuously changed from normal linguistic voice to non-segmental voice by simplifying means while keeping prosodic information such as intonation moderately. Therefore, the troublesomeness for the user can be reduced. Thereby, the user can comfortably use the voice response device.

簡略化手段は、音声の周波数帯域を減少させるとともに音声の再生速度を増加させることにより音声を非分節化してもよい。この場合、簡略化手段により音声の周波数帯域が減少されるとともに音声の再生速度が増加されることにより、音声出力による説明が通常の言語的音声から非分節的な音声へと、イントネーション等の韻律情報を適度に保ちながら連続的に変化され、使用者へのわずらわしさを軽減することができる。これにより、使用者は快適に音声応答装置を使用することができる。 The simplification means may desegment the audio by reducing the frequency band of the audio and increasing the playback speed of the audio. In this case, the simplification means reduces the frequency band of the voice and increases the playback speed of the voice, so that the explanation by the voice output is changed from a normal linguistic voice to a non-segmental voice and a prosody such as intonation. It is continuously changed while keeping the information moderate, and the troublesomeness for the user can be reduced. Thereby, the user can comfortably use the voice response device.

簡略化手段は、説明の一部の語句を省略することにより音声出力を簡略化してもよい。この場合、使用者の習熟度に基づいて音声出力による説明の一部の語句が簡略化手段により省略されることにより、使用者へのわずらわしさを軽減することができる。これにより、使用者は快適に音声応答装置を使用することができる。 The simplification means may simplify the audio output by omitting some words in the description. In this case, it is possible to reduce annoyance to the user by omitting a part of the explanation of the voice output explanation by the simplification means based on the proficiency level of the user. Thereby, the user can comfortably use the voice response device.

判定手段は、機器の操作のやり直しまたは操作時間に基づいて操作の習熟度を判定してもよい。この場合、使用者が所望の操作を行うことができなかった場合または使用者が操作に戸惑っている場合等が使用者の習熟度の度合いとして判定手段により判定される。これにより、音声応答装置の円滑な進行が実現される。 The determination unit may determine the proficiency level of the operation based on the re-operation of the device or the operation time. In this case, when the user cannot perform a desired operation or when the user is confused by the operation, the determination unit determines the degree of proficiency of the user. Thereby, smooth progress of the voice response device is realized.

簡略化手段は、機器の操作のやり直しがなくかつ操作時間が所定時間を超えない場合に、音声出力を段階的に簡略化し、機器の操作のやり直しがあった場合または操作時間が所定時間を超えた場合に音声出力の簡略化の段階を戻してもよい。 The simplification means simplifies audio output step by step when the device operation is not re-executed and the operation time does not exceed the predetermined time, and when the device operation is re-executed or the operation time exceeds the predetermined time In this case, the audio output simplification stage may be returned.

この場合、使用者の機器の操作の習熟度に基づいて使用者にとって適切な音声出力の段階的な簡略化が行われ、また簡略化の段階が戻される。 In this case, step-by-step simplification of sound output appropriate for the user is performed based on the user's proficiency in the operation of the device, and the simplification step is returned.

音声応答装置は、使用者を認識する認識手段をさらに備え、判定手段は、認識手段により認識された使用者ごとに機器の操作の習熟度を判定してもよい。 The voice response device may further include a recognition unit for recognizing the user, and the determination unit may determine the proficiency level of the operation of the device for each user recognized by the recognition unit.

この場合、複数の使用者が、それぞれの習熟度に基づいた音声出力による説明を得ることができる。 In this case, a plurality of users can obtain explanations by voice output based on their proficiency levels.

第２の発明に係る音声応答方法は、機器の操作を音声で案内する音声応答方法であって、機器の操作に関する説明を音声出力するステップと、機器の操作の習熟度を判定するステップと、判定される習熟度に基づいて音声出力を簡略化するステップとを備えたものである。 The voice response method according to the second invention is a voice response method for guiding the operation of the device by voice, the step of outputting a description regarding the operation of the device, the step of determining the proficiency level of the operation of the device, And a step of simplifying voice output based on the determined proficiency level.

本発明に係る音声応答方法においては、機器の操作に関する説明が音声出力され、使用者による機器の操作の習熟度が判定される。また、判定される使用者の機器の操作の習熟度に基づいて音声出力が簡略化される。 In the voice response method according to the present invention, a description about the operation of the device is output as a voice, and the proficiency level of the operation of the device by the user is determined. Further, the audio output is simplified based on the user's proficiency level of the operation of the device.

これにより、音声出力をイントネーション等の韻律情報を適度に保ちながら連続的に簡略化させることによって、使用者が初心者である場合にはわかりやすい音声出力による説明が行われ、使用者が熟練者である場合には使用者へのわずらわしさを軽減することができる音声出力による説明が行われる。 As a result, by simplifying the audio output continuously while maintaining proper prosodic information such as intonation, explanation is given with easy-to-understand audio output if the user is a beginner, and the user is an expert In some cases, explanation is given by voice output that can reduce the troublesomeness to the user.

また、使用者による機器の操作の習熟度に基づいて機器の操作に関する説明が変化することにより、使用者が音声応答に対して愛着を抱き、また、音声応答に対してより親和性を抱くことも期待できる。 In addition, the explanation about the operation of the device changes based on the proficiency of the operation of the device by the user, so that the user is attached to the voice response and has more affinity to the voice response. Can also be expected.

第３の発明に係る音声応答プログラムは、コンピュータにより実行可能な音声応答プログラムであるとともに機器の操作を音声で案内する音声応答プログラムであって、機器の操作に関する説明を音声出力する処理と、機器の操作の習熟度を判定する処理と、判定される習熟度に基づいて音声出力を簡略化する処理とを、コンピュータに実行させるものである。 A voice response program according to a third aspect of the invention is a voice response program that can be executed by a computer, and that is a voice response program that guides the operation of the device by voice. The computer is caused to execute a process for determining the proficiency level of the operation and a process for simplifying voice output based on the determined proficiency level.

本発明に係る音声応答プログラムにおいては、機器の操作に関する説明が音声出力され、使用者による機器の操作の習熟度が判定される。また、判定される使用者の機器の操作の習熟度に基づいて音声出力が簡略化される。 In the voice response program according to the present invention, the explanation about the operation of the device is outputted as a voice, and the proficiency level of the operation of the device by the user is determined. Further, the audio output is simplified based on the user's proficiency level of the operation of the device.

また、使用者による機器の操作の習熟度に基づいて機器の操作に関する説明が変化することにより、使用者が音声応答に対して愛着を抱き、また、音声応答に対してより親和性を抱くことも期待できる。 In addition, the explanation about the operation of the device changes based on the user's proficiency in the operation of the device, so that the user has an attachment to the voice response and has a higher affinity for the voice response. Can also be expected.

本発明によれば、音声出力をイントネーション等の韻律情報を適度に保ちながら連続的に簡略化させることによって、使用者が初心者である場合にはわかりやすい音声出力による説明が行われ、使用者が熟練者である場合には使用者へのわずらわしさを軽減することができる音声出力による説明が行われる。 According to the present invention, the audio output is continuously simplified while keeping the prosody information such as intonation moderately, so that if the user is a beginner, the explanation is made with easy-to-understand audio output, and the user is skilled. In the case of a user, explanation is given by voice output that can reduce the troublesomeness to the user.

以下、本実施の形態に係る音声応答装置（以下、単に応答装置と呼ぶ）について図面を参照しながら説明する。 Hereinafter, a voice response device (hereinafter simply referred to as a response device) according to the present embodiment will be described with reference to the drawings.

本実施の形態に係る応答装置は、例えばカーナビゲーションシステムまたはビデオテープレコーダの録画予約システム等の種々の機器（以下、単に機器と呼ぶ）に用いることができる。 The response device according to the present embodiment can be used for various devices (hereinafter simply referred to as devices) such as a car navigation system or a recording reservation system of a video tape recorder.

図１は、本実施の形態に係る応答装置の構成を示すブロック図である。 FIG. 1 is a block diagram showing a configuration of a response device according to the present embodiment.

図１に示すように、応答装置１０は、ＣＰＵ（中央演算処理装置）１、ＲＯＭ（リードオンリメモリ）２、ＲＡＭ（ランダムアクセスメモリ）３、入力装置４、表示装置５、外部記憶装置６、記録媒体駆動装置７およびスピーカ８を含む。 As shown in FIG. 1, the response device 10 includes a CPU (Central Processing Unit) 1, a ROM (Read Only Memory) 2, a RAM (Random Access Memory) 3, an input device 4, a display device 5, an external storage device 6, A recording medium driving device 7 and a speaker 8 are included.

入力装置４は、キーボード、マウス、スキャナ、デジタルカメラ等からなり、各種指令、データおよび画像を入力するために用いられる。 The input device 4 includes a keyboard, a mouse, a scanner, a digital camera, and the like, and is used for inputting various commands, data, and images.

ＲＯＭ２にはシステムプログラムが記憶される。記録媒体駆動装置７は、ＣＤ−ＲＯＭドライブ、フロッピー（登録商標）ディスクドライブ等からなり、ＣＤ−ＲＯＭ、フロッピィディスクドライブ等の記録媒体９に対してデータの読み書きを行う。 The ROM 2 stores a system program. The recording medium driving device 7 includes a CD-ROM drive, a floppy (registered trademark) disk drive, and the like, and reads / writes data from / to a recording medium 9 such as a CD-ROM, a floppy disk drive, or the like.

記録媒体９には、音声応答プログラムが記憶されている。外部記憶装置６は、ハードディスク装置等からなり、記憶媒体駆動装置７を介して記録媒体９から読み込まれた音声応答プログラムおよび各種データを記憶する。ＣＰＵ１は、外部記憶装置６に記憶された音声応答プログラムをＲＡＭ３上で実行する。 The recording medium 9 stores a voice response program. The external storage device 6 includes a hard disk device or the like, and stores a voice response program and various data read from the recording medium 9 via the storage medium driving device 7. The CPU 1 executes the voice response program stored in the external storage device 6 on the RAM 3.

表示装置５は、液晶表示パネル、ＣＲＴ（陰極線管）等からなり、各種画像等を表示する。スピーカ８は、音声応答プログラムに基づいて案内の音声を出力する。 The display device 5 includes a liquid crystal display panel, a CRT (cathode ray tube), and the like, and displays various images. The speaker 8 outputs a guidance voice based on the voice response program.

なお、音声応答プログラムを記憶する記録媒体９として、ＲＯＭ等の半導体メモリ、ハードディスク等の種々の記録媒体を用いることができる。また、音声応答プログラムを通信回線等の通信媒体を介して外部記憶装置６にダウンロードし、ＲＡＭ３上で実行してもよい。 As the recording medium 9 for storing the voice response program, various recording media such as a semiconductor memory such as a ROM and a hard disk can be used. Further, the voice response program may be downloaded to the external storage device 6 via a communication medium such as a communication line and executed on the RAM 3.

図２は、本実施の形態に係る応答装置１０の外部記憶装置６に記憶される音声応答プログラムの構成の一例を示すブロック図である。 FIG. 2 is a block diagram showing an example of a configuration of a voice response program stored in the external storage device 6 of the response device 10 according to the present embodiment.

図２に示すように、外部記憶装置６に記憶される音声応答プログラムは、操作の円滑性判定モジュール６ａ、音声合成モジュール６ｂおよび音声合成パラメータ設定モジュール６ｃにより構成されている。 As shown in FIG. 2, the voice response program stored in the external storage device 6 includes an operation smoothness determination module 6a, a voice synthesis module 6b, and a voice synthesis parameter setting module 6c.

操作の円滑性判定モジュール６ａは、使用者による機器の操作が円滑に行われているか否かを判定する機能を有する。 The smoothness determination module 6a has a function of determining whether or not the user is operating the device smoothly.

音声合成モジュール６ｂは、説明のテキストを音声として合成する。音声合成パラメータ設定モジュール６ｃは、合成される音声の周波数、再生速度等のパラメータを設定する。具体的には、音声合成パラメータ設定モジュール６ｃは、音声をローパスフィルタに通過させてその通過帯域を狭めると同時に、音声の再生速度を段階的に速める。それにより、合成音声を非分節化する。 The speech synthesis module 6b synthesizes the description text as speech. The voice synthesis parameter setting module 6c sets parameters such as the frequency and playback speed of the voice to be synthesized. Specifically, the speech synthesis parameter setting module 6c passes the sound through the low-pass filter to narrow the pass band, and at the same time increases the sound reproduction speed step by step. Thereby, the synthesized speech is desegmented.

このようにして非分節化された合成音声は、本来の合成音声に比べて明瞭度は低下し音声としての品質は劣化してはくるが、上記の合成音声を聞き慣れた使用者は十分上記の合成音声による説明の意味合いを理解することができ、音声でのアナウンスの意味合いから単なる合図のような感覚で音声を受けとめることができる。 The synthesized speech that has been unsegmented in this way has lower clarity and degraded quality as compared to the original synthesized speech, but users who are familiar with the synthesized speech are sufficiently satisfied with the above. It is possible to understand the meaning of the explanation by the synthesized speech, and from the meaning of the announcement in the voice, the voice can be received with a sense like a simple signal.

図３は、本実施の形態に係る応答装置１０の外部記憶装置６に記憶される音声応答プログラムの構成の他の例を示すブロック図である。 FIG. 3 is a block diagram showing another example of the configuration of the voice response program stored in the external storage device 6 of the response device 10 according to the present embodiment.

図３に示す外部記憶装置６に記憶される音声応答プログラムが、図２に示す外部記憶装置６に記憶される音声応答プログラムと異なる点は、音声合成パラメータ設定モジュール６ｃの代わりに操作案内テキスト生成モジュール６ｄを含む点である。 The voice response program stored in the external storage device 6 shown in FIG. 3 is different from the voice response program stored in the external storage device 6 shown in FIG. 2 in that operation guidance text generation is performed instead of the voice synthesis parameter setting module 6c. This is a point including the module 6d.

操作案内テキスト生成モジュール６ｄは、説明のテキストの語句を省略することにより音声出力を簡略化する機能を有する。詳細については後述する。 The operation guidance text generation module 6d has a function of simplifying voice output by omitting words in the description text. Details will be described later.

図４は、図３の外部記憶装置６の操作案内テキスト生成モジュール６ｄによる説明のテキストの簡略化の段階を示す説明図である。なお、図４（ａ），（ｂ），（ｃ）に示すテキストはこの順に音声としてスピーカ８から出力されるものとする。説明のテキストは、予め外部記憶装置６に記憶されている。 FIG. 4 is an explanatory diagram showing the stage of simplification of the explanation text by the operation guidance text generation module 6d of the external storage device 6 of FIG. It is assumed that the texts shown in FIGS. 4A, 4B, and 4C are output from the speaker 8 as voices in this order. The explanatory text is stored in the external storage device 6 in advance.

図４（ａ），（ｂ），（ｃ）に示すように、それぞれ例えば「目的地を設定するときは設定のボタンを押してください」、「エリアを選択するときは選択のボタンを押してください」および「やり直しは※のボタンを押してください」というテキストがある。 As shown in Figs. 4 (a), 4 (b), and 4 (c), for example, "Press the set button to set the destination", "Press the select button to select the area" And there is a text "Please press the * button to redo."

ここで、簡略化の段階が一段階進行した場合には、図４（ａ）に示す＜１＞の分節である「を設定するときは」および「のボタン」が省略される。すなわち、簡略化後のテキストは「目的地は設定を押してください」となる。 Here, when the simplification step proceeds by one step, the “when setting” and “button” of the <1> segment shown in FIG. 4A are omitted. In other words, the simplified text is “Please press the destination” button.

同様に、図４（ｂ）に示す＜１＞の分節である「を選択するときは」および「のボタン」が省略される。すなわち、簡略化後のテキストは「エリアは選択を押してください」となる。 Similarly, “when selecting” and “button of” which are segments of <1> shown in FIG. 4B are omitted. That is, the simplified text is “Please press select for area”.

同様に、図４（ｃ）に示す＜１＞の分節である「のボタン」が省略される。すなわち、簡略化後のテキストは「やり直しは※を押してください」となる。 Similarly, the “button” which is the segment <1> shown in FIG. 4C is omitted. In other words, the simplified text is “Please press * to redo”.

また、簡略化の段階がさらに一段階進行した場合には、図４（ａ）に示す＜２＞の分節である「押してください」が省略される。すなわち、簡略化後のテキストは「目的地は設定を」となる。 Further, when the simplification stage further proceeds, “please press” which is the segment <2> shown in FIG. 4A is omitted. That is, the simplified text is “Destination is set”.

同様に、図４（ｂ）に示す＜２＞の分節である「押してください」が省略される。すなわち、簡略化後のテキストは「エリアは選択を」となる。 Similarly, “please press” which is the segment of <2> shown in FIG. 4B is omitted. That is, the simplified text is “select area”.

また、簡略化の段階がさらに一段階進行した場合には、図４（ａ）に示す＜３＞の分節である「を」が省略される。すなわち、簡略化後のテキストは「目的地は設定」となる。 Further, when the simplification stage further proceeds by one stage, “<”> which is the segment of <3> shown in FIG. 4A is omitted. That is, the simplified text is “set destination”.

同様に、図４（ｂ）に示す＜３＞の分節である「を」が省略される。ずなわち、簡略化後のテキストは「エリアは選択」となる。 Similarly, the “<”> segment “<”> shown in FIG. 4B is omitted. In other words, the simplified text is “select area”.

また、簡略化の段階がさらに一段階進行した場合には、図４（ｃ）に示す＜４＞の分節である「を」および「押してください」が省略される。すなわち、簡略化後のテキストは「やり直しは※」となる。 In addition, when the simplification stage further proceeds by one stage, “<”> and “please press” which are the segments of <4> shown in FIG. 4C are omitted. In other words, the simplified text is "Redo *".

そして、簡略化の段階がさらに進行した場合には、図４（ｃ）に示す＜５＞の分節である「やり直しは」および「※」が省略される。すなわち、図４（ｃ）に示す「やり直しは※のボタンを押してください」のテキストが省略される。 When the simplification stage further proceeds, “redo” and “*”, which are the segments of <5> shown in FIG. 4C, are omitted. That is, the text “please press the * button to redo” shown in FIG. 4C is omitted.

このように、本例では、使用者の操作の習熟度が高くなるに従って説明のテキストの語句を段階的に省略する。それにより、音声を非分節化する。 In this way, in this example, the words in the explanation text are omitted step by step as the user's skill of operation increases. Thereby, the speech is desegmented.

なお、図４に示す音声のテキストの省略の順序は一例であり、これに限定されるものではなく、適宜テキストの省略の順序を変更することができる。 Note that the omission text omission order shown in FIG. 4 is an example, and the present invention is not limited to this, and the omission text omission order can be changed as appropriate.

次に、本実施の形態に係る応答装置１０による操作案内の簡略化について説明する。本実施の形態に係る応答装置１０では、操作案内の簡略化が行われる前に、まず使用者により機器の操作が円滑に行われているか否かが判定される。 Next, simplification of operation guidance by the response device 10 according to the present embodiment will be described. In response device 10 according to the present embodiment, before the operation guidance is simplified, it is first determined whether or not the user is operating the device smoothly.

図５は、本実施の形態に係る応答装置１０における音声応答プログラムに基づく音声の簡略化を示すフローチャートである。 FIG. 5 is a flowchart showing the simplification of voice based on the voice response program in the response device 10 according to the present embodiment.

図５に示すように、まず、応答装置１０のＣＰＵ１は、音声による機器の操作案内を行う（ステップＳ１）。この場合、スピーカ８から案内の音声が出力される。 As shown in FIG. 5, first, the CPU 1 of the response device 10 performs operation guidance on the device by voice (step S1). In this case, guidance voice is output from the speaker 8.

次に、ＣＰＵ１は、後戻り操作が必要であるか否かを判定する（ステップＳ２）。例えば、使用者により機器に設けられているやり直しスイッチが押下されたか否かがＣＰＵ１により判定される。この場合、使用者によりやり直しスイッチが押下された場合には後戻り操作が必要であると判定され、やり直しスイッチが押下されていない場合には後戻り操作が必要でないと判定される。 Next, the CPU 1 determines whether or not a backward operation is necessary (step S2). For example, the CPU 1 determines whether or not a redo switch provided on the device has been pressed by the user. In this case, when the redo switch is pressed by the user, it is determined that the backward operation is necessary, and when the redo switch is not pressed, it is determined that the backward operation is not necessary.

後戻り操作が必要でない場合には、ＣＰＵ１は、使用者による次の操作までのタイムラグが長いか否かを判定する（ステップＳ３）。この場合、使用者による次の操作までの時間が所定時間を超えているか否かが判定される。 When the return operation is not necessary, the CPU 1 determines whether or not the time lag until the next operation by the user is long (step S3). In this case, it is determined whether or not the time until the next operation by the user exceeds a predetermined time.

使用者による次の操作までの時間が所定時間を超えていない場合、ＣＰＵ１は、音声のテキストの簡略化の段階を進める（ステップＳ４）。その後、ステップＳ１に戻る。 If the time until the next operation by the user does not exceed the predetermined time, the CPU 1 advances the simplification stage of the voice text (step S4). Then, it returns to step S1.

ここで、上記の簡略化の段階とは、図４において示した順序に基づいた操作案内テキスト生成モジュール６ｄによる音声のテキストの簡略化の段階または音声合成パラメータ設定モジュール６ｃによる音声のテキストの簡略化の段階をいう。 Here, the simplification step is a step of simplification of speech text by the operation guidance text generation module 6d based on the order shown in FIG. 4 or simplification of speech text by the speech synthesis parameter setting module 6c. The stage.

上記のステップＳ２において後戻り操作が必要であると判定された場合およびステップＳ３において使用者による次の操作までの時間が所定時間を超えていると判定された場合には、ＣＰＵ１は、音声のテキストの簡略化の段階を戻す（ステップＳ５）。その後、ステップＳ１に戻り、簡略化の段階が戻されたテキストに基づく操作案内が行われる。 If it is determined in step S2 that a backward operation is necessary and if it is determined in step S3 that the time until the next operation by the user exceeds a predetermined time, the CPU 1 The simplification stage is returned (step S5). Thereafter, the process returns to step S1, and operation guidance based on the text returned to the simplification stage is performed.

本実施の形態においては、音声合成パラメータ設定モジュール６ｃおよび操作案内テキスト生成モジュール６ｄにより音声出力をイントネーション等の韻律情報を適度に保ちながら連続的に簡略化させることによって、使用者が初心者である場合にはわかりやすい音声出力による説明が行われ、使用者が熟練者である場合には使用者へのわずらわしさを軽減することができる音声出力による説明が行われる。 In the present embodiment, when the user is a beginner by continuously simplifying the speech output while maintaining proper prosodic information such as intonation by the speech synthesis parameter setting module 6c and the operation guidance text generation module 6d. An easy-to-understand voice output explanation is given, and if the user is an expert, a voice output explanation that can reduce the troublesomeness to the user is given.

また、使用者による機器の操作の習熟度に基づいて機器の操作に関する説明が変化することにより、使用者が応答装置１０に対して愛着を抱き、また、応答装置１０に対してより親和性を抱くことも期待できる。 In addition, the explanation about the operation of the device changes based on the user's familiarity with the operation of the device, so that the user is attached to the response device 10 and more compatible with the response device 10. You can expect to hold them.

上記実施の形態に係る応答装置１０は、機器に直接設けることも可能であるが、ネットワークを介して機器に接続することも可能である。 The response device 10 according to the above embodiment can be directly provided in the device, but can also be connected to the device via a network.

上記実施の形態に係る応答装置１０は、ペットロボットに設けることもできる。この場合、使用者によるペットロボットの操作の習熟度に応じて上記の音声出力の簡略化が進められる。ここで、ペットロボットと使用者との間の会話の頻度、円滑性等に基づいて操作の習熟度が判定される。 The response device 10 according to the above embodiment can also be provided in a pet robot. In this case, the simplification of the voice output is advanced according to the user's proficiency level of operation of the pet robot. Here, the proficiency level of the operation is determined based on the frequency and smoothness of the conversation between the pet robot and the user.

なお、応答装置１０は、使用者を認識するための認識モジュールを備えてもよい。認識モジュールは、例えば使用者の顔、指紋、声または使用者によるパスワード入力等により上記使用者を認識する。この場合、複数の使用者が、それぞれの習熟度に基づいた音声出力による説明を得ることができる。 Note that the response device 10 may include a recognition module for recognizing the user. The recognition module recognizes the user by, for example, the user's face, fingerprint, voice, or password input by the user. In this case, a plurality of users can obtain explanations by voice output based on their proficiency levels.

本実施の形態においては、応答装置１０が音声応答装置に相当し、スピーカ８が音声出力手段に相当し、操作の円滑性判定モジュール６ａが判定手段に相当し、音声合成パラメータ設定モジュール６ｃおよび操作案内テキスト生成モジュール６ｄが簡略化手段に相当し、認識モジュールが認識手段に相当する。 In the present embodiment, the response device 10 corresponds to a voice response device, the speaker 8 corresponds to a voice output unit, the operation smoothness determination module 6a corresponds to a determination unit, and the voice synthesis parameter setting module 6c and the operation The guidance text generation module 6d corresponds to simplification means, and the recognition module corresponds to recognition means.

なお、上記実施の形態では、判定手段、簡略化手段および認識手段がＣＰＵ１および音声応答プログラムにより構成されているが、判定手段、簡略化手段および認識手段が電子回路等のハードウェアにより構成されてもよい。 In the above embodiment, the determination unit, the simplification unit, and the recognition unit are configured by the CPU 1 and the voice response program. However, the determination unit, the simplification unit, and the recognition unit are configured by hardware such as an electronic circuit. Also good.

本発明は、カーナビゲーションシステム、ビデオテープレコーダの録画予約システム、ペットロボットまたはチケット予約システム等の種々の機器に利用することができる。 The present invention can be used for various devices such as a car navigation system, a video tape recorder recording reservation system, a pet robot, or a ticket reservation system.

本実施の形態に係る応答装置の構成を示すブロック図である。It is a block diagram which shows the structure of the response apparatus which concerns on this Embodiment. 本実施の形態に係る応答装置の外部記憶装置に記憶される音声応答プログラムの構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the voice response program memorize | stored in the external storage device of the response apparatus which concerns on this Embodiment. 本実施の形態に係る応答装置の外部記憶装置に記憶される音声応答プログラムの構成の他の例を示すブロック図である。It is a block diagram which shows the other example of a structure of the audio | voice response program memorize | stored in the external storage device of the response apparatus which concerns on this Embodiment. 図３の外部記憶装置の操作案内テキスト生成モジュールによる説明のテキストの簡略化の段階を示す説明図である。It is explanatory drawing which shows the step of the simplification of the description text by the operation guidance text generation module of the external storage device of FIG. 本実施の形態に係る応答装置における音声応答プログラムに基づく音声の簡略化を示すフローチャートである。It is a flowchart which shows simplification of the audio | voice based on the audio | voice response program in the response apparatus which concerns on this Embodiment.

Explanation of symbols

１ＣＰＵ
２ＲＯＭ
３ＲＡＭ
４入力装置
５表示装置
６外部記憶装置
６ａ操作の円滑性判定モジュール
６ｂ音声合成モジュール
６ｃ音声合成パラメータ設定モジュール
６ｄ操作案内テキスト生成モジュール
７記録媒体駆動装置
８スピーカ
９記録媒体
１０応答装置 1 CPU
2 ROM
3 RAM
4 Input Device 5 Display Device 6 External Storage Device 6a Operation Smoothness Determination Module 6b Speech Synthesis Module 6c Speech Synthesis Parameter Setting Module 6d Operation Guidance Text Generation Module 7 Recording Medium Drive Device 8 Speaker 9 Recording Medium 10 Response Device

Claims

A voice response device for guiding operation of a device by voice,
An audio output means for outputting an explanation about the operation of the device;
A determination means for determining a proficiency level of operation of the device;
A voice response device comprising: simplification means for simplifying voice output by the voice output means based on a proficiency level determined by the determination means.

2. The voice response apparatus according to claim 1, wherein the simplification means simplifies the voice output by desegmenting the voice.

3. The voice response device according to claim 2, wherein the simplification means reduces the frequency band of the voice and increases the playback speed of the voice to desegment the voice.

2. The voice response apparatus according to claim 1, wherein the simplification means simplifies voice output by omitting some words in the description.

The voice response device according to claim 1, wherein the determination unit determines a proficiency level of the operation based on re-operation of the device or an operation time.

The simplification means simplifies the audio output step by step when the operation of the device is not re-executed and the operation time does not exceed a predetermined time, and when the operation of the device is re-executed or the operation 6. The voice response device according to claim 1, wherein when the time exceeds a predetermined time, the step of simplifying the voice output is returned.

A recognition means for recognizing the user;
The voice response device according to claim 1, wherein the determination unit determines a proficiency level of the operation of the device for each user recognized by the recognition unit.

A voice response method for guiding operation of a device by voice,
Outputting a description of the operation of the device by voice;
Determining a proficiency level of operation of the device;
A voice response method comprising: a step of simplifying voice output based on a degree of proficiency determined.

A voice response program executable by a computer and a voice response program for guiding operation of the device by voice,
A process for outputting a description of the operation of the device by voice;
A process for determining a proficiency level of operation of the device;
Processing for simplifying audio output based on the proficiency level determined
A voice response program to be executed by the computer.