JP2004117905A

JP2004117905A - Method and device for information access using voice

Info

Publication number: JP2004117905A
Application number: JP2002281841A
Authority: JP
Inventors: Naoki Sashita; 指田　直毅; Ichiji Ishigaki; 石垣　一司
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2002-09-26
Filing date: 2002-09-26
Publication date: 2004-04-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method and a device for information access using voice, which realize easy information access operation of high conformity even in the case that a user doesn't or cannot gaze at a display picture for operation. <P>SOLUTION: An audio response behavior control part is provided which accepts an input from the user and determines information contents to be presented to the user on the basis of the input from the user and transmits the determined information contents to be presented to the user, to the user by a voice output and allows the user to change contents of the voice output to be presented to the user by using a connected input device. The user uses the input device to adjust a reading speed for the voice output or changes a level of detail of information contents being the object of the voice output. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、合成音声等による音声出力を用いた応答機能を利用することにより、ユーザが操作用表示画面を注視することなく、ネットワークやデータベース上に存在する様々な情報を取得することができる環境を提供する音声を用いた情報アクセス装置及び方法に関する。
【０００２】
【従来の技術】
昨今のコンピュータ技術の急速な進展に伴って、ネットワーク上、あるいはデータベース上に存在する様々な情報に対して自由にアクセスし、必要な情報を取得あるいは閲覧することができる環境が、様々な形態において提供されている。今後、かかる環境下における情報に対するアクセス機器の小型化、モバイル化、あるいはウェラブル化が進展するのに伴い、ユーザが使用するに当たって、情報に対するアクセス機器の操作用表示画面を常時注視しなければならないという制約条件を緩和したい場面が増大する傾向にあるものと予測されている。係る場面として現時点において想定されるものとしては、例えば携帯電話を介した情報アクセスや、自動車内端末を介した情報アクセス等が考えられている。
【０００３】
例えば、携帯電話による情報アクセスの例としては、ＮＴＴドコモ社が提供する「ｉ−ｍｏｄｅ」（Ｒ）サービス等の情報提供サービスが挙げられる。このような情報提供サービスにおいては、基本的に画面を見ながらアクセス操作を行うことになる。
【０００４】
しかし、例えば歩きながら当該サービスを利用する場合や、音声出力による案内を用いた情報の抽出サービスを利用する場合等を想定した場合には、画面による操作よりも音声応答によるデータ読上げ操作の方が高い利便性を発揮できる可能性があり、このような音声操作インタフェースに対するニーズが今後増大するものと考えられる。
【０００５】
一方、自動車向けの情報アクセス装置に関しては、近年特に日本を中心として、カーナビゲーションシステム（以下、「カーナビ」という。）の利用環境が一般的な乗用車の世界にも深く浸透しており、車の中において詳細な電子地図や交通情報、さらには現在地付近の施設情報等の様々な情報を取得あるいは閲覧できる環境が実現されている。
【０００６】
今後、カーナビのさらなる発展に伴って、自動車は「走る情報ステーション」に変貌し、車載用情報ブラウザ装置に対するニーズも急速に拡大すると予想されおり、このような情報アクセス装置の出現によって、これまで移動手段としての価値がメインであった自動車の利便性あるいは快適性が飛躍的に増大するものと期待されている。
【０００７】
しかしながら、このような情報アクセス装置が車内に持ち込まれると、利便性が向上する反面、情報を閲覧するための複雑な機器操作や、情報そのものを注視するというようなサブタスクが発生することから、走行中のドライバが専念すべき「運転」というメインタスクへの悪影響や負荷が生じることが懸念されている。このような運転以外のサブタスクへの負荷が高まることによって、脇見行為や注意散漫等が頻発し、結果的に追突事故や衝突事故等を引き起こす可能性が高くなるという問題点も指摘されている。
【０００８】
そこで、かかる問題点を解決するべく、（特許文献１）において開示されているような、操作の対象となる情報アクセス装置との対話を全て音声入出力により実現する音声対話システムや、（特許文献２）において開示されているような、音声入出力以外にボタン等の物理的デバイスを用いる音声対話システム等が多々開発されている。
【０００９】
【特許文献１】
特開２０００−３０５７４９号公報
【００１０】
【特許文献２】
特開平８−２９３９２３号公報
【００１１】
【発明が解決しようとする課題】
しかしながら、（特許文献１）に示すようなユーザの音声入力に対応した情報アクセス装置においては、特に入力系を構成する音声認識エンジンに対して、ロードノイズ等に代表される実環境下での騒音による認識エラーの問題や、未知のメニュー項目に対する音声入力内容の曖昧性の解消等、技術的に克服するべき検討課題が数多く残されており、音声入力機能を導入することでかえって利便性が低下し、結果的に心理的負荷が増大してしまうという問題が生じるおそれが残されている。
【００１２】
また、固定／携帯電話を利用したＩＶＲ（Ｉｎｔｅｒａｃｔｉｖｅ　Ｖｏｉｃｅ　Ｒｅｓｐｏｎｓｅ）に代表されるような、音声（認識）入力に加えて、電話上のプッシュボタン等の物理デバイスを介してメニュー項目番号等を指定する情報アクセス装置も考えられるが（特許文献２）、この場合においても、音声読上げされるメニュー項目数が多すぎると頭の中で記憶しきれない、聴き返しをしたいと思っても音声案内が流れているため頭の中が混乱する、項目名が短すぎるとその内容が理解できず戸惑う、等の音声データストリームに特有の問題点が生じ、ユーザにとって操作に起因する心理的負荷が増大するおそれも残されている。その結果、例えば自動車を運転中のドライバに対しては、注意力散漫で危険な状況に陥ってしまう可能性があるという問題点もあった。
【００１３】
本発明は、上記問題点を解決するために、ユーザが操作用の表示画面を注視しない、もしくはできない場合であっても、簡便かつ即応性の高い情報アクセス操作が可能な音声を用いた情報アクセス装置及び方法を提供することを目的とする。
【００１４】
【課題を解決するための手段】
上記目的を達成するために本発明にかかる音声を用いた情報アクセス装置は、ユーザによる入力を受けつけるユーザ入力受入部と、ユーザによる入力に基づいて、ユーザに対して提示すべき情報内容を決定する提示情報管理部と、決定されたユーザに対して提示すべき情報内容を、音声出力を介してユーザに伝達する音声応答処理部と、接続された入力デバイスを用いて、ユーザに提示される音声出力の内容をユーザが変更することができる音声応答挙動制御部とを備えることを特徴とする。
【００１５】
かかる構成により、音声応答の挙動をユーザが能動的かつ確実にコントロールできる機能を提供することによって、従来方法では解決できなかった機器操作に起因する心理的負荷、例えば音声認識エラーによる繰返し入力や復帰操作に関わるストレス、音声読上げ内容の記憶負荷、聞き逃し回避のための読上げ聴取への意識集中等を軽減することができ、簡便かつ即応性の高い快適な情報アクセス操作を行うことが可能となる。
【００１６】
また、本発明にかかる音声を用いた情報アクセス装置は、音声応答挙動制御部に接続された入力デバイスを介して、ユーザが音声出力を行う際の読上げ速度を調整することができることが好ましい。
【００１７】
あるいは、本発明にかかる音声を用いた情報アクセス装置は、情報内容が複数の詳細レベルを有し、音声応答挙動制御部に接続された入力デバイスを介して、ユーザが音声出力の対象となる情報内容の詳細レベルを変更することができることが好ましい。
【００１８】
また、本発明にかかる音声を用いた情報アクセス装置は、情報内容が複数の情報内容を有し、音声応答挙動制御部に接続された入力デバイスを介して、音声出力の対象となる情報内容自体を変更することができることが好ましい。
【００１９】
また、本発明は、上記のような音声を用いた情報アクセス装置の機能をコンピュータの処理ステップとして実行するソフトウェアを特徴とするものであり、具体的には、ユーザによる入力を受けつける工程と、ユーザによる入力に基づいて、ユーザに対して提示すべき情報内容を決定する工程と、決定されたユーザに対して提示すべき情報内容を、音声出力を介してユーザに伝達する工程と、接続された入力デバイスを用いて、ユーザに提示される音声出力の内容をユーザが変更することができる工程とを備える音声を用いた情報アクセス方法並びにそのような工程を具現化するコンピュータ実行可能なプログラムであることを特徴とする。
【００２０】
かかる構成により、コンピュータ上へ当該プログラムをロードさせ実行することで、音声応答の挙動をユーザが能動的かつ確実にコントロールできる機能を提供することができ、従来方法では解決できなかった機器操作に起因する心理的負荷、例えば音声認識エラーによる繰返し入力や復帰操作に関わるストレス、音声読上げ内容の記憶負荷、聞き逃し回避のための読上げ聴取への意識集中等を軽減することができ、簡便かつ即応性の高い快適な情報アクセス操作を行うことができる音声を用いた情報アクセス装置を提供することが可能となる。
【００２１】
【発明の実施の形態】
（実施の形態１）
以下、本発明の実施の形態１にかかる音声を用いた情報アクセス装置について、図面を参照しながら説明する。図１は本発明の実施の形態１にかかる音声を用いた情報アクセス装置の構成図である。
【００２２】
図１において、１１はユーザによる入力操作を受け付けるユーザ入力受入部を、１２はユーザ入力に応じて提示すべき情報内容を決定する提示情報管理部を、１３は決定されたユーザに対して提示すべき情報内容を、音声出力を介してユーザに伝達する音声応答処理部を、１４はユーザに提示される音声出力を介した情報内容の順序や流れ、詳細さのレベル等をユーザ自らが能動的に変更可能な音声応答挙動制御部を、それぞれ示している。また、１５はユーザが任意の指示を入力することができる物理的なデバイスを示している。
【００２３】
ユーザ入力受入部１１は、ユーザによる音声入力を受け付け、ユーザによる音声入力を提示するべき情報内容の検索キー情報として用いるために少なくとも音声認識エンジンを有しており、当該音声認識エンジンによる音声認識結果をキー情報として提示情報管理部１２へ渡すことになる。
【００２４】
提示情報管理部１２においては、ユーザに表示するべきメニュー構成あるいはコンテンツを、事前に表示情報データベース１６に保存しておき、ユーザ入力受入部１１から渡されたキー情報に基づいて当該表示情報データベース１６を照会して、ユーザに表示するべき情報内容を決定することになる。
【００２５】
音声応答処理部１３では、提示情報管理部１２において決定されたユーザに提示すべき情報内容を、合成音声等を用いて音声出力するものである。ユーザは出力されてきた音声の内容を聞くことによって、決定された情報内容が自分の希望する情報であるか否かについて判断することができる。
【００２６】
そして、音声応答挙動制御部１４においては、ユーザの判断によって外部に設けられているデバイス１５から指示を入力することによって、提示情報管理部１２へ渡すキー情報に更なる条件を付加したり、音声応答処理部１３における再生出力速度等の設定条件を変更する指示を出すことになる。
【００２７】
図２及び図３に、本発明の実施の形態１にかかる音声を用いた情報アクセス装置における音声応答挙動制御部１４の構成図を示す。図２及び図３においては、デバイス２２の操作モードを切り替えるためのボタンであるデバイス２１と、ダイヤルリモコン装置であるデバイス２２を用いる場合を想定して説明する。もちろん、本実施の形態に限定されるものではない。
【００２８】
まず図２は、読上げ速度を調整するための音声応答挙動制御部１４の構成図である。図２において、操作モード決定部１４１では、デバイス２１からの入力がオン信号であるかオフ信号であるかに基づいて、デバイス２２からの入力信号が、提示するべき情報内容を変更するための信号であるのか、あるいは読上げ速度を変更するための信号であるのか、その操作モードを決定する。
【００２９】
そして、定まった操作モードに基づいて、指示コマンド生成部１４２では、それぞれのモードに対応した指示コマンドを生成することになる。すなわち、提示するべき情報内容を変更するモードである場合には、表示情報データベース１６から情報内容を抽出するためのキー情報を生成し、読上げ速度を変更するモードである場合には、読上げ速度パラメタ調整部１４３において読上げ速度を調整するパラメタを変更することになる。
【００３０】
次に、図３は、読上げ項目名の詳細レベル（項目に含まれる情報内容を伝えるテキストの長さ）を調整するための音声応答挙動制御部１４の構成図である。図３に示すように、基本的な構成は図２と同様であるが、操作モード決定部１４１においては、デバイス２１からの入力がオン信号であるかオフ信号であるかに基づいて、デバイス２２からの入力信号が、提示するべき情報内容を変更するための信号であるのか、あるいは読上げ項目名の詳細レベルを変更するための信号であるのか、その操作モードを決定する点において相違する。
【００３１】
そして、定まった操作モードに基づいて、指示コマンド生成部１４２では、それぞれのモードに対応した指示コマンドを生成することになる。すなわち、提示するべき情報内容を変更するモードである場合には、表示情報データベース１６から情報内容を抽出するためのキー情報を生成し、読上げ項目名の詳細レベルを変更するモードである場合には、読上げ詳細レベル調整部１４４において読上げ詳細レベルを調整するパラメタを変更することになる。なお、この場合には表示情報データベース１６内に、詳細レベルに応じた情報内容を事前に保存しておくことが必要となることは言うまでもない。
【００３２】
以上のような構成とすることによって、単純に電話プッシュボタンで音声メニューの項目番号を入力指定する通常のＩＶＲシステムに比べて、音声応答（読上げ）中のメニュー項目がユーザの希望している情報とは関係ない項目であると判断された時点において、ユーザの意思に基づいて素早く次のメニュー項目へ移動することができたり、あるいは音声応答（読上げ）中において情報内容を聞き逃した場合等、以前のメニュー項目について再度参照する必要が生じた場合において、音声応答終了を待つことなく希望するメニュー項目へ移動することも可能となる。
【００３３】
したがって、自分が希望する情報内容に対応するメニュー項目が出現するまで、継続して音声応答に聴き入る必要もなく、あるいは候補となるメニュー項目数が多い場合であっても、その全ての内容を各々の項目番号と対応付けて一時的に記憶しておく必要もなくなる。
【００３４】
また、音声応答（読上げ）中、メニュー項目の内容が把握できなかった場合等においても、項目名称の詳細レベル（項目に含まれる情報内容を伝えるテキストの長さ）を上げることが可能となる。
【００３５】
このように、情報アクセス装置を操作するタスクによって費やされていた意識的あるいは心理的な負担を極力抑えることができることから、より安全かつ安心な状態を維持しながら目的とする情報へのアクセスを行うことが可能となる。
【００３６】
一方、読上げ確認の途中で音声認識結果エラーに起因する経由メニュー項目の間違いが発生し、ユーザがそれを確認した場合には、音声応答挙動制御部１４に接続されたデバイスを介して、メニュー選択項目を変更して意図した正しいメニュー項目への復帰操作を行うことも考えられる。
【００３７】
例えば、「グルメ」→「フランス料理」→「○○店」というメニュー階層構成に対して、音声入力によって下位階層に存在するメニュー項目「○○店」への直接移動（ショートカット）を試みようとする際には、音声応答処理部１３において、表示させたいメニュー項目へ至るまでに経由するメニュー項目名を各階層ごとに順次読上げを行い、ユーザによる確認操作を受けながら階層移動を行っていくことになる。
【００３８】
この時、音声入力エラー等の発生によって、対象項目が「グルメ」→「イタリア料理」→「△△店」と誤認識されてしまった場合、第２階層目における「イタリア料理」の項目を確認する段階においてユーザが誤認識が生じていることに気づいたら、音声応答挙動制御部１４に接続されたデバイス１５を操作することによって、意図している正しい項目である「フランス料理」へ即座に移動できるようにする。具体的には、デバイス２２であるダイヤルリモコンを回転させることによって、同一階層に位置する項目を音声出力させることになる。
【００３９】
このような構成にすることによって、音声認識による入力エラーが発生した場合であっても、ユーザが意図しているメニュー項目への復帰操作を比較的少ないステップで行うことが可能となる。
【００４０】
なお、音声応答挙動制御部１４に接続されているデバイス１５としては、ＩＶＲシステムで一般に採用されているように、プッシュボタンに挙動操作のための各機能を割り当てたものを利用することが考えられる。ただし、特にこれに限定されるものではなく、例えば運転中のドライバがハンドルから手を離すことなく操作可能なダイヤル式リモコンに挙動操作のための各機能を割り当てたものを利用することも考えられる。後者の場合、ダイヤルの送り／戻り操作の回転速度を読み取り、その速度の大きさに応じてメニュー項目の移動ステップ量の大きさを変化させるようにすることも考えられ、係る構成とすることによって、ある程度既知のメニュー構成に対して指定すべき選択項目位置（順番）が予め推測できている場合においては、余計な音声応答を素早くスキップさせることができることから、より短時間でメニュー操作を実現することが可能となる。
【００４１】
具体的な操作方法としては、図４に示すように、ダイヤルが順方向（▲１▼）へ１ステップ分回転された時には、音声出力の対象となる情報メニュー項目を１つ先へ送るようにし、またダイヤルが逆方向（▲２▼）へ回転された時には、音声出力の対象となる情報メニュー項目を１つ後へ戻るようにする。また、ダイヤル径方向（▲３▼）へ押された時には、情報内容を示している詳細レベルよりも詳細レベルを挙げるようにし、逆のダイヤル径方向（▲４▼）へ押された時には、情報内容を示している詳細レベルよりも詳細レベルを下げるようにすることもできる。
【００４２】
また、モードボタンの使い方としては、モードボタンがＯＦＦの場合には、叙述したように▲１▼〜▲４▼の各操作に対してそれぞれの機能を割り当てる。一方、モードボタンがＯＮの場合には、ダイヤルが順方向（▲１▼）へ１ステップ分回転された時には、提示される情報内容の詳細レベルを上げ（例えばより長い項目名テキストを音声出力する）、またダイヤルが逆方向（▲２▼）へ回転された時には、提示される情報内容の詳細レベルを下げる（例えばより短い項目名テキストを音声出力する）ようにすることも考えられる。
【００４３】
一方、音声応答挙動制御部１４に接続されているデバイス１５として、ダイヤル式リモコンの代わりに携帯電話のプッシュボタンもしくはカーソルを用いることも考えられる。この場合、携帯電話の番号ボタン以外のファンクションボタン等に、上記で説明したメニュー項目の音声出力に対する送り処理や戻り処理、あるいは情報内容の詳細レベル変更等の各種機能を割り当てることになる。
【００４４】
また、本実施の形態１においては、メニュー選択ベースの操作インタフェースについて説明しているが、特にこれに限定されるものではなく、例えばリンクを辿りながらウェブ情報を順次参照していくウェブブラウザの操作インタフェースに対しても適用可能である。
【００４５】
次に、本発明の実施の形態１にかかる音声を用いた情報アクセス装置を実現するプログラムの処理の流れについて説明する。図５に本発明の実施の形態１にかかる音声を用いた情報アクセス装置を実現するプログラムの処理の流れ図を示す。
【００４６】
図５において、まずユーザによる音声入力を受け付け（ステップＳ５０１）、音声認識エンジンを用いて入力音声に対する音声認識を行う（ステップＳ５０２）。
【００４７】
次に、ユーザに表示するべきメニュー構成あるいはコンテンツが保存されている表示情報データベース１６を、認識結果をキー情報として照会する（ステップＳ５０３）。そして、ユーザに表示するべき情報内容を表示情報データベース１６の中から抽出することになる（ステップＳ５０４）。
【００４８】
次に、抽出されたユーザに提示すべき情報内容を、合成音声等を用いて音声出力する（ステップＳ５０５）。そして、出力されてきた音声を聞いたユーザによる外部デバイスを用いた操作信号を受信する（ステップＳ５０６）。
【００４９】
そして、操作信号がキー情報の更新指示であった場合には、キー情報を再作成し（ステップＳ５０７）、ステップＳ５０３へ戻る。操作信号が音声応答における再生出力速度の変更指示である場合には、再生出力における速度パラメタを更新し（ステップＳ５０８）、ステップＳ５０５へ戻る。
【００５０】
以上のように本実施の形態１によれば、外部デバイスを用いて、音声応答処理内容を容易にかつ確実に更新することができることから、音声データストリームを用いる場合に冗長さや検索の困難さを克服することが可能となる。
【００５１】
（実施の形態２）
以下、本発明の実施の形態２にかかる音声を用いた情報アクセス装置について、図面を参照しながら説明する。図６に、本発明の実施の形態２にかかる音声を用いた情報アクセス装置の構成図を示す。
【００５２】
図６に示すように、構成自体は実施の形態１とは大きく相違しないが、例えば運転中のドライバにおける運転操作に費やす視覚的あるいは意識的な負荷状態を推定し、その負荷レベルを出力するユーザ負荷状況推定部６１を備えている点に特徴を有する。
【００５３】
すなわち、車線変更をしようとしている場合には、後続車に対する連続的な注視が必要になる。あるいは、交差点内を右折しようとしている場合には、歩行者や直進車への注意が必要になる。また、不案内なルート上を迷いながら走行している場合等、ユーザであるドライバにおいて運転操作に対する負荷レベルが比較的高い状態であると考えられる場合をユーザ負荷状況推定部６１において事前に推定し、メニュー項目あるいは提示された情報内容に対して、読上げ速度を低めに設定する等の制御を行うものである。
【００５４】
あるいは、情報内容を提示する際におけるユーザへの負荷を考慮して、情報内容を読上げる順序自体を変更することも考えられる。例えば、暗算処理を誘発する等、閲覧時における負荷が高い情報は読み上げ順序を後ろに回し、運転負荷が低くなったと判断された場合に提示するようにするものである。
【００５５】
また、運転負荷が高い状態が継続していると推定される場合には、情報内容の提示詳細レベルを下げ、より簡潔な情報内容として読上げるようにすることも考えられる。
【００５６】
このように、ユーザの負荷状況を推定することで、推定された状況に応じて音声応答挙動を変化させることにより、何らかの理由でドライバが音声応答挙動を能動的に操作できない、あるいは操作しない場合においても、情報へアクセスするための操作に費やされる意識的注意を抑制することができ、より安全な運転状態を維持しながら目的の情報アクセス操作を実行することが可能となる。
【００５７】
このようにユーザの負荷状況を推定するためには、各種のセンサが必要となる。例えば運転中のドライバが操作する車載用情報アクセス装置を想定すると、ユーザの負荷状況を推定するために、車内（バックミラーやダッシュボード上等）に設置された小型カメラを用いてドライバ頭部の動作を画像処理によりトラッキングする機能を付加することが考えられる。
【００５８】
かかる小型カメラ等により観測された頭部動作の平均移動量や単位時間あたりの動作回数が、通常状態よりも大きくなった場合には、ドライバの運転負荷状態が高いものと判断し、音声合成機能をコントロールする際の指標とすることができる。
【００５９】
また、ユーザ負荷状況推定部６１の別構成としては、ハンドル、アクセル、ブレーキ等の車両状態をコントロールするための各種操作デバイスの操作量を計測し、計測されたこれらの値に基づいて、カーブが連続する山道を走行している等の負荷状態を推定することも考えられる。
【００６０】
また、小型カメラ等のセンサデバイスを設けることなく、地理的要因や渋滞状況等にの外部状況に基づいてユーザの負荷状況を推定する構成も考えられる。例として、運転中のドライバが操作する車載用情報アクセス装置を想定すると、現行カーナビのようにＧＰＳあるいは地図情報に基づいて、車両の現在位置や走行道路に関する情報を受信し、当該車両位置や走行道路に関する情報に基づいて、例えばカーブの多い山道や主要駅前、あるいは都市部の混雑度の高い道を走行している場合には、ユーザの運転負荷が大きい状況であると推定して、聴取負荷を軽減するように音声応答の挙動を変化させるものである。
【００６１】
しかし、この場合には、ユーザ自らの意思や希望に反した音声応答制御が実行される可能性が残されていることから、そのような場合には、ユーザ自らがデバイス１５を介して音声応答挙動を変更することで、システムの挙動とユーザ感覚との不整合を簡単かつ即座に是正することも可能となる。
【００６２】
次に、本発明の実施の形態にかかる音声を用いた情報アクセス装置を実現するプログラムの処理の流れについて説明する。図７に本発明の実施の形態にかかる音声を用いた情報アクセス装置を実現するプログラムの処理の流れ図を示す。
【００６３】
図７において、まずユーザによる音声入力を受け付け（ステップＳ７０１）、音声認識エンジンを用いて入力音声に対する音声認識を行って、表示情報データベース１６のキー情報を生成する（ステップＳ７０２）。
【００６４】
一方、各種のセンサからの検知信号を受信して、ユーザの負荷状況を推定する（ステップＳ７０３）。そして、推定された負荷状況に応じて、生成されたキー情報を更新し（ステップＳ７０４）、更新されたキー情報に基づいて、表示情報データベース１６を照会して（ステップＳ７０５）、ユーザに表示するべき情報内容を抽出することになる（ステップＳ７０６）。
【００６５】
そして、抽出されたユーザに提示すべき情報内容を、合成音声等を用いて音声出力し（ステップＳ７０７）、出力されてきた音声を聞いたユーザによる外部デバイスを用いた操作信号を受信する（ステップＳ７０８）ことによって、ユーザの希望する情報内容へより直接的にアクセスできるようにしている。以下の処理は実施の形態１と同様である。
【００６６】
以上のように本実施の形態２によれば、あらかじめユーザの負荷状況を推定してから、それに応じた情報内容を提示することができ、またユーザの意図に沿わない情報内容が提示された場合であっても、容易に提示内容を更新することが可能となる。
【００６７】
なお、本発明の実施の形態にかかる音声を用いた情報アクセス装置を実現するプログラムは、図８に示すように、ＣＤ−ＲＯＭ８２−１やフレキシブルディスク８２−２等の可搬型記録媒体９２だけでなく、通信回線の先に備えられた他の記憶装置８１や、コンピュータ８３のハードディスクやＲＡＭ等の記録媒体８４のいずれに記憶されるものであっても良く、プログラム実行時には、プログラムはローディングされ、主メモリ上で実行される。
【００６８】
また、本発明の実施の形態にかかる音声を用いた情報アクセス装置により用いられる表示情報データベース等についても、図８に示すように、ＣＤ−ＲＯＭ８２−１やフレキシブルディスク８２−２等の可搬型記録媒体８２だけでなく、通信回線の先に備えられた他の記憶装置８１や、コンピュータ８３のハードディスクやＲＡＭ等の記録媒体８４のいずれに記憶されるものであっても良く、例えば本発明にかかる音声を用いた情報アクセス装置を利用する際にコンピュータ８３により読み取られる。
【００６９】
（付記１）　ユーザによる入力を受けつけるユーザ入力受入部と、
前記ユーザによる入力に基づいて、前記ユーザに対して提示すべき情報内容を決定する提示情報管理部と、
決定された前記ユーザに対して提示すべき情報内容を、音声出力を介して前記ユーザに伝達する音声応答処理部と、
接続された入力デバイスを用いて、前記ユーザに提示される音声出力の内容を前記ユーザが変更することができる音声応答挙動制御部とを備えることを特徴とする音声を用いた情報アクセス装置。
【００７０】
（付記２）　前記音声応答挙動制御部に接続された前記入力デバイスを介して、前記ユーザが音声出力を行う際の読上げ速度を調整することができる付記１に記載の音声を用いた情報アクセス装置。
【００７１】
（付記３）　前記情報内容が複数の詳細レベルを有し、前記音声応答挙動制御部に接続された前記入力デバイスを介して、前記ユーザが音声出力の対象となる情報内容の詳細レベルを変更することができる付記１に記載の音声を用いた情報アクセス装置。
【００７２】
（付記４）　前記情報内容が複数の情報内容を有し、前記音声応答挙動制御部に接続された前記入力デバイスを介して、音声出力の対象となる情報内容自体を変更することができる付記１に記載の音声を用いた情報アクセス装置。
【００７３】
（付記５）　ユーザの操作時点における操作負荷状況を推定するユーザ負荷状況推定部を備え、
前記ユーザ負荷状況推定部により得られた前記ユーザの操作負荷状況に応じて、音声出力の内容を変化させる付記１から４のいずれか一項に記載の音声を用いた情報アクセス装置。
【００７４】
（付記６）　前記ユーザ周辺の外部状況を計測する外部状況計測部をさらに備え、
前記ユーザ負荷状況推定部において、前記外部状況計測部により得られた現在位置及び周辺環境状況に応じてユーザの操作時点における操作負荷状況を推定する付記５に記載の音声を用いた情報アクセス装置。
【００７５】
（付記７）　前記入力デバイスが、前記音声応答挙動制御部の内部に設けられたダイヤル式リモコンであり、ダイヤル操作時の回転速度に応じて、前記ユーザに提示される音声出力の内容を変更できる付記１から６のいずれか一項に記載の音声を用いた情報アクセス装置。
【００７６】
（付記８）　ユーザによる入力を受けつける工程と、
前記ユーザによる入力に基づいて、前記ユーザに対して提示すべき情報内容を決定する工程と、
決定された前記ユーザに対して提示すべき情報内容を、音声出力を介して前記ユーザに伝達する工程と、
接続された入力デバイスを用いて、前記ユーザに提示される音声出力の内容を前記ユーザが変更することができる工程とを備えることを特徴とする音声を用いた情報アクセス方法。
【００７７】
（付記９）　ユーザによる入力を受けつけるステップと、
前記ユーザによる入力に基づいて、前記ユーザに対して提示すべき情報内容を決定するステップと、
決定された前記ユーザに対して提示すべき情報内容を、音声出力を介して前記ユーザに伝達するステップと、
接続された入力デバイスを用いて、前記ユーザに提示される音声出力の内容を前記ユーザが変更することができるステップとを備えることを特徴とする音声を用いた情報アクセス方法を具現化するコンピュータ実行可能なプログラム。
【００７８】
【発明の効果】
以上のように本発明にかかる音声を用いた情報アクセス装置によれば、ユーザが操作用の表示画面を注視しない、もしくはできない場合であっても、音声応答機能及び音声応答挙動制御が可能な外部デバイスを組合せて利用することによって、簡便かつ即応性の高い情報アクセス操作が可能な音声応答型情報アクセス装置を提供することが可能となる。
【図面の簡単な説明】
【図１】本発明の実施の形態１にかかる音声を用いた情報アクセス装置の構成図
【図２】本発明の実施の形態１にかかる音声を用いた情報アクセス装置における音声応答挙動制御部の構成図
【図３】本発明の実施の形態１にかかる音声を用いた情報アクセス装置における音声応答挙動制御部の構成図
【図４】本発明の実施の形態１にかかる音声を用いた情報アクセス装置におけるデバイスの構成例示図
【図５】本発明の実施の形態１にかかる音声を用いた情報アクセス装置における処理の流れ図
【図６】本発明の実施の形態２にかかる音声を用いた情報アクセス装置の構成図
【図７】本発明の実施の形態２にかかる音声を用いた情報アクセス装置における処理の流れ図
【図８】コンピュータ環境の例示図
【符号の説明】
１１　ユーザ入力受入部
１２　提示情報管理部
１３　音声応答処理部
１４　音声応答挙動制御部
１５、２１、２２　デバイス
１６　表示情報データベース
６１　ユーザ負荷状況推定部
８１　回線先の記憶装置
８２　ＣＤ−ＲＯＭやフレキシブルディスク等の可搬型記録媒体
８２−１　ＣＤ−ＲＯＭ
８２−２　フレキシブルディスク
８３　コンピュータ
８４　コンピュータ上のＲＡＭ／ハードディスク等の記録媒体
１４１　操作モード決定部
１４２　指示コマンド生成部
１４３　読上げパラメタ調整部
１４４　読上げ詳細レベル調整部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention provides an environment in which various information existing on a network or a database can be obtained without using a user's gaze at an operation display screen by using a response function using a voice output by a synthetic voice or the like. And a method for accessing information using voice.
[0002]
[Prior art]
With the rapid progress of computer technology in recent years, the environment in which various information existing on a network or a database can be freely accessed, and required information can be obtained or browsed has been developed in various forms. Are provided. In the future, as access devices for information in such an environment become smaller, more mobile, or more wearable, it is necessary for the user to constantly watch the display screen for operating the access device for information when using the device. It is predicted that the number of situations where the user wants to relax the constraints tends to increase. For example, information access through a mobile phone, information access through an in-car terminal, and the like are considered as such scenes.
[0003]
For example, as an example of information access by a mobile phone, an information providing service such as an “i-mode” (R) service provided by NTT DOCOMO is given. In such an information providing service, an access operation is basically performed while looking at a screen.
[0004]
However, for example, when assuming the use of the service while walking, or the use of an information extraction service using guidance by voice output, the data reading operation by voice response is more effective than the screen operation. There is a possibility that high convenience can be exhibited, and it is considered that the need for such a voice operation interface will increase in the future.
[0005]
On the other hand, regarding information access devices for automobiles, the use environment of car navigation systems (hereinafter referred to as “car navigation systems”) has been deeply penetrating into the general passenger car world, especially in Japan. An environment has been realized in which various information such as detailed electronic maps and traffic information, as well as information on facilities near the current location can be obtained or browsed.
[0006]
In the future, with the further development of car navigation systems, automobiles will be transformed into “running information stations”, and the need for on-board information browser devices is expected to grow rapidly. It is expected that the convenience or comfort of automobiles, whose value has been the main means, will dramatically increase.
[0007]
However, when such an information access device is brought into a vehicle, the convenience is improved, but on the other hand, complicated device operations for browsing information and subtasks such as watching the information itself occur, so traveling It is feared that the driver inside will be devoted to the main task of "driving", which may have an adverse effect or load. It has also been pointed out that an increase in the load on the sub-tasks other than the driving frequently causes inattentive behavior, distraction, and the like, resulting in a high possibility of causing a rear-end collision, a collision accident, and the like.
[0008]
In order to solve such a problem, a voice interaction system disclosed in (Patent Document 1) that realizes all dialogs with an information access device to be operated by voice input / output, As disclosed in 2), many voice interaction systems using physical devices such as buttons in addition to voice input / output have been developed.
[0009]
[Patent Document 1]
JP-A-2000-305749
[0010]
[Patent Document 2]
JP-A-8-293923
[0011]
[Problems to be solved by the invention]
However, in an information access apparatus corresponding to a user's voice input as disclosed in (Patent Literature 1), a noise in a real environment represented by a road noise or the like is generated particularly for a voice recognition engine constituting an input system. There are many technical issues to be overcome, such as the problem of recognition errors due to the problem and the ambiguity of voice input contents for unknown menu items, and the introduction of the voice input function has reduced convenience. However, there is a possibility that a problem that the psychological load is increased as a result may occur.
[0012]
Also, in addition to voice (recognition) input typified by an IVR (Interactive Voice Response) using a fixed / mobile phone, a menu item number or the like is specified via a physical device such as a push button on the phone. Although an information access device is also conceivable (Patent Document 2), in this case too, if the number of menu items read out aloud is too large, it cannot be memorized in the head. Even if you want to hear back, voice guidance flows. If the item name is too short, the contents will be difficult to understand if the item name is too short, etc., causing problems specific to the audio data stream, and may increase the psychological load caused by the operation for the user. Is also left. As a result, for example, there is a problem that a driver who is driving a car may be distracted and fall into a dangerous situation.
[0013]
The present invention has been made in order to solve the above-described problems. In order to solve the above problems, even when the user does not gaze at the display screen for the operation, or even when the user cannot do so, the information access using the voice that can perform the information access operation easily and quickly It is an object to provide an apparatus and a method.
[0014]
[Means for Solving the Problems]
In order to achieve the above object, an information access apparatus using voice according to the present invention determines a user input receiving unit that receives an input by a user and information content to be presented to the user based on the input by the user. A presentation information management unit, a voice response processing unit that transmits the determined information content to be presented to the user to the user via voice output, and a voice presented to the user using the connected input device. A voice response behavior control unit capable of changing the content of the output by the user.
[0015]
With this configuration, by providing a function that allows the user to actively and reliably control the behavior of the voice response, a psychological load due to device operation that cannot be solved by the conventional method, for example, repeated input or return due to a voice recognition error. Stress related to operation, memory load of voice reading content, concentration of consciousness in reading and listening to avoid oversight can be reduced, and it is possible to perform simple, responsive and comfortable information access operation. .
[0016]
Further, it is preferable that the information access device using voice according to the present invention can adjust the reading speed when the user performs voice output via the input device connected to the voice response behavior control unit.
[0017]
Alternatively, the information access apparatus using voice according to the present invention may be configured such that the information content has a plurality of detail levels, and the user can output the information to be output as voice through an input device connected to the voice response behavior control unit. Preferably, the level of detail of the content can be changed.
[0018]
Further, the information access apparatus using voice according to the present invention is characterized in that the information content has a plurality of information contents, and the information content itself to be subjected to voice output via an input device connected to the voice response behavior control unit. Can be preferably changed.
[0019]
Further, the present invention is characterized by software that executes the function of the information access device using voice as a processing step of a computer as described above. Specifically, a step of receiving an input by a user, Determining the information content to be presented to the user based on the input by the user, and transmitting the determined information content to be presented to the user to the user via a voice output; A method of allowing a user to change the content of a voice output presented to a user using an input device, and a computer-executable program embodying such a process. It is characterized by the following.
[0020]
With such a configuration, by loading and executing the program on the computer, a function that allows the user to actively and reliably control the behavior of the voice response can be provided. Simple and responsive, which can reduce the psychological load that occurs, such as the stress associated with repeated input and return operations due to voice recognition errors, the memory load of voice reading content, and the concentration of consciousness in reading and listening to avoid oversight. It is possible to provide an information access apparatus using voice that can perform a highly comfortable information access operation.
[0021]
BEST MODE FOR CARRYING OUT THE INVENTION
(Embodiment 1)
Hereinafter, an information access device using voice according to the first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a configuration diagram of an information access device using voice according to the first embodiment of the present invention.
[0022]
In FIG. 1, reference numeral 11 denotes a user input receiving unit that receives an input operation by a user, 12 denotes a presentation information management unit that determines information content to be presented according to a user input, and 13 denotes a user to the determined user. The voice response processing unit 14 transmits the information content to be transmitted to the user via voice output. The user 14 actively controls the order, flow, level of detail, etc. of the information content via the voice output presented to the user. , The voice response behavior control units that can be changed are shown. Reference numeral 15 denotes a physical device to which a user can input an arbitrary instruction.
[0023]
The user input receiving unit 11 has at least a speech recognition engine for accepting a speech input by the user and using the speech input as search key information of information content to be presented, and a speech recognition result by the speech recognition engine. To the presentation information management unit 12 as key information.
[0024]
In the presentation information management unit 12, the menu configuration or content to be displayed to the user is stored in the display information database 16 in advance, and based on the key information passed from the user input reception unit 11, the display information database 16 is displayed. To determine the information content to be displayed to the user.
[0025]
The voice response processing unit 13 outputs the information content to be presented to the user determined by the presentation information management unit 12 by using a synthesized voice or the like. The user can determine whether the determined information content is the information desired by listening to the content of the output voice.
[0026]
Then, the voice response behavior control unit 14 adds an additional condition to the key information to be passed to the presentation information management unit 12, An instruction to change the setting conditions such as the reproduction output speed in the response processing unit 13 is issued.
[0027]
2 and 3 are configuration diagrams of the voice response behavior control unit 14 in the information access device using voice according to the first embodiment of the present invention. 2 and 3, description will be made on the assumption that a device 21 which is a button for switching the operation mode of the device 22 and a device 22 which is a dial remote controller are used. Of course, it is not limited to this embodiment.
[0028]
First, FIG. 2 is a configuration diagram of the voice response behavior control unit 14 for adjusting the reading speed. In FIG. 2, the operation mode determination unit 141 uses an input signal from the device 22 based on whether the input from the device 21 is an ON signal or an OFF signal to change the information content to be presented. Or a signal for changing the reading speed is determined.
[0029]
Then, based on the determined operation mode, the instruction command generator 142 generates an instruction command corresponding to each mode. That is, in a mode in which the information content to be presented is changed, key information for extracting the information content from the display information database 16 is generated. In a mode in which the reading speed is changed, the reading speed parameter is used. The adjustment unit 143 changes the parameter for adjusting the reading speed.
[0030]
Next, FIG. 3 is a configuration diagram of the voice response behavior control unit 14 for adjusting the level of detail of the name of the reading item (the length of the text that conveys information contained in the item). As shown in FIG. 3, the basic configuration is the same as that of FIG. 2, but the operation mode determining unit 141 determines whether the input from the device 21 is an ON signal or an OFF signal, Is different in that the operation mode is determined whether the input signal from is a signal for changing the information content to be presented or a signal for changing the detail level of the reading item name.
[0031]
Then, based on the determined operation mode, the instruction command generator 142 generates an instruction command corresponding to each mode. That is, when the mode is a mode for changing the information content to be presented, key information for extracting the information content from the display information database 16 is generated, and when the mode for changing the detail level of the reading item name is used. Thus, the parameter for adjusting the reading detail level in the reading detail level adjusting unit 144 is changed. In this case, it is needless to say that the information content corresponding to the detail level needs to be stored in the display information database 16 in advance.
[0032]
With the above-described configuration, the menu item in the voice response (speech) is the information desired by the user as compared with the normal IVR system in which the item number of the voice menu is simply input and designated by the telephone push button. If it is determined that the item is unrelated to the item, the user can quickly move to the next menu item based on the user's intention, or if the information content is missed during the voice response (reading), When it becomes necessary to refer to the previous menu item again, it is possible to move to the desired menu item without waiting for the end of the voice response.
[0033]
Therefore, it is not necessary to continuously listen to the voice response until the menu item corresponding to the information content desired by the user appears, or even if the number of candidate menu items is large, all the contents are individually There is no need to temporarily store the item numbers in association with the item numbers.
[0034]
In addition, even when the contents of the menu item cannot be grasped during the voice response (speech), the detail level of the item name (the length of the text that conveys the information content included in the item) can be increased.
[0035]
In this way, since the conscious or psychological burden spent by the task of operating the information access device can be minimized, access to the target information can be performed while maintaining a safer and more secure state. It is possible to do.
[0036]
On the other hand, if an error occurs in the via menu item due to the voice recognition result error during the reading confirmation, and the user confirms the error, the menu selection is performed via the device connected to the voice response behavior control unit 14. It is also conceivable to change the item and perform a return operation to the intended correct menu item.
[0037]
For example, for a menu hierarchy configuration of “gourmet” → “French cuisine” → “XX restaurant”, an attempt is made to directly move (shortcut) to a menu item “XX restaurant” existing in a lower hierarchy by voice input. In doing so, the voice response processing unit 13 sequentially reads out the menu item names that pass through to the menu item to be displayed for each layer, and moves through the layers while receiving a confirmation operation by the user. become.
[0038]
At this time, if the target item is erroneously recognized as “gourmet” → “Italian food” → “△△ shop” due to occurrence of a voice input error or the like, check the item of “Italian food” on the second level. If the user notices that misrecognition has occurred at the stage of performing the operation, the device 15 connected to the voice response behavior control unit 14 is operated to immediately move to the intended correct item “French cuisine”. It can be so. Specifically, by rotating the dial remote controller as the device 22, the items located in the same hierarchy are output as audio.
[0039]
With such a configuration, even when an input error due to voice recognition occurs, it is possible to perform a return operation to the menu item intended by the user in relatively few steps.
[0040]
As the device 15 connected to the voice response behavior control unit 14, it is conceivable to use a device in which each function for behavior operation is assigned to a push button, as generally used in an IVR system. . However, the present invention is not particularly limited to this. For example, it is also conceivable to use a dial-type remote controller that can be operated without releasing the hand from the steering wheel, to which various functions for behavior operation are assigned. . In the latter case, it is conceivable to read the rotational speed of the dial feed / return operation and change the magnitude of the movement step amount of the menu item according to the magnitude of the speed. In the case where the position (order) of selection items to be specified for a known menu configuration can be estimated in advance, unnecessary voice responses can be skipped quickly, so that menu operations can be performed in a shorter time. It becomes possible.
[0041]
As a specific operation method, as shown in FIG. 4, when the dial is rotated by one step in the forward direction (1), an information menu item to be output as an audio is forwarded to the next. When the dial is rotated in the opposite direction ((2)), the information menu item to be output as sound is returned to the next one. When the dial is pressed in the dial radial direction (3), the detail level is set higher than the detailed level indicating the information content. When the dial is pressed in the opposite dial radial direction (4), the information level is increased. The detail level may be lower than the detail level indicating the content.
[0042]
As for the usage of the mode button, when the mode button is OFF, the respective functions are assigned to the operations (1) to (4) as described above. On the other hand, when the mode button is ON, when the dial is rotated by one step in the forward direction (1), the detail level of the information content to be presented is increased (for example, a longer item name text is output by voice). Also, when the dial is turned in the opposite direction (2), the level of detail of the information content to be presented may be lowered (for example, a shorter item name text is output as voice).
[0043]
On the other hand, as the device 15 connected to the voice response behavior control unit 14, a push button or a cursor of a mobile phone may be used instead of the dial-type remote controller. In this case, to the function buttons other than the number buttons of the mobile phone, various functions such as the above-described sending process and returning process for the audio output of the menu item or changing the detail level of the information content are assigned.
[0044]
In the first embodiment, the menu selection-based operation interface is described. However, the present invention is not limited to this. For example, an operation of a web browser that sequentially refers to web information while following a link Applicable to interfaces.
[0045]
Next, a description will be given of a processing flow of a program for realizing the information access device using voice according to the first embodiment of the present invention. FIG. 5 shows a flow chart of a process of a program for realizing the information access device using voice according to the first embodiment of the present invention.
[0046]
In FIG. 5, first, a voice input by a user is received (step S501), and voice recognition for the input voice is performed using a voice recognition engine (step S502).
[0047]
Next, the display information database 16 in which the menu configuration or the content to be displayed to the user is stored is queried using the recognition result as key information (step S503). Then, the information content to be displayed to the user is extracted from the display information database 16 (step S504).
[0048]
Next, the extracted information content to be presented to the user is output as a voice using a synthesized voice or the like (step S505). Then, an operation signal using the external device by the user who has heard the output voice is received (step S506).
[0049]
If the operation signal is a key information update instruction, the key information is re-created (step S507), and the process returns to step S503. If the operation signal is an instruction to change the playback output speed in the voice response, the speed parameter in the playback output is updated (step S508), and the process returns to step S505.
[0050]
As described above, according to the first embodiment, it is possible to easily and reliably update the content of the voice response processing using the external device. Therefore, when the voice data stream is used, the redundancy and the difficulty of the search are reduced. It is possible to overcome it.
[0051]
(Embodiment 2)
Hereinafter, an information access device using voice according to a second embodiment of the present invention will be described with reference to the drawings. FIG. 6 shows a configuration diagram of an information access device using voice according to the second embodiment of the present invention.
[0052]
As shown in FIG. 6, although the configuration itself is not significantly different from that of the first embodiment, for example, a user who estimates a visual or conscious load state spent on driving operation by a driver while driving and outputs the load level It is characterized in that a load status estimating unit 61 is provided.
[0053]
That is, when the driver is going to change lanes, it is necessary to continuously watch the following vehicle. Alternatively, when trying to make a right turn inside an intersection, attention must be paid to pedestrians and straight-ahead vehicles. The user load situation estimating unit 61 preliminarily estimates a case in which the driver, who is the user, considers that the load level for the driving operation is in a relatively high state, such as a case where the user is traveling on an unguided route while being lost. , For the menu items or the presented information contents, such as setting the reading speed to a lower speed.
[0054]
Alternatively, it is conceivable to change the reading order of the information content itself in consideration of the load on the user when presenting the information content. For example, information with a high load at the time of browsing, such as inducing mental arithmetic processing, is read backward, and is presented when it is determined that the driving load has become low.
[0055]
Further, when it is estimated that the state where the driving load is high continues, it is conceivable to lower the presentation detail level of the information content and read it as a more concise information content.
[0056]
As described above, by estimating the load situation of the user, by changing the voice response behavior according to the estimated situation, when the driver cannot actively operate the voice response behavior for some reason, or when the driver does not operate the voice response behavior, In addition, it is possible to suppress the conscious attention spent on the operation for accessing the information, and to execute the target information access operation while maintaining a safer driving state.
[0057]
In order to estimate the load status of the user in this way, various sensors are required. For example, assuming an in-vehicle information access device operated by a driver while driving, in order to estimate the user's load situation, a small camera installed inside the vehicle (such as a rearview mirror or a dashboard) is used to estimate the load on the driver's head. It is conceivable to add a function of tracking an operation by image processing.
[0058]
If the average amount of head movement or the number of operations per unit time observed by such a small camera or the like becomes larger than the normal state, it is determined that the driving load state of the driver is high, and the voice synthesis function is used. Can be used as an index for controlling
[0059]
Further, as another configuration of the user load status estimating unit 61, the operation amounts of various operation devices for controlling the vehicle state such as the steering wheel, the accelerator, and the brake are measured, and a curve is formed based on the measured values. It is also conceivable to estimate a load state such as running on a continuous mountain road.
[0060]
In addition, a configuration is also conceivable in which the user's load situation is estimated based on an external situation such as a geographical factor or a traffic jam without providing a sensor device such as a small camera. As an example, assuming an in-vehicle information access device operated by a driver while driving, information on the current position of a vehicle and a traveling road is received based on GPS or map information as in a current car navigation system, Based on information on roads, for example, when traveling on a mountain road with many curves, in front of a main station, or on a highly congested road in an urban area, it is estimated that the driving load of the user is large, and the listening load is reduced. Is to change the behavior of the voice response so as to reduce the response.
[0061]
However, in this case, there is a possibility that the voice response control is executed contrary to the user's own intention and desire. In such a case, the user himself / herself performs the voice response control via the device 15. By changing the behavior, the inconsistency between the behavior of the system and the user's feeling can be easily and immediately corrected.
[0062]
Next, a description will be given of a processing flow of a program for realizing the information access device using voice according to the embodiment of the present invention. FIG. 7 shows a flowchart of the processing of a program for realizing the information access apparatus using voice according to the embodiment of the present invention.
[0063]
7, first, a voice input by a user is received (step S701), and voice recognition is performed on the input voice using a voice recognition engine to generate key information of the display information database 16 (step S702).
[0064]
On the other hand, detection signals from various sensors are received, and the load status of the user is estimated (step S703). Then, the generated key information is updated according to the estimated load condition (step S704), and the display information database 16 is queried based on the updated key information (step S705) and displayed to the user. The information content to be extracted is extracted (step S706).
[0065]
Then, the extracted information content to be presented to the user is output as a voice using a synthesized voice or the like (step S707), and an operation signal using the external device by the user who has heard the output voice is received (step S707). S708) makes it possible to more directly access information desired by the user. The following processing is the same as in the first embodiment.
[0066]
As described above, according to the second embodiment, it is possible to estimate the user's load situation in advance, and then to present the information content corresponding thereto, and to present the information content that does not meet the user's intention. However, it is possible to easily update the presentation content.
[0067]
As shown in FIG. 8, the program for realizing the information access device using audio according to the embodiment of the present invention is only a portable recording medium 92 such as a CD-ROM 82-1 or a flexible disk 82-2. Alternatively, the program may be stored in any of the other storage device 81 provided at the end of the communication line, or a recording medium 84 such as a hard disk or a RAM of the computer 83. When the program is executed, the program is loaded. Runs on main memory.
[0068]
Also, as shown in FIG. 8, a display information database and the like used by the information access device using voice according to the embodiment of the present invention are portable recording media such as a CD-ROM 82-1 and a flexible disk 82-2. The information may be stored not only in the medium 82 but also in any other storage device 81 provided at the end of the communication line, or in a recording medium 84 such as a hard disk or a RAM of a computer 83. It is read by the computer 83 when using the information access device using voice.
[0069]
(Supplementary Note 1) A user input receiving unit that receives an input by a user;
Based on an input by the user, a presentation information management unit that determines information content to be presented to the user,
A voice response processing unit that transmits the determined information content to be presented to the user to the user via voice output,
An information access apparatus using voice, comprising: a voice response behavior control unit that allows the user to change the content of voice output presented to the user using a connected input device.
[0070]
(Supplementary note 2) The information access apparatus using speech according to supplementary note 1, wherein the reading speed at which the user performs speech output can be adjusted via the input device connected to the speech response behavior control unit. .
[0071]
(Supplementary Note 3) The information content has a plurality of detail levels, and the user changes the detail level of the information content to be output as a voice via the input device connected to the voice response behavior control unit. An information access device using a voice according to Supplementary Note 1 that can perform the information access.
[0072]
(Supplementary note 4) Supplementary note 1 in which the information content has a plurality of information contents, and the information content itself to be output as a voice can be changed via the input device connected to the voice response behavior control unit. An information access device using voice described in 1.
[0073]
(Supplementary Note 5) A user load status estimating unit that estimates an operation load status at the time of the user's operation is provided,
The information access apparatus using sound according to any one of Supplementary notes 1 to 4, wherein the content of the sound output is changed according to the operation load state of the user obtained by the user load state estimation unit.
[0074]
(Supplementary Note 6) An external situation measuring unit that measures an external situation around the user is further provided,
6. The information access device using sound according to claim 5, wherein the user load condition estimating unit estimates an operation load condition at the time of a user's operation according to a current position and a surrounding environment condition obtained by the external condition measuring unit.
[0075]
(Supplementary Note 7) The input device is a dial-type remote controller provided inside the voice response behavior control unit, and the content of voice output presented to the user can be changed according to a rotation speed at the time of dial operation. 7. An information access device using voice according to any one of supplementary notes 1 to 6.
[0076]
(Supplementary Note 8) a step of receiving an input by a user;
A step of determining information content to be presented to the user based on the input by the user;
Transmitting the determined information content to be presented to the user to the user via voice output,
A step of allowing the user to change the content of the audio output presented to the user by using the connected input device.
[0077]
(Supplementary Note 9) a step of receiving an input by a user;
Based on the input by the user, determining the information content to be presented to the user,
Transmitting the determined information content to be presented to the user to the user via audio output;
Using the connected input device to change the content of the audio output presented to the user. The computer-implemented method for implementing the information access method using audio. Possible programs.
[0078]
【The invention's effect】
As described above, according to the information access apparatus using voice according to the present invention, even when the user does not gaze at the display screen for operation or cannot, he can control the voice response function and the voice response behavior. By using a combination of devices, it is possible to provide a voice response type information access apparatus capable of performing a simple and highly responsive information access operation.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of an information access device using voice according to a first embodiment of the present invention;
FIG. 2 is a configuration diagram of a voice response behavior control unit in the information access device using voice according to the first embodiment of the present invention;
FIG. 3 is a configuration diagram of a voice response behavior control unit in the information access device using voice according to the first embodiment of the present invention;
FIG. 4 is an exemplary configuration diagram of a device in the information access apparatus using voice according to the first embodiment of the present invention;
FIG. 5 is a flowchart of a process in the information access device using voice according to the first embodiment of the present invention;
FIG. 6 is a configuration diagram of an information access device using voice according to a second embodiment of the present invention;
FIG. 7 is a flowchart of a process in the information access device using voice according to the second embodiment of the present invention;
FIG. 8 is an exemplary diagram of a computer environment.
[Explanation of symbols]
11 User input receiving unit
12 presentation information management department
13 Voice response processing unit
14 Voice response behavior control unit
15, 21, 22 devices
16 Display information database
61 User load status estimation unit
81 Line storage device
82 Portable recording media such as CD-ROM and flexible disk
82-1 CD-ROM
82-2 Flexible disk
83 Computer
84 Recording media such as RAM / hard disk on computer
141 operation mode determination unit
142 instruction command generator
143 Reading parameter adjustment unit
144 Reading Detail Level Adjuster

Claims

A user input receiving unit for receiving an input by a user,
Based on an input by the user, a presentation information management unit that determines information content to be presented to the user,
A voice response processing unit that transmits the determined information content to be presented to the user to the user via voice output,
An information access apparatus using voice, comprising: a voice response behavior control unit that allows the user to change the content of voice output presented to the user using a connected input device.

2. The information access apparatus using voice according to claim 1, wherein the reading speed at which the user performs voice output can be adjusted via the input device connected to the voice response behavior control unit. 3.

The information content has a plurality of detail levels, and the user can change the detail level of the information content to be output as a voice via the input device connected to the voice response behavior control unit. An information access device using voice according to claim 1.

2. The information content according to claim 1, wherein the information content has a plurality of information contents, and the information content itself to be output as a voice can be changed via the input device connected to the voice response behavior control unit. Information access device using voice.

Accepting user input;
A step of determining information content to be presented to the user based on the input by the user;
Transmitting the determined information content to be presented to the user to the user via voice output,
A step of allowing the user to change the content of the audio output presented to the user by using the connected input device.