JP2006317573A

JP2006317573A - Information terminal

Info

Publication number: JP2006317573A
Application number: JP2005138252A
Authority: JP
Inventors: Toshihiro Kujirai; 俊宏鯨井; Takahisa Tomota; 孝久友田; Tetsuo Shinagawa; 哲夫品川
Original assignee: Xanavi Informatics Corp
Current assignee: Faurecia Clarion Electronics Co Ltd
Priority date: 2005-05-11
Filing date: 2005-05-11
Publication date: 2006-11-24

Abstract

<P>PROBLEM TO BE SOLVED: To provide an information terminal that can acquire information for a user through voice interaction with the user. <P>SOLUTION: The information terminal which recognizes user's voice as a command and provides information, based on the recognized command comprises a talk signal part for outputting a talk signal, according to a user's instruction; a voice recognition part for recognizing the voice uttered by the user as the command; an environmental information acquiring part for acquiring environmental information about the environment in the surroundings of the information terminal; an inferring part for inferring the information intended by the user from the command recognized and the environmental information acquired; and a control part for controlling the processing of the information terminal, and on receiving the talk signal, the control part for deciding whether the user's command, recognized by the speech recognition part, has been acquired, and provides the user with the information presumed by the inferring part for using the environmental information acquired by the environmental information acquiring part, when the command has not been obtained. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、ユーザとの音声対話によってユーザに情報を取得する情報端末に関するものである。 The present invention relates to an information terminal that obtains information from a user through voice interaction with the user.

現在のカーナビゲーションシステムなどの車載情報端末は、ＤＶＤやＨＤＤなどの大容量記憶媒体に、店舗情報や施設情報などの大量の情報を保持している。さらに、無線通信を用いてサーバからこれらの情報を取得することができる。 In-vehicle information terminals such as current car navigation systems hold a large amount of information such as store information and facility information in a large-capacity storage medium such as a DVD or HDD. Furthermore, these pieces of information can be acquired from the server using wireless communication.

一般的な情報検索装置においては、ユーザがキーワードや分類名などを入力・選択することで、大量の情報の中から必要な情報を絞り込む操作を行う方法が用いられる。車載情報端末においても、停車中はこの方法は有効であるため、ユーザがソフトウェアキーボードを利用したキーワード入力や、分類をメニュー画面で選択する方法を用いた情報検索インタフェースが採用されている。一方、走行時は、音声認識システムを利用し、ユーザがキーワードや分類名を音声入力することにより、ハンズフリー／アイズフリーで必要な情報を取り出すシステムが開発されている。 In a general information retrieval apparatus, a method is used in which a user performs an operation of narrowing down necessary information from a large amount of information by inputting and selecting a keyword, a classification name, or the like. Even in a vehicle information terminal, since this method is effective while the vehicle is stopped, an information search interface using a keyword input by a user using a software keyboard and a method of selecting a classification on a menu screen is adopted. On the other hand, a system has been developed that uses a voice recognition system to extract necessary information hands-free / eyes-free when a user inputs a keyword or a classification name by voice.

また、一般的なウェブの検索エンジンと異なり、ユーザが必要とする情報は、ユーザがどこを走っているか、渋滞が発生しているか、などの周囲状況によって大きく影響を受けることが考えられる。そのため、ユーザの車両が走行している付近や、これから走行する予定の経路上の店舗情報、施設情報に限定して情報を提供することで、ユーザの利便性を高める方法も知られている（特許文献１参照。）。 In addition, unlike a general web search engine, information required by a user may be greatly influenced by surrounding conditions such as where the user is running and whether a traffic jam occurs. Therefore, there is also known a method for improving the convenience of the user by providing information only in the vicinity where the user's vehicle is traveling or the store information and facility information on the route scheduled to travel ( (See Patent Document 1).

また、ユーザの音声コマンドによる情報の推定方法には、ベイジアンネットワークを用いることができる。ベイジアンネットワークとはグラフ後続を持つ確率モデルの一つであり、不確実性を含む事象の予測や合理的な意志決定、観測結果から原因を探る障害診断などに利用することができる（非特許文献１参照。）。
特開平１１−５１６６６号公報本村陽一、“ベイジアンネットワーク：入門からヒューマンモデリングへの応用まで”、［online］、産業技術総合研究所、［平成１７年４月１日検索］、インターネット＜URL：http://staff.aist.go.jp/y.motomura/paper/BSJ0403.pdf＞ Further, a Bayesian network can be used as a method for estimating information based on a user's voice command. A Bayesian network is a probabilistic model with a graph succession, and can be used for predicting events including uncertainty, rational decision making, and fault diagnosis to find the cause from observation results (Non-Patent Documents) 1).
JP-A-11-51666 Yoichi Motomura, “Bayesian Network: From Introduction to Application to Human Modeling”, [online], National Institute of Advanced Industrial Science and Technology, [Search April 1, 2005], Internet <URL: http: //staff.aist. go.jp/y.motomura/paper/BSJ0403.pdf>

しかし、ユーザが必要とする情報を推定する方式の改良によって推定の精度が高くなったとしても、ユーザがその情報を知りたいと思うタイミングを正確に推定することは困難である。 However, even if the accuracy of estimation is improved by improving the method for estimating information required by the user, it is difficult to accurately estimate the timing when the user wants to know the information.

例えば、渋滞が発生した場合、渋滞情報は、ユーザがその状況で必要とする情報の種別の推定値として適切であると言える。しかし、ユーザが既に別の手段で渋滞情報を知っている場合や、はじめから渋滞を覚悟している場合など、あらためて渋滞の距離や時間を知らされることはユーザにとって嬉しいことではない。 For example, when a traffic jam occurs, it can be said that the traffic jam information is appropriate as an estimated value of the type of information that the user needs in the situation. However, if the user already knows the traffic jam information by another means, or if the user is prepared for the traffic jam from the beginning, it is not pleasant for the user to be informed of the traffic jam distance and time.

また、車両の位置情報やユーザの購買履歴に基づいて付近の店舗情報を提供するサービスを行うこともできる。この場合、情報を提供する店舗側はできる限り高い頻度で情報を提供したいが、ユーザはこのような情報が頻繁に提供されることをわずらわしいと感じる可能性がある。 In addition, it is possible to provide a service for providing nearby store information based on vehicle position information and user purchase history. In this case, the store providing information wants to provide information as frequently as possible, but the user may feel troublesome that such information is frequently provided.

さらに、定期的にユーザに提供すべき情報や実行すべき操作を推定してユーザに提供する方法もあるが、ある状況である情報が必要とされる可能性が非常に高いと推定された場合でも、実際にはユーザがそのタイミングでは情報を欲していないというミスマッチが起こり得るので、これも同様にユーザは情報が提供されることをわずらわしいと感じる可能性がある。 In addition, there is a method to provide information to the user by estimating information to be provided to the user and operations to be performed periodically, but when it is estimated that information in a certain situation is very likely to be required However, since there may actually be a mismatch that the user does not want information at that timing, this may similarly be annoying for the user to be provided with information.

このような問題から、現在では情報端末からの押し付け型の情報提供は車載情報端末向けには普及しておらず、従来技術でも述べたようにユーザの操作によって提供する情報を特定する方式が一般的である。 Because of these problems, push-type information provision from information terminals is currently not widespread for in-vehicle information terminals, and a method of specifying information provided by user operation as described in the prior art is generally used. Is.

ここで問題となるのは、音声認識を入力手段として用いる場合、ユーザが適切な音声コマンドを思いつけず何も発声しなかった場合、間違った音声コマンドを発声した場合、又は、周囲雑音をシステムがユーザ発声と間違えて認識処理した場合など、認識結果が全く得られない。このような状況では、従来のシステムでは単にユーザに再発声を促すだけであり、ユーザは何度も音声を発するなど、利便性が低くなっていた。 The problem here is that when speech recognition is used as an input means, the user cannot think of an appropriate voice command and speaks nothing, the wrong voice command is spoken, or the ambient noise No recognition result is obtained at all, such as when the recognition process is mistaken for user utterance. In such a situation, the conventional system merely urges the user to recite the voice, and the user utters the voice many times.

本発明は、ユーザの音声をコマンドとして認識し、認識したコマンドに基づいた情報を提供する情報端末において、ユーザの指示によってトーク信号を出力するトーク信号部と、ユーザの発声した音声をコマンドとして認識する音声認識部と、情報端末の周囲の環境に関する環境情報を取得する環境情報取得部と、認識したコマンド及び取得した環境情報からユーザが意図する情報を推定する推定部と、情報端末の処理を制御する制御部と、を備え、制御部は、トーク信号の受信を契機として、音声認識部が認識したユーザのコマンドが得られたか否かを判定し、当該コマンドが得られない場合に、環境情報取得部が取得した環境情報を用いて推定部が推定した情報をユーザに提供することを特徴とする。 The present invention recognizes a user's voice as a command, and an information terminal that provides information based on the recognized command, recognizes a talk signal unit that outputs a talk signal according to a user's instruction, and recognizes a voice uttered by the user as a command. A voice recognition unit, an environment information acquisition unit that acquires environment information about the environment around the information terminal, an estimation unit that estimates information intended by the user from the recognized command and the acquired environment information, and processing of the information terminal A control unit that controls, the control unit determines whether or not the user's command recognized by the voice recognition unit is obtained in response to reception of the talk signal. The information estimated by the estimation unit using the environment information acquired by the information acquisition unit is provided to the user.

本発明によると、ユーザが、適切な音声コマンドを思いつけず何も発声しなかった場合や、システムが想定していない発声をしてしまった場合でも、ユーザの操作を無駄にせず、状況に応じてユーザが必要としている可能性の高い情報を提供できる。これによって、利便性の高いユーザインタフェースを提供することができる。 According to the present invention, even when the user does not come up with an appropriate voice command and does not utter anything, or when the system utters unexpectedly, the user's operation is not wasted and the situation is not affected. Accordingly, it is possible to provide information that is likely to be required by the user. Thereby, a highly convenient user interface can be provided.

さらに、ユーザ意図を推定するのは、ユーザがトークボタンを押した場合に限られるため、ユーザは、不必要な情報を押し付けられることがないという利点がある。 Furthermore, since the user intention is estimated only when the user presses the talk button, there is an advantage that the user is not pressed against unnecessary information.

以下に本発明の実施例について、図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本発明の第１の実施例の情報端末の構成を表したブロック図である。 FIG. 1 is a block diagram showing the configuration of an information terminal according to the first embodiment of the present invention.

情報端末１００は、自動車等の車両に設置される。そして、ユーザの音声による音声コマンドによる指示を受け付け、その音声コマンドを認識し、認識結果に対応する情報を合成音声又はディスプレイへの表示によって応答する。 The information terminal 100 is installed in a vehicle such as an automobile. Then, an instruction by a voice command by the user's voice is received, the voice command is recognized, and information corresponding to the recognition result is responded by a synthesized voice or display on a display.

この情報端末１００は、システム制御部１０１、ＧＰＳ受信機１０２、自律航法センサ１０３、交通情報受信機１０４、ディスプレイ１０５、音声出力装置１０６、ハードディスク（ＨＤＤ）１０９、マイク１１０、トークボタン１１１、ＧＵＩ操作部１１２、ＡＶ装置１１３、路面センサ１１６、発汗センサ１１７、メモリ１１８、インタフェース１１９等によって構成される。 The information terminal 100 includes a system control unit 101, a GPS receiver 102, an autonomous navigation sensor 103, a traffic information receiver 104, a display 105, an audio output device 106, a hard disk (HDD) 109, a microphone 110, a talk button 111, and a GUI operation. Unit 112, AV device 113, road surface sensor 116, sweat sensor 117, memory 118, interface 119, and the like.

システム制御部１０１は、情報端末１００の処理を実行する。システム制御部１０１は、ＣＰＵ、ＲＯＭ、ＲＡＭ、データバスなどから構成される。 The system control unit 101 executes processing of the information terminal 100. The system control unit 101 includes a CPU, a ROM, a RAM, a data bus, and the like.

ＧＰＳ受信機１０２は、図示しない複数のＧＰＳ衛星からの信号を受信して、車両の現在位置を算出する装置である。 The GPS receiver 102 is a device that receives signals from a plurality of GPS satellites (not shown) and calculates the current position of the vehicle.

自律航法センサ１０３は、ジャイロを備えており、このジャイロからの情報及び車両からインタフェース１１９を介して取得した車速パルスを利用して、車両の現在位置と車両の向きとを推定する装置である。 The autonomous navigation sensor 103 includes a gyro and is a device that estimates the current position of the vehicle and the direction of the vehicle using information from the gyro and vehicle speed pulses acquired from the vehicle via the interface 119.

交通情報受信機１０４は、ビーコンやＦＭ放送波によって搬送される交通情報信号を受信する装置である。 The traffic information receiver 104 is a device that receives a traffic information signal carried by a beacon or FM broadcast wave.

ディスプレイ１０５は、地図情報、車両の位置情報、交通情報等をユーザに表示する装置である。ディスプレイ１０５は、例えば液晶ディスプレイ装置によって構成される。 The display 105 is a device that displays map information, vehicle position information, traffic information, and the like to the user. The display 105 is configured by a liquid crystal display device, for example.

音声出力装置１０６は、システム制御部１０１の指示によって、合成音声をユーザに応答する。例えば、走行中のルート案内を合成音声によってユーザに伝える。 The voice output device 106 responds to the user with the synthesized voice in response to an instruction from the system control unit 101. For example, route guidance while traveling is transmitted to the user by synthetic voice.

ハードディスク装置（ＨＤＤ）１０９はデジタルデータを格納する記憶媒体である。このＨＤＤ１０９は、地図情報を保存する。 A hard disk device (HDD) 109 is a storage medium for storing digital data. The HDD 109 stores map information.

マイク１１０は、ユーザの発声する音声コマンドを集音するための装置である。このマイク１１０は、車両に複数備えられ、複数のマイクが取得した音声の到達時間の差から、音声を発したのが運転席のユーザであるか助手席のユーザであるかを判断できる。 The microphone 110 is a device for collecting voice commands uttered by the user. A plurality of the microphones 110 are provided in the vehicle, and it can be determined from the difference in the arrival times of the voices acquired by the plurality of microphones whether the voice is produced by the driver seat user or the passenger seat user.

トークボタン１１１は、音声認識部１１４に音声認識処理の実行を指示するトーク信号を発生させる。具体的には、ユーザがトークボタン１１１を押した後に、音声コマンドを発声する。システム制御部１０１は、トークボタン押下によるトーク信号の受信を契機として、音声認識部１１４に音声コマンドの認識処理を実行させる。これによって、ユーザが音声コマンドを発するタイミングを知ることができ、例えば走行中の騒音環境の中でも雑音による誤認識を防ぎ、良好な音声認識性能を得ることができる。 The talk button 111 generates a talk signal that instructs the voice recognition unit 114 to execute voice recognition processing. Specifically, after the user presses the talk button 111, the voice command is uttered. The system control unit 101 causes the voice recognition unit 114 to execute voice command recognition processing when a talk signal is received by pressing the talk button. Accordingly, it is possible to know the timing at which the user issues a voice command. For example, it is possible to prevent erroneous recognition due to noise even in a running noise environment and to obtain good voice recognition performance.

このトークボタン１１１は、例えば車両のハンドルに取り付けたスイッチによって構成される。なお、ユーザが発話することをシステム制御部１０１に通知するトーク信号を発生するための手段であればどのようなものでもよい。 The talk button 111 is configured by a switch attached to a vehicle handle, for example. Any means for generating a talk signal for notifying the system control unit 101 that the user speaks may be used.

例えば、音声コマンドを発声する前に、特定のキーワードを発声することによって、トークボタンの代わりとしてもよい。この場合、音声認識部１１４が常時、特定のキーワードのみを待ち受けており（特定語彙ワードスポット）、音声認識部１１４が特定のキーワードを検出したことによって、システム制御部１０１にトーク信号を送信する。すなわち、ユーザが特定のキーワードを発声することが、トークボタン１１１を押すことに相当する。 For example, the voice button may be used instead of a talk button by uttering a specific keyword before uttering a voice command. In this case, the voice recognition unit 114 always waits for only a specific keyword (specific vocabulary word spot), and transmits a talk signal to the system control unit 101 when the voice recognition unit 114 detects the specific keyword. That is, the user speaking a specific keyword corresponds to pressing the talk button 111.

ＧＵＩ操作部１１２は、ユーザが音声コマンド以外の方法で情報端末を操作するための装置であり、リモコンや、タッチパネル、ボタンなどで構成される。 The GUI operation unit 112 is a device for a user to operate the information terminal by a method other than a voice command, and includes a remote controller, a touch panel, buttons, and the like.

ＡＶ装置１１３は、ＣＤプレイヤ、ＭＤプレイヤ、ＤＶＤプレイヤやこれらの複合型プレイヤ、ラジオチューナ、テレビチューナ、アンプなどから構成される。これらの再生結果として、音声は音声出力装置１０６から出力され、映像はディスプレイ１０５に表示される。 The AV device 113 includes a CD player, an MD player, a DVD player, a composite player of these, a radio tuner, a television tuner, an amplifier, and the like. As these playback results, audio is output from the audio output device 106, and video is displayed on the display 105.

路面センサ１１６は、走行中の路面の状態（乾燥、湿潤、水膜、積雪、凍結）を検出するセンサであり、路面センサ１１６は、例えば、可視画像式センサ、レーザレーダ式センサ、光ファイバ式センサなどを用いる。 The road surface sensor 116 is a sensor that detects the state of the road surface during travel (dry, wet, water film, snow cover, freezing). The road surface sensor 116 is, for example, a visible image sensor, a laser radar sensor, or an optical fiber sensor. A sensor or the like is used.

発汗センサ１１７は、ハンドルに取り付けられ、運転中のユーザの発汗を検出して、ユーザの緊張度を測定する。 The perspiration sensor 117 is attached to the handle, detects perspiration of the user during driving, and measures the user's tension.

インタフェース（Ｉ／Ｆ）１１９は、車両の各種センサに接続され、車速パルス、ハンドブレーキ信号、エンジン始動信号等の各種情報を取得する。 The interface (I / F) 119 is connected to various sensors of the vehicle and acquires various information such as a vehicle speed pulse, a handbrake signal, and an engine start signal.

メモリ１１８には、経路探索部１０７、地図情報読込部１０８、音声認識部１１４及び意図推定部１１５が格納される。これら各部はシステム制御部１０１によって実行されるプログラムである。 The memory 118 stores a route search unit 107, a map information reading unit 108, a voice recognition unit 114, and an intention estimation unit 115. Each of these units is a program executed by the system control unit 101.

経路探索部１０７は、ユーザの指示等によって設定された目的地への経路を、ＨＤＤ１０９に格納されている地図情報から探索する。 The route search unit 107 searches the map information stored in the HDD 109 for a route to the destination set by a user instruction or the like.

地図情報読込部１０８は、ＨＤＤ１０９に格納されている地図情報をメモリ１１８に読み出す。そして、ＧＰＳ受信機１０２が受信した情報から計算した位置情報、自律航法センサ１０３が得た位置情報及び読み出した地図情報を用いてマップマッチングを実行し、車両の現在位置を推定する。そして、ディスプレイ１０５に推定した自車位置の周辺の地図及び自車を現すシンボルを表示する。 The map information reading unit 108 reads the map information stored in the HDD 109 into the memory 118. Then, map matching is executed using the position information calculated from the information received by the GPS receiver 102, the position information obtained by the autonomous navigation sensor 103, and the read map information, and the current position of the vehicle is estimated. Then, a map around the estimated vehicle position and a symbol representing the vehicle are displayed on the display 105.

音声認識部１１４は、マイク１１０によって集音された音声コマンドを、音声コマンドに対応するテキスト又は記号を認識結果として得る。より具体的には、音声認識部１１４が保持する音声モデルと入力された音声コマンドとの類似度を算出し、類似度が高いものを認識結果とする。そして、認識結果とその類似度とを出力する。 The voice recognition unit 114 obtains a voice command collected by the microphone 110 as a recognition result of text or a symbol corresponding to the voice command. More specifically, the similarity between the voice model held by the voice recognition unit 114 and the input voice command is calculated, and the one with the higher similarity is used as the recognition result. Then, the recognition result and its similarity are output.

意図推定部１１５は、ＧＰＳ受信機１０２及び自律航法センサ１０３によって計算された位置情報、交通情報受信機１０４によって受信された交通状況、路面センサ１１６によって検出された路面の状況、インタフェース１１９から受信した車両の情報等から、ユーザが必要とする情報又は希望する操作内容を推定する。 The intention estimation unit 115 receives the position information calculated by the GPS receiver 102 and the autonomous navigation sensor 103, the traffic situation received by the traffic information receiver 104, the road condition detected by the road sensor 116, and received from the interface 119. Information required by the user or desired operation content is estimated from vehicle information and the like.

図２は、本発明の第１の実施例の情報端末１０１の音声対話処理のフローチャートである。 FIG. 2 is a flowchart of the voice interaction process of the information terminal 101 according to the first embodiment of this invention.

このフローチャートは、ユーザがトークボタン１１１を押して音声コマンドを発声したときに、情報端末１００が、その音声コマンドを認識してどのような応答をするかの処理を示す。 This flowchart shows a process of how the information terminal 100 recognizes the voice command and responds when the user presses the talk button 111 to utter a voice command.

まず、ユーザはトークボタン１１１を押して、情報端末１００に音声コマンドの発声を予告する。このトークボタン１１１の押下によって、システム制御部１０１にトーク信号が送信される。 First, the user presses the talk button 111 to notify the information terminal 100 of the utterance of the voice command. When the talk button 111 is pressed, a talk signal is transmitted to the system control unit 101.

システム制御部１０１は、トークボタン１１１からのトーク信号を受信すると（Ｓ１００１）、ユーザに発声を促すための質問内容を生成する（Ｓ１００２）。このとき、すなわちトークボタン１１１が押された直後に生成する質問内容は、例えば「何でしょう？」、「音声コマンドをどうぞ」のような短い音声、又は、「ピッ」という短い信号音等が適している。 When the system control unit 101 receives the talk signal from the talk button 111 (S1001), the system control unit 101 generates a question content for prompting the user to speak (S1002). At this time, that is, the question content generated immediately after the talk button 111 is pressed is, for example, a short voice such as “What?” Or “Please speak” or a short signal sound such as “Pip”. ing.

システム制御部１０１は、生成した質問内容を音声出力装置１０６に指示する。音声出力装置１０６がこの質問内容を音声として出力する（Ｓ１００３）。なお、このとき、同時にディスプレイ１０５に質問内容をテキストとして表示してもよい。 The system control unit 101 instructs the voice output device 106 about the generated question content. The voice output device 106 outputs the contents of the question as voice (S1003). At this time, the content of the question may be displayed as text on the display 105 at the same time.

ユーザは、この質問内容の出力に応えて音声コマンドを発声する（Ｓ１００４）。マイク１１０は、発声された音声コマンドを集音してＡ／Ｄ変換する。Ａ／Ｄ変換された音声コマンドは音声認識部１１４に送られる。音声認識部１１４は、この音声コマンドを音声認識処理することによって認識結果を得る（Ｓ１００５）。 In response to the output of the question content, the user utters a voice command (S1004). The microphone 110 collects voice commands that have been uttered and performs A / D conversion. The A / D converted voice command is sent to the voice recognition unit 114. The voice recognition unit 114 obtains a recognition result by performing voice recognition processing on the voice command (S1005).

なお、この音声認識処理の結果は、ユーザの発声した音声コマンドに対応したテキストや記号が音声認識結果として得られる場合と、ユーザの発声に対応したテキストや記号が音声認識結果として得られない場合の二通りがある。 In addition, the result of this voice recognition processing is obtained when the text or symbol corresponding to the voice command uttered by the user is obtained as the voice recognition result and when the text or symbol corresponding to the user utterance is not obtained as the voice recognition result. There are two ways.

後者の場合は、例えば、ユーザがトークボタン１１１を押した後、一定時間何も発声しなかった場合（タイムアウト）、ユーザが発声した音声が、音声認識部１１４が想定していない音声であった場合、又は、ユーザがトークボタン１１１を押した後、周囲の雑音がマイク１１０によって集音され、音声認識部１１４がこの雑音をユーザの発声と間違えて音声認識処理をした場合等に起きる。このような場合は、音声認識部１１４が保持する音声モデルと入力された音声との類似度が低くなり、認識結果が得られない。 In the latter case, for example, when the user presses the talk button 111 and does not utter anything for a certain time (timeout), the voice uttered by the user is a voice that the voice recognition unit 114 does not assume. Or when the user presses the talk button 111 and ambient noise is collected by the microphone 110, and the speech recognition unit 114 mistakes this noise for the user's speech and performs speech recognition processing. In such a case, the similarity between the speech model held by the speech recognition unit 114 and the input speech is low, and a recognition result cannot be obtained.

システム制御部１０１は、Ｓ１００５の処理の結果、音声処理結果が得られたか否かを判定する（Ｓ１００６）。音声認識結果が得られた場合は、システム制御部１０１は、認識結果に含まれる情報を用いて、ユーザが必要とする情報や希望する操作の候補を絞り込む（Ｓ１００７）。音声認識結果が得られなかった場合は後述する。 The system control unit 101 determines whether an audio processing result is obtained as a result of the processing of S1005 (S1006). When the voice recognition result is obtained, the system control unit 101 uses the information included in the recognition result to narrow down the information required by the user and the candidate for the desired operation (S1007). The case where the voice recognition result is not obtained will be described later.

次に、システム制御部１０１は、現在の環境情報を取得する。環境情報とは、ＧＰＳ受信機１０２、自律航法センサ１０３、交通情報受信機１０４、経路探索部１０７、地図情報読込部１０８、路面センサ１１６、発汗センサ１１７等の少なくとも１つから取得した情報である（Ｓ１００８）。環境情報は、車や運転者の状況等に関する情報であって、上記音声入力された情報とは異なる。 Next, the system control unit 101 acquires current environment information. The environmental information is information acquired from at least one of the GPS receiver 102, the autonomous navigation sensor 103, the traffic information receiver 104, the route search unit 107, the map information reading unit 108, the road surface sensor 116, the sweat sensor 117, and the like. (S1008). The environmental information is information related to the situation of the car and the driver, and is different from the information inputted by voice.

次に、意図推定部１１５は、Ｓ１００７によって絞り込まれた認識結果の範囲において、Ｓ１００８で取得した環境情報を元に、ユーザが必要とする情報又は希望する操作内容を推定する（Ｓ１００９）。この推定の具体的な方式は後述する。 Next, the intention estimation unit 115 estimates the information required by the user or the desired operation content based on the environment information acquired in S1008 within the range of the recognition result narrowed down in S1007 (S1009). A specific method of this estimation will be described later.

意図推定部１１５は、この推定の結果、推定内容及び推定の信頼度を得る。そして、システム制御部１０１は、この推定の信頼度が予め定めた閾値以上であるか否かを判定する（Ｓ１０１０）。 As a result of this estimation, the intention estimation unit 115 obtains the estimation content and the reliability of the estimation. Then, the system control unit 101 determines whether or not the estimation reliability is equal to or higher than a predetermined threshold (S1010).

信頼度が閾値よりも高い場合は、推定内容がユーザの意図と等しいと判断する。そこで、推定内容に係る情報をユーザに提供する（Ｓ１０１１）。具体的には、システム制御部１０１は、ディスプレイ１０５への表示又は音声出力装置１０６による音声出力によって、推定内容に係る情報を提示する。又は、推定した操作内容を実行し、操作内容を実行したことをユーザにディスプレイ１０５への表示又は音声出力装置１０６による音声出力によって知らせる。 If the reliability is higher than the threshold, it is determined that the estimated content is equal to the user's intention. Therefore, information related to the estimated content is provided to the user (S1011). Specifically, the system control unit 101 presents information related to the estimated content by display on the display 105 or voice output by the voice output device 106. Alternatively, the estimated operation content is executed, and the user is informed that the operation content has been executed by display on the display 105 or sound output by the sound output device 106.

一方、Ｓ１００６において、類似度が低く、認識結果が得られなかった場合について説明する。 On the other hand, the case where the similarity is low and the recognition result is not obtained in S1006 will be described.

従来の音声認識を用いたシステムでは、ユーザ発声の認識結果が得られなかった場合は音声認識失敗とみなす。そのため、ユーザに再度発声を求める質問内容を生成して、その質問内容を出力する。 In a conventional system using voice recognition, if a user utterance recognition result is not obtained, it is regarded as a voice recognition failure. Therefore, the question content which asks the user to speak again is generated and the question content is output.

これに対して、本発明では、認識結果が得られなかった場合には、環境情報を取得して（Ｓ１００８）、ユーザが必要とする情報又は希望する操作内容を、取得した環境情報を元に推定する（Ｓ１００９）。 On the other hand, in the present invention, when the recognition result is not obtained, the environment information is acquired (S1008), and the information required by the user or the desired operation content is obtained based on the acquired environment information. Estimate (S1009).

前述したように、結果が得られない状況とは、（Ａ）ユーザがトークボタン１１１を押した後、一定時間何も発声しなかった場合、（Ｂ）ユーザがシステムが想定していない音声を発した場合、（Ｃ）ユーザがトークボタン１１１を押した後、周囲の雑音がマイク１１０によって集音され、音声認識部１１４がこの雑音をユーザの発声と間違えて音声認識処理をした場合、に分類できる。これはユーザ視点で考えると、トークボタン１１１を押したにもかかわらず適切な音声コマンドを思いつけなかった場合や、音声コマンドを発声しようとしたのに周囲雑音に妨害された場合に相当する。このときユーザは、トークボタン１１１を押下して、音声によって、何らかの情報を必要としている又は何らかの操作をしようとしていることは確実である。 As described above, the situation where the result cannot be obtained is that (A) when the user presses the talk button 111 and nothing is uttered for a certain period of time, (B) when the user does not assume the system (C) When the user presses the talk button 111, ambient noise is collected by the microphone 110, and the speech recognition unit 114 mistakes this noise for the user's speech and performs speech recognition processing. Can be classified. From the viewpoint of the user, this corresponds to a case where an appropriate voice command is not conceived even though the talk button 111 is pressed, or a case where an attempt is made to utter a voice command but the user is disturbed by ambient noise. At this time, it is certain that the user presses the talk button 111 and needs some information or tries to perform some operation by voice.

従って、従来のシステムように、認識結果が得られなかった場合に再度発声を求めるよりも、ユーザ意図を推定する方が、ユーザに再度の発声を求めることがなくなり、ユーザの利便性が増す。特に、（Ａ）及び（Ｂ）の場合では、再度発声を求めても、ユーザは適切な音声コマンドを思いつけない可能性があるため、再度認識結果が得られない場合が発生する。これでは、情報端末１００の利便性が損なわれ、ユーザの満足度が低くなる。 Therefore, as in the conventional system, when the user's intention is estimated, the user's convenience is increased because the user's intention is not requested again, rather than the user's intention to be uttered again when the recognition result is not obtained. In particular, in the case of (A) and (B), there is a possibility that the user cannot come up with an appropriate voice command even if the utterance is requested again, so that the recognition result cannot be obtained again. This impairs the convenience of the information terminal 100 and lowers user satisfaction.

そこで、Ｓ１００９において推定した情報をユーザに提供する（Ｓ１０１１）。すなわち、システム制御部１０１は、ディスプレイ１０５への表示又は音声出力装置１０６による音声出力によって、推定内容に係る情報を提示する。又は、推定した操作内容を実行し、操作内容を実行したことをユーザにディスプレイ１０５への表示又は音声出力装置１０６による音声出力によって知らせる。 Therefore, the information estimated in S1009 is provided to the user (S1011). That is, the system control unit 101 presents information related to the estimated content by display on the display 105 or sound output by the sound output device 106. Alternatively, the estimated operation content is executed, and the user is informed that the operation content has been executed by display on the display 105 or sound output by the sound output device 106.

このとき、ユーザに出力した推定内容又は操作内容は、必ずしも正しいとは限らない。そのため、ユーザに確認を求めたり、訂正の機会を与えることが必要である。 At this time, the estimated content or the operation content output to the user is not necessarily correct. Therefore, it is necessary to ask the user for confirmation or provide an opportunity for correction.

特に、意図推定部１１５がＳ１０１０において信頼度が低いと判定した場合は、ユーザに提供すべき情報や実行すべき操作は適切に推定できない。そこで、システム制御部１０１は、ユーザにさらなる情報を求める質問内容を生成する（Ｓ１００２）。生成した質問をディスプレイ１０５や音声出力装置１０６を通じてユーザに出力し（Ｓ１００３）、ユーザの発声を促す。 In particular, when the intention estimation unit 115 determines that the reliability is low in S1010, the information to be provided to the user and the operation to be performed cannot be appropriately estimated. Therefore, the system control unit 101 generates a question content that requests further information from the user (S1002). The generated question is output to the user through the display 105 and the audio output device 106 (S1003), and the user's utterance is prompted.

また、システム制御部１０１は、Ｓ１０１０において信頼度が高いと判定した場合は、まず、ユーザに情報を提供する（Ｓ１０１１）。そして、システム制御部１０１は、確認発声や訂正発声を促す質問内容を生成する（Ｓ１００２）。生成した質問をディスプレイ１０５や音声出力装置１０６を通じてユーザに出力し（Ｓ１００３）、ユーザの発声を促す。このときの質問内容は、例えば、「他に何かありますか？」や「この操作でよろしいですか？」のような内容とする。 If the system control unit 101 determines that the reliability is high in S1010, the system control unit 101 first provides information to the user (S1011). Then, the system control unit 101 generates a question content that prompts confirmation utterance or correction utterance (S1002). The generated question is output to the user through the display 105 and the audio output device 106 (S1003), and the user's utterance is prompted. The contents of the question at this time are, for example, contents such as “Is there anything else?” Or “Are you sure about this operation?”

これによって、情報端末１００は、ユーザの音声コマンドの認識結果が得られなかった場合には、環境情報を取得してユーザが必要とする情報又は希望する操作内容を推定してユーザに提示する。特に、この推定内容が間違っていた場合でも、ユーザは、再度トークボタン１１１を押す手間を省いて訂正発話ができ、ユーザの利便性が増す。さらに、ユーザにとっては、情報端末１００に要求するときは必ずトークボタン１１１を押して音声コマンドを発する。これによってユーザインタフェースの一貫性が保たれていることを直感的に把握でき、ユーザの利便性が高まる。 As a result, when the recognition result of the user's voice command is not obtained, the information terminal 100 acquires the environment information, estimates the information required by the user or the desired operation content, and presents it to the user. In particular, even when the estimated content is wrong, the user can correct the utterance without the need to press the talk button 111 again, and the convenience of the user is increased. Furthermore, for the user, when making a request to the information terminal 100, the user always presses the talk button 111 to issue a voice command. As a result, it is possible to intuitively understand that the consistency of the user interface is maintained, and the convenience of the user is enhanced.

また、従来の、定期的にユーザに情報や操作を推定して提供する方法においては、ある状況である情報が必要とされる可能性が非常に高いと推定された場合でも、実際にはユーザがそのタイミングでは情報を欲していないというミスマッチが起こり得た。これに対して、本発明では、ユーザによるトークボタン１１１の押下を契機として、ユーザが情報が必要であるということが判った上で情報を推定するので、ユーザが情報を必要としているタイミングにて必要な情報が提供され、わずらわしさがなくなり、利便性が高まる。 In addition, in the conventional method of periodically providing information and operations to the user, even if it is estimated that there is a high possibility that information in a certain situation is required, the user actually However, there could be a mismatch that did not want information at that time. On the other hand, in the present invention, when the user presses the talk button 111, the information is estimated after the user knows that the information is necessary. Therefore, at the timing when the user needs the information. Necessary information is provided, there is no troublesomeness, and convenience is enhanced.

図３は、本発明の情報端末１００とユーザとの対話の例の説明図である。 FIG. 3 is an explanatory diagram of an example of an interaction between the information terminal 100 of the present invention and the user.

なお、「Ｓ」は情報端末１００からの音声出力による情報提示や質問を表し、「Ｕ」はユーザによる発話を表す。 Note that “S” represents information presentation or question by voice output from the information terminal 100, and “U” represents utterance by the user.

図３（Ａ）の例は、まずユーザがトークボタン１１１を押下すると、トーク信号がシステム制御部１０１に送信される（Ｓ１００１）。システム制御部１０１はこれを受けて、応答のための質問内容を生成し（Ｓ１００２）、出力する（Ｓ１００３）。このときの質問内容は、ユーザに発声を促す「何でしょうか？」という音声とする。 In the example of FIG. 3A, when the user first presses the talk button 111, a talk signal is transmitted to the system control unit 101 (S1001). In response to this, the system control unit 101 generates question contents for response (S1002) and outputs them (S1003). The question content at this time is a voice “What is it?” Prompting the user to speak.

このとき、ユーザが適切な音声コマンドを思いつかなかったため、トークボタン１１１押下後、無音が所定時間無音経過し、音声認識部１１４がタイムアウトと判定する。そのため、音声認識結果が得られない（Ｓ１００６）。そこで、環境情報を取得する（Ｓ１００８）。この図３の例では、交通情報受信機１０４が渋滞情報を取得している。また、ＧＰＳ受信機１０２の位置情報、自律航法センサ１０３の位置情報及び地図情報読込部１０８が得た地図情報をマッチングして得られた位置情報によって、ユーザが渋滞中の道路にいることが判明している。 At this time, since the user did not come up with an appropriate voice command, after the talk button 111 is pressed, the silence has been silenced for a predetermined time, and the voice recognition unit 114 determines that a timeout has occurred. Therefore, a voice recognition result cannot be obtained (S1006). Therefore, environmental information is acquired (S1008). In the example of FIG. 3, the traffic information receiver 104 acquires traffic jam information. Further, the position information obtained by matching the position information of the GPS receiver 102, the position information of the autonomous navigation sensor 103, and the map information obtained by the map information reading unit 108 proves that the user is on a congested road. is doing.

この環境情報によって、意図推定部１１５は、ユーザが必要としている情報が渋滞情報であると推定する。そして、この推定内容の信頼度が所定の閾値よりも高いと判定する。 Based on this environmental information, the intention estimation unit 115 estimates that the information required by the user is traffic jam information. And it determines with the reliability of this presumed content being higher than a predetermined threshold value.

そこで、システム制御部１０１は、図３（Ａ）に示すように、渋滞情報をユーザに提供する。その後に、ユーザに確認発声や訂正発声を促す質問内容（「他に何かありますか」）を出力する。 Therefore, the system control unit 101 provides traffic information to the user as shown in FIG. After that, a question content ("Is there anything else?") That prompts the user to confirm or correct the utterance is output.

ユーザは、トークボタン１１１押下時に必要としていた情報が渋滞情報であったため、その情報を得られたので、「いいえ」と返答し、情報端末１００との対話を終了する。 Since the information required when the talk button 111 is pressed is the traffic jam information, the user has obtained the information, and therefore responds “No” and ends the dialogue with the information terminal 100.

この例のように、ユーザは、適切な音声コマンドを思いつけなかった場合にも、必要としていた情報を得ることができる。 As in this example, even when the user does not come up with an appropriate voice command, the user can obtain the necessary information.

図３（Ｂ）の例は、図３（Ａ）の例と同様に、ユーザが適切な音声コマンドを思いつかなかったため、音声認識結果が得られない場合である。意図推定部１１５は、ユーザが必要としている情報が渋滞情報であることを推定する。システム制御部１０１は、この渋滞情報をユーザに提供する。その後に、ユーザに確認発声や訂正発声を促す質問内容（「他に何かありますか」）を出力する。 The example of FIG. 3B is a case where a voice recognition result cannot be obtained because the user did not come up with an appropriate voice command, as in the example of FIG. The intention estimation unit 115 estimates that information required by the user is traffic jam information. The system control unit 101 provides this traffic jam information to the user. After that, a question content ("Is there anything else?") That prompts the user to confirm or correct the utterance is output.

しかし、ユーザは、トークボタン１１１押下時に必要としていた情報と提供された情報とが異なることを認識する。そこで、必要とする情報である「抜け道情報」を返答する。 However, the user recognizes that the information required when the talk button 111 is pressed is different from the provided information. Therefore, the “return information” which is necessary information is returned.

これを受けて、情報端末１００は絞り込み処理（Ｓ１００７）によってユーザの必要とする情報である抜け道情報を絞り込む。そして、抜け道情報の検索の応答をする。 In response to this, the information terminal 100 narrows down the loop-out information, which is information required by the user, by the narrowing-down process (S1007). Then, it responds to search for loophole information.

この例のように、ユーザが適切な音声コマンドを思いつけず、さらに、環境情報により推定した情報がユーザの要求していた情報と異なっていた場合にも、ユーザが何度もトークボタン１１１を押下して音声を入力することなく、必要としていた情報を得ることができる。 As in this example, even when the user does not come up with an appropriate voice command and the information estimated by the environment information is different from the information requested by the user, the user repeatedly presses the talk button 111. Necessary information can be obtained without pressing and inputting voice.

図３（Ｃ）の例は、図３（Ａ）の例と同様に、ユーザが適切な音声コマンドを思いつかなかったため、音声認識結果が得られない場合である。意図推定部１１５は、ユーザが必要としている情報が渋滞情報であることを推定する。しかし、図３（Ｃ）の例は、この推定内容の信頼度が所定の閾値よりも低いと判定した場合である。 The example of FIG. 3C is a case where a voice recognition result cannot be obtained because the user has not come up with an appropriate voice command, as in the example of FIG. The intention estimation unit 115 estimates that information required by the user is traffic jam information. However, the example of FIG. 3C is a case where it is determined that the reliability of the estimated content is lower than a predetermined threshold.

信頼度が低いと判定した場合は、システム制御部１０１は、ユーザにさらなる情報を求める質問内容を生成する。ここでは、ユーザが渋滞情報を知ろうとしているのかを確認するための「渋滞情報をお知りになりたいのでしょうか？」を出力する。 If it is determined that the reliability is low, the system control unit 101 generates a question content for requesting further information from the user. Here, “Would you like to know the traffic information?” Is output to confirm whether the user wants to know the traffic information.

ユーザは、トークボタン１１１押下時に必要としていた情報は渋滞情報であるため、「はい」と応答する。システム制御部１０１は、これを受けて、絞り込み処理（Ｓ１００７）によって渋滞情報を絞り込み、環境情報より得た渋滞情報を提供する。 The user responds “Yes” because the information required when the talk button 111 is pressed is traffic jam information. In response to this, the system control unit 101 narrows down the traffic jam information by the narrowing-down process (S1007), and provides the traffic jam information obtained from the environment information.

この例のように、ユーザが適切な音声コマンドを思いつけず、さらに、環境情報により推定した情報の信頼度が低い場合にも、ユーザが何度もトークボタン１１１を押下して音声を入力することなく、必要としていた情報を得ることができる。 As in this example, even when the user does not come up with an appropriate voice command and the reliability of the information estimated from the environment information is low, the user presses the talk button 111 many times to input the voice. Without having to, you can get the information you need.

図３（Ｄ）の例は、ユーザが適切な音声コマンドを思いつかなかったため、音声認識部１１４では音声認識が難しい自然発話をした場合の例である。 The example in FIG. 3D is an example in the case where the user has not come up with an appropriate voice command, and the voice recognition unit 114 makes a natural utterance that is difficult to recognize.

ユーザがトークボタン１１１を押下した後、システム制御部１０１は、「何でしょうか？」と応答する。ここで、ユーザが「この渋滞をどの位で抜けられる？」と、音声コマンドではない発話をする。この発話は音声認識部１１４では音声認識が難しい自然発話であるため、音声認識結果が得られない（Ｓ１００６）。そこで、システム制御部１０１は、環境情報を取得する（Ｓ１００８）。意図推定部１１５は、ユーザが必要としている情報が渋滞情報であることを推定する。しかし、図３（Ｄ）の例は、この推定内容の信頼度が所定の閾値よりも低いと判定した場合である。 After the user presses the talk button 111, the system control unit 101 responds “What is it?”. Here, the user makes an utterance that is not a voice command, "How far can this traffic jam be removed?" Since this utterance is a natural utterance that is difficult for the voice recognition unit 114 to recognize the voice, a voice recognition result cannot be obtained (S1006). Therefore, the system control unit 101 acquires environment information (S1008). The intention estimation unit 115 estimates that information required by the user is traffic jam information. However, the example of FIG. 3D is a case where it is determined that the reliability of the estimated content is lower than a predetermined threshold.

ユーザは、トークボタン１１１押下時に意図していた質問内容は渋滞情報であるため、「はい」と応答する。システム制御部１０１は、これを受けて、絞り込み処理（Ｓ１００７）によって渋滞情報を絞り込み、環境情報より得た渋滞情報を提供する。 The user answers “Yes” because the question content intended when the talk button 111 is pressed is traffic jam information. In response to this, the system control unit 101 narrows down the traffic jam information by the narrowing-down process (S1007), and provides the traffic jam information obtained from the environment information.

この例のように、ユーザが適切な音声コマンドを思いつけず音声認識が難しい自然発話をし、さらに、環境情報により推定した情報の信頼度が低い場合にも、ユーザが何度もトークボタン１１１を押下して音声を入力することなく、必要としていた情報を得ることができる。 As in this example, even when the user utters a natural utterance that is difficult to recognize without thinking of an appropriate voice command, and the reliability of the information estimated from the environmental information is low, the user repeatedly presses the talk button 111. Necessary information can be obtained without pressing and inputting voice.

図３（Ｅ）の例は、ユーザが適切な音声コマンドを思いつかなかったため、音声認識部１１４では音声認識が難しい自然発話をした場合の例である。 The example of FIG. 3E is an example in the case where the user has not come up with an appropriate voice command, and thus the voice recognition unit 114 makes a natural utterance that is difficult to recognize.

図３（Ｄ）の例と同様に、ユーザが「この渋滞をどの位で抜けられる？」と、音声コマンドではない発話をする。この発話は音声認識部１１４では音声認識が難しい自然発話であるため、音声認識結果が得られない（Ｓ１００６）。そこで、システム制御部１０１は、環境情報を取得する（Ｓ１００８）。意図推定部１１５は、ユーザが必要としている情報が渋滞情報であることを推定する。この信頼度が低いと判定した場合は、ユーザに提供すべき情報や実行すべき操作が適切に推定できない。そこで、システム制御部１０１は、ユーザにさらなる情報を求める質問内容を生成する。ここでは、ユーザが渋滞情報を知ろうとしているのかを確認するために「渋滞情報をお知りになりたいのでしょうか？」を出力する。 Similar to the example of FIG. 3D, the user utters a message that is not a voice command, "How long can this traffic jam be removed?" Since this utterance is a natural utterance that is difficult for the voice recognition unit 114 to recognize the voice, a voice recognition result cannot be obtained (S1006). Therefore, the system control unit 101 acquires environment information (S1008). The intention estimation unit 115 estimates that information required by the user is traffic jam information. If it is determined that the reliability is low, the information to be provided to the user and the operation to be performed cannot be estimated appropriately. Therefore, the system control unit 101 generates a question content for requesting further information from the user. Here, “Do you want to know the traffic information?” Is output to confirm whether the user wants to know the traffic information.

しかし、ユーザは、トークボタン１１１押下時に必要としていた情報と提供された情報が異なることを認識する。そこで、「いいえ」と応答する。 However, the user recognizes that the information required when the talk button 111 is pressed differs from the provided information. Therefore, it responds “No”.

これを受けて、システム制御部１０１は、渋滞情報を除外した絞り込み処理（Ｓ１００７）を実行する。この結果、到着時間情報が推定され、これをユーザに提供する。このとき、ユーザに確認発声や訂正発声を促す内容を付加して出力する。 In response to this, the system control unit 101 executes a narrowing process (S1007) excluding traffic jam information. As a result, arrival time information is estimated and provided to the user. At this time, contents for prompting the user to confirm or correct the utterance are added and output.

ユーザは、トークボタン１１１押下時に意図していた質問内容は到着時間情報であるため、「はい」と応答する。システム制御部１０１は、これを受けて、環境情報より得た到着時間情報を提供する。 The user responds “Yes” because the question content intended when the talk button 111 is pressed is arrival time information. In response to this, the system control unit 101 provides arrival time information obtained from the environment information.

図４は環境情報の例を示す説明図である。 FIG. 4 is an explanatory diagram showing an example of environment information.

この図４に例示する環境情報は、ユーザが必要とする情報又は希望する操作内容に影響を与えるものである。システム制御部１０１は、この環境情報を取得して、ユーザの要求する情報を推定する。 The environment information illustrated in FIG. 4 affects information required by the user or desired operation content. The system control unit 101 acquires this environment information and estimates information requested by the user.

（１）乃至（６）の環境情報は、ＧＰＳ受信機１０２が受信した位置情報の変化、ＨＤＤ１０９に格納されている地図情報及び交通情報受信機１０４が受信した誘導情報から、システム制御部１０１がマッチングして判断する。 The environmental information of (1) to (6) is obtained by the system control unit 101 from the change in the position information received by the GPS receiver 102, the map information stored in the HDD 109, and the guidance information received by the traffic information receiver 104. Judge by matching.

渋滞中であるか否か（７）は、ＧＰＳ受信機１０２が受信した位置情報及び交通情報受信機１０４が受信したＶＩＣＳの渋滞情報のマッチングした結果と車速の短時間履歴とによってシステム制御部１０１が判断する。 Whether or not there is a traffic jam (7) is determined by the system control unit 101 based on the matching result of the position information received by the GPS receiver 102 and the traffic information of the VICS received by the traffic information receiver 104 and a short history of vehicle speed. Judgment.

走行中であるか否か（８）は、車速の短時間履歴又はサイドブレーキ信号によって判断する。 Whether or not the vehicle is traveling (8) is determined based on a short time history of the vehicle speed or a side brake signal.

長時間運転か否か（９）は、エンジン始動から現在までの経過時間と、予め設定した閾値とをシステム制御部１０１が比較して判断する。 Whether or not the operation is performed for a long time (9) is determined by the system control unit 101 comparing the elapsed time from the start of the engine to the present time and a preset threshold value.

汗をかいているか否か（１０）は、ハンドルに設置されている発汗センサ１１７からの信号によって判断する。 Whether or not the person is sweating (10) is determined by a signal from the sweat sensor 117 installed on the handle.

現在再生中のＣＤが既に１回再生済みであるか否か（１１）は、システム制御部１０１が、ＡＶ装置１１３のＣＤ再生装置からリピート信号によって判断する。 Whether or not the currently playing CD has been played once (11) is determined by the system control unit 101 based on the repeat signal from the CD playback device of the AV device 113.

ご飯時であるか否か（１２）は、システム制御部１０１に内蔵された時計から現在時刻を取得して判断する。 Whether it is a meal time (12) is determined by acquiring the current time from a clock built in the system control unit 101.

路面の状態（１３）は、路面センサ１１６からの信号によって判断する。 The state (13) of the road surface is determined by a signal from the road surface sensor 116.

音声コマンドを発したのはドライバーであるか否か（１４）は、システム制御部１０１が、複数のマイク１１０の音声の到来方向を推定して判断する。 Whether or not the driver issues the voice command (14) is determined by the system control unit 101 by estimating the voice arrival directions of the plurality of microphones 110.

システム制御部１０１は、これらの環境情報を、それぞれの条件の取りうる値に対する確率分布として表す。 The system control unit 101 represents the environment information as a probability distribution with respect to values that can be taken by the respective conditions.

例えば、（１１）の条件は、取りうる値は真／偽の２値であり、かつ、真／偽の判断は確実にできる。従って、（１１）の条件が真である確率をＰ１１（真）、偽である確率をＰ１１（偽）とすると、それらの値は、
Ｐ１１（真）＝１、Ｐ１１（偽）＝０
又は、
Ｐ１１（真）＝０、Ｐ１１（偽）＝１
のどちらかとなる。 For example, in the condition (11), possible values are binary values of true / false, and true / false judgments can be made reliably. Therefore, if the probability that the condition of (11) is true is P11 (true) and the probability that it is false is P11 (false), these values are
P11 (true) = 1, P11 (false) = 0
Or
P11 (true) = 0, P11 (false) = 1
Either.

一方、（１４）の条件は、外部の雑音の影響などによって音源方向推定の精度が落ちる場合がある。そのため、システム制御部１０１は、マイク１１０に備えられている複数のマイクから取得した情報に基づいて、システム制御部１０１が推定した音源方向の信頼度をＰ１４（真）の値とする。 On the other hand, in the condition (14), the accuracy of sound source direction estimation may be reduced due to the influence of external noise. Therefore, the system control unit 101 sets the reliability of the sound source direction estimated by the system control unit 101 based on information acquired from a plurality of microphones included in the microphone 110 as a value of P14 (true).

また、（１３）の条件は、取りうる値が２値ではない。このような場合は、システム制御部１０１は、それぞれの取りうる値に関しての確率を環境情報の値とする。具体的には、路面センサ１１６から取得した情報から、Ｐ１３（乾燥）、Ｐ１３（湿潤）、Ｐ１３（水膜）、Ｐ１３（積雪）、Ｐ１３（凍結）とする。 Also, the condition (13) is not a binary value. In such a case, the system control unit 101 sets the probability regarding each possible value as the value of the environment information. Specifically, P13 (dry), P13 (wet), P13 (water film), P13 (snow cover), and P13 (freeze) are obtained from information acquired from the road surface sensor 116.

図５は、ユーザが必要とする情報又は希望する操作内容の推定値の一例を示す説明図である。 FIG. 5 is an explanatory diagram showing an example of information required by the user or an estimated value of the desired operation content.

システム制御部１０１は、このような推定値を予め持っている。そして、図４のような環境情報の取得値から、推定値の近似度を算出する。 The system control unit 101 has such an estimated value in advance. Then, the degree of approximation of the estimated value is calculated from the acquired value of the environment information as shown in FIG.

なお、提供可能な情報全て又は可能な操作全てを推定値とすることは、図４の環境情報の各条件との対応付けがうまくいかなくなる。そのため、想定されうるユーザ意図を絞って推定値とすることが好ましい。 Note that setting all the information that can be provided or all the possible operations to be estimated values makes it difficult to associate with the conditions of the environment information in FIG. Therefore, it is preferable to narrow down the possible user intentions and set the estimated value.

次に、推定の方法を説明する。 Next, an estimation method will be described.

システム制御部１０１は、図２のＳ１００８で取得した環境情報と、ユーザの発声した音声コマンドの認識結果とを用いて、図５に示すユーザが必要とする情報又は希望する操作内容を推定する。この推定にはベイジアンネットワークを用いることができる。 The system control unit 101 uses the environment information acquired in S1008 in FIG. 2 and the recognition result of the voice command uttered by the user to estimate information required by the user or desired operation content shown in FIG. A Bayesian network can be used for this estimation.

図６は、ベイジアンネットワークを用いた推定方法の説明図である。 FIG. 6 is an explanatory diagram of an estimation method using a Bayesian network.

システム制御部１０１は、図２のＳ１００８で取得した環境情報と、ユーザの発声した音声コマンドの認識結果とを条件とする。そしてこれらの条件全てを親ノードとし、ユーザが必要とする情報又は希望する操作の内容の種別を確率変数とするユーザの意図を子ノードとする。 The system control unit 101 uses the environment information acquired in S1008 of FIG. 2 and the recognition result of the voice command uttered by the user as a condition. All of these conditions are set as parent nodes, and the user's intention with the type of information required by the user or the content of desired operation as a random variable is set as a child node.

このベイジアンネットワークの条件付確率表は、経験則による考察から作成する。以下に、経験則から条件付確率表を作成する方式の一実施例を説明する。 The conditional probability table for this Bayesian network is created based on a rule of thumb. Hereinafter, an embodiment of a method for creating a conditional probability table from an empirical rule will be described.

図７は、経験則によって作成されるベイジアンネットワークの確率変数の一例の説明図である。 FIG. 7 is an explanatory diagram of an example of a random variable of a Bayesian network created by an empirical rule.

これは、図５に例示したユーザの意図（１）及び（２）に関して、それぞれの条件がこれらのユーザ意図を生起させる可能性を考慮して、予め作成され、情報端末１００に格納される。 This is created in advance and stored in the information terminal 100 with regard to the user intentions (1) and (2) illustrated in FIG. 5 in consideration of the possibility that each condition causes these user intentions.

例えば、渋滞中であるか否か（７）の条件の真偽値が「真」である場合に、自車の位置が有料道路上であって、かつ料金所から遠い場合は、ユーザの意図は渋滞情報を必要としている可能性が高くなる。また、自車の位置が一般道路上であって、交差点付近である場合は、ユーザの意図は抜け道情報を必要としている可能性が高くなる。 For example, if the true / false value of the condition (7) on whether or not there is a traffic jam is “true” and the vehicle is on a toll road and is far from the toll gate, the user's intention Is more likely to need traffic information. In addition, when the position of the own vehicle is on a general road and in the vicinity of an intersection, the user's intention is likely to require exit information.

このようにして、ユーザが必要とする情報又は希望する操作内容を推定する。 In this way, information required by the user or desired operation content is estimated.

このように、各条件が取りうる値の組み合わせ全てに対して、その条件の組み合わせにおいてユーザが渋滞情報を必要とする確率や、抜け道情報を必要とする確率を適当に決めればよい。 As described above, for all combinations of values that can be taken by each condition, the probability that the user needs traffic information and the probability that the exit information is needed in the combination of conditions may be appropriately determined.

一方、どのような条件の取りうる値の組み合わせで、どのような情報提供の要求や、操作の要求が行われたかに関する実際のデータから、上記資料に記載されている方法により条件付確率表を推定する方法でベイジアンネットワークを完成させることもできる。 On the other hand, a conditional probability table is created by the method described in the above document from actual data on what information provision requests and operation requests were made in what combinations of possible values of conditions. A Bayesian network can also be completed by an estimation method.

以上のように本発明の第１の実施例の情報端末１００は、ユーザがトークボタン１１１を押下したことを契機として、ユーザの音声コマンドを解析し、音声コマンドに対応する情報を提供する。このとき、ユーザの音声コマンドの解析結果が得られなかった場合は、環境情報を取得して、ユーザが必要とする情報又は希望する操作内容を推定して、推定内容を提供する。このようにすることによって、ユーザは、何度もトークボタン１１０を押し何度も音声コマンドを発することなく必要とする情報又は希望する操作内容を得られるので、ユーザの利便性が向上する。 As described above, the information terminal 100 according to the first embodiment of the present invention analyzes the user's voice command when the user presses the talk button 111 and provides information corresponding to the voice command. At this time, if the analysis result of the user's voice command is not obtained, the environment information is acquired, the information required by the user or the desired operation content is estimated, and the estimated content is provided. By doing so, the user can obtain necessary information or desired operation contents without pressing the talk button 110 many times and issuing voice commands many times, so that convenience for the user is improved.

次に、本発明の第２の実施例について説明する。 Next, a second embodiment of the present invention will be described.

前述した第１の実施例では、ユーザがトークボタンを押下し、ユーザの音声コマンドを受け付けた後、ユーザの意図を推定した。これに対して、第２の実施例では、ユーザがトークボタンを押下した後にユーザの意図を推定し、ユーザに情報を提供する。その後、ユーザの音声コマンドを受け付ける。 In the first embodiment described above, the user's intention is estimated after the user presses the talk button and receives the user's voice command. In contrast, in the second embodiment, the user's intention is estimated after the user presses the talk button, and information is provided to the user. Thereafter, the user's voice command is accepted.

図８は、本発明の第２の実施の情報端末１００の音声対話処理のフローチャートである。 FIG. 8 is a flowchart of the voice interaction process of the information terminal 100 according to the second embodiment of this invention.

システム制御部１０１は、トークボタン１１１からのトーク信号を受信すると（Ｓ１００１）、現在の環境情報を、ＧＰＳ受信機１０２、自律航法センサ１０３、交通情報受信機１０４、経路探索部１０７、地図情報読込部１０８、路面センサ１１６、発汗センサ１１７等から取得する（Ｓ１００８）。そして、ユーザが必要とする情報又は希望する操作内容を、取得した環境情報を元に推定する（Ｓ１００９）。 When the system control unit 101 receives the talk signal from the talk button 111 (S1001), the system control unit 101 reads the current environment information from the GPS receiver 102, the autonomous navigation sensor 103, the traffic information receiver 104, the route search unit 107, and the map information reading. Obtained from the unit 108, the road surface sensor 116, the sweat sensor 117, etc. (S1008). Then, the information required by the user or the desired operation content is estimated based on the acquired environment information (S1009).

意図推定部１１５は、この推定処理の結果、推定内容及び推定の信頼度を得る。そして、この推定の信頼度が、予め定めた閾値以上であるか否かを判定する（Ｓ１０１０）。 The intention estimation unit 115 obtains estimation contents and estimation reliability as a result of the estimation processing. Then, it is determined whether or not the reliability of the estimation is equal to or greater than a predetermined threshold (S1010).

信頼度が閾値よりも高い場合は、推定内容がユーザの意図と等しいと判断する。そこで、推定内容に係る情報をユーザに提供する（Ｓ１０１１）。具体的には、システム制御部１０１は、ディスプレイ１０５への表示又は音声出力装置１０６による音声出力をして、その情報を提示する。また、システム制御部１０１は、推定した操作内容を実行し、操作内容を実行したことをユーザにディスプレイ１０５への表示又は音声出力装置１０６による音声出力によって知らせる。 If the reliability is higher than the threshold, it is determined that the estimated content is equal to the user's intention. Therefore, information related to the estimated content is provided to the user (S1011). Specifically, the system control unit 101 performs display on the display 105 or audio output by the audio output device 106 and presents the information. Further, the system control unit 101 executes the estimated operation content, and notifies the user that the operation content has been executed by display on the display 105 or sound output by the sound output device 106.

一方、意図推定部１１５が、Ｓ１０１０において信頼度が低いと判定した場合は、システム制御部１０１は、ユーザにさらなる情報を求める質問内容を生成する（Ｓ１００２）。生成した質問をディスプレイ１０５や音声出力装置１０６を通じてユーザに出力し（Ｓ１００３）、ユーザの発声を促す。 On the other hand, when the intention estimation unit 115 determines that the reliability is low in S1010, the system control unit 101 generates a question content for further information from the user (S1002). The generated question is output to the user through the display 105 and the audio output device 106 (S1003), and the user's utterance is prompted.

次に、ユーザが発した音声コマンドは（Ｓ１００４）、音声認識部１１４によって音声認識処理することによって認識結果が得られる（Ｓ１００５）。 Next, a voice command issued by the user (S1004) is subjected to voice recognition processing by the voice recognition unit 114, and a recognition result is obtained (S1005).

次に、音声認識部１１４は、Ｓ１００５の処理の結果、音声処理結果が得られたか否かを判定する（Ｓ１００６）。音声認識結果が得られた場合は、音声認識部１１４は、認識結果に含まれる情報を用いて、ユーザが必要とする情報又は希望する操作内容の候補の絞り込みを実行する（Ｓ１００７）。 Next, the voice recognition unit 114 determines whether or not a voice processing result is obtained as a result of the process of S1005 (S1006). If a voice recognition result is obtained, the voice recognition unit 114 uses the information included in the recognition result to narrow down information required by the user or candidates for desired operation contents (S1007).

従って、Ｓ１００９の推定結果がユーザの意図と同じであった場合は、ユーザはトークボタン１１１を押すだけで必要とする情報を得たり、希望する操作が実行できたりするという利点が得られる。 Therefore, when the estimation result of S1009 is the same as the user's intention, the user can obtain the necessary information or perform the desired operation simply by pressing the talk button 111.

また、第１の実施例（図２）と同様に、推定結果が正しくなかった場合は、Ｓ１００２、Ｓ１００３のステップによって、ユーザは訂正発声や、本当に必要としている情報、操作に関する発話を行う機会が与えられる。 Similarly to the first embodiment (FIG. 2), when the estimation result is not correct, the user has an opportunity to perform correct utterance, utterance regarding the information and operation that are really necessary by the steps of S1002 and S1003. Given.

図９は、第２の実施例の情報端末１００とユーザとの対話の例の説明図である。 FIG. 9 is an explanatory diagram of an example of a dialogue between the information terminal 100 and the user according to the second embodiment.

図９（Ａ）の例は、まずユーザがトークボタン１１１を押下すると、トーク信号がシステム制御部１０１に送信される（Ｓ１００１）。システム制御部１０１はこれを受けて、環境情報を取得する（Ｓ１００８）。なお、この図９の例では、経路探索部１０７が保持する経路情報と、ＧＰＳ受信機１０２からの位置情報、自律航法センサ１０３からの位置情報、と地図情報読込部１０８から得られた地図情報をマッチングして得られた位置情報によって、ユーザが高速道路を走行中であり、間もなく料金所に差し掛かることが判明する。 In the example of FIG. 9A, when the user first presses the talk button 111, a talk signal is transmitted to the system control unit 101 (S1001). In response to this, the system control unit 101 acquires environment information (S1008). In the example of FIG. 9, the route information held by the route search unit 107, the position information from the GPS receiver 102, the position information from the autonomous navigation sensor 103, and the map information obtained from the map information reading unit 108 From the position information obtained by matching, it is found that the user is traveling on the highway and will soon reach the toll gate.

この環境情報によって、意図推定部１１５は、ユーザが必要としている情報が高速道路料金情報であることを推定する。そして、この推定内容の信頼度は所定の閾値よりも高くなっている。 Based on this environmental information, the intention estimation unit 115 estimates that the information required by the user is highway toll information. The reliability of the estimated content is higher than a predetermined threshold value.

そこで、システム制御部１０１は、図９（Ａ）に示すように、高速道路料金情報をユーザに提供する。その後に、ユーザに確認発声や訂正発声を促す質問内容（「他に何かありますか」）を出力する。 Therefore, the system control unit 101 provides highway toll information to the user as shown in FIG. After that, a question content ("Is there anything else?") That prompts the user to confirm or correct the utterance is output.

このときユーザは、トークボタン１１１押下時に意図していた質問内容と同等の情報を得られたので、「いいえ」と返答し、情報端末１００との対話を終了する。 At this time, since the user has obtained information equivalent to the question content intended when the talk button 111 is pressed, the user replies “No” and ends the dialogue with the information terminal 100.

この例のように、ユーザはトークボタン１１１を押した時点で、意図した質問内容と同等の結果を得ることができる。 As in this example, when the user presses the talk button 111, the user can obtain a result equivalent to the intended question content.

一方、図９（Ｂ）の例は、図９（Ａ）の例と同様に、ユーザがトークボタン１１１を押下した後、システム制御部１０１は、高速道路料金情報をユーザに提供する。その後に、ユーザに確認発声や訂正発声を促す質問内容（「他に何かありますか」）を出力する。 On the other hand, in the example of FIG. 9B, as in the example of FIG. 9A, after the user presses the talk button 111, the system control unit 101 provides the highway toll information to the user. After that, a question content ("Is there anything else?") That prompts the user to confirm or correct the utterance is output.

このときユーザは、トークボタン１１１押下時に意図していた質問内容とは異なる情報を提供されたため、必要な情報である「近くのコンビニ」と返答する。これを受けてシステム制御部１０１は、絞り込に処理（Ｓ１０１０）によって、ユーザの意図を特定し、取得した環境情報（Ｓ１００８）を元に、近くのコンビニ情報を特定する（Ｓ１００８）。この情報をユーザに提供する（Ｓ１０１１）。 At this time, since the user is provided with information different from the question content intended when the talk button 111 is pressed, the user responds with “Nearby convenience store” which is necessary information. In response to this, the system control unit 101 specifies the user's intention through the narrowing process (S1010), and specifies nearby convenience store information based on the acquired environment information (S1008) (S1008). This information is provided to the user (S1011).

この図９（Ｂ）の場合は、最初に提供した高速料金の情報はユーザにとって必要な情報であったかどうかは判明しない。しかし、必要であった場合、従来のシステムではこれらの二つの情報を取得するためには、ユーザは最低でも２回（「高速料金」「近くのコンビニ」）の音声コマンド発声をする必要があったが、本発明のシステムにおいては、発声数を減らすことができており、利便性が向上している。 In the case of FIG. 9B, it is not clear whether or not the information on the high-speed fee provided first was necessary information for the user. However, if necessary, in order to acquire these two pieces of information in the conventional system, the user needs to speak a voice command at least twice (“fast charge” and “closer convenience store”). However, in the system of the present invention, the number of utterances can be reduced, and convenience is improved.

以上のように本発明の第２の実施例の情報端末１００は、ユーザがトークボタン１１１を押下したことを契機として、環境情報を取得し、ユーザが必要とする情報又は希望する操作内容を推定して、推定内容を提供する。このようにすることによって、ユーザは、音声コマンドを発することなく必要とする情報又は希望する操作内容を得られるので、ユーザの利便性が向上する。 As described above, the information terminal 100 according to the second embodiment of the present invention obtains environmental information when the user presses the talk button 111, and estimates information required by the user or desired operation content. And provide the estimated content. By doing so, the user can obtain necessary information or desired operation contents without issuing a voice command, so that convenience for the user is improved.

また、提供された情報が必要とする情報又は希望する操作内容したものでなかった場合にも、ユーザは、さらに音声コマンドを発声するので、ユーザの意図がより絞りやすくなり、推定の精度が向上する。このようにすることによって、ユーザが何度もトークボタンを押し何度も音声コマンドを発することなく意図する情報を得られるので、ユーザの利便性が向上する。 In addition, even when the provided information is not what is required or what the desired operation is, the user utters more voice commands, making it easier to narrow down the user's intention and improving estimation accuracy. To do. By doing so, intended information can be obtained without the user pressing the talk button many times and issuing voice commands many times, so that convenience for the user is improved.

本発明は、カーナビゲーションシステムなどの車載情報端末に適用し、利便性の高いユーザインタフェースを提供できる。 The present invention can be applied to an in-vehicle information terminal such as a car navigation system and can provide a highly convenient user interface.

本発明の第１の実施例の情報端末の構成を表すブロック図である。It is a block diagram showing the structure of the information terminal of 1st Example of this invention. 本発明の第１の実施例の音声対話処理のフローチャートである。It is a flowchart of the voice dialogue process of 1st Example of this invention. 本発明の第１の実施例の情報端末とユーザとの対話の例の説明図である。It is explanatory drawing of the example of the interaction of the information terminal and user of 1st Example of this invention. 本発明の第１の実施例の環境情報の例を示す説明図である。It is explanatory drawing which shows the example of the environmental information of 1st Example of this invention. 本発明の第１の実施例のユーザが必要とする情報又は希望する操作内容の推定値の一例を示す説明図である。It is explanatory drawing which shows an example of the estimated value of the information which the user of the 1st Example of this invention requires or the desired operation content. 本発明の第１の実施例のベイジアンネットワークを用いた推定方法の説明図である。It is explanatory drawing of the estimation method using the Bayesian network of 1st Example of this invention. 本発明の第１の実施例のベイジアンネットワークの確率変数の一例の説明図である。It is explanatory drawing of an example of the random variable of the Bayesian network of 1st Example of this invention. 本発明の第１の実施例の音声対話処理のフローチャートである。It is a flowchart of the voice dialogue process of 1st Example of this invention. 本発明の第２の実施例の情報端末とユーザとの対話の例の説明図である。It is explanatory drawing of the example of the interaction of the information terminal and user of 2nd Example of this invention.

Explanation of symbols

１００情報端末
１０１システム制御部
１０２ＧＰＳ受信機
１０３自律航法センサ
１０４交通情報受信機
１０５ディスプレイ
１０６音声出力装置
１０７経路探索部
１０８地図情報読込部
１０９ハードディスク（ＨＤＤ）
１１０マイク
１１１トークボタン
１１２ＧＵＩ操作部
１１３ＡＶ装置
１１４音声認識部
１１５意図推定部
１１６路面センサ
１１７発汗センサ
１１８メモリ
１１９インタフェース DESCRIPTION OF SYMBOLS 100 Information terminal 101 System control part 102 GPS receiver 103 Autonomous navigation sensor 104 Traffic information receiver 105 Display 106 Voice output device 107 Route search part 108 Map information reading part 109 Hard disk (HDD)
DESCRIPTION OF SYMBOLS 110 Microphone 111 Talk button 112 GUI operation part 113 AV apparatus 114 Voice recognition part 115 Intention estimation part 116 Road surface sensor 117 Sweating sensor 118 Memory 119 Interface

Claims

In an information terminal that recognizes a user's voice as a command and provides information based on the recognized command,
A talk signal unit that outputs a talk signal according to a user instruction;
A voice recognition unit that recognizes a voice uttered by a user as a command;
An environmental information acquisition unit that acquires environmental information about the environment around the information terminal;
An estimation unit that estimates information intended by the user from the recognized command and the acquired environment information;
A control unit for controlling processing of the information terminal;
With
The controller is
In response to reception of the talk signal, it is determined whether or not a user command recognized by the voice recognition unit has been obtained,
An information terminal characterized in that when the command cannot be obtained, the information estimated by the estimation unit is provided to a user using the environment information acquired by the environment information acquisition unit.

In an information terminal that recognizes a user's voice as a command and provides information based on the recognized command,
A talk signal unit that outputs a talk signal according to a user instruction;
A voice recognition unit that recognizes a voice uttered by a user as a command;
An environmental information acquisition unit that acquires environmental information about the environment around the information terminal;
An estimation unit that estimates information intended by the user from the recognized command and the acquired environment information;
A control unit for controlling processing of the information terminal;
With
The controller is
In response to reception of the talk signal,
An information terminal characterized in that the information estimated by the estimation unit using the environment information acquired by the environment information acquisition unit is provided to a user and a command recognized by the voice recognition unit based on a user's utterance is acquired. .

The information terminal according to claim 1, wherein the control unit determines that a command cannot be acquired when a similarity between a user's utterance recognized by the voice recognition unit and a voice model held by the voice recognition unit is low.

When the control unit acquires a command recognized by the voice recognition unit, the control unit selects information corresponding to a user command using the environment information acquired by the environment information acquisition unit, and the selected information is selected by the user. The information terminal according to claim 1, wherein the information terminal is provided.

The control unit, after obtaining the command recognized by the voice recognition unit, selects information corresponding to the user's command based on the estimated content, and provides the selected information to the user. The information terminal according to claim 2.

The information terminal according to claim 1, wherein the control unit adds a question to the information estimated by the estimation unit and presents the information to the user.

The information terminal according to claim 1, wherein the control unit outputs a question asking the user for further information when the reliability of the information estimated by the estimation unit is lower than a predetermined threshold. .

Provided in the vehicle,
A GPS receiver for receiving radio waves from GPS satellites;
An autonomous navigation sensor unit that includes a gyro and acquires the direction and acceleration of the vehicle by the gyro;
A map information storage unit for storing map information;
A position calculation unit for calculating the position of the vehicle;
With
The position calculation unit calculates a position on the map of the vehicle based on the radio wave received by the GPS reception unit and information acquired by the autonomous navigation sensor,
The information terminal according to claim 1, wherein the environmental information acquisition unit acquires the position of the vehicle calculated by the control unit as environmental information.