JP6281202B2

JP6281202B2 - Response control system and center

Info

Publication number: JP6281202B2
Application number: JP2013158282A
Authority: JP
Inventors: 星野　賢一; 賢一星野; 健浩阿部田
Original assignee: Denso Corp
Current assignee: Denso Corp
Priority date: 2013-07-30
Filing date: 2013-07-30
Publication date: 2018-02-21
Anticipated expiration: 2033-07-30
Also published as: JP2015028566A

Description

本発明は、ユーザの入力した音声データに対して自動応答する技術に関する。 The present invention relates to a technique for automatically responding to voice data input by a user.

従来、ユーザから入力された音声データに対して音声認識処理を実施し、その認識処理結果に応じた種々の情報をユーザに提供する応答システムがある（例えば特許文献１）。この特許文献１に開示の応答システムは、車載器、及び車両外部に設けられて当該車載器と無線通信を実施する情報センターを備えている。 2. Description of the Related Art Conventionally, there is a response system that performs voice recognition processing on voice data input from a user and provides the user with various information according to the recognition processing result (for example, Patent Document 1). The response system disclosed in Patent Document 1 includes an on-vehicle device and an information center that is provided outside the vehicle and performs wireless communication with the on-vehicle device.

特許文献１に開示の応答システムでは、車載器は、ユーザの音声データを取得して情報センターに当該音声データを送信する。一方、情報センターは、当該音声データに対して音声認識処理を実施して、その認識処理結果に応じた応答音声データを車載器に返送する。そして、応答音声データの返送を受けた車載器では、当該応答音声データに従った音声出力を行う。 In the response system disclosed in Patent Literature 1, the vehicle-mounted device acquires the user's voice data and transmits the voice data to the information center. On the other hand, the information center performs voice recognition processing on the voice data, and returns response voice data corresponding to the recognition processing result to the vehicle-mounted device. Then, the vehicle-mounted device that has received the response voice data returns the voice according to the response voice data.

一方、携帯電話機においても、携帯電話会社の管理する情報センターと携帯電話機が無線通信することによって、ユーザの音声入力に対して種々の情報を提供する応答システムが普及してきている。なお、携帯電話機に限らず、様々な場面、環境においてユーザの入力した音声に対して自動応答する応答システムは利用されつつある。 On the other hand, response systems that provide various kinds of information in response to voice input by users by wireless communication between an information center managed by a mobile phone company and the mobile phone have become widespread. In addition, not only a mobile phone but a response system that automatically responds to a voice input by a user in various scenes and environments is being used.

特開２００４−３４８６５８号公報JP 2004-348658 A

近年では、車載器と携帯電話機とを連携させて動作させる技術も開発されている。このように、車載器と携帯電話機とを連携して動作させる場合には、車載器を介して複数の応答システムが利用可能な構成となる。 In recent years, a technique for operating an in-vehicle device and a mobile phone in cooperation with each other has also been developed. Thus, when operating an onboard equipment and a mobile telephone in cooperation, it becomes the composition which can use a plurality of response systems via an onboard equipment.

しかしながら、このような構成ではユーザは、車載器を操作するなどして、複数の応答システムのうち、目的に応じた応答システムを選択してから音声入力をしなければならない。また、いったん応答システムを選択した後に、異なる応答システムを利用したい場合には、利用する応答システムを切り替えるための操作をしなければならない。 However, in such a configuration, the user must input a voice after selecting a response system according to the purpose from a plurality of response systems by operating the vehicle-mounted device. In addition, when a different response system is to be used after selecting the response system, an operation for switching the response system to be used must be performed.

本発明は、この事情に基づいて成されたものであり、その目的とするところは、ユーザの音声入力に対する応答を行う応答システムを複数利用可能な場合に、応答を行わせる応答システムをユーザが選択する手間を省くことを可能にする応答制御システム、およびセンターを提供することにある。 The present invention has been made based on this circumstance, and the object of the present invention is to provide a response system that makes a response when a plurality of response systems that respond to a user's voice input are available. response control system that allows eliminating the need to select to provide a contact and center.

その目的を達成するための応答制御システムの発明は、マイク（１２）を介してユーザによる入力音声を入力音声データとして取得する音声取得部（１８Ａ）と、音声取得部が取得した入力音声データに対して音声認識処理を実施し、当該音声認識処理の結果に基づいて、入力音声データに対する応答となる第１の応答音声データを生成する第１の応答システム（３）と、音声取得部が取得した入力音声データに対して音声認識処理を実施し、当該音声認識処理の結果に基づいて、入力音声データに対する応答となる音声データである第２の応答音声データを生成する第２の応答システム（４）と、第１の応答システムが応答するべき入力音声データの内容を記述した対応リスト（３５Ａ）を記憶する対応リスト記憶部（３５）と、第１の応答システムによる音声認識処理の結果に基づいて、第１、第２の応答システムのうち、いずれの応答システムが入力音声データに対して応答するべきかを判定する回答側システム判定部（３２Ａ）と、回答側システム判定部において入力音声データに対して応答するべきと判定された方の応答システムである回答側応答システムが生成した応答音声データをスピーカ（１４）に音声出力させる回答出力部（１８Ｄ）と、を備え、回答側システム判定部は、音声認識処理の結果が対応リストに対応付けられている場合には、入力音声データに対して第１の応答システムが応答するべきであると判定する一方、音声認識処理の結果が対応リストに対応付けられていない場合には、入力音声データに対して第２の応答システムが応答するべきであると判定することを特徴とする。 The invention of the response control system for achieving the object includes an audio acquisition unit (18A) that acquires input audio by a user as input audio data via a microphone (12), and input audio data acquired by the audio acquisition unit. The voice response processing is performed on the first response system (3) that generates the first response voice data that is a response to the input voice data based on the result of the voice recognition process, and the voice acquisition unit acquires A second response system that performs voice recognition processing on the input voice data and generates second response voice data that is voice data that is a response to the input voice data based on the result of the voice recognition processing ( and 4) the corresponding list storage unit that stores a correlation list (35A) where the first response system that describes the contents of the input audio data to respond (35), the first response An answering system determination unit (32A) for determining which one of the first and second response systems should respond to the input voice data based on the result of the voice recognition processing by the system; Answer output unit (18D) for outputting response voice data generated by the answer side response system, which is a response system that is determined to respond to the input voice data in the answer side system determination unit, to the speaker (14). The answering system determination unit determines that the first response system should respond to the input voice data when the result of the voice recognition processing is associated with the correspondence list. On the other hand, if the result of the speech recognition process is not associated with the correspondence list, the second response system should respond to the input speech data. Wherein the determining.

以上の構成では、回答側システム判定部が第１の応答システムによる音声認識処理の結果に基づいて第１、第２の応答システムのうち、いずれの応答システムが入力音声データに対して応答するべきかを判定する。そして、回答出力部は、回答判定部において応答するべきであると判定された方の応答システムからの応答音声データをスピーカに音声出力させる。 In the above configuration, the answering system determination unit should respond to the input voice data among the first and second response systems based on the result of the voice recognition processing by the first response system. Determine whether. Then, the answer output unit causes the speaker to output response voice data from the response system that has been determined to be answered by the answer determination unit.

このような構成によれば、回答側システム判定部が自動的に入力音声データの内容に応じた応答システムを選択するため、ユーザは質問する度に当該質問に対する回答を求める応答システムを選択する手間を省略することができ、ユーザの利便性を向上させる事ができる。 According to such a configuration, since the answering system determination unit automatically selects a response system according to the contents of the input voice data, the user has to select a response system that asks for an answer to the question every time a question is asked. Can be omitted, and the convenience of the user can be improved.

センターの発明は、前記第１の応答システムとしての機能を担うとともに、前記回答側システム判定部を備えることを特徴とする。 The center invention bears the function as the first response system and includes the answering system determination unit.

本実施形態にかかる応答制御システム１００の概略的な構成の一例を示すブロック図である。It is a block diagram which shows an example of a schematic structure of the response control system 100 concerning this embodiment. 第１センター３の概略的な構成の一例を示すブロック図である。3 is a block diagram illustrating an example of a schematic configuration of a first center 3. FIG. 制御部１８の概略的な構成の一例を示すブロック図である。3 is a block diagram illustrating an example of a schematic configuration of a control unit 18. FIG. 制御部１８が実施する応答切替処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the response switching process which the control part 18 implements. 第１センター側制御部３２が実施する第１センター応答処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the 1st center response process which the 1st center side control part 32 implements. 表示装置１３の表示画面の例である。3 is an example of a display screen of the display device 13. 変形例４の応答制御システム１００Ａの概略的な構成の一例を示すブロック図である。It is a block diagram which shows an example of a schematic structure of the response control system 100A of the modification 4.

以下、本発明の実施形態を図１〜図６を用いて説明する。図１は、本実施形態に係る応答制御システム１００の概略的な構成の一例を示す図である。図１に示すように応答制御システム１００は、ナビゲーション装置１、携帯電話機２、第１センター３、および第２センター４を備えている。ナビゲーション装置１と第１センター３、ナビゲーション装置１と携帯電話機２、携帯電話機２と第２センター４とは、それぞれ公知の無線通信技術を用いてデータの送受信を実施する。 Hereinafter, embodiments of the present invention will be described with reference to FIGS. FIG. 1 is a diagram illustrating an example of a schematic configuration of a response control system 100 according to the present embodiment. As shown in FIG. 1, the response control system 100 includes a navigation device 1, a mobile phone 2, a first center 3, and a second center 4. The navigation device 1 and the first center 3, the navigation device 1 and the mobile phone 2, and the mobile phone 2 and the second center 4 each perform data transmission / reception using a known wireless communication technology.

本実施形態において第１センター３および第２センター４がそれぞれ応答システムとして動作し、ナビゲーション装置１は各応答システムを利用するためのユーザインターフェース（背景技術欄および請求項に記載の車載器）として動作する。第１センター３が請求項に記載の第１の応答システムに、第２センター４が請求項に記載の第２の応答システムに相当する。なお、以降ではナビゲーション装置１を搭載している車両を自車両と呼ぶ。 In the present embodiment, the first center 3 and the second center 4 each operate as a response system, and the navigation device 1 operates as a user interface for using each response system (the vehicle-mounted device described in the background art section and claims). To do. The first center 3 corresponds to the first response system described in the claims, and the second center 4 corresponds to the second response system described in the claims. Hereinafter, a vehicle equipped with the navigation device 1 is referred to as a host vehicle.

第１センター３は、一例として自動車会社の情報センターであって、ナビゲーション装置の操作や、渋滞情報、自車両の操作に関連する質問に対応したり、ナビゲーション装置１の機能を利用するための音声入力による命令（これを命令コマンドとする）に対応する。第１センター３の動作の概要としては、ナビゲーション装置１から送信されてくる音声データをもとに、音声認識処理を実施することでユーザの質問内容を解析する。そして、ユーザの質問に対して自センターが応答すべきかどうかを判定するとともに、ユーザの質問への応答となる応答音声データを生成してナビゲーション装置１に返送する。例えば、第１センター３は、ユーザの「コンビニはどこ？」という質問に対し、ユーザの現在地から最寄りのコンビニの位置を教えてくれるものである。以降では、この第１センター３の構成について図２を用いてより詳細に説明する。 The first center 3 is an information center of an automobile company as an example, and is a voice for responding to questions related to navigation device operation, traffic jam information, and own vehicle operation, and using functions of the navigation device 1. Corresponds to an instruction by input (this is an instruction command). As an outline of the operation of the first center 3, the content of the question of the user is analyzed by performing a voice recognition process based on the voice data transmitted from the navigation device 1. Then, it determines whether the center should respond to the user's question, generates response voice data that is a response to the user's question, and returns it to the navigation device 1. For example, the first center 3 tells the location of the nearest convenience store from the user's current location in response to the user's question “Where is the convenience store?”. Hereinafter, the configuration of the first center 3 will be described in more detail with reference to FIG.

図２に示すように第１センター３は、第１センター側通信部３１、第１センター側制御部３２、音声認識部３３、音声認識データベース（以降、データベースはＤＢと略す）３４、第１センター側メモリ３５、および音声合成部３６を備えている。第１センター側通信部３１、第１センター側制御部３２、音声認識部３３、音声認識ＤＢ３４、第１センター側メモリ３５、音声合成部３６は、それぞれ例えば公知の通信規格に準拠したバス３７を介して相互通信可能に接続されている。 As shown in FIG. 2, the first center 3 includes a first center side communication unit 31, a first center side control unit 32, a voice recognition unit 33, a voice recognition database (hereinafter, the database is abbreviated as DB) 34, a first center. A side memory 35 and a speech synthesizer 36 are provided. The first center side communication unit 31, the first center side control unit 32, the voice recognition unit 33, the voice recognition DB 34, the first center side memory 35, and the voice synthesis unit 36 each have a bus 37 compliant with, for example, a known communication standard. Are connected so that they can communicate with each other.

第１センター側通信部３１は、データの送受信をするための変調／復調などの種々の信号処理を実施する機能を備え、例えば携帯電話網やインターネット網などのネットワークを介してナビゲーション装置１と通信を実施する。第１センター側通信部３１は、ナビゲーション装置１から送られてくるデータを第１センター側制御部３２に出力し、また、第１センター側制御部３２から入力されるデータをナビゲーション装置１に送信する。なお、ナビゲーション装置１から第１センター３に送られてくるデータとしては、ユーザが発話した音声データの他、ナビゲーション装置１の現在地情報などがある。現在地情報などは逐次（例えば１００ミリ秒毎）に送られてくるものとする。 The first center side communication unit 31 has a function of performing various signal processing such as modulation / demodulation for data transmission / reception, and communicates with the navigation apparatus 1 via a network such as a mobile phone network or the Internet network. To implement. The first center side communication unit 31 outputs data sent from the navigation device 1 to the first center side control unit 32 and transmits data input from the first center side control unit 32 to the navigation device 1. To do. The data sent from the navigation device 1 to the first center 3 includes current location information of the navigation device 1 as well as voice data spoken by the user. It is assumed that the current location information is sent sequentially (for example, every 100 milliseconds).

音声認識ＤＢ３４には、音声認識処理に必要なデータとして、例えば、人間の発声の小さな単位（音素）の音響特徴が記述されている音響モデル、音素の音響特徴と単語とを対応付ける認識辞書、単語間の連接関係を表現する言語モデルが格納されている。なお、本実施形態における音声認識ＤＢ３４は、例えば千語から数万語に対応する大規模なデータベースであるものとする。また、音声認識ＤＢ３４には、音声合成部３６が応答音声データを生成するために用いる音声データも格納されている。 In the speech recognition DB 34, as data necessary for speech recognition processing, for example, an acoustic model in which acoustic features of small units (phonemes) of human utterances are described, a recognition dictionary that associates acoustic features of phonemes with words, words A language model that expresses the connection relationship between the two is stored. Note that the speech recognition DB 34 in the present embodiment is a large-scale database corresponding to, for example, 1000 to tens of thousands of words. The voice recognition DB 34 also stores voice data used by the voice synthesizer 36 to generate response voice data.

音声認識部３３は、第１センター側制御部３２から入力される入力音声データから、音声認識ＤＢ３４に格納されている種々のデータを用いて、音声認識処理を実施する。音声認識処理は、公知の技術を用いればよいため、ここでの説明は省略する。音声認識処理の結果は、第１センター側制御部３２に出力される。音声合成部３６は、第１センター側制御部３２からの指示に基づいて音声認識ＤＢ３４に格納されている音声波形データを合成することで、応答音声データを生成する。生成された応答音声データは第１センター側制御部３２に出力される。この第１センター３において生成される応答音声データが請求項に記載の第１の応答音声データに相当する。 The voice recognition unit 33 performs voice recognition processing using various data stored in the voice recognition DB 34 from the input voice data input from the first center side control unit 32. Since the voice recognition process may use a known technique, a description thereof is omitted here. The result of the voice recognition process is output to the first center side control unit 32. The voice synthesizing unit 36 generates response voice data by synthesizing the voice waveform data stored in the voice recognition DB 34 based on an instruction from the first center side control unit 32. The generated response voice data is output to the first center side control unit 32. The response voice data generated in the first center 3 corresponds to the first response voice data described in the claims.

第１センター側メモリ３５は、書き込み可能なＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等の大容量記憶装置である。第１センター側メモリ３５には、対応リスト３５Ａおよび応答用データ３５Ｂが保存されている。対応リスト３５Ａには自センターが対応すべき質問や命令コマンドのリストが登録されている。また、応答用データ３５Ｂは、ユーザからの質問や命令コマンドに応答するためのデータである。 The first center memory 35 is a mass storage device such as a writable HDD (Hard Disk Drive). The first center side memory 35 stores a correspondence list 35A and response data 35B. In the correspondence list 35A, a list of questions and command commands to be handled by the center is registered. The response data 35B is data for responding to questions and command commands from the user.

例えば、応答用データ３５Ｂには、対応リスト３５Ａに登録されている各質問に対応する回答や、命令コマンドに対応してナビゲーション装置１の動作を制御するための制御信号のパターンが登録されている。第１センター側制御部３２は、応答用データ３５Ｂに基づいて、ユーザの質問に対する応答音声データを返送したり、ナビゲーション装置１の動作を制御するための制御信号をナビゲーション装置１に返送したりする。 For example, in the response data 35B, an answer corresponding to each question registered in the correspondence list 35A and a pattern of a control signal for controlling the operation of the navigation device 1 corresponding to the command command are registered. . Based on the response data 35B, the first center-side control unit 32 returns response voice data to the user's question, or returns a control signal for controlling the operation of the navigation device 1 to the navigation device 1. .

本実施形態において、応答音声データを生成するための応答用データ３５Ｂは、応答音声データを生成するために必要な音声波形データを指定するコマンドの群とするが、これに限らない。対応リスト３５Ａに登録されている質問に対応する回答として予め登録してある音声データであってもよい。この場合、この音声データを応答音声データとしてナビゲーション装置１に返送すればよいため、第１センター３は音声合成部３６を備える必要はない。 In this embodiment, the response data 35B for generating response voice data is a group of commands that specify voice waveform data necessary for generating response voice data, but is not limited thereto. Voice data registered in advance as an answer corresponding to a question registered in the correspondence list 35A may be used. In this case, since the voice data may be returned to the navigation device 1 as response voice data, the first center 3 does not need to include the voice synthesizer 36.

なお、本実施形態では第１センター側メモリ３５としてＨＤＤを用いる構成とするが、その他、ＤＶＤやフラッシュメモリなど公知の記憶媒体を用いてもよい。この第１センター側メモリ３５が請求項に記載の対応リスト記憶部に相当する。 In the present embodiment, an HDD is used as the first center-side memory 35, but other known storage media such as a DVD and a flash memory may be used. The first center-side memory 35 corresponds to a correspondence list storage unit described in claims.

第１センター側制御部３２は、コンピュータとして構成されており、周知のＣＰＵ、ＲＯＭやＥＥＰＲＯＭなどの不揮発性メモリ、ＲＡＭなどの揮発性メモリ、Ｉ／Ｏ、及びこれらの構成を接続するバスライン（いずれも図示略）などを備えている。不揮発性メモリには、種々の処理を実行するためのプログラムが格納されている。第１センター側制御部３２は、第１センター側通信部３１や音声認識部３３、音声合成部３６が実施する処理を制御する。 The first center-side control unit 32 is configured as a computer, and includes a well-known CPU, a nonvolatile memory such as a ROM and an EEPROM, a volatile memory such as a RAM, an I / O, and a bus line that connects these configurations ( All are not shown). The nonvolatile memory stores a program for executing various processes. The first center side control unit 32 controls processing performed by the first center side communication unit 31, the voice recognition unit 33, and the voice synthesis unit 36.

例えば第１センター側制御部３２は、ナビゲーション装置１から送られてくる音声データ（これを入力音声データとする）を第１センター側通信部３１から取得した場合には、当該音声データを音声認識部３３に出力し、音声認識部３３に音声認識処理を実施させる。 For example, when the first center-side control unit 32 acquires the voice data sent from the navigation device 1 (this is input voice data) from the first center-side communication unit 31, the first center-side control unit 32 recognizes the voice data. Output to the unit 33, and causes the voice recognition unit 33 to perform voice recognition processing.

そして、音声認識部３３から取得する音声認識処理の結果をもとに、第１センター側メモリ３５に保存されている応答用データ３５Ｂを参照し、入力音声データに応じた応答音声データを音声合成部３６に生成させる。応答音声データを音声合成部３６から取得すると、当該応答音声データを第１センター側通信部３１からナビゲーション装置１に送信させる。 Based on the result of the speech recognition processing acquired from the speech recognition unit 33, the response data 35B stored in the first center side memory 35 is referred to, and the response speech data corresponding to the input speech data is speech synthesized. It is made to generate in part 36. When the response voice data is acquired from the voice synthesis unit 36, the response voice data is transmitted from the first center side communication unit 31 to the navigation device 1.

また、第１センター側制御部３２は、図２に示すように、機能ブロックとして回答側判定部３２Ａと、更新部３２Ｂと、を備えている。更新部３２Ｂは、図５に示すフローチャートのステップＳ２１９を実施する。更新部３２Ｂについての詳細な説明は後述する。 Moreover, the 1st center side control part 32 is provided with the answer side determination part 32A and the update part 32B as a functional block, as shown in FIG. The updating unit 32B performs step S219 of the flowchart shown in FIG. A detailed description of the updating unit 32B will be described later.

回答側判定部３２Ａは、音声認識部３３から取得する音声認識処理の結果と、対応リスト３５Ａと、から、入力音声データに対して自センターが応答すべきか否かを判定する。例えば、音声認識処理の結果、ユーザの質問がこの対応リスト３５Ａに登録されている範囲の質問である場合には、当該入力音声データに対して自センターが応答すべきであると判定する。 The answering side determination unit 32A determines whether or not the own center should respond to the input voice data from the result of the voice recognition process acquired from the voice recognition unit 33 and the correspondence list 35A. For example, if the user's question is in the range registered in the correspondence list 35A as a result of the voice recognition process, it is determined that the center should respond to the input voice data.

また、ユーザの質問がこの対応リスト３５Ａに登録されている範囲外の質問である場合には、当該入力音声データに対して自センターが応答すべきではないと判定する。判定の結果は、応答音声データの送信に先立って、判定結果信号としてナビゲーション装置１に送信する。この回答側判定部３２Ａが請求項に記載の回答側システム判定部に相当する。 When the user's question is a question outside the range registered in the correspondence list 35A, it is determined that the center should not respond to the input voice data. The determination result is transmitted to the navigation device 1 as a determination result signal prior to transmission of the response voice data. The answer side determination unit 32A corresponds to the answer side system determination unit described in the claims.

なお、回答側判定部３２Ａにおいて自センターが回答すべきではないと判定された場合であっても、第１センター側制御部３２は音声合成部３６などと協働して「すみません。認識出来ませんでした」などの応答音声データを生成し、ナビゲーション装置１に返送する構成としておけば良い。回答側判定部３２Ａにおいて自センターが回答すべきではないと判定された場合の処理は、適宜設計されればよいが、何らかの応答音声データを生成してナビゲーション装置１に返送するものとする。 Even if the answering side determination unit 32A determines that the center should not answer, the first center side control unit 32 cooperates with the speech synthesis unit 36 and the like. It may be configured to generate response voice data such as “Did it?” And return it to the navigation device 1. The processing when the answering side determination unit 32A determines that the center should not answer may be designed as appropriate, but some response voice data is generated and returned to the navigation device 1.

また、ナビゲーション装置１の動作を制御する制御信号を送信する場合でも、どのような動作をさせるのかをユーザが認識できるような応答音声データも送信するものとする。ただし、後述するように、入力音声データの内容が他のセンターからの回答を出力するように要求するものであった場合には、特に応答音声データを返送せずに、他のセンターからの回答を出力させる制御信号を送信すれば良い。 In addition, even when a control signal for controlling the operation of the navigation device 1 is transmitted, response voice data that allows the user to recognize what operation is to be performed is also transmitted. However, as will be described later, when the content of the input voice data is a request to output a response from another center, the response from the other center is not sent without returning the response voice data. May be transmitted as a control signal.

したがって、本実施形態においては、入力音声データの内容が他のセンターからの回答を出力するように要求するものであった場合を除いて、入力音声データを取得すると応答音声データを返送するものとする。もちろん、他の構成として、入力音声データの内容が命令コマンドであった場合には、応答音声データを返送しない構成としても良い。 Therefore, in the present embodiment, the response voice data is returned when the input voice data is acquired, unless the content of the input voice data is a request to output an answer from another center. To do. Of course, as another configuration, when the content of the input voice data is a command command, the response voice data may not be returned.

携帯電話機２は、周知の携帯電話機であって、第２センター４とネットワークを介して通信を実施するとともに、ナビゲーション装置１が備える第２通信部１７とも通信を実施する。例えば携帯電話機２は、第２通信部１７から受信する信号を、携帯電話機２と第２センター４間の通信の規格に応じた信号に変換して、第２センター４に送信する。また、第２センター４から受信する信号を、携帯電話機２と第２通信部１７間の通信の規格に応じた信号に変換して、第２通信部１７に送信する。 The mobile phone 2 is a well-known mobile phone, and communicates with the second center 4 via the network and also communicates with the second communication unit 17 included in the navigation device 1. For example, the mobile phone 2 converts the signal received from the second communication unit 17 into a signal according to the communication standard between the mobile phone 2 and the second center 4 and transmits the signal to the second center 4. In addition, the signal received from the second center 4 is converted into a signal according to the communication standard between the mobile phone 2 and the second communication unit 17 and transmitted to the second communication unit 17.

第２センター４は、例えば携帯電話会社の情報センターであって、当該携帯電話会社の管理する携帯電話網を利用する携帯電話機２のユーザに対して種々のサービスを提供する。第２センター４は、回答側判定部３２Ａを備えていないことを除けば、第１センター３と同様の構成である。すなわち、ナビゲーション装置１から送信されてくる音声データをもとに、音声認識処理を実施することでユーザの質問内容を解析する。そして、ユーザの質問への応答となる応答音声データを生成してナビゲーション装置１に返送する。ただし、第２センター４は、回答側判定部３２Ａに相当する機能を備えていないため、ユーザの質問に対して自センターが応答すべきかどうかの判定は実施しない。 The second center 4 is an information center of a mobile phone company, for example, and provides various services to the user of the mobile phone 2 using the mobile phone network managed by the mobile phone company. The second center 4 has the same configuration as the first center 3 except that the answer side determination unit 32A is not provided. That is, based on the voice data transmitted from the navigation device 1, the user's question content is analyzed by performing voice recognition processing. Then, response voice data serving as a response to the user's question is generated and returned to the navigation device 1. However, since the second center 4 does not have a function corresponding to the answer side determination unit 32A, it is not determined whether or not the center should respond to the user's question.

携帯電話会社が提供するサービスとしてスケジュール管理機能を想定した場合を例にとると、例えば、第２センター４は、ユーザの「今日の予定は？」という質問に対し、予め登録されているユーザのその日の予定を教えてくれるものである。なお、応答音声データを生成するための元となるデータ（その日のスケジュールの情報など）は、携帯電話機２と第２センター４とが種々のデータ通信を実施することで、第２センター４が取得する構成としてもよい。この第２センター４が生成する応答音声データが請求項に記載の第２の応答音声データに相当する。 Taking the case where a schedule management function is assumed as a service provided by a mobile phone company as an example, for example, the second center 4 responds to a user's question “What is today's schedule?” It will tell you the schedule for the day. In addition, the second center 4 acquires data (such as schedule information for the day) that is the basis for generating the response voice data by the mobile phone 2 and the second center 4 performing various data communications. It is good also as composition to do. The response voice data generated by the second center 4 corresponds to the second response voice data described in the claims.

ナビゲーション装置１は、車両に搭載されるものであって、一般的なナビゲーション装置と同様の経路案内を行う機能を有している他に、例えば、第１センター３や、携帯電話機２を介して第２センター４と通信を行う機能を有している。ナビゲーション装置１は、図１に示すように、トークスイッチ（以降、トークＳＷ）１１、マイクロフォン（以降、マイク）１２、表示装置１３、スピーカ１４、メモリ１５、第１通信部１６、第２通信部１７、および制御部１８を備えている。 The navigation device 1 is mounted on a vehicle, and has a function of performing route guidance similar to that of a general navigation device. In addition, for example, the navigation device 1 is connected via the first center 3 or the mobile phone 2. It has a function of communicating with the second center 4. As shown in FIG. 1, the navigation device 1 includes a talk switch (hereinafter referred to as talk SW) 11, a microphone (hereinafter referred to as microphone) 12, a display device 13, a speaker 14, a memory 15, a first communication unit 16, and a second communication unit. 17 and a control unit 18.

トークＳＷ１１は、ユーザ（運転者）が音声入力を開始する旨を指示するためのもので、例えばステアリングコラムカバーの側面部やシフトレバーの近傍などユーザが操作しやすい位置に設けられている。なお、トークＳＷ１１は一例として、いわゆるクリック方式のスイッチとし、トークＳＷ１１がユーザの操作によってオンに設定されると（すなわち、クリックされると）、オン信号を制御部１８に出力する。 The talk SW 11 is for instructing the user (driver) to start voice input, and is provided at a position where the user can easily operate, for example, near the side surface of the steering column cover or the shift lever. As an example, the talk SW 11 is a so-called click-type switch. When the talk SW 11 is turned on by a user operation (ie, clicked), an on signal is output to the control unit 18.

制御部１８は、トークＳＷ１１からオン信号が入力されると、音声データを取得するための処理を実施するとともに、第１センター３および第２センター４との接続を開始する。ユーザは、トークＳＷ１１をオン操作した後、一定時間内（例えば１．５秒以内に）に発話し始めることで、その発話した音声をナビゲーション装置１に入力することができる。 When an ON signal is input from the talk SW 11, the control unit 18 performs processing for acquiring audio data and starts connection with the first center 3 and the second center 4. The user can input the spoken voice to the navigation device 1 by starting to speak within a certain time (for example, within 1.5 seconds) after turning on the talk SW 11.

なお、トークＳＷ１１を押下してから一定時間内に発話し始めた場合の、音声入力を終了するタイミングは後述するようにマイク１２から入力される音声信号の電力レベルが一定閾値以下となった時とする。 Note that the timing for ending the voice input when starting to speak within a certain time after pressing the talk SW 11 is when the power level of the voice signal input from the microphone 12 is below a certain threshold, as will be described later. And

マイク１２は、例えば無指向性の小型マイクであり、ユーザが発話した音声や雑音などの周囲の音を集音し、電気的な音声信号に変換して、制御部１８に出力する。マイク１２は、例えばステアリングコラムカバーの上面部や運転席側のサンバイザー等のユーザの音声を拾いやすい位置に設けられる。 The microphone 12 is, for example, a small omnidirectional microphone, collects surrounding sounds such as voices and noises spoken by the user, converts them into electrical voice signals, and outputs them to the control unit 18. The microphone 12 is provided at a position where the user's voice can be easily picked up, such as an upper surface portion of the steering column cover or a sun visor on the driver's seat side.

表示装置１３は、制御部１８からの入力に基づいてテキストや画像を表示し、種々の情報をユーザに報知する。表示装置１３は、例えばインスツルメントパネルの中央、又は運転席の前方に設けられたコンビネーションメータ内等に配置されている。表示装置１３は、例えばフルカラー表示が可能なものであり、液晶ディスプレイ、有機ＥＬディスプレイ、プラズマディスプレイ等を用いて構成することができる。 The display device 13 displays text and images based on the input from the control unit 18 and notifies the user of various information. The display device 13 is arranged, for example, in the center of the instrument panel or in a combination meter provided in front of the driver's seat. The display device 13 is capable of full color display, for example, and can be configured using a liquid crystal display, an organic EL display, a plasma display, or the like.

スピーカ１４は、制御部１８から入力された電気的な音声信号を音声（単なる音を含む）に変換して出力する。メモリ１５は、種々のデータを記憶する記憶装置であり、主として、種々の通信部１６、１７を介して第１センター３や第２センター４から取得する応答音声データや、ユーザの音声データを保存する。メモリ１５は、公知の記憶媒体を用いて構成すればよく、ＨＤＤや、比較的記憶容量の小さいリムーバブルなメモリ（例えばＳＤカードなど）であってもよい。 The speaker 14 converts the electrical sound signal input from the control unit 18 into sound (including simple sound) and outputs the sound. The memory 15 is a storage device that stores various data, and mainly stores response voice data acquired from the first center 3 and the second center 4 via various communication units 16 and 17 and user voice data. To do. The memory 15 may be configured using a known storage medium, and may be an HDD or a removable memory (for example, an SD card) having a relatively small storage capacity.

第１通信部１６は、送受信アンテナ（図示略）を備え、通信網を介して、第１センター３との間で通信を行う。第１通信部１６は、例えばテレマティクス通信に用いられるＤＣＭ（ＤａｔａＣｏｍｍｕｎｉｃａｔｉｏｎＭｏｄｕｌｅ）といった車載通信モジュールなどの様々なものを採用することができる。第１通信部１６は、第１センター３から受信した信号を復調して制御部１８に入力し、また、制御部１８から入力されたデータを変調して第１センター３に送信する。 The first communication unit 16 includes a transmission / reception antenna (not shown), and performs communication with the first center 3 via a communication network. The first communication unit 16 can employ various devices such as an in-vehicle communication module such as DCM (Data Communication Module) used for telematics communication, for example. The first communication unit 16 demodulates the signal received from the first center 3 and inputs the demodulated signal to the control unit 18, and modulates the data input from the control unit 18 and transmits the data to the first center 3.

第２通信部１７は、送受信アンテナ（図示略）を備え、携帯電話機２との間でＢｌｕｅｔｏｏｔｈ（登録商標）の規格に従った通信（以下、ＢＴ通信）を行うことで、情報のやり取りを行う。なお、本実施形態では、携帯電話機２と第２通信部１７との間での通信を、ＢＴ通信で行う構成を示したが、必ずしもこれに限らない。例えばＺｉｇＢｅｅ（登録商標）等の近距離無線通信規格やＩＥＥＥ８０２．１１等の無線ＬＡＮ規格などに従った無線通信によって行う構成としてもよいし、ＵＳＢ通信等の有線通信によって行う構成としてもよい。第２通信部１７は、携帯電話機２との通信規格に応じた変調／復調などの機能を備えていればよい。 The second communication unit 17 includes a transmission / reception antenna (not shown), and exchanges information with the mobile phone 2 by performing communication (hereinafter referred to as BT communication) in accordance with the Bluetooth (registered trademark) standard. . In the present embodiment, the configuration in which the communication between the mobile phone 2 and the second communication unit 17 is performed by BT communication is shown, but the configuration is not necessarily limited thereto. For example, it may be configured to perform wireless communication according to short-range wireless communication standards such as ZigBee (registered trademark) or wireless LAN standards such as IEEE 802.11, or may be configured to perform wired communication such as USB communication. The second communication unit 17 only needs to have a function such as modulation / demodulation according to the communication standard with the mobile phone 2.

制御部１８は、通常のコンピュータとして構成されており、周知のＣＰＵ、ＲＯＭやＥＥＰＲＯＭなどの不揮発性メモリ、ＲＡＭなどの揮発性メモリ、Ｉ／Ｏ、及びこれらの構成を接続するバスライン（いずれも図示略）などを備えている。不揮発性メモリには、種々の処理を実行するためのプログラムが格納されている。制御部１８は、種々の処理を実行するための機能ブロックとして、図３に示すように、音声取得部１８Ａ、回答一時保存部１８Ｂ、回答側センター設定部１８Ｃ、回答出力部１８Ｄ、回答フィードバック部１８Ｅ、エージェント表示制御部１８Ｆ、および別回答要求判定部１８Ｇを備える。 The control unit 18 is configured as a normal computer, and includes a well-known CPU, a non-volatile memory such as a ROM and an EEPROM, a volatile memory such as a RAM, an I / O, and a bus line connecting these configurations (all are (Not shown). The nonvolatile memory stores a program for executing various processes. As shown in FIG. 3, the control unit 18 includes a voice acquisition unit 18A, a response temporary storage unit 18B, a response side center setting unit 18C, a response output unit 18D, and a response feedback unit as functional blocks for executing various processes. 18E, an agent display control unit 18F, and another response request determination unit 18G.

音声取得部１８Ａは、トークＳＷ１１からのオン信号に基づいて、マイク１２から入力される音声信号からノイズ成分を除去した音声データを取得する処理を実施する。例えば音声取得部１８Ａは、オン信号が入力されると、マイク１２から入力される音声信号を音声データに変換可能な状態である待機状態となる。そして、待機状態となってから音声が入力されない状態が一定時間（例えば、１．５秒）以上継続すると、自動的に変換不可状態となる。一定時間内に音声が入力されているとの判定が為された場合には、音声の入力が終わったと判定されるまでのマイク１２から入力される音声信号を音声データに変換する。 Based on the ON signal from the talk SW 11, the sound acquisition unit 18 </ b> A performs a process of acquiring sound data obtained by removing noise components from the sound signal input from the microphone 12. For example, when the ON signal is input, the voice acquisition unit 18A enters a standby state in which the voice signal input from the microphone 12 can be converted into voice data. When a state in which no sound is input continues for a certain period of time (for example, 1.5 seconds) after entering the standby state, the conversion is automatically disabled. If it is determined that the voice is input within a certain time, the voice signal input from the microphone 12 until it is determined that the voice input is completed is converted into voice data.

音声が入力されているか否か、および音声入力が終了したか否かは、公知技術を用いればよく、例えば音声信号の電力レベルが所定の閾値以上となったか否かによって判定すればよい。もちろん、このような構成においては閾値以上の電力レベルとなっている音声信号が入力された場合に、音声が入力されたと判定する。これら音声データの取得方法は、公知の技術を用いればよい。音声取得部１８Ａで取得した音声データは、ＶｏＩＰ（ＶｏｉｃｅｏｖｅｒＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）などの公知の技術を用いてデジタルのパケット信号に変換して第１通信部１６を介して第１センター３に送信される。 Whether or not the voice is input and whether or not the voice input has ended may be determined by using a known technique, for example, by determining whether or not the power level of the voice signal is equal to or higher than a predetermined threshold. Of course, in such a configuration, when an audio signal having a power level equal to or higher than the threshold value is input, it is determined that the audio is input. A publicly known technique may be used as a method for acquiring these audio data. The voice data acquired by the voice acquisition unit 18A is converted into a digital packet signal using a known technique such as VoIP (Voice over Internet Protocol) and transmitted to the first center 3 via the first communication unit 16. .

回答一時保存部１８Ｂは、第１センター３および第２センター４から受信した応答音声データをメモリ１５に一時保存する。回答一時保存部１８Ｂは、これらの音声データを受信してから、少なくとも例えば図４のステップＳ１１５でＮＯと判定されるまでは保存しておくものとする。この回答一時保存部１８Ｂが請求項に記載の一時保存部に相当する。 The answer temporary storage unit 18 </ b> B temporarily stores the response voice data received from the first center 3 and the second center 4 in the memory 15. It is assumed that the answer temporary storage unit 18B stores these audio data until at least, for example, NO is determined in step S115 in FIG. The answer temporary storage unit 18B corresponds to the temporary storage unit described in the claims.

回答側センター設定部１８Ｃは、第１センター３から送られてくる判定結果信号に応じて、第１センター３および第２センター４のどちらから取得する応答音声データを、ユーザから入力された質問への回答としてスピーカ１４から出力するかを決定する。そして、ユーザから入力された質問に回答する側のセンターを回答側センターに設定し、回答側センターに設定されていないほうのセンターを準回答側センターに設定する。 In response to the determination result signal sent from the first center 3, the answer-side center setting unit 18C sends response voice data acquired from either the first center 3 or the second center 4 to the question input by the user. To output from the speaker 14 as an answer to the above. Then, the center that answers the question input by the user is set as the answering center, and the center that is not set as the answering center is set as the semi-answering center.

より具体的には、判定結果信号が、第１センター３が応答すべきであるという内容であった場合には、回答側センターとして第１センター３を採用し、第１センター３から取得した応答音声データを出力するように回答出力部１８Ｄに指示する。また、判定結果信号が、第１センター３が応答すべきでないという内容であった場合には、回答側センターとして第２センター４を採用し、第２センター４から取得した応答音声データを出力するように回答出力部１８Ｄに指示する。 More specifically, when the determination result signal indicates that the first center 3 should respond, the response obtained from the first center 3 by adopting the first center 3 as the answering center. The answer output unit 18D is instructed to output audio data. If the determination result signal indicates that the first center 3 should not respond, the second center 4 is adopted as the answering center and the response voice data acquired from the second center 4 is output. Instruct the answer output unit 18D as follows.

なお、本実施形態ではこのように回答側センター設定部１８Ｃにおいて、第１センター３および第２センター４のそれぞれを、回答側センターおよび準回答側センターに設定する構成とするが、この回答側センター設定部１８Ｃは発明を実施する上で任意の要素である。回答側センターが請求項に記載の回答側応答システムに、準回答側センターが請求項に記載の準回答側応答システムにそれぞれ相当する。 In the present embodiment, the answering center setting unit 18C is configured to set each of the first center 3 and the second center 4 as the answering center and the semi-answering center. The setting unit 18C is an optional element for carrying out the invention. The answer-side center corresponds to the answer-side response system described in the claims, and the semi-answer-side center corresponds to the semi-answer-side response system described in the claims.

回答出力部１８Ｄは、回答側センター設定部１８Ｃの指示またはユーザ操作に基づいて、第１センター３または第２センター４のどちらから取得した応答音声データを音声信号に変換してスピーカ１４に出力し、スピーカ１４に音声として出力させる。回答フィードバック部１８Ｅ、エージェント表示制御部１８Ｆ、別回答要求判定部１８Ｇについては後述する。 The answer output unit 18D converts the response voice data acquired from either the first center 3 or the second center 4 into a voice signal based on an instruction from the answer side center setting unit 18C or a user operation, and outputs the voice signal to the speaker 14. The speaker 14 is made to output as sound. The response feedback unit 18E, the agent display control unit 18F, and the separate response request determination unit 18G will be described later.

（応答切替処理）
ここで、図４に示すフローチャートを用いて、ナビゲーション装置１の制御部１８が実施する応答切替処理の流れを説明する。図４のフローチャートは、例えば自車両のイグニッションスイッチがオンされてナビゲーション装置１に電源供給されたときに開始されるものとする。 (Response switching process)
Here, the flow of the response switching process which the control part 18 of the navigation apparatus 1 implements is demonstrated using the flowchart shown in FIG. The flowchart in FIG. 4 is started when, for example, the ignition switch of the host vehicle is turned on and power is supplied to the navigation device 1.

まず、ステップＳ１０１では、トークＳＷ１１からオン信号が入力された否かを判定する。トークＳＷ１１からオン信号が入力された場合には、ステップＳ１０１がＹＥＳとなってステップＳ１０３に移る。また、トークＳＷ１１からオン信号が入力されていない場合には、ステップＳ１０１がＮＯとなって、ステップＳ１０１に戻る。すなわち、ユーザによってトークＳＷ１１が押下されるまで待機する。 First, in step S101, it is determined whether or not an ON signal is input from the talk SW11. If an ON signal is input from the talk SW 11, step S101 is YES, and the process proceeds to step S103. If no ON signal is input from the talk SW 11, step S101 is NO and the process returns to step S101. That is, it waits until the talk SW 11 is pressed by the user.

ステップＳ１０３では、制御部１８は、第１センター３に接続し、第１センター３による応答システムを利用可能な状態にするとともに、携帯電話機２を介して第２センター４にも接続し、第２センター４による応答システムを利用可能な状態とする。すなわち、各応答システムは、ナビゲーション装置１から送信される音声データの受信を待機している状態とする。 In step S103, the control unit 18 connects to the first center 3, makes the response system by the first center 3 available, and also connects to the second center 4 via the mobile phone 2, The response system by the center 4 is made available. That is, each response system is in a state of waiting for reception of voice data transmitted from the navigation device 1.

また、それぞれの応答システムが利用可能な状態となると、エージェント表示制御部１８Ｆは、表示装置１３に図６（Ａ）に示すように、それぞれの応答システムに対応するエージェントＡ、Ｂを同時に表示させる。図６中のエージェントＡは、第１センター３による応答システムに対応するエージェントの画像であり、エージェントＢが第２センター４による応答システムに対応するエージェントの画像である。なお、エージェントとは、架空の人物や擬人化された動物などのキャラクターである。 When each response system becomes available, the agent display control unit 18F causes the display device 13 to simultaneously display agents A and B corresponding to each response system, as shown in FIG. . Agent A in FIG. 6 is an image of an agent corresponding to the response system by the first center 3, and Agent B is an image of an agent corresponding to the response system by the second center 4. An agent is a character such as a fictional character or anthropomorphized animal.

ある応答システムに応じたエージェントを表示する技術としては、例えば特開２００６−１９５５７８に開示されている技術を適用すれば良い。ただし、特許文献１は、複数の応答システムのそれぞれに対応する複数のエージェントを表示するものではない。本実施形態のエージェント表示制御部１８Ｆは、特許文献１などの公知の技術を用いて、それぞれの応答システムに対応するエージェント画像を生成し、それら複数のエージェントの画像を１つの画面に表示されるように合成または重畳して表示するものとする。 As a technique for displaying an agent corresponding to a certain response system, for example, a technique disclosed in JP-A-2006-195578 may be applied. However, Patent Document 1 does not display a plurality of agents corresponding to each of a plurality of response systems. The agent display control unit 18F according to the present embodiment generates an agent image corresponding to each response system using a known technique such as Patent Document 1, and displays the images of the plurality of agents on one screen. Thus, it is assumed that the images are combined or superimposed.

ステップＳ１０３での処理が終了すると、ステップＳ１０５に移る。なお、Ｓ１０３を実施している間も音声取得部１８Ａは、マイク１２から音声データを取得している。 When the process in step S103 ends, the process proceeds to step S105. Note that the voice acquisition unit 18A acquires voice data from the microphone 12 while S103 is being performed.

ステップＳ１０５では、マイク１２からユーザによる音声が入力されたか否かを判定する。すなわち、ステップＳ１０１でトークＳＷ１１が押下されてから一定時間内にユーザからの音声が入力されなかった場合には、ステップＳ１０５がＮＯとなってステップＳ１０１に戻る。一方、ユーザから音声が入力された場合には、ステップＳ１０５がＹＥＳとなってステップＳ１０７に移る。ステップＳ１０７では、音声取得部１８Ａが取得した音声データを、ＶｏＩＰ技術などを用いて第１センター３および第２センター４に送信してステップＳ１０９に移る。 In step S105, it is determined whether or not a user's voice is input from the microphone 12. That is, if no voice is input from the user within a predetermined time after the talk SW 11 is pressed in step S101, step S105 is NO and the process returns to step S101. On the other hand, when a voice is input from the user, step S105 becomes YES and the process proceeds to step S107. In step S107, the voice data acquired by the voice acquisition unit 18A is transmitted to the first center 3 and the second center 4 using the VoIP technology or the like, and the process proceeds to step S109.

ステップＳ１０９では、第１センター３から送られてくる判定結果信号を、第１通信部１６を介して取得して、ステップＳ１１１に移る。なお、判定結果信号を受信すると、回答側センター設定部１８Ｃが、当該判定結果信号に基づいて、第１センター３および第２センター４をそれぞれ回答側センターまたは準回答側センターに設定する。 In step S109, the determination result signal sent from the first center 3 is acquired via the first communication unit 16, and the process proceeds to step S111. When the determination result signal is received, the answer-side center setting unit 18C sets the first center 3 and the second center 4 as the answer-side center or the semi-answer-side center, respectively, based on the determination result signal.

ステップＳ１１１では、第１センター３および第２センター４から送信される応答音声データを取得する。ここで、準回答側センターからの応答音声データは、回答一時保存部１８Ｂによってメモリ１５に一時保存される。なお、本実施形態では回答側センターからの応答音声データを取得したタイミングでステップＳ１１１からステップＳ１１３に移る構成とするが、両方のセンター３，４から応答音声データを取得してからステップＳ１１３に移る構成でもよい。 In step S111, response voice data transmitted from the first center 3 and the second center 4 is acquired. Here, the response voice data from the semi-answer side center is temporarily saved in the memory 15 by the answer temporary saving unit 18B. In the present embodiment, the process proceeds from step S111 to step S113 when the response voice data from the answering center is acquired. However, after the response voice data is acquired from both the centers 3 and 4, the process proceeds to step S113. It may be configured.

ステップＳ１１３では、回答出力部１８Ｄが、回答側センター設定部１８Ｃによって回答側センターからの応答音声データをスピーカ１４に音声出力させる。また、回答側センターの応答音声データをスピーカ１４から音声出力させるとともに、図６の（Ｂ）や（Ｃ）に示すように回答側センターに対応するエージェントを相対的に大きく表示し、かつ、当該エージェントが話しているように画像を表示させる。ステップＳ１１３を実施すると、ステップＳ１１５に移る。 In step S113, the answer output unit 18D causes the answer side center setting unit 18C to output the response voice data from the answer side center to the speaker 14 by voice. In addition, the answering voice data of the answering center is output from the speaker 14, and the agent corresponding to the answering center is displayed relatively large as shown in FIGS. 6B and 6C. Display the image as the agent is speaking. When step S113 is performed, the process proceeds to step S115.

なお、各センターから応答音声データとともに応答音声データの内容に対応するテキストデータも取得できる場合には、当該エージェントのそばに当該テキストを表示しても良い。当該エージェントが話しているように画像を表示させる視覚効果は適宜設計されれば良い。 In addition, when the text data corresponding to the content of the response voice data can be acquired together with the response voice data from each center, the text may be displayed near the agent. The visual effect for displaying an image as the agent is speaking may be designed as appropriate.

ステップＳ１１５では、別回答要求判定部１８Ｇが、ユーザから別の回答を要求する操作を受け付けたか否かを判定する。別の回答を要求する操作を受け付けた場合としては、例えば、表示装置１３に積層されたタッチパネル（図示略）に対して、準回答側センターのエージェントが表示されている領域をユーザがタッチしたことを検出した場合とすればよい。 In step S115, the separate answer request determination unit 18G determines whether an operation for requesting another answer from the user has been received. For example, when an operation for requesting another answer is accepted, the user touches the area where the agent of the semi-answer side center is displayed on the touch panel (not shown) stacked on the display device 13. May be detected.

その他、ユーザは、準回答側センターに対応するエージェントの名前を音声入力することによって、別の回答を制御部１８に要求してもよい。ただし、この場合、対応リスト３５Ａに、両センターの応答システムに対応するエージェントの名前をそれぞれ登録しておく必要がある。そして、第１センター側制御部３２は、準回答側センターのエージェントの名前が呼ばれたことを検出することによって、準回答側センターの応答音声データを出力させる制御信号をナビゲーション装置１に送信する。 In addition, the user may request the controller 18 for another answer by voice-inputting the name of the agent corresponding to the semi-answer side center. However, in this case, it is necessary to register the names of agents corresponding to the response systems of both centers in the correspondence list 35A. Then, the first center side control unit 32 transmits to the navigation device 1 a control signal for outputting response voice data of the semi-answer side center by detecting that the name of the agent of the semi-answer side center is called. .

なお、ステップＳ１１５に遷移してから一定時間（例えば２秒）経過してもユーザから別の回答を要求する操作を受け付けていない場合には、ユーザから別の回答を要求する操作が為されなかったと判定する。 If an operation for requesting another answer is not received from the user even after a lapse of a certain time (for example, 2 seconds) after the transition to step S115, an operation for requesting another answer is not performed from the user. It is determined that

ユーザから別の回答を要求する操作を受け付けた場合には、ステップＳ１１５がＹＥＳとなってステップＳ１１７に移る。また、ユーザから別の回答を要求する操作が為されなかった場合には、ステップＳ１１５がＮＯとなってステップＳ１１９に移る。ステップＳ１１７では、準回答側センターからの応答音声データを、メモリ１５から読み出し、スピーカ１４から音声出力させる。また、準回答側センターの応答音声データをスピーカ１４から音声出力するとともに、図６の（Ｂ）や（Ｃ）のように準回答側センターに対応するエージェントを相対的に大きく表示し、当該エージェントが話しているように画像を表示させる。ステップＳ１１７を実施すると、ステップＳ１１９に移る。 If an operation for requesting another answer is received from the user, step S115 becomes YES and the process moves to step S117. If the user does not request another answer, step S115 is NO and the process proceeds to step S119. In step S117, the response voice data from the semi-answer side center is read from the memory 15 and output from the speaker 14 as voice. Also, the response voice data of the semi-answer side center is output from the speaker 14 and the agent corresponding to the semi-answer side center is displayed relatively large as shown in FIGS. 6B and 6C. Display an image as if talking. When step S117 is performed, the process proceeds to step S119.

ステップＳ１１９では、回答フィードバック処理を実施して、再びステップＳ１０１の待機状態に戻る。この回答フィードバック処理では、判定結果信号の内容、すなわち、回答側判定部３２Ａの判定結果が正しかったか否かを第１センター３に送信する。例えば、ステップＳ１１５においてユーザから別の回答を要求する操作が為されなかった場合には、回答側判定部３２Ａの判定が正しかったとする内容の信号を第１センター３に送信する。一方、ステップＳ１１５においてユーザから別の回答を要求する操作を受け付けた場合には、回答側判定部３２Ａの判定が誤っていたとする内容の信号を第１センター３に送信する。 In step S119, an answer feedback process is performed, and the process returns to the standby state in step S101. In this answer feedback processing, the contents of the judgment result signal, that is, whether or not the judgment result of the answer side judgment unit 32A is correct is transmitted to the first center 3. For example, if an operation for requesting another answer is not performed by the user in step S115, a signal indicating that the answer side determination unit 32A has made a correct determination is transmitted to the first center 3. On the other hand, when an operation for requesting another answer is received from the user in step S115, a signal indicating that the answer side determination unit 32A has made an incorrect determination is transmitted to the first center 3.

なお、本実施形態においては、いったんステップＳ１０３を実施した場合には、以降においても常に第１センター３および第２センター４との接続を維持するとともに、表示装置１３にも図６の（Ａ）を表示させておく構成とする。接続を維持する場合には一定周期（例えば２００ミリ秒ごと）で接続が維持できているかを確認するための信号を送受信すればよい。そして、接続が切断している状態において、オン信号が入力された場合には再度ステップＳ１０３を実施すれば良い。もちろん、他の構成として、ステップＳ１１９を実施する度に、第１センター３および第２センター４との接続を切断する構成でもよい。 In the present embodiment, once step S103 is performed, the connection with the first center 3 and the second center 4 is always maintained and the display device 13 is also configured as shown in FIG. Is displayed. When maintaining the connection, a signal for confirming whether the connection can be maintained at a constant period (for example, every 200 milliseconds) may be transmitted and received. Then, when an ON signal is input in a state where the connection is disconnected, step S103 may be performed again. Of course, as another configuration, the connection with the first center 3 and the second center 4 may be disconnected every time step S119 is performed.

（第１センター応答処理）
次に、ナビゲーション装置１が実施する応答切替処理における各処理を受けて第１センター側制御部３２が実施する第１センター応答処理について、図５に示すフローチャートを用いて説明する。図５に示すフローチャートは、ナビゲーション装置１から接続要求信号を受信したとき（図４のステップＳ１０３）に開始される。 (First center response process)
Next, the first center response process performed by the first center-side control unit 32 in response to each process in the response switching process performed by the navigation device 1 will be described with reference to the flowchart shown in FIG. The flowchart shown in FIG. 5 is started when a connection request signal is received from the navigation device 1 (step S103 in FIG. 4).

まず、ステップＳ２０１では、ナビゲーション装置１からの接続要求信号に対して応答信号を返送し、ナビゲーション装置１および第１センター３間の接続を確立する。ナビゲーション装置１および第１センター３間の接続を確立すると、ステップＳ２０３に移る。ステップＳ２０３では、ナビゲーション装置１から入力音声データを取得したか否かを判定する。入力音声データを取得した場合には、ステップＳ２０３がＹＥＳとなってステップＳ２０５に移る。入力音声データを取得していない場合には、ステップＳ２０３がＮＯとなってステップＳ２０３に戻る。すなわち、入力音声データを取得するまで第１センター側制御部３２は待機している状態である。 First, in step S201, a response signal is returned to the connection request signal from the navigation device 1 to establish a connection between the navigation device 1 and the first center 3. When the connection between the navigation device 1 and the first center 3 is established, the process proceeds to step S203. In step S203, it is determined whether or not input voice data has been acquired from the navigation device 1. When the input voice data is acquired, step S203 becomes YES and the process proceeds to step S205. If the input voice data has not been acquired, step S203 is NO and the process returns to step S203. That is, the first center side control unit 32 is in a standby state until the input voice data is acquired.

ステップＳ２０５では、入力音声データを音声認識部３３に出力し、公知の音声認識処理を実施させる。音声認識部３３から音声認識処理の結果を取得すると、ステップＳ２０７に移る。ステップＳ２０７では、回答側判定部３２Ａが、音声認識部３３から取得する音声認識処理の結果と、対応リスト３５Ａと、から、入力音声データに対して自センターが応答すべきか否かを判定する。入力音声データに対して自センターが応答すべきであると判定した場合には、ステップＳ２０９がＹＥＳとなってステップＳ２１１に移る。また、入力音声データに対して自センターが応答すべきではないと判定した場合には、ステップＳ２０９がＮＯとなってステップＳ２１３に移る。 In step S205, the input voice data is output to the voice recognition unit 33, and a known voice recognition process is performed. When the result of the voice recognition process is acquired from the voice recognition unit 33, the process proceeds to step S207. In step S207, the answer-side determination unit 32A determines whether or not the center should respond to the input voice data from the result of the voice recognition process acquired from the voice recognition unit 33 and the correspondence list 35A. If it is determined that the center should respond to the input voice data, step S209 is YES and the process moves to step S211. If it is determined that the center should not respond to the input voice data, step S209 is NO and the process proceeds to step S213.

ステップＳ２１１では、回答側センターとして自センターを設定した判定結果信号をナビゲーション装置１に返送し、ステップＳ２１５に移る。またステップＳ２１３では、回答側センターとして他のセンターを設定した判定結果信号をナビゲーション装置１に返送し、ステップＳ２１５に移る。 In step S211, a determination result signal in which the center is set as the answering center is returned to the navigation device 1, and the process proceeds to step S215. In step S213, a determination result signal in which another center is set as the answering center is returned to the navigation device 1, and the process proceeds to step S215.

ステップＳ２１５では、音声認識処理の結果をもとに、第１センター側メモリ３５に保存されている応答用データ３５Ｂを参照し、入力音声データの内容に応じた応答音声データを音声合成部３６に生成させる。音声合成部３６から応答音声データを取得するとステップＳ２１７に移る。ステップＳ２１７では、当該応答音声データを第１センター側通信部３１からナビゲーション装置１に送信させ、ステップＳ２１９に移る。 In step S215, based on the result of the voice recognition process, the response data 35B stored in the first center memory 35 is referred to, and the response voice data corresponding to the content of the input voice data is sent to the voice synthesizer 36. Generate. When response voice data is acquired from the voice synthesizer 36, the process proceeds to step S217. In step S217, the response voice data is transmitted from the first center communication unit 31 to the navigation device 1, and the process proceeds to step S219.

ステップＳ２１９では、更新部３２Ｂがフィードバック処理を実施する。このフィードバック処理では、まず、ナビゲーション装置１から回答側判定部３２Ａの判定結果が正しかったか否かの結果を取得する（図４のステップＳ１１９）。ナビゲーション装置１から回答側判定部３２Ａの判定が正しかったとする内容を取得した場合には、次回以降の判定も今回と同様に判定すれば良い。また、ナビゲーション装置１から回答側判定部３２Ａの判定が誤っていたとする内容を取得した場合には、次回以降の判定に対して今回とは異なる判定をするように対応リスト３５Ａを更新する。 In step S219, the update unit 32B performs feedback processing. In this feedback processing, first, the result of whether or not the determination result of the answer side determination unit 32A is correct is acquired from the navigation device 1 (step S119 in FIG. 4). When the content that the determination of the answer side determination unit 32A is correct is acquired from the navigation device 1, the determination after the next time may be determined in the same manner as this time. Further, when the content that the determination of the answer side determination unit 32A is incorrect is acquired from the navigation device 1, the correspondence list 35A is updated so as to make a determination different from this time for the subsequent determination.

なお、ナビゲーション装置１による応答切替処理を受けて、第２センターが実施する応答処理も、図５のステップＳ２０７〜Ｓ２１３を実施しない点を除けば、第１センター応答処理と同様であるため、ここでの詳細な説明は省略する。 The response process performed by the second center in response to the response switching process by the navigation device 1 is the same as the first center response process except that steps S207 to S213 in FIG. 5 are not performed. The detailed description in is omitted.

以上で述べた応答制御システム１００における、ユーザ操作に対する一連の作動について説明する。まず、ユーザがトークＳＷ１１を押下すると、表示装置１３には図６（Ａ）に示すように第１センター３および第２センター４のそれぞれの応答システムに対応するエージェントＡ，Ｂを表示する（ステップＳ１０３）。そして、ユーザから例えば「コンビニはどこ？」という発話に対応する音声信号が入力された場合には、ナビゲーション装置１は、この音声信号を音声データに変換して、第１センター３および第２センター４に送信する（ステップＳ１０７）。 A series of operations in response to user operations in the response control system 100 described above will be described. First, when the user presses down the talk SW 11, agents A and B corresponding to the response systems of the first center 3 and the second center 4 are displayed on the display device 13 as shown in FIG. S103). For example, when a voice signal corresponding to an utterance “Where is the convenience store?” Is input from the user, the navigation device 1 converts the voice signal into voice data, and the first center 3 and the second center. 4 (step S107).

第１センター３では、この音声データ（すなわち入力音声データ）を取得すると、音声認識処理を実施し（ステップＳ２０５）、回答判定処理を実施する（ステップＳ２０７）。ここでの例では入力音声データの内容がコンビニの位置を質問しているものであるため、音声認識処理の結果、道路案内の質問であると判定し、自センターが回答するべきであると判定する（ステップＳ２０９ＹＥＳ）。そして、回答側センターとして自センター（すなわち、第１センター３）を設定した判定結果信号を出力する（ステップＳ２１１）。 When the first center 3 acquires the voice data (that is, input voice data), the voice recognition process is performed (step S205), and the answer determination process is performed (step S207). In this example, the content of the input voice data is a question about the location of a convenience store. As a result of the voice recognition process, it is determined that it is a road guidance question, and it is determined that the center should respond. (Step S209 YES) And the determination result signal which set the self center (namely, 1st center 3) as an answer side center is output (step S211).

ナビゲーション装置１の回答側センター設定部１８Ｃは、この判定結果信号を受けて第１センター３を回答側センターに設定し、かつ、第２センター４を準回答側センターに設定して、各センター３，４から応答音声データから送られてくるのを待機する。その後、各センター３，４から応答音声データを取得すると、第１センター３からの応答音声データをスピーカ１４から音声出力させ、第２センター４からの応答音声データを回答一時保存部１８Ｂがメモリ１５に一時保存する（ステップＳ１１３）。 The answering center setting unit 18C of the navigation device 1 receives the determination result signal, sets the first center 3 as the answering center, sets the second center 4 as the semi-answering center, and sets each center 3 , 4 waits for response voice data to be sent. Thereafter, when the response voice data is acquired from each of the centers 3 and 4, the response voice data from the first center 3 is output from the speaker 14, and the response temporary storage unit 18B stores the response voice data from the second center 4 in the memory 15. (Step S113).

なお、第１センター３からの応答音声データをスピーカ１４から出力している間は、図６（Ｂ）に示すように第１センター３に対応するエージェントＡを相対的に大きく表示するとともに、エージェントＡが話しているように表示する。 While the response voice data from the first center 3 is being output from the speaker 14, the agent A corresponding to the first center 3 is displayed relatively large as shown in FIG. Display as A speaks.

その後、一定時間内にユーザから別の回答を要求する操作を受け付けた場合には（ステップＳ１１５ＹＥＳ）、第２センター４からの応答音声データをスピーカ１４から出力させる（ステップＳ１１７）。そして、ナビゲーション装置１は回答フィードバック処理を実施して、この「コンビニはどこ？」というユーザの発話に対する一連の応答切替処理を終了する。なお、以上では第１センター３が回答側センターとなる例を述べたが、第２センター４が回答側センターとなる場合であっても同様に処理すればよい。 Thereafter, when an operation for requesting another answer is received from the user within a predetermined time (YES in step S115), response voice data from the second center 4 is output from the speaker 14 (step S117). And the navigation apparatus 1 implements an answer feedback process, and complete | finishes a series of response switching processes with respect to this user's utterance "where is a convenience store?" Although the example in which the first center 3 is the answering center has been described above, the same processing may be performed even when the second center 4 is the answering center.

（本実施形態のまとめ）
以上の構成では、回答側判定部３２Ａが第１センター３による音声認識処理の結果に基づいて前記第１、第２センターのうち、いずれのセンターが前記入力音声データに対して応答するべきかを判定し、その判定結果をナビゲーション装置１に送信する。ナビゲーション装置１では、回答側センター設定部１８Ｃが当該判定結果に基づいて、回答側センターおよび準回答側センターを設定し、回答出力部が、回答側センターからの応答音声データをスピーカに音声出力させる。 (Summary of this embodiment)
In the above configuration, which one of the first and second centers the response side determination unit 32A should respond to the input voice data based on the result of the voice recognition processing by the first center 3. The determination result is transmitted to the navigation device 1. In the navigation device 1, the answer-side center setting unit 18C sets the answer-side center and the semi-answer-side center based on the determination result, and the answer output unit causes the speaker to output response voice data from the answer-side center. .

このような構成によれば、回答側判定部３２Ａが自動的に入力音声データの内容に応じて、ユーザの音声入力に対して応答すべきセンター（すなわち応答システム）を選択するため、ユーザは応答を行わせる応答システムを選択する手間を省くことができ、ユーザの利便性を向上させる事ができる。 According to such a configuration, the answering side determination unit 32A automatically selects a center (that is, a response system) that should respond to the user's voice input according to the contents of the input voice data. Therefore, it is possible to save the trouble of selecting a response system for performing the operation and to improve the convenience for the user.

また、本実施形態では車両外のセンター（ここでは第１センター３）に回答側判定部３２Ａを備える構成とすることで、より多くのユーザからの回答フィードバック処理を受け付けることができる。これにともなって、回答側判定部３２Ａは、より的確な回答判定処理を実施することができるようになる。 Further, in the present embodiment, by providing the answer side determination unit 32A in the center outside the vehicle (here, the first center 3), answer feedback processing from more users can be accepted. Accordingly, the answering side determination unit 32A can perform more accurate answer determination processing.

また、回答一時保存部１８Ｂが準回答側センターからの応答音声データをメモリ１５に一時保存しておき、ステップＳ１１５においてユーザからの所定の操作入力（エージェント画像のタッチなど）を受け付けると、準回答側センターからの応答音声を音声出力する。これによって、回答側判定部３２Ａの判定結果が不適切であった場合であっても、同じ音声内容を再度入力すること無く、速やかに他のセンターからの回答を聞くことができる。 Further, when the response temporary storage unit 18B temporarily stores the response voice data from the semi-answer side center in the memory 15 and receives a predetermined operation input (such as touch of an agent image) from the user in step S115, the semi-answer Outputs the response voice from the side center. As a result, even if the determination result of the answering side determination unit 32A is inappropriate, it is possible to promptly hear answers from other centers without inputting the same audio content again.

また、回答フィードバック処理およびフィードバック処理を実施する構成とした。これによって、回答側判定部３２Ａの判定結果に対するユーザの反応（回答側センターからの応答内容に満足したか否か）を、次回からの判定に反映することができ、より回答判定処理の精度を向上させることができる。 The response feedback process and the feedback process are implemented. Thereby, the user's reaction to the determination result of the answer side determination unit 32A (whether or not the content of the response from the answer side center is satisfied) can be reflected in the determination from the next time, and the accuracy of the answer determination process can be improved. Can be improved.

さらに、エージェント表示制御部が、表示装置１３に各センターに対応するエージェントＡ、Ｂを表示し、回答出力部１８Ｄによってスピーカ１４から音声出力されている応答音声データを生成したセンターに対応するエージェントが話しているように表示する。これによって、ユーザは、自身の音声入力に対してどちらのセンターが対応しているのかが一目で認識することができる。 Further, the agent corresponding to the center where the agent display control unit displays the agents A and B corresponding to the respective centers on the display device 13 and generates the response voice data output from the speaker 14 by the answer output unit 18D. Display as if speaking. Thus, the user can recognize at a glance which center corresponds to his / her voice input.

なお、本実施形態では、複数の応答システムを利用する際のインターフェースとしてナビゲーション装置１を用いる構成としたが、これに限らない。ナビゲーション機能を備えない車載器を、複数の応答システムを利用する際のインターフェースとして用いてもよい。 In the present embodiment, the navigation device 1 is used as an interface when using a plurality of response systems. However, the present invention is not limited to this. An in-vehicle device that does not have a navigation function may be used as an interface when using a plurality of response systems.

以上、本発明の実施形態を説明したが、本発明は上述の実施形態に限定されるものではなく、以降に述べる種々の変形例も本発明の技術的範囲に含まれ、さらに、下記以外にも要旨を逸脱しない範囲内で種々変更して実施することができる。 As mentioned above, although embodiment of this invention was described, this invention is not limited to the above-mentioned embodiment, The various modifications described below are also included in the technical scope of this invention, and also in addition to the following However, various modifications can be made without departing from the scope of the invention.

（変形例１）
上述の実施形態では回答側判定部３２Ａを第１センター３に備えさせたが、もちろん、回答側判定部３２Ａは第２センター４に備えさせても良い。また、第１センター３として自動車会社の情報センターを想定し、第２センター４として携帯電話会社の情報センターを想定したが、これら以外の情報センターであってもよい。 (Modification 1)
In the above-described embodiment, the answer side determination unit 32A is provided in the first center 3, but of course, the answer side determination unit 32A may be provided in the second center 4. In addition, although an information center of an automobile company is assumed as the first center 3 and an information center of a mobile phone company is assumed as the second center 4, an information center other than these may be used.

（変形例２）
また、上述した実施形態では、第２センター４との通信は携帯電話機２を介して実施する構成としたが、これに限らない。携帯電話機２を介さずに、ナビゲーション装置１と第２センター４とが通信する構成でもよい。さらに、携帯電話機２に代わって、無線通信機能を有する他の携帯端末（例えば公知のタブレット端末など）を用いる構成であってもよい。なお、メモリ１５は任意の要素としてもよい。 (Modification 2)
In the above-described embodiment, the communication with the second center 4 is performed via the mobile phone 2, but is not limited thereto. A configuration in which the navigation device 1 and the second center 4 communicate without using the mobile phone 2 may be employed. Furthermore, instead of the mobile phone 2, another mobile terminal having a wireless communication function (for example, a known tablet terminal) may be used. Note that the memory 15 may be an arbitrary element.

（変形例３）
前述の実施形態では、第１センター３が回答するのか、第２センター４が回答するのかを判定する回答側判定部３２Ａを第１センター３に備えさせたが、これに限らない。例えばナビゲーション装置１に、対応リスト３５Ａおよび回答側判定部３２Ａに相当する機能を備えさせても良い。 (Modification 3)
In the above-described embodiment, the first center 3 is provided with the answer side determination unit 32A that determines whether the first center 3 answers or the second center 4 answers, but the present invention is not limited to this. For example, the navigation device 1 may be provided with functions corresponding to the correspondence list 35A and the answer side determination unit 32A.

回答判定処理に用いる音声認識の結果は、第１センター３または第２センター４から取得する構成でも良いし、音声認識部３３、音声認識ＤＢ３４に相当する機能をナビゲーション装置１に備えさせても良い。この変形例１の構成によれば、ナビゲーション装置１において、回答側センターを判定できるため、応答制御システムの構成をより簡単にすることができる。 The voice recognition result used for the answer determination process may be obtained from the first center 3 or the second center 4, or the navigation device 1 may be provided with functions corresponding to the voice recognition unit 33 and the voice recognition DB 34. . According to the configuration of the first modification, since the answering center can be determined in the navigation device 1, the configuration of the response control system can be further simplified.

（変形例４）
なお、以上では、ユーザの入力音声データに応答する複数の応答システムがいずれも車両外のセンターに備えられている構成を示したが、これに限らない。たとえば図７に示すように、ナビゲーション装置１に第１センター３に相当する応答システム１９が備えられている構成でもよい。この場合、通信は第２センター４とだけ実施するため、通信費を前述の実施形態よりも抑制することができる。 (Modification 4)
In addition, although the above demonstrated the structure with which the some response system which responds to a user's input audio | voice data is equipped with the center outside a vehicle, it does not restrict to this. For example, as shown in FIG. 7, the navigation device 1 may be provided with a response system 19 corresponding to the first center 3. In this case, since communication is performed only with the second center 4, the communication cost can be suppressed as compared with the above-described embodiment.

（変形例５）
一般に、音声認識処理はＣＰＵに対して比較的高い処理能力を必要とするため、変形例４のようにナビゲーション装置１に音声認識処理などの機能を備えさせると、ナビゲーション装置１は高性能のＣＰＵを備える必要性が生じ、ナビゲーション装置１が高価になってしまうことが懸念される。そこで、公知のＤＳＲ（ＤｉｓｔｒｉｂｕｔｅｄＳｐｅｅｃｈＲｅｃｏｇｎｉｔｉｏｎ）技術を用いて、音声認識処理をナビゲーション装置１と、外部のセンター３，４とで、分担する構成としても良い。 (Modification 5)
In general, since the voice recognition process requires a relatively high processing capacity for the CPU, if the navigation apparatus 1 is provided with a function such as a voice recognition process as in the fourth modification, the navigation apparatus 1 has a high performance CPU. There is a concern that the navigation device 1 becomes expensive. Therefore, a configuration may be adopted in which the voice recognition processing is shared between the navigation device 1 and the external centers 3 and 4 using a known DSR (Distributed Speech Recognition) technique.

例えば、ナビゲーション装置１にユーザの音声データに対して音響分析のみを実施する機能を実装し、音響分析によって得られる特徴量を外部のセンター３，４に送信する。外部のセンター３，４は、ナビゲーション装置１から送られてくる特徴量に基づいて音声認識処理を実行し、応答音声データをナビゲーション装置１に返送する。 For example, the navigation apparatus 1 is equipped with a function for performing only acoustic analysis on the user's voice data, and the feature amount obtained by the acoustic analysis is transmitted to the external centers 3 and 4. The external centers 3 and 4 execute voice recognition processing based on the feature amount sent from the navigation device 1, and return response voice data to the navigation device 1.

（変形例６）
また、前述の実施形態では、回答側判定部３２Ａによる判定の精度を向上させるための学習方法として、ステップＳ１１９での回答フィードバック処理およびステップＳ２１９でのフィードバック処理を実施する構成としたが、これに限らない。 (Modification 6)
In the above-described embodiment, the answer feedback process in step S119 and the feedback process in step S219 are implemented as a learning method for improving the accuracy of the determination by the answer side determination unit 32A. Not exclusively.

例えば、センター側制御部３２は、ユーザが同じ或いは類似性の高い内容の質問を繰り返し入力したかどうかで、ユーザが回答に満足したかを判定してもよい。すなわち、センター側制御部３２は、音声認識処理の結果、同じ或いは類似性の高い内容と推定される質問が一定時間（例えば１分）以内に再度為された場合には、前回（すなわち一回目に）スピーカ１４より出力された回答に対してユーザが満足していなかったと判定する。この場合、回答側判定部３２Ａは、一回目では準回答側センターに設定したセンターを回答側センターに設定するとともに、対応リスト３５Ａの内容を、一回目で準回答側センターに設定したセンターを回答側センターに設定するように修正すればよい。 For example, the center-side control unit 32 may determine whether the user is satisfied with the answer based on whether the user repeatedly inputs the same or highly similar question. That is, as a result of the voice recognition process, the center-side control unit 32 determines that the same or highly similar question is asked again within a certain time (for example, 1 minute). B) It is determined that the user is not satisfied with the answer output from the speaker 14. In this case, the answering side determination unit 32A sets the center set as the semi-answering center at the first time as the answering center and answers the center set as the semi-answering center at the first time. What is necessary is just to correct so that it may set to a side center.

なお、入力音声データの内容が、以前に入力された入力音声データと同じ或いは類似性の高いものであるか否かの判定は、音声認識処理の結果、それぞれの入力音声データに出現する単語（その単語の類似語も含む）の一致度合いによって評価すれば良い。 It should be noted that whether or not the content of the input voice data is the same as or high in similarity with the previously input voice data is determined based on the words ( Evaluation may be made based on the degree of matching (including similar words of the word).

ただし、ユーザが最初の応答音声データによる回答に満足していないからといって、回答側センターが不適切だったとは限らない。ユーザは、同じ回答側センターに対して異なる回答を求めている場合も考えられる。 However, just because the user is not satisfied with the response by the first response voice data does not mean that the answer center is inappropriate. The user may be requesting different answers from the same answering center.

そこで、回答側判定部３２Ａは、音声認識処理の結果と、対応リスト３５Ａとの一致度合いに基づいて、判定結果に対する確からしさを算出する。そして、その確からしさが所定閾値以上である場合には、対応リスト３５Ａは修正せずに、応答用データ３５Ｂを修正してより適切な応答音声データが生成できるようにしてもよい。 Therefore, the answering side determination unit 32A calculates the probability of the determination result based on the degree of coincidence between the result of the voice recognition process and the correspondence list 35A. If the certainty is equal to or greater than a predetermined threshold value, the response list 35A may be corrected to generate more appropriate response voice data without correcting the correspondence list 35A.

なお、音声認識処理の結果と、対応リスト３５Ａとの一致度合いとは、対応リストと、ユーザの入力音声と対応リストに登録されている質問との意味の近さを評価したものと言い換えられる。 The coincidence degree between the result of the speech recognition process and the correspondence list 35A is paraphrased as an evaluation of the closeness of the meaning between the correspondence list, the user's input voice and the question registered in the correspondence list.

例えば、１回目に「近くのガソリンスタンドは？」という質問に対して、第１センター３が近くのＡ社のガソリンスタンドの場所を提示した後に、「Ｂ社のガソリンスタンドは？」という質問が入力された場合を想定する。 For example, in response to the first question “What is a nearby gas station?”, After the first center 3 presents the location of a nearby A company gas station, the question “What is the B company gas station?” Assume that it is entered.

なお、第１センター３が応答すべき質問の内容として「ガソリンスタンドの場所」が対応リストに登録されており、１回目の入力音声も、２回目の入力音声も、ガソリンスタンドの場所を尋ねているため、第１センター３が応答すべきであるという判定結果の確からしさは所定の閾値以上となっているものとする。 As the contents of the question to be answered by the first center 3, “gas station location” is registered in the correspondence list, and the first input voice and the second input voice ask the location of the gas station. Therefore, it is assumed that the certainty of the determination result that the first center 3 should respond is equal to or greater than a predetermined threshold value.

このような場合には、回答側センターの判定結果は維持したまま、次回からはＢ社のガソリンスタンドを優先的に案内するように応答用データを修正する構成とすれば良い。この場合、各センターはユーザ毎の嗜好を反映した応答用データ３５Ｂを備えていることが好ましい。 In such a case, the response data may be corrected so as to preferentially guide the gas station of Company B from the next time while maintaining the determination result of the answering center. In this case, each center is preferably provided with response data 35B reflecting the preference of each user.

（変形例７）
なお、回答側判定部３２Ａを備えている第１センター３は、第２センター４の応答音声データを取り込んでもよい。第１センター３は、第２センター４の応答音声データを解析することにより、第１センター３の回答判定処理や応答用データ３５Ｂに反映することで、ユーザの質問に対する回答の精度を向上させる事ができる。 (Modification 7)
The first center 3 provided with the answer side determination unit 32A may capture the response voice data of the second center 4. The first center 3 analyzes the response voice data of the second center 4 and reflects it in the answer determination process of the first center 3 and the response data 35B, thereby improving the accuracy of the answer to the user's question. Can do.

特に、第１センター３と第２センター４とが、同様なサービス（周辺のレストラン検索など）を実施している場合には、第２センター４の応答音声データを取り込むことで、ユーザの嗜好などを反映することが期待される。 In particular, when the first center 3 and the second center 4 are carrying out similar services (such as searching for nearby restaurants), the user's preferences and the like can be obtained by capturing response voice data of the second center 4. Is expected to reflect.

この場合、ナビゲーション装置１は、第２センター４から取得した応答音声データを回答一時保存部１８Ｂに一時保存するとともに、第１通信部１６を介して第１センター３に送信する。なお、第２センターの応答音声データを取り込む処理は、第１センター応答処理とは独立して行うことが好ましい。 In this case, the navigation device 1 temporarily stores the response voice data acquired from the second center 4 in the answer temporary storage unit 18B and transmits the response voice data to the first center 3 via the first communication unit 16. In addition, it is preferable to perform the process which takes in response audio | voice data of a 2nd center independently of a 1st center response process.

（変形例８）
また、音声認識処理の結果、質問の内容が相対的にあいまいであった場合に、車両の状況によって質問の意図を推測する構成としても良い。例えば、のろのろ運転、ブレーキを頻繁に踏む、ウインカの急な操作等の運転動作や、現在の車両位置がユーザの普段の行動範囲外である場合に、「お腹すいた」といった内容の入力音声データを取得した場合には、レストランなどを探していると推定し、道案内が必要だと判定する。 (Modification 8)
Moreover, when the content of the question is relatively ambiguous as a result of the voice recognition processing, the intention of the question may be estimated based on the vehicle situation. For example, input voice data such as “I'm hungry” when driving, such as slow driving, frequent braking, sudden turn of a blinker, or when the current vehicle position is outside the user's normal range of action If it is obtained, it is estimated that a restaurant or the like is being searched for, and it is determined that route guidance is necessary.

そして、「この先にレストランがあります。目的地設定しますか？」といった応答音声データを返送する。車両の状況を推定するための車両情報（車両の速度、ブレーキ、ウインカ、車両位置など）は、ナビゲーション装置１を通して自動車会社のセンターへ送信するものとする。また、目的物を探している／道に迷っている等の判定は、自動車会社のセンターで行うものとする。なお、ナビゲーション装置１で車両情報に基づいて車両の状況を判定し、その判定結果を自動車会社のセンターへ送信してもよい。 Then, response voice data such as “There is a restaurant ahead. Do you want to set a destination?” Is returned. Vehicle information (vehicle speed, brake, turn signal, vehicle position, etc.) for estimating the vehicle status is transmitted to the center of the automobile company through the navigation device 1. In addition, it is assumed that the determination as to whether the object is being searched or lost is performed at the center of the automobile company. Note that the navigation apparatus 1 may determine the vehicle status based on the vehicle information, and transmit the determination result to the center of the automobile company.

このような構成によれば、ユーザの音声入力の内容が、あいまいなものであった場合にも車両情報から、ユーザの状況を推定することで、より適切な応答を実施することができるようになる。 According to such a configuration, even when the content of the user's voice input is ambiguous, a more appropriate response can be implemented by estimating the user's situation from the vehicle information. Become.

（変形例９）
上述した実施形態では、対応リスト３５Ａには、第１センター３が対応すべき質問や命令コマンドのリストが登録されており、他のセンターが対応すべき質問や命令コマンドのリストについては登録されていない構成としたがこれに限らない。 (Modification 9)
In the embodiment described above, the list of questions and command commands that should be handled by the first center 3 is registered in the correspondence list 35A, and the list of questions and command commands that should be handled by other centers is registered. Although it was set as the structure which is not, it is not restricted to this.

すなわち、対応リスト３５Ａには、第１センター３が対応すべき質問や命令コマンドのリストに加えて、第２センター４が対応すべき質問や命令コマンドのリストを備えていてもよい。これによって、ユーザの質問の内容が、第２センター４が対応すべき質問であると対応リスト３５Ａに登録されている場合には、回答側判定部３２Ａは、単に自センター３が応答すべきではない、というだけでなく、当該ユーザの質問に対しては第２センター４が応答すべきであることまで判定することができるようになる。 That is, the correspondence list 35A may include a list of questions and command commands that the second center 4 should handle in addition to a list of questions and command commands that the first center 3 should handle. As a result, when the content of the user's question is registered in the correspondence list 35A that the second center 4 should be a question, the answer side determination unit 32A should not simply respond to the center 3 itself. Not only that, it is possible to determine that the second center 4 should respond to the user's question.

また、これに伴って、回答側判定部３２Ａは、音声認識処理の結果と、センター毎の対応リスト３５Ａの内容との一致度に基づいて、センター毎にそのセンターが応答することの尤もらしさ（尤度）を算出し、尤度が高い方を回答側センターに設定しても良い。なお、回答側判定部３２Ａは、音声認識処理の結果と、センター毎の対応リスト３５Ａの内容との一致度が高いほど、尤度は高くなるものとする。 Accordingly, the answering side determination unit 32A has a likelihood that the center will respond for each center based on the degree of coincidence between the result of the voice recognition processing and the content of the correspondence list 35A for each center ( (Likelihood) may be calculated, and the higher likelihood may be set as the answer center. The answering side determination unit 32A assumes that the likelihood increases as the degree of coincidence between the result of the speech recognition processing and the content of the correspondence list 35A for each center increases.

（変形例１０）
なお、変形例９において、対応リスト３５Ａには各センターが対応すべき質問や命令コマンドのリストが登録されているものとしたが、これに限らない。例えば、対応リスト３５Ａには、センター毎に、そのセンターが実施するサービスと関連性の強い単語や薄い単語を定義しておき、回答側判定部３２Ａは、音声認識処理の結果に含まれる単語と各センターとの関連性を評価して、回答側センターを判定しても良い。 (Modification 10)
In the modification 9, the correspondence list 35A is registered with a list of questions and command commands that should be handled by each center. However, the present invention is not limited to this. For example, in the correspondence list 35A, words or thin words that are strongly related to the service performed by the center are defined for each center, and the answering side determination unit 32A determines the words included in the result of the speech recognition process. You may judge the answer side center by evaluating the relationship with each center.

例えば、「道」「目的地」「道路」「渋滞」などの単語が含まれていた場合には、第１センター３を回答側センターに設定し、「電話」「（人物名詞）さん」「スケジュール」などの単語が含まれていた場合には、第２センター４を回答側センターに設定すればよい。センターが対応すべき質問と入力音声データとの一致度合いを評価する際に、これらの単語毎の重み付けを適用してもよい。 For example, when words such as “road”, “destination”, “road”, and “traffic jam” are included, the first center 3 is set as the answering center, and “phone” “(person noun)” “ If a word such as “schedule” is included, the second center 4 may be set as the answering center. When evaluating the degree of coincidence between the question to be handled by the center and the input voice data, weighting for each word may be applied.

（変形例１１）
また、以上の例では、ユーザが利用可能な応答システムとして、２つの応答システム（すなわち、センター）が利用可能な構成について説明したが、これに限らない。ユーザが利用可能なセンターは３つ以上であっても良い。すなわち、請求項に記載の第１の応答システムに相当する１つのセンターに対して、第２の応答システムに相当するセンターが複数あってもよい。 (Modification 11)
Moreover, although the above example demonstrated the structure which can use two response systems (namely, center) as a response system which a user can utilize, it is not restricted to this. There may be three or more centers available to the user. That is, there may be a plurality of centers corresponding to the second response system with respect to one center corresponding to the first response system recited in the claims.

この場合、対応リスト３５Ａには、それぞれのセンターが応答するべき入力音声の内容を、センター毎に登録しておく。そして、回答側判定部３２Ａは、音声認識処理の結果と、対応リストと、に基づいて、ユーザが利用な可能な複数の応答システムのうち、いずれの応答システムが当該入力音声データに対して応答するべきかを判定すればよい。 In this case, in the correspondence list 35A, the contents of the input voice that each center should respond to are registered for each center. Then, the answering side determination unit 32A, based on the result of the voice recognition process and the correspondence list, which response system among the plurality of response systems available to the user responds to the input voice data. What is necessary is just to determine.

例えば回答側判定部３２Ａは、音声認識処理の結果と、センター毎の対応リスト３５Ａの内容との一致度合いに基づいて、センター毎に尤度を算出する。そして最も尤度が高いセンターの応答音声データから優先的に音声出力させれば良い。 For example, the answering side determination unit 32A calculates the likelihood for each center based on the degree of coincidence between the result of the voice recognition process and the content of the correspondence list 35A for each center. Then, it is only necessary to preferentially output the voice response data from the center with the highest likelihood.

１００応答制御システム、１ナビゲーション装置（車載器）、１２マイク、１４スピーカ、１８制御部、１８Ａ音声取得部、１８Ｂ回答一時保存部、１８Ｃ回答側センター設定部、１８Ｄ回答出力部、１８Ｅ回答フィードバック部、１８Ｆエージェント表示制御部、１８Ｇ別回答要求判定部、２携帯電話機、３第１センター（第１の応答システム）、３２第１センター側制御部、３２Ａ回答側判定部、３２Ｂ更新部、３３音声認識部、３４音声認識データベース、３５第１センター側メモリ、３５Ａ対応リスト、３５Ｂ応答用データ、４…第２センター（第２の応答システム） DESCRIPTION OF SYMBOLS 100 Response control system, 1 Navigation apparatus (vehicle equipment), 12 Microphone, 14 Speaker, 18 Control part, 18A Voice acquisition part, 18B Answer temporary storage part, 18C Answer side center setting part, 18D Answer output part, 18E Answer feedback part , 18F agent display control unit, 18G separate response request determination unit, 2 mobile phone, 3 first center (first response system), 32 first center side control unit, 32A answer side determination unit, 32B update unit, 33 voice Recognizing unit, 34 voice recognition database, 35 first center side memory, 35A correspondence list, 35B response data, 4... Second center (second response system)

Claims

An audio acquisition unit (18A) that acquires input audio by the user as input audio data via the microphone (12);
A voice recognition process is performed on the input voice data acquired by the voice acquisition unit, and first response voice data serving as a response to the input voice data is generated based on a result of the voice recognition process. Response system (3)
Voice recognition processing is performed on the input voice data acquired by the voice acquisition unit, and based on the result of the voice recognition processing, second response voice data that is voice data serving as a response to the input voice data is obtained. A second response system (4) to generate;
A correspondence list storage unit (35) for storing a correspondence list (35A) describing the contents of the input voice data to which the first response system should respond;
An answerer that determines which one of the first and second response systems should respond to the input voice data based on the result of the voice recognition processing by the first response system. A system determination unit (32A);
An answer output unit for causing the speaker (14) to output the answer voice data generated by the answer side answer system, which is the answer system determined to respond to the input voice data in the answer side system judgment unit ( and 18D), with a,
The answering system determination unit
When the result of the voice recognition processing is associated with the correspondence list, the first response system determines that the first response system should respond to the input voice data,
If the result of the speech recognition process is not associated with the correspondence list, the response control system determines that the second response system should respond to the input speech data .

In claim 1 ,
A user who requests to output the response voice data generated by the semi-answer side response system, which is a response system that is not determined to respond to the input voice data, by the answering system determination unit to the speaker Another response request determination unit (18G) for determining whether or not the operation has been accepted,
When the separate response request determination unit determines that the user operation has been received by the semi-response side response system, the response output unit causes the speaker to output response audio data of the semi-response side response system A response control system characterized by

In claim 2 ,
A temporary storage unit (18B) for temporarily storing the response voice data acquired from the semi-response side response system until it is determined that the user operation is not accepted by the separate response request determination unit;
When it is determined that the user operation is received by the separate response request determination unit, the response audio data of the semi-response side response system stored in the temporary storage unit is output by voice to the speaker. Response control system.

In claim 3 ,
When it is determined that the user operation has been received by the separate response request determination unit, the response system that is the semi-response side response system this time is the next time or later with respect to the result of the speech recognition processing on the input speech data. A response control system comprising an update unit (32B) for updating the contents of the correspondence list so as to be the answer side response system.

In any one of Claim 2 to 4 ,
When the input voice data determined to be the same or highly similar content from the result of the voice recognition processing by the first response system is input again within a predetermined time,
The response system that was the quasi-answer side response system last time is the answer side response system,
The response control system updates the contents of the correspondence list so that the response system that was the semi-answer side response system last time is the answer side response system from the next time.

In any one of Claim 1 to 5 ,
The response control system includes a display device (13),
An agent display control unit (18F) that simultaneously displays on the display device an agent image corresponding to each of the first and second response systems;
The agent display control unit displays the agent image corresponding to the response system that has generated the response voice data output to the speaker by the answer output unit so that the agent image is speaking. system.

In any one of Claim 1 to 6 ,
The response control system includes an on-vehicle device (1) mounted on a vehicle, a first center (3) provided outside the vehicle to perform wireless communication with the on-vehicle device, and an on-vehicle device provided outside the vehicle. A second center (4) for performing wireless communication with the vessel,
The vehicle-mounted device includes the voice acquisition unit and the answer output unit,
The first center is responsible for the function as the first response system and includes the answering system determination unit.
The response control system, wherein the second center has a function as the second response system.

The center provided with the function of the said 1st center of Claim 7 .