JP2006202205A

JP2006202205A - Speech dialog system

Info

Publication number: JP2006202205A
Application number: JP2005015619A
Authority: JP
Inventors: Takashi Kondo; 剛史金銅; Noboru Katsuta; 昇勝田; Takashi Akita; 貴志秋田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2005-01-24
Filing date: 2005-01-24
Publication date: 2006-08-03

Abstract

<P>PROBLEM TO BE SOLVED: To provide a speech dialog system wherein a user is hardly bothered by a registeration operation. <P>SOLUTION: This speech dialog system 1 includes a speech dialog device 11 and a dialog controller 13 for controlling a dialog between an originating-side user and a terminating-side user. The originating-side user operates a dialog start part of the speech dialog device 11 in order to start a dialog. After the operation of the dialog start part, at least a speech representing search contents is inputted through a microphone by the originating-side user. Additional information required for searching the terminating-side user is acquired by an acquisition part, and then speech data and additional information are sent to a network 14 from a transmitting part. In the speech dialog controller 13, a retrieval part retrieves a database by using received speech data and additional information as a retrieval key to specify the terminating-side user after receiving the speech data and additional information through the network 14. A grouping part groups the terminating-side user specified by the retrieval part, to be held. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、音声対話システムに関し、より特定的には、音声対話装置と、当該音声対話装置を利用する第１のユーザ及び他の機器を利用する第２のユーザの間の対話を制御する対話制御装置とを備える音声対話システムに関する。 The present invention relates to a voice dialogue system, and more specifically, a dialogue for controlling a dialogue between a voice dialogue device and a first user who uses the voice dialogue device and a second user who uses another device. The present invention relates to a voice interaction system including a control device.

近年、会話ボタンを押すだけで１人以上の相手と通信できるプッシュツートーク（以下、ＰＴＴ（ＰｕｓｈＴｏＴａｌｋ）と称する）機能が、例えば携帯電話のような音声対話装置に搭載され始めている。このようなＰＴＴ機能により、あるユーザから発せられた音声は、デジタルネットワークを介し、予めグループ化された全ての人が携帯する音声対話装置に届く。また、ＰＴＴでは、上記のようにして開始される音声は半二重通信でやりとりされる。
特開２００３−２７１１９３号公報 In recent years, a push-to-talk (hereinafter referred to as “PTT (Push To Talk)”) function capable of communicating with one or more other parties by simply pressing a conversation button has begun to be installed in a voice interactive apparatus such as a mobile phone. With such a PTT function, a voice uttered from a certain user reaches a voice interactive device carried by all persons grouped in advance via a digital network. In PTT, the voice started as described above is exchanged by half-duplex communication.
JP 2003-271193 A

しかしながら、上述のような音声対話装置には、対話相手は、予めグループ化されて登録されている必要があるので、音声対話装置のユーザは、たとえ１度しか対話しない相手であっても煩わしい登録操作をしなければならないという問題点がある。 However, since the conversation partners need to be grouped and registered in advance in the voice interaction apparatus as described above, the user of the voice interaction apparatus is troublesome even if it is a partner who only interacts once. There is a problem of having to operate.

さらに言えば、たとえ１度しか対話しない相手であっても、音声対話装置に登録する必要があるため、音声対話装置が備える記憶装置の容量を浪費する場合があるという問題点がある。 Furthermore, there is a problem that the capacity of the storage device included in the voice interactive apparatus may be wasted because it is necessary to register with the voice interactive apparatus even if the other party interacts only once.

それ故に、本発明の目的は、ユーザが登録操作の煩わしさを感じにくい音声対話システムを提供することである。 Therefore, an object of the present invention is to provide a voice dialogue system in which a user does not feel the troublesome registration operation.

上記目的を達成するために、本発明の第１の局面は、音声対話システムに向けられている。音声対話システムは、音声対話装置と、当該音声対話装置を利用する第１のユーザ及び他の機器を利用する第２のユーザの間の対話を制御する対話制御装置とを備える。また、音声対話装置は、第１のユーザが対話を開始するために操作する対話開始部と、対話開始部が操作された後、少なくとも、第１のユーザにより検索内容を表す音声が入力され、入力された検索内容を表す音声データを出力するマイクと、マイクから出力された音声データと共に、第２のユーザを検索するために必要な付加情報を取得する取得部と、取得部により取得された音声データと付加情報とをネットワークに送出するユーザ側送信部とを備える。音声対話制御装置は、音声対話装置から、ネットワークを介して送信されてくる音声データと付加情報とを受信する受信部と、受信部により受信された音声データと付加情報とを検索キーとして、第２のユーザを検索可能に構成されたデータベースを格納する格納部と、受信部により受信された音声データ及び付加情報を使って、格納部に格納されたデータベースを検索して、第１のユーザが今回対話可能な第２のユーザを特定する検索部と、検索部により特定された第２のユーザを１グループ化して、第１及び第２のユーザ間の対話が終了するまでの間は少なくとも、グループに登録された第２のユーザを保持するグループ化部とを備える。 In order to achieve the above object, a first aspect of the present invention is directed to a voice interaction system. The voice dialogue system includes a voice dialogue device and a dialogue control device that controls a dialogue between a first user who uses the voice dialogue device and a second user who uses another device. In addition, the voice interaction device is configured such that a dialogue start unit operated by the first user to start the dialogue, and after the dialogue start unit is operated, at least a voice representing a search content is input by the first user, A microphone that outputs audio data representing the input search content, an acquisition unit that acquires additional information necessary for searching for the second user together with the audio data output from the microphone, and acquired by the acquisition unit A user-side transmitter that transmits the audio data and the additional information to the network. The voice conversation control device receives a voice data and additional information transmitted from the voice dialogue device via the network, and uses the voice data and additional information received by the receiver as search keys. The storage unit storing the database configured to be searchable by the two users, and the database stored in the storage unit using the audio data and the additional information received by the receiving unit, the first user searches At least until the dialogue between the first user and the second user is completed by grouping the second user specified by the search unit and the second user specified by the search unit into a group, A grouping unit for holding second users registered in the group.

また、音声対話装置がプッシュツートーク機能を実装している場合、対話開始部は典型的には、第１のユーザにより操作されるボタンである。 In addition, when the voice interactive apparatus has a push-to-talk function, the dialog start unit is typically a button operated by the first user.

また、音声対話装置が車両に搭載可能な場合において、付加情報は、音声対話装置の現在位置、進行方位、移動速度、目的地及び目的地までの経路を含むグループから選ばれた少なくとも１個を含む。また、検索部は、付加情報を参照して、音声対話装置の現在位置付近、目的地付近及び／又は目的地までの経路付近にいる第２のユーザを検索する。 Further, when the voice interactive device can be mounted on the vehicle, the additional information is at least one selected from the group including the current position, traveling direction, moving speed, destination, and route to the destination of the voice interactive device. Including. The search unit refers to the additional information to search for a second user near the current position of the voice interactive apparatus, near the destination, and / or near the route to the destination.

また、データベースは好ましくは、他の機器の現状を示す状況情報を含んでいる。この場合において、検索部は、データベースに含まれる状況情報を参照して、第２のユーザを検索する。 Also, the database preferably includes status information indicating the current status of other equipment. In this case, the search unit searches for the second user with reference to the situation information included in the database.

また、状況情報は典型的には、他の機器が利用中か否かを示している。この場合において、検索部は、データベースに含まれる状況情報を参照して、現在利用中でない機器の第２のユーザを検索する。 The status information typically indicates whether another device is in use. In this case, the search unit searches for a second user of a device that is not currently in use with reference to the situation information included in the database.

また、音声対話制御装置は好ましくは、受信部により受信された音声データをバッファリングする音声蓄積部と、グループ化部によりグループ化された第２のユーザが利用する機器と、音声対話装置との間のコネクションを確立した後、音声蓄積部にバッファされている音声データを、グループ化部によりグループ化された第２のユーザ側の機器に送信するために、ネットワークに送出する制御装置側送信部をさらに備える。 The voice interaction control device preferably includes an audio storage unit that buffers the audio data received by the receiving unit, a device used by the second user grouped by the grouping unit, and the voice interaction device. After the connection is established, the control device side transmission unit that transmits the audio data buffered in the audio storage unit to the network in order to transmit to the second user side device grouped by the grouping unit Is further provided.

また、音声対話装置は好ましくは、音声対話装置と現在コネクションが確立されている機器のいずれかに対して、第１のユーザの操作に応答して、コネクションの解放を要求するための解放要求信号を出力する解放要求部をさらに備える。ここで、ユーザ側送信部はさらに、解放要求部からの解放要求信号をネットワークに送出し、受信部はさらに、音声対話装置から、ネットワークを介して送信されてくる解放要求信号を受信する。また、対話制御部はさらに、受信部により受信された解放要求信号により指定された機器とのコネクションを解放する。 Further, the voice interaction device preferably has a release request signal for requesting release of the connection in response to the operation of the first user to any of the devices that are currently connected to the voice interaction device. Is further provided. Here, the user side transmission unit further sends a release request signal from the release request unit to the network, and the reception unit further receives a release request signal transmitted from the voice interactive apparatus via the network. The dialogue control unit further releases the connection with the device specified by the release request signal received by the receiving unit.

また、音声対話装置は好ましくは、音声対話装置と現在コネクションが確立されている機器のいずれかのみに対して、第１のユーザの操作に応答して、コネクションの維持を要求するための維持要求信号を出力する維持要求部をさらに備える。ここで、ユーザ側送信部はさらに、維持要求部からの維持要求信号をネットワークに送出し、受信部はさらに、音声対話装置から、ネットワークを介して送信されてくる維持要求信号を受信する。また、対話制御部はさらに、受信部により受信された維持要求信号により指定された機器以外とのコネクションを解放する。 Further, the voice interaction device preferably has a maintenance request for requesting only one of the devices that are currently connected to the voice interaction device to maintain the connection in response to the operation of the first user. A maintenance request unit for outputting a signal is further provided. Here, the user side transmission unit further sends a maintenance request signal from the maintenance request unit to the network, and the reception unit further receives a maintenance request signal transmitted from the voice interactive apparatus via the network. Further, the dialogue control unit further releases a connection with a device other than the device specified by the maintenance request signal received by the receiving unit.

また、本発明の第２の局面は、音声対話装置を利用する第１のユーザ及び他の機器を利用する第２のユーザの間の対話を制御する対話制御装置に向けられており、音声対話装置は、第１のユーザが対話を開始するために操作する対話開始部と、対話開始部が操作された後、少なくとも、第１のユーザにより検索内容を表す音声が入力され、入力された検索内容を表す音声データを出力するマイクと、マイクから出力された音声データと共に、第２のユーザを検索するために必要な付加情報を取得する取得部と、取得部により取得された音声データと付加情報とをネットワークに送出する送信部とを備える。音声対話制御装置は、音声対話装置から、ネットワークを介して送信されてくる音声データと付加情報とを受信する受信部と、受信部により受信された音声データと付加情報とを検索キーとして、第２のユーザを検索可能に構成されたデータベースを格納する格納部と、受信部により受信された音声データ及び付加情報を使って、格納部に格納されたデータベースを検索して、第１のユーザが今回対話可能な第２のユーザを特定する検索部と、検索部により特定された第２のユーザを１グループ化して、第１及び第２のユーザ間の対話が終了するまでの間は少なくとも、グループに登録された第２のユーザを保持するグループ化部とを備える。 The second aspect of the present invention is directed to a dialog control device that controls a dialog between a first user who uses a voice dialog device and a second user who uses another device. The apparatus includes: a dialog start unit operated by a first user to start a dialog; and a search that is input with at least a voice representing a search content input by the first user after the dialog start unit is operated A microphone that outputs audio data representing the contents, an acquisition unit that acquires additional information necessary for searching for the second user together with the audio data output from the microphone, and the audio data acquired by the acquisition unit and the addition A transmission unit for transmitting information to the network. The voice conversation control device receives a voice data and additional information transmitted from the voice dialogue device via the network, and uses the voice data and additional information received by the receiver as search keys. The storage unit storing the database configured to be searchable by the two users, and the database stored in the storage unit using the audio data and the additional information received by the receiving unit, the first user searches At least until the dialogue between the first user and the second user is completed by grouping the second user specified by the search unit and the second user specified by the search unit into a group, A grouping unit for holding second users registered in the group.

また、本発明の第３の局面は、音声対話装置を利用する第１のユーザ及び他の機器を利用する第２のユーザの間の対話を制御する対話制御方法に向けられており、対話制御方法は、第１のユーザによる対話を開始するために操作に応答して、音声対話装置により実行される対話開始ステップと、音声対話装置側で実行され、対話開始ステップが操作された後、少なくとも、第１のユーザにより検索内容を表す音声が入力され、入力された検索内容を表す音声データを出力する音声入力ステップと、音声対話装置側で実行され、音声入力ステップで出力された音声データと共に、第２のユーザを検索するために必要な付加情報を取得する取得ステップと、音声対話装置側で実行され、取得ステップにより取得された音声データと付加情報とをネットワークに送出する送信ステップと、 The third aspect of the present invention is directed to a dialog control method for controlling a dialog between a first user who uses a voice dialog device and a second user who uses another device. The method is responsive to an operation for initiating a dialogue by a first user, wherein a dialogue start step executed by the voice interaction device is executed on the side of the voice interaction device, and at least after the dialogue start step is operated A voice input step for outputting the voice data representing the search content inputted by the first user and a voice input step for outputting the voice data representing the inputted search content, together with the voice data outputted in the voice input step. The acquisition step for acquiring the additional information necessary for searching for the second user, and the voice data and the additional information acquired by the acquisition step executed on the voice interaction device side A transmission step of transmitting to the network,

音声対話制御装置側で実行され、音声対話装置から、ネットワークを介して送信されてくる音声データと付加情報とを受信する受信ステップと、音声対話制御装置に格納されるデータベースを、受信ステップにより受信された音声データと付加情報とを検索キーとして検索して、第１のユーザが今回対話可能な第２のユーザを特定する検索ステップと、音声対話制御装置側で実行され、検索ステップにより特定された第２のユーザを１グループ化して、第１及び第２のユーザ間の対話が終了するまでの間は少なくとも、グループに登録された第２のユーザを保持するグループ化ステップとを備える。 A reception step for receiving voice data and additional information transmitted from the voice interaction device via the network, and a database stored in the voice interaction control device are received by the reception step. A search step for searching the voice data and the additional information as a search key to identify a second user who can interact with the first user at this time; A grouping step for holding the second users registered in the group at least until the second user is grouped into one group and the dialogue between the first and second users ends.

また、本発明の第４の局面は、音声対話装置を利用する第１のユーザ及び他の機器を利用する第２のユーザの間の対話を制御するためのコンピュータプログラムに向けられており、音声対話装置は、第１のユーザが対話を開始するために操作する対話開始部と、対話開始部が操作された後、少なくとも、第１のユーザにより検索内容を表す音声が入力され、入力された検索内容を表す音声データを出力するマイクと、マイクから出力された音声データと共に、第２のユーザを検索するために必要な付加情報を取得する取得部と、取得部により取得された音声データと付加情報とをネットワークに送出する送信部とを備える。ここで、コンピュータプログラムは、音声対話制御装置側で実行され、音声対話装置から、ネットワークを介して送信されてくる音声データと付加情報とを受信する受信ステップと、音声対話制御装置に格納されるデータベースを、受信ステップにより受信された音声データと付加情報とを検索キーとして検索して、第１のユーザが今回対話可能な第２のユーザを特定する検索ステップと、音声対話制御装置側で実行され、検索ステップにより特定された第２のユーザを１グループ化して、第１及び第２のユーザ間の対話が終了するまでの間は少なくとも、グループに登録された第２のユーザを保持するグループ化ステップとを備える。 The fourth aspect of the present invention is directed to a computer program for controlling a dialogue between a first user who uses a voice interaction device and a second user who uses another device. The dialogue device includes a dialogue start unit that is operated by the first user to start the dialogue, and after the dialogue start unit is operated, at least a voice representing a search content is input and input by the first user. A microphone that outputs audio data representing the search contents, an acquisition unit that acquires additional information necessary for searching for the second user together with the audio data output from the microphone, and the audio data acquired by the acquisition unit A transmission unit that transmits the additional information to the network. Here, the computer program is executed on the side of the voice interaction control device, and receives the voice data and additional information transmitted from the voice interaction device via the network, and is stored in the voice interaction control device. The database is searched by using the voice data and the additional information received in the receiving step as search keys, and the search step for specifying the second user who can interact with the first user this time is executed on the side of the voice dialog control device. The group that holds the second users registered in the group at least until the second user specified by the search step is grouped and the dialogue between the first and second users ends. And a conversion step.

以上説明したように、本発明の各局面によれば、ネットワーク上の音声対話制御装置は、音声対話装置からの音声データに応答して、音声対話装置の代理で、検索により特定した対話相手の情報を保持する。従って、第１のユーザは、対話相手となる第２のユーザを音声対話装置に、予めグループ化して登録しておく必要がなくなる。さらに、第２のユーザを音声対話装置に登録する必要がなくなるので、音声対話装置が備える記憶装置の容量を浪費しなくても済むことになる。 As described above, according to each aspect of the present invention, the voice interaction control device on the network responds to the voice data from the voice interaction device, and acts on behalf of the conversation partner specified by the search on behalf of the voice interaction device. Keep information. Therefore, the first user does not need to group and register the second user as the conversation partner in advance in the voice interaction apparatus. Furthermore, since it is not necessary to register the second user in the voice interactive apparatus, it is not necessary to waste the capacity of the storage device provided in the voice interactive apparatus.

本発明の上記及びその他の目的、特徴、局面及び利点は、以下に述べる本発明の詳細な説明を添付の図面とともに理解したとき、より明らかになる。 The above and other objects, features, aspects and advantages of the present invention will become more apparent when the detailed description of the present invention described below is understood in conjunction with the accompanying drawings.

（実施形態）
図１は、本発明の実施形態に係る音声対話システム１の全体構成を示す模式図である。図１において、音声対話システム１は、音声対話装置（発信側）１１、１台以上の音声対話装置（着信側）１２、及び音声対話制御装置１３を少なくとも備える。音声対話装置１１及び１２並びに音声対話制御装置１３は、例えばインターネット及び／又はセルラー網のようなネットワーク１４を介して、相互に通信可能に接続される。なお、図１には、１台以上の音声対話装置（着信側）１２の例として、２台の音声対話装置１２ａ及び１２ｂが示されている。 (Embodiment)
FIG. 1 is a schematic diagram showing an overall configuration of a voice interaction system 1 according to an embodiment of the present invention. In FIG. 1, the voice dialogue system 1 includes at least a voice dialogue device (transmitting side) 11, one or more voice dialogue devices (incoming side) 12, and a voice dialogue control device 13. The voice interaction devices 11 and 12 and the voice interaction control device 13 are connected to be able to communicate with each other via a network 14 such as the Internet and / or a cellular network. FIG. 1 shows two voice interaction devices 12 a and 12 b as an example of one or more voice interaction devices (incoming side) 12.

音声対話装置１１は、例示的には、ハンズフリー機能を有する携帯電話１５（図２を参照）と接続された車載端末装置であり、ＰＴＴ（Ｐｕｓｈ−ｔｏ−Ｔａｌｋ）機能を備えている。このような車載端末装置の例としては、車両に設置可能なナビゲーション装置がある。 The voice interaction device 11 is, for example, an in-vehicle terminal device connected to a mobile phone 15 (see FIG. 2) having a hands-free function, and has a PTT (Push-to-Talk) function. As an example of such an in-vehicle terminal device, there is a navigation device that can be installed in a vehicle.

各音声対話装置１２ａ及び１２ｂは、例示的には、レストランのような施設又は店舗に設置されるＰＯＳ（ＰｏｉｎｔＯｆＳａｌｅ）端末装置である。具体例と挙げると、音声対話装置１２ａは店舗Ａに設置され、音声対話装置１２ｂは、別の店舗Ｂに設置される。また、これら音声対話装置１２ａ及び１２ｂは、ＰＴＴ機能により音声対話装置１１から発信された呼を受け付け、その後、音声対話装置１１との間で音声通信を行う。 Each of the voice interactive devices 12a and 12b is illustratively a POS (Point Of Sale) terminal device installed in a facility such as a restaurant or a store. As a specific example, the voice interaction device 12a is installed in the store A, and the voice interaction device 12b is installed in another store B. Further, these voice interaction devices 12a and 12b accept a call transmitted from the voice interaction device 11 by the PTT function, and then perform voice communication with the voice interaction device 11.

音声対話制御装置１３は、典型的には、呼制御を行うサーバ又はセルラー網におけるパケット交換機に組み込まれ、音声対話装置１１のユーザ（以下、発信側ユーザ）と、音声対話装置１２ａ及び１２ｂそれぞれのユーザ（以下、着信側ユーザ）との間で１対多の音声対話（ＰＴＴ）を制御する装置である。 The voice interaction control device 13 is typically incorporated in a server for performing call control or a packet switch in a cellular network, and the user of the voice interaction device 11 (hereinafter referred to as the originating user) and each of the voice interaction devices 12a and 12b. It is a device that controls one-to-many voice conversation (PTT) with a user (hereinafter called a called-side user).

次に、音声対話装置１１について説明する。図２は、図１に示す音声対話装置１１の詳細な構成を示すブロック図である。図２において、音声対話装置１１は、通信部１１１、マイク１１２、スピーカ１１３、対話開始部１１４、ＣＯＤＥＣ１１５、付加情報取得部１１６、制御部１１７及びメッセージ生成部１１８を備えている。 Next, the voice interaction apparatus 11 will be described. FIG. 2 is a block diagram showing a detailed configuration of the voice interaction apparatus 11 shown in FIG. In FIG. 2, the voice interaction apparatus 11 includes a communication unit 111, a microphone 112, a speaker 113, an interaction start unit 114, a CODEC 115, an additional information acquisition unit 116, a control unit 117, and a message generation unit 118.

通信部１１１は、ネットワーク１４に接続されており、音声対話装置１１で生成された発信側メッセージ（詳細は後述する）をネットワーク１４に送出し、ネットワーク１４を介して送られてくる着信側メッセージ（詳細は後述）を受信する。 The communication unit 111 is connected to the network 14, sends a caller-side message (details will be described later) generated by the voice interaction device 11 to the network 14, and a callee-side message (via the network 14) Details will be received later.

マイク１１２には、本音声対話装置１１のユーザが検索したい対話相手を音声対話制御装置１３が検索可能な音声（以下、検索内容と称する）が少なくともユーザにより入力される。その後に続くユーザの話し言葉がマイク１１２には入力される。このような入力音声をマイク１１２はアナログ音声信号に変換し、ＣＯＤＥＣ１１５に出力する。 The microphone 112 receives at least a voice that can be searched by the voice dialog control device 13 for a dialog partner that the user of the voice dialog device 11 wants to search (hereinafter referred to as search content). Subsequent user's spoken words are input to the microphone 112. The microphone 112 converts such input sound into an analog sound signal and outputs it to the CODEC 115.

スピーカ１１３は、後述するＣＯＤＥＣ１１５から送られてくるアナログ音声信号に従って、対話相手からの話し言葉を出力する。 The speaker 113 outputs a spoken word from the conversation partner in accordance with an analog voice signal sent from the CODEC 115 described later.

対話開始部１１４は、典型的にはＰＴＴボタンである。ユーザは、このような対話開始部１１４を操作した後、より具体的には対話開始部１１４を操作しながら、マイク１１２に向かって声を発する。これによって、マイク１１２からは、入力音声を表すアナログ音声情報がＣＯＤＥＣ１１５に出力される。 The dialogue start unit 114 is typically a PTT button. After operating the dialog start unit 114, the user speaks toward the microphone 112 while operating the dialog start unit 114 more specifically. As a result, the analog sound information representing the input sound is output from the microphone 112 to the CODEC 115.

ＣＯＤＥＣ１１５は、マイク１１２からのアナログ音声信号をデジタル音声信号に変換して、制御部１１７に出力する。ＣＯＤＥＣ１１５はさらに、通信部１１１により受信されたデジタル音声信号を、制御部１１７を介して取得し、取得したデジタル音声信号をアナログ音声信号に変換して、スピーカ１１３に出力する。 The CODEC 115 converts the analog audio signal from the microphone 112 into a digital audio signal and outputs the digital audio signal to the control unit 117. The CODEC 115 further acquires the digital audio signal received by the communication unit 111 via the control unit 117, converts the acquired digital audio signal into an analog audio signal, and outputs the analog audio signal to the speaker 113.

付加情報取得部１１６は、ユーザの現在位置を特定可能な位置情報、及び／又はユーザを特定可能な個人情報を、付加情報として取得する。 The additional information acquisition unit 116 acquires position information that can specify the current position of the user and / or personal information that can specify the user as additional information.

位置情報は、現在位置そのもの、現在走行中の道路の名称、現在位置する交差点の名称、目的地の位置、ユーザの移動速度、ユーザの進行方位、又は、目的地に到達するまでの経路若しくは、これらのグループから選ばれた２個以上の組み合わせである。なお、以上のような位置情報を、付加情報取得部１１６は典型的には、周知のナビゲーションシステムから取得することができる。 The position information is the current position itself, the name of the road that is currently traveling, the name of the intersection where the current position is located, the position of the destination, the user's moving speed, the user's traveling direction, or the route to the destination, or A combination of two or more selected from these groups. Note that the position information as described above can be typically acquired by the additional information acquisition unit 116 from a known navigation system.

また、個人情報は、ユーザの氏名、年齢、住所、電話番号、メールアドレス、ニックネーム又は嗜好情報、若しくは、これらのグループから選ばれたいずれか２個以上の組み合わせである。このような個人情報は、音声対話装置１１自身に内蔵されるか、音声対話装置１１に接続される記憶装置（図示せず）に格納される。なお、このような個人情報の一部は、不特定の音声対話装置１２に送信されないように、音声対話装置１１により制御されることが好ましい。具体的には、ユーザが予め許可した項目のみ（例えばニックネーム）が不特定の音声対話装置１２に送信されるよう制御される。 The personal information is the user's name, age, address, telephone number, e-mail address, nickname or preference information, or a combination of any two or more selected from these groups. Such personal information is built into the voice interaction device 11 itself or stored in a storage device (not shown) connected to the voice interaction device 11. It should be noted that part of such personal information is preferably controlled by the voice interaction device 11 so that it is not transmitted to the unspecified voice interaction device 12. Specifically, only items (for example, nicknames) permitted in advance by the user are controlled to be transmitted to the unspecified voice interaction device 12.

制御部１１７は、例えばＣＰＵ、ＲＯＭ及びＲＡＭから構成され、音声対話装置１１の各構成を制御する。 The control unit 117 is composed of, for example, a CPU, a ROM, and a RAM, and controls each component of the voice interaction device 11.

メッセージ生成部１１８は、まず、ＣＯＤＥＣ１１５から出力されたデジタル音声信号を、制御部１１７を通じて取得する。さらに、メッセージ生成部１１８は、付加情報取得部１１６から出力された付加情報を、制御部１１７を通じて取得する。メッセージ生成部１１８は基本的には、取得したデジタル音声信号及び／又は付加情報とを少なくとも含む発信側メッセージを作成して、通信部１１１に出力する。具体的には、発信側メッセージには、検索内容を表すデジタル音声信号が含まれる。 The message generation unit 118 first acquires the digital audio signal output from the CODEC 115 through the control unit 117. Further, the message generation unit 118 acquires the additional information output from the additional information acquisition unit 116 through the control unit 117. The message generation unit 118 basically creates a caller-side message including at least the acquired digital audio signal and / or additional information, and outputs it to the communication unit 111. Specifically, the originator message includes a digital audio signal representing the search content.

次に、音声対話装置１２について説明する。各音声対話装置１２は、ネットワーク１４に接続されており、ＰＴＴベースの対話を行うことが可能な機器である。また、音声対話装置１２は、音声対話装置１１と同様にして、内部で作成された着信側メッセージをネットワーク１４に送出する。ここで、着信側メッセージには、各音声対話装置１２のユーザの話し言葉を表すデジタル音声信号が含まれる。また、音声対話装置１２は、音声対話装置１１により送出された発信側メッセージに含まれるデジタル音声信号を、ネットワーク１４を介して受信する。音声対話装置１２は、音声対話装置１１と同様にして、受信したデジタル音声信号を処理して、それにより表される音声を出力する。 Next, the voice interaction device 12 will be described. Each voice interaction device 12 is connected to the network 14 and is a device capable of performing a PTT-based interaction. In addition, the voice interaction device 12 sends the internally generated incoming message to the network 14 in the same manner as the voice interaction device 11. Here, the incoming message includes a digital voice signal representing the spoken language of the user of each voice interaction device 12. Further, the voice interaction device 12 receives a digital voice signal included in the outgoing message sent by the voice interaction device 11 via the network 14. The voice interaction device 12 processes the received digital voice signal in the same manner as the voice interaction device 11, and outputs the voice represented thereby.

次に、音声対話制御装置１３について説明する。図３は、図１に示す音声対話制御装置１３の詳細な構成を示すブロック図である。図３において、音声対話制御装置１３は、音声対話装置１１と、１以上の音声対話装置１２との間で行われる音声通信を制御するために、通信部１３１、メッセージ解読部１３２、データベース格納部１３３、検索部１３４、グループ化部１３５、対話制御部１３６、メッセージ蓄積部１３７及び制御部１３８を備えている。 Next, the voice interaction control device 13 will be described. FIG. 3 is a block diagram showing a detailed configuration of the voice interaction control device 13 shown in FIG. In FIG. 3, the voice conversation control device 13 includes a communication unit 131, a message decoding unit 132, and a database storage unit for controlling voice communication performed between the voice dialogue device 11 and one or more voice dialogue devices 12. 133, a search unit 134, a grouping unit 135, a dialogue control unit 136, a message storage unit 137, and a control unit 138.

通信部１３１は、ネットワーク１４に接続されており、音声対話装置１１から送出された発信側メッセージ、及び音声対話装置１２から送出された着信側メッセージを、ネットワーク１４を介して受信する。 The communication unit 131 is connected to the network 14 and receives the outgoing message sent from the voice interactive device 11 and the incoming message sent from the voice interactive device 12 via the network 14.

また、通信部１３１は、後述する対話制御部１３６の処理に従って、コネクション確立要求と共に音声対話装置１１から送られてきたデジタル音声信号をネットワーク１４に送出する。 Further, the communication unit 131 sends the digital voice signal sent from the voice dialogue apparatus 11 together with the connection establishment request to the network 14 in accordance with the processing of the dialogue control unit 136 described later.

メッセージ解読部１３２は、送受信部１３１により受信された発信側メッセージを取得し、取得したものに含まれるデジタル音声情報及び付加情報を取り出す。取り出されたデジタル音声情報及び付加情報は制御部１３８に送られる。 The message decryption unit 132 acquires the transmission side message received by the transmission / reception unit 131 and extracts the digital voice information and additional information included in the acquired message. The extracted digital audio information and additional information are sent to the control unit 138.

データベース格納部１３３は、ネットワーク１４に接続されている各音声対話装置１２について各種情報を含むデータベースを格納する。ここで、図４は、データベースを構成するレコードの構造を示す模式図である。図４において、レコードは、例示的に１つの音声対話装置１２に割り当てられており、識別情報と、店舗名と、アドレスと、位置情報と、店舗情報と、状況情報とを含む。 The database storage unit 133 stores a database including various types of information for each voice interactive device 12 connected to the network 14. Here, FIG. 4 is a schematic diagram showing the structure of records constituting the database. In FIG. 4, a record is exemplarily assigned to one voice interactive device 12 and includes identification information, a store name, an address, location information, store information, and status information.

識別情報は、対象となる音声対話装置１２を一意に特定可能な情報である。 The identification information is information that can uniquely identify the target voice interaction device 12.

店舗名は、対象となる音声対話装置１２が店舗に設置されている場合、その店舗の名称である。 The store name is the name of the store when the target voice interaction device 12 is installed in the store.

アドレスは、音声対話装置１１が音声対話装置１２とＰＴＴで音声対話する前に実行されるグループ化処理（詳細は後述）に必要な情報である。このような情報としては、例えば、対象となる音声対話装置１２に割り当てられているグループＩＤ、ＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）アドレス、ＵＲＩ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＩｄｅｎｔｉｆｉｅｒ）又は電話番号が挙げられる。 The address is information necessary for a grouping process (details will be described later) executed before the voice dialogue apparatus 11 performs a voice dialogue with the voice dialogue apparatus 12 using the PTT. Examples of such information include a group ID, an IP (Internet Protocol) address, a URI (Uniform Resource Identifier), or a telephone number assigned to the target voice interaction device 12.

位置情報は、対象となる音声対話装置１２が設置されている位置を示す情報である。 The position information is information indicating the position where the target voice interactive device 12 is installed.

店舗情報は、対象となる音声対話装置１２が店舗に設置されている場合、その店舗が音声対話装置１１のユーザに提供可能なサービスを表す情報である。このような店舗情報は、好ましくは、その店舗が提供可能なサービスを大略的に表す概略情報と、それをより具体的に表す詳細情報とを含んでいる。概略情報としては、例えば、「飲食店」があり、詳細情報としては、例えば、「中華料理店」又は「フランス料理店」がある。なお、概略情報及び詳細情報は、図４には特に図示されていない。 The store information is information representing a service that can be provided to the user of the voice interaction device 11 when the target voice interaction device 12 is installed in the store. Such store information preferably includes outline information that roughly represents a service that the store can provide and detailed information that more specifically represents the service. The summary information includes, for example, “restaurant”, and the detailed information includes, for example, “Chinese restaurant” or “French restaurant”. Note that the summary information and the detailed information are not particularly shown in FIG.

状況情報は、対象となる音声対話装置１２が現在ＰＴＴで音声対話を行っているか否かを示す情報である。具体的には、対象となる音声対話装置１２と、ある音声対話装置１１との間でコネクションが確立されると、つまり音声対話が始まると、例えば「利用中」と、使用状況は対話制御部１３６により設定される。逆に、コネクションが解放されると、使用状況は、例えば「使用可能」と対話制御部１３６により設定される。他にも、状況情報としては、対象となる音声対話装置１２の電源が入っていないことを示しても構わない。 The status information is information indicating whether or not the target voice dialogue apparatus 12 is currently carrying out a voice dialogue with PTT. Specifically, when a connection is established between a target voice interaction device 12 and a certain voice interaction device 11, that is, when a voice conversation starts, for example, “in use”, the usage status is a dialogue control unit. 136 is set. On the other hand, when the connection is released, the usage status is set by the dialog control unit 136 as “available”, for example. In addition, the situation information may indicate that the target voice interactive device 12 is not turned on.

以上のようなレコードが集まって、データベースは構成される。なお、本実施形態では、便宜上、１つの音声対話装置１２に１つのレコードが割り当てられるとして説明するが、場合によっては、複数の音声対話装置１２に１つのレコードが割り当てられてもよい。 The database consists of the above records. In the present embodiment, for convenience, one record is assigned to one voice interactive device 12, but one record may be assigned to a plurality of voice interactive devices 12 in some cases.

ここで、再度図３を参照する。検索部１３４は、メッセージ解読部１３２により取り出されたデジタル音声情報のうち、最初に送られてくるもの、つまり検索内容をキーワードとして、データベース格納部１３３に格納されたデータベースを検索する。また、好ましくは、検索部１３４は、検索内容とともに送られてくる付加情報をさらに使って、データベースを検索する。このような検索で得られた識別情報により、発信側ユーザが今回対話可能な着信側ユーザを、検索部１３４は特定する。 Here, FIG. 3 will be referred to again. The search unit 134 searches the database stored in the database storage unit 133 using the digital audio information extracted by the message decoding unit 132 as a keyword, that is, the first sent information, that is, the search content. Preferably, the search unit 134 further searches the database using additional information sent together with the search content. Based on the identification information obtained by such a search, the search unit 134 identifies a called user that the calling user can interact with at this time.

グループ化部１３５は、発信側メッセージに含まれる検索内容及び付加情報に加え、検索部１３４により検索された識別情報（つまり、発信側ユーザが今回対話可能な着信側ユーザ）をグループ化して保持する。 The grouping unit 135 groups and holds the identification information searched by the search unit 134 (that is, the incoming side user with whom the outgoing side user can interact this time) in addition to the search content and additional information included in the outgoing side message. .

対話制御部１３６は、発信側ユーザと、グループ化部１３５でグループ化された着信側ユーザとの間でプッシュツートークを行うためのコネクションを確立する。対話制御部１３６は、例えばＳＩＰ（ＳｅｓｓｉｏｎＩｎｉｔｉａｔｉｏｎＰｒｏｔｏｃｏｌ）のような呼制御プロトコルを使い、検索内容を送った音声対話装置１１及びデータベースから検索された音声対話装置１２との間でコネクションを確立する。コネクション確立後、音声対話装置１１及び音声対話装置１２は、音声対話制御装置１３を介さずに直接、ＰＴＴベースの音声通信を行うことができるが、対話制御部１３６は、この音声通信を監視することもできる。対話制御部１３６は、コネクションが確立されると、対象となる音声対話装置１２に割り当てられたレコードの状況情報を「使用中」に設定し、コネクションが解放されると、対象となる状況情報を「使用可能」に設定する。 The dialogue control unit 136 establishes a connection for performing a push-to-talk between the calling side user and the called side user grouped by the grouping unit 135. The dialogue control unit 136 uses a call control protocol such as SIP (Session Initiation Protocol), for example, to establish a connection between the voice dialogue device 11 that has sent the search content and the voice dialogue device 12 that has been retrieved from the database. After the connection is established, the voice interaction device 11 and the voice interaction device 12 can directly perform PTT-based voice communication without going through the voice dialog control device 13, but the dialog control unit 136 monitors this voice communication. You can also. When the connection is established, the dialog control unit 136 sets the status information of the record assigned to the target voice dialog device 12 to “in use”. When the connection is released, the dialog control unit 136 sets the target status information. Set to “Available”.

音声蓄積部１３７は、音声対話装置１１から送出された発信側メッセージから取り出された検索内容を蓄積する。 The voice accumulating unit 137 accumulates the search content extracted from the caller-side message sent from the voice interaction apparatus 11.

制御部１３８は、例えばＣＰＵ、ＲＯＭ及びＲＡＭから構成され、ＲＯＭに予め格納されているコンピュータプログラムに従って、音声対話制御装置１３の各構成要素を制御する。 The control unit 138 includes, for example, a CPU, a ROM, and a RAM, and controls each component of the voice interaction control device 13 according to a computer program stored in advance in the ROM.

次に、本実施形態に係る音声対話システムの動作について図面を参照し説明する。図５は、本実施形態に係る音声対話システムにおいて、発信側ユーザ及び着信側ユーザにおける音声対話が始まるまでの音声対話装置１１、音声対話制御装置１３及び音声対話装置１２のそれぞれの動作を示すシーケンス図である。 Next, the operation of the voice interaction system according to the present embodiment will be described with reference to the drawings. FIG. 5 is a sequence diagram showing respective operations of the voice interaction device 11, the voice interaction control device 13, and the voice interaction device 12 until the voice conversation of the calling user and the called user starts in the voice interaction system according to the present embodiment. FIG.

まず、音声対話装置１１において、メッセージ生成部２０は、発信側メッセージを生成する（シーケンスＳ１１）。最初の発信側メッセージには、前述したように、検索内容を表すデジタル音声情報と、付加情報とが少なくとも含まれている。ステップＳ１１で生成された発信側メッセージは、通信部１１１及び携帯電話１５を介してネットワーク１４に送出され、音声対話制御装置１３へと送信される（シーケンスＳ１２）。 First, in the voice interaction device 11, the message generator 20 generates a caller message (sequence S11). As described above, the first caller message includes at least digital audio information representing the search content and additional information. The outgoing message generated in step S11 is transmitted to the network 14 via the communication unit 111 and the mobile phone 15, and is transmitted to the voice interaction control device 13 (sequence S12).

音声対話制御装置２において、メッセージ解読部１３２は、今回受信した発信側メッセージを解読し、それに含まれる検索内容及び付加情報を得る（シーケンスＳ１３）。検索内容及び付加情報は、制御部１３８に渡され、制御部１３８は、検索内容をメッセージ蓄積部１３７に格納し、検索内容及び付加情報を検索部１３４に渡す。 In the spoken dialogue control apparatus 2, the message decryption unit 132 decrypts the caller message received this time, and obtains search contents and additional information included therein (sequence S13). The search content and additional information are passed to the control unit 138, and the control unit 138 stores the search content in the message storage unit 137 and passes the search content and additional information to the search unit 134.

次に、検索部１３４は、受け取った検索内容及び付加情報を使って、データベース格納部１３３に格納されるデータベースを検索し（シーケンスＳ１４及びＳ１５）、これによって、発信側ユーザが今回音声対話する候補の着信側ユーザ（つまり、音声対話装置１２）を選択する。例えば、検索内容が、中華料理店を食べたいであり、付加情報が例えば発信側ユーザの現在位置である場合、検索部１３４は、データベースの店舗情報及び位置情報をチェックし、発信側ユーザの現在位置周辺の中華料理店に割り当てられたレコードを、少なくとも１つ検索する。なお、ＰＴＴベースの音声対話では同報通信が可能であるので、好ましくは、複数のレコードが検索される。さらに好ましくは、検索処理において、検索部１３４は、状況情報をチェックし、状況情報が「使用可能」と設定されているレコードを検索し、それが「使用不可」と設定されているレコードを検索しない。 Next, the search unit 134 searches the database stored in the database storage unit 133 by using the received search content and additional information (sequences S14 and S15), and thereby, the candidate that the calling-side user performs a voice conversation this time , The callee user (that is, the voice interaction device 12). For example, if the search content is to eat a Chinese restaurant and the additional information is, for example, the current location of the calling user, the search unit 134 checks the store information and location information in the database, and At least one record assigned to the Chinese restaurant around the location is searched. Since broadcast communication is possible in a PTT-based voice conversation, preferably a plurality of records are searched. More preferably, in the search process, the search unit 134 checks the status information, searches for a record in which the status information is set to “available”, and searches for a record in which the status information is set to “unavailable”. do not do.

ここで、図６は、検索部１３４による検索処理の詳細な処理手順を示すフローチャートである。なお、以下では、検索内容は「中華料理を食べたい。」であり、付加情報が発信側ユーザの現在位置であると仮定して、検索処理を説明する。 Here, FIG. 6 is a flowchart showing a detailed processing procedure of search processing by the search unit 134. In the following, the search process will be described on the assumption that the search content is “I want to eat Chinese food” and the additional information is the current location of the user on the transmission side.

まず、検索部１３４は、データベースにおける「位置情報」のフィールドと、付加情報と比較し、発信者側ユーザの現在位置周辺にある店舗を全て探索する（ステップＳ４０）。この探索で１以上のレコードが見つかった場合には（ステップＳ４１でＹｅｓ）、検索部１３４は、ステップＳ４２に処理を移す。レコードが１件も見つからなかった場合には（Ｓ４１でＮｏ）、検索部１３４は、ステップＳ４７へ処理を移し、抽出を完了（０件）し、処理を終了する。 First, the search unit 134 compares the “location information” field in the database with the additional information, and searches all stores near the current location of the sender user (step S40). When one or more records are found in this search (Yes in step S41), the search unit 134 moves the process to step S42. If no record is found (No in S41), the search unit 134 moves the process to step S47, completes the extraction (zero), and ends the process.

ステップＳ４２では、検索内容と、ステップＳ４１で見つかったレコードの「店舗情報」フィールドを比較し、検索要求である「中華料理」の条件を満たす店舗のレコードを探索する。もし１以上のレコードが存在すれば（ステップＳ４３でＹｅｓ）、検索部１３４はステップＳ４４に処理を移す。逆に、レコードが１つも見つからなかった場合には（ステップＳ４３でＮｏ）、前述のステップＳ４７へ処理を移す。 In step S42, the contents of the search and the “store information” field of the record found in step S41 are compared, and a record of a store that satisfies the condition of “Chinese cuisine” as the search request is searched. If there is one or more records (Yes in step S43), the search unit 134 moves the process to step S44. Conversely, if no record is found (No in step S43), the process proceeds to step S47 described above.

次に、検索部１３４は、見つかったレコードの状況情報をチェックして、「使用可能」が設定されているレコードを探す（ステップＳ４４）。もし１以上のレコードが存在する場合には（ステップＳ４５でＹｅｓ）、検索部１３４はステップＳ４６に処理を移し、今回の対話相手となる着信側ユーザの抽出を完了（１件以上）し（ステップＳ４６）、処理を終了する。逆に、１件も見つからなかった場合には（ステップＳ４５でＮｏ）、検索部１３４は、Ｓ４７へ処理を移す。 Next, the search unit 134 checks the status information of the found record and searches for a record for which “usable” is set (step S44). If there is one or more records (Yes in step S45), the search unit 134 moves the process to step S46, and completes extraction (one or more) of the incoming side users who will be the current conversation partner (step 1). S46), the process ends. On the other hand, if none is found (No in step S45), the search unit 134 moves the process to S47.

なお、上述の説明では、上記ではステップＳ４０→Ｓ４２→Ｓ４４の順番で検索処理が行われていたが、この順番は変更されても構わない。さらに、一度に検索処理が行われても構わない。 In the above description, the search processing is performed in the order of steps S40 → S42 → S44. However, this order may be changed. Furthermore, the search process may be performed at a time.

また、上述の説明では、付加情報が現在位置であるとして説明したがが、これに限らず、付加情報は、上述のように、現在位置、現在走行中の道路の名称、現在位置する交差点の名称、目的地の位置、ユーザの移動速度、ユーザの進行方位、又は、目的地に到達するまでの経路若しくは、これらのグループから選ばれた２個以上の組み合わせであればよい。このような場合、検索部１３４は、これらの付加情報を参照して、音声対話装置１１（発信側ユーザ）の現在位置付近、目的地付近及び／又は目的地までの経路付近にいる前記第２のユーザを、データベースから検索する。他にも、音声対話装置１１が車載用ナビゲーション装置に実装されている場合は、車両が市街地走行か郊外走行かを判断し、市街地なら検索エリアを狭め、郊外走行なら検索エリアを広げたりすることができる。車載用ナビゲーション装置に現在表示されている地図の縮尺情報を基に、画面に表示されているエリア内に存在する着信側ユーザを検索することも、検索部１３４は可能である。さらに、今回の検索で見つかった着信側ユーザの数が少ない場合、検索部１３４は、次回の検索では、広範囲のエリアから着信側ユーザを検索し、逆に、検索により得られた着信側ユーザ数が多い場合には、検索部１３４は、徐々に検索エリアを縮小して検索することもできる。 In the above description, the additional information is described as the current position. However, the present invention is not limited to this, and as described above, the additional information includes the current position, the name of the currently traveling road, and the intersection at the current position. The name, the position of the destination, the moving speed of the user, the traveling direction of the user, the route to reach the destination, or a combination of two or more selected from these groups may be used. In such a case, the search unit 134 refers to the additional information, and the second part that is near the current position, near the destination, and / or near the route to the destination of the voice interaction device 11 (the user on the transmission side). Search the database for users. In addition, when the voice interactive device 11 is mounted on the in-vehicle navigation device, it is determined whether the vehicle is traveling in the city or in the suburbs, and the search area is narrowed if the vehicle is urban and the search area is expanded if the vehicle is traveling in the suburbs. Can do. Based on the scale information of the map currently displayed on the vehicle-mounted navigation device, the search unit 134 can also search for a called user who exists in the area displayed on the screen. Further, when the number of the called users found in the current search is small, the search unit 134 searches for the called users from a wide area in the next search, and conversely, the number of called users obtained by the search. When there are many, the search part 134 can also reduce a search area gradually and can search.

以上のような検索処理で今回得られたレコードは２個で、そのレコードが割り当てられている店舗には、音声対話装置１２ａ及び１２ｂが設置されていると仮定する。 It is assumed that the number of records obtained this time by the search processing as described above is two, and that the voice interactive devices 12a and 12b are installed in the store to which the records are assigned.

次に、グループ化部１３５は、今回発信側メッセージを送ってきた音声対話装置１１と、今回検索された音声対話装置１２とを１グループ化して、保持する（シーケンスＳ１６）。 Next, the grouping unit 135 groups and holds the voice interaction device 11 that has sent the message on the outgoing side this time and the voice interaction device 12 that has been searched this time (sequence S16).

さらに、対話制御部１３６は、ステップＳ１６でグループ化された音声対話装置１１と、音声対話装置１２との間でＰＴＴベースの音声通信を行うためのコネクションを確立する（シーケンスＳ１７−Ｓ２０）。その後、発信側ユーザは、音声対話装置１１を使って、着信側ユーザは、音声対話装置１２を使って、音声対話を始める（シーケンスＳ２１）。前述の仮定下では、音声対話装置１１と、音声対話装置１２ａ及び１２ｂとの間でコネクションが確立され、それらの間で、音声通信が始まる。 Furthermore, the dialogue control unit 136 establishes a connection for performing PTT-based voice communication between the voice dialogue device 11 grouped in step S16 and the voice dialogue device 12 (sequence S17 to S20). Thereafter, the calling-side user starts a voice conversation using the voice interaction device 11 and the called-side user uses the voice interaction device 12 (sequence S21). Under the above assumption, a connection is established between the voice interaction device 11 and the voice interaction devices 12a and 12b, and voice communication starts between them.

また、コネクションが確立されると、グループ化された音声対話装置１２は音声通信中になるので、対話制御部１３６は、データベースにおいて、音声通信を開始した音声対話装置１２に割り当てられているレコードの状況情報を「使用不可」に設定する。また、上述のように、対話制御部１３６は、音声対話装置１１及び音声対話装置１２の間の音声通信を監視できるので、その音声通信が終了すると、対象となるレコードの状況情報を「使用可能」に設定する。 When the connection is established, the grouped voice interaction devices 12 are in voice communication. Therefore, the dialogue control unit 136 stores the record assigned to the voice interaction device 12 that has started voice communication in the database. Set status information to Unusable. Further, as described above, the dialogue control unit 136 can monitor the voice communication between the voice dialogue device 11 and the voice dialogue device 12, and when the voice communication is finished, the status information of the target record is “available”. To "".

なお、ステップＳ１４の検索処理において、レコードを１つも検索することができなかった場合、音声対話制御装置１３は、予め定められた接続先、例えばコンシェルジェサービスセンタへ接続するようにすれば良い。このとき、この接続先と、音声対話装置１１とがグループ化される。 If no record can be searched in the search process of step S14, the voice interaction control device 13 may connect to a predetermined connection destination, for example, a concierge service center. At this time, the connection destination and the voice interaction device 11 are grouped.

また、グループ化部１３５によってグループ化された後、コネクションを確立する前に、音声対話制御装置１３が、発信側ユーザに対して、検索により得られた着信側ユーザと音声対話を行うか否か問い合わせても構わない。これにより、発信側ユーザは、音声対話を拒否することができる。この場合、音声対話制御装置１３はコネクション確立処理（シーケンスＳ１７−Ｓ２０）を行わない。 Whether or not the voice conversation control device 13 performs a voice conversation with the called-side user obtained by the search with respect to the calling-side user after the grouping by the grouping unit 135 and before the connection is established. You can contact us. Thereby, the transmission side user can refuse voice conversation. In this case, the voice conversation control device 13 does not perform connection establishment processing (sequences S17 to S20).

前述のようなコネクション確立処理を行った後、音声対話装置１１、音声対話制御装置１２及び音声対話装置１３の間で行われる音声通信について説明する。図７は、コネクション確立処理後における図１に示す音声対話装置１１、音声対話制御装置１３及び音声対話装置１２のそれぞれの動作を示すシーケンス図である。 The voice communication performed between the voice dialogue apparatus 11, the voice dialogue control apparatus 12, and the voice dialogue apparatus 13 after performing the connection establishment process as described above will be described. FIG. 7 is a sequence diagram showing operations of the voice interaction device 11, the voice interaction control device 13, and the voice interaction device 12 shown in FIG. 1 after the connection establishment process.

まず、前述したように、発信側ユーザは、音声対話装置１１の対話開始部１１４を操作しながら、検索内容を表す音声をマイク１１２に入力する。検索内容を表す音声としては、「中華を食べたい。」が例示される。これに応答して、音声対話装置１１では、図５を参照して説明したような最初の発信者メッセージが作成される。また、音声対話制御装置１２は、この発信者メッセージに含まれる検索内容をメッセージ蓄積部１３７に蓄積し、さらに、この発信者メッセージを使って、図５を参照して説明したように、発信側ユーザと今回対話すべき着信側ユーザが検索され、音声対話装置１１及び１２の間でコネクションが確立される。 First, as described above, the calling-side user inputs the voice representing the search content to the microphone 112 while operating the dialogue start unit 114 of the voice dialogue apparatus 11. The voice representing the search content is exemplified by “I want to eat Chinese food”. In response to this, the voice interaction device 11 creates the first caller message as described with reference to FIG. Further, the voice conversation control device 12 accumulates the search contents included in the caller message in the message accumulating unit 137, and further uses the caller message as described with reference to FIG. The called user who should interact with the user this time is searched, and a connection is established between the voice interaction apparatuses 11 and 12.

その後、図７に示すように、音声対話制御装置１３は、自身のメッセージ蓄積部１３７に格納される検索内容を表すデジタル音声情報を、対象となる音声対話装置１２ａ及び１２ｂに送信する（シーケンスＳ２２）
このようなデジタル音声情報を受けると、音声対話装置１２ａの着信側ユーザ、及び音声対話装置１２ｂ側の着信側ユーザは、それぞれ、独自に返答を行う。それぞれの返答内容は、音声対話装置１２ａ及び１２ｂのそれぞれから、着信側メッセージが送信され、音声対話装置１２ａから送出された着信側メッセージは、音声対話装置１１及び１２ｂに送信される（シーケンスＳ２３）。また、音声対話装置１２ｂからの着信側メッセージは、音声対話装置１１及び１２ａに送信される（シーケンスＳ２４）。 After that, as shown in FIG. 7, the voice interaction control device 13 transmits digital audio information representing the search contents stored in its message storage unit 137 to the target voice interaction devices 12a and 12b (sequence S22). )
Upon receiving such digital voice information, the receiving user of the voice interaction device 12a and the receiving user of the voice interaction device 12b each respond independently. As for the contents of each response, the incoming message is transmitted from each of the voice interactive devices 12a and 12b, and the incoming message sent from the voice interactive device 12a is transmitted to the voice interactive devices 11 and 12b (sequence S23). . Also, the incoming message from the voice interaction device 12b is transmitted to the voice interaction devices 11 and 12a (sequence S24).

また、着信側ユーザは、着信側メッセージそれぞれに応答して、返答を音声対話装置１１に入力する。応じて、音声対話装置１１は、入力された返答に基づき、さらなる発信者メッセージを生成し、音声対話装置１２ａ及び１２ｂに送信する（シーケンスＳ２５）。以降、音声対話が終了するまで、同様のシーケンスが繰り返される（シーケンスＳ２６−Ｓ２９）。 In addition, the receiving side user inputs a response to the voice interactive apparatus 11 in response to each incoming side message. In response, voice interaction device 11 generates a further caller message based on the input response and transmits it to voice interaction devices 12a and 12b (sequence S25). Thereafter, the same sequence is repeated until the voice conversation is finished (sequences S26 to S29).

以上説明したように、本実施形態に係る音声対話システムによれば、ネットワーク１４上の音声対話制御装置１３は、音声対話装置１１からの第１回目の発信者側メッセージに応答して、それに含まれる検索内容及び付加情報を使って、着信側ユーザを検索し、検索により得られた着信側ユーザと発信側ユーザとを、音声対話装置１１の代理でグループ化して保持する。従って、発信側ユーザは、対話相手となる着信側ユーザを音声対話装置１１に、予めグループ化して登録しておく必要がなくなる。さらに、着信側ユーザを音声対話装置１１に登録する必要がなくなるので、音声対話装置１１が備える記憶装置の容量を浪費しなくても済むことになる。 As described above, according to the voice dialogue system according to the present embodiment, the voice dialogue control device 13 on the network 14 is included in response to the first caller side message from the voice dialogue device 11. The called user is searched using the search contents and the additional information, and the called user and the calling user obtained by the search are grouped on behalf of the voice interactive device 11 and held. Therefore, it is not necessary for the calling-side user to group and register the called-side users who are conversation partners in the voice dialogue apparatus 11 in advance. Furthermore, since it is not necessary to register the called user in the voice interaction apparatus 11, it is not necessary to waste the storage capacity of the voice interaction apparatus 11.

また、本音声対話装置１１を車載端末装置として実現することで、ユーザの情報検索という操作負荷を軽減したり、店舗のアドレス情報を知らなくても検索することができたりする。 Further, by realizing the voice interaction device 11 as an in-vehicle terminal device, it is possible to reduce the operation load of user information search or to search without knowing store address information.

なお、ある音声対話装置１２からの着信側メッセージは、他の音声対話装置１２への届くので、着信側ユーザにとっては常に、発信側ユーザと対話することにはならず、お互いが混乱することも考えられる。このため、音声対話装置１２の間では、着信側メッセージのやりとりを禁止するよう制御されても構わない。 In addition, since a receiving side message from a certain voice interactive device 12 reaches another voice interactive device 12, the receiving side user does not always interact with the calling side user and may be confused with each other. Conceivable. For this reason, control may be performed so as to prohibit the exchange of messages on the called side between the voice interaction devices 12.

また、音声対話装置１１が車載用ナビゲーション装置に実装されている場合、着信側メッセージの位置情報を取得することで、車載用ナビゲーション装置は、現在位置から、着信側ユーザの店舗まで経路探索し、着信側ユーザの店舗まで発信側ユーザを案内することができる。 In addition, when the voice interaction device 11 is mounted on the in-vehicle navigation device, the in-vehicle navigation device searches for a route from the current position to the store of the incoming user by acquiring the location information of the incoming side message, The calling user can be guided to the receiving user's store.

また、以上の実施形態では、音声対話装置１２は、音声対話装置１１と同様の構成であり、同様の処理を行うとして説明したが、これに限らず、音声対話装置１２は、Ｂｌｕｅｔｏｏｔｈに代表されるような近距離無線通信プロトコルを実装している場合、複数のヘッドセットが音声対話装置１２の介して音声対話に対応することができるようになる。この場合、データベースにおいて、対象となるレコードには、音声対話装置１２にアクセス可能なヘッドセット毎の状況情報が付加されることになる。 In the above embodiment, the voice interaction device 12 has the same configuration as the voice interaction device 11 and performs the same processing. However, the present invention is not limited to this, and the voice interaction device 12 is represented by Bluetooth. When the short-range wireless communication protocol is implemented, a plurality of headsets can respond to voice conversation via the voice dialogue apparatus 12. In this case, status information for each headset that can access the voice interactive device 12 is added to the target record in the database.

また、音声対話装置１１に備わる入力装置の一つ（図示せず）に、発信者側のユーザの操作に応答して、現在コネクションが確立されている音声対話装置１２のいずれか１個以上に対して、コネクションの解放を要求するための解放要求信号を出力する解放要求機能が割り当てられていることが好ましい。この場合、発信側ユーザが、対象となる入力装置を操作すると、図８に示すように、音声対話装置１１の通信部１１１からは、入力装置からの解放要求信号がネットワーク１４に送出され、音声対話制御装置１３に送信される（シーケンスＳ３１）。音声対話制御装置１３において、通信部１３１はさらに、ネットワーク１４を介して送信されてくる解放要求信号を受信し、対話制御部１３６は、受信された解放要求信号により指定された音声対話装置１２とのコネクションを解放する（シーケンスＳ３１及びＳ３２）。これによって、音声対話装置１１は、今回コネクションの解放対象にならなかった残りの音声対話装置１２との間でのみ音声通信を継続することになる（シーケンスＳ３３）。 In addition, one of the input devices (not shown) provided in the voice interaction device 11 is connected to any one or more of the voice interaction devices 12 to which a connection is currently established in response to a user operation on the caller side. On the other hand, a release request function that outputs a release request signal for requesting release of the connection is preferably assigned. In this case, when the calling user operates the target input device, as shown in FIG. 8, the communication unit 111 of the voice interaction device 11 sends a release request signal from the input device to the network 14, and the voice It is transmitted to the dialogue control device 13 (sequence S31). In the voice dialogue control device 13, the communication unit 131 further receives a release request signal transmitted via the network 14, and the dialogue control unit 136 communicates with the voice dialogue device 12 designated by the received release request signal. Are released (sequences S31 and S32). As a result, the voice interaction device 11 continues the voice communication only with the remaining voice interaction devices 12 that were not targeted for the current connection release (sequence S33).

他にも、音声対話装置１１に備わる入力装置の一つ（図示せず）に、発信側ユーザの操作に応答して、現在コネクションが確立されている音声対話装置１２のいずれかのみに対して、コネクションの維持を要求するための維持要求信号を出力する維持要求機能が割り当てられていることが好ましい。この場合、発信側ユーザが、対象となる入力装置を操作すると、音声対話装置１１の通信部１１１からは、入力装置からの維持要求信号がネットワーク１４に送出され、音声対話制御装置１３に送信される。音声対話制御装置１３において、対話制御部１３６は、今回受信された維持要求信号により指定された音声対話装置１２以外とのコネクションを解放する。これによって、音声対話装置１１は、今回コネクションの解放対象にならなかった残りの音声対話装置１２との間でのみ音声通信を継続することになる。 In addition, one of the input devices (not shown) provided in the voice interaction device 11 can be applied only to one of the voice interaction devices 12 to which a connection is currently established in response to the operation of the originating user. It is preferable that a maintenance request function for outputting a maintenance request signal for requesting connection maintenance is assigned. In this case, when the originating user operates the target input device, the communication unit 111 of the voice interaction device 11 sends a maintenance request signal from the input device to the network 14 and is transmitted to the voice interaction control device 13. The In the voice interaction control device 13, the interaction control unit 136 releases a connection with a device other than the voice interaction device 12 specified by the maintenance request signal received this time. As a result, the voice interaction device 11 continues voice communication only with the remaining voice interaction devices 12 that were not targeted for connection release this time.

以上の２つの方法により、発信側ユーザは、もはや対話することが不要になった着信側ユーザとの音声対話を打ち切ることが可能となる。 With the above two methods, the calling user can terminate the voice conversation with the called user who no longer needs to talk.

さらに、音声対話制御装置１３は、音声対話装置１１及び１２間の音声通信を監視しているので、両者の間で流れるデジタル音声データが無くなった場合、つまり、一定時間の無音区間が発生した場合には、対話は終了したと判断し、コネクションを自動的に解放するようにしても良い。 Furthermore, since the voice interaction control device 13 monitors the voice communication between the voice interaction devices 11 and 12, when there is no digital voice data flowing between the two, that is, when a silent section of a certain time occurs. In this case, it may be determined that the conversation has ended and the connection is automatically released.

また、音声対話装置１１が車載用ナビゲーション装置に実装されている場合、音声対話装置１２の店舗情報から設定された目的地に到着するまでは、対象となる音声対話装置１２とのコネクションを維持し、目的地に到着するとコネクションを自動的に解放するようにしても良い。 Further, when the voice interactive device 11 is mounted on the in-vehicle navigation device, the connection with the target voice interactive device 12 is maintained until the destination set from the store information of the voice interactive device 12 is reached. The connection may be automatically released upon arrival at the destination.

一方、音声対話装置１１は移動体である場合があるので、グループに属する音声対話装置１２は、時間の経過とともに、つまり移動と共に、発信側ユーザの対話相手としてふさわしくなくなるという状況が発生する。例えば、検索内容が「中華料理を食べたい。」であり、付加情報が現在位置である場合、移動に伴い検索時点での現在位置から発信側ユーザが遠く離れてしまったにも係らず、まだ検索時点の現在地周辺でグループ化された着信側ユーザと音声対話を続けているような状況である。このような状況を回避するために、音声対話制御装置２は、音声対話が行われている間、定期的に音声対話装置１１から付加情報を送信してもらうことで、グループ化された音声対話装置１１及び１２の間に確立されているコネクションの解放、再グループ化を行うことができる。これにより、時々刻々と変動する音声対話装置１１の位置情報に応じて、発信側ユーザの検索要求を満たすことのできるグループを保つことができる。 On the other hand, since the voice interaction device 11 may be a mobile object, a situation occurs in which the voice interaction device 12 belonging to the group becomes unsuitable as a conversation partner of the calling user as time elapses, that is, with movement. For example, if the search content is “I want to eat Chinese food” and the additional information is the current location, the calling user has moved far away from the current location at the time of the search as a result of the movement. The situation is such that the voice conversation is continued with the called users grouped around the current location at the time of the search. In order to avoid such a situation, the voice dialogue control device 2 periodically sends additional information from the voice dialogue device 11 while the voice dialogue is being performed. Connections established between the devices 11 and 12 can be released and regrouped. As a result, it is possible to maintain a group that can satisfy the search request of the caller user according to the position information of the voice interaction device 11 that changes from moment to moment.

また、制御部１３８は、上述のようにＲＯＭに予め格納されるコンピュータプログラムに従って行っていた。しかし、これに限らず、上述のような処理はハードウェアで実現されても構わない。また、コンピュータプログラムは、ＣＤ−ＲＯＭのような記憶媒体に記録された状態で頒布されても構わない。他にも、コンピュータプログラムは、ネットワークに接続されたサーバ装置に、端末装置がダウンロード又は可能に格納されていても構わない。 Moreover, the control part 138 performed according to the computer program previously stored in ROM as mentioned above. However, the present invention is not limited to this, and the above processing may be realized by hardware. The computer program may be distributed in a state where it is recorded on a storage medium such as a CD-ROM. In addition, the computer program may be stored in a server device connected to the network so that the terminal device can be downloaded or stored.

また、以上の実施形態では、ＰＴＴベースの音声対話装置１１について説明したので、対話開始部１１４はボタンであるとして説明した。しかし、これに限らず、音声入力におり対話が開始されるのであれば、対話開始部１１４は、音声入力装置となる。 In the above embodiment, since the PTT-based voice interaction device 11 has been described, the interaction start unit 114 has been described as a button. However, the present invention is not limited to this, and the dialogue start unit 114 is a voice input device if the dialogue is started by voice input.

以上、本発明を詳細に説明したが、上記説明はあらゆる意味において例示的なものであり限定的なものではない。本発明の範囲から逸脱することなしに多くの他の改変例及び変形例が可能であることが理解される。 As mentioned above, although this invention was demonstrated in detail, the said description is an illustration in all the meanings, and is not restrictive. It will be appreciated that many other modifications and variations are possible without departing from the scope of the invention.

本発明に係る音声対話システムは、ユーザが登録操作の煩わしさを感じにくいという技術的効果が要求されるＰＴＴベースの音声対話装置等に特に有用である。 The voice interaction system according to the present invention is particularly useful for a PTT-based voice interaction device or the like that requires a technical effect that the user is less likely to feel the troublesome registration operation.

本発明の実施形態に係る音声対話システム１の全体構成を示す模式図The schematic diagram which shows the whole structure of the voice dialogue system 1 which concerns on embodiment of this invention. 図１に示す音声対話装置１１の詳細な構成を示すブロック図FIG. 1 is a block diagram showing a detailed configuration of the voice interaction apparatus 11 shown in FIG. 図１に示す音声対話制御装置１３の詳細な構成を示すブロック図The block diagram which shows the detailed structure of the voice dialogue control apparatus 13 shown in FIG. 図３に示すデータベース格納部１３３に格納されるデータベースを構成するレコードの構造を示す模式図The schematic diagram which shows the structure of the record which comprises the database stored in the database storage part 133 shown in FIG. 本実施形態に係る音声対話システムにおいて、発信側ユーザ及び着信側ユーザにおける音声対話が始まるまでの音声対話装置１１、音声対話制御装置１３及び音声対話装置１２のそれぞれの動作を示すシーケンス図The sequence diagram which shows each operation | movement of the voice dialogue apparatus 11, the voice dialogue control apparatus 13, and the voice dialogue apparatus 12 until the voice dialogue in the calling-side user and the called-side user starts in the voice dialogue system according to the present embodiment. 図３に示す検索部１３４による検索処理の詳細な処理手順を示すフローチャートThe flowchart which shows the detailed process sequence of the search process by the search part 134 shown in FIG. コネクション確立処理後における図１に示す音声対話装置１１、音声対話制御装置１３及び音声対話装置１２のそれぞれの動作を示すシーケンス図The sequence diagram which shows each operation | movement of the voice dialogue apparatus 11, the voice dialogue control apparatus 13, and the voice dialogue apparatus 12 which are shown in FIG. コネクション確立処理後における図１に示す音声対話装置１１及び１２のコネクション解放に関するシーケンス図FIG. 1 is a sequence diagram relating to the connection release of the voice interactive apparatuses 11 and 12 shown in FIG. 1 after the connection establishment process.

Explanation of symbols

１音声対話システム
１１音声対話装置（発信側）
１２音声対話装置（着信側）
１３音声対話制御装置
１４ネットワーク 1 Spoken Dialogue System 11 Spoken Dialogue Device (Sender)
12 Spoken dialogue device (incoming side)
13 Spoken Dialogue Control Device 14 Network

Claims

A spoken dialogue system,
The voice dialogue system includes a voice dialogue device, and a dialogue control device that controls a dialogue between a first user who uses the voice dialogue device and a second user who uses another device,
The voice interaction device
A dialogue start unit operated by the first user to start a dialogue;
After the dialogue start unit is operated, at least a voice representing search content is input by the first user, and a microphone that outputs voice data representing the input search content;
An acquisition unit that acquires additional information necessary for searching for the second user together with audio data output from the microphone;
A user-side transmission unit that transmits the audio data and additional information acquired by the acquisition unit to a network;
The voice interaction control device
A receiving unit for receiving voice data and additional information transmitted from the voice interaction device via the network;
A storage unit for storing a database configured to be able to search for the second user using the audio data and additional information received by the receiving unit as search keys;
A search unit that searches the database stored in the storage unit using the audio data and additional information received by the reception unit, and identifies a second user that the first user can interact with at this time;
The second user specified by the search unit is grouped into one group, and at least the second user registered in the group is held until the dialogue between the first and second users is ended. A voice dialogue system comprising a grouping unit.

The voice interaction system according to claim 1, wherein when the voice interaction device implements a push-to-talk function, the interaction start unit is a button operated by the first user.

When the voice interactive device can be mounted on a vehicle, the additional information is at least one selected from the group including the current position, traveling direction, moving speed, destination, and route to the destination of the voice interactive device. Including
2. The search unit according to claim 1, wherein the search unit searches for the second user near the current position of the voice interactive apparatus, near the destination and / or near the route to the destination with reference to the additional information. Voice dialogue system.

The database includes status information indicating the current status of the other device,
The spoken dialogue system according to claim 1, wherein the search unit searches for the second user with reference to status information included in the database.

The status information indicates whether the other device is in use,
The spoken dialogue system according to claim 4, wherein the search unit searches for a second user of a device that is not currently in use by referring to status information included in the database.

The voice interaction control device
An audio storage unit for buffering audio data received by the receiving unit;
A dialogue control unit that establishes a connection between the device used by the second user grouped by the grouping unit and the voice dialogue device;
When a connection is established by the dialog control unit, the audio data buffered in the audio storage unit is sent to the network for transmission to a second user device grouped by the grouping unit. The spoken dialogue system according to claim 1, further comprising a control device-side transmission unit.

The voice interaction device sends a release request signal for requesting release of a connection to one of the devices currently connected to the voice interaction device in response to the operation of the first user. It further includes a release request part for outputting,
The user side transmission unit further sends a release request signal from the release request unit to the network,
The receiving unit further receives a release request signal transmitted from the voice interaction device via the network,
The voice dialogue system according to claim 6, wherein the dialogue control unit further releases the connection with the device specified by the release request signal received by the receiving unit.

The voice interaction apparatus is a maintenance request signal for requesting only one of the devices that are currently connected to the voice interaction apparatus to maintain the connection in response to the operation of the first user. Is further provided with a maintenance request unit that outputs
The user side transmission unit further sends a maintenance request signal from the maintenance request unit to the network,
The receiving unit further receives a maintenance request signal transmitted from the voice interaction device via the network,
The voice dialogue system according to claim 6, wherein the dialogue control unit further releases a connection with a device other than the device designated by the maintenance request signal received by the receiving unit.

A dialogue control device for controlling a dialogue between a first user who uses a voice dialogue device and a second user who uses another device,
The voice interaction device
A dialogue start unit operated by the first user to start a dialogue;
After the dialogue start unit is operated, at least a voice representing search content is input by the first user, and a microphone that outputs voice data representing the input search content;
An acquisition unit that acquires additional information necessary for searching for the second user together with audio data output from the microphone;
A transmission unit that sends the audio data and additional information acquired by the acquisition unit to a network;
The voice interaction control device
A receiving unit for receiving voice data and additional information transmitted from the voice interaction device via the network;
A storage unit for storing a database configured to be able to search for the second user using the audio data and additional information received by the receiving unit as search keys;
A search unit that searches the database stored in the storage unit using the audio data and additional information received by the reception unit, and identifies a second user that the first user can interact with at this time;
The second user specified by the search unit is grouped into one group, and at least the second user registered in the group is held until the dialogue between the first and second users is ended. A dialog control device comprising a grouping unit.

A dialog control method for controlling a dialog between a first user using a voice dialog device and a second user using another device,
An interaction start step executed by the voice interaction device in response to an operation to start an interaction by the first user;
Voice input that is executed on the side of the voice interaction device and after the dialogue start step is operated, at least a voice representing the search content is inputted by the first user, and voice data representing the inputted search content is output. Steps,
An acquisition step for acquiring additional information necessary for searching for the second user together with the voice data output in the voice input step and executed on the voice interaction device side;
A transmission step executed on the side of the voice interaction device and transmitting the voice data and additional information acquired in the acquisition step to a network;
A reception step executed on the side of the voice interaction control device and receiving voice data and additional information transmitted from the voice interaction device via the network;
The database stored in the voice interaction control device is searched using the voice data received in the receiving step and the additional information as a search key, and the second user who can interact with the first user is specified. A search step;
The second user, which is executed on the side of the voice interaction control device and specified by the search step, is grouped into one group, and at least until the dialogue between the first and second users is completed, And a grouping step for holding a registered second user.

A computer program for controlling a dialogue between a first user using a voice interaction device and a second user using another device,
The voice interaction device
A dialogue start unit operated by the first user to start a dialogue;
After the dialogue start unit is operated, at least a voice representing search content is input by the first user, and a microphone that outputs voice data representing the input search content;
An acquisition unit that acquires additional information necessary for searching for the second user together with audio data output from the microphone;
A transmission unit that sends the audio data and additional information acquired by the acquisition unit to a network;
The computer program is
A reception step executed on the side of the voice interaction control device and receiving voice data and additional information transmitted from the voice interaction device via the network;
The database stored in the voice interaction control device is searched using the voice data received in the receiving step and the additional information as a search key, and the second user who can interact with the first user is specified. A search step;
The second user, which is executed on the side of the voice interaction control device and specified by the search step, is grouped into one group, and at least until the dialogue between the first and second users is completed, A computer program comprising: a grouping step for holding a registered second user.