JP2006186893A

JP2006186893A - Voice conversation control apparatus

Info

Publication number: JP2006186893A
Application number: JP2004380781A
Authority: JP
Inventors: Takashi Akita; 貴志秋田; Takashi Kondo; 剛史金銅; Noboru Katsuta; 昇勝田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-12-28
Filing date: 2004-12-28
Publication date: 2006-07-13

Abstract

<P>PROBLEM TO BE SOLVED: To provide a voice conversation control apparatus in which a call originating side user can easily specify a communicating party. <P>SOLUTION: In a voice conversation control apparatus 13, a central control unit 137 selects two or more call terminating side voice conversation devices, based on additional information in a call originating side message of the present time and attribute information stored in advance. A conversation control unit 135 establishes first connection with one or more of the selected call terminating side voice conversation devices and the call originating side voice conversation device. After the first connection is cut off, the conversation control unit 135 establishes new connection with a call terminating side voice conversation device which is selected by the central control unit 137 but does not transmit voice information of the call originating side message of the present time yet, and the call originating side voice conversation device. Each time connection is established, a transmission/reception unit 131 further transmits voice information in the call originating side message to the target call terminating side voice conversation device. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、音声対話制御装置及び音声対話システムに関し、より特定的には、半二重方式の音声対話において、発信側の音声対話装置と、受信側の複数の音声対話装置との間におけるコネクションを制御する音声対話制御装置に関する。 The present invention relates to a voice dialogue control device and a voice dialogue system, and more particularly, a connection between a voice dialogue device on a transmission side and a plurality of voice dialogue devices on a reception side in a half-duplex voice dialogue. The present invention relates to a voice dialog control device for controlling the voice.

従来、上記のような音声対話装置の一例として、ユーザにより携帯可能で、複数の無線端末による多者間の同報通話を可能とするものがある（例えば、特許文献１参照）。 2. Description of the Related Art Conventionally, as an example of the above-described voice interaction apparatus, there is one that can be carried by a user and allows a multi-party broadcast call by a plurality of wireless terminals (see, for example, Patent Document 1).

また、近年、会話ボタンを押すだけで１人以上の相手と通信できるプッシュツートーク（以下、ＰＴＴ（ＰｕｓｈＴｏＴａｌｋ）と称する）機能が、例えば携帯電話のような音声対話装置に搭載され始めている。このようなＰＴＴ機能により、あるユーザから発せられた音声は、デジタルネットワークを介し、予めグループ化された全ての人が携帯する音声対話装置に届く。また、ＰＴＴでは、上記のようにして開始される音声は半二重通信でやりとりされる。
特開平６―１５２４７０号公報 In recent years, a push-to-talk (hereinafter referred to as PTT (Push To Talk)) function capable of communicating with one or more other parties by simply pressing a conversation button has begun to be installed in a voice interactive apparatus such as a mobile phone. . With such a PTT function, a voice uttered from a certain user reaches a voice interactive device carried by all persons grouped in advance via a digital network. In PTT, the voice started as described above is exchanged by half-duplex communication.
JP-A-6-152470

しかしながら、ＰＴＴでは、あるユーザが声を発すると、多くの人がほぼ同時に応答することも多くなるため、そのユーザは誰と対話しているのか認識することが困難となるという問題点がある。 However, in PTT, when a certain user speaks, many people often respond at the same time, which makes it difficult to recognize who the user is interacting with.

それ故に、本発明の目的は、発信側のユーザが通話の相手を特定し易い音声対話制御装置を提供することである。 SUMMARY OF THE INVENTION Therefore, an object of the present invention is to provide a voice interaction control device that allows a user on the calling side to easily identify a call partner.

上記目的を達成するために、本発明の第１の局面は、発信側の音声対話装置と、２以上の着信側の音声対話装置とネットワークを介して接続されており、発信側の音声対話装置と、２以上の着信側の音声対話装置との間で行われる半二重方式の音声通信を制御する音声対話制御装置に向けられている。音声対話制御装置は、発信側の音声対話装置からネットワークに送出され、当該発信側の音声対話装置側で生成された音声情報及び付加情報を少なくとも含む発信側メッセージを受信する受信部と、各着信側の音声対話装置の属性を表す属性情報を格納する属性情報記憶部と、受信部により受信された発信側メッセージに含まれる付加情報と、属性情報記憶部に格納される属性情報とに基づいて、当該発信側メッセージに含まれる音声情報を転送すべき２以上の着信側の音声対話装置を選択する中央制御部と、中央制御部により選択された着信側の音声対話装置のいずれか１以上と、発信側の音声対話装置との間で最初のコネクションを確立する対話制御部と、対話制御部により最初のコネクションが確立された着信側の音声対話装置に、受信部により受信された発信側メッセージに含まれる音声情報を送信する送信部とを備える。ここで、対話制御部はさらに、最初のコネクションを切断した後、中央制御部により選択されかつ受信部により受信された発信側メッセージの音声情報を未送信である着信側の音声対話装置と、発信側の音声対話装置との間で新たなコネクションを確立し、送信部はさらに、対話制御部によりコネクションが新たに確立された着信側の音声対話装置に、受信部により受信された発信側メッセージに含まれる音声情報を送信する。 In order to achieve the above object, according to a first aspect of the present invention, a voice conversation device on a transmission side is connected to two or more voice conversation devices on a reception side via a network. And a voice dialogue control device for controlling half-duplex voice communication performed between two or more incoming voice dialogue devices. The voice conversation control device includes a receiving unit that receives a caller side message including at least voice information and additional information that is transmitted from the caller voice dialog device to the network and generated on the caller voice dialog device side, and each incoming call. On the basis of attribute information storage unit storing attribute information representing the attribute of the voice conversation device on the side, additional information included in the outgoing message received by the reception unit, and attribute information stored in the attribute information storage unit A central control unit that selects two or more incoming voice conversation devices to which the voice information included in the originating message is to be transferred, and any one or more of the incoming voice conversation devices selected by the central control unit; The dialogue control unit that establishes the initial connection with the voice conversation device on the calling side and the voice conversation device on the called side that establishes the first connection by the dialogue control unit And a transmitter for transmitting the audio information included in the received calling party message by parts. Here, the dialog control unit further disconnects the first connection, and then selects the voice information of the calling side message selected by the central control unit and received by the receiving unit, A new connection is established with the voice conversation device on the side, and the transmission unit further transmits the message on the caller side received by the reception unit to the voice conversation device on the reception side where the connection is newly established by the dialogue control unit. Send included audio information.

音声対話制御装置は、付加的に、中央制御部により選択された２以上の着信側の音声対話装置に対し、対話制御部がコネクションを確立する順番を優先度として割り当てる優先度設定部を備える。ここで、対話制御部は、優先度設定部により割り当てられた優先度が最も高い着信側の音声対話装置と最初のコネクションを確立する。 The voice conversation control device additionally includes a priority setting unit that assigns, as a priority, the order in which the dialog control unit establishes a connection to two or more incoming voice conversation devices selected by the central control unit. Here, the dialogue control unit establishes an initial connection with the called voice dialogue device having the highest priority assigned by the priority setting unit.

また、典型的には、付加情報は、少なくとも、発信側の音声対話装置の現在位置を特定可能な位置情報及び／又は発信側のユーザに関連する個人情報を含んでおり、位置情報は、発信側の音声対話装置の現在位置、目的地の位置、移動速度、進行方位、又は目的地に到達するまでの経路を少なくとも含んでおり、さらに、個人情報は、発信側のユーザの氏名、年齢、住所、電話番号、メールアドレス、ニックネーム又は嗜好情報を少なくとも含んでいる。ここで、優先度設定部は、発信側メッセージに含まれる位置情報及び／又は個人情報を参照して、優先度の割り当てを行う。 Further, typically, the additional information includes at least position information capable of specifying the current position of the voice conversation device on the transmission side and / or personal information related to the user on the transmission side. Including at least the current location, destination location, moving speed, direction of travel, or route to the destination, and personal information includes the name, age, It contains at least address, phone number, email address, nickname or preference information. Here, the priority setting unit assigns priorities with reference to position information and / or personal information included in the outgoing message.

音声対話制御装置は、付加的に、受信部により受信された発信側メッセージに含まれる音声情報を解析し、発信側のユーザの発話内容を認識する音声認識部をさらに備える。ここで、優先度設定部は、音声認識部の認識結果を参照して、優先度の割り当てを行う。 The voice interaction control apparatus additionally includes a voice recognition unit that analyzes voice information included in the caller side message received by the receiving unit and recognizes the utterance content of the user on the caller side. Here, the priority setting unit assigns priority by referring to the recognition result of the voice recognition unit.

対話制御部は、好ましくは、優先度設定部により割り当てられた優先度が高い着信側の音声対話装置から順番に、発信側の音声対話装置とコネクションを確立する。 The dialogue control unit preferably establishes a connection with the voice conversation device on the outgoing side in order from the voice dialogue device on the incoming side having the higher priority assigned by the priority setting unit.

また、対話制御部は、典型的には、前回コネクションが確立された音声対話装置間で無音区間が所定時間続いた場合、新たなコネクションを確立する。 Also, the dialogue control unit typically establishes a new connection when a silent period continues for a predetermined time between the voice dialogue apparatuses to which the previous connection has been established.

また、本発明の第２の局面は、音声対話システムに向けられており、音声対話システムは、発信側の音声対話装置と、２以上の着信側の音声対話装置と、発信側の音声対話装置と、着信側の音声対話装置との間で行われる半二重方式の音声通信を制御する音声対話制御装置とを備える。また、発信側の音声対話装置は、発信側の音声対話装置のユーザにより入力された音声に基づく音声情報と、予め定められた付加情報を少なくとも含む発信側メッセージを生成する発信側生成部と、生成部により生成された発信側メッセージを、音声対話制御装置に送信する発信側送信部とを含む。また、音声対話制御装置は、発信側送信部から送信されてくる発信側メッセージを受信する制御側受信部と、各着信側の音声対話装置の属性を表す属性情報を格納する属性情報記憶部と、制御側受信部により受信された発信側メッセージに含まれる付加情報と、属性情報記憶部に格納される属性情報とに基づいて、当該発信側メッセージに含まれる音声情報を転送すべき２以上の着信側の音声対話装置を選択する中央制御部と、中央制御部により選択された着信側の音声対話装置のいずれか１以上と、発信側の音声対話装置との間で最初のコネクションを確立する対話制御部と、対話制御部により最初のコネクションが確立された着信側の音声対話装置に、制御側受信部により受信された発信側メッセージに含まれる音声情報を送信する制御側送信部とを含む。ここで、対話制御部はさらに、最初のコネクションを切断した後、中央制御部により選択されかつ受信部により受信された発信側メッセージの音声情報を未送信である着信側の音声対話装置と、発信側の音声対話装置との間で新たなコネクションを確立し、さらに、制御側送信部はさらに、対話制御部によりコネクションが新たに確立された着信側の音声対話装置に、受信部により受信された発信側メッセージに含まれる音声情報を送信する。着信側の音声対話装置は、着信側の音声対話装置との間でコネクションが確立されると、制御側送信部から送信されてくる音声情報を受信する着信側受信部を少なくとも含む。 In addition, the second aspect of the present invention is directed to a voice dialogue system, and the voice dialogue system includes a calling-side voice dialogue device, two or more called-side voice dialogue devices, and a calling-side voice dialogue device. And a voice dialogue control device for controlling half-duplex voice communication performed with the voice dialogue device on the receiving side. Further, the calling-side voice interaction device includes a calling-side generation unit that generates a calling-side message including at least voice information based on voice input by a user of the calling-side voice interaction device and predetermined additional information; A transmission side transmission unit that transmits the transmission side message generated by the generation unit to the voice interaction control device. Further, the voice conversation control device includes a control-side receiving unit that receives a calling-side message transmitted from a calling-side sending unit, an attribute information storage unit that stores attribute information representing attributes of each called-side voice dialogue device, Based on the additional information included in the calling side message received by the control side receiving unit and the attribute information stored in the attribute information storage unit, two or more voice information included in the calling side message should be transferred An initial connection is established between the central control unit that selects the voice conversation device on the called side, one or more of the voice conversation devices on the called side selected by the central control unit, and the voice conversation device on the outgoing side. Control for transmitting voice information included in a caller-side message received by a control-side receiver to a dialog control unit and a voice-side dialog device on the receiving side for which an initial connection has been established by the dialog controller And a transmission unit. Here, the dialog control unit further disconnects the first connection, and then selects the voice information of the calling side message selected by the central control unit and received by the receiving unit, A new connection is established with the voice conversation device on the side, and the control side transmission unit is further received by the reception unit at the voice conversation device on the receiving side where the connection is newly established by the dialogue control unit. The voice information included in the caller message is transmitted. The incoming-side voice interactive device includes at least an incoming-side receiving unit that receives voice information transmitted from the control-side transmitting unit when a connection is established with the incoming-side voice interactive device.

以上の各局面に係る音声対話制御装置によれば、半二重方式の音声通信による場合であっても、音声対話制御装置の処理により、選ばれた着信側の音声対話装置とのコネクションが切断されるまで、他の着信側の音声対話装置と、発信側の音声対話装置との間のコネクションは確立されない。従って、ある時間において、発信側の音声対話装置は、選ばれた音声対話装置としか音声通信を行わない。言い換えれば、発信側の音声対話装置のユーザが発話しても、多数の人がほぼ同時に応答することを避けることができるため、発信側のユーザが通話の相手を特定し易い音声対話制御装置を提供することが可能となる。 According to the voice conversation control device according to each of the above aspects, even when half-duplex voice communication is used, the connection with the selected voice conversation device on the receiving side is disconnected by the processing of the voice conversation control device. Until the call is made, a connection between the other incoming voice dialogue device and the outgoing voice dialogue device is not established. Therefore, at a certain time, the calling-side voice interactive device performs voice communication only with the selected voice interactive device. In other words, since it is possible to avoid a large number of people from responding almost simultaneously even when the user of the calling-side voice interaction device speaks, the calling-side user can easily specify the other party of the call. It becomes possible to provide.

本発明の上記及びその他の目的、特徴、局面及び利点は、以下に述べる本発明の詳細な説明を添付の図面とともに理解したとき、より明らかになる。 The above and other objects, features, aspects and advantages of the present invention will become more apparent when the detailed description of the present invention described below is understood in conjunction with the accompanying drawings.

（実施形態）
図１は、本発明の実施形態に係る音声対話システム１の全体構成を示す模式図である。図１において、音声対話システム１は、音声対話装置（発信側）１１、１台以上の音声対話装置（着信側）１２、及び音声対話制御装置１３を少なくとも備える。音声対話装置１１及び１２並びに音声対話制御装置１３は、例えばインターネット及び／又はセルラー網のようなネットワーク１４を介して、相互に通信可能に接続される。なお、図１には、１台以上の音声対話装置（着信側）１２の例として、２台の音声対話装置１２ａ及び１２ｂが示されている。 (Embodiment)
FIG. 1 is a schematic diagram showing an overall configuration of a voice interaction system 1 according to an embodiment of the present invention. In FIG. 1, the voice dialogue system 1 includes at least a voice dialogue device (transmitting side) 11, one or more voice dialogue devices (incoming side) 12, and a voice dialogue control device 13. The voice interaction devices 11 and 12 and the voice interaction control device 13 are connected to be able to communicate with each other via a network 14 such as the Internet and / or a cellular network. FIG. 1 shows two voice interaction devices 12 a and 12 b as an example of one or more voice interaction devices (incoming side) 12.

音声対話装置１１は、典型的には、ハンズフリー機能を有する携帯電話と接続された車載端末装置であり、ＰＴＴ（Ｐｕｓｈ−ｔｏ−Ｔａｌｋ）機能を備えている。 The voice interaction device 11 is typically an in-vehicle terminal device connected to a mobile phone having a hands-free function, and has a PTT (Push-to-Talk) function.

各音声対話装置１２ａ及び１２ｂは、典型的には、レストランのような施設又は店舗に設置されるＰＯＳ（ＰｏｉｎｔＯｆＳａｌｅ）端末装置である。具体例と挙げると、音声対話装置１２ａは焼肉店に設置され、音声対話装置１２ｂは中華料理店に設置される。また、これら音声対話装置１２ａ及び１２ｂは、ＰＴＴ機能により音声対話装置１１から発信された呼を受け付け、その後、音声対話装置１１との間で音声通信を行う。 Each of the voice interactive devices 12a and 12b is typically a POS (Point Of Sale) terminal device installed in a facility such as a restaurant or a store. As a specific example, the voice interaction device 12a is installed in a yakiniku restaurant, and the voice interaction device 12b is installed in a Chinese restaurant. Further, these voice interaction devices 12a and 12b accept a call transmitted from the voice interaction device 11 by the PTT function, and then perform voice communication with the voice interaction device 11.

音声対話制御装置１３は、典型的には、呼制御を行うサーバ又はセルラー網におけるパケット交換機に組み込まれ、音声対話装置１１のユーザと、音声対話装置１２ａ及び１２ｂそれぞれのユーザとの間で１対多の音声対話（ＰＴＴ）を制御する装置である。 The voice interaction control device 13 is typically incorporated in a server that performs call control or a packet switch in a cellular network, and makes a pair between the user of the voice interaction device 11 and each user of the voice interaction devices 12a and 12b. It is a device that controls multiple voice conversations (PTT).

次に、音声対話装置１１について説明する。図２は、図１に示す音声対話装置１１の詳細な構成を示すブロック図である。図２において、音声対話装置１１は、通信部１１１、マイク１１２、スピーカ１１３、音声入力制御部１１４、ＣＯＤＥＣ１１５、付加情報取得部１１６、制御部１１７及びメッセージ生成部１１８を備えている。 Next, the voice interaction apparatus 11 will be described. FIG. 2 is a block diagram showing a detailed configuration of the voice interaction apparatus 11 shown in FIG. 2, the voice interaction apparatus 11 includes a communication unit 111, a microphone 112, a speaker 113, a voice input control unit 114, a CODEC 115, an additional information acquisition unit 116, a control unit 117, and a message generation unit 118.

通信部１１１は、ネットワーク１４に接続されており、音声対話装置１１で生成された発信側メッセージ（詳細は後述する）をネットワーク１４に送出し、ネットワーク１４を介して送られてくる着信側メッセージ（詳細は後述）を受信する。 The communication unit 111 is connected to the network 14, sends a caller-side message (details will be described later) generated by the voice interaction device 11 to the network 14, and a callee-side message (via the network 14) Details will be received later.

音声入力制御部１１４は、典型的にはＰＴＴボタンである。ユーザは、このような音声入力制御部１１４を操作しながら、マイク１１２に向かって声を発する。これによって、マイク１１２からは、入力音声を表すアナログ音声情報がＣＯＤＥＣ１１５に出力される。
ＣＯＤＥＣ１１５は、マイク１１２からのアナログ音声情報をデジタル音声信号に変換して、制御部１１７に出力する。ＣＯＤＥＣ１１５はさらに、通信部１１１により受信されたデジタル音声信号を、制御部１１７を介して取得し、取得したデジタル音声信号をアナログ音声信号に変換して、スピーカ１１３に出力する。これによって、スピーカ１１３からは、音声が出力される。 The voice input control unit 114 is typically a PTT button. The user speaks toward the microphone 112 while operating such a voice input control unit 114. As a result, the analog sound information representing the input sound is output from the microphone 112 to the CODEC 115.
The CODEC 115 converts analog audio information from the microphone 112 into a digital audio signal and outputs the digital audio signal to the control unit 117. The CODEC 115 further acquires the digital audio signal received by the communication unit 111 via the control unit 117, converts the acquired digital audio signal into an analog audio signal, and outputs the analog audio signal to the speaker 113. As a result, sound is output from the speaker 113.

付加情報取得部１１６は、ユーザの現在位置を特定可能な位置情報、ユーザを特定可能な個人情報、又は、ユーザが音声対話を希望する相手先のカテゴリ情報、若しくはこれらグループから選ばれた２個以上の組み合わせを、付加情報として取得する。 The additional information acquisition unit 116 is selected from position information that can identify the current position of the user, personal information that can identify the user, category information of the other party that the user desires to have a voice conversation, or two groups selected from these groups. The above combination is acquired as additional information.

位置情報は、現在位置そのもの、目的地の位置、ユーザの移動速度、ユーザの進行方位、又は、目的地に到達するまでの経路若しくは、これらのグループから選ばれた２個以上の組み合わせである。なお、以上のような位置情報を、付加情報取得部１１６は典型的には、周知のナビゲーションシステムから取得することができる。 The position information is the current position itself, the position of the destination, the moving speed of the user, the traveling direction of the user, the route to reach the destination, or a combination of two or more selected from these groups. Note that the position information as described above can be typically acquired by the additional information acquisition unit 116 from a known navigation system.

また、個人情報は、ユーザの氏名、年齢、住所、電話番号、メールアドレス、ニックネーム又は嗜好情報、若しくは、これらのグループから選ばれたいずれか２個以上の組み合わせである。このような個人情報は、音声対話装置１１自身に内蔵されるか、音声対話装置１１に接続される記憶装置（図示せず）に格納される。 The personal information is the user's name, age, address, telephone number, e-mail address, nickname or preference information, or a combination of any two or more selected from these groups. Such personal information is built into the voice interaction device 11 itself or stored in a storage device (not shown) connected to the voice interaction device 11.

また、カテゴリ情報は、例えば「和食レストラン」又は「ホテル」のようにカテゴリを表し、ユーザは、これらの店舗又は施設側の人と直接対話して情報を得たい場合に、リモコン又はタッチキーのような入力装置を操作してカテゴリ情報を入力する。このようにして入力されたカテゴリ情報を、付加情報取得部１１６は取得する。 The category information represents a category such as “Japanese restaurant” or “hotel”, and when the user wants to obtain information by directly interacting with a person at the store or facility side, the remote control or touch key is used. The category information is input by operating such an input device. The additional information acquisition unit 116 acquires the category information input in this way.

制御部１１７は、例えばＣＰＵ、ＲＯＭ及びＲＡＭから構成され、音声対話装置１１の各構成を制御する。 The control unit 117 is composed of, for example, a CPU, a ROM, and a RAM, and controls each component of the voice interaction device 11.

メッセージ生成部１１８は、まず、ＣＯＤＥＣ１１５から出力されたデジタル音声情報を、制御部１１７を通じて取得する。さらに、メッセージ生成部１１８は、付加情報取得部１１６から出力された付加情報を、制御部１１７を通じて取得する。メッセージ生成部１１８は基本的には、取得したデジタル音声情報及び／又は付加情報と、音声対話装置１１と音声通信を行うべき音声対話装置１２を一意に識別可能な識別情報とを含む発信側メッセージを作成して、通信部１１１に出力する。また、例外的に、音声対話装置１１と音声通信を行うべき音声対話装置１２が確定していない段階において、メッセージ生成部１１８は、その旨を示すメッセージを作成する。 The message generation unit 118 first acquires the digital audio information output from the CODEC 115 through the control unit 117. Further, the message generation unit 118 acquires the additional information output from the additional information acquisition unit 116 through the control unit 117. The message generation unit 118 basically includes a caller-side message including the acquired digital voice information and / or additional information and identification information that can uniquely identify the voice dialog device 12 that should perform voice communication with the voice dialog device 11. And output to the communication unit 111. In exceptional cases, at a stage where the voice interaction device 12 that should perform voice communication with the voice interaction device 11 has not yet been determined, the message generation unit 118 creates a message indicating that.

次に、音声対話装置１２について説明する。図３は、図１に示す音声対話装置１２の詳細な構成を示すブロック図である。図３において、各音声対話装置１２は、通信部１２１、ワイヤレスヘッドセット１２２、音声入力制御部１２３、近距離無線通信部１２４、ＣＯＤＥＣ１２５、及び制御部１２６を備えている。 Next, the voice interaction device 12 will be described. FIG. 3 is a block diagram showing a detailed configuration of the voice interactive apparatus 12 shown in FIG. In FIG. 3, each voice interaction device 12 includes a communication unit 121, a wireless headset 122, a voice input control unit 123, a short-range wireless communication unit 124, a CODEC 125, and a control unit 126.

通信部１２１は、ネットワーク１４に接続されており、音声対話装置１２で作成された着信側メッセージ（詳細は後述する）をネットワーク１４に送出し、音声対話装置１１により送出された発信側メッセージに含まれるデジタル音声情報を、ネットワーク１４を介して受信する。 The communication unit 121 is connected to the network 14, sends an incoming call message (details will be described later) created by the voice interaction device 12 to the network 14, and is included in the caller message sent by the voice interaction device 11. Digital audio information is received via the network 14.

音声入力制御部１２３は、典型的にはＰＴＴボタンである。各音声対話装置１２のユーザ（以下、着信側と称する）は、このような音声入力制御部１２３を操作しながら、ヘッドセット１２２の構成要素であるマイクに向かって声を発する。これによって、ヘッドセット１２２では、着信側の入力音声を表すアナログ音声情報がデジタル音声信号に変換され、その後、例えばＢｌｕｅｔｏｏｔｈのような近距離無線通信プロトコルに従って、デジタル音声信号が、無線リンクを通じて近距離無線通信部１２４に出力される。 The voice input control unit 123 is typically a PTT button. A user of each voice interactive device 12 (hereinafter referred to as a called party) speaks to a microphone that is a component of the headset 122 while operating such a voice input control unit 123. Thereby, in the headset 122, the analog voice information representing the input voice on the called side is converted into a digital voice signal, and then the digital voice signal is converted to a short distance through a wireless link according to a short distance wireless communication protocol such as Bluetooth. The data is output to the wireless communication unit 124.

ＣＯＤＥＣ１２５は、近距離無線通信部１２４を通じて受け取ったデジタル音声情報をエンコードして、制御部１２６に出力する。ＣＯＤＥＣ１２５はさらに、通信部１２１により受信されたデジタル音声信号を、制御部１２６を介して取得し、取得したデジタル音声信号をデコードして、近距離無線通信部１２４に出力する。これに応じて、近距離無線通信部１２４は、上記プロトコルに従って、デコードされたデジタル音声信号を、無線リンクを通じてヘッドセット１２２に出力する。これによって、ヘッドセット１２２において、入力されたデジタル音声信号はアナログ音声信号に変換され、ヘッドセット１２２が備えるスピーカから音声が出力される。 The CODEC 125 encodes the digital audio information received through the short-range wireless communication unit 124 and outputs it to the control unit 126. The CODEC 125 further acquires the digital audio signal received by the communication unit 121 via the control unit 126, decodes the acquired digital audio signal, and outputs the decoded digital audio signal to the short-range wireless communication unit 124. In response to this, the short-range wireless communication unit 124 outputs the decoded digital audio signal to the headset 122 through the wireless link according to the protocol. Thereby, in the headset 122, the input digital audio signal is converted into an analog audio signal, and audio is output from a speaker provided in the headset 122.

制御部１２６は、例えばＣＰＵ、ＲＯＭ及びＲＡＭから構成され、音声対話装置１２の各構成を制御する。制御部１２６は、典型的には、通信部１２１を通じて取得したデジタル音声情報をＣＯＤＥＣ１２５に渡し、さらに、ＣＯＤＥＣ１２５を通じて取得したデジタル音声情報を着信側メッセージとして通信部１２１に渡す。 The control unit 126 includes, for example, a CPU, a ROM, and a RAM, and controls each component of the voice interaction device 12. Typically, the control unit 126 passes the digital audio information acquired through the communication unit 121 to the CODEC 125, and further passes the digital audio information acquired through the CODEC 125 to the communication unit 121 as an incoming message.

なお、上述のように音声対話装置１２はワイヤレスヘッドセット１２２を備えるとして説明した。これは、音声対話装置１２が店舗に設置されることを想定し、着信側（典型的には店員）が音声対話装置１２の周囲を自由に動き回れるようにするためである。しかし、音声対話装置１２において、音声入力及び音声出力のための構成要素は、ワイヤレスヘッドセット１２２に限らず、音声対話装置１１と同様、マイク及びスピーカの組み合わせであっても良い。 As described above, the voice interactive device 12 has been described as including the wireless headset 122. This is because it is assumed that the voice interaction device 12 is installed in the store, and the incoming side (typically, a store clerk) can freely move around the voice interaction device 12. However, in the voice interaction device 12, the components for voice input and voice output are not limited to the wireless headset 122, and may be a combination of a microphone and a speaker, as in the voice interaction device 11.

次に、音声対話制御装置１３について説明する。図４は、図１に示す音声対話制御部１３の詳細な構成を示すブロック図である。図４において、音声対話制御部１３は、音声対話装置１１と、１以上の音声対話装置１２との間で行われる音声通信を制御するために、送受信部１３１、メッセージ解読部１３２、属性情報記憶部１３３、優先度設定部１３４、対話制御部１３５、メッセージ蓄積部１３６及び中央制御部１３７を備えている。 Next, the voice interaction control device 13 will be described. FIG. 4 is a block diagram showing a detailed configuration of the voice interaction control unit 13 shown in FIG. In FIG. 4, the voice dialogue control unit 13 controls a voice communication performed between the voice dialogue device 11 and one or more voice dialogue devices 12, a transmission / reception unit 131, a message decoding unit 132, an attribute information storage. A unit 133, a priority setting unit 134, a dialogue control unit 135, a message storage unit 136, and a central control unit 137.

送受信部１３１は、ネットワーク１４に接続されており、音声対話装置１１から送出された発信側メッセージ、及び音声対話装置１２から送出された着信側メッセージを、ネットワーク１４を介して受信する。 The transmission / reception unit 131 is connected to the network 14, and receives the outgoing side message sent from the voice interaction device 11 and the incoming side message sent from the voice interaction device 12 via the network 14.

また、送受信部１３１は、後述する対話制御部１３５の処理に従って、コネクション確立要求と共に音声対話装置１１から送られてきたデジタル音声情報をネットワーク１４に送出する。 Further, the transmission / reception unit 131 sends the digital voice information sent from the voice dialogue apparatus 11 together with the connection establishment request to the network 14 in accordance with the processing of the dialogue control unit 135 described later.

メッセージ解読部１３２は、送受信部１３１により受信された発信側メッセージを取得し、取得したものに含まれるデジタル音声情報及び付加情報を取り出す。取り出されたデジタル音声情報及び付加情報は中央制御部１３７に送られる。 The message decryption unit 132 acquires the transmission side message received by the transmission / reception unit 131 and extracts the digital voice information and additional information included in the acquired message. The extracted digital audio information and additional information are sent to the central control unit 137.

属性情報記憶部１３３は、ネットワーク１４に接続されている各音声対話装置１２の属性を表す属性情報を記憶する。属性情報は、音声対話装置１２が置かれている店舗又は施設が存在する位置を特定可能な位置情報、及び、対象となる店舗又は施設が属するカテゴリを表すカテゴリ情報を含む。 The attribute information storage unit 133 stores attribute information representing the attributes of each voice interactive device 12 connected to the network 14. The attribute information includes position information that can specify the position where the store or facility where the voice interactive apparatus 12 is located, and category information that represents the category to which the target store or facility belongs.

優先度設定部１３４は、発信側メッセージから取り出された付加情報を中央制御部１３７から取得し、さらに、中央制御部１３７が属性情報記憶部１３３から取り出した所定の属性情報を取得する。その後、優先度設定部１３４は、取得した付加情報及び属性情報に基づいて、今回の発信側メッセージ含まれるデジタル音声情報を最初に転送すべき音声対話装置１２（着信側）を決定する。また、候補となる音声対話装置１２が複数存在する場合には、各候補にデジタル音声情報を転送する順番（つまり、優先度）を、優先度設定部１３４は割り当てる。 The priority setting unit 134 acquires the additional information extracted from the originating message from the central control unit 137, and further acquires predetermined attribute information extracted by the central control unit 137 from the attribute information storage unit 133. After that, the priority setting unit 134 determines the voice interactive device 12 (incoming side) to which the digital voice information included in the current outgoing message should be transferred first based on the acquired additional information and attribute information. In addition, when there are a plurality of candidate voice interactive devices 12, the priority setting unit 134 assigns the order (that is, the priority) in which the digital voice information is transferred to each candidate.

対話制御部１３５は、音声対話装置１１及び各音声対話装置１２の間で、ＰＴＴベースの音声対話を行うためのコネクションを確立し、それぞれから送出されたデジタル音声情報の送受を制御する。具体的には、対話制御部１３５は、音声対話装置１１からコネクション確立の要求があった後、優先度設定部１３４で定められた優先度に従って、音声対話装置１１及び音声対話装置１２の間で音声対話のコネクションを確立する。さらに具体的には、対話制御部１３５は、優先度が高い音声対話装置１２から順番に、コネクション確立要求と共に音声対話装置１１から送られてきたデジタル音声情報を、送受信部１３１を介して送信する。 The dialogue control unit 135 establishes a connection for performing a PTT-based voice dialogue between the voice dialogue device 11 and each voice dialogue device 12, and controls transmission / reception of digital voice information sent from each. Specifically, the dialogue control unit 135, after receiving a connection establishment request from the voice dialogue device 11, between the voice dialogue device 11 and the voice dialogue device 12 according to the priority set by the priority setting unit 134. Establish a voice conversation connection. More specifically, the dialogue control unit 135 transmits the digital voice information sent from the voice dialogue device 11 together with the connection establishment request via the transmission / reception unit 131 in order from the voice dialogue device 12 having the highest priority. .

音声蓄積部１３５は、音声対話装置１１から送出された着信側メッセージから取り出されたデジタル音声情報を蓄積する The voice storage unit 135 stores the digital voice information extracted from the incoming message sent from the voice interaction device 11.

中央制御部１３７は、例えばＣＰＵ、ＲＯＭ及びＲＡＭから構成され、音声対話制御装置１３の各構成要素を制御する。 The central control unit 137 includes, for example, a CPU, a ROM, and a RAM, and controls each component of the voice interaction control device 13.

次に、以上のように構成される音声対話システム１における音声対話装置１１と、各音声対話装置１２との通信手順の一例を説明する。図５は、音声対話システム１における音声通信の典型例を示すシーケンスチャートである。図５の例では、発信側ユーザの現在位置の近くにありかつカテゴリＡに属するレストランを、発信側ユーザが探すために、音声対話装置１１のＰＴＴ機能を使って直接対象となるレストランに呼びかける場合を示している。 Next, an example of a communication procedure between the voice interaction device 11 and each voice interaction device 12 in the voice interaction system 1 configured as described above will be described. FIG. 5 is a sequence chart showing a typical example of voice communication in the voice interaction system 1. In the example of FIG. 5, when the calling user calls directly to the target restaurant using the PTT function of the voice interaction device 11 in order to find a restaurant that is near the current location of the calling user and belongs to category A. Is shown.

まず、音声対話装置１１において、メッセージ生成部１１８は、発信側メッセージを生成する（シーケンスＳ１１）。発信側メッセージには、ユーザが入力した音声から生成されたデジタル音声情報及び付加情報が含まれる。上記の仮定下では、付加情報には、発信側ユーザの現在位置が位置情報として、さらにカテゴリＡがカテゴリ情報として含まれる。 First, in the voice interaction device 11, the message generator 118 generates a caller message (sequence S11). The outgoing message includes digital voice information and additional information generated from voice input by the user. Under the above assumption, the additional information includes the current location of the calling user as location information, and category A as category information.

以上のシーケンスＳ１１で生成された発信側メッセージを、通信部１１１は、メッセージ生成部１１８から取得し、ネットワーク１４に送出する（シーケンスＳ１２）。 The communication unit 111 acquires the transmission side message generated in the above sequence S11 from the message generation unit 118 and sends it to the network 14 (sequence S12).

音声対話制御装置１３において、送受信部１３１は、ネットワーク１４を通じて、音声対話装置１１から送出された発信側メッセージを受信して、受信メッセージをメッセージ解読部１３２に渡す。メッセージ解読部１３２は、受け取った発信側メッセージを解読し、それに含まれるデジタル音声情報及び付加情報を取得し、中央制御部１３７に渡す（シーケンスＳ１３）。 In the voice interaction control device 13, the transmission / reception unit 131 receives the outgoing message sent from the voice interaction device 11 through the network 14 and passes the received message to the message decoding unit 132. The message decoding unit 132 decodes the received outgoing message, acquires the digital voice information and additional information included therein, and passes them to the central control unit 137 (sequence S13).

前述のように、属性情報記憶部１３３には、音声対話装置１２毎に、店舗又は施設の位置情報及びカテゴリ情報が属性情報として格納されている。中央制御部１３７は、以下の２条件を満たす属性情報を選択するとともに、デジタル音声情報を音声蓄積部１３６に格納する（シーケンスＳ１４）。第１の条件は、メッセージ解読部１３２から受け取った付加情報に含まれるカテゴリ情報に一致する属性情報であることである。第２の条件は、上記付加情報に含まれる位置情報から予め定められた距離の範囲内に存在する２以上の着信側の音声対話装置１２に割り当てられた属性情報であることである。このようにして選択された属性情報は、付加情報と共に、優先度設定部１３４に送られる。ここで、本実施形態の説明では、図１に示す音声対話装置１２ａ及び１２ｂの属性情報が選ばれると仮定する。 As described above, the attribute information storage unit 133 stores location information and category information of a store or facility as attribute information for each voice interaction device 12. The central control unit 137 selects attribute information that satisfies the following two conditions, and stores the digital audio information in the audio storage unit 136 (sequence S14). The first condition is that the attribute information matches the category information included in the additional information received from the message decoding unit 132. The second condition is that the attribute information is assigned to two or more called-side voice interactive devices 12 existing within a predetermined distance from the position information included in the additional information. The attribute information selected in this way is sent to the priority setting unit 134 together with the additional information. Here, in the description of the present embodiment, it is assumed that the attribute information of the voice interactive apparatuses 12a and 12b shown in FIG. 1 is selected.

次に、優先度設定部１３４は、受け取った付加情報及び各属性情報に基づいて、シーケンスＳ１４で選ばれた音声対話装置１２に優先度を割り当てる（シーケンスＳ１５）。前述で仮定したように、付加情報に、発信側ユーザの現在位置が含まれている場合、優先度設定部１３４は、各属性情報に含まれる店舗又は施設の位置情報を使って、発信側ユーザの現在位置に近い位置に存在する音声対話装置１２ほど、高い優先度が割り当てられる。本実施形態ではさらに、音声対話装置１２ａが、音声対話装置１２ｂよりも、発信側の音声対話装置１１から近い位置に存在すると仮定する。この仮定下では、音声対話装置１２ａに高い優先度が割り当てられ、音声対話装置１２ｂに低い優先度が割り当てられる。 Next, the priority setting unit 134 assigns a priority to the voice interactive device 12 selected in the sequence S14 based on the received additional information and each attribute information (sequence S15). As assumed above, when the additional information includes the current location of the calling user, the priority setting unit 134 uses the location information of the store or facility included in each attribute information to send the calling user. The higher the priority is assigned to the voice interactive device 12 existing near the current position. In the present embodiment, it is further assumed that the voice interaction device 12a is closer to the calling-side voice interaction device 11 than the voice interaction device 12b. Under this assumption, a high priority is assigned to the voice interactive device 12a, and a low priority is assigned to the voice interactive device 12b.

シーケンスＳ１５が終了すると、中央制御部１３７は、優先度設定部１３４から、最も高い優先度が割り当てられた音声対話装置１２の情報を取得し、音声蓄積部１３６に格納されたデジタル音声情報と共に、対話制御部１３５に通知する。対話制御部１３５は、今回通知された音声対話装置１２と、発信側の音声対話装置１１との音声通信に必要なコネクションを確立し（シーケンスＳ１６）、その直後にデジタル音声情報をネットワーク１４に送出し、これによって、デジタル音声情報は音声対話装置１２に送信される（シーケンスＳ１７）。これによって、両者のユーザ間でＰＴＴ型の音声対話が始まる（シーケンスＳ１８）。前述の仮定の下では、上位の優先度が割り当てられた音声対話装置１２ａに対し、音声対話装置１１から送出されたデジタル音声情報が送られ、音声対話装置１１及び１２ａのユーザ間で音声対話が始まる。これによって、音声対話装置１１のユーザは、音声対話装置１２ａのユーザから、その店舗又は施設について様々な事項を問い合わせることが可能となる。それに対し、現段階では、デジタル音声情報は音声対話装置１２ｂには送信されないので、音声対話装置１１及び１２ｂのユーザ間ではＰＴＴ型の音声対話は始まらない。 When the sequence S15 ends, the central control unit 137 acquires the information of the voice interactive device 12 to which the highest priority is assigned from the priority setting unit 134, and together with the digital voice information stored in the voice storage unit 136, The dialog control unit 135 is notified. The dialogue control unit 135 establishes a connection necessary for voice communication between the voice dialogue device 12 notified this time and the voice dialogue device 11 on the transmission side (sequence S16), and immediately after that, sends digital voice information to the network 14. Thus, the digital voice information is transmitted to the voice interaction device 12 (sequence S17). As a result, a PTT-type voice conversation starts between both users (sequence S18). Under the above assumption, the digital voice information sent from the voice dialogue device 11 is sent to the voice dialogue device 12a to which the higher priority is assigned, and the voice dialogue between the users of the voice dialogue devices 11 and 12a is performed. Begins. As a result, the user of the voice interaction device 11 can inquire various matters regarding the store or facility from the user of the voice interaction device 12a. On the other hand, since the digital voice information is not transmitted to the voice dialogue apparatus 12b at this stage, the PTT type voice dialogue does not start between the users of the voice dialogue apparatuses 11 and 12b.

以上説明したように、本実施形態に係る音声対話制御装置１３によれば、対話制御部１３５は、現時点では、優先度設定部１３４により最初に選択された単一の音声対話装置１２（上記仮定下では、音声対話装置１２ａ）と、音声対話装置１１との間でコネクションを確立し、さらに、選択された単一の音声対話装置１２（１２ａ）に対してのみ、音声対話装置１１から送出された音声情報を送る。従って、他の音声対話装置１２（上記仮定下では、音声対話装置１２ｂ）は、現時点で、音声対話装置１１から送出された音声情報を受信しない。これによって、発信側ユーザは、現時点で、単一の着信側ユーザと、ＰＴＴに基づく音声対話を行うことが可能となり、他の着信側ユーザからの応答を得ることは無いので、対話の相手を特定し易くなる。 As described above, according to the spoken dialogue control apparatus 13 according to the present embodiment, the dialogue control unit 135 currently has the single voice dialogue apparatus 12 (the above assumption) selected first by the priority setting unit 134. Below, a connection is established between the voice interaction device 12a) and the voice interaction device 11, and further, only the selected single voice interaction device 12 (12a) is transmitted from the voice interaction device 11. Send voice information. Accordingly, the other voice interaction device 12 (under the above assumption, the voice interaction device 12b) does not receive the voice information sent from the voice interaction device 11 at this time. As a result, the calling-side user can currently perform a voice dialogue based on PTT with a single called-side user and does not get a response from other called-side users. It becomes easy to specify.

また、最初に選ばれた着信側ユーザとの対話の結果、発信側ユーザが、今回選択された音声対話装置１２（１２ａ）の設置場所に行くと決定した場合、発信側ユーザの用件は終了し、さらなる音声対話は不要である。この場合、音声対話装置１１は、図示はしていないが、用件が終了した旨を表す発信側メッセージを作成し、ネットワーク１４を介して、音声対話制御装置１３にこれを送信する。これに応じて、音声対話装置１３は、音声対話装置１１及び１２（１２ａ）のコネクションを切断し、さらに、例えば音声対話装置１１からの音声情報のように、今回の一連の処理で生成された各種情報を破棄する。 If the calling user decides to go to the installation location of the currently selected voice interaction device 12 (12a) as a result of the dialogue with the first receiving user, the message of the calling user ends. However, no further voice interaction is necessary. In this case, although not shown in the figure, the voice interaction device 11 creates a caller side message indicating that the message has been completed, and transmits it to the voice interaction control device 13 via the network 14. In response to this, the voice interaction device 13 disconnects the connection between the voice interaction devices 11 and 12 (12a), and is generated by a series of processes this time, for example, voice information from the voice interaction device 11. Discard various information.

また、発信側ユーザが上記の決定をなした場合において、音声対話装置１１がナビゲーション装置に組み込まれていたり、音声対話装置１１がナビゲーション装置と通信可能であったりする場合、ナビゲーション装置は、音声対話装置１１から、音声対話装置１２（１２ａ）の設置場所まで、発信側ユーザを案内することが可能となる。 In addition, when the calling user makes the above determination, when the voice interaction device 11 is incorporated in the navigation device or the voice interaction device 11 can communicate with the navigation device, the navigation device uses the voice interaction. It is possible to guide the calling-side user from the device 11 to the installation location of the voice interaction device 12 (12a).

一方、発信側ユーザが、さらなる音声対話が必要と判断した場合、音声対話装置１１は、別のコネクションの確立を要求するため、その旨を表す発信側メッセージを作成し、音声対話制御装置１３に、ネットワーク１４を介して、これを送信する。用件が終了した旨を表す発信側メッセージを作成し、音声対話制御装置１３に送出する。 On the other hand, when the calling-side user determines that further voice dialogue is necessary, the voice dialogue device 11 requests the establishment of another connection. Therefore, the voice dialogue device 11 creates a calling-side message to that effect and sends it to the voice dialogue control device 13. This is transmitted via the network 14. A calling side message indicating that the message has been completed is created and sent to the voice interaction control device 13.

上記の発信側メッセージに応答して、音声対話装置１３において、対話制御部１３５は、音声対話装置１１及び１２（１２ａ）の間に張られていたコネクションを切断する（シーケンスＳ１９）。 In response to the message on the calling side, in the voice interaction device 13, the dialogue control unit 135 disconnects the connection established between the voice interaction devices 11 and 12 (12a) (sequence S19).

その後、中央制御部１３７は、優先度設定部１３４から、次位の優先度が割り当てられた音声対話装置１２の情報を取得し、音声蓄積部１３６に格納されたデジタル音声情報と共に、対話制御部１３５に通知する。対話制御部１３５は、今回通知された音声対話装置１２と、発信側の音声対話装置１１との音声通信に必要なコネクションを確立し（シーケンスＳ２０）、その直後にデジタル音声情報をネットワーク１４に送出し、これによって、デジタル音声情報は音声対話装置１２に送信される（シーケンスＳ２１）。これによって、両者のユーザ間でＰＴＴ型の音声対話が始まる（シーケンスＳ２２）。以降の処理は、前述した音声対話装置１２ａの場合と同様であるため、説明を省略する。 Thereafter, the central control unit 137 acquires the information of the voice interaction device 12 to which the next priority is assigned from the priority setting unit 134, and together with the digital voice information stored in the voice storage unit 136, the dialogue control unit 135 is notified. The dialogue control unit 135 establishes a connection necessary for voice communication between the voice dialogue device 12 notified this time and the voice dialogue device 11 on the transmission side (sequence S20), and immediately after that, sends digital voice information to the network 14. Thus, the digital voice information is transmitted to the voice interaction device 12 (sequence S21). As a result, a PTT type voice conversation is started between both users (sequence S22). Subsequent processing is the same as in the case of the voice interaction device 12a described above, and thus description thereof is omitted.

前述の仮定の下では、２番目の優先度が割り当てられた音声対話装置１２ｂに対し、音声対話装置１１から送出されたデジタル音声情報が送られ、音声対話装置１１及び１２ｂのユーザ間で音声対話が始まる。これによって、発信側ユーザは、新たな着信側ユーザから、その店舗又は施設について様々な事項を問い合わせることが可能となる。この場合においても、デジタル音声情報は音声対話装置１２ｂ以外には送信されないので、新たな着信側ユーザ以外には、発信側ユーザの音声は届かない。 Under the above-mentioned assumption, the digital voice information sent from the voice dialogue device 11 is sent to the voice dialogue device 12b to which the second priority is assigned, and voice dialogue between the users of the voice dialogue devices 11 and 12b is performed. Begins. As a result, the calling-side user can inquire various matters about the store or facility from the new called-side user. Also in this case, since the digital voice information is not transmitted to other than the voice interactive device 12b, the voice of the calling user does not reach other than the new called user.

以上説明したように、本実施形態に係る音声対話制御装置１３によれば、半二重通信（具体的にはＰＴＴ機能を音声対話装置１１及び１２間の通信）による場合であっても、ネットワーク１４上の音声対話制御装置１３の処理により、単一の着信側の音声対話装置１２とのコネクションが切断されるまで、他の音声対話装置１２とのコネクションは確立されない。従って、ある時間において、音声対話装置１１は、単一の音声対話装置１２としか音声通信を行わない。言い換えれば、発信側ユーザがＰＴＴ機能を使って声を発しても、複数人がほぼ同時に応答することを避けることができるため、発信側のユーザが通話の相手を特定し易い音声対話制御装置１３を提供することが可能となる。 As described above, according to the voice interaction control device 13 according to the present embodiment, even in the case of half-duplex communication (specifically, communication between the PTT function and the voice interaction devices 11 and 12), The connection with the other voice interactive apparatus 12 is not established until the connection with the single incoming voice interactive apparatus 12 is disconnected by the processing of the voice interactive control apparatus 13 on 14. Accordingly, at a certain time, the voice interaction device 11 performs voice communication only with the single voice interaction device 12. In other words, even if the calling user speaks using the PTT function, it is possible to avoid a plurality of people from responding almost simultaneously, so that the calling user can easily specify the other party of the call. Can be provided.

なお、以上の実施形態では、好ましい例として、発信側の音声対話装置１１により近い位置にある音声対話装置１２に対して優先度を割り当てるようにしていた。しかし、これに限らず、例えば、発信側メッセージに含まれる発信側ユーザの個人情報に基づいて、複数の音声対話装置１２に優先度を、音声対話制御装置１３は割り当てても構わない。 In the above embodiment, as a preferred example, priority is assigned to the voice interaction device 12 located closer to the calling-side voice interaction device 11. However, the present invention is not limited to this, and for example, the voice conversation control device 13 may assign a priority to the plurality of voice interaction devices 12 based on the personal information of the calling user included in the calling message.

さらに、音声対話装置１３は、コネクションを確立する前に、複数の音声対話装置１２からランダムに単一のものを選択し、今回選択した音声対話装置１２と、音声対話装置１１とのコネクションを確立しても構わない。 Furthermore, before establishing a connection, the voice interaction device 13 randomly selects a single one from the plurality of voice interaction devices 12, and establishes a connection between the currently selected voice interaction device 12 and the voice interaction device 11. It doesn't matter.

また、以上の実施形態では、音声対話装置１１からの所定の発信側メッセージの到着により、音声対話制御装置１３は、コネクションを切断していた。しかし、これに限らず、音声対話装置１３は、現在コネクションが確立されている音声対話装置１１及び１２の間で所定時間の無音状態を検出すると、そのコネクションを切断するようにしても構わない。 Further, in the above embodiment, the voice conversation control device 13 disconnects the connection due to the arrival of a predetermined sender message from the voice dialogue device 11. However, the present invention is not limited to this, and the voice interaction device 13 may disconnect the connection when detecting a silent state for a predetermined time between the voice interaction devices 11 and 12 to which the connection is currently established.

また、以上の実施形態では、音声対話装置１２には、音声情報が送られるとして説明した。しかし、これに限らず、付加情報も音声対話装置１２に送信されても構わない。これによって、音声対話装置１２は、自身が備えうるディスプレイに、付加情報を表示させることが可能となる。これによって、音声対話装置１２は、ユーザの位置情報及び個人情報を取得することが可能となる。 Further, in the above embodiment, it has been described that voice information is sent to the voice interaction device 12. However, the present invention is not limited to this, and additional information may be transmitted to the voice interaction device 12. As a result, the voice interaction apparatus 12 can display additional information on a display that it can have. Thus, the voice interaction device 12 can acquire the user's position information and personal information.

また、以上の実施形態において設定される優先度は、定期的に更新されても構わない。これによって、音声対話装置１１の移動に応じ、現在位置に最も近い音声対話装置１２に対して高い優先度を割り当てることが可能となる。他にも、別の発信側ユーザと通話が開始された着信側ユーザの音声対話装置１２の優先度を下げることで、発信側の音声対話装置１１と対話不可能な相手に高い優先度を割り当てることを防止することも可能となる。 Moreover, the priority set in the above embodiment may be updated regularly. As a result, according to the movement of the voice interaction device 11, a higher priority can be assigned to the voice interaction device 12 closest to the current position. In addition, by lowering the priority of the voice interactive device 12 of the called user who has started a call with another outgoing user, a higher priority is assigned to the other party who cannot communicate with the voice interactive device 11 of the outgoing side. It is also possible to prevent this.

さらに、音声対話制御装置１３は、所定の基準値よりも低い優先度が割り当てられた音声対話装置１２とコネクションを確立しない方が好ましい場合もある。これによって、例えば、音声対話装置１１から非常に距離の離れた位置にある音声対話装置１２とのコネクション確立を防止したり、発信側ユーザの好みでない店舗又は施設に設置されている音声対話装置１２とのコネクション確立を防止したりすることが可能となる。 Furthermore, it may be preferable that the voice interaction control device 13 does not establish a connection with the voice interaction device 12 to which a priority lower than a predetermined reference value is assigned. Thereby, for example, establishment of a connection with the voice interaction device 12 located at a very far distance from the voice interaction device 11 is prevented, or the voice interaction device 12 installed in a store or facility that is not preferred by the calling user. It is possible to prevent establishment of a connection with.

また、音声対話制御装置１３は、音声対話装置１１からの音声情報を解析し、発信側ユーザの発話内容を認識する音声認識部（図示せず）を備えていてもかまわない。この場合、優先度設定部１３４は、発話内容に応じて、優先度を割り当てることも可能となる。 Further, the voice dialogue control device 13 may include a voice recognition unit (not shown) that analyzes the voice information from the voice dialogue device 11 and recognizes the utterance content of the calling user. In this case, the priority setting unit 134 can also assign a priority according to the utterance content.

また、以上の実施形態では、音声対話装置１１は、ある時間において単一の音声対話装置１２と音声通信を行うとして説明した。しかし、これに限らず、発信側と着信側との対話が円滑に進む範囲内であれば、音声対話制御装置１３は、予め定められた少数の音声対話装置１２と、音声対話装置１１とのコネクションを確立しても構わない。 In the above embodiment, the voice interaction device 11 is described as performing voice communication with the single voice interaction device 12 at a certain time. However, the present invention is not limited to this, and the spoken dialogue control device 13 performs a predetermined number of spoken dialogue devices 12 and the spoken dialogue device 11 as long as the dialogue between the calling side and the called side smoothly proceeds. You may establish a connection.

また、以上の実施形態では、音声対話装置１１は、携帯電話に接続された車載端末装置として説明したが、これに限らず、携帯電話そのものであっても構わないし、携帯電話の機能を実装した車載端末装置でも構わない。同様に、音声対話装置１２は、ＰＯＳ端末装置として説明したが、これに限らず、可搬可能な端末装置であっても構わないし、据え置き型の端末装置であっても構わない。 In the above embodiment, the voice interactive device 11 has been described as an in-vehicle terminal device connected to a mobile phone. However, the present invention is not limited to this, and the mobile phone itself may be used, and the mobile phone function is implemented. An in-vehicle terminal device may be used. Similarly, although the voice interactive apparatus 12 has been described as a POS terminal apparatus, the present invention is not limited thereto, and may be a portable terminal apparatus or a stationary terminal apparatus.

さらに、音声対話装置１１は発信側として、また、音声対話装置１２は着信側として説明したが、これに限らず、音声対話装置１１及び１２のそれぞれには、発信側及び着信側の両方としての機能が備わっていても構わない。 Furthermore, although the voice interaction device 11 has been described as the calling side and the voice interaction device 12 has been described as the receiving side, the present invention is not limited thereto, and each of the voice interaction devices 11 and 12 has both as the calling side and the receiving side. It does not matter if it has a function.

本発明に係る音声対話制御装置は、発信側の音声対話装置のユーザが対話の相手を特定し易いという効果が要求される端末装置等に有効である。 The voice conversation control device according to the present invention is effective for a terminal device or the like that requires an effect that it is easy for the user of the voice conversation device on the transmission side to specify the partner of the conversation.

本発明の実施形態に係る音声対話システム１の全体構成を示す模式図The schematic diagram which shows the whole structure of the voice dialogue system 1 which concerns on embodiment of this invention. 図１に示す音声対話装置１１の詳細な構成を示すブロック図FIG. 1 is a block diagram showing a detailed configuration of the voice interaction apparatus 11 shown in FIG. 図１に示す音声対話装置１２の詳細な構成を示すブロック図1 is a block diagram showing a detailed configuration of the voice interaction device 12 shown in FIG. 図１に示す音声対話制御装置１３の詳細な構成を示すブロック図The block diagram which shows the detailed structure of the voice dialogue control apparatus 13 shown in FIG. 図１に示す音声対話システム１における通信手順の典型例を示すシーケンスチャートThe sequence chart which shows the typical example of the communication procedure in the voice interactive system 1 shown in FIG.

Explanation of symbols

１音声対話システム
１１音声対話装置（発信側）
１１１通信部
１１２マイク
１１３スピーカ
１１４音声入力制御装置
１１５ＣＯＤＥＣ
１１６付加情報取得部
１１７制御部
１１８メッセージ生成部
１２音声対話装置（着信側）
１２１通信部
１２２ワイヤレスヘッドセット
１２３音声入力制御部
１２４近距離無線通信部
１２５ＣＯＤＥＣ
１２６制御部
１３音声対話制御装置
１３１送受信部
１３２メッセージ解読部
１３３属性情報記憶部
１３４優先度設定部
１３５対話制御部
１３６音声蓄積部
１３７中央制御部
１４ネットワーク

1 Spoken Dialogue System 11 Spoken Dialogue Device (Sender)
111 Communication Unit 112 Microphone 113 Speaker 114 Audio Input Control Device 115 CODEC
116 Additional information acquisition unit 117 Control unit 118 Message generation unit 12 Spoken dialogue device (incoming side)
121 Communication Unit 122 Wireless Headset 123 Voice Input Control Unit 124 Short-range Wireless Communication Unit 125 CODEC
126 Control Unit 13 Voice Dialogue Control Device 131 Transmission / Reception Unit 132 Message Decoding Unit 133 Attribute Information Storage Unit 134 Priority Setting Unit 135 Dialogue Control Unit 136 Voice Storage Unit 137 Central Control Unit 14 Network

Claims

The calling-side voice interactive device and two or more called-side voice interactive devices are connected via a network, and are performed between the calling-side voice interactive device and the two or more called-side voice interactive devices. A voice dialogue control device for controlling half-duplex voice communication,
The voice interaction control device
A receiver for receiving a caller-side message that is transmitted from the caller-side voice interaction device to the network and includes at least voice information and additional information generated on the caller-side voice interaction device;
An attribute information storage unit for storing attribute information representing the attributes of each of the called-side voice interactive devices;
Two or more incoming calls to which the voice information contained in the sender message should be transferred based on the additional information contained in the sender message received by the receiver and the attribute information stored in the attribute information storage unit A central control unit for selecting a side voice interaction device;
A dialogue control unit that establishes an initial connection between any one or more of the voice conversation devices on the called side selected by the central control unit and the voice dialogue device on the calling side;
A transmission unit that transmits voice information included in a caller-side message received by the reception unit to a voice conversation device on the reception side where an initial connection is established by the dialogue control unit;
The dialog control unit further disconnects the initial connection, and then selects the voice conversation device on the called side that has not been transmitted with the voice information of the calling side message selected by the central control unit and received by the receiving unit; and Establish a new connection with the caller's voice conversation device,
The voice dialog control device, wherein the transmitter further transmits voice information included in a caller message received by the receiver to a voice dialog device on a receiving side where a connection is newly established by the dialog controller.

The voice interaction control device further includes a priority setting unit that assigns, as a priority, the order in which the dialog control unit establishes a connection to two or more incoming voice conversation devices selected by the central control unit,
The voice interaction control device according to claim 1, wherein the dialogue control unit establishes an initial connection with the voice conversation device on the receiving side having the highest priority assigned by the priority setting unit.

The additional information includes at least position information that can identify a current position of the voice conversation device on the caller side and / or personal information related to the user on the caller side. At least the current location of the voice interactive device, the location of the destination, the moving speed, the traveling direction, or the route to reach the destination, and the personal information includes the name, age, Including at least address, phone number, email address, nickname or preference information,
The voice interaction control device according to claim 2, wherein the priority setting unit assigns the priority with reference to position information and / or personal information included in the outgoing message.

The voice interaction control device further includes a voice recognition unit that analyzes voice information included in a caller-side message received by the receiving unit and recognizes the utterance content of the user on the caller side,
The spoken dialogue control apparatus according to claim 1, wherein the priority setting unit assigns the priority with reference to a recognition result of the voice recognition unit.

The voice dialog control according to claim 2, wherein the dialog control unit establishes a connection with a voice dialog device on a caller side in order from a voice dialog device on a callee side having a higher priority assigned by the priority setting unit. apparatus.

6. The voice conversation control device according to claim 5, wherein the conversation control unit establishes a new connection when a silent period continues for a predetermined time between the voice conversation devices with which the previous connection has been established.

A spoken dialogue system,
A voice dialogue device on the calling side;
Two or more incoming voice dialogue devices;
A voice dialogue control device for controlling half-duplex voice communication performed between the calling-side voice dialogue device and the called-side voice dialogue device;
The calling-side voice interaction device is:
A caller generation unit that generates voice information based on voice input by a user of the voice conversation device on the caller side and a caller side message including at least predetermined additional information;
A calling side transmission unit that transmits the calling side message generated by the generation unit to the voice interaction control device;
The voice interaction control device
A control-side receiving unit that receives a calling-side message transmitted from the calling-side sending unit;
An attribute information storage unit for storing attribute information representing the attributes of each of the called-side voice interactive devices;
Two or more voice information included in the sender message should be transferred based on the additional information included in the sender message received by the control side receiver and the attribute information stored in the attribute information storage unit A central control unit for selecting a voice dialogue device on the receiving side of
A dialogue control unit that establishes an initial connection between any one or more of the voice conversation devices on the called side selected by the central control unit and the voice dialogue device on the calling side;
A control-side transmitter that transmits the voice information included in the caller-side message received by the control-side receiver to the incoming-side voice dialog device in which the first connection is established by the dialog controller;
The dialog control unit further disconnects the initial connection, and then selects the voice conversation device on the called side that has not been transmitted with the voice information of the calling side message selected by the central control unit and received by the receiving unit; and Establish a new connection with the caller's voice conversation device,
The control-side transmission unit further transmits the voice information included in the caller-side message received by the reception unit to the incoming-side voice dialogue device in which a connection is newly established by the dialogue control unit,
The incoming-side voice interaction device includes at least an incoming-side receiving unit that receives voice information transmitted from the control-side transmission unit when a connection is established with the incoming-side voice interaction device. Spoken dialogue system.