JP2017191531A

JP2017191531A - Communication system, server, and communication method

Info

Publication number: JP2017191531A
Application number: JP2016081671A
Authority: JP
Inventors: 義博中橋; Yoshihiro Nakahashi; 貴史鹿田; Takashi Shikata; 尋満山内; Hiromitsu Yamauchi
Original assignee: Robot Start Inc
Current assignee: Robot Start Inc
Priority date: 2016-04-15
Filing date: 2016-04-15
Publication date: 2017-10-19

Abstract

PROBLEM TO BE SOLVED: To increase the amount of information in a database used for conversations between a user and a robot and improve the quality of conversations.SOLUTION: The robot includes: a microphone for collecting sounds produced by a user; sending/receiving means for sending to a server through a speech and receiving a response; sound editing means for editing the response as a speech of the robot and generating a sound signal; and a speaker for outputting the edited sound signal. The server includes: connection management means for managing sending and receiving of signals of the robot; a conversation database for storing a collection of questions and answers in conversations; and a conversation engine for generating a response to a speech of a user with reference to the conversation database and sending the response to the robot. The method includes: sending/receiving means for sending a speech to at least one robot of another user and sending a response to the received speech to the robot if there is no generating a response to a speech of a user; and response collecting means for registering a response sent from the robot of another user as a response to a speech into the conversation database.SELECTED DRAWING: Figure 1

Description

本発明は、コミュニケーションシステム、サーバ及びコミュニケーション方法に関する。 The present invention relates to a communication system, a server, and a communication method.

近年、人とロボット（例えば、人型ロボット）との対話を成立させるコミュニケーションシステムが提案されている。 In recent years, a communication system that establishes a dialogue between a person and a robot (for example, a humanoid robot) has been proposed.

その一つは、タスク指向型とよばれるもので、特定のタスクをロボットに行わせるための対話システムである。例えば「今日の天気を教えて」といった、ユーザである人の発話（命令）に対して、ロボットは今日の天気予報を音声で伝える。これらの命令と回答のセットは、予め一意に辞書に登録されている。 One of them is called a task-oriented type, which is an interactive system that allows a robot to perform a specific task. For example, in response to an utterance (command) of a person who is a user such as “tell me today's weather”, the robot conveys today's weather forecast by voice. A set of these commands and answers is uniquely registered in advance in the dictionary.

もう一つは、雑談型と呼ばれるもので、ロボットに特有のタスクをさせるというより、ユーザがロボットとの会話を楽しむためのシステムである（非特許文献１）。これはchatbot（人工無能）対話システムを応用している。このchatbot対話システムは、ユーザと日常会話を行なうためのシステムであり、大きく分けて、辞書型（シナリオ型）、ログ型、マルコフ文生成型（テキスト生成型）等がある。その基本は所定の対話パターンをデータベース化しておき、対話時の入力内容に応じて相応しい応答内容を検索し、それをシステム側から出力する点にある。例えば、対話システムに対してユーザが「何が好きですか？」とキーボードやマイク等を通じて入力すると、システム側は「何−が−好き−です−か？」といった単語列に最も合致する応答データを検索する。データベースには予め入力例とそれに対応する応答文とが大量に格納されている。対話システムは検索結果によって選ばれた応答文を取り出し、それをスピーカやモニターを介してユーザに対して出力する。データベース中の応答内容の格納方法を工夫することで、ユーザの入力の一部を応答文に挿入することもできる。 The other is called a chat type, which is a system that allows a user to enjoy a conversation with a robot rather than letting a task unique to the robot (Non-Patent Document 1). This applies a chatbot (artificial incompetence) dialogue system. This chatbot dialogue system is a system for carrying out daily conversation with a user, and is roughly classified into a dictionary type (scenario type), a log type, a Markov sentence generation type (text generation type), and the like. The basic point is that a predetermined dialogue pattern is stored in a database, a suitable response content is searched according to the input content at the time of dialogue, and it is output from the system side. For example, when the user inputs “What do you like?” To the interactive system through a keyboard, microphone, etc., the system side will respond most closely to a word string such as “What is it? Search for. A large number of input examples and corresponding response sentences are stored in the database in advance. The dialogue system takes out a response sentence selected according to the search result and outputs it to the user via a speaker or a monitor. By devising a method for storing response contents in the database, it is possible to insert part of the user's input into the response sentence.

[Valerie] Valerie Web Site : http://www.roboceptionist.com/[Valerie] Valerie Web Site: http://www.roboceptionist.com/

しかしながら、現在の所、辞書型（シナリオ型）、ログ型、マルコフ文生成型（テキスト生成型）等のいずれの方法も完全とはいえず、人間とロボットとの会話が成立しない場合が多々ある。これは、雑談エンジンが参照するデータベースの情報量が少ないためであり、その情報量を増やすために、多くの手間がかかっていた。 However, at present, none of the methods such as dictionary type (scenario type), log type, Markov sentence generation type (text generation type) are perfect, and there are many cases where a conversation between a human and a robot is not established. . This is because the amount of information in the database referred to by the chat engine is small, and it takes a lot of effort to increase the amount of information.

一方、ロボットのユーザは、ロボットとの間で、ある程度完成されたコミュニケーションを望んでいる。 On the other hand, the user of the robot desires communication completed to some extent with the robot.

そこで、本発明は上記課題に鑑みて発明されたものであって、その目的はユーザとロボットとの間で行われる会話を行うために使用される会話のデータベースの情報量を手間なく増加させると共に、ユーザとロボットとの間の会話の品質を高めることができるコミュニケーションシステム、サーバ及びコミュニケーション方法を提供することにある。 Therefore, the present invention has been invented in view of the above problems, and its purpose is to increase the amount of information in a conversation database used for conducting a conversation between a user and a robot without trouble. Another object of the present invention is to provide a communication system, a server, and a communication method that can improve the quality of conversation between a user and a robot.

本発明の一態様は、コミュニケーションシステムであって、ユーザ側に設置される複数のロボットと、サーバとを有し、前記ロボットは、ユーザの発話を集音するマイクと、前記マイクで集音されたユーザの発話を、ネットワークを通じて前記サーバに送信し、前記ネットワークを通じて送られてくる、前記ユーザの発話に対する応答を受信する送受信手段と、前記ユーザの発話に対する応答を、前記ロボットの発話として編集して、編集音声信号を生成する音声編集手段と、前記編集音声信号を出力する少なくとも一以上のスピーカと、を有し、前記サーバは、前記ロボットの信号の送受信を管理する接続管理手段と、会話に用いられる問いおよび回答の集合を保存する会話データベースと、前記ユーザの発話に対する応答を、前記会話データベースを参照して検索又は生成し、前記接続管理手段を介して前記ロボットに送信する会話エンジンと、前記会話エンジンが前記ユーザの発話に対する応答を検索又は生成できない場合、前記接続管理手段を介して、前記ユーザの発話を前記ユーザ以外の他のユーザの少なくとも一つ以上のロボットに送信し、前記他のユーザのロボットから送られてきた前記ユーザの発話に対する応答を、前記ユーザのロボットに送信する送受信手段と、前記他のユーザのロボットから送られてきた前記ユーザの発話に対する応答を、前記ユーザの発話に対する応答として、前記会話データベースに登録する応答収集手段とを有するコミュニケーションシステムである。 One embodiment of the present invention is a communication system, which includes a plurality of robots installed on a user side and a server, and the robot collects sound from a user's utterance and the microphone. The user's utterance is transmitted to the server through the network, and the transmission / reception means for receiving the response to the user's utterance sent through the network, and the response to the user's utterance are edited as the utterance of the robot. A voice editing means for generating an edited voice signal, and at least one speaker for outputting the edited voice signal, the server having a conversation with a connection management means for managing transmission / reception of the signal of the robot A conversation database that stores a set of questions and answers used in the process, and a response to the user's utterance. A conversation engine that searches or generates data by referring to a database and transmits it to the robot via the connection management means, and when the conversation engine cannot retrieve or generate a response to the user's utterance, the connection management means The user's utterance is transmitted to at least one robot of other users other than the user, and a response to the user's utterance sent from the other user's robot is transmitted to the user's robot. The communication system includes: a transmission / reception unit; and a response collection unit that registers a response to the user's utterance sent from the other user's robot in the conversation database as a response to the user's utterance.

本発明の一態様は、ユーザ側に設置される複数のロボットの信号の送受信を管理する接続管理手段と、会話に用いられる問いおよび回答の集合を保存する会話データベースと、前記ユーザの発話に対する応答を、前記会話データベースを参照して検索又は生成し、前記接続管理手段を介して前記ロボットに送信する会話エンジンと、前記会話エンジンが前記ユーザの発話に対する応答を検索又は生成できない場合、前記接続管理手段を介して、前記ユーザの発話を前記ユーザ以外の他のユーザの少なくとも一つ以上のロボットに送信し、前記他のユーザのロボットから送られてきた前記ユーザの発話に対する応答を、前記ユーザのロボットに送信する送受信手段と、前記他のユーザのロボットから送られてきた前記ユーザの発話に対する応答を、前記ユーザの発話に対する応答として、前記会話データベースに登録する応答収集手段とを有するサーバである。 One aspect of the present invention is a connection management unit that manages transmission / reception of signals of a plurality of robots installed on a user side, a conversation database that stores a set of questions and answers used in conversation, and a response to the user's utterance Search or generation by referring to the conversation database, and transmitting to the robot via the connection management means, and if the conversation engine cannot search or generate a response to the user's utterance, the connection management The user's utterance is transmitted to at least one or more robots of other users other than the user via the means, and a response to the user's utterance sent from the robot of the other user is sent to the user's utterance. Transmission / reception means for transmitting to the robot, and response to the user's utterance sent from the robot of the other user As a response to the utterance of the user, a server and a response collection means for registering in said conversation database.

本発明の一態様は、サーバであって、ユーザ側に設置される複数のロボットの信号の送受信を管理する接続管理手段と、前記接続管理手段を介して、前記ユーザの発話を前記ユーザ以外の他のユーザの少なくとも一つ以上のロボットに送信し、前記他のユーザのロボットから送られてきた前記ユーザの発話に対する応答を、前記ユーザのロボットに送信する送受信手段と、前記ユーザの発話に対する応答の出力時のユーザを撮影したユーザ画像を受信する手段と、前記ユーザ画像に基づいて、前記ユーザの発話に対する応答に対する反応を評価する評価手段と、前記評価と、前記ユーザの発話と、前記ユーザの発話に対する応答とを関連付けて、会話データベースに登録する応答収集手段とを有するサーバである。 One aspect of the present invention is a server, a connection management unit that manages transmission / reception of signals of a plurality of robots installed on a user side, and the user's utterances other than the user via the connection management unit A transmission / reception means for transmitting a response to the user's utterance transmitted from the robot of the other user to the at least one robot of the other user and a response to the user's utterance; Means for receiving a user image of the user at the time of output, evaluation means for evaluating a response to a response to the user's utterance based on the user image, the evaluation, the user's utterance, and the user And a response collection means for registering in the conversation database in association with the response to the utterance of

本発明は、ユーザ側に設置される第１のロボットは、ユーザの発話を集音し、集音されたユーザの発話を、ネットワークを通じてサーバに送信し、前記サーバは、前記ユーザの発話を受信し、前記ユーザの発話に対する応答を、会話データベースを参照して検索又は生成し、前記第１のロボットに送信し、前記サーバは、前記ユーザの発話を受信し、前記ユーザの発話に対する応答を、会話データベースを参照して検索又は生成できない場合、前記ユーザの発話を前記ユーザ以外の他のユーザの少なくとも一つ以上の第２のロボットに送信し、前記サーバは、第２のロボットから送信された前記ユーザの発話に対する応答を受信し、前記第１のロボットに送信し、前記サーバは、前記第２のロボットから送信された前記ユーザの発話に対する応答を、前記ユーザの発話に対する応答として、会話データベースに登録し、前記第１のロボットは、前記ユーザの発話に対する応答を受信し、前記ユーザの発話に対する応答を、前記ロボットの発話として編集して出力するコミュニケーション方法である。 In the present invention, a first robot installed on a user side collects a user's utterance, transmits the collected user's utterance to a server through a network, and the server receives the user's utterance. A response to the user's utterance is retrieved or generated with reference to a conversation database and transmitted to the first robot, the server receives the user's utterance, and receives a response to the user's utterance, If it is not possible to search or generate by referring to the conversation database, the user's utterance is transmitted to at least one second robot of other users other than the user, and the server is transmitted from the second robot. A response to the user's utterance is received and transmitted to the first robot, and the server responds to the user's utterance transmitted from the second robot. An answer is registered in a conversation database as a response to the user's utterance, and the first robot receives the response to the user's utterance, and edits the response to the user's utterance as the utterance of the robot. It is a communication method to output.

本発明の一態様は、ユーザ側に設置される第１のロボットは、ユーザの発話を集音し、集音されたユーザの発話を、ネットワークを通じてサーバに送信し、前記サーバは、前記ユーザの発話を受信し、前記ユーザの発話を前記ユーザ以外の他のユーザの少なくとも一つ以上の第２のロボットに送信し、前記サーバは、第２のロボットから送信された前記ユーザの発話に対する応答を受信し、前記第１のロボットに送信し、前記第１のロボットは、受信したユーザの発話に対する応答を出力し、前記第１のロボットは、前記ユーザの発話に対する応答の出力時のユーザを撮影し、ユーザ画像を前記サーバに送信し、前記サーバは、前記ユーザ画像に基づいて、前記ユーザの発話に対する応答に対する反応を評価し、前記サーバは、前記評価と、前記ユーザの発話と、前記ユーザの発話に対する応答とを関連付けて、会話データベースに登録するコミュニケーション方法である。 In one embodiment of the present invention, a first robot installed on a user side collects a user's utterance, transmits the collected user's utterance to a server through a network, and the server includes the user's utterance. Receiving an utterance, transmitting the user's utterance to at least one second robot of other users other than the user, and the server responding to the user's utterance transmitted from the second robot. Receiving and transmitting to the first robot, wherein the first robot outputs a response to the received user utterance, and the first robot images the user at the time of outputting a response to the user utterance And transmitting the user image to the server, wherein the server evaluates a response to the response to the user's utterance based on the user image, and the server includes the evaluation and In association with the utterance of the serial user and response to the utterance of the user, a communication method for registering in the conversation database.

本発明は、ユーザとロボットとの間で行われる会話を行うために使用される会話のデータベースの情報量を手間なく増加させると共に、ユーザとロボットとの間の会話の品質を高めることができる。 The present invention can easily increase the amount of information in a conversation database used for performing a conversation between a user and a robot, and can improve the quality of the conversation between the user and the robot.

図１は本発明の第１の実施の形態に係るコミュニケーションロボットシステムを模式的に示した図である。FIG. 1 is a diagram schematically showing a communication robot system according to a first embodiment of the present invention. 図２は通常のユーザとロボットとの会話の動作を説明するための図である。FIG. 2 is a diagram for explaining a conversation operation between a normal user and a robot. 図３は発話に対する応答である発話を、会話エンジンにより検索又は生成することができない場合の動作を説明するための図である。FIG. 3 is a diagram for explaining an operation when an utterance that is a response to an utterance cannot be searched or generated by the conversation engine. 図４は第１の実施の形態におけるロボット１の構成を示すブロック図である。FIG. 4 is a block diagram showing a configuration of the robot 1 according to the first embodiment. 図５は第１の実施の形態におけるサーバ３のブロック図である。FIG. 5 is a block diagram of the server 3 in the first embodiment. 図６は接続管理データベース３２の一例を示す図である。FIG. 6 is a diagram illustrating an example of the connection management database 32. 図７は接続管理データベース３２の他の一例を示す図である。FIG. 7 is a diagram showing another example of the connection management database 32. 図８は本発明の第２の実施の形態に係るコミュニケーションロボットシステムを模式的に示した図である。FIG. 8 is a diagram schematically showing a communication robot system according to the second embodiment of the present invention. 図９は本発明の第２の実施の形態に係るロボット１のブロック図である。FIG. 9 is a block diagram of the robot 1 according to the second embodiment of the present invention. 図１０は本発明の第２の実施の形態に係るサーバ３のブロック図である。FIG. 10 is a block diagram of the server 3 according to the second embodiment of the present invention.

＜第１の実施の形態＞
本発明の第１の実施の形態を説明する。 <First Embodiment>
A first embodiment of the present invention will be described.

図１は、本発明の第１の実施の形態に係るコミュニケーションロボットシステムを模式的に示した図である。 FIG. 1 is a diagram schematically showing a communication robot system according to a first embodiment of the present invention.

図１中、１はユーザＡ側に設置されるロボットであり、２はユーザＢ側に設置されるロボットであり、３はロボット１とロボット２とに接続されるサーバである。 In FIG. 1, 1 is a robot installed on the user A side, 2 is a robot installed on the user B side, and 3 is a server connected to the robot 1 and the robot 2.

本発明の第１の実施の形態に係るコミュニケーションロボットシステムの概略を説明する。 An outline of the communication robot system according to the first embodiment of the present invention will be described.

サーバ３は、各ロボット１，２とのデータの送受信を行うための接続管理と、ユーザとロボット１、２との会話を行うための会話エンジンの機能とを備える。 The server 3 includes connection management for transmitting / receiving data to / from the robots 1 and 2 and a conversation engine function for performing conversation between the user and the robots 1 and 2.

まず、通常のユーザとロボットとの会話を説明する。図２は、通常のユーザとロボットとの会話の動作を説明するための図である。 First, a conversation between a normal user and a robot will be described. FIG. 2 is a diagram for explaining a conversation operation between a normal user and a robot.

ユーザＡが発話Ｘを行うと、ロボット１は発話Ｘを集音し、その発話Ｘの音声信号をサーバ３に送信する。発話Ｘの音声信号を受信したサーバ３は、発話Ｘに対する応答である発話Ｙを、会話エンジンにより検索又は生成し、発話Ｙをロボット１に送信する。発話Ｙを受信したロボット１は、ロボット１の発話として、発話Ｙを出力する。これにより、ユーザＡとロボット１との会話が成立する。 When the user A utters the utterance X, the robot 1 collects the utterance X and transmits an audio signal of the utterance X to the server 3. The server 3 that has received the voice signal of the utterance X searches or generates the utterance Y that is a response to the utterance X by the conversation engine, and transmits the utterance Y to the robot 1. The robot 1 that has received the utterance Y outputs the utterance Y as the utterance of the robot 1. Thereby, the conversation between the user A and the robot 1 is established.

次に、発話に対する応答である発話を、会話エンジンにより検索又は生成することができない場合について説明する。図３は、発話に対する応答である発話を、会話エンジンにより検索又は生成することができない場合の動作を説明するための図である。 Next, a case where an utterance that is a response to an utterance cannot be searched or generated by the conversation engine will be described. FIG. 3 is a diagram for explaining an operation when an utterance that is a response to an utterance cannot be searched or generated by the conversation engine.

ユーザＡが発話Ｍを行うと、ロボット１は発話Ｍを集音し、その発話Ｍの音声信号をサーバ３に送信する。発話Ｍの音声信号を受信したサーバ３は、会話エンジンにより、発話Ｍに対する応答である発話を検索又は生成することを試みる。しかし、その発話を検索又は生成することができない場合、サーバ３に、接続可能又は接続中のロボット（図３ではロボット２）に、発話Ｍを送信する。 When the user A utters the utterance M, the robot 1 collects the utterance M and transmits a voice signal of the utterance M to the server 3. The server 3 that has received the voice signal of the utterance M attempts to search or generate an utterance that is a response to the utterance M by the conversation engine. However, if the utterance cannot be retrieved or generated, the utterance M is transmitted to the server 3 to the connectable or connected robot (robot 2 in FIG. 3).

発話Ｍを受信したロボット２は、ロボット２の発話として、発話Ｍを出力する。ロボット２の発話Ｍを聞いたユーザＢは、発話Ｍの応答として発話Ｎを発する。ユーザＢが発話Ｎを行うと、ロボット２は発話Ｎを集音し、その発話Ｎの音声信号をサーバ３に送信する。 The robot 2 that has received the utterance M outputs the utterance M as the utterance of the robot 2. The user B who has heard the utterance M of the robot 2 utters the utterance N as a response to the utterance M. When the user B utters an utterance N, the robot 2 collects the utterance N and transmits a voice signal of the utterance N to the server 3.

発話Ｎの音声信号を受信したサーバ３は、発話Ｎをロボット１に送信する。発話Ｎを受信したロボット１は、ロボット１の発話として、発話Ｎを出力する。これにより、ユーザＡとロボット１との会話が成立する。 The server 3 that has received the voice signal of the utterance N transmits the utterance N to the robot 1. The robot 1 that has received the utterance N outputs the utterance N as the utterance of the robot 1. Thereby, the conversation between the user A and the robot 1 is established.

また、発話Ｍに対する応答である発話を検索又は生成することができなかったサーバ３は、発話Ｍに対する応答として、発話Ｎを会話ログとして記録する。 Further, the server 3 that has not been able to retrieve or generate an utterance that is a response to the utterance M records the utterance N as a conversation log as a response to the utterance M.

このような構成にすることにより、現状の会話エンジンだけではカバーできない発話があったとしても対処することができるとともに、会話ログを自動で収集していくので、高品質の会話エンジンを生成することができる。 With this configuration, it is possible to handle utterances that cannot be covered by the current conversation engine alone, and to automatically collect conversation logs, thus generating a high-quality conversation engine. Can do.

以下、具体的な実施の形態を説明する。 Hereinafter, specific embodiments will be described.

図４は第１の実施の形態におけるロボットの構成を示すブロック図である。尚、ロボット１とロボット２とは、同様のものなので、ロボット１を例にしてロボットの構成を説明する。 FIG. 4 is a block diagram showing the configuration of the robot in the first embodiment. Since the robot 1 and the robot 2 are the same, the configuration of the robot will be described using the robot 1 as an example.

図４に示す如く、ロボット１は、マイク１１と、音声編集部１２と、スピーカ１３と、制御部１４とを有する。 As shown in FIG. 4, the robot 1 includes a microphone 11, a voice editing unit 12, a speaker 13, and a control unit 14.

マイク１１は、ユーザＡの音声を集音するマイクである。 The microphone 11 is a microphone that collects the voice of the user A.

音声編集部１２は、ネットワークを通じてサーバ３から送られてくるユーザＢの音声信号を、ロボット１の発話として編集して、編集音声信号を生成するものである。ここで、ユーザＢの音声信号をロボット１の発話として編集するとは、ユーザＢの音声信号に対して、ユーザＢの音声（音色や声色）をロボット１の音声（音色や声色）に編集（変換）するものである。例えば、男性又は女性のユーザの音声を、ロボット特有の中性の音声に編集（変換）したり、ユーザのカスタマイズによるロボットの音声に編集（変換）したりする。 The voice editing unit 12 edits the voice signal of the user B transmitted from the server 3 through the network as an utterance of the robot 1 and generates an edited voice signal. Here, editing the user B's voice signal as the utterance of the robot 1 means that the user B's voice (timbre or voice color) is edited (converted) into the robot 1 voice (tone or voice color) with respect to the user B voice signal. ) For example, the voice of a male or female user is edited (converted) into neutral voice unique to the robot, or edited (converted) into the voice of the robot customized by the user.

スピーカ１３は、音声編集部１２により編集（変換）された編集音声信号を出力する少なくとも一以上のスピーカである。 The speaker 13 is at least one speaker that outputs an edited audio signal edited (converted) by the audio editing unit 12.

次に、サーバ３を説明する。 Next, the server 3 will be described.

図５は第１の実施の形態におけるサーバ３のブロック図である。 FIG. 5 is a block diagram of the server 3 in the first embodiment.

サーバ３は、ロボット間接続管理部３１と、接続管理データベース３２、音声認識部３３と、会話エンジン３４と、会話ログデータベース３５と、会話ログ収集部３６とを備える。 The server 3 includes an inter-robot connection management unit 31, a connection management database 32, a voice recognition unit 33, a conversation engine 34, a conversation log database 35, and a conversation log collection unit 36.

接続管理データベース３２は、図６に示す如く、ロボット識別情報（ＩＤ）と、接続状況（接続中又は切断中）と、接続先のロボット識別情報（ＩＤ）とが関連付けられて記憶される。ここで、ロボット識別情報（ＩＤ）は、サーバ３と接続され、会話エンジン又は他のロボットを用いてユーザの発話に対する応答の発話を返すべきロボットの識別情報である。接続状況は、サーバ３と現在接続状態にあるかを示す情報である。接続先のロボット識別情報（ＩＤ）とは、会話エンジンを用いてユーザの発話に対する応答の発話を生成することができない場合、ユーザの発話を発言させるロボットの識別情報である。 As shown in FIG. 6, the connection management database 32 stores robot identification information (ID), connection status (connected or disconnected), and connected robot identification information (ID) in association with each other. Here, the robot identification information (ID) is identification information of a robot that is connected to the server 3 and should return an utterance in response to the user's utterance using the conversation engine or another robot. The connection status is information indicating whether the server 3 is currently connected. The connection destination robot identification information (ID) is identification information of a robot that speaks the user's utterance when the utterance of the response to the user's utterance cannot be generated using the conversation engine.

ロボット間接続管理部３１は、接続管理データベース３２を用いて、ロボット間、本例では、ロボット１とロボット２との接続を管理する。 The inter-robot connection management unit 31 uses the connection management database 32 to manage the connection between the robots, in this example, the robot 1 and the robot 2.

具体的には、ユーザの発話に対する応答の要求がロボットからあった場合、そのロボットと接続を確立し、接続管理データベース３２の接続状況を接続中にする。 Specifically, when a response request for a user's utterance is received from a robot, a connection is established with the robot, and the connection status of the connection management database 32 is set to being connected.

また、会話エンジンを用いてユーザの発話に対する応答の発話を生成することができない場合、現在接続されていない（切断中）のロボット中から、ユーザの発話を発言させるロボットを検索する。このとき、図７に示すように、接続管理データベース３２にユーザ属性情報を記憶させておけば、そのユーザ属性に基づいて、ユーザの発話を発言させるロボットを検索するようにしても良い。ここで、ユーザ属性情報とは、ロボットを所有するユーザの年齢、性別、住所、趣味等である。 Also, when it is not possible to generate a response utterance to the user's utterance using the conversation engine, a search is made for a robot that speaks the user's utterance from among the robots that are not currently connected (disconnected). At this time, as shown in FIG. 7, if user attribute information is stored in the connection management database 32, a robot that speaks the user's speech may be searched based on the user attribute. Here, the user attribute information is the age, sex, address, hobby, etc. of the user who owns the robot.

例えば、接続管理データベース３２のユーザ属性情報に基づいて、応答を求めているユーザが２０代の女性である場合、２０代の女性で、かつ、切断中のロボット識別情報を検索し、それに対応するロボットとの接続を確立する。 For example, based on the user attribute information in the connection management database 32, when the user who is requesting a response is a woman in her 20s, the robot identification information that is a woman in her 20s and is being cut is searched for and corresponding Establish a connection with the robot.

音声認識部３３は、ロボット１から送信されてきた音声信号を、従来からある音声認識の技術を用いてテキスト化する。 The voice recognition unit 33 converts the voice signal transmitted from the robot 1 into text using a conventional voice recognition technique.

会話エンジン３４は、例えば、ログ型の会話エンジンである。会話エンジン３４は、音声認識部３３から渡されるテキストデータを分析し、その内容に応じて会話ログデータベース３５を検索し、応答に適した発話（コンテンツ）を選択する。そして、選択した発話（コンテンツ）を、ロボット間接続管理部３１を介してロボット１に送信する。一方、会話エンジン３４は、音声認識部３３から渡されるテキストデータを分析してその内容に応じて会話ログデータベース３５を検索した結果、応答に適した発話（コンテンツ）が無い場合、その結果を会話ログ収集部３６に送信する。 The conversation engine 34 is, for example, a log type conversation engine. The conversation engine 34 analyzes the text data delivered from the voice recognition unit 33, searches the conversation log database 35 according to the content, and selects an utterance (content) suitable for the response. Then, the selected utterance (content) is transmitted to the robot 1 via the inter-robot connection management unit 31. On the other hand, if the conversation engine 34 analyzes the text data passed from the speech recognition unit 33 and searches the conversation log database 35 according to the content, if there is no utterance (content) suitable for the response, the conversation engine 34 interprets the result. It transmits to the log collection part 36.

会話ログ収集部３６は、会話エンジン３４の結果を受けて、音声認識部３３から渡されるテキストデータに対応する音声信号を、ロボット間接続管理部３１に送信する。また、会話ログ収集部３６は、会話エンジン３４で応答に適した発話（コンテンツ）の代わりに応答したロボットからの発話を収集し、その発話の内容をテキスト化して会話ログデータベース３５に登録する。 The conversation log collection unit 36 receives the result of the conversation engine 34 and transmits a voice signal corresponding to the text data passed from the voice recognition unit 33 to the inter-robot connection management unit 31. The conversation log collection unit 36 collects utterances from the robot that responds instead of the utterances (contents) suitable for response by the conversation engine 34, converts the contents of the utterances into text, and registers them in the conversation log database 35.

次に、本実施の形態の具体的な動作を説明する。 Next, a specific operation of the present embodiment will be described.

まず、ユーザＡは、ロボット１に向かって話しかける。そのユーザＡの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＡが「アップルパイを作って食べるよ。」と話しかけると、「アップルパイを作って食べるよ。」の音声信号がサーバ３に送信される。 First, the user A speaks toward the robot 1. The voice of the user A is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user A talks to “I will make and eat an apple pie”, a voice signal “I will make and eat an apple pie” is transmitted to the server 3.

「アップルパイを作って食べるよ。」の音声信号を受信したサーバ３の音声認識部３３は、ロボット１から送信されてきた音声信号を、音声認識の技術を用いてテキスト化する。そして、テキスト化されたテキストデータを、会話エンジン３４に送信する。 The voice recognition unit 33 of the server 3 that has received the voice signal “I will make and eat an apple pie” converts the voice signal transmitted from the robot 1 into text using a voice recognition technique. Then, the text data converted into text is transmitted to the conversation engine 34.

会話エンジン３４は、音声認識部３３から渡されるテキストデータ「アップルパイを作って食べるよ。」を分析し、その内容に応じて会話ログデータベース３５を検索し、応答に適した発話（コンテンツ）を選択する。ここでは、応答に適した発話（コンテンツ）として、「いいね。食べたい。」が選択されたものとすると、この「いいね。食べたい。」を、ロボット間接続管理部３１を介してロボット１に送信する。 The conversation engine 34 analyzes the text data “I will make and eat an apple pie” passed from the voice recognition unit 33, searches the conversation log database 35 according to the content, and utters (content) suitable for the response. select. Here, if “Like. I want to eat.” Is selected as an utterance (content) suitable for response, this “Like. I want to eat.” Is transmitted to the robot via the inter-robot connection management unit 31. 1 to send.

ロボット１では、音声編集部１２により、受信した「いいね。食べたい。」の音声信号を、ロボット１の発話として編集して、編集音声信号を生成する。そして、ロボット１の特有の音声で、「いいね。食べたい。」が出力される。 In the robot 1, the received voice signal of “Like. I want to eat” is edited by the voice editing unit 12 as an utterance of the robot 1 to generate an edited voice signal. Then, “Like. I want to eat.” Is output with the voice unique to the robot 1.

次に、ユーザＡは、ロボット１が発した音声に返答して、ロボット１に向かって話しかける。そのユーザＡの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＡが「いいね。食べたい。」に対して、「あ。パイ生地買い忘れた。」と返答した場合、「あ。パイ生地買い忘れた。」の音声信号がサーバ３に送信される。 Next, the user A responds to the voice uttered by the robot 1 and speaks toward the robot 1. The voice of the user A is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user A responds to “Like. I want to eat.”, A voice signal “A. I forgot to buy puff pastry.” Is sent to the server 3. Is done.

「あ。パイ生地買い忘れた。」の音声信号を受信したサーバ３の音声認識部３３は、ロボット１から送信されてきた音声信号を、音声認識の技術を用いてテキスト化する。そして、テキスト化されたテキストデータを、会話エンジン３４に送信する。 The voice recognition unit 33 of the server 3 that has received the voice signal “Ah, I forgot to buy puff pastry” converts the voice signal transmitted from the robot 1 into text using a voice recognition technique. Then, the text data converted into text is transmitted to the conversation engine 34.

会話エンジン３４は、音声認識部３３から渡されるテキストデータ「あ。パイ生地買い忘れた。」を分析し、その内容に応じて会話ログデータベース３５を検索し、応答に適した発話（コンテンツ）を選択する。ここで、会話エンジン３４は応答に適した発話（コンテンツ）を検索できなかったものとすると、会話エンジン３４はその結果を、会話ログ収集部３６に送信する。 The conversation engine 34 analyzes the text data “A. I forgot to buy puff pastry” delivered from the voice recognition unit 33, searches the conversation log database 35 according to the content, and finds the utterance (content) suitable for the response. select. Here, if the conversation engine 34 cannot retrieve an utterance (content) suitable for a response, the conversation engine 34 transmits the result to the conversation log collection unit 36.

会話ログ収集部３６は、会話エンジン３４の結果を受けて、音声認識部３３から渡されるテキストデータに対応する音声信号を、ロボット間接続管理部３１に送信する。 The conversation log collection unit 36 receives the result of the conversation engine 34 and transmits a voice signal corresponding to the text data passed from the voice recognition unit 33 to the inter-robot connection management unit 31.

ロボット間接続管理部３１は、現在接続されていない（切断中）のロボット中から、ユーザの発話を発言させるロボットを、接続管理データベース３２のユーザ属性情報に基づいて検索する。そして、ここでは、ロボットＢが選択されたものとし、サーバ３のロボット間接続管理部３１は、テキストデータ「あ。パイ生地買い忘れた。」を、ロボット２に送信する。 The inter-robot connection management unit 31 searches for robots that speak the user's speech from among the robots that are not currently connected (being disconnected) based on the user attribute information in the connection management database 32. In this example, it is assumed that the robot B is selected, and the inter-robot connection management unit 31 of the server 3 transmits the text data “Ah, I forgot to buy pie dough” to the robot 2.

ロボット２では、音声編集部１２により、受信した音声信号をロボット２の発話として編集して、編集音声信号を生成する。例えば、受信した「あ。パイ生地買い忘れた。」のテキストデータを、ロボット２の発話として編集して、編集音声信号を生成する。そして、編集音声信号は、スピーカ１３から出力される。例えば、ロボット２の特有の音声で、「あ。パイ生地買い忘れた。」が出力される。 In the robot 2, the voice editing unit 12 edits the received voice signal as an utterance of the robot 2 to generate an edited voice signal. For example, the received text data “A. I forgot to buy puff pastry” is edited as an utterance of the robot 2 to generate an edited audio signal. Then, the edited audio signal is output from the speaker 13. For example, “Ah, I forgot to buy puff pastry” is output by a voice specific to the robot 2.

ユーザＢは、ロボット２が発した音声に返答して、ロボット２に向かって話しかける。そのユーザＢの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＢが「あ。パイ生地買い忘れた。」に対して、「残念。今度ね。」と返答した場合、「残念。今度ね。」の音声信号がサーバ３に送信される。 The user B responds to the voice uttered by the robot 2 and speaks toward the robot 2. The voice of the user B is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user B responds to “Ah, I forgot to buy the pie dough”, the voice signal “Sorry. This time” is transmitted to the server 3.

サーバ３では、ユーザＢが返答した「残念。今度ね。」を受信し、ロボット１に送信する。 In the server 3, the user B responds “Sorry. This time” and sends it to the robot 1.

ロボット１では、音声編集部１２により、受信した音声信号をロボット１の発話として編集して、編集音声信号を生成する。例えば、受信した「残念。今度ね。」の音声信号を、ロボット１の発話として編集して、編集音声信号を生成する。そして、ロボット１の特有の音声で、「残念。今度ね。」が、スピーカ１３から出力される。 In the robot 1, the voice editing unit 12 edits the received voice signal as an utterance of the robot 1 to generate an edited voice signal. For example, the received voice signal of “sorry. This time” is edited as the utterance of the robot 1 to generate an edited voice signal. Then, “unfortunately, this time” is output from the speaker 13 with the voice specific to the robot 1.

また、サーバ３の会話ログ収集部３６では、ユーザＢが発した「あ。パイ生地買い忘れた。」に対する返答として「残念。今度ね。」を、会話ログデータベース３５に登録する。 In the conversation log collection unit 36 of the server 3, “Sorry. This time” is registered in the conversation log database 35 as a response to “A. I forgot to buy puff pastry” issued by the user B.

このように、ユーザの発話に対する応答がデータベースに登録されていない場合であっても、その応答を他のロボットから取得するように構成しているので、会話が途切れることなく行える。更に、多くの自然なやり取りの会話ログが収集でき、収集した会話ログは、ログ型の会話エンジンなどのデータに用いることができるので、高品質な会話エンジンを構築できる。 Thus, even when a response to the user's utterance is not registered in the database, the response can be obtained from another robot, so that the conversation can be performed without interruption. Furthermore, many natural conversation conversation logs can be collected, and the collected conversation logs can be used for data such as a log-type conversation engine, so that a high-quality conversation engine can be constructed.

尚、会話エンジンが応答に適した発話（コンテンツ）を検索又は生成できず、他のロボットからその応答を取得する場合、他のロボットのユーザが会話に不適切な発話（以下、禁止用語と記載する）をしてしまう場合もある。 When the conversation engine cannot retrieve or generate an utterance (content) suitable for a response and obtains the response from another robot, an utterance inappropriate for the conversation by the user of the other robot (hereinafter referred to as a prohibited term) Sometimes).

このような場合を防ぐため、サーバ３は、上記の構成に加えて、禁止用語をフィルタリングするフィルタリング部と、禁止用語群が格納された禁止用語データベースとを備えるようにしても良い。 In order to prevent such a case, the server 3 may include a filtering unit that filters prohibited terms and a prohibited term database in which prohibited term groups are stored, in addition to the above configuration.

フィルタリング部は、禁止用語データベースを参照し、ユーザの発話に対する応答の発話に禁止用語が含まれているかを判断し、含まれている場合にはユーザの該当する音声を削除、または、その該当する音声を他の用語に変換する。そして、削除、変換された音声を、通信先のロボットに転送する。 The filtering unit refers to the prohibited term database, determines whether the prohibited term is included in the utterance of the response to the user's utterance, and if it is included, deletes the corresponding voice of the user or applies Convert speech to other terms. Then, the deleted and converted voice is transferred to the communication destination robot.

このようにすることにより、会話中に不適切な言葉が発せられた場合であっても、会話の相手に、不快な思いをさせることがない。 In this way, even if inappropriate words are uttered during the conversation, the conversation partner is not made uncomfortable.

＜第２の実施の形態＞
本発明の第２の実施の形態を説明する。 <Second Embodiment>
A second embodiment of the present invention will be described.

第２の実施の形態は、会話エンジンがある発話の応答に適した発話（コンテンツ）が無い場合、その発話を複数のロボットに送信し、そのロボットに発話させる。そして、複数の応答を収集し、複数の応答を、その応答の元となる発話を集音したロボットに発話させ、ユーザの反応により適したログを収集する例である。 In the second embodiment, when there is no utterance (content) suitable for a response of a certain utterance, the utterance is transmitted to a plurality of robots, and the robot is uttered. In this example, a plurality of responses are collected, a plurality of responses are uttered by a robot that has collected the utterances that are the basis of the responses, and logs more suitable for the user's reaction are collected.

第２の実施の形態の概略を説明する。 An outline of the second embodiment will be described.

図８は、本発明の第２の実施の形態に係るコミュニケーションロボットシステムを模式的に示した図である。 FIG. 8 is a diagram schematically showing a communication robot system according to the second embodiment of the present invention.

第２の実施の形態の概略を説明すると、ユーザＡが発話Ｍを行うと、ロボット１は発話Ｍを集音し、その発話Ｍの音声信号をサーバ３に送信する。発話Ｍの音声信号を受信したサーバ３は、会話エンジンにより、発話Ｍに対する応答である発話を検索又は生成することを試みる。しかし、その発話Ｍを検索又は生成することができない場合、サーバ３に接続している複数のロボット（図３ではロボット２、４）に、発話Ｍを送信する。 The outline of the second embodiment will be described. When the user A utters an utterance M, the robot 1 collects the utterance M and transmits an audio signal of the utterance M to the server 3. The server 3 that has received the voice signal of the utterance M attempts to search or generate an utterance that is a response to the utterance M by the conversation engine. However, when the utterance M cannot be retrieved or generated, the utterance M is transmitted to a plurality of robots (robots 2 and 4 in FIG. 3) connected to the server 3.

発話Ｍを受信したロボット２、４は、ロボット２、４の発話として、発話Ｍを出力する。 The robots 2 and 4 that have received the utterance M output the utterance M as the utterances of the robots 2 and 4.

ロボット２の発話Ｍを聞いたユーザＢは、発話Ｍの応答として発話Ｎを発する。ユーザＢが発話Ｎを行うと、ロボット２は発話Ｎを集音し、その発話Ｎの音声信号をサーバ３に送信する。ロボット４の発話Ｍを聞いたユーザＣは、発話Ｍの応答として発話Ｏを発する。ユーザＣが発話Ｏを行うと、ロボット４は発話Ｏを集音し、その発話Ｏの音声信号をサーバ３に送信する。 The user B who has heard the utterance M of the robot 2 utters the utterance N as a response to the utterance M. When the user B utters an utterance N, the robot 2 collects the utterance N and transmits a voice signal of the utterance N to the server 3. The user C who has heard the utterance M of the robot 4 utters the utterance O as a response to the utterance M. When the user C utters an utterance O, the robot 4 collects the utterance O and transmits a voice signal of the utterance O to the server 3.

発話Ｎ及び発話Ｏの音声信号を受信したサーバ３は、まず、発話Ｎをロボット１に送信する。 The server 3 that has received the voice signals of the utterance N and the utterance O first transmits the utterance N to the robot 1.

発話Ｎを受信したロボット１は、ロボット１の発話として、発話Ｎを出力する。このとき、ロボット１は、発話Ｎを聞いた時のユーザＡの表情等の画像を撮影する。そして、ロボット１は、発話Ｎを聞いた時のユーザＡの表情等の画像を、サーバ３に送信する。 The robot 1 that has received the utterance N outputs the utterance N as the utterance of the robot 1. At this time, the robot 1 captures an image such as the facial expression of the user A when the utterance N is heard. Then, the robot 1 transmits an image such as the facial expression of the user A when the utterance N is heard to the server 3.

続いて、サーバ３は、発話Ｏをロボット１に送信する。 Subsequently, the server 3 transmits the utterance O to the robot 1.

発話Ｏを受信したロボット１は、ロボット１の発話として、発話Ｏを出力する。このとき、ロボット１は、発話Ｏを聞いた時のユーザＡの表情等の画像を撮影する。そして、ロボット１は、発話Ｏを聞いた時のユーザＡの表情等の画像を、サーバ３に送信する。 The robot 1 that has received the utterance O outputs the utterance O as the utterance of the robot 1. At this time, the robot 1 captures an image such as the facial expression of the user A when the utterance O is heard. Then, the robot 1 transmits an image such as the facial expression of the user A when the utterance O is heard to the server 3.

サーバ３は、発話Ｎを聞いた時のユーザＡの表情等の画像と、発話Ｏを聞いた時のユーザＡの表情等の画像とを解析し、いずれの発話（応答）の方がユーザＡの反応（リアクション）が良いかを評価する。そして、発話Ｍに対する応答の会話ログとして、発話Ｎ及び発話Ｏとその評価結果とを記録する。 The server 3 analyzes the image such as the facial expression of the user A when the utterance N is heard and the image such as the facial expression of the user A when the utterance O is heard, and which utterance (response) is the user A. Evaluate whether the reaction (reaction) is good. Then, as a conversation log of a response to the utterance M, the utterance N and the utterance O and the evaluation result are recorded.

ある発話に対する返答としての応答を複数記録し、各応答の発話に対する評価を記録することにより、会話エンジンが、よりユーザが好ましい発話を選択することができる。 By recording a plurality of responses as responses to a certain utterance and recording an evaluation for the utterance of each response, the conversation engine allows the user to select a more preferable utterance.

以下、具体的な構成について説明する。 Hereinafter, a specific configuration will be described.

ロボット１は、図９に示すように、第１の実施の形態の構成に加えて、ロボット１がユーザを撮影するカメラ１４を備えている。そして、ロボット１は、カメラ１４により、ロボット１の発話時に、発話を聞いたユーザを撮影し、この撮影したユーザ画像を、発話を識別する情報とともに、サーバ３に送信する。 As shown in FIG. 9, the robot 1 includes a camera 14 for photographing the user by the robot 1 in addition to the configuration of the first embodiment. Then, when the robot 1 speaks, the robot 1 photographs the user who has heard the utterance, and transmits the captured user image to the server 3 together with information for identifying the utterance.

サーバ３は、図１０に示す如く、第１の実施の形態の構成に加えて、評価部３７を備えている。そして、評価部３７は、サーバ３に送信されてくるユーザ画像に基づいて、ロボット１の発話時のユーザの反応（リアクション）を判断し、その発話に対するレイティングを付けて会話ログデータベース３５に登録する。 As shown in FIG. 10, the server 3 includes an evaluation unit 37 in addition to the configuration of the first embodiment. Then, the evaluation unit 37 determines the user's reaction (reaction) when the robot 1 speaks based on the user image transmitted to the server 3, and adds the rating to the speech and registers it in the conversation log database 35. .

例えば、ロボット１が発話Ｍに対する応答として発話Ｎを出力した時に撮影されたユーザ画像に基づいてユーザの顔が無表情であると判断した場合、発話Ｎは発話Ｍに対する応答としては低い評価が与えられる。一方、ロボット１が発話Ｍに対する応答として発話Ｏを出力した時に撮影されたユーザ画像に基づいてユーザの顔が笑顔であると判断した場合、発話Ｏは発話Ｍに対する応答としては高い評価が与えられる。そして、これらの評価と共に、発話Ｎ及び発話Ｏが会話ログデータベース３５に登録される。 For example, if it is determined that the user's face is expressionless based on the user image taken when the robot 1 outputs the utterance N as a response to the utterance M, the utterance N is given a low evaluation as a response to the utterance M. It is done. On the other hand, when the robot 1 determines that the user's face is smiling based on the user image taken when the utterance O is output as a response to the utterance M, the utterance O is highly evaluated as a response to the utterance M. . Along with these evaluations, the utterance N and the utterance O are registered in the conversation log database 35.

第２の実施の形態は、会話エンジンがユーザの発話の応答に適した発話（コンテンツ）が無い場合において、複数のロボットから収集した応答を、ユーザに対する返答として投げかけ、そのユーザの反応を評価するように構成されている。この構成により、ユーザの反応の良い応答を区別してデータベースに登録することができ、会話エンジンが応答を生成するときに参照することができ、会話エンジンの精度を高めることができる。 In the second embodiment, when there is no utterance (content) suitable for the response of the user's utterance, the conversation collected from a plurality of robots is thrown as a response to the user, and the user's reaction is evaluated. It is configured as follows. With this configuration, it is possible to distinguish and register a response with a good user response in the database, to refer to it when the conversation engine generates a response, and to improve the accuracy of the conversation engine.

尚、上述した実施の形態では、各部をハードウェアで構成したが、上述した動作の処理を情報処理装置（ＣＰＵ）に行わせるプログラムによっても構成できる。 In the above-described embodiment, each unit is configured by hardware, but may be configured by a program that causes the information processing apparatus (CPU) to perform the above-described operation processing.

以上好ましい実施の形態をあげて本発明を説明したが、本発明は必ずしも上記実施の形態に限定されるものではなく、その技術的思想の範囲内において様々に変形し実施することが出来る。 Although the present invention has been described with reference to the preferred embodiments, the present invention is not necessarily limited to the above-described embodiments, and various modifications can be made within the scope of the technical idea.

１ロボット
２ロボット
３サーバ
４ロボット
１１マイク
１２音声編集部
１３スピーカ
１４カメラ
３１ロボット間接続管理部
３２接続管理データベース
３３音声認識部
３４会話エンジン
３５会話ログデータベース
３６会話ログ収集部
３７評価部 DESCRIPTION OF SYMBOLS 1 Robot 2 Robot 3 Server 4 Robot 11 Microphone 12 Voice editing part 13 Speaker 14 Camera 31 Inter-robot connection management part 32 Connection management database 33 Voice recognition part 34 Conversation engine 35 Conversation log database 36 Conversation log collection part 37 Evaluation part

Claims

A communication system,
A plurality of robots installed on the user side and a server;
The robot is
A microphone that collects user utterances;
Transmitting / receiving means for transmitting a user's utterance collected by the microphone to the server through a network and receiving a response to the user's utterance transmitted through the network;
A voice editing means for editing a response to the user's utterance as an utterance of the robot and generating an edited voice signal;
At least one speaker for outputting the edited audio signal;
Have
The server
Connection management means for managing transmission and reception of signals of the robot;
A conversation database that stores a set of questions and answers used in the conversation;
A conversation engine that searches or generates a response to the user's utterance with reference to the conversation database and transmits the response to the robot via the connection management means;
If the conversation engine cannot retrieve or generate a response to the user's utterance, the user's utterance is transmitted to at least one robot other than the user via the connection management means, and the other A transmission / reception means for transmitting a response to the user's utterance sent from the user's robot to the user's robot;
A communication system comprising: response collection means for registering a response to the user's utterance sent from the robot of the other user as a response to the user's utterance in the conversation database.

The robot is
A camera that captures the user when outputting a response to the user's utterance sent from the robot of the other user;
Means for transmitting a user image taken by the camera to the server;
The server response collection means includes:
The communication system according to claim 1, wherein a response to a response to the user's utterance is evaluated based on the user image, and the evaluation and a response to the user's utterance are associated and registered in the conversation database.

The server
A banned term database that contains banned terms for conversations;
Referring to the prohibited term database, it is determined whether a response to the user's utterance sent from the robot of the other user includes a prohibited term. If the prohibited term includes the prohibited term, the response to the user's utterance is determined. The communication system according to claim 1, further comprising a filtering unit that deletes or converts a part of the filtering system.

The server
A user attribute information database storing attribute information of a user who owns the robot;
4. The apparatus according to claim 1, further comprising: a matching control unit that refers to the user attribute information database, performs user matching based on user attribute information, and establishes a connection between robots corresponding to the matched user. The communication system described in Crab.

Connection management means for managing transmission and reception of signals of a plurality of robots installed on the user side;
A conversation database that stores a set of questions and answers used in the conversation;
A conversation engine that searches or generates a response to the user's utterance with reference to the conversation database and transmits the response to the robot via the connection management means;
If the conversation engine cannot retrieve or generate a response to the user's utterance, the user's utterance is transmitted to at least one robot other than the user via the connection management means, and the other A transmission / reception means for transmitting a response to the user's utterance sent from the user's robot to the user's robot;
A server having response collection means for registering a response to the user's utterance sent from the robot of the other user in the conversation database as a response to the user's utterance;

The server
A banned term database that contains banned terms for conversations;
Referring to the prohibited term database, it is determined whether a response to the user's utterance sent from the robot of the other user includes a prohibited term. If the prohibited term includes the prohibited term, the response to the user's utterance is determined. 6. The server according to claim 5, further comprising filtering means for deleting or converting a part of the server.

The server
A user attribute information database storing attribute information of a user who owns the robot;
7. The matching control means for referring to the user attribute information database, matching users based on user attribute information, and establishing a connection between robots corresponding to the matched users. Communication system.

A server,
Connection management means for managing transmission and reception of signals of a plurality of robots installed on the user side;
Via the connection management means, the user's utterance is transmitted to at least one robot of other users other than the user, and a response to the user's utterance sent from the robot of the other user is received. Transmitting / receiving means for transmitting to the user's robot;
Means for receiving a user image taken of the user at the time of outputting a response to the user's utterance;
Evaluation means for evaluating a response to a response to the user's utterance based on the user image;
A server having response collection means for associating the evaluation, the user's utterance, and a response to the user's utterance and registering it in a conversation database.

The first robot installed on the user side collects the user's utterance, transmits the collected user's utterance to the server through the network,
The server receives the user's utterance, searches or generates a response to the user's utterance with reference to a conversation database, and transmits the response to the first robot.
When the server receives the user's utterance and cannot retrieve or generate a response to the user's utterance with reference to a conversation database, the server utters the user's utterance by at least one of other users other than the user. To the second robot,
The server receives a response to the user's utterance transmitted from the second robot, transmits the response to the first robot,
The server registers a response to the user's utterance transmitted from the second robot in a conversation database as a response to the user's utterance,
The first robot receives a response to the user's utterance, and edits and outputs the response to the user's utterance as the utterance of the robot.

The first robot captures a user when outputting a response to the user's utterance, and transmits the captured user image to the server.
The communication according to claim 9, wherein the server evaluates a response to a response to the user's utterance based on the user image, associates the evaluation with a response to the user's utterance, and registers the response in the conversation database. Method.

The server refers to the prohibited term database, determines whether or not the prohibited term is included in the response to the user's utterance transmitted from the second robot, and if the prohibited term is included, 11. The communication method according to claim 9 or 10, wherein a response to is deleted or part of the response is converted.

The first robot installed on the user side collects the user's utterance, transmits the collected user's utterance to the server through the network,
The server receives the user's utterance, transmits the user's utterance to at least one second robot of other users other than the user,
The server receives a response to the user's utterance transmitted from the second robot, transmits the response to the first robot,
The first robot outputs a response to the received user utterance,
The first robot captures a user at the time of outputting a response to the user's utterance, transmits a user image to the server,
The server evaluates a response to a response to the user's utterance based on the user image,
The server is a communication method in which the evaluation, the user's utterance, and a response to the user's utterance are associated with each other and registered in a conversation database.