JP6909189B2

JP6909189B2 - Programs, servers and methods to switch agents according to user utterance text

Info

Publication number: JP6909189B2
Application number: JP2018158795A
Authority: JP
Inventors: 俊一田原; 啓一郎帆足
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2018-08-27
Filing date: 2018-08-27
Publication date: 2021-07-28
Anticipated expiration: 2038-08-27
Also published as: JP2020034626A

Description

本発明は、シナリオを用いてユーザと対話するエージェント対話システムの技術に関する。 The present invention relates to a technique of an agent dialogue system that interacts with a user using a scenario.

スマートフォンやタブレット端末では、ユーザに対して自然に対話する対話システムが普及しつつある。特に、コンピュータグラフィックスのキャラクタが、ユーザと音声やテキストで対話する機能を、「エージェント」と称す。エージェントは、ユーザから見て、ユーザに特別な意識を持たせることなく、ユーザの状況や、趣味趣向、感情に応じた対話を成立させる。 In smartphones and tablet terminals, dialogue systems that naturally interact with users are becoming widespread. In particular, the function of a computer graphics character interacting with a user by voice or text is called an "agent". From the user's point of view, the agent establishes a dialogue according to the user's situation, hobbies, and emotions without giving the user a special consciousness.

従来、漫画などの登場人物を模した複数のエージェントの中から、ユーザが１体のエージェントを選択することができる雑談対話システムの技術がある（例えば特許文献１参照）。この技術によれば、対話システムは、選択されたエージェントに基づく雑談データを取得し、対話時にその雑談データを用いて返答を生成する。対話の際、スマートフォンのディスプレイには、エージェントの姿が表示される（チャットボット型）。但し、エージェントの雑談データに含まれていないユーザからの発話に対しては、返答を生成することができない Conventionally, there is a technology of a chat dialogue system in which a user can select one agent from a plurality of agents imitating characters such as cartoons (see, for example, Patent Document 1). According to this technique, the dialogue system acquires chat data based on the selected agent and uses the chat data at the time of dialogue to generate a response. During the dialogue, the appearance of the agent is displayed on the smartphone display (chatbot type). However, it is not possible to generate a response to a utterance from a user that is not included in the agent's chat data.

尚、対話システムについて、ユーザの印象を調査した文献もある（例えば非特許文献１参照）。この文献によれば、ユーザは、キャラクタ性を持つ１体のエージェントとの対話システムと、キャラクタ性を持つ３体のエージェントとの対話システムとを用いている。但し、Wizard of Oz方式で実験したものに過ぎず、システム化されていない。 There is also a document that investigates the impression of the user regarding the dialogue system (see, for example, Non-Patent Document 1). According to this document, the user uses a dialogue system with one agent having a character and a dialogue system with three agents having a character. However, it is only an experiment using the Wizard of Oz method and has not been systematized.

特開２０１４−９８８４４号公報Japanese Unexamined Patent Publication No. 2014-98844

Ana Paula Chaves, Marco Aurelio Gerosa, "Single or Multiple Conversational Agents?: An Interactional Coherence Comparison" ACM CHI Conference on Human Factors in Computing Systems. (2018, April)Ana Paula Chaves, Marco Aurelio Gerosa, "Single or Multiple Conversational Agents ?: An Interactional Coherence Comparison" ACM CHI Conference on Human Factors in Computing Systems. (2018, April)

前述した特許文献１によれば、チャットボット型のように、ユーザが１体のエージェントと対話する際に、ユーザの発話に対して、エージェントに返答候補が無い場合、その対話が破綻してしまうことになる。 According to the above-mentioned Patent Document 1, when a user interacts with one agent as in the chatbot type, if the agent has no response candidate to the user's utterance, the dialogue breaks down. It will be.

そこで、本発明は、ユーザ発話テキストに応じてエージェントを交代させることができるプログラム、サーバ及び方法を提供することを目的とする。 Therefore, an object of the present invention is to provide a program, a server, and a method capable of changing agents according to a user's utterance text.

本発明によれば、ユーザ発話テキストを入力し、エージェント発話テキストを返答する複数のエージェントを管理するようにコンピュータを機能させるエージェント管理プログラムであって、
エージェント毎に、ユーザ想定テキストとエージェント発話テキストとエージェントペルソナテキストとを対応付けたエージェントデータベースと、
現選択のエージェントに対するユーザからのユーザ発話テキストと、全てのエージェントに含まれるユーザ想定テキストそれぞれとの類似度に、当該ユーザ発話テキストと当該エージェントのエージェントペルソナテキストとの類似度を重み付けて算出する類似度算出手段と、
現選択のエージェントの全てのユーザ想定テキストの類似度が所定閾値以上でない場合、類似度が所定閾値以上で且つ最も高いユーザ想定テキストを含む他のエージェントに交代するエージェント選択手段と
してコンピュータを機能させることを特徴とする。 According to the present invention, an agent management program that causes a computer to function to manage a plurality of agents that input a user utterance text and return an agent utterance text.
For each agent, an agent database that associates user assumption text, agent utterance text, and agent persona text,
Similarity calculated by weighting the similarity between the user utterance text of the currently selected agent and the user expected text included in all agents, and the similarity between the user utterance text and the agent persona text of the agent. Degree calculation means and
If the similarity of all user-assumed texts of the currently selected agent is not greater than or equal to a predetermined threshold, the computer should function as an agent selection means to be replaced by another agent having a similarity equal to or greater than a predetermined threshold and containing the highest user-assumed text. It is characterized by.

本発明のプログラムにおける他の実施形態によれば、
エージェント選択手段によって他のエージェントに交代した際に、当該他のエージェントについて類似度が最も高いユーザ想定テキストに対応するエージェント発話テキストを返答する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It is also preferable to make the computer function to return the agent utterance text corresponding to the user-assumed text having the highest similarity for the other agent when it is replaced by another agent by the agent selection means.

本発明のプログラムにおける他の実施形態によれば、
ユーザ毎に、ユーザプロファイルテキストを蓄積したユーザプロファイル蓄積手段として更に機能させ、
類似度算出手段は、当該ユーザのユーザプロファイルテキストと、各エージェントのエージェントペルソナテキスト及び／又はエージェント発話テキストとの類似度を、当該エージェントの全てのユーザ想定テキストの類似度に重み付ける
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
For each user, further function as a user profile storage means that stores the user profile text,
The similarity calculation means causes the computer to weight the similarity between the user profile text of the user and the agent persona text and / or the agent utterance text of each agent to the similarity of all the user assumed texts of the agent. It is also preferable to make it work.

本発明のプログラムにおける他の実施形態によれば、
ユーザプロファイル蓄積手段は、ＳＮＳ(Social Networking Service)によって各ユーザのユーザプロファイルテキストを取得する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It is also preferable that the user profile storage means make the computer function so as to acquire the user profile text of each user by SNS (Social Networking Service).

本発明のプログラムにおける他の実施形態によれば、
エージェントペルソナテキストは、敬語表現の有無を含んでおり、
類似度算出手段は、更に、ユーザ発話テキストの敬語表現の有無と、複数のエージェントにおける敬語表現の有無とが不一致となる場合、当該エージェントの全てのユーザ想定テキストの類似度を零とする
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
The agent persona text contains the presence or absence of honorific expressions and
The similarity calculation means further sets the similarity of all user-assumed texts of the agent to zero when the presence or absence of the honorific expression in the user-spoken text and the presence or absence of the honorific expression in a plurality of agents do not match. It is also preferable to have the computer function.

本発明のプログラムにおける他の実施形態によれば、
類似度算出手段は、ユーザ発話テキストとユーザ想定テキストとの両方を文字要素に基づくベクトルに変換し、２つのベクトル間の距離をコサイン類似度として算出する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It is also preferable that the similarity calculation means converts both the user-spoken text and the user-assumed text into vectors based on the character elements, and causes the computer to function so as to calculate the distance between the two vectors as the cosine similarity.

本発明のプログラムにおける他の実施形態によれば、
ユーザ発話テキストは、ユーザから発話された音声を音声認識処理によって変換したテキストであるか、又は、ユーザから入力されたテキストである
ことも好ましい。 According to other embodiments in the program of the present invention
The user-spoken text is preferably a text obtained by converting a voice uttered by the user by a voice recognition process, or a text input by the user.

本発明のプログラムにおける他の実施形態によれば、
エージェント選択手段は、現選択のエージェントから他のエージェントへ交代させる際に、現選択のエージェントが当該他のエージェントへ交代する旨のエージェント発話テキストを送信し、その後、当該他のエージェントにおける最も類似度が高いユーザ想定テキストに対応するエージェント発話テキストを送信する
ことも好ましい。 According to other embodiments in the program of the present invention
When the currently selected agent is replaced by another agent, the agent selection means sends an agent utterance text indicating that the currently selected agent is replaced by the other agent, and then the highest degree of similarity among the other agents. It is also preferable to send the agent utterance text corresponding to the user's assumed text with a high value.

本発明によれば、前述したエージェント管理プログラムと通信する装置に搭載されたコンピュータを機能させるユーザプログラムであって、
エージェント管理プログラムによって選択されたエージェントのキャラクタをディスプレイに表示すると共に、エージェント発話テキストをユーザに明示し、ユーザによって入力又は発話されたユーザ発話テキストをエージェント管理プログラムへ送信する
ようにコンピュータを機能させることも好ましい。 According to the present invention, it is a user program that functions a computer mounted on a device that communicates with the agent management program described above.
Displaying the agent character selected by the agent management program on the display, showing the agent utterance text to the user, and making the computer function to send the user utterance text entered or spoken by the user to the agent management program. Is also preferable.

本発明によれば、ユーザ操作に基づく端末に対して、ユーザ発話テキストに応じてエージェント発話テキストを返答する複数のエージェントを管理する対話サーバであって、
エージェント毎に、ユーザ想定テキストとエージェント発話テキストとエージェントペルソナテキストとを対応付けたエージェントデータベースと、
現選択のエージェントに対するユーザからのユーザ発話テキストと、全てのエージェントに含まれるユーザ想定テキストそれぞれとの類似度に、当該ユーザ発話テキストと当該エージェントのエージェントペルソナテキストとの類似度を重み付けて算出する類似度算出手段と、
現選択のエージェントの全てのユーザ想定テキストの類似度が所定閾値以上でない場合、類似度が所定閾値以上で且つ最も高いユーザ想定テキストを含む他のエージェントに交代するエージェント選択手段と
を有することを特徴とする。 According to the present invention, it is a dialogue server that manages a plurality of agents that return an agent utterance text according to a user utterance text to a terminal based on a user operation.
For each agent, an agent database that associates user assumption text, agent utterance text, and agent persona text,
Similarity calculated by weighting the similarity between the user utterance text of the currently selected agent and the user expected text included in all agents, and the similarity between the user utterance text and the agent persona text of the agent. Degree calculation means and
When the similarity of all user-assumed texts of the currently selected agent is not equal to or higher than a predetermined threshold value, it is characterized by having an agent selection means that substitutes for another agent having a similarity degree equal to or higher than a predetermined threshold value and includes the highest user-assumed text. And.

本発明によれば、ユーザ操作に基づく端末に対して、ユーザ発話テキストに応じてエージェント発話テキストを返答する複数のエージェントを管理する装置のエージェント管理方法であって、
装置は、エージェント毎に、ユーザ想定テキストとエージェント発話テキストとエージェントペルソナテキストとを対応付けたエージェントデータベースと、
装置は、
現選択のエージェントに対するユーザからのユーザ発話テキストと、全てのエージェントに含まれるユーザ想定テキストそれぞれとの類似度に、当該ユーザ発話テキストと当該エージェントのエージェントペルソナテキストとの類似度を重み付けて算出する第１のステップと、
現選択のエージェントの全てのユーザ想定テキストの類似度が所定閾値以上でない場合、類似度が所定閾値以上で且つ最も高いユーザ想定テキストを含む他のエージェントに交代する第２のステップと
を実行することを特徴とする。
According to the present invention, it is an agent management method of a device that manages a plurality of agents that return an agent utterance text according to a user utterance text to a terminal based on a user operation.
The device has an agent database in which the user's assumed text, the agent utterance text, and the agent persona text are associated with each agent.
The device is
The similarity between the user utterance text from the user for the currently selected agent and the expected user text included in all agents is calculated by weighting the similarity between the user utterance text and the agent persona text of the agent. Step 1 and
If the similarity of all user-assumed texts of the currently selected agent is not greater than or equal to a predetermined threshold, perform the second step of switching to another agent having a similarity greater than or equal to a predetermined threshold and containing the highest user-assumed text. It is characterized by.

本発明のプログラム、サーバ及び方法によれば、ユーザ発話テキストに応じてエージェントを交代させることができる。特に、現選択のエージェントについて、ユーザ発話テキストに対して応答候補となるエージェント発話テキストが無い場合、他のエージェントに交代することによって、ユーザから見た対話内容を充実させることができる。 According to the program, server and method of the present invention, the agent can be changed according to the user's utterance text. In particular, for the currently selected agent, when there is no agent utterance text that is a response candidate for the user utterance text, the dialogue content seen by the user can be enhanced by replacing the agent with another agent.

本発明における対話サーバの機能構成図である。It is a functional block diagram of the dialogue server in this invention. エージェントデータベースに登録された各エージェントのテーブルである。It is a table of each agent registered in the agent database. 本発明におけるテキスト間の類似度の算出を表す説明図である。It is explanatory drawing which shows the calculation of the similarity between texts in this invention. ユーザ発話テキストとユーザ想定テキストとの類似度の算出を表す説明図である。It is explanatory drawing which shows the calculation of the degree of similarity between a user utterance text and a user assumed text. ユーザ発話テキストとエージェントペルソナテキストとの類似度の算出を表す説明図である。It is explanatory drawing which shows the calculation of the degree of similarity between a user utterance text and an agent persona text. ユーザとエージェントとの間の敬語表現に基づく類似度の算出を表す説明図である。It is explanatory drawing which shows the calculation of the similarity based on the honorific expression between a user and an agent. ユーザプロファイルテキストとエージェントペルソナテキストとの類似度の算出を表す説明図である。It is explanatory drawing which shows the calculation of the similarity between a user profile text and an agent persona text.

以下、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明における対話サーバの機能構成図である。 FIG. 1 is a functional configuration diagram of the dialogue server in the present invention.

図１によれば、エージェントを実行する対話サーバ１は、ユーザプログラムを実行する端末２と、ネットワークを介して接続されている。 According to FIG. 1, the dialogue server 1 that executes the agent is connected to the terminal 2 that executes the user program via a network.

端末２は、例えばスマートフォンであって、ユーザに対するマイク及びスピーカと、エージェントのキャラクタを表示するディスプレイとを有する。
端末２には、ユーザとの対話のインタフェースとなるユーザプログラムが実装されており、対話サーバ１から受信したエージェントキャラクタをディスプレイに表示すると共に、エージェント発話テキストをユーザに明示する。また、ユーザから発話又は入力されたユーザ発話テキストは、対話サーバ１へ送信される。
ユーザ発話テキストは、ユーザから発話された音声を音声認識処理によって変換したテキストであるか、又は、ユーザから入力されたテキストである。ユーザが音声発話で入力する場合、端末２のディスプレイにマイク入力ボタンを表示し、ユーザが発話テキストを入力する場合、端末２のディスプレイに入力フォームを表示する。 The terminal 2 is, for example, a smartphone, and has a microphone and a speaker for the user, and a display for displaying the character of the agent.
A user program that serves as an interface for dialogue with the user is implemented in the terminal 2, and the agent character received from the dialogue server 1 is displayed on the display and the agent utterance text is clearly shown to the user. Further, the user utterance text uttered or input by the user is transmitted to the dialogue server 1.
The user-spoken text is a text obtained by converting a voice uttered by the user by a voice recognition process, or a text input by the user. When the user inputs by voice utterance, the microphone input button is displayed on the display of the terminal 2, and when the user inputs the utterance text, the input form is displayed on the display of the terminal 2.

対話サーバ１は、ユーザ発話テキストを入力し、エージェント発話テキストを返答する複数のエージェントを管理する。対話サーバ１のエージェントは、ユーザと対話的にシナリオを進行させる。 The dialogue server 1 manages a plurality of agents that input the user utterance text and return the agent utterance text. The agent of the dialogue server 1 advances the scenario interactively with the user.

図１によれば、対話サーバ１は、既存機能として、対話インタラクション部１００と、エージェント対話部１０１とを有する。
また、対話サーバ１は、本発明の機能として、エージェントデータベース１１０と、類似度算出部１１１と、エージェント選択部１１２と、ユーザプロファイル蓄積部１１３とを有する。
これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現される。また、これら機能構成部の処理の流れは、エージェント選択方法としても理解できる。 According to FIG. 1, the dialogue server 1 has a dialogue interaction unit 100 and an agent dialogue unit 101 as existing functions.
Further, the dialogue server 1 has an agent database 110, a similarity calculation unit 111, an agent selection unit 112, and a user profile storage unit 113 as functions of the present invention.
These functional components are realized by executing a program that makes the computer mounted on the device function. In addition, the processing flow of these functional components can be understood as an agent selection method.

［対話インタラクション部１００］
対話インタラクション部１００は、端末２のユーザプログラムと、エージェント対話部１０１のエージェントとのインタフェースとなる。対話インタラクション部１００は、端末２へ、エージェント対話部１０１におけるキャラクタ及びエージェント発話テキストを送信すると共に、端末２から、ユーザ発話テキストを受信し、エージェント対話部１０１へ出力する。 [Dialogue Interaction Unit 100]
The dialogue interaction unit 100 serves as an interface between the user program of the terminal 2 and the agent of the agent dialogue unit 101. The dialogue interaction unit 100 transmits the character and the agent utterance text in the agent dialogue unit 101 to the terminal 2, receives the user utterance text from the terminal 2, and outputs the user utterance text to the agent dialogue unit 101.

［エージェント対話部１０１］
エージェント対話部１０１は、複数のエージェントを管理し、エージェント選択部１１２から指示されたエージェントを起動する。
そして、エージェント対話部１０１は、対話インタラクション部１００から入力したユーザ発話テキストを、類似度算出部１１１へ出力する。また、エージェント対話部１０１は、エージェント選択部１１２から指示されたエージェントに基づいて、そのエージェント発話テキストを、対話インタラクション部１００へ出力する。
エージェント対話部１０１は、エージェント選択部１１２からの指示に応じて、エージェントを交代させる。 [Agent Dialogue Unit 101]
The agent dialogue unit 101 manages a plurality of agents and activates the agent instructed by the agent selection unit 112.
Then, the agent dialogue unit 101 outputs the user utterance text input from the dialogue interaction unit 100 to the similarity calculation unit 111. Further, the agent dialogue unit 101 outputs the agent utterance text to the dialogue interaction unit 100 based on the agent instructed by the agent selection unit 112.
The agent dialogue unit 101 changes agents in response to an instruction from the agent selection unit 112.

［エージェントデータベース１１０］
エージェントデータベース１１０は、エージェント毎に、「ユーザ想定テキスト」と「エージェント発話テキスト」とを対応付けたものである。 [Agent database 110]
The agent database 110 associates the "user assumed text" with the "agent utterance text" for each agent.

図２は、エージェントデータベースに登録された各エージェントのテーブルである。 FIG. 2 is a table of each agent registered in the agent database.

図２のエージェントデータベース１１０によれば、ユーザからユーザ想定テキストが発話された場合、そのエージェントは、それに対応するエージェント発話テキストを返答する、ことを意味する。
また、エージェントデータベース１１０は、エージェント毎に、「エージェントペルソナテキスト」を更に対応付けていることも好ましい。エージェントは、擬人化したものであるので、そのキャラクタ特性としてのエージェントペルソナテキストを持つ。 According to the agent database 110 of FIG. 2, when the user's assumed text is spoken by the user, the agent responds with the corresponding agent spoken text.
Further, it is also preferable that the agent database 110 further associates the "agent persona text" with each agent. Since the agent is anthropomorphic, it has an agent persona text as its character characteristic.

図２には、４つのエージェントが登録されている。
例えばエージェント１は、エージェントペルソナテキストとして、「年齢：２２歳」「職業：学生」「趣味：バイク」等が登録されている。
エージェント１は、ユーザ想定テキスト「休講になった」がユーザから発話された場合、エージェント発話テキスト「ラッキー！このゲーム面白いよ」をユーザへ返答する。
また、エージェント１は、ユーザ想定テキスト「バイト探してる」がユーザから発話された場合、エージェント発話テキスト「埼玉にこの求人があるよ」をユーザへ返答する。
このように他にも、エージェント２「職業：会社員」、エージェント３「ジムトレーナ」、エージェント４「職業：バイト」が登録されている。 In FIG. 2, four agents are registered.
For example, Agent 1 has "age: 22 years old", "occupation: student", "hobby: motorcycle", etc. registered as an agent persona text.
When the user utters the user's assumed text "Canceled", the agent 1 returns the agent utterance text "Lucky! This game is interesting" to the user.
Further, when the user utters the user's assumed text "searching for a byte", the agent 1 returns the agent utterance text "Saitama has this job offer" to the user.
In this way, agent 2 "occupation: office worker", agent 3 "gym trainer", and agent 4 "occupation: part-time job" are also registered.

尚、エージェントデータべース１１０は、ユーザ想定テキスト及びエージェント発話テキストの組を、ツリー状に並べたものであってもよい。ユーザからの返答に応じて、ツリーを辿って、ユーザとの間の対話を進めることもできる。 The agent database 110 may be a set of user-assumed texts and agent utterance texts arranged in a tree shape. Depending on the response from the user, it is possible to follow the tree and proceed with the dialogue with the user.

［ユーザプロファイル蓄積部１１３］
ユーザプロファイル蓄積部１１３は、ユーザ毎に、ユーザプロファイルテキストを蓄積する。
ユーザプロファイルテキストは、ＳＮＳ(Social Networking Service)によってユーザ毎に取得されたものであってもよい。例えばＳＮＳのプロファイルを、ユーザプロファイルテキストとしてもよい。
また、例えばＳＮＳの投稿文から、ユーザの趣味嗜好や最新の状況に関する情報を抽出し、それをユーザプロファイルテキストとしてもよい。例えば、そのユーザの投稿文を形態素解析によって単語を抽出し、ＴＦ−ＩＤＦ（Term Frequency - Inverse Document Frequency：単語の出現頻度−逆出現頻度）によって特徴的な単語を、ユーザプロファイルテキストとすることも好ましい。 [User profile storage unit 113]
The user profile storage unit 113 stores the user profile text for each user.
The user profile text may be acquired for each user by the SNS (Social Networking Service). For example, the SNS profile may be used as the user profile text.
Further, for example, information on the user's hobbies / preferences and the latest situation may be extracted from the posted text of the SNS and used as the user profile text. For example, a word may be extracted from the user's posted sentence by morphological analysis, and a word characteristic by TF-IDF (Term Frequency --Inverse Document Frequency) may be used as a user profile text. preferable.

［類似度算出部１１１］
＜ユーザ発話テキストとユーザ想定テキストとの類似度＞
類似度算出部１１１は、現選択のエージェントに対するユーザからの「ユーザ発話テキスト」と、現選択のエージェントも含めた全てのエージェントに含まれる「ユーザ想定テキスト」それぞれとの類似度を算出する。
類似度算出部１１１は、ユーザ発話テキストとユーザ想定テキストとの両方を文字要素に基づくベクトルに変換し、２つのベクトル間の距離を「コサイン類似度」として算出する。 [Similarity calculation unit 111]
<Similarity between user-spoken text and user-assumed text>
The similarity calculation unit 111 calculates the similarity between the "user utterance text" from the user for the currently selected agent and the "user assumed text" included in all the agents including the currently selected agent.
The similarity calculation unit 111 converts both the user-spoken text and the user-assumed text into vectors based on the character elements, and calculates the distance between the two vectors as "cosine similarity".

具体的には、ユーザ発話テキストを、形態素解析によって複数の単語に分解し、Bag Of Wordsを用いて特徴ベクトルに変換する。「Bag-of-Words」とは、テキストに含まれる各単語の出現頻度のみを表現したベクトルをいう。ここでは、単語の出現順は無視される。この特徴ベクトルは、単語を軸とし、出現頻度を値として、その空間の１点にそのテキストを位置付けたものである。
同様に、全てのエージェントに含まれる各ユーザ想定テキストを、形態素解析によって複数の単語に分解し、Bag Of Wordsを用いて特徴ベクトルに変換する。 Specifically, the user-spoken text is decomposed into a plurality of words by morphological analysis and converted into a feature vector using Bag Of Words. "Bag-of-Words" refers to a vector that expresses only the frequency of occurrence of each word contained in the text. Here, the order of appearance of words is ignored. This feature vector positions the text at one point in the space with the word as the axis and the frequency of occurrence as the value.
Similarly, each user-assumed text contained in all agents is decomposed into a plurality of words by morphological analysis and converted into feature vectors using Bag Of Words.

そして、類似度算出部１１１は、ユーザ発話テキストioの平均特徴ベクトルと、全てのエージェントに含まれる各ユーザ想定テキストajの平均特徴ベクトルとの間で、コサイン類似度Ｓ(io,aj)を算出する。
i：ユーザ
io：ユーザ発話テキストの識別子（ユーザiのo番目の発話）
a：エージェントの識別子
aj：ユーザ想定テキストの識別子
例えば以下の概念式で算出される。
Ｖio：ユーザ発話テキストioにおける単語群の特徴ベクトル
Ｖaj：エージェントaのユーザ想定テキストajにおける単語群の特徴ベクトル
Ｓ(io,aj)＝cosθ＝（Ｖio・Ｖaj）／(|Ｖio||Ｖaj|)
コサイン類似度Ｓ(io,aj)は、０〜１の値となり、類似性が高いほど１に近づく。 Then, the similarity calculation unit 111 calculates the cosine similarity S (io, aj) between the average feature vector of the user utterance text io and the average feature vector of each user assumed text aj included in all the agents. do.
i: user
io: User utterance text identifier (user i's o-th utterance)
a: Agent identifier
aj: User-assumed text identifier For example, it is calculated by the following conceptual formula.
Vio: Characteristic vector of word group in user utterance text io Vaj: Characteristic vector of word group in user assumed text aj of agent a S (io, aj) = cosθ = (Vio · Vaj) / (| Vio || Vaj |)
The cosine similarity S (io, aj) is a value of 0 to 1, and the higher the similarity, the closer to 1.

ここで、ユーザ発話テキストからみて、コサイン類似度Ｓが所定閾値を超えるユーザ想定テキストが検出されなかった場合、改めて、Word2vecを用いてコサイン類似度Ｓを算出する。「Word2vec」とは、単語の意味や文法を捉えるために単語をベクトル表現化して次元を圧縮する技術をいう。
具体的には、ユーザ発話テキストを、形態素解析によって複数の単語に分解し、Word2vecを用いて特徴ベクトルに変換する。同様に、全てのエージェントに含まれる各ユーザ想定テキストを、形態素解析によって複数の単語に分解し、Word2vecを用いて特徴ベクトルに変換する。
勿論、Bag Of WordsやWord2vecに限ることなく、各単語の品詞又は意味を解析した特徴ベクトルに変換することができればよい。 Here, when the user assumed text whose cosine similarity S exceeds a predetermined threshold value is not detected from the user utterance text, Word2vec is used to calculate the cosine similarity S again. "Word2vec" is a technology that compresses dimensions by vectorizing words in order to capture the meaning and grammar of words.
Specifically, the user-spoken text is decomposed into a plurality of words by morphological analysis and converted into a feature vector using Word2vec. Similarly, each user-assumed text included in all agents is decomposed into a plurality of words by morphological analysis and converted into feature vectors using Word2vec.
Of course, it is not limited to Bag Of Words and Word2vec, and it is sufficient if the part of speech or meaning of each word can be converted into an analyzed feature vector.

［エージェント選択部１１２］
エージェント選択部１１２は、現選択のエージェントの全てのユーザ想定テキストの類似度が所定閾値以上でない場合、類似度が所定閾値以上で且つ最も高いユーザ想定テキストを含む他のエージェントに交代する。
また、エージェント選択部１１２によって他のエージェントに交代した際に、エージェント対話部１０１へ、交代した当該他のエージェントについて類似度が最も高いユーザ想定テキストに対応するエージェント発話テキストを返答するべく指示する。 [Agent selection unit 112]
If the similarity of all user-assumed texts of the currently selected agent is not equal to or higher than a predetermined threshold value, the agent selection unit 112 is replaced by another agent having a similarity degree equal to or higher than a predetermined threshold value and includes the highest user-assumed text.
Further, when the agent selection unit 112 substitutes for another agent, the agent dialogue unit 101 is instructed to return the agent utterance text corresponding to the user assumed text having the highest similarity for the replaced other agent.

図３は、本発明におけるテキスト間の類似度の算出を表す説明図である。 FIG. 3 is an explanatory diagram showing the calculation of the similarity between texts in the present invention.

最初に、ユーザに対して、エージェント２が選択されているとする。
（Ｓ３１）ユーザは、ユーザ発話テキスト「今日会社行ったけど疲れたなあ」と発話したとする。
このとき、ユーザ発話テキストと、エージェント２に登録された全てのユーザ想定テキストとのコサイン類似度を算出する。
ユーザ発話テキスト：「今日会社行ったけど疲れたなあ」
（エージェント２）
ユーザ想定テキスト：「会社は疲れます」 ★コサイン類似度Ｓが最高
「プレゼン発表がありました」
「会議が多かったです」
・・・・・・
「ジムには行ってない」
ここで、ユーザ発話テキスト「今日会社行ったけど疲れたなあ」に対して、ユーザ想定テキスト「会社は疲れます」とのコサイン類似度Ｓが最も高いとする。また、そのコサイン類似度Ｓは、所定閾値（例えば０．７）以上であったとする。 First, it is assumed that the agent 2 is selected for the user.
(S31) It is assumed that the user utters the user utterance text "I went to the office today, but I'm tired".
At this time, the cosine similarity between the user utterance text and all the user assumed texts registered in the agent 2 is calculated.
User utterance text: "I went to work today, but I'm tired."
(Agent 2)
User assumption text: "Company gets tired" ★ Cosine similarity S is the highest
"There was a presentation announcement"
"There were many meetings."
・・・・・・
"I haven't been to the gym"
Here, it is assumed that the user's utterance text "I went to the company today but I'm tired" has the highest cosine similarity S with the user's assumed text "The company is tired". Further, it is assumed that the cosine similarity S is equal to or higher than a predetermined threshold value (for example, 0.7).

（Ｓ３２）最も高いコサイン類似度Ｓが所定閾値以上であるので、エージェント２を交代しない。そして、ユーザ想定テキスト「会社は疲れます」に対応して、エージェント２から、エージェント発話テキスト「週末までの我慢です」を、エージェント対話部１０１へ指示する。
即ち、現選択のエージェントの中に、ユーザ発話テキストに対して所定閾値以上のコサイン類似度Ｓとなるユーザ想定テキストが１つでもあれば、エージェントを交代しない。 (S32) Since the highest cosine similarity S is equal to or higher than a predetermined threshold value, the agent 2 is not replaced. Then, in response to the user's assumed text "The company gets tired", the agent 2 instructs the agent dialogue unit 101 with the agent utterance text "I'm patient until the weekend".
That is, if there is at least one user-assumed text having a cosine similarity S equal to or higher than a predetermined threshold value with respect to the user-spoken text among the currently selected agents, the agent is not replaced.

（Ｓ３３）これに対して、ユーザは、ユーザ発話テキスト「ジムに行って体力つけなきゃ」と発話したとする。
このとき、ユーザ発話テキストと、エージェント２に登録された全てのユーザ想定テキストとのコサイン類似度を算出する。
ユーザ発話テキスト：「ジムに行って体力つけなきゃ」
（エージェント２）
ユーザ想定テキスト：「会社は疲れます」
「プレゼン発表がありました」
「会議が多かったです」
・・・・・・
「ジムには行ってない」 ★コサイン類似度が最高
ここで、ユーザ発話テキスト「ジムに行って体力つけなきゃ」に対して、ユーザ想定テキスト「ジムには行ってない」のコサイン類似度Ｓが最も高いとする。しかしながら、そのコサイン類似度Ｓは、所定閾値（例えば０．７）よりも低いとする。 (S33) On the other hand, it is assumed that the user utters the user utterance text "I have to go to the gym and improve my physical strength".
At this time, the cosine similarity between the user utterance text and all the user assumed texts registered in the agent 2 is calculated.
User utterance text: "I have to go to the gym and get fit"
(Agent 2)
User assumption text: "Company gets tired"
"There was a presentation announcement"
"There were many meetings."
・・・・・・
"I haven't been to the gym" ★ The cosine similarity is the highest. Here, the cosine similarity S of the user's assumed text "I haven't been to the gym" is compared to the user utterance text "I have to go to the gym to improve my physical strength". The highest. However, the cosine similarity S is assumed to be lower than a predetermined threshold value (for example, 0.7).

この場合、ユーザ発話テキストと、エージェント２以外の他のエージェントにおける、全てのユーザ想定テキストとのコサイン類似度Ｓを算出する。 In this case, the cosine similarity S between the user utterance text and all user assumed texts in agents other than agent 2 is calculated.

図４は、ユーザ発話テキストとユーザ想定テキストとの類似度の算出を表す説明図である。 FIG. 4 is an explanatory diagram showing the calculation of the degree of similarity between the user utterance text and the user assumed text.

図４によれば、以下のテキスト同士の類似度が算出されている。
ユーザ発話テキスト：「ジムに行って体力つけなきゃ」
（エージェント１）
ユーザ想定テキスト：「休講になった」
「バイト探してる」
「レポートが大変」
・・・・・・
「電車が混んでて大変でした」
（エージェント３）
ユーザ想定テキスト：「筋トレはいいよね」
「ジムには行っている」 ★コサイン類似度Ｓが最高
「残業が多い」
・・・・・・
「電車が混んでて」
（エージェント４）
ユーザ想定テキスト：「乾燥する季節です」
「友達と買い物に行くよ」
「夜遅くまでバイトだった」
・・・・・・
「電車が混んでて大変」
ここで、ユーザ発話テキスト「ジムに行って体力つけなきゃ」に対して、エージェント３のユーザ想定テキスト「ジムには行っている」のコサイン類似度Ｓが最も高いとする。また、そのコサイン類似度Ｓは、所定閾値（例えば０．７）以上であったとする。 According to FIG. 4, the similarity between the following texts is calculated.
User utterance text: "I have to go to the gym and get fit"
(Agent 1)
User assumption text: "Canceled"
"Looking for a part-time job"
"Report is hard"
・・・・・・
"It was hard because the train was crowded."
(Agent 3)
User assumption text: "Muscle training is good"
"I go to the gym" ★ Cosine similarity S is the highest
"A lot of overtime"
・・・・・・
"The train is crowded"
(Agent 4)
User assumption text: "It's a dry season"
"I'm going shopping with my friends"
"I was working part-time until late at night."
・・・・・・
"It's hard because the train is crowded"
Here, it is assumed that the cosine similarity S of the user assumed text "I am going to the gym" of the agent 3 is the highest with respect to the user utterance text "I have to go to the gym to improve my physical strength". Further, it is assumed that the cosine similarity S is equal to or higher than a predetermined threshold value (for example, 0.7).

（Ｓ３４）エージェント２にコサイン類似度Ｓが所定閾値以上となるユーザ想定テキストが無いので、エージェント２から、エージェント３へを交代する。そして、ユーザ想定テキスト「ジムには行っている」に対応して、エージェント３から、エージェント発話テキスト「ジムは会社帰りですか？」を、エージェント対話部１０１へ指示する。
即ち、現選択のエージェントの中に、ユーザ発話テキストに対して所定閾値以上のコサイン類似度となるユーザ想定テキストが１つもなく、他のエージェントの中に、ユーザ発話テキストに対して所定閾値以上のコサイン類似度となるユーザ想定テキストが１つでもあれば、そのエージェントに交代する。 (S34) Since the agent 2 does not have a user-assumed text in which the cosine similarity S is equal to or higher than a predetermined threshold value, the agent 2 is switched to the agent 3. Then, in response to the user's assumed text "I'm going to the gym", the agent 3 instructs the agent dialogue unit 101 with the agent utterance text "Is Jim going home from work?".
That is, none of the currently selected agents has a user-assumed text having a cosine similarity equal to or higher than a predetermined threshold for the user-spoken text, and among other agents, the user-spoken text has a cosine similarity equal to or higher than the predetermined threshold. If there is even one user-assumed text that has cosine similarity, it will be replaced by that agent.

他の実施形態として、図３によれば、Ｓ３３の後段にＳ３３１のエージェント発話テキストが返答されている。
ユーザから見て、エージェント２からＳ３２のエージェント発話テキスト「週末までの我慢です」に対して、Ｓ３３のユーザ発話テキスト「ジムに行って体力つけなきゃ」と返答した後、急に、別のエージェント３からＳ３４のエージェント発話テキスト「ジムは会社帰りですか？」と言われると、違和感があり、自然な対話の流れではない。現在まで対話していたエージェント２が突然居なくなり、他のエージェント３から急に発話されてしまうためである。 As another embodiment, according to FIG. 3, the agent utterance text of S331 is returned after S33.
From the user's point of view, after replying to the agent utterance text "I'm patient until the weekend" from agents 2 to S32, the user utterance text "I have to go to the gym and get fit" in S33, suddenly another agent 3 When asked by the agent's utterance text "Is Jim going home from work?" In S34, there is a sense of discomfort, and it is not a natural flow of dialogue. This is because the agent 2 who has been interacting with until now suddenly disappears, and another agent 3 suddenly speaks.

そのために、エージェント選択部１１２は、現選択のエージェントから他のエージェントへ交代させる際に、現選択のエージェントが当該他のエージェントへ交代する旨のエージェント発話テキストを送信し、その後、当該他のエージェントにおける最も類似度が高いユーザ想定テキストに対応するエージェント発話テキストを送信する。 Therefore, when the currently selected agent is replaced by another agent, the agent selection unit 112 transmits an agent utterance text indicating that the currently selected agent is replaced by the other agent, and then the other agent. Sends the agent utterance text corresponding to the user-assumed text with the highest degree of similarity in.

図３のＳ３３１によれば、交代前のエージェント２が、エージェント発話テキスト「そう言えば、ジムトレーナさんも同じこと言ってました」と発話することによって、ユーザに対して、ジムトレーナのエージェント３を予め意識させておくことができる。 According to S331 in FIG. 3, the agent 2 before the change tells the user the agent 3 of the gym trainer in advance by saying the agent utterance text "By the way, Mr. Jim trainer said the same thing." You can be aware of it.

類似度算出部１１１は、前述した実施形態によれば、＜ユーザ発話テキストとユーザ想定テキストとの類似度Ｓ＞を算出する。
これに対して、他の実施形態として、この類似度に、以下のような類似度を「重み付け」ることも好ましい。
＜ユーザ発話テキストとエージェントペルソナテキストとの類似度Ｗ₁＞
＜ユーザ発話テキストの敬語表現と、エージェント発話テキストの敬語表現との関係Ｋ＞
＜ユーザプロファイルテキストとエージェントペルソナテキストとの類似度Ｗ₂＞
＜ユーザプロファイルテキストとユーザ想定テキストとの類似度Ｗ₃＞ According to the above-described embodiment, the similarity calculation unit 111 calculates <similarity S between the user utterance text and the user assumed text>.
On the other hand, as another embodiment, it is also preferable to "weight" the similarity as follows.
<Similarity between user utterance text and agent persona text W ₁ >
<Relationship between the honorific expression of the user utterance text and the honorific expression of the agent utterance text K>
<Similarity between user profile text and agent persona text W ₂ >
<Similarity between user profile text and user assumed text W ₃ >

＜ユーザ発話テキストとエージェントペルソナテキストとの類似度Ｗ₁＞
図５は、ユーザ発話テキストとエージェントペルソナテキストとの類似度の算出を表す説明図である。
類似度算出部１１１は、ユーザ発話テキストioと、各エージェントaのエージェントペルソナテキストPaとの類似度Ｗ₁（コサイン類似度）を、当該エージェントの各ユーザ想定テキストの類似度Ｓに重み付ける。
Pa：エージェントａのエージェントペルソナテキスト
Ｗ₁(io,Pa) <Similarity between user utterance text and agent persona text W ₁ >
FIG. 5 is an explanatory diagram showing the calculation of the degree of similarity between the user utterance text and the agent persona text.
_{The similarity calculation unit 111 weights the similarity W 1} (cosine similarity) between the user utterance text io and the agent persona text Pa of each agent a to the similarity S of each user assumed text of the agent.
Pa: Agent persona text of agent a W ₁ (io, Pa)

＜ユーザ発話テキストの敬語表現と、エージェント発話テキストの敬語表現との関係Ｋ＞
図６は、ユーザとエージェントとの間の敬語表現に基づく類似度の算出を表す説明図である。
他の実施形態として、エージェントペルソナテキストが、敬語表現の有無を含んでいることも好ましい。 <Relationship between the honorific expression of the user utterance text and the honorific expression of the agent utterance text K>
FIG. 6 is an explanatory diagram showing the calculation of the degree of similarity between the user and the agent based on the honorific expression.
As another embodiment, it is also preferred that the agent persona text includes the presence or absence of honorific expressions.

類似度算出部１１１は、ユーザ発話テキストが敬語表現であるか否かを判定するために、ユーザ発話テキストを形態素解析した述語によって判定するものであってもよいし、ユーザプロファイルに記述されたものであってもよい。
同様に、エージェント発話テキストが敬語表現であるか否かを判定するために、エージェント発話テキストを形態素解析した述語によって判定するものであってもよいし、エージェントペルソナテキストに記述されたものであってもよい。
そして、類似度算出部１１１は、ユーザ発話テキストの敬語表現（有／無）と、各エージェントのエージェント発話テキストの敬語表現（有／無）とが一致しているか否かを判定する。ここで、ユーザ発話テキストの敬語表現が有りで、エージェント発話テキストの敬語表現が無しの場合、そのエージェントを選択しない。その場合、そのエージェントにおける全てのユーザ想定テキストのコサイン類似度を、零とするのが好ましい。ユーザが敬語で話すのに対し、エージェントが平常語で話すことは、自然な対話とはいえない。
i：ユーザ
a：エージェント
Ｋ(i,a)＝０, ０＜Ｋ(i,a)≦１
ユーザ発話テキストとエージェント発話テキストとの敬語表現が一致する場合、Ｋに対して0.00より大きく1.00以下の任意の数とする。一方で、一致しない場合、Ｋ＝０とする。 The similarity calculation unit 111 may determine whether or not the user utterance text is a honorific expression by using a predicate obtained by morphological analysis of the user utterance text, or is described in the user profile. It may be.
Similarly, in order to determine whether or not the agent utterance text is an honorific expression, the agent utterance text may be determined by a morphologically analyzed predicate, or it may be described in the agent persona text. May be good.
Then, the similarity calculation unit 111 determines whether or not the honorific expression (yes / no) of the user utterance text and the honorific expression (yes / no) of the agent utterance text of each agent match. Here, if there is an honorific expression in the user utterance text and there is no honorific expression in the agent utterance text, that agent is not selected. In that case, it is preferable that the cosine similarity of all user-assumed texts in the agent is zero. While users speak in honorifics, agents speaking in plain language is not a natural dialogue.
i: user
a: Agent K (i, a) = 0, 0 <K (i, a) ≤ 1
If the honorific expressions of the user utterance text and the agent utterance text match, any number greater than 0.00 and less than or equal to 1.00 is used for K. On the other hand, if they do not match, K = 0.

また、ユーザプロファイルテキストの年齢と、エージェントプロファイルテキストの年齢とを比較するものであってもよい。
ユーザの年齢が、エージェントの年齢よりも高い場合、エージェント発話テキストは、敬語であることが好ましい。その場合、敬語表現無しとなるエージェントが選択されないように、そのようなエージェントにおける全てのユーザ想定テキストのコサイン類似度を、零とする。年上の相手と対話する場合、敬語を使うのが一般的であるからである。 It may also compare the age of the user profile text with the age of the agent profile text.
If the user's age is older than the agent's age, the agent utterance text is preferably honorific. In that case, the cosine similarity of all user-assumed texts in such agents is set to zero so that agents without honorific expressions are not selected. This is because it is common to use honorifics when interacting with older people.

＜ユーザプロファイルテキストとエージェントペルソナテキストとの類似度Ｗ₂＞
図７は、ユーザプロファイルテキストとエージェントペルソナテキストとの類似度の算出を表す説明図である。
類似度算出部１１１は、当該ユーザのユーザプロファイルテキストuと、各エージェントaのエージェントペルソナテキストPaとの類似度Ｗ₂（コサイン類似度）を、当該エージェントの全てのユーザ想定テキストの類似度Ｓに重み付ける。
u：ユーザプロファイルテキスト
Pa：エージェントaのエージェントペルソナテキスト
Ｗ₂(u,Pa)
ユーザプロファイルテキストとエージェントペルソナテキストとの類似度が大きいほど、ユーザは、自らの趣味嗜好に近いエージェントほど親近感を持つ。前述した図３の場合、ユーザに対して、年齢や職業が類似するエージェント２が選択される。 <Similarity between user profile text and agent persona text W ₂ >
FIG. 7 is an explanatory diagram showing the calculation of the degree of similarity between the user profile text and the agent persona text.
_{The similarity calculation unit 111 sets the similarity W 2} (cosine similarity) between the user profile text u of the user and the agent persona text Pa of each agent a to the similarity S of all user assumed texts of the agent. Weight.
u: User profile text
Pa: Agent persona text of agent a W ₂ (u, Pa)
The greater the similarity between the user profile text and the agent persona text, the more familiar the user is to the agent who is closer to his or her hobbies and tastes. In the case of FIG. 3 described above, the agent 2 having a similar age and occupation is selected for the user.

＜ユーザプロファイルテキストとユーザ想定テキストとの類似度Ｗ₃＞
類似度算出部１１１は、当該ユーザのユーザプロファイルテキストuと、各エージェントaのユーザ想定テキストajとの類似度Ｗ₃（コサイン類似度）を、当該エージェントの全てのユーザ想定テキストの類似度Ｓに重み付ける。
u：ユーザプロファイルテキスト
aj：エージェントaのユーザ想定テキストj
Ｗ₃(u,aj) <Similarity between user profile text and user assumed text W ₃ >
_{The similarity calculation unit 111 sets the similarity W 3} (cosine similarity) between the user profile text u of the user and the user assumed text aj of each agent a to the similarity S of all the user assumed texts of the agent. Weight.
u: User profile text
aj: User assumption text of agent a j
W ₃ (u, aj)

最終的に、ユーザ想定テキストの類似度Ｓ(io,aj)に、他の類似度Ｗを重み付けた総合類似度Ｓallを、以下のように算出する。
ここで、エージェント交代の場合と、初期エージェントの選択の場合とがある。
＜エージェント交代の場合＞
Ｓall(io,aj)＝Ｓ(io,aj)×Ｗ₁(io,Pa)×Ｋ(i,a)
敬語有無Ｋ＝０の場合、Ｓall(io,aj)＝０となる。
そして、総合類似度Ｓallが最も高いユーザ想定テキストを検出し、そのユーザ想定テキストを登録する「エージェント」を選択する。また、そのユーザ想定テキスト対応するエージェント発話テキストが、エージェント対話部１０１へ指示される。
尚、Ｓall(io,aj)＝Ｓ(io,aj)×Ｗ₁(io,aj)×Ｋ×Ｗ₂(io,aj)×Ｗ₃(io,aj)としてもよい。但し、Ｗ₂(io,aj)×Ｗ₃(io,aj)の重みの寄与率は低いことが好ましい。エージェント交代の判定が、ユーザプロファイルに引っ張られないようにすべきという理由に基づく。 Finally, the total similarity Sall obtained by weighting the similarity S (io, aj) of the user assumed text with the other similarity W is calculated as follows.
Here, there are cases where the agent is changed and cases where the initial agent is selected.
<In case of agent change>
Sall (io, aj) = S (io, aj) x W ₁ (io, Pa) x K (i, a)
When K = 0 with or without honorifics, Sall (io, aj) = 0.
Then, the user assumed text having the highest overall similarity Sall is detected, and the "agent" for registering the user assumed text is selected. Further, the agent utterance text corresponding to the user assumed text is instructed to the agent dialogue unit 101.
In addition, Sall (io, aj) = S (io, aj) × W ₁ (io, aj) × K × W ₂ (io, aj) × W ₃ (io, aj) may be set. However, it is preferable that the contribution rate of the weight of _{W 2} (io, aj) × W _{3 (io, aj) is low.} It is based on the reason that the agent change decision should not be pulled by the user profile.

＜初期エージェントの選択の場合＞
Ｓall(io,aj)＝Ｗ₂(u,Pa)×Ｗ₃(u,aj)
対話開始時にはユーザ発話テキストが存在しないために、ユーザプロファイルを比較対象として、エージェントを選択することが好ましい。 <When selecting the initial agent>
Sall (io, aj) = W ₂ (u, Pa) x W ₃ (u, aj)
Since there is no user utterance text at the start of the dialogue, it is preferable to select the agent by comparing the user profile.

以上、詳細に説明したように、本発明のプログラム、サーバ及び方法によれば、ユーザ発話テキストに応じてエージェントを交代させることができる。特に、現選択のエージェントについて、ユーザ発話テキストに対して応答候補となるエージェント発話テキストが無い場合、他のエージェントに交代することによって、ユーザから見た対話内容を充実させることができる。 As described in detail above, according to the program, server and method of the present invention, the agent can be changed according to the user's utterance text. In particular, for the currently selected agent, when there is no agent utterance text that is a response candidate for the user utterance text, the dialogue content seen by the user can be enhanced by replacing the agent with another agent.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 With respect to the various embodiments of the present invention described above, various changes, modifications and omissions within the scope of the technical idea and viewpoint of the present invention can be easily made by those skilled in the art. The above explanation is just an example and does not attempt to restrict anything. The present invention is limited only to the scope of claims and their equivalents.

１対話サーバ
１００対話インタラクション部
１０１エージェント対話部
１１０エージェントデータベース
１１１類似度算出部
１１２エージェント選択部
１１３ユーザプロファイル蓄積部
２端末
1 Dialogue server 100 Dialogue interaction unit 101 Agent dialogue unit 110 Agent database 111 Similarity calculation unit 112 Agent selection unit 113 User profile storage unit 2 Terminals

Claims

An agent management program that allows a computer to function to manage multiple agents that enter user utterance text and respond to agent utterance text.
For each agent, an agent database that associates user assumption text, agent utterance text, and agent persona text,
Similarity calculated by weighting the similarity between the user utterance text of the currently selected agent and the user expected text included in all agents, and the similarity between the user utterance text and the agent persona text of the agent. Degree calculation means and
If the similarity of all user-assumed texts of the currently selected agent is not equal to or higher than a predetermined threshold, the computer should function as an agent selection means to be replaced by another agent having a similarity equal to or higher than a predetermined threshold and containing the highest user-assumed text. An agent management program that features.

The claim is characterized in that when the agent is replaced by another agent by the agent selection means, the computer functions to return the agent utterance text corresponding to the user assumed text having the highest similarity for the other agent. The agent management program according to 1.

For each user, further function as a user profile storage means that stores the user profile text,
The similarity calculation means is a computer so as to weight the similarity between the user profile text of the user and the agent persona text and / or the agent utterance text of each agent to the similarity of all the user assumed texts of the agent. The agent management program according to claim 1 or 2 , wherein the agent management program is made to function.

The agent management program according to claim 3 , wherein the user profile storage means causes a computer to function so as to acquire a user profile text of each user by an SNS (Social Networking Service).

The agent persona text includes the presence or absence of honorific expressions.
Further, when the presence / absence of the honorific expression in the user-spoken text and the presence / absence of the honorific expression in the plurality of agents do not match, the similarity calculation means sets the similarity of all the user-assumed texts of the agent to zero. The agent management program according to any one of claims 1 to 4 , wherein the computer functions as described above.

The similarity calculation means is characterized in that both the user-spoken text and the user-assumed text are converted into vectors based on character elements, and the computer functions to calculate the distance between the two vectors as the cosine similarity. The agent management program according to any one of claims 1 to 5.

The user-spoken text is any one of claims 1 to 6 , wherein the user-spoken text is a text obtained by converting a voice uttered by the user by a voice recognition process, or is a text input by the user. The listed agent management program.

When the currently selected agent is replaced by another agent, the agent selection means sends an agent utterance text indicating that the currently selected agent is replaced by the other agent, and then the most similar in the other agent. The agent management program according to any one of claims 1 to 7 , wherein an agent utterance text corresponding to a user's assumed text with a high degree is transmitted.

A user program that functions a computer mounted on a device that communicates with the agent management program according to any one of claims 1 to 8.
The computer is displayed so that the character of the agent selected by the agent management program is displayed on the display, the agent utterance text is clearly shown to the user, and the user utterance text input or spoken by the user is transmitted to the agent management program. A user program characterized by making it work.

A dialogue server that manages multiple agents that respond to agent utterance texts in response to user utterance texts to terminals based on user operations.
For each agent, an agent database that associates user assumption text, agent utterance text, and agent persona text,
Similarity calculated by weighting the similarity between the user utterance text of the currently selected agent and the user expected text included in all agents, and the similarity between the user utterance text and the agent persona text of the agent. Degree calculation means and
When the similarity of all user-assumed texts of the currently selected agent is not equal to or higher than a predetermined threshold value, it is characterized by having an agent selection means that substitutes for another agent having a similarity degree equal to or higher than a predetermined threshold value and includes the highest user-assumed text. Dialogue server.

It is an agent management method of a device that manages a plurality of agents that return an agent utterance text according to a user utterance text to a terminal based on a user operation.
The device includes an agent database in which a user assumption text, an agent utterance text, and an agent persona text are associated with each agent.
The device is
The similarity between the user utterance text from the user for the currently selected agent and the expected user text included in all agents is calculated by weighting the similarity between the user utterance text and the agent persona text of the agent. Step 1 and
If the similarity of all user-assumed texts of the currently selected agent is not greater than or equal to a predetermined threshold, perform the second step of switching to another agent having a similarity greater than or equal to a predetermined threshold and containing the highest user-assumed text. An agent management method characterized by.