JP2023074852A

JP2023074852A - Conversation database

Info

Publication number: JP2023074852A
Application number: JP2021188005A
Authority: JP
Inventors: 直人高柳; Naoto Takayanagi; 牧子平石; Makiko Hiraishi; 一彦楊井; Kazuhiko Yanagii
Original assignee: Kao Corp
Current assignee: Kao Corp
Priority date: 2021-11-18
Filing date: 2021-11-18
Publication date: 2023-05-30

Abstract

To allow a non-task type chatbot terminal to have a dialog with a user without failure.SOLUTION: A dialog method between a user H and a non-task type chatbot terminal 1 for having a dialog with a user, includes the steps of, by the terminal 1, (1) acquiring an utterance of the user, (2) estimating an utterance intention of the user to attach an utterance intention label to the utterance of the user, (3) when a kind of an intention of a response in the non-task type chatbot is indicated by a response label, referring to a conversation database storing a plurality of response patters formed in an arrangement of the response labels for each utterance intention label of the user and storing a selection probability of each response pattern, to select the response pattern corresponding to the utterance intention label attached in (2) on the basis of the selection probability of the response pattern, (4) selecting a response phrase corresponding to each response label forming the response pattern, and (5) creating a response sentence on the basis of the response phrase.SELECTED DRAWING: Figure 3

Description

本発明は会話データベース及び対話方法に関する。 The present invention relates to conversation databases and dialogue methods.

日常会話を通じたコミュニケーションの促進は、対人関係の構築や維持に重要な役割を担う。特に高齢者においては、会話によって他者と社会的交流機会を維持することが、ＱＯＬの向上や記憶力維持など様々な健康増進に影響すると考えられる。 The promotion of communication through daily conversation plays an important role in building and maintaining interpersonal relationships. Particularly in the elderly, maintaining opportunities for social interaction with others through conversation is thought to affect various health promotions such as improving QOL and maintaining memory.

しかしながら、近年では少子高齢化や核家族化の影響もあり、高齢者が家族とコミュニケーションを取る機会が減少している。また、地域コミュニティ活動の縮小に伴い、高齢者が「ご近所付き合い」といった横のつながりをもつことも希薄になっている。そのため、高齢者に「話したい」という欲求があっても、思ったほどはコミュニケーションの機会をもてないという課題がある。これは一人暮らしの高齢者に限られず、高齢者が入居する介護施設や高齢者用住宅においても同様であり、日々の多忙な業務をこなす介護スタッフに対して高齢者が雑談をする機会は少ない。 However, in recent years, due to the effects of the declining birthrate and aging population, as well as the trend toward nuclear families, opportunities for elderly people to communicate with their families are decreasing. In addition, with the shrinking of local community activities, it is becoming less common for the elderly to have horizontal connections such as "neighborhood associations." Therefore, even if the elderly have a desire to "talk", there is a problem that they cannot have the opportunity to communicate as much as they thought. This applies not only to elderly people living alone, but also to nursing care facilities and housing for the elderly where elderly people live.

高齢者がコミュニケーションを取る機会が少ないという課題に対し、人工知能を搭載したロボットや対話エージェントといったシステムを、高齢者の話し相手の代用とする試みがなされている。 In response to the problem that the elderly have few opportunities to communicate, attempts are being made to substitute systems such as robots and dialogue agents equipped with artificial intelligence as conversation partners for the elderly.

人工知能を搭載したシステムは大きく２つに分類することができる。一つは、「今日の天気を教えて」、「〇〇について教えて」等の課題に応じた発話を行うスマートスピーカー等のシステムである。このシステムは「タスク型チャットボット」と呼ばれる。もう一つは、日常会話や雑談等のように明確な課題がなく、相手と共に時を過ごすための活動として会話をすることができるシステムである。このシステムは「非タスク型チャットボット」と呼ばれる。 Systems equipped with artificial intelligence can be broadly classified into two types. One is a system such as a smart speaker that performs utterances according to tasks such as "Tell me about today's weather" and "Tell me about XX". This system is called a "task chatbot." The other is a system that allows conversation as an activity to spend time with the other party without a clear problem such as daily conversation or casual chat. The system is called a "non-task chatbot."

非タスク型チャットボットはタスク型チャットボットと比較して言葉の選択の自由度が非常に高いため、ユーザと非タスク型チャットボット端末との対話内容がかみあわないという対話破綻が起こる可能性が高い。そのため、近年ではディープラーニング技術を利用し、膨大なテキストデータを学習させることで対話破綻を起きにくくする対話システムが提案されている。例えば、特に２０２０年に、Ｆａｃｅｂｏｏｋが発表した、ＴｒａｎｓｆｏｒｍｅｒをベースとするＢｌｅｎｄｅｒＢｏｔがある（非特許文献１）。しかしながらＢｌｅｎｄｅｒＢｏｔは、ディープラーニングをベースとした対話モデルであるために膨大なテキストデータを必要とすることもあり、学習データは、実際の人間同士の会話ではなく、インターネット等を通して収集されるデータに依存してしまう。そのため、ディープラーニングをベースとした対話モデルを単独で用いるだけでは人間が実際に行っているような「人らしい会話」を実現させることは難しい。 Non-task chatbots have a much higher degree of freedom in choosing words than task-type chatbots, so there is a high possibility of dialogue failure where the content of the conversation between the user and the non-task-type chatbot terminal does not mesh. . Therefore, in recent years, a dialogue system has been proposed that uses deep learning technology to learn a huge amount of text data, thereby making it difficult for the dialogue to fail. For example, especially in 2020, Facebook announced BlenderBot based on Transformer (Non-Patent Document 1). However, since BlenderBot is a dialogue model based on deep learning, it sometimes requires a huge amount of text data, and the learning data depends on data collected through the Internet, etc., rather than actual conversations between humans. Resulting in. Therefore, it is difficult to realize "human-like conversations" that humans actually have by using a dialogue model based on deep learning alone.

一方、実際の人間対話コーパス(人間同士の会話データ)に基づいて生成されたモデルを用いることで、人間の会話スタイルで人間とのインタラクションを行う対話システムが提案されている（特許文献１）。この技術では、人間対話コーパスに基づいて生成した、質問中のキーワードと回答中のキーワードとのマッピング関係から、当該質問にマッチングする単語（コロケーション関係にある単語）を予測し、文法構造により単語の順序を調整することで回答文を生成する。しかしながら、この方法では対話システムが人間の発話意図を読み取れず、人間の発話意図に対応しない内容の発言をする虞がある。また、キーワードの抽出後に文法構造から言葉の順序を調整して文を生成するため、文脈上成り立たない文が生成されてしまう虞もある。 On the other hand, a dialogue system has been proposed that uses a model generated based on an actual human dialogue corpus (conversation data between humans) to interact with humans in a human conversation style (Patent Document 1). This technology predicts words that match the question (words in collocation relationships) from the mapping relationship between the keywords in the question and the keywords in the answers generated based on the human dialogue corpus, and predicts the word matching with the grammatical structure. Generate an answer sentence by adjusting the order. However, in this method, the dialogue system cannot read the intention of human speech, and there is a possibility that the contents of the speech do not correspond to the intention of human speech. In addition, since sentences are generated by adjusting the order of words from the grammatical structure after extracting keywords, there is a possibility that sentences that do not hold in context may be generated.

特許６７２６８００号公報Japanese Patent No. 6726800

Roller et al. Recipes for building an open-domain chatbot, arXiv:2004.13637(2020)Roller et al. Recipes for building an open-domain chatbot, arXiv:2004.13637(2020)

上述した従来技術に対し、本発明の課題は、非タスク型チャットボットがユーザと雑談等の対話をする場合に、対話破綻をなくすことに関する。 In contrast to the above-described prior art, an object of the present invention is to eliminate the breakdown of dialogue when a non-task type chatbot engages in a dialogue such as chatting with a user.

本発明者は、非タスク型チャットボット端末においてユーザとの対話破綻をなくすためには、ユーザの発言に対して単にチャットボットの返答フレーズが確率的に出力されるのではなく、返答フレーズの選択に先立ち、チャットボット端末の返答意図の種類のパターンを確率的に考慮することが有効であることを見出し、本発明を完成させた。 In order to eliminate the dialogue breakdown with the user in the non-task type chatbot terminal, the present inventors have found that instead of stochastically outputting the reply phrase of the chatbot to the user's utterance, it is necessary to select a reply phrase. Prior to this, it was found that it is effective to stochastically consider the pattern of the type of reply intention of the chatbot terminal, and the present invention was completed.

即ち、本発明は、ユーザと非タスク型チャットボット端末との会話生成に用いる会話データベースであって、
ユーザの発言の意図の種類を示す発言意図ラベルと、各発言意図ラベルに対応する例文が記憶された発言意図記憶部、及び
非タスク型チャットボット端末による返答の意図の種類を返答ラベルで示した場合に、一つの返答ラベル又は複数の返答ラベルの並びで形成された返答パターンが、ユーザの発言意図ラベルごとに複数通り記憶されると共に、各返答パターンがユーザの発言に対する返答において選択される確率が記憶されている返答パターン記憶部、
を備えている会話データベースを提供する。 That is, the present invention is a conversation database used for generating a conversation between a user and a non-task type chatbot terminal,
An utterance intention label indicating the type of utterance intention of the user, an utterance intention storage unit storing example sentences corresponding to each utterance intention label, and a reply label indicating the type of reply intention by the non-task type chatbot terminal. In this case, a plurality of response patterns formed by one response label or a sequence of multiple response labels are stored for each user's utterance intention label, and the probability that each response pattern is selected in response to the user's utterance. A response pattern storage unit that stores
provides a conversation database with

また、本発明は、ユーザと対話するための非タスク型チャットボット端末における該非タスク型チャットボット端末の対話方法であって、
非タスク型チャットボット端末が、
（１）ユーザの発言を取得するステップ、
（２）請求項１記載の会話データベースの発言意図記憶部を参照し、ユーザの発言意図を推定し、ユーザの発言に発言意図ラベルを付するステップ、
（３）請求項１記載の会話データベースの返答パターン記憶部を参照し、（２）で付した発言意図ラベルに対応する返答パターンを、返答パターン記憶部に記憶されている返答パターンの確率に基づいて選択するステップ、
（４）（３）のステップで選択した返答パターンを形成する各返答ラベルに対応する返答フレーズを選択するステップ、
（５）（４）のステップで選択した返答フレーズに基づいて返答文を作成するステップ
を行う対話方法を提供する。 The present invention also provides a non-task chatbot terminal interaction method for interacting with a user, comprising:
A non-task type chatbot terminal
(1) obtaining a user's utterance;
(2) a step of referring to the utterance intention storage unit of the conversation database according to claim 1, estimating the user's utterance intention, and attaching an utterance intention label to the user's utterance;
(3) By referring to the response pattern storage section of the conversation database described in claim 1, a response pattern corresponding to the utterance intention label attached in (2) is determined based on the probability of the response pattern stored in the response pattern storage section. the step of selecting by
(4) selecting a reply phrase corresponding to each reply label forming the reply pattern selected in step (3);
(5) To provide an interactive method for creating a reply sentence based on the reply phrase selected in step (4).

本発明のデータベースによれば、非タスク型チャットボット端末の返答時に、返答の意図の種類を表す返答ラベルで形成された返答パターンをユーザの発言の意図の種類に対応させて確率的に選択できるので、返答の意図を考慮した返答フレーズを作成することが可能となる。したがって、ユーザの発言意図に対してチャットボット端末の返答の意図が噛み合わないという対話の破綻が起こりにくくなり、チャットボット端末はユーザと非タスク型の対話を続けることが可能となる。 According to the database of the present invention, when a non-task type chatbot terminal responds, it is possible to stochastically select a response pattern formed of response labels representing the type of intention of the response, corresponding to the type of intention of the user's utterance. Therefore, it is possible to create a reply phrase considering the intention of the reply. Therefore, it is less likely that the chatbot terminal will not match the user's utterance intentions with the chatbot terminal's reply intentions, and the chatbot terminal can continue the non-task-type dialogue with the user.

図１は、ユーザと非タスク型チャットボット端末との対話時の模式図である。FIG. 1 is a schematic diagram of a dialogue between a user and a non-task type chatbot terminal. 図２は、実施例の会話データベースのブロック図である。FIG. 2 is a block diagram of the conversation database of the embodiment. 図３は、実施例のユーザと非タスク型チャットボット端末との対話方法のフローチャートである。FIG. 3 is a flow chart of an interaction method between a user and a non-task type chatbot terminal of the embodiment.

以下、図面を参照しつつ本発明を詳細に説明する。
（会話データベースの構成）
本発明の会話データベースは、例えば、図１に示すように高齢者であるユーザＨと非タスク型チャットボット端末１とが対話をする場合に、非タスク型チャットボット端末１が会話生成のために用いることのできるデータベースである。なお、本発明の会話データベースの用途としては、非タスク型チャットボット端末がユーザと対話をするために用いるだけでなく、タスク型の対話システムで、例えばアンケート調査装置に対してユーザが非定型的な応答をしたときに用いるなど、各種自動装置等の対人会話においても用いることができる。 The present invention will be described in detail below with reference to the drawings.
(Conversation database configuration)
For example, as shown in FIG. 1, when an elderly user H and a non-task chatbot terminal 1 interact with the conversation database of the present invention, the non-task chatbot terminal 1 can generate a conversation. database that can be used. The conversation database of the present invention can be used not only for a non-task type chatbot terminal to have a dialogue with a user, but also for a task-type dialogue system, for example, a non-routine dialogue system in which a user interacts with a questionnaire survey device. It can also be used in interpersonal conversations with various automatic devices, such as when you make a response.

図２は、本発明の一実施例の会話データベース１０のブロック部である。このデータベース１０は発言意図記憶部１１と、返答パターン記憶部１２を備え、さらに必要に応じて返答フレーズ記憶部１３を備える。 FIG. 2 is a block diagram of conversation database 10 of one embodiment of the present invention. This database 10 comprises a utterance intention storage section 11, a reply pattern storage section 12, and further comprises a reply phrase storage section 13 as necessary.

発言意図記憶部１１は、ユーザの発言の意図の種類を示す発言意図ラベルと、各発言意図ラベルに対応する例文を記憶する。ユーザの発言意図ラベルは、ユーザの発言意図を短い言葉で簡潔に表したものであり、例えば、表１に示す１０種を挙げることができる。ユーザの発言意図ラベルは、例えば同表に示すように記号化してもよい。なお、ユーザの発言意図ラベルの種類は表１に挙げた１０種に限られないが、高齢者の日常会話における発言意図を大凡この１０種に分類することで相手の意図を汲み取った発言が可能である。 The utterance intention storage unit 11 stores an utterance intention label indicating the type of utterance intention of the user, and an example sentence corresponding to each utterance intention label. The user's utterance intention label simply expresses the user's utterance intention in short words. The user's utterance intention label may be encoded, for example, as shown in the same table. The types of user's utterance intention labels are not limited to the 10 types listed in Table 1, but by roughly classifying utterance intentions in daily conversations of the elderly into these 10 types, it is possible to make remarks that understand the other party's intentions. is.

表２は表１に示すユーザの発言意図ラベルに対応するユーザの発言の例文である。ユーザの発言意図ラベルとその例文を対応させてデータベースに記憶させておくことにより、非タスク型チャットボット端末１がユーザの発言を取得し、発言意図記憶部１１を参照することでユーザの発言の意図を推定することが可能となる。 Table 2 is an example sentence of a user's utterance corresponding to the user's utterance intention label shown in Table 1. A user's utterance intention label and its example sentence are associated with each other and stored in a database, so that the non-task type chatbot terminal 1 acquires the user's utterance and refers to the utterance intention storage unit 11 to refer to the user's utterance. Intention can be estimated.

一方、返答ラベルは、ユーザの発言に対して非タスク型チャットボット端末１が返答するときの返答の意図の種類をラベル化したものである。表３は、本実施例の返答パターン記憶部１２が記憶する返答ラベルの種類の一覧である。表３には１２種の返答ラベルを挙げたが、本発明において返答ラベルの種類はこれに限られない。 On the other hand, the reply label is a label indicating the type of reply intention when the non-task chatbot terminal 1 replies to the user's statement. Table 3 is a list of types of reply labels stored in the reply pattern storage unit 12 of this embodiment. Although 12 types of response labels are listed in Table 3, the types of response labels in the present invention are not limited to these.

返答パターン記憶部１２は、非タスク型チャットボット端末１の返答において、一つの返答ラベル、又は複数の返答ラベルの並びで形成された返答パターンが、ユーザの発言の意図ごとに選択される割合（以下、選択確率ともいう）を記憶する。即ち、返答パターン記憶部１２には、ユーザの発言意図ラベルごとに複数通りの返答パターンが記憶されると共に、各返答パターンが当該ユーザの発言意図ラベルにおいて選択される確率も記憶される。例えば、表４は、ユーザの発言意図ラベルが情報通知であった場合の非タスク型チャットボット端末１における返答パターンと、その返答パターンの選択確率である。 The response pattern storage unit 12 stores, in the responses from the non-task chatbot terminal 1, the rate at which a response pattern formed by one response label or a sequence of multiple response labels is selected for each intention of the user's utterance ( hereinafter also referred to as selection probability). That is, the response pattern storage unit 12 stores a plurality of response patterns for each user's utterance intention label, and also stores the probability that each response pattern is selected by the user's utterance intention label. For example, Table 4 shows the response patterns in the non-task chatbot terminal 1 when the user's utterance intention label is information notification, and the selection probabilities of the response patterns.

なお、表４には返答ラベルが３つまでの返答パターンを挙げているが、本発明において返答パターンを形成する返答ラベルの個数はこれに限られない。４つ以上とすることもできるが、非タスク型チャットボット端末１が破綻のない返答をする上で、返答パターンを形成する返答ラベルの個数は３つまでとすれば十分である。即ち、返答パターンを形成する返答ラベルの個数を４つ以上とすることもできるが、実際の会話データの収集により、返答ラベルが４つ以上の発話の場合、「反応」や「同意」等の対話のテーマを変更しないことにより会話の破綻に関わらないリアクションを表す返答ラベルが複数組み込まれている場合が多いことがわかった。したがって、非タスク型チャットボット端末１の返答文の作成方法としては、返答パターンを形成する返答ラベルの個数は３以下とすることが、返答文の作成の処理速度を上げる点から好ましい。一方、ユーザＨと非タスク型チャットボット端末１との間で、対話がタスク型とならず、日常会話が続くようにする点から、返答パターンを形成する返答ラベルの個数は複数個とすることが好ましい。なお、本発明の対話方法において、選択確率に基づいて選択された返答パターンが、結果的に１つの返答ラベルで形成されていることは日常会話を続ける上で支障にはならない。 Although Table 4 lists response patterns with up to three response labels, the number of response labels forming a response pattern in the present invention is not limited to this. Although the number may be four or more, it is sufficient if the number of reply labels forming the reply pattern is three or less so that the non-task type chatbot terminal 1 can give a correct reply. That is, the number of response labels forming a response pattern can be four or more. We found that by not changing the theme of the dialogue, there are many cases where multiple reply labels are incorporated that express reactions that are not related to the breakdown of the conversation. Therefore, as a method for creating a reply message of the non-task chatbot terminal 1, it is preferable to set the number of response labels forming a reply pattern to 3 or less in order to increase the processing speed of creating a reply message. On the other hand, the number of reply labels that form the reply pattern should be plural, in order to keep the conversation between the user H and the non-task type chatbot terminal 1 from becoming a task type and to continue a daily conversation. is preferred. In the dialogue method of the present invention, the fact that the response pattern selected based on the selection probability is ultimately formed of one response label does not hinder daily conversation.

返答フレーズ記憶部１３は、非タスク型チャットボット端末１が返答文を作成する過程において、ユーザの発言意図ラベルに応じて返答パターンが選択された後、その返答パターンを形成する１又は複数の返答ラベルに対応した返答フレーズを選択できるように、返答フレーズを蓄積し記憶する。 After the response pattern is selected according to the user's utterance intention label in the process of the non-task chatbot terminal 1 creating a response sentence, the response phrase storage unit 13 stores one or more responses forming the response pattern. The reply phrases are accumulated and stored so that the reply phrase corresponding to the label can be selected.

例えば、表５は、反応（Ｒｅ）という返答ラベルに対応して返答フレーズ記憶部１３に蓄積された返答フレーズの例である。表５において、[固有表現]とは、ユーザが直前の発言で「花王ミュージアム」などの固有表現を用いた時にそれを引用するための表記である。 For example, Table 5 is an example of reply phrases stored in the reply phrase storage unit 13 corresponding to the reply label "reaction (Re)". In Table 5, [named expression] is a notation for quoting when the user used a specific expression such as "Kao Museum" in the previous statement.

後述するように、返答パターンに対応した返答フレーズは、例えば、Ｔｒａｎｓｆｏｒｍｅｒベースのディープラーニング技術を利用した文書の生成方法や、ＴＦ－ＩＤＦ値を計算し、コサイン類似度を計算して文書を選択する方法などによっても形成することができるので、本発明のデータベースにおいて返答フレーズ記憶部１３は必要に応じて設けられる。ただし、「反応」、「納得・理解」、「同意」、「好意的反応」、「フォロー」などのリアクションに関する返答ラベルについては、あらかじめ取得した人間同士の会話データをもとに各返答ラベルに対応する応答フレーズを返答フレーズ記憶部１３に蓄積しておくことが好ましい。これにより、返答フレーズ記憶部１３に蓄積した返答フレーズの中からランダムに選択したものを、返答ラベルに対応する返答フレーズとすることができ、簡便に返答することが可能となる。 As will be described later, the response phrase corresponding to the response pattern can be selected by, for example, a document generation method using Transformer-based deep learning technology, or by calculating the TF-IDF value and calculating the cosine similarity. Since it can also be formed by a method or the like, the reply phrase storage unit 13 is provided as necessary in the database of the present invention. However, for response labels such as "Reaction", "Consent/Understanding", "Agree", "Favorable response", "Follow", etc. It is preferable to store corresponding response phrases in the response phrase storage unit 13 . As a result, one randomly selected from the reply phrases stored in the reply phrase storage unit 13 can be used as the reply phrase corresponding to the reply label, making it possible to reply easily.

また、「自分の話や考え」、「情報通知」、「回答」といった、より自由度の高い返答ラベルについては、人間同士の会話データから類似のユーザ発言とそれに対する返答のデータセットとして、返答フレーズ記憶部１３に蓄積しておくことが好ましい。 In addition, for more flexible response labels such as “my story and thoughts,” “information notification,” and “answer,” we created a data set of similar user utterances and responses from human conversation data. It is preferable to store them in the phrase storage unit 13 .

（会話データベースの作成方法）
会話データベース１０は、実際の人間同士の会話データを取得することで作成することが好ましい。これにより、人間の会話構造、即ち、実際の人間同士の会話で生じる話し手の発言意図とそれに対応した聞き手の返答意図の並び、より具体的には、ユーザの発言意図ラベルと、その発言意図ラベルに対応した、非タスク型チャットボット端末における、一つ又は複数の返答ラベルで形成された返答パターンの並びを実装することが可能となる。 (How to create a conversation database)
The conversation database 10 is preferably created by acquiring actual conversation data between people. As a result, the human conversation structure, that is, the arrangement of the speaker's utterance intention and the corresponding listener's response intention occurring in an actual conversation between humans, more specifically, the user's utterance intention label and its utterance intention label It is possible to implement a sequence of response patterns formed by one or more response labels in a non-task type chatbot terminal corresponding to .

実際の人間同士の会話データの取得は、非タスク型チャットボット端末とユーザとの対話を想定し、非タスク型チャットボット端末の役割を担う「聞き手」とユーザの役割を担う「話し手」により、格別に課題を定めることなく話したいことを話す日常会話を一定時間行い、会話データを収集することにより行うことができる。「話し手」の人数は会話内容のバリエーションを増やすために多い方が好ましいが、例えば、６０代又は７０代の一般の高齢女性を「話し手」とし、データベースの作成に係わる技術者（３０代男性）を「聞き手」とし、趣味や最近行っている事などの日常を話題として、「聞き手」が２２人の「話し手」と、「話し手」一人あたり１０分間の会話を行うことで、非タスク型チャットボット端末が高齢者であるユーザと破綻無く対話を続けられる程度の会話データを得ることができた。本発明の会話データベースを作成するための会話データとしては、ディープラーニング技術を利用して対話システムを構築する場合に必要とされる膨大なテキストデータは不要である。 Acquisition of actual conversation data between humans assumes a dialogue between a non-task type chatbot terminal and a user, and a "listener" who plays the role of a non-task type chatbot terminal and a "speaker" who plays the role of the user. It can be done by having a daily conversation to say what you want to talk about for a certain period of time without setting a particular task and collecting conversation data. The number of "speakers" is preferably as large as possible in order to increase the variation of the content of the conversation. are “listeners”, and the “listeners” talk with 22 “speakers” about everyday topics such as hobbies and recent activities. We were able to obtain enough conversation data for the bot terminal to continue a conversation with an elderly user without failure. Conversation data for creating the conversation database of the present invention does not require a huge amount of text data that is required when constructing a dialogue system using deep learning technology.

なお、非タスク型チャットボット端末１と対話をすることが想定されるユーザの年代等に応じて、会話データを収集する際の「話し手」の年代を定めてもよい。例えば、会話データを収集するときの「話し手」を児童とすることにより、非タスク型チャットボット端末が児童と破綻無く日常会話を続けることを可能とする会話データを取得することができる。 Note that the age of the “speaker” when collecting the conversation data may be determined according to the age of the user who is expected to interact with the non-task chatbot terminal 1 . For example, by using children as "speakers" when collecting conversation data, it is possible to obtain conversation data that enables a non-task chatbot terminal to continue daily conversations with children without disruption.

収集した人間同士の会話データは、会話データベース１０に格納するため、テキストデータとして書き起こす。なお、「うん」、「はい」、「ええ」等の相槌は書き起こさなくてもよい。 The collected conversation data between humans is transcribed as text data in order to be stored in the conversation database 10 . It is not necessary to write back responses such as "yes", "yes", "yeah".

発言意図記憶部１１にユーザの発言意図ラベルとその例文を蓄積するため、収集した会話データにおいて、話し手の発言内容について発言意図ラベルを決定し、その発言を当該発言意図ラベルの例文とし、表２に示したように、ユーザの発言意図ラベルと例文のセットを発言意図記憶部１１に蓄積する。この場合、例えば、「１週間とかですかね。もう大昔なんで。」のように、複数の文で一つの発言意図が表されているときや、「昨日は何をしていましたか？私は釣りに行ったんだけど」等のように全体として質問のニュアンスがあるときは、複数の文を合わせて一つの発言意図ラベルの例文としてもよい。 In order to store the user's utterance intention label and its example sentences in the utterance intention storage unit 11, the utterance intention label is determined for the utterance content of the speaker in the collected conversation data, and the utterance is set as the example sentence of the utterance intention label. 2, a set of user's utterance intention labels and example sentences is accumulated in the utterance intention storage unit 11. As shown in FIG. In this case, for example, when one statement intention is expressed in multiple sentences, such as "It's been a week. It's been a long time." When there is a nuance to the question as a whole, such as "I went to ,", a plurality of sentences may be combined as an example sentence of one utterance intention label.

一方、返答パターン記憶部１２に、ユーザの発言意図ラベルごとに非タスク型チャットボット端末の複数通りの返答パターンと、各返答パターンがユーザの発言に対する返答として選択される確率とを蓄積するため、収集した会話データにおける聞き手の返答に返答ラベルを決定し、その返答を当該返答ラベルの例文とする。返答が複数の文からなる場合には、返答の各文に対して返答ラベルとその並びを決定する。こうして、決定された返答ラベル又はその並びで形成される返答パターンと対応する例文を返答パターン記憶部１２に蓄積する。 On the other hand, in the response pattern storage unit 12, a plurality of response patterns of the non-task type chatbot terminal for each user's utterance intention label and the probability that each response pattern is selected as a response to the user's utterance are accumulated. A response label is determined for the listener's response in the collected conversation data, and the response is used as an example of the response label. If the reply consists of multiple sentences, the reply label and its sequence are determined for each sentence of the reply. In this way, the reply pattern storage unit 12 stores the reply patterns formed by the determined reply labels or the example sentences corresponding to the reply patterns.

なお、返答ラベルを決定するにあたり、返答が、例えば、「そうなんですか。」といった「反応」、「なるほど、そうなんですね。」といった「納得・理解」、「そうですよね。」といった「同意」である場合、これらはリアクションという点で共通するが、それぞれ「反応」、「納得・理解」、「同意」という別個の返答ラベルを定めることが好ましい。これにより、多様な返答フレーズを作成することが可能となり、結果としてユーザにきめ細かな対応をとることが可能となる。 In determining the response label, the response may be, for example, a "reaction" such as "Is that so?" , these are common in terms of reactions, but it is preferable to define separate response labels such as "reaction", "satisfaction/understanding", and "agree", respectively. As a result, it becomes possible to create various reply phrases, and as a result, it becomes possible to take detailed measures for the user.

実際の会話データの聞き手の返答において、返答ラベル又はその並びで形成された返答パターンが話し手の発言意図ラベルごとに選択された割合を選択確率として算出し、選択確率も返答パターン記憶部１２に記憶する。こうして、返答パターン記憶部１２には、例えば表４に示したように、ユーザの発言意図ごとの返答パターンの選択確率が蓄積される。 In the listener's response to the actual conversation data, the rate at which the response label or the response pattern formed by the arrangement thereof is selected for each utterance intention label of the speaker is calculated as the selection probability, and the selection probability is also stored in the response pattern storage unit 12. do. In this way, the response pattern storage unit 12 accumulates the response pattern selection probability for each user's utterance intention, as shown in Table 4, for example.

なお、会話のテーマが例えば「ある出来事に対する議論」等のように日常会話よりも踏み込んだものである場合、ユーザの発言意図ラベルは表１に示した１０種では足らず、「相手発言への肯定」や「相手発言への否定」といった発言意図ラベルを表１に示した発言意図ラベルに追加することができる。一方、会話のテーマによっては、表１に示した発言意図ラベルのうち「自虐的な発言」が含まれない可能性もある。したがって、ユーザの発言意図ラベルは、収集した会話データに応じて適宜変更することができる。同様に返答ラベルの種類も、収集した会話データに応じて適宜変更することができる。 If the theme of the conversation is more in-depth than everyday conversation, such as "arguing about a certain event," the user's utterance intention labels of 10 types shown in Table 1 are not enough, and "affirming the other party's utterance" is not enough. ” and “denial of the other party's utterance” can be added to the utterance intention labels shown in Table 1. On the other hand, depending on the theme of the conversation, there is a possibility that "self-deprecating remarks" is not included in the remark intention labels shown in Table 1. Therefore, the user's utterance intention label can be appropriately changed according to the collected conversation data. Similarly, the type of reply label can be changed as appropriate according to the collected conversation data.

また、話し手の返答ラベルが「趣味は読書です」等の「回答」の場合、聞き手は次に「質問」をする割合が高く、話し手の回答について聞き手はさらに深掘りしていく傾向が見られる。一方で、話し手の発言が「サッカーはあまり好きじゃない。」といった「ノーのニュアンスを含む回答」である場合、聞き手は一度「そうなんですね」等の「反応」を選択する傾向がある。話し手と聞き手の発言の意図を細分化することにより、より人間らしい会話構造の実装が可能となる。一方で、例えば、システムの計算処理を早くすることでチャットボット端末の応答速度を早くすることを目的に、話し手の返答ラベルを少なくする必要がある場合、密接な関係にある「回答」と「ノーのニュアンスを含む回答」という２つの発言意図を１つの「回答」という発言意図に統合してもよい。 In addition, when the speaker's response label is "Answer" such as "My hobby is reading", the listeners are more likely to ask a "question" next, and there is a tendency for the listeners to dig deeper into the speaker's answers. . On the other hand, when the speaker's utterance is an "answer including the nuance of no" such as "I don't really like soccer", the listener tends to select a "response" such as "That's right" once. A more human-like conversation structure can be implemented by subdividing the utterance intentions of the speaker and the listener. On the other hand, for example, if it is necessary to reduce the speaker's response label for the purpose of speeding up the response speed of the chatbot terminal by speeding up the calculation processing of the system, the closely related "answer" and " Two utterance intentions of "an answer including the nuance of no" may be integrated into one utterance intention of "answer".

（返答フレーズ記憶部の作成方法）
上述の人間同士の会話データに基づき、聞き手の返答ラベルごとに返答フレーズを収集する。 (Method of Creating Response Phrase Storage Section)
Based on the human-to-human conversation data described above, reply phrases are collected for each listener's reply label.

返答ラベルが「情報通知」である場合、例えば表６に示すように、話し手の発言と聞き手の返答フレーズのデータセットを蓄積する。返答ラベルが「自分の考え」、「回答」等についても同様にデータセットを作成し、返答フレーズ記憶部に蓄積する。 If the reply label is "information notice", a data set of the speaker's utterances and the listener's reply phrases is accumulated, as shown in Table 6, for example. Similarly, data sets are created for reply labels such as "my thoughts" and "answer", and stored in the reply phrase storage unit.

聞き手の返答ラベルが「質問」の場合、話し手の発言を形態素に解析し、その情報を参照することで話し手の内容において不足する５Ｗ１Ｈ情報を返答フレーズで問いかけることで対応する。または人間の会話データを元に、５Ｗ１Ｈ情報に関する質問のデータフレーズストックを作成する。例えば、表７に示すように、不足する５Ｗ１Ｈ情報に対して「質問」となる返答フレーズを作成する。 When the listener's response label is "question", the speaker's utterance is morphologically analyzed, and by referring to the information, the 5W1H information lacking in the speaker's content is addressed with a response phrase. Or create a data phrase stock of questions about 5W1H information based on human conversation data. For example, as shown in Table 7, a reply phrase of "question" is created for the missing 5W1H information.

返答ラベルが「話題提供」、「挨拶」、「別れの挨拶」、「反応」、「納得・理解」、「同意」、「好意的反応」、「フォロー」の場合も、人間同士の会話データからフレーズストックを準備することで対応することができる。例えば、表８は、返答ラベルとして「話題提供」が選択されたときのフレーズストックの一例である。 Conversation data between humans, even if the response labels are "Provide topic", "Greeting", "Farewell greeting", "Reaction", "Consent/Understanding", "Agree", "Favorable reaction", and "Follow" It is possible to respond by preparing phrase stock from. For example, Table 8 is an example of phrase stock when "topic offer" is selected as a reply label.

（対話方法）
本発明の対話方法は上述の会話データベースを用いて行われる。図３は、本発明の一実施例の対話方法のフローチャートである。 (Dialogue method)
The dialogue method of the present invention is performed using the conversation database described above. FIG. 3 is a flowchart of the interaction method of one embodiment of the present invention.

この対話方法では、（１）ステップとして、非タスク型チャットボット端末１がユーザの発言を取得する。例えば、非タスク型チャットボット端末１が音声入力機能を有する場合、図１に示すようユーザＨが非タスク型チャットボット端末１と対峙し、話しかけると、非タスク型チャットボット端末１がユーザの発話を録音し、音声認識アプリにより録音データをテキスト化し、テキストデータとして取得する。スマートフォンやタブレット端末の音声入力機能を用いて発話を録音し、一般的な音声認識アプリを使用して録音データをテキスト化して取得してもよい。また、ＬＩＮＥ（登録商標）等のインターネットを用いた対話ツールを用いてユーザの発話をテキスト情報として取得してもよく、パーソナルコンピュータを使用してユーザにテキスト形式で発言を入力してもらうことで取得してもよい。 In this interactive method, as step (1), the non-task chatbot terminal 1 acquires the user's speech. For example, if the non-task chatbot terminal 1 has a voice input function, as shown in FIG. is recorded, the recorded data is converted into text by a speech recognition application, and acquired as text data. It is also possible to record an utterance using the voice input function of a smartphone or tablet terminal, convert the recorded data into text using a general voice recognition application, and acquire the text. In addition, the user's utterance may be obtained as text information using an interactive tool using the Internet such as LINE (registered trademark). may be obtained.

図３は、ユーザの発言として「花王ミュージアムには行ったことはありますね。」が取得されたことを示している。 FIG. 3 shows that "Have you ever been to the Kao Museum?" is acquired as the user's statement.

次に、（２）ステップとして、本発明の会話データベースの発言意図記憶部を参照し、（１）で取得したユーザの発言内容からユーザの発言意図を推定し、ユーザの発言に発言意図ラベルを付する。具体的には、発言意図記憶部１１に記憶されたユーザ発言意図と対応文のデータセットを用いて、発言意図推定モデルの作成を行い、そのモデルからユーザの発言意図を取得する。発言意図推定モデルは、ＢＥＲＴ(Bidirectional Encoder Representations from Transformers)を始めとするＴｒａｎｓｆｏｒｍｅｒベースの機械学習手法などを利用して作成することができる。また、ＭｉｃｒｏｓｏｆｔのＬＵＩＳ(Language Understanding)などの、機械学習から情報を抽出するクラウドベースのＡＰＩサービスを利用して作成することも可能である。 Next, as step (2), the user's statement intention is estimated from the user's statement content acquired in (1) by referring to the statement intention storage unit of the conversation database of the present invention, and a statement intention label is attached to the user's statement. attached. Specifically, a data set of the user's utterance intention and the corresponding sentence stored in the utterance intention storage unit 11 is used to create a utterance intention estimation model, and the user's utterance intention is acquired from the model. The utterance intention estimation model can be created using a Transformer-based machine learning method such as BERT (Bidirectional Encoder Representations from Transformers). It can also be created using a cloud-based API service that extracts information from machine learning, such as Microsoft's LUIS (Language Understanding).

本実施例では、ＭｉｃｒｏｓｏｆｔのＬＵＩＳを利用した。ＬＵＩＳにて、会話データベースに記憶させたユーザ発言意図と対応する例文のデータセットを全て入力し、事前に学習させておく。これにより、データセットにない入力文を入力することで、ユーザの発言意図を推定し、出力することが可能となる。例えば、予めデータセットを学習させたＬＵＩＳに、「花王ミュージアムには行ったことはありますね。」という文を入力すると「情報通知」という発言意図が推定される。この「情報通知」という発言意図ラベルを用いて、後段のステップで返答文を作成していく。 Microsoft's LUIS was used in this embodiment. In LUIS, all data sets of example sentences corresponding to user utterance intentions stored in the conversation database are input and learned in advance. This makes it possible to estimate and output the user's utterance intention by inputting an input sentence not included in the data set. For example, when the sentence "Have you ever been to the Kao Museum?" Using this utterance intention label "information notification", a reply sentence is created in a later step.

次に（２’）ステップとして、固有表現を抽出する。固有表現の抽出は、（１）ステップで取得したユーザの発言を形態素解析して行う。ここで、形態素解析とは、予め辞書等に基づいて登録された単語の品詞などの情報に基づき、入力文を、言語的に意味を持つ最小単位に分割していく解析手法である。例えば、ユーザの入力文が「花王ミュージアムには行ったことはありますね。」であるとき、Ｍｅｃａｂというオープンソース形態素解析エンジンを利用することで、「花王ミュージアム／に／は／行っ／た／こと／は／あり／ます／ね／。」といった形態素に分割することができる。この情報より、固有名詞である「花王ミュージアム」という単語を抽出する。ここで得られた固有表現は、後述する「返答ラベルごとに、返答フレーズを取得するステップ」にて必要に応じて使用する。特許文献１に記載の対話方法のように、固有表現をキーワードとして扱うことは必要ではない。 Next, as a step (2'), a named entity is extracted. The named entity is extracted by morphologically analyzing the user's utterance obtained in step (1). Here, the morphological analysis is an analysis method for dividing an input sentence into linguistically meaningful minimum units based on information such as parts of speech of words registered in advance based on a dictionary or the like. For example, when the user's input sentence is "Have you ever been to the Kao Museum?" It can be divided into morphemes such as /wa/ari/masu/ne/. From this information, the word "Kao Museum", which is a proper noun, is extracted. The unique expressions obtained here are used as necessary in the "step of obtaining a reply phrase for each reply label" described later. Unlike the dialogue method described in Patent Document 1, it is not necessary to treat named entities as keywords.

次に（３）ステップとして、本発明の会話データベースの返答パターン記憶部１２を参照して返答ラベルを取得し、（２）で付した発言意図ラベルに対応する返答パターンを、返答パターン記憶部１２に記憶されている返答パターンの確率に基づいて選択する。具体的には、会話データベース１０の返答パターン記憶部１２が記憶している、ユーザの発言意図ラベルに応じた返答パターンの選択確率を参照し、選択確率に応じて返答パターンを決定する。より具体的には、ユーザの入力文が「花王ミュージアムには行ったことはありますね。」だった場合、ユーザの発言意図ラベルは上述の（２）ステップで「情報通知」として出力されているので、返答パターン記憶部１２に記憶されている「情報通知」に対する返答パターンの選択確率から返答パターンが決定される。 Next, as step (3), the reply pattern storage unit 12 of the conversation database of the present invention is referenced to obtain a reply label, and the reply pattern corresponding to the utterance intention label attached in (2) is stored in the reply pattern storage unit 12. based on the probabilities of the response patterns stored in . Specifically, the response pattern selection probability corresponding to the user's utterance intention label stored in the response pattern storage unit 12 of the conversation database 10 is referred to, and the response pattern is determined according to the selection probability. More specifically, when the user's input sentence is "Have you been to the Kao Museum?", the user's utterance intention label is output as "information notification" in step (2) above. Therefore, the response pattern is determined from the probability of selection of the response pattern to the “information notification” stored in the response pattern storage unit 12 .

選択確率に応じて選ぶとは、多数回繰り返したときにその選択確率になるように、毎回ランダムに選択することをいう。具体的には、表４の１行目は３．４％の確率であるが、この場合、一千回行うと３４回選択されるように、プログラムされている。 To select according to the selection probability means to make a random selection each time so that the selection probability is reached when repeated many times. Specifically, the first row of Table 4 has a probability of 3.4%, and in this case, the program is programmed so that 34 selections are made when 1,000 repetitions are performed.

より詳細に述べると、本実施形態では、０から９９９までの数字がランダムに、かつ等確率で発生する乱数発生プログラムを用いて、乱数を取得している。そして、取得される各々の数字は、表４の確率に応じて予め各行に対応させてある。すなわち、０から３３を１行目、３４から３０個を２行目、次の４０個を３行目のパターンに対応させておく。例えば、乱数発生プログラムにより「８０」という乱数を取得した場合は、３行目が選択されるので、表４により、返答パターンは、返答ラベル１が「反応（Ｒｅ）」、返答ラベル２が「話題提供（Ｗa）」となる。 More specifically, in this embodiment, random numbers are obtained using a random number generation program that randomly generates numbers from 0 to 999 with equal probability. Each acquired number is associated with each row in advance according to the probability of Table 4. That is, 0 to 33 correspond to the first line, 34 to 30 correspond to the second line, and the next 40 correspond to the third line. For example, if a random number "80" is obtained by the random number generation program, the third row is selected. topic provision (Wa)”.

このようにすることで、十分に多い回数繰り返せば、目標の選択確率に収束する、選択確率に応じた選択を実現できる。 By doing so, it is possible to realize selection according to the selection probability, which converges to the target selection probability if it is repeated a sufficiently large number of times.

なお、確率に応じた選択は、種々の方法が知られている。ここで述べた以外にも、例えばハードウエアを用いて乱数を発生させることなども行われている。また、いわゆる疑似乱数を用いた選択も、本発明で目指す自然な会話の実現においては有効である。 Various methods are known for the selection according to the probability. In addition to the methods described here, for example, hardware is also used to generate random numbers. Selection using so-called pseudo-random numbers is also effective in realizing natural conversation aimed at by the present invention.

次に（４）ステップとして、（３）ステップで選択した返答パターンを構成する各返答ラベルに対応する返答フレーズを選択する。 Next, as step (4), a reply phrase corresponding to each reply label constituting the reply pattern selected in step (3) is selected.

一例として、（３）ステップで、「反応（Ｒｅ）」、「情報通知（Ｉｎ）」、「質問（Ｑ）」という３つの返答ラベルで形成された返答パターンが選択されたとする（表４の網掛け部分）。この場合、まず返答ラベル「反応（Ｒｅ）」に対応する返答フレーズを作成する。 As an example, in step (3), it is assumed that a response pattern made up of three response labels of "reaction (Re)", "information notification (In)", and "question (Q)" is selected (see Table 4). shaded area). In this case, first, a reply phrase corresponding to the reply label "reaction (Re)" is created.

会話データベース１０が、返答ラベルと、該返答ラベルに対応した返答フレーズを記憶している返答フレーズ記憶部１３を有するときには、返答ラベル「反応（Ｒｅ）」に対応する返答フレーズを作成するため、返答フレーズ記憶部１３を参照し、返答フレーズ記憶部１３に記憶された「反応（Ｒｅ）」に対する返答フレーズ（表５）からランダムに１つ選択する。なお、返答ラベル「反応（Ｒｅ）」に対応する返答フレーズの作成は、後述する「情報通知」と同様に、ＴＦ－ＩＤＦ値を計算し、コサイン類似度を計算する方法や、Ｔｒａｎｓｆｏｒｍｅｒベースのディープラーニング技術を利用した方法により行ってもよい。 When the conversation database 10 has a reply label and a reply phrase storage unit 13 that stores reply phrases corresponding to the reply labels, the reply The phrase storage unit 13 is referred to, and one is randomly selected from the response phrases (Table 5) to the “reaction (Re)” stored in the response phrase storage unit 13 . The response phrase corresponding to the response label "reaction (Re)" can be created by calculating the TF-IDF value, calculating the cosine similarity, or using the Transformer-based deep A method using learning technology may be used.

この際、[固有表現]を含む返答フレーズが選択された場合は、ユーザの入力文から取得した固有表現を代入する。もし、ユーザの入力文に固有表現が含まれなかった場合は固有表現を含む返答フレーズを選択しないようにする。例えば、ユーザの入力文が「花王ミュージアムには行ったことはありますね。」の場合、「反応（Ｒｅ）」の返答フレーズとして、「花王ミュージアムですか。」という返答フレーズを選択することができる。なお、返答ラベルが「反応」以外の、「納得・理解」、「同意」、「好意的反応」、「フォロー」などのリアクションに関する返答ラベルや、「話題提供」、「挨拶」、「別れの挨拶」といった返答ラベルについても、「反応」と同様に、返答フレーズ記憶部１３に記憶されている返答フレーズのストックからランダムに1つ選択することで返答フレーズの作成が可能である。 At this time, if a reply phrase containing [named word] is selected, the named word obtained from the user's input sentence is substituted. If the user's input sentence does not contain a named entity, the response phrase containing the named entity should not be selected. For example, if the user's input sentence is "Have you ever been to the Kao Museum?", the reply phrase "Are you at the Kao Museum?" . In addition, response labels other than "reaction", such as "satisfaction/understanding", "agree", "favorable response", and "follow", For the reply label such as "Greeting", similarly to "Reaction", it is possible to create a reply phrase by selecting one from the stock of reply phrases stored in the reply phrase storage unit 13 at random.

返答ラベルとして「情報通知」が選択された場合には、会話データベース１０の返答フレーズ記憶部１３に記憶された返答ラベル「情報通知」の全ての文に対して形態素解析を実施し、各形態素についてＴＦ－ＩＤＦ値の計算を行う。ＴＦ（Term Frequency）とは、対象の文章の中でどれだけその単語が重要かを表す指標である。全文における全単語数の内、その単語の出現回数の割合として算出される。例えば「する」「ある」といった単語は出現回数としては多いためＴＦは高値を示す。一方、ＩＤＦ（Inverse Document Frequency）とは、その単語がどれだけ特徴的な言葉かを表す指標である。例えば「花王ミュージアム」といった全文中でもあまり出現しない言葉はＩＤＦが高値を示す。ＴＦ－ＩＤＦはこれらの積の値であり、各文の特徴量を示す指標であるともいえる。 When "information notice" is selected as the reply label, morphological analysis is performed on all sentences with the reply label "information notice" stored in the reply phrase storage unit 13 of the conversation database 10, and for each morpheme Calculate the TF-IDF value. TF (Term Frequency) is an index representing how important a word is in a target sentence. It is calculated as the ratio of the number of occurrences of the word to the total number of words in the entire sentence. For example, words such as "to do" and "to be" appear frequently, so the TF indicates a high value. On the other hand, IDF (Inverse Document Frequency) is an index representing how characteristic a word is. For example, words such as "Kao Museum" that do not appear in the entire text show high prices in the IDF. TF-IDF is the value of the product of these, and can be said to be an index indicating the feature amount of each sentence.

その後、ユーザの入力文のＴＦ－ＩＤＦ値とデータセット内のユーザの発言のＴＦ－ＩＤＦ値をもとにコサイン類似度を計算し、ユーザの発言に最も近く対応する応答フレーズを選択する。コサイン類似度が最も近いということは、ユーザの入力文に最も近いデータベース内の発言を参照し、それに対応する返答フレーズを選択できていることになる。 Then, the cosine similarity is calculated based on the TF-IDF value of the user's input sentence and the TF-IDF value of the user's utterance in the dataset, and the response phrase that most closely corresponds to the user's utterance is selected. If the cosine similarity is the closest, it means that the utterance in the database closest to the user's input sentence can be referred to and the corresponding reply phrase can be selected.

例えば、ユーザの入力文「花王ミュージアムには行ったことはありますね。」に対し、返答ラベル「情報通知」に応答した返答フレーズを参照したい場合、「花王ミュージアムには行ったことはありますね。」というＴＦ－ＩＤＦベクトルに最もコサイン類似度が高値を示すユーザの発言は表６の「花王ミュージアムは昔行ったことがあるな」となった。そのため、その発言に対応する「私は花王の工場見学には行きました。」が返答フレーズとして選択される。 For example, in response to the user's input sentence "Have you been to the Kao Museum?", if you want to refer to the reply phrase that responds to the response label "Information notification", you can enter "Have you been to the Kao Museum?" The user's utterance showing the highest value of cosine similarity for the TF-IDF vector of "I have been to the Kao Museum in the past" in Table 6. Therefore, "I went to Kao's factory tour." corresponding to the statement is selected as a reply phrase.

返答ラベルが「自分の話や考え」や「回答」についても同様の手法でデータセットの発言に対応する返答フレーズを出力することで文脈上整合性のあるフレーズを準備することが可能である。 It is possible to prepare contextually consistent phrases by outputting reply phrases corresponding to utterances in the data set using the same method for reply labels such as "my talk and thoughts" and "answers."

また、本技術は、返答フレーズ記憶部１３に記録されたデータを参照して対応文を選択するのではなく、Ｔｒａｎｓｆｏｒｍｅｒベースのディープラーニング技術を利用して文を生成することで返答フレーズを作成することも可能である。データを参照する方法の場合、ユーザの入力文と類似する発言がデータセット内にない場合、見当違いの発言を参照してしまう可能性も考えられるが、Ｔｒａｎｓｆｏｒｍｅｒベースの機械学習を導入することで、ユーザの発言を入力情報とし、各返答ラベルに応じた返答フレーズを学習させることで文を新たに生成することができるため、より幅広い入力情報にも対応することが可能となる。 In addition, the present technology does not refer to the data recorded in the response phrase storage unit 13 to select the corresponding sentence, but creates the response phrase by generating the sentence using the Transformer-based deep learning technology. is also possible. In the case of the method of referring to data, if there is no utterance similar to the user's input sentence in the data set, it is possible to refer to an irrelevant utterance, but by introducing Transformer-based machine learning , the user's utterances are used as input information, and response phrases corresponding to each response label can be learned to generate new sentences, making it possible to handle a wider range of input information.

返答ラベルとして「質問（Ｑ）」が選択された場合の返答フレーズの作成方法としては、まず、ユーザの入力文を形態素に解析する。各形態素についてユーザが５Ｗ１Ｈ情報を述べているかを調べ、ユーザが述べていない５Ｗ１Ｈ情報の中から１つの情報を指定し、会話データベースの返答フレーズ記憶部に記憶された「質問」のフレーズストックからこの情報について問いかけるフレーズを参照する。例えばユーザの入力文が「花王ミュージアムには行ったことはありますね。」だった場合、「花王ミュージアム」というのは固有表現の情報から場所(Where)を示しており、花王ミュージアム直後の「に」という助詞も含めて読み取ることができる。一方、場所（Where）以外の情報については不足しているため、それ以外の４Ｗ１Ｈ情報については質問することが可能であるとわかる。これらの情報を返答フレーズ記憶部１３に記憶させておき、そこから質問フレーズを参照する。例えば４Ｗ１Ｈ情報の中から理由（Why）が選択された場合、あらかじめ取得した人間同士の会話データから準備した「きっかけは何かあったんですか？」というフレーズを選択する。 As a method of creating a reply phrase when "question (Q)" is selected as the reply label, first, the sentence input by the user is morphologically analyzed. For each morpheme, it is checked whether the user has stated 5W1H information, one piece of 5W1H information that the user has not stated is specified, and this information is selected from the "question" phrase stock stored in the reply phrase storage section of the conversation database. Refer to phrases that ask about information. For example, if the user's input sentence is "Have you ever been to the Kao Museum?" ” can also be read. On the other hand, since information other than location (Where) is lacking, it can be seen that it is possible to ask about other 4W1H information. These pieces of information are stored in the reply phrase storage unit 13, and the question phrase is referred to from there. For example, when the reason (Why) is selected from the 4W1H information, the phrase "What was the reason?" prepared from conversation data between humans obtained in advance is selected.

次に（５）ステップとして、（４）ステップで選択した返答ラベルごとの返答フレーズに基づいて返答文を作成する。通常、（４）ステップで選択した返答ラベルごとの返答フレーズを、返答パターンを形成する返答ラベルの順に繋げることで返答文を作成することができる。例えば、「花王ミュージアムには行ったことはありますね。」というユーザの入力文に対し、返答パターンが、「反応」、「情報通知」、「質問」という３つの返答ラベルで形成され、「反応」に対応する返答フレーズが「花王ミュージアムですか。」であり、「情報通知」に対応する返答フレーズが「私は花王の工場見学には行きました。」であり、「質問」に対応する返答フレーズが「きっかけは何かあったんですか？」である場合、この３つの返答フレーズを繋げることで、返答文として「花王ミュージアムですか。私は花王の工場見学には行きました。きっかけは何かあったんですか？」を出力する。 Next, as step (5), a reply sentence is created based on the reply phrase for each reply label selected in step (4). Normally, a reply sentence can be created by connecting the reply phrases for each reply label selected in step (4) in the order of the reply labels forming the reply pattern. For example, in response to the user's input sentence, "Have you ever been to the Kao Museum?" is the response phrase "Is it Kao Museum?" If the response phrase is "What was the reason for this?", by connecting these three response phrases, the response sentence will be "Is it the Kao museum? I went to the Kao factory tour. Did something happen?" is output.

こうして本発明の対話方法によれば、非タスク型チャットボット端末１とユーザＨとの間で破綻することなく対話を続けることができる。 Thus, according to the dialogue method of the present invention, the dialogue between the non-task type chatbot terminal 1 and the user H can be continued without failure.

これに対し、返答意図ラベルを考慮しない市販の対話方法では、例えばチャットボット端末とユーザとの次の対話において、チャットボット端末の発話２で破綻が生じた。 On the other hand, in the commercially available dialogue method that does not consider the reply intention label, for example, in the next dialogue between the chatbot terminal and the user, the utterance 2 of the chatbot terminal fails.

チャットボット端末（発話１）「ゲストさんは好きな歌とかありますか？もしあれば教えて下さい。」
ゲスト（発話１）「Ａさんが好きです。」
チャットボット端末(発話２)「気にされているようですね。女性歌手の歌は女性の共感を呼びやすいです。」 Chatbot terminal (utterance 1) "Does the guest have a favorite song? If so, please tell me."
Guest (utterance 1) "I like Mr. A."
Chatbot terminal (utterance 2) "It seems that you are interested. Female singers' songs are easy to evoke sympathy from women."

本発明の対話方法によれば、ゲストの「Ａさんが好きです。」に、発言意図ラベル「回答」が付され、それに対し、返答パターンとして、返答ラベルが「反応」、「私の考え」で形成される返答パターンが選択されるとする。その場合、１つ目の返答ラベルとして「反応」に対応するフレーズが選択されるため、「気にされているようですね。」という話の流れに沿わないフレーズは選択されない。本発明の対話方法からは、上記返答ラベルより、「Ａさんですね。最近は女性歌手の活躍が著しいですよね。」といったフレーズが形成される。 According to the dialogue method of the present invention, the guest's message "I like Mr. A." is given the utterance intention label "answer", and the reply patterns are the reply labels "reaction" and "my thoughts". Suppose a reply pattern formed by is selected. In that case, since the phrase corresponding to "reaction" is selected as the first response label, the phrase "You seem to like it" is not selected. According to the dialogue method of the present invention, a phrase such as "Mr. A, isn't it? Recently, female singers have been very active, isn't it?"

なお、返答文は入力文と同様に、音声合成技術を利用することで、システムの音声としてユーザに返答してもよく、また、ＬＩＮＥ（登録商標）等のメッセージアプリで表示してもよく、パソコンを利用することで、ディスプレイに視覚的に表示してもよい。 As with the input text, the response text may be returned to the user as a system voice by using speech synthesis technology, or may be displayed in a message application such as LINE (registered trademark). By using a personal computer, it may be displayed visually on a display.

１非タスク型チャットボット端末
１０会話データベース
１１発言意図記憶部
１２返答パターン記憶部
１３返答フレーズ記憶部
Ｈユーザ 1 non-task type chatbot terminal 10 conversation database 11 utterance intention storage unit 12 reply pattern storage unit 13 reply phrase storage unit H user

Claims

A conversation database used for generating a conversation between a user and a non-task type chatbot terminal,
An utterance intention label indicating the type of utterance intention of the user, an utterance intention storage unit storing example sentences corresponding to each utterance intention label, and a reply label indicating the type of reply intention by the non-task type chatbot terminal. In this case, a plurality of response patterns formed by one response label or a sequence of multiple response labels are stored for each user's utterance intention label, and the probability that each response pattern is selected in response to the user's utterance. A response pattern storage unit that stores
Conversation database with .

A reply phrase storage section storing reply labels and reply phrases corresponding to the reply labels is provided, and the reply phrase storage section stores reply phrases into which a named entity is inserted and reply phrases into which a named entity is not inserted. A conversation database according to claim 1.

3. The conversation database according to claim 1 or 2, wherein a reply pattern to one utterance intention label is formed by three or less reply labels.

A method for interacting with a non-task chatbot terminal for interacting with a user, comprising:
A non-task type chatbot terminal
(1) obtaining a user's utterance;
(2) referencing the utterance intention storage section of the conversation database according to claim 1, estimating the user's utterance intention, and attaching an utterance intention label to the user's utterance;
(3) By referring to the response pattern storage section of the conversation database described in claim 1, a response pattern corresponding to the utterance intention label attached in (2) is determined based on the probability of the response pattern stored in the response pattern storage section. the step of selecting by
(4) selecting a reply phrase corresponding to each reply label forming the reply pattern selected in step (3);
(5) A dialogue method that performs a step of creating a reply sentence based on the reply phrase selected in step (4).

4. In the step of selecting the reply phrase of (4), the non-task type chatbot terminal selects the reply phrase by referring to a database storing reply labels and reply phrases corresponding to the reply labels. How to interact as described.