JP2019185230A

JP2019185230A - Conversation processing device and conversation processing system and conversation processing method and program

Info

Publication number: JP2019185230A
Application number: JP2018072745A
Authority: JP
Inventors: 義尚櫻井; Yoshinao Sakurai; 鶴田　節夫; Setsuo Tsuruta; 節夫鶴田
Original assignee: Meiji University
Current assignee: Meiji University
Priority date: 2018-04-04
Filing date: 2018-04-04
Publication date: 2019-10-24
Anticipated expiration: 2038-04-04
Also published as: JP7252690B2

Abstract

To respect a user as well as continuing conversation getting close to feeling of the user and lead spontaneous resolution of the use's trouble and problem while resolving the user's stress/loneliness/anxiety by taking out the user's inner world.SOLUTION: A conversation processing device 1 comprises an output response sentence creation unit 60 that makes at least a part of natural language texts included in user's utterance an input sentence and creates an output response sentence based on one or more of response sentences of either an input sentence with a response sentence for confirmation facilitation where a response sentence for confirmation facilitation that facilitates confirmation of a contents of user's utterance is connected to an ending of the input sentence, a response sentence for sympathy showing sympathy on the user's utterance, and a response sentence for conversation contents embodiment that embodies a contents of a conversation with the user.SELECTED DRAWING: Figure 1

Description

本発明は、会話処理装置、会話処理システム、会話処理方法及びプログラムに関する。 The present invention relates to a conversation processing apparatus, a conversation processing system, a conversation processing method, and a program.

厚生労働省の患者調査では、気分［感情］障害の推計患者数が２００８年に１００万人を超え、２０１４年においても約１１２万人となっている等、近年、様々な悩みや課題を抱えるユーザが増えている。ユーザが自分の抱える悩みや課題の本質を自分自身では言語化できない場合があり、その場合にはユーザ自身の力で悩みや課題を解決することは難しい。 According to a survey conducted by the Ministry of Health, Labor and Welfare, the number of patients with mood [emotion] disorders has exceeded 1 million in 2008 and has reached about 1.12 million in 2014. Is increasing. The user may not be able to verbalize the essence of his / her problems and issues, and in that case, it is difficult to solve the issues and problems with his / her own power.

上述のようにユーザの抱える悩みや課題を解決するために、カウンセラー（相談員）と会話する、カウンセリングを受ける等によって支援を受けるという対策が考えられる。ところが、ユーザの悩みや課題に精通するカウンセラーはまだ少なく、さらにカウンセラーが対応可能な人数は１か月あたり延べ２０人から３０人程度であるので、圧倒的な人手不足な状態である。 As described above, in order to solve the problems and problems that the user has, it is conceivable that the user can receive support by talking with a counselor (consultant) or receiving counseling. However, there are still few counselors who are familiar with the problems and issues of users, and the number of counselors that can be handled by the counselor is about 20 to 30 per month.

そこで、カウンセラーによるユーザの支援を代行できるソフトウェアエージェントが求められている。例えば、非特許文献１に開示されているＥＬＩＺＡは、ユーザが入力したテキスト内容を同じテキスト内容で言い換える、いわゆるオウム返しする応答文を作成する。
また、会話を継続させるために新たな話題への転換を含めて誘導する応答も行う。
しかし、ＥＬＩＺＡはユーザの入力内容と同じ内容を応答することや、会話の継続のために話題を広げたりするので、ユーザの抱える悩みや課題に焦点を絞って具体化し、精神的な深い悩みや課題を解決することは困難であると考えられる。 Therefore, there is a need for a software agent that can act as a user support by a counselor. For example, ELIZA disclosed in Non-Patent Document 1 creates a response sentence that returns a so-called parrot that rephrases the text content input by the user with the same text content.
In addition, in order to continue the conversation, it also makes a response that guides it to a new topic.
However, since ELIZA responds to the same content as the user's input content and expands the topic to continue the conversation, it focuses on the problems and issues faced by the user, Solving the problem is considered difficult.

上述のようなユーザの精神的な深い悩みや課題を解決する目的で、特許文献１に記載の内省支援装置が提案されている。特許文献１に記載の内省支援装置は、職業と、職業に対応する課題とを対応づけるとともに、職業および課題の関連キーワードを対応づける属性データを記憶する記憶装置と、ユーザの課題を特定する課題特定手段と、特定されたユーザの課題について、ユーザによる解決を支援する課題解決支援手段を備える。特許文献１に記載の内省支援装置では、ユーザの話す事柄や感情のキーワードを利用し、文脈を維持しつつ、課題の詳細を問い合わせるための問い合わせを織り交ぜて応答を繰り返すことによって、ユーザの悩みや課題を特定し、ユーザに対して内省による整理を促し、改善の気づきを与える。 In order to solve the above-mentioned deep mental problems and problems of users, an introspection support apparatus described in Patent Document 1 has been proposed. The introspection support device described in Patent Literature 1 associates occupations with tasks corresponding to occupations, and stores a storage device that stores attribute data that associates occupations and related keywords of the tasks, and identifies user issues The apparatus includes problem identification means and problem solution support means for assisting the user in solving the identified user problem. The introspection support device described in Patent Document 1 uses the user's spoken or emotional keywords, maintains the context, and repeats the response by interlacing inquiries to inquire about the details of the task. Identify worries and issues, encourage users to organize by introspection, and give improvement awareness.

特開２０１４−２２９１８０号公報JP 2014-229180 A

狩野芳伸；「コンピューターに話が通じるか：対話システムの現在」，情報管理，59巻，10号，pp.658-665，(2016)Yoshinobu Kano; "Can we communicate with computers: the present state of dialogue systems", Information Management, Vol. 59, No. 10, pp. 658-665, (2016)

上述のように、非特許文献１に記載のソフトウェアエージェントでは、ユーザの抱える悩みや課題を具体化し、精神的な深い悩みや課題を解決することは困難であるという問題があった。また、特許文献１に記載の内省支援装置では、ユーザに対して職業と、職業に対応する課題を特定するための問い合わせがされるが、適切な問い合わせには膨大な知識情報が必要なため、装置の構築にコストがかかる。また、ユーザの抱える悩みや課題によっては適切な問い合わせが困難であることも多いうえに、問い合わせが不適切な場合や問い合わせること自体に対して、ユーザは話がそれたと感じ、ストレス・不安を感じるという問題があった。また、特許文献１に記載の内省支援装置では、応答文が課題特定に注力して作成される一方で、ユーザは自分が尊重されていない、あるいは自分の立場を理解されていないと思う場合や、寂しさを感じる場合があるという問題があった。また、このようなユーザと内省支援装置との距離感が影響し、会話が長続きし難く、ユーザの深い悩みの自発的解決に至らない、至ったとしてもユーザにストレスが残る虞があるという問題があった。 As described above, the software agent described in Non-Patent Document 1 has a problem that it is difficult to materialize the troubles and problems that the user has and to solve deep mental problems and problems. In addition, in the introspection support apparatus described in Patent Document 1, an inquiry is made to the user to identify a job and a problem corresponding to the job, but an enormous amount of knowledge information is required for an appropriate query. The construction of the device is expensive. In addition, depending on the user's problems and issues, it is often difficult to make appropriate inquiries, and the user feels the story has changed and the user feels stress and anxiety when the inquiries are inappropriate or the inquiries themselves. There was a problem. In addition, in the introspection support device described in Patent Document 1, a response sentence is created by focusing on problem identification, while the user thinks that he / she is not respected or does not understand his / her position There was also a problem that sometimes I felt loneliness. In addition, such a sense of distance between the user and the introspection support device is affected, and it is difficult for the conversation to last for a long time, and the user's deep troubles are not voluntarily resolved. There was a problem.

本発明は、上記問題を解決するためになされたものであって、ユーザを尊重すると共にユーザの気持ちに寄り添いつつ会話を継続し、ユーザの内面を引き出すことによって、ユーザ自身がストレス・寂しさ・不安を解消しながら、ユーザが抱える悩みや課題の自発的な解決に導くことができる会話処理装置、会話処理システム、会話処理方法及びプログラムを提供する。 The present invention has been made in order to solve the above-mentioned problems, and respects the user and keeps talking while keeping close to the user's feelings. Provided are a conversation processing device, a conversation processing system, a conversation processing method, and a program that can lead to a user's troubles and problems to be resolved spontaneously while eliminating anxiety.

上記問題を解決するために、本発明の一態様は、ユーザの発言に含まれる自然言語要素の少なくとも一部を入力文とし、前記入力文の語尾に前記発言内容の確認を促進する確認促進用応答文が接続された確認促進用応答文付き入力文、前記発言に対する共感を示す共感用応答文、及び、前記ユーザとの会話の内容を具体化する前記会話内容具体化用応答文のいずれか一以上を含む応答文に基づいて出力応答文を作成する出力応答文作成部を備えることを特徴とする会話処理装置である。 In order to solve the above-described problem, an aspect of the present invention provides a confirmation promoting function that uses at least a part of a natural language element included in a user's speech as an input sentence and promotes confirmation of the content of the comment at the end of the input sentence Any one of an input sentence with a confirmation prompting response sentence connected with a response sentence, a response sentence for empathy indicating empathy for the utterance, and a response sentence for recognizing the conversation content that embodies the content of the conversation with the user A conversation processing apparatus including an output response sentence creating unit that creates an output response sentence based on a response sentence including one or more.

また、本発明の一態様は、前記出力応答文作成部は、前記ユーザ自身を示す二人称表現語句、修飾語、及び、述語、のいずれか一以上を省略して前記出力応答文を作成し、前記出力応答文は、少なくとも前記確認促進用応答文付き入力文、及び、前記会話内容具体化用応答文を含み、前記出力応答文作成部は、前記確認促進用応答文付き入力文、前記会話内容具体化用応答文を含む前記出力応答文を作成する会話処理装置である。 Further, according to one aspect of the present invention, the output response sentence creation unit creates the output response sentence by omitting any one or more of second-person expressions, modifiers, and predicates that indicate the user itself, The output response sentence includes at least an input sentence with a confirmation prompting response sentence and a response sentence for specifying the conversation content, and the output response sentence creating unit includes the input sentence with a confirmation prompting response sentence, the conversation A conversation processing apparatus that creates the output response sentence including a response sentence for content instantiation.

また、本発明の一態様は、前記発言からの前記自然言語要素の抽出の不良の有無を前記ユーザの入力に基づいて判断し、前記不良が有ると判断された際に前記ユーザに再度の発言を求める自然言語要素抽出不良判断部をさらに備える会話処理装置である。 In addition, according to one aspect of the present invention, the presence or absence of the extraction of the natural language element from the utterance is determined based on the input of the user, and the utterance is re-sent to the user when it is determined that the defect exists. Is a conversation processing device that further includes a natural language element extraction failure determination unit.

また、本発明の一態様は、上述の会話処理装置と、前記発言に含まれる前記自然言語要素を抽出し、前記自然言語要素の少なくとも一部を前記入力文として記憶する入力文記憶部と、前記入力文を含むユーザの過去の会話内容を記録するログ記憶部と、前記共感用応答文、前記入力文と前記入力文に接続した前記確認促進用応答文、及び、前記会話内容具体化用応答文が記憶されている応答文記憶部と、を備え、前記出力応答文作成部は、前記入力文記憶部及び前記ログ記憶部から前記入力文を読み出すと共に、前記応答文記憶部から前記共感用応答文、前記入力文と前記入力文に接続した前記確認促進用応答文、前記会話内容具体化用応答文を読み出し、前記出力応答文を作成することを特徴とする会話処理システムである。 Further, according to one aspect of the present invention, the above-described conversation processing device, an input sentence storage unit that extracts the natural language element included in the speech and stores at least a part of the natural language element as the input sentence; A log storage unit for recording past conversation contents of a user including the input sentence, the response sentence for empathy, the response sentence for confirmation confirmation connected to the input sentence and the input sentence, and the content for conversation contents A response sentence storage unit in which a response sentence is stored, and the output response sentence creation unit reads the input sentence from the input sentence storage unit and the log storage unit, and the empathy from the response sentence storage unit The conversation processing system is characterized in that the response sentence for response, the input sentence, the confirmation prompting response sentence connected to the input sentence, and the response sentence for realizing the conversation content are read out and the output response sentence is created.

また、本発明の一態様は、上述の前記ユーザの発言が入力される入力装置と、前記出力応答文を音声情報及び視覚情報として出力する出力装置と、をさらに備える会話処理システムである。 One embodiment of the present invention is a conversation processing system further comprising: an input device to which the above-mentioned user's speech is input; and an output device that outputs the output response sentence as audio information and visual information.

また、本発明の一態様は、前記入力装置がユーザを含む入力側の環境情報を取得し、前記出力応答文作成部が前記環境情報に基づいて前記出力応答文を作成する会話処理システムである。 Another aspect of the present invention is a conversation processing system in which the input device acquires environment information on an input side including a user, and the output response sentence creation unit creates the output response sentence based on the environment information. .

また、本発明の一態様は、前記入力文記憶部及び前記ログ記憶部が、過去の会話時の前記ユーザの発言を既入力文として記憶し、前記出力応答文作成部が前記入力文が前記既入力文と一致すると判断したときに、過去を示す過去応答文、前記既入力文、前記確認促進用応答文を含む前記出力応答文を作成する会話処理システムである。 Further, according to one aspect of the present invention, the input sentence storage unit and the log storage unit store the user's utterance at the time of past conversation as an already input sentence, and the output response sentence creation unit includes the input sentence as the input sentence. It is a conversation processing system that creates the output response sentence including a past response sentence indicating the past, the already input sentence, and the confirmation prompting response sentence when it is determined to match the already input sentence.

また、本発明の一態様は、前記出力応答文に合わせて動く頭部、胴体部、手部、及び、脚部のいずれか一以上をさらに備える会話処理システムである。 One embodiment of the present invention is a conversation processing system further including at least one of a head, a torso, a hand, and a leg that moves according to the output response sentence.

また、本発明の一態様は、前記頭部、前記胴体部、前記手部、及び、前記脚部、のいずれか一以上が仮想的に表現されたものである会話処理システムである。 One embodiment of the present invention is a conversation processing system in which any one or more of the head, the body, the hand, and the leg are virtually represented.

また、本発明の一態様は、ユーザの発言に含まれる自然言語要素を抽出して入力文を取得する入力文取得ステップと、前記入力文の語尾に前記発言の内容の確認を促進する確認促進用応答文が接続された確認促進用応答文付き入力文、前記発言に対する共感を示す共感用応答文、及び、前記ユーザとの会話の内容を具体化する会話内容具体化用応答文のいずれか一以上を含む応答文を用いて出力応答文を作成する出力応答文作成ステップと、を含むことを特徴とする会話処理方法である。 Further, according to one aspect of the present invention, an input sentence obtaining step for extracting a natural language element included in a user's utterance and obtaining an input sentence, and confirmation promotion for promoting confirmation of the content of the utterance at the end of the input sentence Any of an input sentence with a confirmation prompting response sentence to which a response sentence for confirmation is connected, a response sentence for empathy indicating empathy for the utterance, and a response sentence for embodying a conversation content that embodies the contents of the conversation with the user And an output response sentence creating step of creating an output response sentence using a response sentence including one or more.

また、本発明の一態様は、ユーザの発言に含まれる自然言語要素を抽出して入力文を取得する入力文取得ステップと、前記入力文の語尾に前記発言の内容の確認を促進する確認促進用応答文が接続された確認促進用応答文付き入力文、前記発言に対する共感を示す共感用応答文、及び、前記ユーザとの会話の内容を具体化する会話内容具体化用応答文のいずれか一以上を含む応答文を用いて出力応答文を作成する出力応答文作成ステップと、を実行させるプログラムである。 Further, according to one aspect of the present invention, an input sentence obtaining step for extracting a natural language element included in a user's utterance and obtaining an input sentence, and confirmation promotion for promoting confirmation of the content of the utterance at the end of the input sentence Any of an input sentence with a confirmation prompting response sentence to which a response sentence for confirmation is connected, a response sentence for empathy indicating empathy for the utterance, and a response sentence for embodying a conversation content that embodies the contents of the conversation with the user An output response sentence creating step for creating an output response sentence using a response sentence including one or more.

本発明によれば、ユーザを尊重すると共にユーザの気持ちに寄り添いつつ会話を継続し、ユーザの内面を引き出すことによって、ユーザのストレス・寂しさ・不安を解消しながら、ユーザが抱える悩みや課題の自発的な解決に導くことができる。 According to the present invention, the user continues to have a conversation while respecting the user's feelings and pulling out the inner surface of the user, thereby relieving the user's stress, loneliness and anxiety, while resolving the user's troubles and problems. It can lead to a spontaneous solution.

本発明の第１の実施形態による会話処理システムの一例を示すブロック図である。It is a block diagram which shows an example of the conversation processing system by the 1st Embodiment of this invention. 会話処理装置及び会話処理システムによる会話処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the conversation process by a conversation processing apparatus and a conversation processing system. 会話処理装置及び会話処理システムによる会話例である。It is the example of a conversation by a conversation processing apparatus and a conversation processing system. 本発明の第２の実施形態による会話処理システムの一例を示すブロック図である。It is a block diagram which shows an example of the conversation processing system by the 2nd Embodiment of this invention. 会話処理システムのロボット装置の一部の模式図である。It is a schematic diagram of a part of the robot apparatus of the conversation processing system.

以下、本発明の第１の実施形態による会話処理装置、会話処理システム、会話処理方法及びプログラムについて、図面を参照して説明する。 Hereinafter, a conversation processing device, a conversation processing system, a conversation processing method, and a program according to a first embodiment of the present invention will be described with reference to the drawings.

［第１の実施形態］
（会話処理装置、会話処理システム）
図１は、第１の実施形態による会話処理システムの一例を示す機能ブロック図である。図１に示すように、会話処理システム１００は少なくとも会話処理装置１を備える。会話処理装置１は、少なくとも出力応答文作成部６０を備え、音声認識部１２、判断部（自然言語要素抽出不良判断部）８０、応答文選択部９４、学習部９２、及び音声生成部１３をさらに備える。会話処理システム１００は、会話処理装置１に加えて、入力装置２と、記憶装置３と、出力装置４と、を備える。記憶装置３は、入力文記憶部３０、応答文記憶部４０、及びログ記憶部９１を備える。 [First Embodiment]
(Conversation processing device, conversation processing system)
FIG. 1 is a functional block diagram illustrating an example of a conversation processing system according to the first embodiment. As shown in FIG. 1, the conversation processing system 100 includes at least a conversation processing device 1. The conversation processing apparatus 1 includes at least an output response sentence creation unit 60, and includes a speech recognition unit 12, a determination unit (natural language element extraction failure determination unit) 80, a response sentence selection unit 94, a learning unit 92, and a speech generation unit 13. Further prepare. The conversation processing system 100 includes an input device 2, a storage device 3, and an output device 4 in addition to the conversation processing device 1. The storage device 3 includes an input sentence storage unit 30, a response sentence storage unit 40, and a log storage unit 91.

入力装置２は、ユーザの発言を入力可能な装置であり、例えばマイクやキーボード、デジタルペン等が挙げられる。入力装置２へのユーザの発言の入力形態は、発言に含まれる自然言語要素を抽出可能であれば特に限定されず、ユーザの音声、ユーザの発言の内容を表す文字等が広く含まれる。以下では、ユーザの音声が入力装置２に入力されるものと想定し、説明する。ただし、ユーザの発言が音声以外の入力形態で会話処理装置１に入力される場合は、音声認識部１２及び音声生成部１３は音声以外の入出力形態に合った認識部及び生成部に置き換えられる。 The input device 2 is a device that can input a user's speech, and examples thereof include a microphone, a keyboard, and a digital pen. The input form of the user's speech to the input device 2 is not particularly limited as long as a natural language element included in the speech can be extracted, and widely includes a user's voice, characters representing the content of the user's speech, and the like. In the following description, it is assumed that the user's voice is input to the input device 2. However, when the user's utterance is input to the conversation processing device 1 in an input form other than voice, the voice recognition unit 12 and the voice generation unit 13 are replaced with a recognition unit and a generation unit suitable for an input / output mode other than voice. .

入力装置２は、上述の機能に加え、ユーザを含む入力側の環境情報を取得する機能を有していてもよい。ユーザを含む入力側の環境情報とは、例えば、会話中におけるユーザの表情、顔色、声色、瞬きの回数、口の動き、身振り手振り、視線等の非言語コミュニケーションにおける情報や、体温、脈波・脳波変動等の生体情報のことである。入力装置２が環境情報を取得する手段としては特に限定されず、例えば、視線追跡装置、赤外線センサ、体温計、深度センサ、着座センサ、脈波センサ、脳波センサ等が挙げられる。
入力装置２によって取得された視線、表情、身体動作、発声変化等の身体情報や発熱、脈拍、脳波等の生体信号に基づき、出力応答文作成部６０はユーザの感情やその変化を推定する。感情を推定する方法としては特に限定されず、既知の方法を用いることができる。 The input device 2 may have a function of acquiring environment information on the input side including the user in addition to the above-described functions. The environment information on the input side including the user includes, for example, information in the non-verbal communication such as the user's facial expression, face color, voice color, number of blinks, mouth movement, gesture gesture, gaze, etc. during conversation, body temperature, pulse wave, It is biological information such as brain wave fluctuations. The means by which the input device 2 acquires environmental information is not particularly limited, and examples thereof include a line-of-sight tracking device, an infrared sensor, a thermometer, a depth sensor, a seating sensor, a pulse wave sensor, and an electroencephalogram sensor.
Based on body information such as line of sight, facial expression, body movement, and utterance change acquired by the input device 2 and biological signals such as fever, pulse, and brain waves, the output response sentence creation unit 60 estimates the user's emotion and its change. The method for estimating the emotion is not particularly limited, and a known method can be used.

入力文記憶部３０は、ユーザの発言に含まれる自然言語要素を抽出し、抽出された自然言語要素の少なくとも一部を入力文として記憶する。入力文記憶部３０は、例えば音声認識部１２を介して入力装置２からユーザの音声（発言）に含まれる自然言語要素を受け取る。具体的には、入力文記憶部３０は、コンピュータのメモリ等である。入力文記憶部３０は、一時的にデータを記憶する揮発性メモリ等でもよく、長期的にデータを記憶する不揮発性メモリやハードディスク等でもよく、ネットワークサーバ上の記憶媒体でもよい。
また、入力文記憶部３０は後述するログ記憶部９１の機能を兼ねていてもよく、以後特に言及しない限り、入力文記憶部３０は、ログ記憶部９１の機能を兼ねているものとして説明する。 The input sentence storage unit 30 extracts natural language elements included in the user's statement and stores at least a part of the extracted natural language elements as an input sentence. The input sentence storage unit 30 receives a natural language element included in the user's voice (utterance) from the input device 2 via the voice recognition unit 12, for example. Specifically, the input sentence storage unit 30 is a computer memory or the like. The input sentence storage unit 30 may be a volatile memory that temporarily stores data, a non-volatile memory that stores data for a long time, a hard disk, or the like, or may be a storage medium on a network server.
Further, the input sentence storage unit 30 may also function as a log storage unit 91 described later. Unless otherwise specified, the input sentence storage unit 30 will be described as also functioning as the log storage unit 91. .

音声認識部１２は、入力装置２から入力されたユーザの音声に対して、任意の音声認識サーバで音声認識を行い、ユーザの音声を自然言語テキスト（自然言語要素）に変換する。変換された自然言語テキストは所定のチェックや修正がなされる。 The voice recognition unit 12 performs voice recognition on an arbitrary voice recognition server for the user's voice input from the input device 2 and converts the user's voice into natural language text (natural language elements). The converted natural language text is subjected to predetermined checks and corrections.

音声認識部１２では、ユーザの発言の主旨の言い換え確認等の自然言語処理を行うため、例えば、入力されたユーザの音声（発言）の入力文の形態素解析、統語解析、構文解析を行う。入力文が日本語である場合、形態素解析等を行うツールは特に限定されないが、例えばＭｅＣａｂ（http://taku910.github.io/mecab/）、ＫＡＫＡＳＨＩ（http://kakasi.namazu.org/）等の標準ツールを用いることができる。以下に、ユーザの音声（発言）に対する形態素解析の一例を示す。
＜例１＞
ユーザ音声（発言）：「就職活動で悩んでいます。」
→ （就職活動，名詞，シュウショクカツドウ）（で，助詞，デ）（悩ん，動詞，ナヤン）（で，助詞，デ）（い，動詞，イ）（ます，助動詞，マス）
＜例２＞
ユーザ音声（発言）：「自分はＳＥの仕事をしたい。」
→ （自分，名詞，ジブン）（は，助詞，ハ）（ＳＥ，名詞，エスイー）（の，助詞，ノ）（仕事，名詞，シゴト）（し，動詞，シ）（たい，助動詞，タイ） The speech recognition unit 12 performs, for example, morphological analysis, syntactic analysis, and syntax analysis of an input sentence of the input user's speech (utterance) in order to perform natural language processing such as rephrasing confirmation of the main point of the user's speech. When the input sentence is Japanese, the tools for performing morphological analysis and the like are not particularly limited. For example, MeCab (http://taku910.github.io/mecab/), KAKASHI (http://kakasi.namazu.org/ ) And other standard tools can be used. An example of morphological analysis for the user's voice (utterance) will be shown below.
<Example 1>
User voice (speech): “I'm worried about job hunting.”
→ (Job hunting, Noun, Shushoku Katsudo) (de, particle, de) (worse, verb, nayan) (de, particle, de) (i, verb, i) (mas, auxiliary verb, mass)
<Example 2>
User voice (speech): “I want to work for SE”
→ (self, noun, jibn) (ha, particle, ha) (SE, noun, sue) (no, particle, no) (job, noun, shigoto) (do, verb, shi) (tai, auxiliary verb, thai)

入力文記憶部３０には、音声認識部１２を用いた音声認識等によってユーザの音声から抽出された自然言語テキストを、出力応答文作成用の入力文（以下、応答文作成用入力文という場合がある）として記憶する。
また、ユーザの音声から抽出された自然言語テキストには、目的、願望、手段、理由、帰結感情等を示すタグが付加され、入力文記憶部３０に記憶される。
また、ユーザの音声と同期して取得された環境情報がある場合には、喜怒哀楽の感情等のタグをさらに付加して入力文記憶部３０に記憶される。 In the input sentence storage unit 30, the natural language text extracted from the user's voice by voice recognition using the voice recognition unit 12 is an input sentence for creating an output response sentence (hereinafter referred to as an input sentence for creating a response sentence). Remember as).
Further, tags indicating the purpose, desire, means, reason, consequent emotion, etc. are added to the natural language text extracted from the user's voice and stored in the input sentence storage unit 30.
When there is environmental information acquired in synchronization with the user's voice, a tag such as emotions of emotions is further added and stored in the input sentence storage unit 30.

判断部８０は、ユーザの音声から抽出された自然言語テキストの抽出の不良の有無を判断し、不良が有れば上述した所定の修正がなされる。自然言語テキストの抽出に対する判断は、例えば再帰型ニューラルネットワーク（ＲＮＮ）やＮグラムによる日本語文予測器（例えば、“Flick: Japanese Input Method Editor using N-gram and Recurrent Neural Network Language Model based Predictive Text Input”, Yukino Ikegami, Yoshitaka Sakurai, Ernesto Damiani, Rainer Knauf, Setsuo Tsuruta: SITIS2017, Jaipur, India, December 4-7, 2017 等、参照）を利用した認識結果正解判定機で行うことができる。 The determination unit 80 determines whether or not the natural language text extracted from the user's voice is defective. If there is a defect, the predetermined correction described above is performed. Judgment on extraction of natural language text is, for example, a recursive neural network (RNN) or a Japanese sentence predictor using N-gram (for example, “Flick: Japanese Input Method Editor using N-gram and Recurrent Neural Network Language Model based Predictive Text Input”) , Yukino Ikegami, Yoshitaka Sakurai, Ernesto Damiani, Rainer Knauf, Setsuo Tsuruta: SITIS2017, Jaipur, India, December 4-7, 2017, etc.).

また、判断部８０は、ユーザの音声から抽出された自然言語テキストの抽出の不良の有無に関する判断機会を提供し、ユーザからの入力に基づいて判断してもよい。この場合、判断部８０が、後述する出力装置４に、ユーザの音声から抽出された自然言語テキストを表示させる。ユーザが表示された自然言語テキストについて不良の有無を判断し、入力装置２を介して当該判断結果を入力する。判断部８０はユーザによって不良が有ると判断された場合には、上述した所定の修正を行う。 The determination unit 80 may provide a determination opportunity regarding the presence or absence of extraction failure of the natural language text extracted from the user's voice, and may make a determination based on the input from the user. In this case, the determination unit 80 causes the output device 4 described later to display natural language text extracted from the user's voice. The user determines whether or not the displayed natural language text is defective, and inputs the determination result via the input device 2. When the determination unit 80 determines that there is a defect by the user, the determination unit 80 performs the predetermined correction described above.

また、自然言語テキストの抽出において不良が有ると判断された際には、出力応答文作成部６０で出力応答文として「もう一度お話しください」等の自然言語テキストを生成する。前述の出力応答文を音声情報として出力する場合には、音声生成部１３で音声情報に変換し、出力装置４を通してユーザに音声情報又は視覚情報として出力することによって、判断部８０はユーザに再度の発言を求めることができる。ただし、ユーザとの会話を妨げないように、応答時間・負荷・ソフトウェアエコの観点から、出力応答文作成部６０での処理前に、音声認識部１２において音声の認識や解釈、すなわち聞き違え等、精神的、認知的、哲学的気付き等との関連性が低い稚拙又は表面的な入力不良を、出力応答文作成部６０で視覚情報として出力することが好ましい。 When it is determined that there is a defect in the extraction of the natural language text, the output response sentence creating unit 60 generates a natural language text such as “Please speak again” as an output response sentence. When outputting the output response sentence as voice information, the voice generation unit 13 converts the output response sentence into voice information, and outputs the voice response information to the user as voice information or visual information through the output device 4. Can be requested. However, from the viewpoint of response time, load, and software ecology, the speech recognition unit 12 recognizes and interprets speech, that is, misunderstanding, etc. before processing in the output response sentence creation unit 60 from the viewpoint of response time, load, and software ecology. It is preferable that the output response sentence creation unit 60 outputs the childishness or the superficial input failure having a low relevance to mental, cognitive, philosophical awareness, or the like as visual information.

また、精神的悩みを持つユーザ本人やユーザと接している人間でなければ判定できない入力不良を処理するため、判定部８０は、入力結果を視覚情報として、確認窓等のディスプレイの一部に表示し、例えば２秒〜１０秒間、一定時間待機させてもよい。ユーザは、入力不良を確認した場合、入力装置２を介して再入力をする。 In addition, in order to process an input failure that can only be determined by the user who has a mental problem or a person in contact with the user, the determination unit 80 displays the input result as visual information on a part of a display such as a confirmation window. For example, you may make it wait for a fixed time for 2 to 10 seconds. When the user confirms an input failure, the user inputs again through the input device 2.

また、ユーザが再入力をする場合、入力装置２は、入力装置２に設けられたスイッチ又はボタンを押すことによって、ユーザの再入力を許可するように構成してもよい。 When the user performs re-input, the input device 2 may be configured to allow the user to re-input by pressing a switch or button provided in the input device 2.

音声認識部１２は、再入力されたユーザの音声に対し、直前の抽出とは異なる変換法や音声認識法を用いてもよい。異なる変換法の例としては、代替の変換表項目、変換辞書、異なる音声認識法の例としては、代替音声認識器や多重化音声識別器が挙げられる。異なる変換法は、入力不良又は再入力回数に応じて、変更してもよい。 The speech recognition unit 12 may use a conversion method or speech recognition method different from the previous extraction for the re-input user's speech. Examples of different conversion methods include alternative conversion table items, conversion dictionaries, and examples of different speech recognition methods include alternative speech recognizers and multiplexed speech discriminators. Different conversion methods may be changed depending on the input failure or the number of re-inputs.

また、会話処理システム１００自体の演算能力及び処理能力の限界による誤認識や、その確認のための待機時間に起因する会話の断絶を防止する点から、出力応答文作成部６０は、待機時間中又はその前後に、音声の出力や後述する第二実施形態の視覚情報を出力してもよく、再入力促進出力として前述の音声や視覚情報を出力してよい。
再入力送信出力とは、例えば、「フムフム」等の音声や動作での相槌や、確認窓等のディスプレイの一部にテキストを表示したり、頷き等の動作を表示させたりする出力のことである。 In addition, the output response sentence creation unit 60 is in the waiting time in order to prevent misrecognition due to the limits of the computing power and processing power of the conversation processing system 100 itself and the interruption of the conversation due to the waiting time for the confirmation. Alternatively, before and after that, audio output and visual information of the second embodiment described later may be output, and the above-described audio and visual information may be output as a re-input promotion output.
Re-input / transmission output refers to output that displays voice or operation such as “Humhum”, text on a part of the display such as a confirmation window, or operation such as whispering. is there.

判断部８０での自然言語テキストの抽出の不良の有無についての判断結果、及び、辞書利用・予測自動修正／会話修正による修正結果は、入力文記憶部３０に記憶される。また、判断結果等は、出力装置４に表示される。 The determination result regarding the presence or absence of the natural language text extraction failure in the determination unit 80 and the correction result by dictionary use / predictive automatic correction / conversation correction are stored in the input sentence storage unit 30. In addition, the determination result and the like are displayed on the output device 4.

また、判断部８０では、音声認識部１２からの返り値による不良の自動確認と不良時の音声認識部１２での音声認識の再試行を行うことができる。判断結果の表示による会話の確認（すなわち、再入力）と不良時の音声認識部１２での音声認識の再試行が可能である。修正結果は、入力文記憶部３０に記憶され、ユーザごとに対応した辞書等に反映される。 In addition, the determination unit 80 can perform automatic confirmation of failure by a return value from the speech recognition unit 12 and retry speech recognition in the speech recognition unit 12 at the time of failure. It is possible to confirm the conversation by displaying the determination result (that is, re-input) and retry the speech recognition in the speech recognition unit 12 at the time of failure. The correction result is stored in the input sentence storage unit 30 and reflected in a dictionary corresponding to each user.

応答文記憶部４０には、ユーザの発言に対する共感を示す共感用応答文、直前のユーザの発言の内容である発言内容の確認を促進する確認促進用応答文、及び、ユーザとのある期間における複数回のユーザの発言の内容である会話内容を具体化する会話内容具体化用応答文が記憶されている。共感用応答文としては、例えば「ふーん」、「ふむふむ」、「うんうん」等が挙げられる。共感用応答文は、会話処理装置１及び会話処理システム１００がユーザとの会話時にユーザを尊重・尊敬すると共にユーザに寄り添うことに寄与する。確認促進用応答文としては、「なんですね」、「なんだ」等が挙げられる。確認促進用応答文は、ユーザに対してユーザ自身が発言した内容を自然に自覚させる効果を有する。会話内容具体化用応答文としては、例えば「それで」、「もう少し詳しくお話しください」、「具体的にお話しください」、「もっと詳しく」等が挙げられる。会話内容具体化用応答文は、ユーザの発言や発言の内容の主旨からそれずに、ユーザの内面を引き出す効果を有する。
また、会話内容具体化用応答文は、ユーザが抱える悩みや課題の焦点を絞り、ユーザに問題を気付かせ易くする効果を有する問題焦点化用応答文としての機能も有する。 In the response sentence storage unit 40, a response sentence for empathy indicating empathy for the user's remarks, a confirmation prompting response sentence for promoting confirmation of the remark contents that are the contents of the remarks of the user immediately before, and a certain period with the user A conversation content instantiation response sentence that embodies the conversation content that is the content of the user's remarks multiple times is stored. As the response sentence for empathy, for example, “Fun”, “Fumufum”, “Yunun” and the like can be mentioned. The response message for empathy contributes to the conversation processing device 1 and the conversation processing system 100 respecting and respecting the user and talking to the user when talking with the user. Examples of the confirmation prompting response include “what is it” and “what is it”. The confirmation prompting response sentence has an effect of making the user naturally aware of the contents that the user has spoken. Examples of the response text for specifying the conversation content include “So”, “Please tell me a little more”, “Please tell me more specifically”, “More details”, and the like. The response message for instantiating the conversation content has an effect of drawing out the inner surface of the user without departing from the gist of the user's speech or the content of the speech.
Moreover, the response sentence for instantiating the conversation content also has a function as a response sentence for problem focusing that has an effect of narrowing the focus of the troubles and issues that the user has and making it easier for the user to notice the problem.

また、共感用応答文、確認促進用応答文、会話内容具体化用応答文は、どのような文脈でもユーザに違和感を覚えさせず、かつ話題を逸らすことが無いよう簡潔なものであることが好ましい。
また、共感用応答文、確認促進用応答文、会話内容具体化用応答文は、ユーザを退屈させないために、類似の意味のものであっても複数用意することが好ましい。
また、共感用応答文、確認促進用応答文、会話内容具体化用応答文は文章であってもよく、一語以上の語句であってもよい。
また、共感用応答文、確認促進用応答文、会話内容具体化用応答文、は複数の種類から所定のルールに基づいて選択されるものであってもよい。 In addition, the response sentence for empathy, the response sentence for confirmation promotion, and the response sentence for specifying the conversation content may be concise so as not to make the user feel uncomfortable in any context and not to distract the topic. preferable.
In addition, it is preferable to prepare a plurality of response sentences for empathy, response sentences for promoting confirmation, and response sentences for specifying conversation contents even if they have similar meanings so as not to bore the user.
In addition, the response sentence for empathy, the response sentence for confirmation promotion, and the response sentence for specifying the conversation content may be a sentence, or may be one or more words.
Moreover, the response sentence for empathy, the response sentence for confirmation promotion, and the response sentence for actualizing the conversation content may be selected from a plurality of types based on a predetermined rule.

出力応答文作成部６０は、入力文記憶部３０から入力文を読み出すと共に、応答文記憶部４０から共感用応答文、確認促進用応答文、会話内容具体化用応答文を読み出す。出力応答文作成部６０は、読み出された入力文の語尾に確認促進用応答文が接続された確認促進用応答文付き入力文、共感用応答文、及び、会話内容具体化用応答文のいずれか一以上の応答文を用いて出力応答文を作成する。出力応答文作成部６０は、基本形として、確認促進用応答文付き入力文を含む出力応答文を作成する。出力応答文作成部６０は、発展形として、確認促進用応答文付き入力文、会話内容具体化用応答文を含む出力応答文を作成する。さらに、出力応答文作成部６０は、別の発展形として、共感用応答文を加えた出力応答文を作成する。ユーザが会話の際に親近感を覚えることから、共感用応答文は、文頭に加えられることが好ましい。 The output response sentence creation unit 60 reads the input sentence from the input sentence storage unit 30 and reads the response sentence for empathy, the response sentence for confirmation promotion, and the response sentence for specifying the conversation content from the response sentence storage unit 40. The output response sentence creation unit 60 includes an input sentence with a confirmation prompting response sentence in which a confirmation prompting response sentence is connected to the end of the read input sentence, a response sentence for empathy, and a response sentence for specifying the conversation content. An output response sentence is created using any one or more response sentences. The output response sentence creation unit 60 creates an output response sentence including an input sentence with a confirmation prompting response sentence as a basic form. The output response sentence creation unit 60 creates an output response sentence including an input sentence with a confirmation prompting response sentence and a response sentence for recognizing a conversation content as an expanded form. Furthermore, the output response sentence creation unit 60 creates an output response sentence to which a response sentence for empathy is added as another development. Since the user feels familiarity during the conversation, the response message for empathy is preferably added to the beginning of the sentence.

出力応答文作成部６０は、ユーザの発言から前述のように自然言語テキストが抽出され、抽出された自然言語テキストから、「私は」、「自分は」、「俺は」等のユーザ自身を示す一人称表現語句を「あなたは」等に言い換えた二人称表現語句（以下、単に二人称表現語句ということがある）や形容詞、副詞等の修飾語、動詞を含む述語等を省いて、発言の主旨を日本語らしく、簡潔化してもよい。
例えば、二人称表現語句を省略して出力応答文を作成する場合、前述の形態素解析の例＜例２＞では、二人称表現語句に言い換えられる一人称表現語句は、（自分）、（は）の２つである。 As described above, the output response sentence creating unit 60 extracts the natural language text from the user's remarks. From the extracted natural language text, the user himself / herself such as “I am”, “I am”, “I am”, etc. Represent the main purpose of the statement by omitting the second person expression phrase (hereinafter sometimes simply referred to as the second person expression phrase) or the adjective, adverb modifier, predicate including verb, etc. It seems to be Japanese and may be simplified.
For example, when an output response sentence is created by omitting a second person expression phrase, in the above-described morphological analysis example <example 2>, first person expression phrases that can be rephrased as a second person expression phrase are (self) and (ha). It is.

具体的には、音声認識部１２を用いたユーザの音声（発言）の形態素解析結果に対し、二人称表現語句や形容詞、副詞等の修飾語、動詞を含む述語等のうち、いずれか一つ又は複数を除いた直前のユーザの音声（発言）の主旨の確認用・絞込み型会話継続用の応答文の作成を行うためのロギングを行う。このようなロギングは、例えば、レギュラーマッチングを利用した＜例３＞に例示する条件発火ルールの選択及び実行によって行うことができる。
＜例３＞
ユーザ音声（発言）：「自分はＳＥの仕事をしたいがプログラムが苦手なので就職できるか不安です。」
出力応答文：「ＳＥの仕事をしたいがプログラムが苦手なので就職できるか不安なんですね。もう少し詳しくお話しください。」
条件発火ルール：
・条件文…
[ur‘自分は( (?:ＳＥ)|(?:PM)|(?:プログラマ)|(?:システム開発者)@1)(.*@2 )たいが(.*@3)ので( (?:就職)|(?:入社)|(?:卒業)|(?:合格)@4 )(.*@5)((?:不安)|(?:心配)|(?:気がり)@6 )(.*@7)’
（なお、上述の@nは説明用の記述でコーディング文には存在しない記述、uはユニコード、ｒはレギュラーエクスプレッションを示す。）
・ログ文…
[u"うんうん%1%2たい[目的]が%3[障害]ので%4[結果]%5%6なんですね。もう少し詳しくお話しください。",
u"ふむふむ%1%2たい[目的]が%3[障害]ので%4[結果]%5%6%7ね。%4%5%6についてもっとお話しください。,
u"%1%2たい[目的]が%3[障害]ので%4[結果]%5%6%7ね。%3ので%4%5%6について具体的にお話しください。"]
さらに、前述の＜例３＞のように、ユーザ自身の気付きを容易にするための「ですね」等の確認促進用応答文を用いて、話し相手であるユーザを尊重して、ユーザの言動を強調する。
このような省略処理、及び、応答文付加処理によって入力文から主旨を示す語句を抽出し、主旨を示す出力応答文を構成することで、ユーザはストレスなく、自然に会話でき、また入力文を印象付ける、あるいは確認を強化して共感・継続や気付きの効果を高めることができる。 Specifically, for a morphological analysis result of a user's voice (speech) using the speech recognition unit 12, any one of a second person expression phrase, an adjective, a modifier such as an adverb, a predicate including a verb, or the like Logging is performed to create a response sentence for confirming the intention of the user's voice (speech) immediately before excluding a plurality and for narrowing-down conversation continuation. Such logging can be performed, for example, by selecting and executing a conditional firing rule exemplified in <Example 3> using regular matching.
<Example 3>
User voice (speech): “I want to work for SE but I am worried that I can get a job because I am not good at the program.”
Output response: “I want to work for SE but I'm not good at the program, so I'm worried about getting a job. Please tell me a little more.”
Condition firing rules:
・ Conditional statement ...
[ur'I am ((?: SE) | (?: PM) | (?: Programmer) | (?: System developer) @ 1) (. * @ 2) (?: Job) | (?: Join) | (?: Graduation) | (?: Pass) @ 4) (. * @ 5) ((?: Anxiety) | (?: Worry) | (?: Attention ) @ 6) (. * @ 7) '
(Note that the above-mentioned @n is an explanatory description that does not exist in the coding sentence, u is Unicode, and r is a regular expression.)
・ Log sentence ...
[u "Yeah% 1% 2 I want to get% 4 [Result]% 5% 6 because [Purpose] is% 3 [Disability]. Please tell me a little more.",
u "Fumumum% 1% 2 [Purpose] is% 3 [Disability] so% 4 [Result]% 5% 6% 7. Please tell us more about% 4% 5% 6.,
u "[Target] is% 3 [Disability] for% 1% 2, so% 4 [Result]% 5% 6% 7. Please tell us specifically about% 4% 5% 6 because% 3."]
In addition, as in <Example 3> above, using a confirmation prompting response sentence such as “I like” to facilitate the user's own awareness, respecting the user who is speaking and Emphasize.
By extracting a word indicating the main point from the input sentence by such an abbreviation process and a response sentence adding process, and constructing an output response sentence indicating the main point, the user can naturally talk without stress, and the input sentence Impress or enhance confirmation to enhance empathy, continuation and awareness.

上述のように二人称表現語句や形容詞、副詞等の修飾語、動詞を含む述語等の省略や確認促進用応答文を伴う出力応答文は、フローチャート的な手続き型実行ではなく、例えばレギュラーマッチングにより上述の条件発火ルールを発火して実行することによって作成される。出力応答文作成部６０は、所謂ルールベースエンジンを構築することにより、出力応答文を作成する。以下に、上記説明した出力応答文の一例を示す。
＜例４＞
ユーザ音声（発言）：「私はＳＥになりたいがプログラムができないので不安です。」
出力応答文：「ＳＥになりたいがプログラムができないので不安なんだ。」／「ＳＥになりたいがプログラムができないので不安なのですね。」
＜例５＞
ユーザ音声（発言）：「私は彼を嗜めた。」
出力応答文：「彼を嗜めたんですね。」
＜例６＞
ユーザ音声（発言）：「私はＳＥになりたいが、先輩に“君はプログラムができないから厳しい”と言われたので、悩んでいます。」
出力応答文：「ＳＥになりたいが、先輩に“プログラムができないから厳しい”と言われたので、悩んでいるんだ。」／「ＳＥになりたいが、先輩に“プログラムができないから厳しい”と言われたので、悩んでいるのですね。」
＜例７＞
ユーザ音声（発言）：「私はＳＥになりたいがプログラムができないので不安です。」
出力応答文：「ふむふむ、ＳＥになりたいがプログラムができないので不安なのですね。もう少し詳しくお話しください。」 As mentioned above, second-person expressions, adjectives, modifiers such as adverbs, predicates containing verbs, etc., and output response sentences with confirmation-promoting response sentences are not flowchart-like procedural execution, for example, by regular matching. It is created by firing and executing the condition firing rule. The output response sentence creation unit 60 creates an output response sentence by constructing a so-called rule-based engine. An example of the output response sentence described above is shown below.
<Example 4>
User voice (speech): “I want to be SE but I ’m worried because I ca n’t program.”
Output response: “I want to be SE but I'm uneasy because I can't program.” / “I want to be SE but I'm uneasy because I can't program.”
<Example 5>
User voice (speech): “I liked him.”
Output response: “You like him.”
<Example 6>
User voice (speech): “I want to be SE, but I am worried because my senior told me that“ you can't program because it ’s tough ”.
Output response: “I want to be SE, but I am worried because my senior told me“ I ca n’t program ”, so I ’m worried.” / “I want to be SE, but my senior says“ I ca n’t program because I ca n’t program ”. I am worried because I was
<Example 7>
User voice (speech): “I want to be SE but I ’m worried because I ca n’t program.”
Output response: “Fumufu, I want to be SE, but I ’m worried because I ca n’t program. Please tell me a little more.”

出力応答文作成部６０は、二人称表現語句を用いないことに加えて、冗長な修飾語を省略して出力応答文を作成することができる。以下に、このような出力応答文の一例を示す。
＜例８＞
ユーザ音声（発言）：「Ａ社に入りたいがプログラムをあまり上手く書けないので不安です。」
出力応答文：「Ａ社に入りたいがプログラムを書けないので不安なのですね。」
なお、＜例８＞では、入力文記憶部３０では、ユーザの音声（発言）が「Ａ社に入りたい［目的］がプログラムを上手く（書け）ないので［理由］不安です［帰結感情］。」という形で解析及びタグ付けされ、「Ａ社に入りたいがプログラムを書けないので不安です」という主旨の入力文として記憶される。 The output response sentence creating unit 60 can create an output response sentence by omitting redundant modifiers in addition to not using the second person expression phrase. An example of such an output response sentence is shown below.
<Example 8>
User voice (speech): “I want to join Company A, but I ’m worried because I ca n’t write the program very well.”
Output response: “I want to join Company A, but I ’m worried because I ca n’t write a program.”
In <Example 8>, in the input sentence storage unit 30, the user's voice (speech) is “I want to get into Company A [Purpose] does not work well (write) [Reason] Anxiety [Consequence]]. ”And is stored as an input sentence stating that“ I want to join Company A, but I am worried because I cannot write a program ”.

入力文記憶部３０及び出力応答文作成部６０には、出力応答文の内容や語数、応答回数を含む評価データ、出力応答文の種類を含む応答状態をログとして記憶可能なログ記憶部９１が接続されている。ログ記憶部９１は、ユーザごとに対応する過去の会話の記録（すなわち、過去の音声に含まれる自然言語テキスト）や感情語を記録する辞書ログを含む。また、出力応答文作成部６０には、前述の評価データや出力応答文の応答方法に基づいて効果的な応答方法を学習する学習部９２が設けられている。
ログ記憶部９１は、入力文記憶部３０の機能を兼ねてもよい。 In the input sentence storage unit 30 and the output response sentence creation unit 60, there is a log storage unit 91 that can store the contents and number of output response sentences, evaluation data including the number of responses, and response states including the types of output response sentences as a log. It is connected. The log storage unit 91 includes a past log record (that is, a natural language text included in a past voice) corresponding to each user and a dictionary log that records emotion words. In addition, the output response sentence creation unit 60 is provided with a learning unit 92 that learns an effective response method based on the above-described evaluation data and the response method of the output response sentence.
The log storage unit 91 may also function as the input sentence storage unit 30.

出力応答文作成部６０は、ログ記憶部９１の辞書ログ等を参照し、入力文記憶部３０における過去の会話の記録をユーザの音声（発言）の内容の経時変化を抽出する。出力応答文作成部６０は、過去の会話のやり取りの所定の回数（Ｎ１回）以内に、初めての、あるいは異なる感情語の出現を含む変化があれば、変化を提示する出力応答文を作成し、かつ出力応答文をログ記憶部９１へのロギングに追加する。すなわち、入力文記憶部３０は過去の会話時のユーザの音声（発言）に含まれる自然言語テキストを既入力文としてログ記憶部９１に記憶し、出力応答文作成部６０は、ユーザの音声（発言）に基づく入力文が既入力文のいずれかと一致すると判断したときに、過去を示す過去応答文、既入力文、確認促進用応答文を含む出力応答文を作成する。以下に、このような出力応答文の一例を示す。
＜例９＞
ユーザ音声（発言）：「私はＳＥになりたいがプログラムができないので不安です。」
出力応答文：「前に“ＳＥになりたいが先輩にプログラムができないから厳しいと言われたので悩んでいる”と言っていましたね。」
Ｎ３回の会話以内に過去の発言の主旨が述べられた場合は、現在（すなわち直前の）発言の主旨と会話継続用（トピック具体化用）応答を提示する出力応答文を作成し、かつ出力応答文をログ記憶部９１へのロギングに追加する。
また、出力応答文作成部６０は、ログ記憶部９１を参照し、入力文が例えば目的を表す既入力文と一致すると判断したときに、その目的を表す既入力文のさらに前にユーザが発した関連する既入力文を含めた出力応答文を作成することもできる。 The output response sentence creation unit 60 refers to a dictionary log or the like in the log storage unit 91 and extracts changes in the content of the user's voice (speech) over time from past conversation records in the input sentence storage unit 30. The output response sentence creation unit 60 creates an output response sentence that presents a change if there is a change including the appearance of the first or different emotion word within a predetermined number of exchanges (N1 times) of past conversations. In addition, the output response sentence is added to the logging to the log storage unit 91. That is, the input sentence storage unit 30 stores the natural language text included in the user's voice (speech) in the past conversation as an already input sentence in the log storage unit 91, and the output response sentence creation unit 60 stores the user's voice ( When it is determined that the input sentence based on (speech) matches any of the already input sentences, an output response sentence including a past response sentence indicating the past, an already input sentence, and a confirmation prompting response sentence is created. An example of such an output response sentence is shown below.
<Example 9>
User voice (speech): “I want to be SE but I ’m worried because I ca n’t program.”
Output response statement: “You said before,“ I want to become an SE but I am worried because my senior told me that it was difficult because I couldn't program. ”
If the purpose of the previous statement is stated within N3 conversations, create and output an output response sentence that presents the current (ie, immediately preceding) statement and the response for continuing the conversation (for topic specificization) The response sentence is added to the logging to the log storage unit 91.
In addition, when the output response sentence creating unit 60 refers to the log storage unit 91 and determines that the input sentence matches, for example, an already input sentence representing the purpose, the user issues a sentence before the already entered sentence representing the purpose. It is also possible to create an output response sentence including related input sentences.

なお、ユーザの音声（発言）の主旨に変化がなくても願望（すなわち目的、課題）が含まれていれば、出力応答文作成部６０は、会話のやり取りをＮ２回以上行った後に願望と感情との関係を提示する出力応答文を作成し、かつログ記憶部９１へのロギングに追加する。ユーザの音声（発言）の内容に願望達成における障害や達成手段が含まれている場合は、出力応答文作成部６０は、願望達成と障害や達成手段との関係も提示する出力応答文を作成し、かつ出力応答文をログ記憶部９１へのロギングに追加する。
以下に、このような出力応答文の一例を示す。
＜例１０＞
ユーザ音声（発言）：「Ａ社に入りたいがシステム開発ができないので不安です。」
出力応答文：「ふむふむ、Ａ社に入りたいがシステム開発ができないので不安なのですね。もう少し詳しくお話しください。」
ユーザ音声（発言）：「プログラムを作るのが遅いのです」
出力応答文：「うんうん、プログラムを作るのが遅いのですね。もっと具体的に。」
ユーザ音声（発言）：「プログラム演習で提出が大抵ビリなんです」
出力応答文：「プログラム演習で提出がビリとシステム開発ができないが関係するのですね。それで。」 If the desire (that is, the purpose and the task) is included even if the gist of the user's voice (speech) does not change, the output response sentence creation unit 60 performs the conversation after exchanging conversations N2 times or more. An output response sentence that presents the relationship with the emotion is created and added to logging to the log storage unit 91. When the content of the user's voice (speech) includes a failure or achievement means in achieving the desire, the output response sentence creating unit 60 creates an output response sentence that also presents the relationship between the achievement of the desire and the obstacle or the achievement means. In addition, the output response sentence is added to the logging to the log storage unit 91.
An example of such an output response sentence is shown below.
<Example 10>
User voice (speech): “I want to join Company A, but I ’m worried because I ca n’t develop the system.”
Output response: “Hum Fumu, I want to join Company A, but I ’m worried because I ca n’t develop the system. Please tell me a little more.”
User voice (speaking): “It's slow to make a program”
Output response: “Yeah, it ’s slow to make a program. More specifically.”
User voice (speaking): “Submissions are usually tedious during program exercises”
Output response: “Submission is not possible in the program exercise and system development is related.

また、出力応答文作成部６０は、＜例１０＞において、主旨を簡潔にして強調するために、一部の語句を「それ」「あれ」「これ」等に置き換える処理を行ってもよい。例えば、「プログラム演習で提出がビリとシステム開発ができないが関係するのですね。それで。」を「それとシステム開発ができないが関係するのですね。それで。」に置き換える処理等が挙げられる。 In addition, in <Example 10>, the output response sentence creation unit 60 may perform a process of replacing some words with “it”, “that”, “this”, etc. in order to emphasize the main point briefly. For example, the process of substituting “I can't do system development because I can't do system development” is replaced with “I can't do system development.

応答文記憶部４０と出力応答文作成部６０との間には、応答文選択部９４が設けられていてもよい。応答文選択部９４は、出力応答文に用いる共感用応答文、及び、会話内容具体化用応答文を、応答文記憶部４０に記憶された複数の種類の共感用応答文、及び、会話内容具体化用応答文の中から、所定のルールに基づいて選択し、選択した各応答文を出力応答文作成部６０に出力する。所定のルールは、例えばファジールールであってもよい。前述のルールに基づいて各種の出力応答文の会話中の何番目の応答で出力するかということによって、出力応答文の共感用応答文の種類、会話内容具体化用応答文の有無や種類、相槌、主旨等に変化をつける。出力応答文の順番を確率、ファジールールさらにはニューラルネット、深層学習、遺伝的アルゴリズム等の学習機や学習辞書によって、前述の出力応答文の共感用応答文の種類等を制御可能である。また、自然言語テキストに付加されたタグの情報を用いて出力応答文を選択してもよい。このように出力応答文に用いる共感用応答文、及び、会話内容具体化用応答文が適宜選択されることによって、共感用応答文、及び、会話内容具体化用応答文が繰り返し使用されにくく、ユーザが退屈になりにくく、ストレスを感じにくいため、会話が促進される。 Between the response text storage unit 40 and the output response text creation unit 60, a response text selection unit 94 may be provided. The response sentence selection unit 94 uses a plurality of types of empathy response sentences and conversation contents stored in the response sentence storage unit 40 as response messages for empathy and response sentences for embodying conversation contents used for output response sentences. The response sentence for instantiation is selected based on a predetermined rule, and each selected response sentence is output to the output response sentence creating unit 60. The predetermined rule may be a fuzzy rule, for example. Depending on the number of responses in the conversation of various output response sentences based on the above-mentioned rules, the type of response sentence for empathy in the output response sentence, the presence / absence or type of the response sentence for specifying the conversation content, Make changes to the agenda, purpose, etc. The order of output response sentences can be controlled by a learning machine or learning dictionary such as probability, fuzzy rule, neural network, deep learning, genetic algorithm, and the like. Moreover, you may select an output response sentence using the information of the tag added to the natural language text. In this way, by appropriately selecting the response sentence for empathy and the response sentence for instantiating the conversation content used for the output response sentence, the response sentence for empathy and the response sentence for instantiating the conversation content are less likely to be used repeatedly. Since the user is less likely to be bored and feel less stressed, conversation is promoted.

音声生成部１３は、出力応答文作成部６０から出力された出力応答文を音声情報に変換し、音声情報を出力装置４に出力する。
音声生成部１３は、内容か応答回数に応じて、出力応答文を音質差や抑揚差を変化させた音声情報に変換してもよい。これにより、ユーザの感情への働きかけが強化できる。 The voice generation unit 13 converts the output response sentence output from the output response sentence creation unit 60 into voice information, and outputs the voice information to the output device 4.
The voice generation unit 13 may convert the output response sentence into voice information in which the sound quality difference or the inflection difference is changed according to the content or the number of responses. Thereby, the action to a user's emotion can be strengthened.

また、出力装置４は、音声情報と同じ文字等の視覚情報を出力してもよく、音声情報と視覚情報を同時に出力してもよい。
出力装置４は、音声情報又は視覚情報をユーザが知覚できるよう出力するための装置である。
出力装置４は、音声生成部１３から出力された音声情報を音声として出力する。この場合、出力装置４は、例えばスピーカーで構成される。また、出力応答文作成部６０から出力された出力応答文を文字等の視覚情報として出力する場合には、出力装置４は、例えば液晶モニタで構成される。 The output device 4 may output visual information such as the same characters as the audio information, or may output the audio information and the visual information at the same time.
The output device 4 is a device for outputting audio information or visual information so that the user can perceive it.
The output device 4 outputs the audio information output from the audio generation unit 13 as audio. In this case, the output device 4 is constituted by a speaker, for example. Moreover, when outputting the output response sentence output from the output response sentence preparation part 60 as visual information, such as a character, the output device 4 is comprised with a liquid crystal monitor, for example.

（会話処理方法、プログラム）
第１の実施形態の会話処理方法は、上述の会話処理装置１及び会話処理システム１００によって実行される。上述の会話処理装置１及び会話処理システム１００は、第１の実施形態の会話処理プログラムがインストールされたコンピュータで構成され、ユーザの支援者（エージェント）として機能している。 (Conversation processing method, program)
The conversation processing method according to the first embodiment is executed by the conversation processing apparatus 1 and the conversation processing system 100 described above. The conversation processing apparatus 1 and the conversation processing system 100 described above are configured by a computer in which the conversation processing program of the first embodiment is installed, and function as a user supporter (agent).

次に、会話処理装置１及び会話処理システム１００によって会話を処理する手順について説明する。図２は、会話処理装置１及び会話処理システム１００による会話処理の手順を示すフローチャートである。図３は、第１の実施形態の会話処理方法における会話例を説明する図である。 Next, a procedure for processing a conversation by the conversation processing device 1 and the conversation processing system 100 will be described. FIG. 2 is a flowchart showing a procedure of conversation processing by the conversation processing device 1 and the conversation processing system 100. FIG. 3 is a diagram for explaining a conversation example in the conversation processing method according to the first embodiment.

まず、ユーザが、会話処理装置１及び会話処理システム１００や会話処理プログラム等を起動すると、ステップＳ１１に進む。ステップＳ１１では、ユーザが入力装置２に向かって自身の悩みや課題について発言する。図３に示す会話例では、ユーザが文ＰＨ５１「３年で博士号を取りたいが、論文を書けないので、不安です。」と発言している。ただし、最初に会話するときに、所定の時間内にユーザの発言がない場合は、出力応答文作成部６０で「何か気がかりなことをお話しください」等の出力応答文を生成し、音声生成部１３で音声情報に変換し、出力装置４を通してユーザに音声出力する。 First, when the user activates the conversation processing apparatus 1, the conversation processing system 100, the conversation processing program, and the like, the process proceeds to step S11. In step S 11, the user speaks about his / her problems and issues toward the input device 2. In the conversation example shown in FIG. 3, the user says the sentence PH51 “I want to get a doctorate in 3 years, but I can't write a paper, so I ’m worried”. However, if the user does not speak within a predetermined time during the first conversation, the output response sentence creation unit 60 generates an output response sentence such as “Please tell me something you are interested in” and generate a voice. The unit 13 converts the voice information into voice information, and outputs the voice to the user through the output device 4.

次に、ユーザの発言が入力装置２によって読み込まれ、音声認識部１２に入力されると共に音声認識され、自然言語テキストが抽出される。出力応答文作成部６０を通じて、音声認識部１２から送信される自然言語テキストを判断部８０によって受信したか否かを判断する（ステップＳ１２）。その結果、自然言語テキストを受信していない、あるいはその他のエラーが発生したと判断した場合（ステップＳ１２：ＮＯ）、判断部８０は、ユーザからの強制終了の指示等がない限り、再度、ユーザの発言を求める。 Next, the user's speech is read by the input device 2 and input to the voice recognition unit 12 and voice recognition is performed, and natural language text is extracted. It is determined whether the natural language text transmitted from the voice recognition unit 12 is received by the determination unit 80 through the output response sentence creation unit 60 (step S12). As a result, when it is determined that the natural language text has not been received or that another error has occurred (step S12: NO), the determination unit 80 again performs the user unless there is a forced termination instruction or the like from the user. Ask for remarks.

ステップＳ１２において、自然言語テキストを受信したと判断した場合（ステップＳ１２：ＹＥＳ）、判断部８０は、ユーザの発言から抽出された自然言語テキストに適当なログ等を付加し、ログ付加後の自然言語テキストを入力文として、音声認識部１２から出力応答文作成部６０を通じて入力文記憶部３０に出力する（ステップＳ１４）。 If it is determined in step S12 that the natural language text has been received (step S12: YES), the determination unit 80 adds an appropriate log or the like to the natural language text extracted from the user's remarks, and the natural language after the log is added. The language text is output as an input sentence from the voice recognition unit 12 to the input sentence storage unit 30 through the output response sentence creation unit 60 (step S14).

ステップＳ１２において、自然言語テキストを受信したと判断した場合（ステップＳ１２：ＹＥＳ）、音声認識部１２から受信した自然言語テキストに「すっきりした」「やる気が出てきた」「できそうな気がする」等のユーザが悩みや課題について解決したことを示す終了定型文が含まれていないか否かを判断する（ステップＳ１３）。 If it is determined in step S12 that the natural language text has been received (step S12: YES), the natural language text received from the speech recognition unit 12 is “clean”, “motivated”, “I feel like I can do it”. It is determined whether or not an end fixed sentence indicating that the user has solved the problem or problem is not included (step S13).

音声認識部１２から受信した自然言語テキストに終了定型文が含まれていないと判断された場合（ステップＳ１３：ＹＥＳ）、ステップＳ１４において入力文が入力文記憶部３０に一旦記憶された後、ステップＳ１４として、音声認識部１２から受信した自然言語テキストに既入力文が含まれていないか否かを判断する（ステップＳ１５）。ステップＳ１５において、自然言語テキストに既入力語は含まれないと判断した場合（ステップＳ１５：ＹＥＳ）、入力文記憶部３０は入力文を出力応答文作成部６０に出力する。予め複数の共感用応答文、確認促進用応答文、及び、会話内容具体化用応答文を記憶している応答文記憶部４０は、各種応答文を出力応答文作成部６０に出力する。出力応答文作成部６０は、確認促進用応答文付き入力文を作成し、確認促進用応答文付き入力文、及び、共感用応答文、会話内容具体化用応答文のいずれか一以上の応答文を用いて出力応答文を作成する（ステップＳ１６）。 If it is determined that the natural language text received from the speech recognition unit 12 does not include the end fixed sentence (step S13: YES), the input sentence is temporarily stored in the input sentence storage unit 30 in step S14, and then the step As S14, it is determined whether or not the already-input sentence is included in the natural language text received from the speech recognition unit 12 (step S15). In step S15, when it is determined that the natural language text does not include the already input word (step S15: YES), the input sentence storage unit 30 outputs the input sentence to the output response sentence creation unit 60. The response sentence storage unit 40 that stores a plurality of response sentences for empathy, a response sentence for confirmation promotion, and a response sentence for specifying the conversation content in advance outputs various response sentences to the output response sentence creation unit 60. The output response sentence creation unit 60 creates an input sentence with a confirmation prompting response sentence, and includes at least one of the input sentence with the confirmation prompting response sentence, the response sentence for empathy, and the response sentence for specifying the conversation content. An output response sentence is created using the sentence (step S16).

図３に示す会話例では、文ＰＨ５１に対する出力応答文として、出力応答文ＯＨ５１「ふーん、ふーん。３年で博士号を取りたいが、論文を書けないので、不安なのですね。」が作成されている。出力応答文ＯＨ５１は、共感用応答文、確認促進用応答文付き入力文、会話内容具体化用応答文をこの順に含んでいる。 In the conversation example shown in FIG. 3, the output response sentence OH51 “Fun, hmm. I want to get a doctoral degree in three years, but I ca n’t write a paper, so I ’m worried.” Yes. The output response sentence OH51 includes a response sentence for empathy, an input sentence with a confirmation prompting response sentence, and a response sentence for specifying conversation contents in this order.

ステップＳ１４において、音声認識部１２から受信した自然言語テキストにタグにおいて一致する既入力文、あるいは目的（願望）と手段（障害・理由）等のように関係する既入力文が含まれていると判断した場合（ステップＳ１５：ＮＯ）、出力応答文作成部６０は、過去を示す「前に」等の過去応答文、既入力文、確認促進用応答文を含む出力応答文を作成する（ステップＳ１７）。 In step S14, if the natural language text received from the speech recognition unit 12 includes a previously input sentence that matches in the tag or a related input sentence such as a purpose (aspiration) and means (failure / reason). When the determination is made (step S15: NO), the output response sentence creation unit 60 creates an output response sentence including a past response sentence such as “before” indicating the past, an already input sentence, and a confirmation prompting response sentence (step). S17).

ステップＳ１６，Ｓ１７のいずれかにおいて、出力応答文作成部６０は、作成した出力応答文を音声生成部１３に送信し、音声生成部１３は受信した出力応答文を音声情報に変換する。変換された音声は、出力装置４からユーザに向けて発声される（ステップＳ１８）。その後は、ユーザの次の発言が求められるため、ステップＳ１１に戻る。以降、ステップＳ１１からステップＳ１８が適宜繰り返される。 In any one of steps S16 and S17, the output response sentence creation unit 60 transmits the created output response sentence to the voice generation unit 13, and the voice generation unit 13 converts the received output response sentence into voice information. The converted voice is uttered from the output device 4 to the user (step S18). Thereafter, since the user's next statement is required, the process returns to step S11. Thereafter, steps S11 to S18 are repeated as appropriate.

ステップＳ１１からステップＳ１８が繰り返される中でユーザが自身の悩みや課題を解決するに至った場合（ステップＳ１３：ＮＯ）は、ステップＳ１４には進まずに、出力応答文生成部６０は、会話終了を示す「よかったですね」、「お疲れ様でした」等の終了応答文を音声作成部６０に出力し（ステップＳ１９）、会話処理を終了する。 When the user has solved his / her problems and problems while repeating Step S11 to Step S18 (Step S13: NO), the output response sentence generator 60 does not proceed to Step S14, but ends the conversation. An end response sentence such as “It was good” or “Thank you for your work” is output to the voice creating unit 60 (step S19), and the conversation process is terminated.

以上説明したように、第１の実施形態による会話処理装置１及び会話処理システム１００は、確認促進用応答文付き入力文、共感用応答文、及び、会話内容具体化用応答文のいずれか一以上の応答文を用いて出力応答文を作成する。そのため、ユーザを尊重すると共にユーザの気持ちに寄り添いつつ、話をそらさずに会話を継続し、ユーザの内面を引き出すことによって、ユーザのストレス・寂しさ・不安を解消しながら、ユーザが抱える悩みや課題の自発的な解決に導くことができる。図３に示す会話例では、ユーザの最初の文（発言）ＰＨ５１以降、エージェント（すなわち、会話処理装置１、会話処理システム１００）から、ユーザ発する文ＰＨ５２，ＰＨ５３，ＰＨ５４，ＰＨ５５，ＰＨ５６，ＰＨ５７に対して、それらの文の主旨とは異なる主旨の応答文を含まずに出力応答文ＯＨ５１，ＯＨ５２，ＯＨ５３，ＯＨ５４，ＯＨ５５，ＯＨ５６，ＯＨ５７を応答し、ユーザの内面を引き立たせることができる。 As described above, the conversation processing device 1 and the conversation processing system 100 according to the first embodiment are any one of the input sentence with the confirmation prompting response sentence, the response sentence for empathy, and the response sentence for specifying the conversation content. An output response sentence is created using the above response sentence. Therefore, while respecting the user and staying close to the user's feelings, continuing the conversation without diverting the story and eliciting the user's inner surface, the user's troubles, This can lead to the spontaneous resolution of issues. In the conversation example shown in FIG. 3, after the first sentence (speech) PH51 of the user, the sentence (PH52, PH53, PH54, PH55, PH56, PH57) issued by the user from the agent (that is, the conversation processing device 1, the conversation processing system 100). On the other hand, the output response sentences OH51, OH52, OH53, OH54, OH55, OH56, and OH57 can be responded without including a response sentence that is different from the main idea of those sentences, and the inner surface of the user can be enhanced.

また、第１の実施形態による会話処理装置１及び会話処理システム１００では、出力応答文作成部６０は、ユーザ自身を示す二人称表現語句、修飾語、述語のうち少なくともいずれか一つを省略して出力応答文を作成するので、応答文を簡素化し、より自然な会話を交わすことができる。 Further, in the conversation processing device 1 and the conversation processing system 100 according to the first embodiment, the output response sentence creation unit 60 omits at least one of the second person expression phrases, modifiers, and predicates indicating the user himself / herself. Since the output response sentence is created, the response sentence can be simplified and a more natural conversation can be exchanged.

また、第１の実施形態による会話処理装置１及び会話処理システム１００では、出力応答文作成部６０は、確認促進用応答文付き入力文、会話内容具体化用応答文を含む出力応答文を作成するので、ユーザの発言以外の内容を交えずに、話をそらさず、かつ発言内容を強調し、ユーザの内面をより一層引き出すことができる。 Further, in the conversation processing device 1 and the conversation processing system 100 according to the first embodiment, the output response sentence creating unit 60 creates an output response sentence including an input sentence with a confirmation prompting response sentence and a response sentence for specifying the conversation content. Therefore, without exchanging contents other than the user's remarks, it is possible to emphasize the content of the remarks and to further draw out the inner surface of the user.

また、第１の実施形態による会話処理装置１及び会話処理システム１００では、出力応答文に用いる共感用応答文、及び、会話内容具体化用応答文を、複数の種類の共感用応答文、及び、会話内容具体化用応答文の中から所定のルールに基づいて選択する応答文選択部９４をさらに備える。このような構成によれば、応答文が同じパターンの繰り返しになることを防ぎ、ユーザが退屈にならないようにすることができる。 Further, in the conversation processing device 1 and the conversation processing system 100 according to the first embodiment, a response message for empathy used for an output response message and a response message for specifying a conversation content are a plurality of types of response messages for empathy, and The response content selecting unit 94 further selects a response content for specifying the conversation content based on a predetermined rule. According to such a configuration, the response sentence can be prevented from repeating the same pattern, and the user can be prevented from being bored.

また、第１の実施形態による会話処理装置１及び会話処理システム１００では、ユーザの発言からの自然言語テキスト抽出の不良の有無を判断し、不良が有ると判断した際にユーザに再度の発言を求める判断部８０をさらに備えるので、自然言語テキスト抽出の不良による出力応答文の作成不良や会話処理装置１及び会話処理システム１００の誤作動等の発生を防止できる。 Further, in the conversation processing device 1 and the conversation processing system 100 according to the first embodiment, it is determined whether or not there is a defect in natural language text extraction from the user's utterance, and when it is determined that there is a defect, the user again speaks. Since the determination unit 80 to be obtained is further provided, it is possible to prevent the generation of an output response sentence due to the failure of natural language text extraction or the malfunction of the conversation processing device 1 and the conversation processing system 100.

また、第１の実施形態による会話処理装置１及び会話処理システム１００では、入力文記憶部３０と、応答文記憶部４０と、を備え、出力応答文作成部６０は、入力文記憶部３０から入力文を読み出すと共に、応答文記憶部４０から共感用応答文、確認促進用応答文、会話内容具体化用応答文を読み出し、出力応答文を作成する。このような構成によれば、ユーザごとの入力文に関する情報は入力文記憶部３０に記憶すると共に、共感用応答文、確認促進用応答文、会話内容具体化用応答文については複数種類の各種応答文を応答文記憶部４０に記憶することができる。また、それぞれの記憶部から出力応答文の構成要素を読み出し、場面やユーザの心理等に合わせた多様な出力応答文を円滑に作成し、会話を継続できる。 Further, the conversation processing device 1 and the conversation processing system 100 according to the first embodiment include the input sentence storage unit 30 and the response sentence storage unit 40, and the output response sentence creation unit 60 is connected to the input sentence storage unit 30. While reading the input sentence, the response sentence storage unit 40 reads the response sentence for empathy, the response sentence for promoting confirmation, and the response sentence for specifying the conversation content, and creates an output response sentence. According to such a configuration, the information related to the input sentence for each user is stored in the input sentence storage unit 30, and a plurality of types of response sentences for empathy, confirmation prompting responses, and conversation content specifying response sentences are used. The response sentence can be stored in the response sentence storage unit 40. In addition, the constituent elements of the output response sentence can be read from each storage unit, and various output response sentences can be smoothly created according to the scene, the user's psychology, etc., and the conversation can be continued.

また、第１の実施形態による会話処理装置１及び会話処理システム１００は、入力装置２と、出力装置４と、をさらに備えるので、ユーザからの発言の入力形態やユーザへの出力応答文の出力形態を自在に設定、変更し、会話処理装置１及び会話処理システム１００の汎用性を高めることができる。 Moreover, since the conversation processing device 1 and the conversation processing system 100 according to the first embodiment further include the input device 2 and the output device 4, the input form of the utterance from the user and the output of the output response sentence to the user The versatility of the conversation processing device 1 and the conversation processing system 100 can be enhanced by freely setting and changing the form.

また、第１の実施形態による会話処理装置１及び会話処理システム１００は、ユーザを含む入力側の環境情報を取得し、出力応答文作成部６０は当該環境情報を用いて、ユーザの心理を正確に推定して適切な出力応答文を作成又は選択でき、会話を継続できる。 In addition, the conversation processing device 1 and the conversation processing system 100 according to the first embodiment acquire input-side environment information including the user, and the output response sentence creation unit 60 uses the environment information to accurately determine the user's psychology. Therefore, it is possible to create or select an appropriate output response sentence and to continue the conversation.

また、第１の実施形態による会話処理装置１及び会話処理システム１００では、入力文記憶部３０は、過去の会話時のユーザの発言を既入力文として記憶し、出力応答文作成部６０は、入力文が既入力文やそのタグと一致、あるいは目的（願望）と手段（障害・理由）等のように関係すると判断したときに、過去を示す過去応答文、既入力文、確認促進用応答文を含む前記出力応答文を作成する。このような構成によれば、以前の会話もふまえてユーザと会話し、過去及び現在の両方の発言内容からユーザの内面を引き立たせ、過去及び現在の両方の発言内容に基づいたユーザ自身の気づきを促すことができる。 Further, in the conversation processing device 1 and the conversation processing system 100 according to the first embodiment, the input sentence storage unit 30 stores the user's remarks at the time of past conversation as an already input sentence, and the output response sentence creation unit 60 includes: When it is determined that the input sentence matches the already input sentence or its tag, or is related to the purpose (aspiration) and the means (failure / reason), etc., the past response sentence indicating the past, the already input sentence, the confirmation prompting response The output response sentence including the sentence is created. According to such a configuration, the user talks with the previous conversation, highlights the user's inner face from both the past and current utterance contents, and the user's awareness based on both the past and current utterance contents. Can be encouraged.

［第２の実施形態］
次に、本発明の第２の実施形態について説明する。 [Second Embodiment]
Next, a second embodiment of the present invention will be described.

（会話処理装置、会話処理システム）
図４は、第２の実施形態による会話処理システム１１０の一例を示すブロック図である。図４に示すように、会話処理システム１１０は、少なくとも会話処理装置１を備える。また、第２の実施形態による会話処理装置１及び会話処理システム１１０は、第１の実施形態による会話処理装置１及び会話処理システム１００と部分的に同様の構成を備えている。そのため、以下の説明では第２の実施形態による会話処理装置１及び会話処理システム１１０において第１の実施形態とは異なる構成を主体に説明し、第１の実施形態と共通する構成については同一の符号を付し、その説明を省略する。 (Conversation processing device, conversation processing system)
FIG. 4 is a block diagram illustrating an example of the conversation processing system 110 according to the second embodiment. As shown in FIG. 4, the conversation processing system 110 includes at least a conversation processing device 1. Further, the conversation processing device 1 and the conversation processing system 110 according to the second embodiment have partially the same configuration as the conversation processing device 1 and the conversation processing system 100 according to the first embodiment. Therefore, in the following description, the conversation processing device 1 and the conversation processing system 110 according to the second embodiment will mainly be described with respect to the configuration different from the first embodiment, and the configuration common to the first embodiment will be the same. Reference numerals are assigned and explanations thereof are omitted.

図４に示すように、会話処理システム１１０は、入力部２Ｘ、出力部４Ｘ及びロボット制御部５を有するロボット装置６を備える。入力部２Ｘは、第１の実施形態で説明した入力装置２と同様の機能を備え、マイクやスピーカーで構成される。
出力部４Ｘは、第１の実施形態で説明した出力装置４と同様の機能を有するが、第２の実施形態では出力応答文に合わせて動く頭（頭部）１２１、首１２５、胴体（胴体部）１３１、手（手部）１３２、脚（脚部）１３３を備える。図５に示すように、頭１２１は、可動性を備えた目（目部）１２２、耳（耳部）１２３、鼻（鼻部）１２４、口（口部）１２６を有し、さらに装飾的な構成として髭、髪等を有する。また、会話処理システム１１０は、目１２２、耳１２３、鼻１２４、髭等を動かすサーボメータ（図示略）を備え、環境情報を取得する環境入力部２Ｚとして、カメラ、入力ボタン、脈派計、体温計、視線検出装置などの入力装置（図示略）、入力された音声データ、脈派・脈拍データ、体温、視線データを出力するための表示装置（図示略）を備える。すなわち、会話処理システム１１０は、所謂ロボット型のシステムである。これらは全て含まれていてもよいが、一部のみで構成されていてもよい。 As shown in FIG. 4, the conversation processing system 110 includes a robot apparatus 6 having an input unit 2X, an output unit 4X, and a robot control unit 5. The input unit 2X has the same function as the input device 2 described in the first embodiment, and includes a microphone and a speaker.
The output unit 4X has the same function as that of the output device 4 described in the first embodiment. However, in the second embodiment, the head (head) 121, the neck 125, and the trunk (trunk) that move according to the output response sentence. Part) 131, a hand (hand part) 132, and a leg (leg part) 133. As shown in FIG. 5, the head 121 has movable eyes (eye parts) 122, ears (ear parts) 123, nose (nose parts) 124, and mouth (mouth part) 126. As a simple structure, it has wrinkles, hair and the like. The conversation processing system 110 includes a servo meter (not shown) that moves the eyes 122, the ears 123, the nose 124, the eyelids, and the like. As an environment input unit 2Z that acquires environment information, a camera, an input button, a pulse meter, An input device (not shown) such as a thermometer and a line-of-sight detection device, and a display device (not shown) for outputting input voice data, pulse / pulse data, body temperature, and line-of-sight data are provided. That is, the conversation processing system 110 is a so-called robot type system. All of these may be included, but may be composed of only a part.

ロボット装置６は、ユーザの発言をタッチパネルやキーボード等、音声以外で入力可能なロボット入力部１４１と、出力応答文の内容を視覚情報として出力可能なロボット出力部１４２とを、さらに備える。ロボット出力部１４２は、例えば液晶ディスプレイやプラズマディスプレイ等のディスプレイや立体画像表示装置で構成できる。図５では、ロボット入力部１４１及びロボット出力部１４２を胴体１３１の前面（すなわち、ユーザ側に向く面）のタッチパネル型のデバイスに設けた例を示しているが、ロボット入力部１４１及びロボット出力部１４２は互いに異なるデバイスで構成されていてもよい。 The robot apparatus 6 further includes a robot input unit 141 that can input a user's speech other than voice, such as a touch panel or a keyboard, and a robot output unit 142 that can output the contents of an output response sentence as visual information. The robot output unit 142 can be configured by a display such as a liquid crystal display or a plasma display or a stereoscopic image display device, for example. FIG. 5 shows an example in which the robot input unit 141 and the robot output unit 142 are provided on a touch panel type device on the front surface of the body 131 (that is, the surface facing the user side), but the robot input unit 141 and the robot output unit are illustrated. 142 may be composed of different devices.

会話処理システム１１０は、ロボット制御のシステム・ソフトウェア・プログラムとしてラズベリーパイ等の実時間ＯＳとインターネット通信用プログラム、ブラウザープログラム、その上で入出力など主にインターフェース関係のデータの処理をするJavaScript（登録商標）等を備える。会話処理装置１の本体（プログラム）は、ロボット側、すなわち頭１２１や胴体１３１の内部に配置されていてもよく、会話処理装置１の本体のインターフェース以外の主要部分がサーバに配置されていてもよい。また、会話処理装置１として、グーグル（Google LLC）の音声認識サーバを活用することもできる。脈派・脈拍データ、体温、視線データ等の非言語情報は、不図示の非言語情報認識機（またはサーバ）に送信され、ユーザの感情が識別される。非言語情報や感情は、入力部２Ｘから入力されるユーザの音声データから識別された自然言語テキストの情報や、環境入力部２Ｚから入力される情報とともに出力応答文作成部６０に送信される。なお、音声認識部１２及び音声生成部１３は、会話処理装置１に替えて、ロボット装置６の頭１２１や胴体１３１の内部に配置されていてもよい。 The conversation processing system 110 is a robot control system software program, such as a real-time OS such as Raspberry Pi, Internet communication program, browser program, and JavaScript that mainly processes interface-related data such as input / output. Trademark) and the like. The main body (program) of the conversation processing device 1 may be arranged on the robot side, that is, inside the head 121 or the torso 131, or a main part other than the interface of the main body of the conversation processing device 1 may be arranged on the server. Good. Further, as the conversation processing device 1, a voice recognition server of Google (Google LLC) can be used. Non-language information such as pulse / pulse data, body temperature, line-of-sight data, etc. is transmitted to a non-language information recognizer (or server) (not shown) to identify the user's emotion. The non-linguistic information and emotion are transmitted to the output response sentence creating unit 60 together with the information of the natural language text identified from the user's voice data input from the input unit 2X and the information input from the environment input unit 2Z. Note that the voice recognition unit 12 and the voice generation unit 13 may be arranged inside the head 121 and the body 131 of the robot device 6 instead of the conversation processing device 1.

（会話処理方法、プログラム）
第２の実施形態の会話処理方法は、基本的に第１の実施形態の会話処理方法と同様であり、図２のフローチャートに示す手順に従う。そのため、以下の説明では第２の実施形態による会話処理方法において第１の実施形態とは異なる手順・内容を主体に説明し、第１の実施形態と共通する内容についてはその説明を省略する。 (Conversation processing method, program)
The conversation processing method of the second embodiment is basically the same as the conversation processing method of the first embodiment, and follows the procedure shown in the flowchart of FIG. Therefore, in the following description, the conversation processing method according to the second embodiment will be described mainly with respect to procedures and contents different from those of the first embodiment, and description of contents common to the first embodiment will be omitted.

ステップＳ１１において、ユーザはロボット装置６の入力部２Ｘに向かって発言すると共に、手１３２や不図示の入力装置、ロボット入力部１４１を介して発言内容を入力することができる。 In step S 11, the user can speak toward the input unit 2 X of the robot apparatus 6 and can input the content of the comment via the hand 132, an input device (not shown), and the robot input unit 141.

ステップＳ１２において、ユーザが発した音声に基づく自然言語テキストを出力応答文作成部６０が受信したか否かの判断を、ロボット装置６に付与したテキスト表示機能を有する出力装置を用いて行うことができる。 In step S 12, it is determined whether or not the output response sentence creation unit 60 has received the natural language text based on the voice uttered by the user using the output device having the text display function added to the robot device 6. it can.

また、ステップＳ１８において、出力応答文作成部６０から出力された出力応答文は、音声生成部１３に送信され、音声情報に変換されると共に、ロボット制御部５に送信される。ロボット制御部５は、受信した音声情報に基づいて頭１２１の、口１２６に備えられたスピーカーから音声を発すると共に、音声に伴う相槌、頷き、首を傾げる、身振り手振り等の各種の動きを頭１２１、胴体１３１、手１３２、脚１３３で表現する。加えて、頭１２１に備えられた目１２２、耳１２３、鼻１２４、口１２６を、ユーザを安心させたり、信頼感を与えたり、心配するような喜怒哀楽の表情を形成するように動作させ、出力応答文に合わせた非言語コミュケーションをロボット装置６がユーザに対して行う。
また、口１２６がスピーカーを備えず、単に開閉を行うのみの機能を有している場合には、ロボット装置６の近傍に設けられたスピーカー等から音声を発してもよい。 In step S 18, the output response text output from the output response text creation unit 60 is transmitted to the voice generation unit 13, converted into voice information, and transmitted to the robot control unit 5. Based on the received audio information, the robot control unit 5 emits a sound from the speaker provided on the mouth 126 of the head 121, and performs various movements such as reciprocity, whispering, tilting the head, gesturing gestures, and the like. 121, body 131, hand 132, and leg 133. In addition, the eyes 122, the ears 123, the nose 124, and the mouth 126 provided on the head 121 are operated so as to form a facial expression of emotions that gives the user peace of mind, gives a sense of trust, and makes them worried. The robot apparatus 6 performs non-language communication in accordance with the output response sentence to the user.
When the mouth 126 does not include a speaker and has a function of simply opening and closing, a sound may be emitted from a speaker or the like provided in the vicinity of the robot apparatus 6.

例えば、図３に示す会話例において、エージェントの文ＯＨ５１では、「ふーん、ふーん。」の音声出力とともに頭１２１が頷き、「不安なのですね。」の音声出力とともに心配そうな目になったり、手を導体１３１の前で組んだり、ユーザに差し伸べたりする。また、エージェントの文ＯＨ５２，ＯＨ５３，…，ＯＨ５６において、「もっと詳しく」の出力回数を重ねるごとに穏やかな表情を表すように目や口、手ぶりで表現する。さらに、エージェントの文ＯＨ５７では、「よかったですね。」の音声出力とともに嬉しそうな目、口や手の動きを表現する。また、出力応答文は、ロボット出力部１４２を介してテキストとして出力できる。 For example, in the conversation example shown in FIG. 3, in the sentence OH51 of the agent, the head 121 crawls along with the voice output of “Fun, huh.” Are assembled in front of the conductor 131 or extended to the user. Further, in the sentence OH52, OH53,..., OH56 of the agent, it is expressed with eyes, mouth and hand gestures so as to express a gentle expression each time the “more details” output count is repeated. In addition, the agent's sentence OH57 expresses joyful eyes, mouth and hand movements along with a voice output of “It was good”. The output response sentence can be output as text via the robot output unit 142.

ロボット装置６が備える頭１２１、胴体１３１、手１３２、脚１３３、目１２２、耳１２３、鼻１２４、首１２５、口１２６等を含む身体は、物理的に存在するものであってもよいし、頭１２１、胴体１３１、手１３２、脚１３３、目１２２、耳１２３、鼻１２４を音声、首１２５、口１２６等を含む身体全体又は一部が仮想的に表現されたものであってもよい。例えば、ロボットや人の形状を模した、キャラクタやアバターとして、２次元の動画像又は静止画像、又はホログラムや拡張現実（Augmented Reality）、仮想現実（virtual reality）等のユーザが光学的に視認できる形態によって、頭１２１、胴体１３１、手１３２、脚１３３等を表してもよい。
この場合、入力部２Ｘや出力部４Ｘはそれぞれロボット装置６近傍に設けられたマイク、スピーカー等で構成されるが、出力部４Ｘから出力される音声に合わせて、音声に伴う各種の動きを仮想的に表現された頭１２１、胴体１３１、手１３２、脚１３３、目１２２、耳１２３、鼻１２４、首１２５、口１２６で表現する。
このように、頭１２１、胴体１３１、手１３２、脚１３３等を含む身体のうち少なくとも一部を仮想的に表現することで、実体としてのロボットが身近にない環境であっても、ネットワークを介して会話することができる。
また、ユーザが親近感を持つように、ユーザの好みに合わせてロボット装置６の外見を容易に変更することができる。 The body including the head 121, the torso 131, the hands 132, the legs 133, the eyes 122, the ears 123, the nose 124, the neck 125, the mouth 126, and the like included in the robot apparatus 6 may be physically present, The entire body or part of the body including the head 121, the torso 131, the hands 132, the legs 133, the eyes 122, the ears 123, the nose 124, the voice, the neck 125, the mouth 126, and the like may be virtually represented. For example, a user such as a two-dimensional moving image or still image, hologram, augmented reality, or virtual reality can be optically visually recognized as a character or avatar that imitates the shape of a robot or a person. Depending on the form, the head 121, the torso 131, the hand 132, the leg 133, and the like may be represented.
In this case, the input unit 2X and the output unit 4X are each configured by a microphone, a speaker, and the like provided in the vicinity of the robot device 6, but various movements associated with the sound are virtually transmitted in accordance with the sound output from the output unit 4X. The head 121, the torso 131, the hands 132, the legs 133, the eyes 122, the ears 123, the nose 124, the neck 125, and the mouth 126 are expressed in a realistic manner.
In this way, by virtually expressing at least a part of the body including the head 121, the torso 131, the hand 132, the leg 133, etc., even in an environment where the robot as an entity is not familiar, it is possible to communicate via the network. Can talk.
Further, the appearance of the robot apparatus 6 can be easily changed according to the user's preference so that the user has a sense of familiarity.

以上説明したように、第２の実施形態による会話処理装置１及び会話処理システム１１０は、基本的に第１の実施形態の会話処理装置１及び会話処理システム１００と同様の構成を備えるので、第１の実施形態と同様の作用効果を奏する。
また、第２の実施形態による会話処理装置１及び会話処理システム１１０は、出力応答文に合わせて動く頭１２１、胴体１３１、手１３２、脚１３３を備える。頭１２１は、目１２２、耳１２３、鼻１２４、首１２５、口１２６を有し、会話処理システム１１０は環境入力部４Ｚをさらに備える。このような構成によれば、ユーザにより一層寄り添い、様々な入力形態及び出力形態を備え、かつ幅広い応答内容を表現可能な会話処理システムを実現できる。 As described above, the conversation processing device 1 and the conversation processing system 110 according to the second embodiment basically have the same configuration as the conversation processing device 1 and the conversation processing system 100 of the first embodiment. The same effects as those of the first embodiment are obtained.
The conversation processing device 1 and the conversation processing system 110 according to the second embodiment include a head 121, a torso 131, a hand 132, and a leg 133 that move according to an output response sentence. The head 121 has an eye 122, an ear 123, a nose 124, a neck 125, and a mouth 126, and the conversation processing system 110 further includes an environment input unit 4Z. According to such a configuration, it is possible to realize a conversation processing system that is closer to the user, has various input forms and output forms, and can express a wide range of response contents.

また、第２の実施形態による会話処理システム１１０は、ロボット入力部１４１及びロボット出力部１４２を備えるので、２次元画像の身体を介してユーザに寄り添い、会話を継続できる。 In addition, since the conversation processing system 110 according to the second embodiment includes the robot input unit 141 and the robot output unit 142, the conversation processing system 110 can approach the user via the body of the two-dimensional image and continue the conversation.

なお、上述した会話処理装置１が備える各構成は、内部に、コンピュータシステムを有している。そして、上述した会話処理装置１が備える各構成の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより上述の会話処理装置１が備える各構成における処理を行ってもよい。ここで、「記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行する」とは、コンピュータシステムにプログラムをインストールすることを含む。前述の「コンピュータシステム」とは、ＯＳや周辺機器などのハードウェアを含むものとする。 Each configuration included in the conversation processing apparatus 1 described above has a computer system therein. Then, a program for realizing the functions of each component included in the above-described conversation processing device 1 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Thus, the processing in each configuration provided in the above-described conversation processing device 1 may be performed. Here, “loading and executing a program recorded on a recording medium into a computer system” includes installing the program in the computer system. The aforementioned “computer system” includes an OS and hardware such as peripheral devices.

また、「コンピュータシステム」は、インターネットやＷＡＮ、ＬＡＮ、専用回線等の通信回線を含むネットワークを介して接続された複数のコンピュータ装置を含んでもよい。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。このように、プログラムを記憶した記録媒体は、ＣＤ−ＲＯＭなどの非一過性の記録媒体であってもよい。 Further, the “computer system” may include a plurality of computer devices connected via a network including a communication line such as the Internet, WAN, LAN, and dedicated line. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. As described above, the recording medium storing the program may be a non-transitory recording medium such as a CD-ROM.

また、記録媒体には、当該プログラムを配信するために配信サーバからアクセス可能な内部又は外部に設けられた記録媒体も含まれる。なお、プログラムを複数に分割し、それぞれ異なるタイミングでダウンロードした後に会話処理装置１が備える各構成で合体される構成や、分割されたプログラムのそれぞれを配信する配信サーバが異なっていてもよい。さらに「コンピュータ読み取り可能な記録媒体」とは、ネットワークを介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。また、上記プログラムは、上述した機能の一部を実現するためのものであってもよい。さらに、上記プログラムは、上述した機能をコンピュータシステムに既に記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The recording medium also includes a recording medium provided inside or outside that is accessible from the distribution server in order to distribute the program. It should be noted that the program may be divided into a plurality of parts and downloaded at different timings, and the structure combined with each constituent provided in the conversation processing device 1 or the distribution server that distributes each of the divided programs may be different. Furthermore, the “computer-readable recording medium” holds a program for a certain period of time, such as a volatile memory (RAM) inside a computer system that becomes a server or a client when the program is transmitted via a network. Including things. The program may be for realizing a part of the functions described above. Further, the program may be a so-called difference file (difference program) that can realize the functions described above in combination with a program already recorded in the computer system.

また、上述した機能の一部又は全部を、ＬＳＩ（Large Scale Integration）等の集積回路として実現してもよい。上述した各機能は個別にプロセッサ化してもよいし、一部、又は全部を集積してプロセッサ化してもよい。また、集積回路化の手法はＬＳＩに限らず専用回路、又は汎用プロセッサで実現してもよい。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 Moreover, you may implement | achieve part or all of the function mentioned above as integrated circuits, such as LSI (Large Scale Integration). Each function described above may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. In addition, when an integrated circuit technology that replaces LSI appears due to the advancement of semiconductor technology, an integrated circuit based on the technology may be used.

また、上述した記憶装置３が備える各構成は、例えばコンピュータ読み取り可能な記録媒体によって構成されていてもよい。 Moreover, each structure with which the memory | storage device 3 mentioned above is provided may be comprised by the computer-readable recording medium, for example.

以上、本発明を適用した実施形態の会話処理装置、会話処理システム、会話処理方法及びプログラムについて説明したが、本発明に係る会話処理装置、会話処理システム、会話処理方法及びプログラムは、上記説明した実施形態に限定されるものではなく、特許請求の範囲内に記載された本発明の要旨の範囲内において、種々の変更が可能である。 The conversation processing device, the conversation processing system, the conversation processing method, and the program according to the embodiment to which the present invention is applied have been described above. The conversation processing apparatus, the conversation processing system, the conversation processing method, and the program according to the present invention have been described above. The present invention is not limited to the embodiments, and various modifications can be made within the scope of the gist of the present invention described in the claims.

例えば、前述のようにユーザの発言内容の入力形態は、出力応答文作成部６０に発言の自然言語テキストを抽出可能であれば、限定されない。ユーザの発言は、音声ではなく、キーボードやデジタルペン等の入力装置２を用いて文字（テキスト）で入力されてもよい。また、出力応答文の出力形態は、入力形態と同じでなくてもよく、出力応答文の内容をユーザに伝えられれば、特に限定されない。出力応答文は、音声ではなく、ディスプレイ等の出力装置４にテキストで出力されてもよい。 For example, as described above, the input form of the user's utterance content is not limited as long as the natural language text of the utterance can be extracted by the output response sentence creation unit 60. The user's utterance may be input by characters (text) using the input device 2 such as a keyboard or a digital pen instead of voice. Further, the output form of the output response sentence may not be the same as the input form, and is not particularly limited as long as the contents of the output response sentence can be transmitted to the user. The output response sentence may be output as text to the output device 4 such as a display instead of voice.

１…会話処理装置
２…入力装置
４…出力装置
３０…入力語記憶部
４０…応答文記憶部
６０…出力応答文作成部
８０…判断部（自然言語要素抽出不良判断部）
１００，１１０…会話処理システム DESCRIPTION OF SYMBOLS 1 ... Conversation processing apparatus 2 ... Input device 4 ... Output device 30 ... Input word memory | storage part 40 ... Response sentence memory | storage part 60 ... Output response sentence preparation part 80 ... Judgment part (natural language element extraction defect judgment part)
100, 110 ... Conversation processing system

Claims

Using at least part of the natural language elements included in the user's speech as input sentences,
An input sentence with a confirmation prompting response sentence in which a confirmation prompting response sentence for promoting confirmation of the content of the speech is connected to the end of the input sentence, a response sentence for empathy indicating empathy for the speech, and a conversation with the user What is claimed is: 1. A conversation processing apparatus comprising: an output response sentence creating unit that creates an output response sentence based on a response sentence including any one or more of response sentences for embodying conversation contents.

The output response sentence creation unit creates the output response sentence by omitting any one or more of second-person expressions, modifiers, and predicates that indicate the user itself,
The output response sentence includes at least an input sentence with a confirmation prompting response sentence, and the conversation content instantiation response sentence,
The output response sentence creating unit creates the output response sentence including the input sentence with the confirmation prompting response sentence and the response sentence for instantiating the conversation content.
The conversation processing apparatus according to claim 1.

A natural language element extraction failure determination unit that determines the presence or absence of extraction failure of the natural language element from the speech based on the input of the user and requests the user to speak again when it is determined that the failure exists Further comprising
The conversation processing apparatus according to claim 1 or 2.

A conversation processing device according to any one of claims 1 to 3,
An input sentence storage unit that extracts the natural language element included in the statement and stores at least a part of the natural language element as the input sentence;
A log storage unit for recording past conversation contents of the user including the input sentence;
A response sentence storage unit in which the response sentence for empathy, the input sentence and the response sentence for promoting confirmation connected to the input sentence, and the response sentence for specifying the conversation content are stored;
With
The output response sentence creation unit reads the input sentence from the input sentence storage unit and the log storage unit, and connects to the response sentence for empathy, the input sentence and the input sentence from the response sentence storage unit A conversation processing system comprising: reading out a response response for promotion and a response message for specifying the conversation content, and creating the output response statement.

An input device for inputting the user's speech;
An output device for outputting the output response sentence as audio information or visual information;
Further comprising
The conversation processing system according to claim 4.

The input device acquires environment information on the input side including a user,
The conversation processing system according to claim 5, wherein the output response sentence creating unit creates the output response sentence based on the environment information.

The input sentence storage unit and the log storage unit store the user's remarks at the time of past conversation as already input sentences,
The output response sentence creation unit, when it is determined that the input sentence matches or relates to the already input sentence or the tag of the already input sentence, the past response sentence indicating the past, the already input sentence, the confirmation promoting Creating the output response sentence including a response sentence;
The conversation processing system according to any one of claims 4 to 6.

It further comprises any one or more of a head, a torso, a hand, and a leg that move according to the output response sentence.
The conversation processing system according to any one of claims 4 to 7.

Any one or more of the head, the body part, the hand part, and the leg part are virtually represented,
The conversation processing system according to claim 8.

An input sentence acquisition step of extracting a natural language element included in the user's utterance and acquiring an input sentence;
An input sentence with a confirmation prompting response sentence in which a confirmation prompting response sentence that promotes confirmation of the content of the comment is connected to the end of the input sentence, an empathy response sentence indicating empathy for the comment, and the user An output response sentence creating step for creating an output response sentence using a response sentence including any one or more of the conversation contents instantiation response sentences for embodying conversation contents;
Conversation processing method characterized by including.

An input sentence acquisition step of extracting a natural language element included in the user's utterance and acquiring an input sentence;
An input sentence with a confirmation prompting response sentence in which a confirmation prompting response sentence that promotes confirmation of the content of the comment is connected to the end of the input sentence, an empathy response sentence indicating empathy for the comment, and the user An output response sentence creating step for creating an output response sentence using a response sentence including any one or more of the conversation contents instantiation response sentences for embodying conversation contents;
A program characterized by having executed.