JP2020184342A

JP2020184342A - Dialog management server, dialog management method, and program

Info

Publication number: JP2020184342A
Application number: JP2020088881A
Authority: JP
Inventors: 鈴木　良尚; Yoshihisa Suzuki; 良尚鈴木; 吉田　憲司; Kenji Yoshida; 憲司吉田; 亮五十嵐; Akira Igarashi; 史子出牛; Fumiko Ideushi; 昂平安田; Kohei Yasuda; 公陽衣笠; Kimihiro Kinugasa; 亮介田嶋; Ryosuke Tajima; 大田　佳宏; Yoshihiro Ota; 佳宏大田
Original assignee: Arithmer Inc
Current assignee: Arithmer Inc
Priority date: 2019-04-26
Filing date: 2020-05-21
Publication date: 2020-11-12
Also published as: JP2020184294A; JP6710007B1

Abstract

To provide a dialog management server capable of effectively using a manual, etc. while improving the convenience for users and the efficiency of operations within an organization, a weight determination device, a validity determination device, a dialog management method, a dialog management method, and a program.SOLUTION: A dialog management server device 1 includes: a record recording unit 11 that records a piece of text data of a descriptive text and a piece of document data in a structured format that has multiple pieces of unit information that associate text data of descriptive text with a piece of text data that represents the heading of the information and hierarchy for identifying one or more hierarchies leading to descriptive text; a reception unit 19 that receives a piece of text data of a question sentence from a user; and an answer generation unit 27 that matches the text data of the question sentence received by the reception unit 19 with the text data included in each unit information recorded in recording unit 11, extracts the unit information related to the question sentence, and generates a piece of answer output information based on the heading and descriptive text corresponding to the extracted unit information.SELECTED DRAWING: Figure 17

Description

本発明は、対話管理サーバ、対話管理方法、及びプログラムに関する。 The present invention relates to a dialogue management server, a dialogue management method, and a program.

従来、企業においては、事務担当者が、営業担当者から業務内容に関する質問を受け付
けると、膨大な業務マニュアルの中から必要な情報を抽出し、一つ一つ回答していた。こ
こで、顧客からの問い合わせに対してオペレータが電話で対応するための通話システムが
知られている。 Conventionally, in a company, when a clerical worker receives a question about a business content from a sales staff, he / she extracts necessary information from a huge business manual and answers it one by one. Here, a calling system is known for an operator to respond to an inquiry from a customer by telephone.

例えば、特許文献１には、顧客及びオペレータの通話の内容を管理する通話システムが
記載されている（特許文献１）。 For example, Patent Document 1 describes a call system that manages the contents of calls made by customers and operators (Patent Document 1).

国際公開第２０１２／１２０６２４号International Publication No. 2012/120624

しかしながら、特許文献１に記載の従来システムにおいては、顧客からの電話での問い
合わせの件数が多いと、その対応を行う専属のオペレータの存在が必要となるので、組織
内の業務の効率化が図れない。また、問い合わせの内容によっては、オペレータが即座に
回答することができないので、顧客にとって利便性が高いサービスとはいえない。
また、顧客からの問い合わせに対応するために、顧客向けのマニュアルやＱ＆Ａ集が用
意されることがある。しかし、そのようなマニュアルは膨大な量に及ぶことがあるため、
顧客はマニュアルのどこを見ればよいのか分からず、結局、電話で問い合わせをしてしま
うことも多い。このように、マニュアル等が存在するにもかかわらず、それらが有効に活
用されているともいえない。 However, in the conventional system described in Patent Document 1, if the number of inquiries from customers by telephone is large, it is necessary to have a dedicated operator to handle the inquiries, so that the efficiency of operations in the organization can be improved. Absent. In addition, depending on the content of the inquiry, the operator may not be able to answer immediately, so it cannot be said that the service is highly convenient for the customer.
In addition, in order to respond to inquiries from customers, manuals and Q & A collections for customers may be prepared. However, such manuals can be huge, so
Customers often end up making inquiries over the phone because they don't know where to look in the manual. In this way, despite the existence of manuals and the like, it cannot be said that they are being effectively utilized.

そこで、本発明のいくつかの態様はかかる事情に鑑みてなされたものであり、組織内の
業務効率化を図ることができ、且つ、ユーザの利便性を向上できることに加え、マニュア
ル等を有効活用することができる対話管理サーバ、対話管理方法、及びプログラムを提供
することを目的とする。 Therefore, some aspects of the present invention have been made in view of such circumstances, and in addition to being able to improve work efficiency within the organization and improving user convenience, effective utilization of manuals and the like is possible. It is an object of the present invention to provide a dialogue management server, a dialogue management method, and a program that can be used.

本発明の一態様に係る対話管理サーバは、説明文のテキストデータと、説明文に至る一
又は複数の階層を識別するための情報及び階層の見出しを表すテキストデータとを対応付
けた単位情報を複数有する構造化した形式の文書のデータを記録する記録部と、ユーザか
らの質問文のテキストデータを受け付ける受付部と、受付部により受け付けられた質問文
のテキストデータと、記録部に記録された各単位情報に含まれるテキストデータとをマッ
チングして、質問文に関連する単位情報を抽出し、抽出された単位情報に対応する見出し
及び説明文に基づく回答出力情報を生成する回答生成部と、を備える。 The dialogue management server according to one aspect of the present invention provides unit information in which the text data of the explanatory text is associated with the information for identifying one or more layers leading to the explanatory text and the text data representing the heading of the hierarchy. A recording unit that records data of multiple structured documents, a reception unit that receives text data of question texts from users, text data of question texts received by the reception department, and recorded in the recording unit. A response generation unit that matches text data included in each unit information, extracts unit information related to the question text, and generates answer output information based on the heading and explanation corresponding to the extracted unit information. To be equipped with.

本発明の一態様に係る対話管理方法は、説明文のテキストデータと、説明文に至る一又
は複数の階層を識別するための情報及び階層の見出しを表すテキストデータとを対応付け
た単位情報を複数有する構造化した形式の文書のデータを記録するステップと、ユーザか
らの質問文のテキストデータを受け付けるステップと、受け付けられた質問文のテキスト
データと、記録された各単位情報に含まれるテキストデータとをマッチングして、質問文
に関連する単位情報を抽出し、抽出された単位情報に対応する見出し及び説明文に基づく
回答出力情報を生成するステップと、を含む。 The dialogue management method according to one aspect of the present invention provides unit information in which text data of an explanatory text is associated with information for identifying one or more layers leading to the explanatory text and text data representing a hierarchy heading. The step of recording the data of a document having a plurality of structured formats, the step of accepting the text data of the question text from the user, the text data of the received question text, and the text data included in each recorded unit information. Includes a step of extracting unit information related to the question text and generating answer output information based on the heading and description corresponding to the extracted unit information.

本発明の一態様に係るプログラムは、コンピュータを、説明文のテキストデータと、説
明文に至る一又は複数の階層を識別するための情報及び階層の見出しを表すテキストデー
タとを対応付けた単位情報を複数有する構造化した形式の文書のデータを記録する記録部
、ユーザからの質問文のテキストデータを受け付ける受付部、受付部により受け付けられ
た質問文のテキストデータと、記録部に記録された各単位情報に含まれるテキストデータ
とをマッチングして、質問文に関連する単位情報を抽出し、抽出された単位情報に対応す
る見出し及び説明文に基づく回答出力情報を生成する回答生成部、として機能させる。 In the program according to one aspect of the present invention, the computer is subjected to unit information in which the text data of the explanatory text is associated with the information for identifying one or more layers leading to the explanatory text and the text data representing the heading of the hierarchy. A recording unit that records data in a structured format that has multiple documents, a reception unit that receives text data of question texts from users, text data of question texts received by the reception department, and each recorded in the recording unit. It functions as an answer generator that matches the text data included in the unit information, extracts the unit information related to the question text, and generates the answer output information based on the heading and description corresponding to the extracted unit information. Let me.

なお、本発明において、「部」とは、単に物理的手段を意味するものではなく、その「
部」が有する機能をソフトウェアによって実現する場合も含む。また、１つの「部」や装
置が有する機能が２つ以上の物理的手段や装置により実現されても、２つ以上の「部」や
装置の機能が１つの物理的手段や装置により実現されても良い。 In the present invention, the "part" does not simply mean a physical means, but the "part" thereof.
It also includes the case where the function of the "part" is realized by software. Further, even if the function of one "part" or device is realized by two or more physical means or devices, the function of two or more "parts" or devices is realized by one physical means or device. You may.

本発明によれば、組織内の業務効率化を図ることができ、且つ、ユーザの利便性を向上
できることに加え、マニュアル等を有効活用することができる。 According to the present invention, it is possible to improve the work efficiency in the organization, improve the convenience of the user, and effectively utilize the manual or the like.

第１実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on 1st Embodiment. 一般のワープロや表計算ソフトにより作成された業務マニュアルデータ等の文書をＯＯＸＭＬ（Office Open XML）等の構造化文書へ変換する処理の一例を示す概念図である。It is a conceptual diagram which shows an example of the process of converting a document such as business manual data created by a general word processor or spreadsheet software into a structured document such as OOXML (Office Open XML). ＯＯＸＭＬ等の構造化文書から、本実施形態に特有の特定形式の構造化文書へ変換する処理の一例を示す概念図である。It is a conceptual diagram which shows an example of the process of converting a structured document such as OOXML into a structured document of a specific format peculiar to this embodiment. 第１実施形態に係る対話管理処理の一例を示すフローチャートである。It is a flowchart which shows an example of the dialogue management processing which concerns on 1st Embodiment. 第１実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 1st Embodiment. 第１実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 1st Embodiment. 第２実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 2nd Embodiment. 第３実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 3rd Embodiment. 第４実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 4th Embodiment. 第５実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 5th Embodiment. 第５実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 5th Embodiment. 第６実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 6th Embodiment. 第６実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 6th Embodiment. 第６実施形態に係るオペレータ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the operator terminal apparatus which concerns on 6th Embodiment. 第７実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on 7th Embodiment. 第８実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on 8th Embodiment. 第９実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on 9th Embodiment. 第９実施形態に係る下位概念語辞書の一例を示す模式図である。It is a schematic diagram which shows an example of the subordinate concept word dictionary which concerns on 9th Embodiment. 第９実施形態に係る対話管理処理の一例を示すフローチャートである。It is a flowchart which shows an example of the dialogue management processing which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on 9th Embodiment. 第９実施形態の変形例６に係るユーザ端末装置の表示部の画面の一例を示す図である。It is a figure which shows an example of the screen of the display part of the user terminal apparatus which concerns on modification 6 of the 9th Embodiment. 第９実施形態の変形例６のアイコンの一例を示す図である。It is a figure which shows an example of the icon of the modification 6 of the ninth embodiment. 第９実施形態の変形例６のアイコンの一例を示す図である。It is a figure which shows an example of the icon of the modification 6 of the ninth embodiment. 第１０実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on tenth embodiment. 第１０実施形態に係る重み決定装置の処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process of the weight determination apparatus which concerns on 10th Embodiment. 第１０実施形態に係る重み決定装置の処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process of the weight determination apparatus which concerns on 10th Embodiment. 第１０実施形態に係る重み決定装置の処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process of the weight determination apparatus which concerns on 10th Embodiment. 第１０実施形態に係る初期値リストの一例を示す図である。It is a figure which shows an example of the initial value list which concerns on 10th Embodiment. 第１０実施形態に係る重み候補リストの一例を示す図である。It is a figure which shows an example of the weight candidate list which concerns on 10th Embodiment. 第１１実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on eleventh embodiment. 第１１実施形態に係る妥当性判定装置の処理を説明するためのフローチャートである。It is a flowchart for demonstrating the process of the validity determination apparatus which concerns on 11th Embodiment. 第１２実施形態に係る対話管理システムの概略構成図（システム構成図）である。It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on 12th Embodiment.

以下、添付図面を参照しながら本発明の実施の形態について説明する。以下の実施の形
態は、本発明を説明するための例示であり、本発明をその実施の形態のみに限定する趣旨
ではない。また、本発明は、その要旨を逸脱しない限り、様々な変形が可能である。さら
に、各図面において同一の構成要素に対しては可能な限り同一の符号を付し、重複する説
明は省略する。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. The following embodiments are examples for explaining the present invention, and the present invention is not intended to be limited only to the embodiments. Further, the present invention can be modified in various ways as long as it does not deviate from the gist thereof. Further, in each drawing, the same components are designated by the same reference numerals as much as possible, and duplicate description will be omitted.

＜第１実施形態＞
図１は、本発明の実施形態に係る対話管理システムの概略構成図（システム構成図）で
ある。図１に示すように、対話管理システム１００は、例示的に、対話管理サーバ１（対
話管理サーバ）、対話ログデータベース（ＤＢ）４、ユーザ端末装置６、及びオペレータ
端末装置８を備えて構成されている。 <First Embodiment>
FIG. 1 is a schematic configuration diagram (system configuration diagram) of the dialogue management system according to the embodiment of the present invention. As shown in FIG. 1, the dialogue management system 100 is optionally configured to include a dialogue management server 1 (dialogue management server), a dialogue log database (DB) 4, a user terminal device 6, and an operator terminal device 8. ing.

対話管理サーバ１は、所定のネットワークＮ上における対話を管理するサーバ用コンピ
ュータであり、そのサーバ用コンピュータにおいて所定のサーバ用プログラムが動作する
ことにより、サーバ機能を発現するものである。「ネットワークＮ上における対話」とは
、ネットワークＮを介する、ユーザと対話管理サーバ１との間の対話、及び、ネットワー
クＮを介する、オペレータと対話管理サーバ１との間の対話のことをいう。例えば、「ネ
ットワークＮ上における対話」は、ネットワークＮを介してユーザ端末装置６から対話管
理サーバ１に入力された質問、当該質問に対応する、対話管理サーバ１により生成された
回答、及び、当該回答に対するユーザの応答を含む。また、「ネットワークＮ上における
対話」は、ネットワークＮを介してオペレータ端末装置８から対話管理サーバ１に入力さ
れた質問、当該質問に対応する、対話管理サーバ１により生成された回答、及び、当該回
答に対するオペレータの応答を含む。 The dialogue management server 1 is a server computer that manages dialogues on a predetermined network N, and exhibits a server function by operating a predetermined server program on the server computer. The “dialogue on the network N” refers to a dialogue between the user and the dialogue management server 1 via the network N, and a dialogue between the operator and the dialogue management server 1 via the network N. For example, "dialogue on network N" includes a question input from the user terminal device 6 to the dialogue management server 1 via the network N, an answer generated by the dialogue management server 1 corresponding to the question, and the said. Includes the user's response to the answer. Further, "dialogue on network N" includes a question input from the operator terminal device 8 to the dialogue management server 1 via the network N, an answer generated by the dialogue management server 1 corresponding to the question, and the said. Includes the operator's response to the answer.

対話ログＤＢ４は、ユーザと対話管理サーバ１との間の対話、及び、オペレータと対話
管理サーバ１との間の対話を記録するデータベースであり、例えば、ネットワークＮを介
してユーザ端末装置６から対話管理サーバ１に入力された質問、当該質問に対応する、対
話管理サーバ１により生成された回答、及び、当該回答に対するユーザの応答を含む対話
ログ（対話結果）を記録する。なお、対話ログＤＢ４は、対話管理サーバ１に含まれても
よい。 The dialogue log DB 4 is a database that records the dialogue between the user and the dialogue management server 1 and the dialogue between the operator and the dialogue management server 1, and is, for example, a dialogue from the user terminal device 6 via the network N. A dialogue log (dialogue result) including a question input to the management server 1, an answer generated by the dialogue management server 1 corresponding to the question, and a user's response to the answer is recorded. The dialogue log DB 4 may be included in the dialogue management server 1.

ユーザ端末装置６及びオペレータ端末装置８は、対話の内容が出力される装置であり、
例えば、ラップトップ又はノートブック型コンピュータ等の情報出力装置である。なお、
ユーザ端末装置６及びオペレータ端末装置８は、例えばスマートフォンなどの携帯電話、
タブレット端末等を含む情報出力装置であってもよい。 The user terminal device 6 and the operator terminal device 8 are devices that output the contents of the dialogue.
For example, an information output device such as a laptop computer or a notebook computer. In addition, it should be noted
The user terminal device 6 and the operator terminal device 8 are, for example, a mobile phone such as a smartphone.
It may be an information output device including a tablet terminal or the like.

ネットワークＮは、例えばインターネット等を含む情報処理に係る通信回線又は通信網
であり、その具体的な構成は、対話管理サーバ１、対話ログＤＢ４、ユーザ端末装置６及
びオペレータ端末装置８の間でデータの送受信が可能なように構成されていれば特に制限
されない。 The network N is a communication line or a communication network related to information processing including, for example, the Internet, and its specific configuration is data between a dialogue management server 1, a dialogue log DB 4, a user terminal device 6, and an operator terminal device 8. There is no particular limitation as long as it is configured to be able to send and receive.

対話管理サーバ１は、例示的に、所定のネットワークＮ上における対話を管理するため
の対話管理処理を実行する情報処理部１０、対話管理処理に必要な情報及び対話管理処理
によって生成された情報を記録する記録部１１、文書が用いられる業界の専門用語が格納
された専門用語辞書ＤＢ１２、及び単語の同義語が格納された同義語辞書ＤＢ１３を備え
て構成される。 The dialogue management server 1 exemplifies the information processing unit 10 that executes the dialogue management process for managing the dialogue on a predetermined network N, the information required for the dialogue management process, and the information generated by the dialogue management process. It is configured to include a recording unit 11 for recording, a technical term dictionary DB 12 in which technical terms of the industry in which the document is used are stored, and a synonym dictionary DB 13 in which synonyms of words are stored.

情報処理部１０は、例えば、機能的に、取得部１５、データ変換部１７、受付部１９、
自然言語処理部２１、重み付け設定部２３、回答生成部２７、対話結果管理部２９、出力
制御部３１、音声認識処理部３３、及び音声対話管理部３５を含んで構成されている。 The information processing unit 10 functionally includes, for example, an acquisition unit 15, a data conversion unit 17, and a reception unit 19.
It includes a natural language processing unit 21, a weighting setting unit 23, an answer generation unit 27, a dialogue result management unit 29, an output control unit 31, a voice recognition processing unit 33, and a voice dialogue management unit 35.

なお、情報処理部１０の上記各部は、例えば、メモリやハードディスク等の記憶領域を
用いたり、記憶領域に格納されているプログラムをプロセッサが実行したりすることによ
り実現することができる。また、対話管理サーバ１の各ＤＢ１２及び１３は、プロセッサ
が実行することにより実現することができる。 Each of the above parts of the information processing unit 10 can be realized, for example, by using a storage area such as a memory or a hard disk, or by executing a program stored in the storage area by a processor. Further, each of the DBs 12 and 13 of the dialogue management server 1 can be realized by executing the processor.

取得部１５は、例えば、回答の対象となる文書を取得する。ここで、「文書」とは、例
えば、業務で使用される業務マニュアルやＱ＆Ａ集などのテキスト文書、ワードプロセッ
サ・ソフト等で作成した文書及びプレゼンテーション・ソフトで作成した文書等を意味す
る。 The acquisition unit 15 acquires, for example, a document to be answered. Here, the "document" means, for example, a business manual used in business, a text document such as a Q & A collection, a document created by word processor software, a document created by presentation software, or the like.

データ変換部１７（変換部）は、文書の内容が編、章、節、項、小項目等の階層構造を
有する文書のデータに基づいて、階層構造の末端に位置する項目に含まれるテキストデー
タのそれぞれを、当該各項目に至る一又は複数の階層を特定可能な情報及び当該各項目に
至る一又は複数の各階層の見出しと対応付けて構造化した形式の文書に変換する。所定の
単位としては、例えば、階層構造の末端（最下層）に位置する項目としてもよい。その場
合、階層構造の末端に位置する項目に含まれるテキストデータのそれぞれを、当該各項目
に至る一又は複数の階層を特定可能な情報及び当該各項目に至る一又は複数の各階層の見
出しと対応付けて構造化した形式の文書に変換する。 The data conversion unit 17 (conversion unit) is based on the data of a document in which the content of the document has a hierarchical structure such as editions, chapters, sections, sections, and sub-items, and the text data included in the items located at the end of the hierarchical structure. Each of the above is converted into a document in a structured format in association with information that can identify one or more layers leading to each item and headings of one or more layers leading to each item. The predetermined unit may be, for example, an item located at the end (bottom layer) of the hierarchical structure. In that case, each of the text data included in the items located at the end of the hierarchical structure is divided into information that can identify one or more layers leading to each item and headings of one or more layers leading to each item. Convert to a document in a structured format in association with each other.

図２及び３を用いて、業務マニュアルやＱ＆Ａ集などの文書をもとに、本実施形態に特
有の特定形式の構造化文書に変換する処理について説明する。ここでは、一般のワープロ
や表計算ソフトにより作成された業務マニュアルデータ等の文書をＯＯＸＭＬ（Office O
pen XML）等の構造化文書に変換した後、さらに、本実施形態の対話管理サーバ１で利用
される特定形式の構造化文書に変換する例について説明する。しかしながら、業務マニュ
アル等の文書を本実施形態に特有の特定形式の構造化文書に変換する際に、必ずしもＯＯ
ＸＭＬを経由する必要は無く、種々の変換処理を採用し得る。 Using FIGS. 2 and 3, a process of converting into a structured document of a specific format peculiar to the present embodiment will be described based on a document such as a business manual or a collection of Q & A. Here, documents such as business manual data created by a general word processor or spreadsheet software are displayed in OOXML (Office O).
An example of converting to a structured document such as pen XML) and then converting to a structured document of a specific format used by the dialogue management server 1 of the present embodiment will be described. However, when converting a document such as a business manual into a structured document in a specific format peculiar to this embodiment, it is not always OO.
It is not necessary to go through XML, and various conversion processes can be adopted.

図２は、一般のワープロや表計算ソフトにより作成された業務マニュアルデータ等の文
書をＯＯＸＭＬ（Office Open XML）等の構造化文書へ変換する処理の一例を示す概念図
である。図２に示すように、データ変換部１７は、ワードファイルの業務マニュアルデー
タを、ＯＯＸＭＬ（Office Open XML）で記述されているファイル集合体へ変換する。 FIG. 2 is a conceptual diagram showing an example of a process of converting a document such as business manual data created by a general word processor or spreadsheet software into a structured document such as OOXML (Office Open XML). As shown in FIG. 2, the data conversion unit 17 converts the business manual data of the word file into a file aggregate described in OOXML (Office Open XML).

図３は、ＯＯＸＭＬ（Office Open XML）等の構造化文書から、本実施形態に特有の特
定形式の構造化文書へ変換する処理の一例を示す概念図である。図３に示すように、デー
タ変換部１７は、ＯＯＸＭＬで記述されているファイル集合体へ変換された業務マニュア
ルデータについて、特定のＸＭＬ形式へ変換する。具体的に、データ変換部１７は、階層
構造を有する業務マニュアルデータを、文書の構成単位（編、章、節、項、小項目等）ご
とに参照可能なように抽出し、階層構造の末端に位置する構成単位（項目）に含まれるテ
キストデータ（図３においては、「取引時には、…を行なう。」）のそれぞれを、抽出さ
れた構成単位ごとに、当該構成単位の階層に関する情報と、各階層の見出しに関する情報
とを対応付けて構造化した形式の文書に変換し、当該文書を後述する文書情報ＢＩ（文字
情報）として記録部１１に記録する。 FIG. 3 is a conceptual diagram showing an example of a process of converting a structured document such as OOXML (Office Open XML) into a structured document of a specific format peculiar to the present embodiment. As shown in FIG. 3, the data conversion unit 17 converts the business manual data converted into the file aggregate described in OOXML into a specific XML format. Specifically, the data conversion unit 17 extracts business manual data having a hierarchical structure so that it can be referred to for each structural unit (edition, chapter, section, section, sub-item, etc.) of the document, and ends the hierarchical structure. For each of the text data (in FIG. 3, "at the time of transaction ...") included in the structural unit (item) located in, for each extracted structural unit, information on the hierarchy of the structural unit and information on the hierarchy of the structural unit. It is converted into a document in a structured format by associating it with information related to headings in each layer, and the document is recorded in the recording unit 11 as document information BI (character information) described later.

図１に戻り、受付部１９は、ユーザからの質問の入力を受け付ける。また、受付部１９
は、ユーザから、質問への回答に対する応答の入力を受け付ける。 Returning to FIG. 1, the reception unit 19 accepts the input of a question from the user. In addition, reception department 19
Accepts input from the user in response to the answer to the question.

自然言語処理部２１は、例えば、文書内の文字情報及び入力された質問に含まれる文字
情報を読み込んで形態素解析して単語ごとに切り出す。「形態素解析」とは、コンピュー
タの自然言語処理の一つであり、文法的な情報の注記のない自然言語のテキストデータか
ら、対象言語の文法や、単語の品詞等の情報に基づいて、形態素の単位に区切り、形態素
毎に品詞等を判別する解析処理をいう。また、自然言語処理部２１は、係り受け解析（構
文解析）、つまり、品詞情報から文の修飾関係を調査することを実行してもよい。 For example, the natural language processing unit 21 reads the character information in the document and the character information included in the input question, performs morphological analysis, and cuts out each word. "Morphological analysis" is one of the natural language processing of a computer, and is based on information such as the grammar of the target language and the part of speech of a word from natural language text data without notes of grammatical information. It is an analysis process that divides into units of and discriminates part of speech etc. for each morphological element. Further, the natural language processing unit 21 may execute dependency analysis (syntax analysis), that is, investigate the modification relation of the sentence from the part of speech information.

重み付け設定部２３は、文書内の文字情報及び入力された質問に含まれる文字情報に含
まれる単語の出現頻度を算出し、算出された単語の出現頻度に基づいて単語ごとの重み付
けを設定する。 The weighting setting unit 23 calculates the frequency of occurrence of words included in the character information in the document and the character information included in the input question, and sets the weighting for each word based on the calculated frequency of appearance of the words.

回答生成部２７は、入力された質問と、記録された文書情報の各項目に含まれるテキス
トデータとをマッチングして、質問に関連する項目を複数抽出し、当該抽出された各項目
に至る一又は複数の階層とその見出しとを含む回答を生成し、質問の回答としてユーザに
返す。また、回答生成部２７は、後述するように、対話ログデータベース４に格納された
対話結果に基づいて、マッチングの重み付けを学習してもよい。 The answer generation unit 27 matches the input question with the text data included in each item of the recorded document information, extracts a plurality of items related to the question, and reaches each of the extracted items. Alternatively, an answer including a plurality of layers and their headings is generated and returned to the user as an answer to the question. Further, the answer generation unit 27 may learn matching weighting based on the dialogue result stored in the dialogue log database 4, as will be described later.

対話結果管理部２９は、ユーザから入力された質問、当該質問に対応する回答及び当該
回答に対するユーザからの応答を含む対話結果を対話ログデータベース４に格納する。対
話結果管理部２９は、例えば、後述する音声認識処理部３３により認識された音声情報を
、対話結果として対話ログデータベース４に格納する。 The dialogue result management unit 29 stores the dialogue result including the question input by the user, the answer corresponding to the question, and the response from the user to the answer in the dialogue log database 4. The dialogue result management unit 29 stores, for example, the voice information recognized by the voice recognition processing unit 33, which will be described later, in the dialogue log database 4 as the dialogue result.

出力制御部３１は、回答生成部２７により生成された回答を出力するように制御する。
出力制御部３１は、生成された回答を、ユーザ端末装置６及びオペレータ端末装置８の表
示部（不図示）において出力するように制御する。 The output control unit 31 controls to output the answer generated by the answer generation unit 27.
The output control unit 31 controls so that the generated answer is output on the display unit (not shown) of the user terminal device 6 and the operator terminal device 8.

音声認識処理部３３は、ユーザとオペレータとの音声対話の音声情報を認識する。 The voice recognition processing unit 33 recognizes the voice information of the voice dialogue between the user and the operator.

音声対話管理部３５は、あるユーザとの間で質問及び当該質問に対する回答が所定の回
数繰り返された場合、当該ユーザとの間で音声対話が可能となるように管理する。 The voice dialogue management unit 35 manages so that when a question and an answer to the question are repeated with a certain user a predetermined number of times, a voice dialogue with the user is possible.

記録部１１は、構造化された形式に変換された文書を文書情報ＢＩとして記録する。ま
た、記録部１１は、ユーザから入力された質問、及び当該質問に対応する回答及び当該回
答に対するユーザからの応答を含む対話結果情報ＴＩ（対話結果）を、ユーザごとに関連
付けて記録する。さらに、記録部１１は、ユーザがネットワークＮを介して入力する情報
であるスタンプ情報ＳＩを記録してもよい。 The recording unit 11 records the document converted into the structured format as the document information BI. Further, the recording unit 11 records the dialogue result information TI (dialogue result) including the question input by the user, the answer corresponding to the question, and the response from the user to the answer in association with each user. Further, the recording unit 11 may record the stamp information SI which is the information input by the user via the network N.

「スタンプ」とは、ネットワークＮを介して入力される、対話用の画像情報である。ス
タンプは、例えば感情や意思、伝えたいメッセージをイラストで表したものであり、テキ
ストを含んでもよい。 The "stamp" is image information for dialogue input via the network N. The stamp is, for example, an illustration representing emotions, intentions, and a message to be conveyed, and may include text.

（対話管理処理）
図４及び図５を用いて、本発明に実施形態に係る対話管理処理を説明する。図４は、本
発明の実施形態に係る対話管理処理の一例を示すフローチャートである。 (Dialogue management process)
The dialogue management process according to the embodiment of the present invention will be described with reference to FIGS. 4 and 5. FIG. 4 is a flowchart showing an example of the dialogue management process according to the embodiment of the present invention.

（ステップＳ１）
図１に示す取得部１５は、業務マニュアルやＱ＆Ａ集などの文書を取得する。 (Step S1)
The acquisition unit 15 shown in FIG. 1 acquires documents such as business manuals and Q & A collections.

（ステップＳ３）
図１に示すデータ変換部１７は、図３に示すように、文書の内容が階層構造を有する文
書のデータに基づいて、例えば、階層構造の末端に位置する構成単位、すなわち最下層の
ノードに含まれるテキストデータのそれぞれを、当該各構成単位に至る一又は複数の階層
を特定可能な情報及び当該各構成単位に至る一又は複数の各階層の見出しと対応付けて構
造化した形式の文書に変換する。 (Step S3)
As shown in FIG. 3, the data conversion unit 17 shown in FIG. 1 is based on the data of a document in which the content of the document has a hierarchical structure, for example, to a structural unit located at the end of the hierarchical structure, that is, a node in the lowest layer. Each of the included text data is converted into a document in a structured format in association with information that can identify one or more layers leading to each structural unit and headings of one or multiple layers leading to each structural unit. Convert.

（ステップＳ５）
図１に示す記録部１１は、構造化した形式に変換された文書を文書情報ＢＩとして記録
する。 (Step S5)
The recording unit 11 shown in FIG. 1 records a document converted into a structured format as document information BI.

（ステップＳ７）
図１に示す受付部１９は、ユーザからの質問の入力を受け付ける。 (Step S7)
The reception unit 19 shown in FIG. 1 receives input of a question from the user.

図５は、本発明の実施形態に係るユーザ端末装置の表示部の画面の一例を示す図である
。図５に示すように、図１に示すユーザ端末装置６の画面Ｇに表示されているとおり、受
付部１９は、ユーザＵから入力された「本人確認書類で年金手帳を持参した。住所の記載
がない場合は有効か？」という質問Ｔ１を受け付ける。 FIG. 5 is a diagram showing an example of a screen of a display unit of a user terminal device according to an embodiment of the present invention. As shown in FIG. 5, as displayed on the screen G of the user terminal device 6 shown in FIG. 1, the reception unit 19 "brought the pension handbook with the identity verification document entered from the user U. Is it valid if there is no? ”Is accepted.

（ステップＳ９）
図１に示す回答生成部２７は、入力された質問と、記録された文書情報の各項目に含ま
れるテキストデータとをマッチングして、質問に関連する項目を複数抽出し、当該抽出さ
れた各項目に至る一又は複数の階層とその見出しとを含む回答出力情報を生成し、質問の
回答としてユーザに返す。 (Step S9)
The answer generation unit 27 shown in FIG. 1 matches the input question with the text data included in each item of the recorded document information, extracts a plurality of items related to the question, and extracts each of the extracted items. Generates answer output information including one or more layers leading to the item and its heading, and returns it to the user as the answer to the question.

図５に示すように、図１に示す回答生成部２７は、質問Ｔ１が入力された場合、入力さ
れた当該質問Ｔ１にマッチングする回答出力情報に基づく回答Ｔ３を生成し、画面Ｇにお
いて提示する。以下では、回答生成処理をより具体的に説明する。 As shown in FIG. 5, when the question T1 is input, the answer generation unit 27 shown in FIG. 1 generates an answer T3 based on the answer output information matching the input question T1 and presents it on the screen G. .. In the following, the answer generation process will be described more specifically.

まず、図１に示す自然言語処理部２１は、例えば、質問Ｔ１「本人確認書類で年金手帳
を持参した。住所の記載がない場合は有効か？」を読み込んで形態素解析して単語ごとに
切り出す。自然言語処理部２１は、例えば「本人確認書類」、「本人」、「確認」、「書
類」、「確認書類」、「本人確認」、「年金手帳」、「年金」、「手帳」、「持参」、「
住所」、「記載」、及び「有効」等の少なくとも一以上の単語ごとに切り出す。この際に
、自然言語処理部２１は、必要に応じて、専門用語辞書ＤＢ１２及び同義語辞書ＤＢ１３
を参照してもよい。また、自然言語処理部２１は、形態素解析結果に基づく品詞情報を参
照して、構文解析を実行する。 First, the natural language processing unit 21 shown in FIG. 1 reads, for example, question T1 "I brought a pension notebook as an identity verification document. Is it valid if the address is not stated?", Performs morphological analysis, and cuts out each word. .. The natural language processing unit 21 is, for example, "identification document", "identity", "confirmation", "document", "confirmation document", "identity verification", "pension notebook", "pension", "notebook", "notebook". Bring ","
Cut out at least one or more words such as "address", "statement", and "valid". At this time, the natural language processing unit 21 may use the technical term dictionary DB 12 and the synonym dictionary DB 13 as necessary.
May be referred to. In addition, the natural language processing unit 21 executes parsing by referring to the part of speech information based on the morphological analysis result.

次に、重み付け設定部２３は、質問Ｔ１に含まれる単語の出現頻度を算出し、算出され
た単語の出現頻度に基づいて単語ごとの重み付けを設定する。重み付け設定部２３は、切
り出された、「本人確認書類」、「本人」、「確認」、「書類」、「確認書類」、「本人
確認」、「年金手帳」、「年金」、「手帳」、「持参」、「住所」、「記載」、及び「有
効」…等の重み付けを設定する。重み付け設定部２３は、例えば、一つ又は複数の文書に
おける、出現頻度の高い単語に対する重みをより大きく設定してもよい。また、重み付け
設定部２３は、一つ又は複数の文書において出現頻度が低い単語に対する重みをより大き
く設定してもよい。さらに、重み付け設定部２３は、より短い文書において出現回数が多
い単語に対する重みをより大きく設定してもよい。なお、設定された単語ごとの重み付け
を示す情報は、記録部１１において、文書情報ＢＩとして記録されてもよい。 Next, the weighting setting unit 23 calculates the frequency of appearance of the words included in the question T1 and sets the weighting for each word based on the calculated frequency of appearance of the words. The weighting setting unit 23 is a cut-out "identity verification document", "identity", "confirmation", "document", "confirmation document", "identity verification", "pension notebook", "pension", "notebook". , "Bring", "Address", "Description", "Valid", etc. are set. The weight setting unit 23 may, for example, set a larger weight for frequently occurring words in one or more documents. In addition, the weight setting unit 23 may set a larger weight for words that appear less frequently in one or more documents. Further, the weight setting unit 23 may set a larger weight for a word that appears frequently in a shorter document. The information indicating the weighting for each set word may be recorded as the document information BI in the recording unit 11.

回答生成部２７は、入力された当該質問Ｔ１「本人確認書類で年金手帳を持参した。住
所の記載がない場合は有効か？」に含まれる単語「本人確認書類」、「本人」、「確認」
、「書類」、「確認書類」、「本人確認」、「年金手帳」、「年金」、「手帳」、「持参
」、「住所」、「記載」、及び「有効」等に対応する、重み付け設定部２３により設定さ
れた重み付けに基づいて、入力された質問Ｔ１に対する回答として適合する文章を図１に
示す文書情報ＢＩから抽出する。回答生成部２７は、例えば、文書情報ＢＩに含まれる構
成単位（編、章、節、項、小項目等）ごとに検索し、回答として適合する文章を抽出して
もよい。 The answer generation unit 27 asked the words "identification document", "identity", and "confirmation" contained in the entered question T1 "I brought my pension certificate with the identity verification document. Is it valid if the address is not stated?""
, "Documents", "Confirmation documents", "Identity verification", "Pension notebook", "Pension", "Notebook", "Bring", "Address", "Description", "Valid", etc. Based on the weighting set by the setting unit 23, a sentence suitable as an answer to the input question T1 is extracted from the document information BI shown in FIG. The answer generation unit 27 may search for each constituent unit (edition, chapter, section, section, sub-item, etc.) included in the document information BI, and extract a sentence suitable as an answer.

回答生成部２７は、例えば、質問Ｔ１に含まれる単語ごとに設定された重みに基づいて
、文書情報ＢＩとして記録されている、文書データの階層構造の末端に位置する構成単位
に含まれる各テキストデータのうち、より大きい重みが設定された一以上の単語をより多
く含むテキストデータを、質問Ｔ１により適合するテキストデータとして抽出する。なお
、マッチング処理は上記に限られず、種々のマッチング処理を採用し得る。 The answer generation unit 27, for example, has each text included in the structural unit located at the end of the hierarchical structure of the document data recorded as the document information BI based on the weight set for each word included in the question T1. Of the data, text data containing more words with a larger weight is extracted as text data more suitable for question T1. The matching process is not limited to the above, and various matching processes can be adopted.

そして、回答生成部２７は、文書情報ＢＩから抽出した、質問Ｔ１により適合したテキ
ストデータ、例えば、図５に示す「管理編１章…」、「為替編６章…」、及び「管理編５
章…」等に、「上位候補を示します。どれを詳しく見ますか？」等のテンプレートを追加
した回答Ｔ３を生成する。 Then, the answer generation unit 27 extracts text data adapted to the question T1 extracted from the document information BI, for example, "Management Chapter 1 ...", "Foreign Exchange Chapter 6 ...", and "Management Chapter 5" shown in FIG.
Generate answer T3 with a template such as "Show top candidates. Which one do you want to see in detail?" To "Chapter ...".

この構成によれば、テンプレートを含む回答を生成することで、人間との会話に近い形
式で自然な対話が可能となる。 According to this configuration, by generating an answer including a template, a natural dialogue can be performed in a format similar to a conversation with a human being.

図５に示すように、図１に示す出力制御部３１は、入力された質問に対する回答が複数
生成された場合、当該質問に対する回答としてより適合する回答を優先して出力するよう
に制御する。出力制御部３１は、例えば、当該質問Ｔ１に対する回答としてより適合する
上位３つの回答、「管理編１章…」、「為替編６章…」、及び「管理編５章…」を、表示
部の画面Ｇにおいて上から順に出力するように制御してもよい。 As shown in FIG. 5, when a plurality of answers to the input question are generated, the output control unit 31 shown in FIG. 1 controls to preferentially output a more suitable answer as the answer to the question. The output control unit 31 displays, for example, the top three answers that are more suitable as answers to the question T1, "Management Chapter 1 ...", "Forex Chapter 6 ...", and "Management Chapter 5 ...". It may be controlled to output in order from the top on the screen G of.

この構成によれば、ある質問に対する回答としてより適合する回答をユーザが容易に把
握することができるので、ユーザの利便性がより向上する。 According to this configuration, the user can easily grasp the answer that is more suitable as the answer to a certain question, so that the convenience of the user is further improved.

そして、図５の選択結果Ｔ５に示すように、画面Ｇにおいては、回答Ｔ３において「（
１）管理編１章…」が選択されると、出力制御部３１は、回答詳細Ｔ７に示すように、「
（１）管理編１章…」の具体的内容を出力するように構成される。具体的には、出力制御
部３１は、選択された番号に基づいて、当該番号に対応する構成単位（編、章、節、項、
小項目等）における小項目の説明文を、文書情報ＢＩから抽出して、出力する。なお、回
答Ｔ３における選択の態様に関しては、回答Ｔ３内の詳細を確認したい番号を画面Ｇ上に
おけるクリック動作によって選択してもよいし、詳細を確認したい番号を入力欄ＥＢにお
いてテキスト入力することで選択してもよい。 Then, as shown in the selection result T5 of FIG. 5, on the screen G, in the answer T3, "((
When "1) Management Chapter 1 ..." is selected, the output control unit 31 will perform "" as shown in the answer details T7.
(1) Management Chapter 1 ... ”is configured to be output. Specifically, the output control unit 31 is based on the selected number, and the structural unit (edition, chapter, section, section,) corresponding to the number.
The explanation of the sub-item in (sub-item, etc.) is extracted from the document information BI and output. Regarding the mode of selection in answer T3, the number for which details are to be confirmed in answer T3 may be selected by a click operation on the screen G, or the number for which details are desired to be confirmed may be entered as text in the input field EB. You may choose.

なお、回答生成部２７は、ホットワード、つまり予め設定された特定の単語を、回答Ｔ
３及び回答詳細Ｔ７の少なくとも一方において、他の単語とは異なる表示形態となるよう
に回答を生成してもよい。図５に示すように、回答生成部２７は、回答Ｔ３及び回答詳細
Ｔ７の少なくとも一方において、予め設定された特定の単語（例えば、本人、書類、本人
確認書類等）を赤字となるように、他の単語は黒字となるように回答を生成する。また、
回答生成部２７は、質問Ｔ１に含まれる単語を、回答Ｔ３及び回答詳細Ｔ７の少なくとも
一方において、他の単語とは異なる表示形態となるように回答を生成してもよい。 In addition, the answer generation unit 27 answers T a hot word, that is, a specific word set in advance.
In at least one of 3 and the answer details T7, the answer may be generated so as to have a display form different from that of the other words. As shown in FIG. 5, the answer generation unit 27 makes a predetermined specific word (for example, a person, a document, an identity verification document, etc.) in red in at least one of the answer T3 and the answer detail T7. Generate answers so that the other words are in the black. Also,
The answer generation unit 27 may generate an answer so that the word included in the question T1 has a display form different from that of the other words in at least one of the answer T3 and the answer detail T7.

この構成によれば、ユーザが、特定の単語が回答においてどの部分で用いられているか
を容易に把握することができる。 With this configuration, the user can easily grasp where a specific word is used in the answer.

また、回答生成部２７は、回答に、質問に含まれる単語の同義語が含まれる場合、生成
される回答において、同義語をユーザが識別できるように回答を生成する。例えば、回答
生成部２７は、回答Ｔ３又は回答詳細Ｔ７において、質問Ｔ１に含まれる単語（住所）の
同義語が含まれる場合、生成される回答において、同義語（住居）を「住居」と強調表示
した上で回答を生成する。 Further, when the answer includes a synonym of the word included in the question, the answer generation unit 27 generates the answer so that the user can identify the synonym in the generated answer. For example, the answer generation unit 27 emphasizes the synonym (dwelling) as "dwelling" in the generated answer when the synonym of the word (address) included in the question T1 is included in the answer T3 or the answer detail T7. Generate an answer after displaying it.

この構成によれば、ユーザが回答において、質問に含まれる単語の同義語を容易に識別
することができる。 According to this configuration, the user can easily identify synonyms of the words included in the question in the answer.

また、図１に示す対話結果管理部２９は、質問と、質問に対する回答と、当該回答に対
する応答とを含む対話結果を対話ログデータベース４に格納する。そして、回答生成部２
７は、対話ログデータベース４に格納された対話結果に基づいて、質問と回答とのマッチ
ングの重み付けを学習する。 Further, the dialogue result management unit 29 shown in FIG. 1 stores the dialogue result including the question, the answer to the question, and the response to the answer in the dialogue log database 4. And answer generation unit 2
7 learns the weighting of matching between the question and the answer based on the dialogue result stored in the dialogue log database 4.

図６は、ユーザが回答（１）を選択した後に回答（２）を選択し直した場合のユーザ端
末装置６の表示部の画面の一例を示す図である。図６に示すように、回答Ｔ９のテキスト
情報が入力された後に、お気に入りボタンＨＩが選択されると、対話結果管理部２９は、
対話の開始から終了まで、つまり、図６に示すＴ１〜Ｔ９のテキスト情報を対話結果とし
て対話ログデータベース４に格納する。ここで、ユーザＵにとって、最後の回答（２）（
回答Ｔ９）は、質問Ｔ１により適合した回答である。したがって、対話結果管理部２９は
、質問Ｔ１から最初の回答（１）及び最後の回答である回答Ｔ９までを纏めて管理し、回
答生成部２７は、お気に入りボタンＨＩが選択されたことによって対話ログデータベース
４に格納された対話結果は、ユーザＵにとって、質問Ｔ１には、回答（２）がより適合す
る回答であることを学習する。このように、回答生成部２７は、対話ごとに管理された対
話結果に基づいて繰り返し学習することで、入力される質問と回答とのマッチングの重み
付けを学習する。 FIG. 6 is a diagram showing an example of a screen of the display unit of the user terminal device 6 when the user selects the answer (1) and then reselects the answer (2). As shown in FIG. 6, when the favorite button HI is selected after the text information of the answer T9 is input, the dialogue result management unit 29 receives the dialogue result management unit 29.
From the start to the end of the dialogue, that is, the text information of T1 to T9 shown in FIG. 6 is stored in the dialogue log database 4 as the dialogue result. Here, for user U, the final answer (2) (
Answer T9) is an answer more suitable for question T1. Therefore, the dialogue result management unit 29 collectively manages the question T1 to the first answer (1) and the last answer T9, and the answer generation unit 27 manages the dialogue log when the favorite button HI is selected. The dialogue result stored in the database 4 learns that the answer (2) is a more suitable answer to the question T1 for the user U. In this way, the answer generation unit 27 learns the weighting of matching between the input question and the answer by repeatedly learning based on the dialogue result managed for each dialogue.

この構成によれば、質問と回答とのマッチングの重み付けを学習することができるので
、回答の精度を向上させることができる。 According to this configuration, it is possible to learn the weighting of matching between the question and the answer, so that the accuracy of the answer can be improved.

以上、本発明の第１実施形態によれば、回答生成部２７は、質問が入力された場合、入
力された当該質問に含まれる単語に対応する設定された重み付けに基づいて、入力された
当該質問に対する回答を生成する。よって、組織内の業務効率化を図ることができ、且つ
、ユーザの利便性を向上できることに加え、マニュアル等を有効活用することができる。 As described above, according to the first embodiment of the present invention, when a question is input, the answer generation unit 27 is input based on the set weighting corresponding to the word included in the input question. Generate answers to questions. Therefore, it is possible to improve the work efficiency in the organization, improve the convenience of the user, and effectively utilize the manual or the like.

（変形例）
なお、上記説明では、回答生成部２７は、入力された質問と、記録された文書情報の各
項目に含まれるテキストデータとをマッチングして、質問に関連する項目を複数抽出する
としたが、回答生成部２７の処理はこれに限定されるものではない。回答生成部２７は、
入力された質問と、記録された文書情報の各項目に含まれるテキストデータ及び見出しと
をマッチングして、質問に関連する項目を複数抽出するものでもよい。 (Modification example)
In the above explanation, the answer generation unit 27 matches the input question with the text data included in each item of the recorded document information, and extracts a plurality of items related to the question. The processing of the generation unit 27 is not limited to this. The answer generation unit 27
The input question may be matched with the text data and headings included in each item of the recorded document information, and a plurality of items related to the question may be extracted.

さらに、この場合、重み付け設定部２３が、見出しとマッチングして抽出した質問に関
連する項目と、記録された文書情報の各項目に含まれるテキストデータとマッチングして
抽出した質問に関連する項目との重要度を変えてもよい。例えば、重み付け設定部２３は
、見出しに基づいて抽出された項目の重要度を、階層構造の末端に位置する構成単位のテ
キストデータに基づいて抽出された項目の重要度よりも高くなるように設定するものでも
よい。 Further, in this case, the item related to the question extracted by matching with the heading and the item related to the question extracted by matching with the text data included in each item of the recorded document information by the weighting setting unit 23. You may change the importance of. For example, the weighting setting unit 23 sets the importance of the items extracted based on the heading to be higher than the importance of the items extracted based on the text data of the structural unit located at the end of the hierarchical structure. It may be something to do.

補足すると、質問に対する回答として、詳細な説明よりも簡易な説明の方がユーザの利
便性に資することがある。それゆえ、質問によっては、各項目に含まれるテキストデータ
よりも見出しの方が適切な回答を誘導する場合がある。変形例の回答生成部２７では、テ
キストデータのみならず、見出しも含めてマッチングをすることで質問に関連する項目を
適切に抽出することができる。 Supplementally, as an answer to the question, a simple explanation may contribute to the convenience of the user rather than a detailed explanation. Therefore, depending on the question, the heading may lead to a more appropriate answer than the text data contained in each item. In the answer generation unit 27 of the modified example, items related to the question can be appropriately extracted by matching not only the text data but also the headline.

＜第２実施形態＞
図７は、入力された質問の内容が不明確である場合の、ユーザ端末装置６の表示部の画
面の一例を示す図である。図７に示すように、図１に示す回答生成部２７は、入力された
質問Ｔ１１「口座」の内容が不明確である場合、例えば、口座の開設について質問したい
のか、口座の解約について質問したいのか等が判然とせず不明確である場合、入力された
当該質問に対して、より具体的な質問を促すように、予め定められた回答Ｔ１３「質問の
詳細を教えてください。」を生成してもよい。具体的に、回答生成部２７は、入力された
質問の文字数、当該質問に対する検索スコア、及び当該質問に対して抽出される文書数な
どを総合的に評価することで、入力された質問の内容が不明確であるか否かを判断する。 <Second Embodiment>
FIG. 7 is a diagram showing an example of a screen of the display unit of the user terminal device 6 when the content of the input question is unclear. As shown in FIG. 7, when the content of the input question T11 “account” is unclear, the answer generation unit 27 shown in FIG. 1 wants to ask, for example, whether he / she wants to ask about opening an account or canceling the account. If it is unclear and unclear, a predetermined answer T13 "Please tell me the details of the question" is generated so as to prompt a more specific question for the entered question. You may. Specifically, the answer generation unit 27 comprehensively evaluates the number of characters of the input question, the search score for the question, the number of documents extracted for the question, and the like, and the content of the input question. Determine if is unclear.

このように、ユーザＵがより具体的な質問をするように促された結果、追加で質問Ｔ１
５「口座開設」と入力されたことで、出力制御部３１は、回答Ｔ１７に示すように、文書
情報ＢＩに含まれる構成単位のうち、「口座開設」により適合する、複数の「章」を画面
Ｇにおいて提示し、提示された複数の「章」の中で、より適合する一以上の「節」を提示
する。そして、出力制御部３１は、最終的に、この「節」の中から抽出された、より適合
する文章を画面Ｇにおいて提示するように構成されてもよい。 As a result of being prompted by the user U to ask a more specific question in this way, the additional question T1
5 By inputting "open an account", the output control unit 31 selects a plurality of "chapter" that are more suitable for "open an account" among the structural units included in the document information BI, as shown in the answer T17. It is presented on screen G, and one or more "sections" that are more suitable are presented among the plurality of "chapter" presented. Then, the output control unit 31 may be configured to finally present a more suitable sentence extracted from this “section” on the screen G.

この構成によれば、ユーザが文書情報ＢＩの検索箇所をより具体的に絞り込めるため、
抽出精度が向上できる。 According to this configuration, the user can narrow down the search location of the document information BI more specifically.
Extraction accuracy can be improved.

以上、本発明の第２実施形態によれば、回答生成部２７は、入力された質問の内容が不
明確である場合、入力された当該質問に対して、予め定められた回答を生成する。よって
、ユーザに対して、より具体的な質問を促すことができるので、質問の曖昧性を回避する
ことができ、質問に対する回答の精度を向上させることができる。 As described above, according to the second embodiment of the present invention, when the content of the input question is unclear, the answer generation unit 27 generates a predetermined answer to the input question. Therefore, since it is possible to prompt the user for a more specific question, ambiguity of the question can be avoided, and the accuracy of the answer to the question can be improved.

＜第３実施形態＞
図８は、質問及び当該質問に対する回答が所定の回数繰り返された場合の、端末装置の
表示部の画面の一例を示す図である。図８に示すように、図１に示す音声対話管理部３５
は、あるユーザＵとの間で質問及び当該質問に対する回答が所定の回数繰り返された場合
、当該ユーザＵとの間で音声対話が可能となるように管理する。例えば、音声対話管理部
３５は、図８のテキストＴ２１〜Ｔ２４に示すように、ユーザＵとの間で質問及び回答が
例えば２回実行された場合、図１に示すオペレータと電話を介して対話可能となるように
電話回線にアクセスできるように管理する。なお、所定の回数は、２回に限定されない。
１回であってもよいし、３回以上であってもよい。 <Third Embodiment>
FIG. 8 is a diagram showing an example of a screen of a display unit of a terminal device when a question and an answer to the question are repeated a predetermined number of times. As shown in FIG. 8, the voice dialogue management unit 35 shown in FIG.
Manages so that when a question and an answer to the question are repeated a predetermined number of times with a certain user U, a voice dialogue is possible with the user U. For example, as shown in the texts T21 to T24 of FIG. 8, the voice dialogue management unit 35 interacts with the operator shown in FIG. 1 via a telephone when the question and the answer are executed twice, for example, with the user U. Manage telephone lines to be accessible as possible. The predetermined number of times is not limited to two times.
It may be once or three times or more.

ユーザＵとの間で質問及び回答が例えば２回実行された場合であって、ユーザＵから電
話のリクエストＴ２５「電話をお願いします。」があったとき、図１に示す出力制御部３
１は、電話を許容するためのメッセージＴ２６を画面Ｇにおいて表示するように構成され
てもよい。 When the question and answer are executed twice with the user U, for example, and the user U receives a telephone request T25 "Please call me", the output control unit 3 shown in FIG.
Reference numeral 1 may be configured to display a message T26 for accepting a telephone call on the screen G.

電話による対話は、図１に示すユーザ端末装置６とオペレータのオペレータ端末装置８
との間で実行もよいし、ユーザ端末装置６やオペレータ端末装置８とは異なる他の通話機
器を介して実行されてもよい。ユーザとオペレータの音声情報は、ユーザ端末装置６、オ
ペレータ端末装置８、当該他の通話機器が収集し、対話管理サーバ１に送信する。なお、
画面Ｇは、ユーザ端末装置６の表示部とオペレータ端末装置８の表示部で共通して表示さ
れてもよい。オペレータは、電話による対話までの画面Ｇ上での対話を踏まえて、電話に
よる応答が可能となる。 The telephone dialogue is performed by the user terminal device 6 shown in FIG. 1 and the operator terminal device 8 of the operator.
It may be executed with or via another communication device different from the user terminal device 6 and the operator terminal device 8. The voice information of the user and the operator is collected by the user terminal device 6, the operator terminal device 8, and the other communication device, and transmitted to the dialogue management server 1. In addition, it should be noted
The screen G may be displayed in common on the display unit of the user terminal device 6 and the display unit of the operator terminal device 8. The operator can answer by telephone based on the dialogue on the screen G up to the dialogue by telephone.

ここで、図１に示す、対話管理サーバ１の音声認識処理部３３は、ユーザとオペレータ
との音声対話の音声情報を認識し、対話結果管理部２９は、対話結果としてユーザごとに
、認識された音声情報の登録又は更新を管理してもよい。 Here, the voice recognition processing unit 33 of the dialogue management server 1 shown in FIG. 1 recognizes the voice information of the voice dialogue between the user and the operator, and the dialogue result management unit 29 recognizes each user as the dialogue result. You may manage the registration or update of voice information.

以上、本発明の第３実施形態によれば、音声対話管理部３５は、あるユーザとの間で質
問及び当該質問に対する回答が所定の回数繰り返された場合、当該ユーザとの間で音声対
話が可能となるように管理する。よって、ユーザが簡単にオペレータに対して電話するこ
とを防ぐことでオペレータに対する電話の回数が減少するので、組織内の業務の効率化が
図れる。 As described above, according to the third embodiment of the present invention, when the question and the answer to the question are repeated a predetermined number of times with the user, the voice dialogue management unit 35 has a voice dialogue with the user. Manage to be possible. Therefore, by preventing the user from easily calling the operator, the number of calls to the operator is reduced, and the efficiency of work in the organization can be improved.

ただし、音声対話管理部３５の動作はこれに限定されるものではなく、対話開始前から
ユーザの電話を受け付けるものでもよいし、電話による音声対話と文字入力による質問と
の両方を受け付けるものでもよい。 However, the operation of the voice dialogue management unit 35 is not limited to this, and the user's telephone may be accepted before the start of the dialogue, or both the voice dialogue by telephone and the question by character input may be accepted. ..

＜第４実施形態＞
図９は、入力された質問に対する回答として一以上の関連キーワードを提示する場合の
、ユーザ端末装置６の表示部の画面の一例を示す図である。図９に示すように、図１に示
す回答生成部２７は、入力された質問Ｔ３１「口座開設」に対する回答として、例えば、
記録部１１において、あらかじめ、「口座開設」という単語に関連づけて記録された、一
以上の関連キーワード「手続き」、「必要書類」、「代理」及び「未成年」を含む回答Ｔ
３３を生成する。ここで、ある単語の関連キーワードは、当該単語の共起情報を取得し、
当該共起情報に重みづけを行うことで生成される。具体的に、「口座開設」の関連キーワ
ードとしての「手続き」、「必要書類」、「代理」及び「未成年」等は、「口座開設」の
共起情報に重みづけを行うことで生成されたものである。そして、「口座開設」という単
語を含む質問、並びに、当該質問に対する、「手続き」、「必要書類」、「代理」及び「
未成年」等を含む回答を含む対話結果を管理することで、関連キーワードの生成の精度を
向上させることができるので、質問に対する回答の精度を向上させることができる。なお
、一の単語と他の単語との共起情報は、構造化した形式の文書の全テキストデータから求
められる。 <Fourth Embodiment>
FIG. 9 is a diagram showing an example of a screen of the display unit of the user terminal device 6 when one or more related keywords are presented as answers to the input question. As shown in FIG. 9, the answer generation unit 27 shown in FIG. 1 serves as an answer to the input question T31 “opening an account”, for example.
Answer T including one or more related keywords "procedure", "required documents", "proxy" and "minor" recorded in advance in the recording unit 11 in association with the word "open an account".
Generate 33. Here, the related keyword of a certain word acquires the co-occurrence information of the word, and
It is generated by weighting the co-occurrence information. Specifically, "procedure", "required documents", "proxy", "minor", etc. as related keywords of "account opening" are generated by weighting the co-occurrence information of "account opening". It is a thing. Then, questions including the word "open an account", and "procedures", "required documents", "proxy" and "proxy" for the questions.
By managing the dialogue results including the answers including "minors" and the like, the accuracy of generating related keywords can be improved, so that the accuracy of the answers to the questions can be improved. The co-occurrence information of one word and another word is obtained from all the text data of the document in the structured format.

以上、本発明の第４実施形態によれば、質問に含まれる単語にあらかじめ関連づけられ
たキーワードを提示し、ユーザに選択させることで、質問の曖昧性を回避することができ
、質問に対する回答の精度を向上させることができる。 As described above, according to the fourth embodiment of the present invention, the ambiguity of the question can be avoided by presenting the keyword associated with the word included in the question in advance and letting the user select it, and the answer to the question can be answered. The accuracy can be improved.

＜第５実施形態＞
第５実施形態においては、ユーザから入力された質問の内容が不明確である場合に、よ
り具体的な質問を促すように、例えば編、章、節、項、小項目等からなる階層構造を有す
る文書データに基づいて、段階的に回答を出力する。例えば、第５実施形態における対話
管理システムにおいては、入力された当該質問に対して編階層（第１階層）における項目
に基づいて回答出力情報を生成し、質問の第１回答としてユーザＵに返す。その後、さら
に入力された質問に対して編階層よりも下位の階層である章、節、項、又は小項目等の階
層（第２階層）における項目に基づいて回答出力情報を生成し、当該質問の第２回答とし
てユーザＵに返す。 <Fifth Embodiment>
In the fifth embodiment, when the content of the question input by the user is unclear, a hierarchical structure consisting of, for example, chapters, sections, sections, sub-items, etc. is provided so as to prompt a more specific question. The answer is output step by step based on the document data that you have. For example, in the dialogue management system according to the fifth embodiment, answer output information is generated based on the items in the knitting layer (first layer) for the input question, and is returned to the user U as the first answer to the question. .. After that, answer output information is generated for the further input question based on the items in the hierarchy (second hierarchy) such as chapters, sections, sections, or sub-items, which are lower than the edition hierarchy, and the question is asked. Is returned to user U as the second answer of.

第５実施形態を図１０及び図１１を参照して説明する。図１０及び図１１は、入力され
た質問の内容が不明確である場合の、ユーザ端末装置６の表示部の画面Ｇの一例を示す図
である。図１０に示すように、図１に示す回答生成部２７は、入力された質問Ｔ４１「口
座を統一する」の文字数、当該質問に対する検索スコア、及び当該質問に対して抽出され
る文書数などを総合的に評価することで、入力された質問Ｔ４１「口座を統一する」の内
容が不明確であるか否かを判定する。回答生成部２７は、どのような内容の口座の統一に
ついて質問したいのか等が判然とせず不明確である場合、入力された当該質問に対して、
より具体的な質問を促すように、入力された当該質問に対して編階層における、「預金編
」、「共通編」…等に基づいて回答出力情報を生成し、質問の回答Ｔ４３としてユーザＵ
に返す。なお、回答生成部２７は、例えば、回答Ｔ４３の出力の際に、予め定められた回
答Ｔ４２「『〇〇〇〇』はどの編に該当しますか？もう少し詳しく教えてください。」を
生成し返答してもよい。 A fifth embodiment will be described with reference to FIGS. 10 and 11. 10 and 11 are diagrams showing an example of the screen G of the display unit of the user terminal device 6 when the content of the input question is unclear. As shown in FIG. 10, the answer generation unit 27 shown in FIG. 1 determines the number of characters of the input question T41 “unify accounts”, the search score for the question, the number of documents extracted for the question, and the like. By comprehensively evaluating, it is determined whether or not the content of the input question T41 "unify accounts" is unclear. When it is unclear and unclear what kind of content the answer generation unit 27 wants to ask about the unification of accounts, the answer generation unit 27 responds to the input question.
In order to prompt a more specific question, answer output information is generated based on the "deposit edition", "common edition", etc. in the edition hierarchy for the entered question, and the user U is used as the answer T43 of the question.
Return to. In addition, the answer generation unit 27 generates, for example, when the answer T43 is output, the predetermined answer T42 "Which volume does" OOOO "correspond to? Please tell me a little more." You may reply.

回答生成部２７は、その後、さらに入力された質問Ｔ４４に対して前編よりも下位の階
層である節階層における、「基本事項」、「取扱手続」…等に基づいて回答出力情報を生
成し、当該質問の回答Ｔ４６としてユーザＵに返す。なお、回答生成部２７は、例えば、
回答Ｔ４６の出力の際に、予め定められた回答Ｔ４５「『〇〇〇〇』について複数の候補
があります。次のうちどの項目に近いですか？」を生成し返答してもよい。 After that, the answer generation unit 27 generates answer output information for the further input question T44 based on "basic matters", "handling procedures", etc. in the section hierarchy which is lower than the first part. It is returned to the user U as the answer T46 of the question. The answer generation unit 27 is, for example,
When the answer T46 is output, the predetermined answer T45 "There are multiple candidates for" OOOO ". Which of the following items is closest to?" May be generated and answered.

図１１に示すように、回答生成部２７は、さらに入力された質問Ｔ４７に対して、編、
章、節、項の各階層における各項目に基づいて回答出力情報を生成し、当該質問の回答Ｔ
４９としてユーザに返す。ここで、回答生成部２７は、例えば、図１０に示す質問Ｔ４１
及びＴ４４、並びにＴ４７に対する最終的な回答としてより適合する上位３つの回答を画
面Ｇにおいて上から順に出力可能な回答出力情報を生成してもよい。なお、回答生成部２
７は、例えば、回答Ｔ４９の出力の際に、最初に入力された、図１０における質問Ｔ４１
「口座を統一する」を生成し返答してもよい。 As shown in FIG. 11, the answer generation unit 27 responds to the further input question T47.
Answer output information is generated based on each item in each level of chapter, section, and section, and the answer T of the question is generated.
Return to the user as 49. Here, the answer generation unit 27, for example, asks the question T41 shown in FIG.
And T44, and the top three answers that are more suitable as the final answers to T47 may be generated on the screen G in order from the top. Answer generation unit 2
7 is, for example, the question T41 in FIG. 10, which was first input when the answer T49 was output.
You may generate and reply to "Unify accounts".

以上、本発明の第５実施形態によれば、ユーザからの質問内容に応じて段階的に回答を
提示する。よって、回答候補を効率的に絞り込むことができるので、回答の精度を向上さ
せることができ、且つ、迅速に所望の回答を画面上において提示することができる。 As described above, according to the fifth embodiment of the present invention, answers are presented step by step according to the content of the question from the user. Therefore, since the answer candidates can be efficiently narrowed down, the accuracy of the answer can be improved, and the desired answer can be quickly presented on the screen.

＜第６実施形態＞
第６実施形態は、ユーザからの質問と、質問に対する回答と、回答に対する評価とを対
話結果として関連付けて格納し、格納された対話結果に基づいて、質問と回答とのマッチ
ングの重み付けを学習する。第６実施形態を図１２から図１４を参照して説明する。 <Sixth Embodiment>
In the sixth embodiment, the question from the user, the answer to the question, and the evaluation for the answer are stored in association with each other as the dialogue result, and the weighting of matching between the question and the answer is learned based on the stored dialogue result. .. A sixth embodiment will be described with reference to FIGS. 12 to 14.

図１２は、ユーザ端末装置６の表示部における、ユーザＵの質問と回答とを含む対話の
一例を示す画面例である。図１２に示すように、回答生成部２７は、入力された質問Ｔ５
１に対して、編、章、節、項の各階層における各項目に基づいて回答出力情報を生成し、
当該質問の回答Ｔ５３としてユーザＵに返す。また、回答Ｔ５３の出力の際に、予め定め
られた回答Ｔ５２「『〇〇〇〇』についてこの中に答えはありますか？」を生成し返答し
てもよい。 FIG. 12 is a screen example showing an example of a dialogue including a question and an answer of the user U in the display unit of the user terminal device 6. As shown in FIG. 12, the answer generation unit 27 asks the input question T5.
For 1, the answer output information is generated based on each item in each layer of edition, chapter, section, and section.
It is returned to the user U as the answer T53 of the question. Further, when the answer T53 is output, the predetermined answer T52 "Is there an answer in this for" OOOO "?" May be generated and replied.

図１３は、図１２に示す画面Ｇにおける回答Ｔ５３において、ユーザＵが一番目の回答
「（１）：計算編…」を選択した後に続く対話の一例を示す画面例である。なお、図１２
における回答Ｔ５３における選択の態様に関しては、回答Ｔ５３内の詳細を確認したい番
号を画面Ｇ上におけるクリック動作によって選択してもよいし、詳細を確認したい番号を
、不図示のユーザ入力欄においてテキスト入力することで選択してもよい。 FIG. 13 is a screen example showing an example of the dialogue that follows after the user U selects the first answer “(1): Calculation ...” in the answer T53 on the screen G shown in FIG. Note that FIG. 12
Regarding the mode of selection in the answer T53, the number in the answer T53 for which the details are to be confirmed may be selected by a click operation on the screen G, or the number for which the details are to be confirmed may be entered as text in a user input field (not shown). You may select by doing.

図１３に示すように、図１２に示す画面Ｇにおける回答Ｔ５３において、ユーザＵが一
番目の回答「（１）：計算編…」を選択すると、回答生成部２７は、回答詳細Ｔ５５に示
すように、「（１）：計算編…」の具体的内容を出力可能なように回答出力情報を生成す
る。具体的には、回答生成部２７は、選択された番号に基づいて、当該番号に対応する構
成単位（編、章、節、項、小項目等）における小項目の説明文を、図１に示す文書情報Ｂ
Ｉから抽出して、回答出力情報を生成する。なお、回答生成部２７は、ホットワード、つ
まり予め設定された特定の単語を、回答詳細Ｔ５５において、他の単語とは異なる表示形
態となるように回答出力情報を生成してもよい。図１３に示すように、回答生成部２７は
、図１２に示す質問Ｔ５１に含まれる単語（例えば、裁判費用立替口、及び、開設）を、
回答詳細Ｔ５５において、他の単語とは異なる表示形態となるように回答を生成してもよ
い。また、回答生成部２７は、予め設定された特定の単語（例えば、裁判費用立替口、及
び、開設）を赤字となるように、他の単語は黒字となるように回答出力情報を生成しても
よい。 As shown in FIG. 13, when the user U selects the first answer "(1): Calculation ..." in the answer T53 on the screen G shown in FIG. 12, the answer generation unit 27 shows the answer details T55. In addition, answer output information is generated so that the specific contents of "(1): Calculation ..." can be output. Specifically, based on the selected number, the answer generation unit 27 provides a description of the sub-items in the structural unit (edition, chapter, section, section, sub-item, etc.) corresponding to the number in FIG. Document information to show B
Extract from I to generate answer output information. The answer generation unit 27 may generate answer output information so that the hot word, that is, a specific word set in advance, is displayed in the answer detail T55 in a form different from that of other words. As shown in FIG. 13, the answer generation unit 27 uses the words included in the question T51 shown in FIG. 12 (for example, the trial expense reimbursement port and the establishment).
Answer details In T55, the answer may be generated so as to have a display form different from that of other words. In addition, the answer generation unit 27 generates answer output information so that a specific preset word (for example, a trial expense reimbursement opening and opening) is in the red, and other words are in the black. May be good.

また、回答生成部２７は、例えば、回答詳細Ｔ５５の後に、ユーザＵから入力された質
問に対する回答について評価を促すための回答Ｔ５６（評価要求）を出力する。回答Ｔ５
６は、例えば、「回答の評価はいかがでしょうか？」のテキストともに、「Ｇｏｏｄ」（
良い）という特定の意味に対応づけられたスタンプＳＴ１、及び、「Ｂａｄ」（悪い）と
いう特定の意味に対応づけられたスタンプＳＴ２を含んで構成される。 Further, the answer generation unit 27 outputs, for example, the answer T56 (evaluation request) for prompting the evaluation of the answer to the question input from the user U after the answer details T55. Answer T5
6 is, for example, "Good" (with the text "How about the evaluation of the answer?"
It is composed of a stamp ST1 associated with a specific meaning of "good" and a stamp ST2 associated with a specific meaning of "bad" (bad).

「スタンプ」とは、ネットワークＮを介して入力される、対話用の画像情報である。ス
タンプは、例えばユーザの感情、意図や意思、伝えたいメッセージをイラストで表したも
のであり、画像情報の他、テキストを含んでもよい。図１に示す記録部１１は、スタンプ
と特定の意味とを対応づけて記録する。例えば、スタンプは、「Ｇｏｏｄ」（良い）、及
び、「Ｂａｄ」（悪い）という特定の意味に対応づけられて記録される。また、スタンプ
は、イラストと、回答の精度を示す得点、例えば１００点、７５点等を示すテキスト情報
と、関連づけて含むものであってもよい。なお、評価は、スタンプを選択することには限
られない。例えば、回答Ｔ５６において選択可能なように「Ｇｏｏｄ」ボタン又は「Ｂａ
ｄ」ボタンが含まれる場合、ユーザは、いずれかのボタンを選択することで評価入力を実
行してもよい。また、ユーザは、回答詳細Ｔ５５が出力された後に、不図示のテキスト入
力欄から、「Ｇｏｏｄ」又は「Ｂａｄ」のテキスト情報を入力することによって評価入力
を実行してもよい。なお、評価は、「Ｇｏｏｄ」（良い）、又は、「Ｂａｄ」（悪い）の
２パターンだけではなく、とても良い、良い、普通、悪い、とても悪いというように３つ
以上のパターンを含んでもよい。 The "stamp" is image information for dialogue input via the network N. The stamp represents, for example, the user's emotions, intentions and intentions, and a message to be conveyed by an illustration, and may include text as well as image information. The recording unit 11 shown in FIG. 1 records the stamp in association with a specific meaning. For example, stamps are recorded in association with the specific meanings of "Good" (good) and "Bad" (bad). Further, the stamp may include an illustration in association with a score indicating the accuracy of the answer, for example, text information indicating 100 points, 75 points, or the like. The evaluation is not limited to selecting a stamp. For example, the "Good" button or "Ba" can be selected in answer T56.
When the "d" button is included, the user may execute the evaluation input by selecting one of the buttons. Further, the user may execute the evaluation input by inputting the text information of "Good" or "Bad" from the text input field (not shown) after the answer details T55 is output. The evaluation may include not only two patterns of "Good" (good) or "Bad" (bad), but also three or more patterns such as very good, good, normal, bad, and very bad. ..

次に、図１に示す対話結果管理部２９は、ユーザＵからの質問と、当該質問に対する回
答と、入力されたスタンプ（評価）と、を含む対話結果を対話ログデータベース４に格納
する。以下、対話結果管理部２９の処理を具体的に説明する。 Next, the dialogue result management unit 29 shown in FIG. 1 stores the dialogue result including the question from the user U, the answer to the question, and the input stamp (evaluation) in the dialogue log database 4. Hereinafter, the processing of the dialogue result management unit 29 will be specifically described.

図１３に示すように、ユーザＵにより、回答に対する良好な評価を示す「Ｇｏｏｄ」を
含むスタンプＳＴ１が選択された場合、またはスタンプＳＴ１が選択された後に不図示の
対話終了ボタンが選択された場合、対話結果管理部２９は、対話の開始から終了まで、つ
まり、図１２及び図１３に示すＴ５１〜Ｔ５６のテキスト情報及びスタンプＳＴ１を対話
結果として対話ログデータベース４に格納する。このように、対話結果管理部２９は、対
話の開始から終了まで纏めて（関連づけて）管理することで、対話管理サーバ１が入力さ
れた質問に対してより適合した回答であると評価して提示された回答、及び、実際にユー
ザが選択した（ユーザが所望する）回答を纏めて管理する。 As shown in FIG. 13, when the stamp ST1 including "Good" indicating a good evaluation for the answer is selected by the user U, or when the dialogue end button (not shown) is selected after the stamp ST1 is selected. The dialogue result management unit 29 stores the text information of T51 to T56 shown in FIGS. 12 and 13, and the stamp ST1 as the dialogue result in the dialogue log database 4 from the start to the end of the dialogue. In this way, the dialogue result management unit 29 evaluates that the dialogue management server 1 is a more suitable answer to the input question by managing (associating) the dialogue from the start to the end. The presented answers and the answers actually selected by the user (desired by the user) are collectively managed.

図１４は、本発明の実施形態に係るオペレータ端末装置８の表示部における対話ログ管
理画面の一例を示す図である。例えば、対話ログ管理画面ＬＧに表示される情報は、対話
結果管理部２９が対話ログデータベース４に格納する情報に対応している。図１４に示す
ように、対話ログデータベース４には、対話ログＩＤと、ユーザＩＤと、対話ログが格納
された日時と、当該対話において最初に入力された質問と、当該質問に関する最終的な回
答と、当該回答の番号（例えば、図１２における回答Ｔ５３の番号（１）、（２）（３）
）と、当該回答に対する評価と、が関連付けて記録されている。なお、対話ログデータベ
ース４には、ユーザからの最初の質問に対する回答のみならず、ユーザから入力された質
問ごとの一又は複数の回答が格納されてよく、対話ログ管理画面ＬＧは、ユーザから入力
された質問ごとの一又は複数の回答が出力されてもよい。 FIG. 14 is a diagram showing an example of a dialogue log management screen in the display unit of the operator terminal device 8 according to the embodiment of the present invention. For example, the information displayed on the dialogue log management screen LG corresponds to the information stored in the dialogue log database 4 by the dialogue result management unit 29. As shown in FIG. 14, the dialogue log database 4 contains a dialogue log ID, a user ID, a date and time when the dialogue log is stored, a question first entered in the dialogue, and a final answer to the question. And the number of the answer (for example, the number (1), (2) (3) of the answer T53 in FIG.
) And the evaluation of the answer are recorded in association with each other. The dialogue log database 4 may store not only the answer to the first question from the user but also one or more answers for each question input by the user, and the dialogue log management screen LG is input by the user. One or more answers for each question asked may be output.

対話結果管理部２９は、スタンプを含む対話結果を格納し、回答に対する評価に基づい
て当該対話結果に関する検索スコアを調整する。つまり、対話結果管理部２９は、入力さ
れた回答の評価を反映したスタンプとともに対話結果を繰り返し記録し、複数の対話結果
（質問及び回答のセット）を回答の評価に応じて、質問に対する検索スコアを上昇させる
。例えば、図１４において、現在、対話ログＩＤ「７３５０」である対話結果、つまり、
質問「新規口座開設について」及び回答「カード編１章…」のセットが上から２つ目にラ
ンクされているが、別の機会に、例えば、ユーザが「新規口座開設」に関する質問を入力
し、上記と同様の回答「カード編１章…」に対して「Ｇｏｏｄ」の評価を入力すると、対
話ログ管理画面ＬＧにおいて、上記セットの表示位置が上位になるように制御される（こ
こでは、最上位になるように制御される）。他方で、ある質問及び回答のセットに対して
「Ｂａｄ」と評価されると、対話ログ管理画面ＬＧにおいて、当該セットが非表示制御さ
れたり、当該セットの表示位置が下位になるように制御されたりする。 The dialogue result management unit 29 stores the dialogue result including the stamp, and adjusts the search score for the dialogue result based on the evaluation of the answer. That is, the dialogue result management unit 29 repeatedly records the dialogue result together with the stamp reflecting the evaluation of the input answer, and records a plurality of dialogue results (a set of questions and answers) according to the evaluation of the answer, and the search score for the question. To raise. For example, in FIG. 14, the dialogue result currently having the dialogue log ID "7350", that is,
The set of question "About opening a new account" and answer "Chapter 1 of the card edition ..." is ranked second from the top, but at another time, for example, the user enters a question about "Opening a new account". , If you enter the evaluation of "Good" for the answer "Card Chapter 1 ..." similar to the above, the display position of the above set is controlled to be higher on the dialogue log management screen LG (here, It is controlled to be the highest level). On the other hand, when a certain question and answer set is evaluated as "Bad", the set is controlled to be hidden or the display position of the set is controlled to be lower on the dialogue log management screen LG. Or

回答生成部２７は、対話結果として対話ログデータベース４に格納されたスタンプに対
応づけられた評価に基づいて、入力された質問と、当該質問に対する回答とのマッチング
の重み付けを学習する。 The answer generation unit 27 learns the weighting of matching between the input question and the answer to the question based on the evaluation associated with the stamp stored in the dialogue log database 4 as the dialogue result.

例えば、回答生成部２７は、ユーザＵにとって、図１２に示す質問Ｔ５１には、回答Ｔ
５３に含まれる複数の回答のうち、一番目の回答「（１）：計算編…」がより適合する回
答であることを分析し、回答「（１）：計算編…」の重み付けが他の回答に比べて上がる
ように制御する。例えば、回答生成部２７は、回答「（１）：計算編…」の重み付けをあ
げるとともに、回答Ｔ５３に含まれる他の回答「（２）計算編…」「（３）計算編…」等
の重み付けを下げてもよい。このように、回答生成部２７は、対話ごとに管理された対話
結果に基づいて繰り返し学習することで、入力される質問と回答とのマッチングの重み付
けを学習する。 For example, the answer generation unit 27 asks the user U to answer T51 to the question T51 shown in FIG.
Of the multiple answers included in 53, the first answer "(1): Calculation ..." is analyzed to be a more suitable answer, and the weighting of the answer "(1): Calculation ..." is other. Control to increase compared to the answer. For example, the answer generation unit 27 gives weighting to the answer "(1): calculation ..." and other answers "(2) calculation ...""(3) calculation ..." included in the answer T53. The weighting may be lowered. In this way, the answer generation unit 27 learns the weighting of matching between the input question and the answer by repeatedly learning based on the dialogue result managed for each dialogue.

以上、本発明の第６実施形態によれば、ユーザからの質問と、質問に対する回答と、回
答に対する評価とを対話結果として関連付けて格納し、格納された対話結果に基づいて、
質問と回答とのマッチングの重み付けを学習する。よって、質問と回答とのマッチングの
重み付けを学習することができるので、回答の精度を向上させることができる。 As described above, according to the sixth embodiment of the present invention, the question from the user, the answer to the question, and the evaluation of the answer are stored in association with each other as the dialogue result, and based on the stored dialogue result, the question is stored.
Learn the weighting of matching between questions and answers. Therefore, since the weighting of matching between the question and the answer can be learned, the accuracy of the answer can be improved.

＜第７実施形態＞
図１５を参照して、第７実施形態に係る対話管理システムを説明する。図１５は、第７
実施形態に係る対話管理システムの概略構成図（システム構成図）である。第７実施形態
においては、図１５に示すように、対話管理サーバ１が除外用語辞書ＤＢ１４をさらに備
える。除外用語辞書ＤＢ１４は、入力された質問と、記録部１１に記録された文書情報の
各項目に含まれるテキストデータとのマッチングの際に影響が除外される「除外用語」を
記憶するデータベースである。具体的には、除外用語としては、助詞、接続詞、所定の修
飾語、所定の頻出語などが挙げられる。ここで、所定の修飾語とは、「いろいろな」「様
々な」などの実質的に就職する単語の意味を制限しない用語が該当する。また、所定の頻
出語とは、特定形式の構造化文書で用いられる全用語のうち頻出する用語のことであり、
数万ワードの日本語の文書においては上位１０個ぐらいの用語がこれに該当する。例えば
、「確認」などの用語が頻出語として挙げられることが多い。 <7th Embodiment>
The dialogue management system according to the seventh embodiment will be described with reference to FIG. FIG. 15 shows the seventh
It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on embodiment. In the seventh embodiment, as shown in FIG. 15, the dialogue management server 1 further includes the exclusion term dictionary DB 14. The exclusion term dictionary DB 14 is a database that stores "excluded terms" whose influence is excluded when matching the input question with the text data included in each item of the document information recorded in the recording unit 11. .. Specifically, the excluded terms include particles, conjunctions, predetermined modifiers, predetermined frequent words, and the like. Here, the predetermined modifier corresponds to a term such as "various" or "various" that does not substantially limit the meaning of the word to be employed. In addition, a predetermined frequent word is a term that frequently appears among all the terms used in a structured document of a specific format.
In a Japanese document with tens of thousands of words, the top 10 terms correspond to this. For example, terms such as "confirmation" are often mentioned as frequent words.

回答生成部２７は、入力された質問と記録された文書情報の各項目に含まれるテキスト
データとのマッチングの際に、除外用語の影響が除外された回答出力情報を生成する。 The answer generation unit 27 generates answer output information excluding the influence of the excluded term when matching the input question with the text data included in each item of the recorded document information.

第７実施形態に係る対話管理サーバ１は、上述した構成を具備しているので、質問に対
する回答の妥当性を高めることができる。補足すると、一般的に、記録された文書情報の
各項目に含まれるテキストデータが、入力された質問に頻出する用語を多く含んでいる場
合、質問の回答として妥当である可能性が高い。しかしながら、文書情報の全用語のうち
他の用語に比して突出して頻出する用語等は、任意の文章で使用されている可能性があり
、却って回答としての妥当性を低下させることがある。そこで、マッチングの際に、これ
らの用語の影響を除外することで、質問に対する回答の妥当性を高めることができる場合
がある。 Since the dialogue management server 1 according to the seventh embodiment has the above-described configuration, the validity of the answer to the question can be enhanced. Supplementally, in general, if the text data contained in each item of recorded document information contains many terms that frequently appear in the entered question, it is likely to be a valid answer to the question. However, among all the terms in the document information, terms that appear more frequently than other terms may be used in arbitrary sentences, which may rather reduce the validity of the answer. Therefore, it may be possible to improve the validity of the answer to the question by excluding the influence of these terms at the time of matching.

＜第８実施形態＞
第８実施形態においては、第１実施形態から第７実施形態に係る構造化した形式の文書
が、階層構造の末端に位置する項目に含まれるテキストデータと、当該各項目に至る一又
は複数の階層を特定可能な情報及び当該各項目に至る一又は複数の各階層の見出しとを対
応付けたものを「単位情報」とするものである。具体的には、図３に示すようなデータが
一つの単位情報として定義され、これらの集合が上述した特定形式の構造化文書として定
義される。 <8th Embodiment>
In the eighth embodiment, the document in the structured format according to the first to seventh embodiments is the text data included in the items located at the end of the hierarchical structure, and one or more documents leading to each item. The "unit information" is a combination of information that can specify a hierarchy and headings of one or a plurality of layers leading to each item. Specifically, the data shown in FIG. 3 is defined as one unit information, and these sets are defined as the above-mentioned specific format structured document.

図１６を参照して、第８実施形態に係る対話管理システムを説明する。図１６は、第８
実施形態に係る対話管理システムの概略構成図（システム構成図）である。第８実施形態
に係る対話管理サーバ１は、逆文書頻度ＤＢ２４をさらに備える。逆文書頻度ＤＢ２４は
、構造化した形式の文書で使用される単語の逆文書頻度を記憶するデータベースである。
ここで、「逆文書頻度」とは、文書全体における単語の珍しさを示す指標であり、構造化
した形式の文書における全単位情報の総数ｍａｘＤを、当該単語が含まれる単位情報の数
ＤＦｒｅｑｓで除した値のことである。 The dialogue management system according to the eighth embodiment will be described with reference to FIG. FIG. 16 shows the eighth
It is a schematic block diagram (system block diagram) of the dialogue management system which concerns on embodiment. The dialogue management server 1 according to the eighth embodiment further includes a reverse document frequency DB 24. The reverse document frequency DB 24 is a database that stores the reverse document frequency of words used in a structured document.
Here, the "reverse document frequency" is an index indicating the rarity of a word in the entire document, and the total number maxD of all unit information in a structured document is calculated by the number of unit information including the word DFreqs. It is the value divided.

重み付け設定部２３は、単位情報内の文字情報に含まれる単語の出現頻度を算出し、算
出された単語の出現頻度に基づいて単語ごとの重み付けを設定する。さらに、重み付け設
定部２３は、逆文書頻度に基づいて重み付けを補正する。 The weighting setting unit 23 calculates the frequency of appearance of words included in the character information in the unit information, and sets the weighting for each word based on the calculated frequency of appearance of words. Further, the weight setting unit 23 corrects the weight based on the reverse document frequency.

回答生成部２７は、入力された質問に含まれる単語を所定の個数以上含む単位情報を候
補回答として抽出し、当該候補回答に基づいて回答出力情報を生成する。ここで、回答生
成部２７は、重み付け設定部２３により設定される重み付けに基づいて候補回答と質問と
の類似度を算出する。そして、回答生成部２７は、類似度の高い候補回答を回答出力情報
として生成する。なお、類似度の算出には、TFIDFSimilarityやBM25Similarityのアルゴ
リズムを用いることができる。 The answer generation unit 27 extracts unit information including a predetermined number or more of words included in the input question as a candidate answer, and generates answer output information based on the candidate answer. Here, the answer generation unit 27 calculates the degree of similarity between the candidate answer and the question based on the weight set by the weight setting unit 23. Then, the answer generation unit 27 generates candidate answers having a high degree of similarity as answer output information. The algorithms of TFIDF Similarity and BM25 Similarity can be used to calculate the similarity.

上記構成により、第８実施形態に係る対話管理サーバ１では、質問に含まれる単語が多
い回答を生成するので、妥当性の高い回答を提供することができる。 With the above configuration, the dialogue management server 1 according to the eighth embodiment generates an answer having many words included in the question, so that a highly valid answer can be provided.

また、第８実施形態に係る対話管理サーバ１は、逆文書頻度に基づいて単語の重みを補
正してから類似度を算出することができる。これにより、文書全体における当該単語の珍
しさを反映させることができる。ここで、文書全体における単語の珍しさは、当該単語が
特徴的であることを示す傾向が高く、テキストデータの検索においては正解を示す可能性
が高いものである。したがって、回答生成部２７は、逆文書頻度で補正された類似度を用
いることで、入力された質問に対して妥当な回答出力情報を生成する。 Further, the dialogue management server 1 according to the eighth embodiment can calculate the similarity after correcting the weight of the word based on the reverse document frequency. This makes it possible to reflect the rarity of the word in the entire document. Here, the rarity of a word in the entire document tends to indicate that the word is characteristic, and is likely to indicate the correct answer in the search of text data. Therefore, the answer generation unit 27 generates appropriate answer output information for the input question by using the similarity corrected by the reverse document frequency.

なお、第８実施形態に係る対話管理サーバ１に、図１５に示す除外用語辞書ＤＢ１４を
組み合わせた場合には、候補回答の抽出の際に、当該除外用語に関しては単位情報の個数
がカウントされなくなる。 When the exclusion term dictionary DB 14 shown in FIG. 15 is combined with the dialogue management server 1 according to the eighth embodiment, the number of unit information is not counted for the exclusion terms when extracting candidate answers. ..

＜第９実施形態＞
第９実施形態においては、第１実施形態から第７実施形態に係る構造化した形式の文書
に関して、説明文のテキストデータと、説明文に至る一又は複数の階層を識別するための
情報及び階層の見出しを表すテキストデータとを対応付けたものを「単位情報」としても
よい。例えば、図３に示すようなデータ（「特定のＸＭＬ形式で抽出」で囲まれる部分）
が一つの単位情報として定義され、複数の単位情報を含む集合が上述した特定形式の構造
化文書として定義される。回答生成部２７は、受付部１９により受け付けられた質問文の
テキストデータと、各単位情報に含まれるテキストデータとをマッチングして、質問文に
関連する単位情報を抽出し、抽出された単位情報に対応する見出し及び説明文に基づく回
答候補（回答出力情報）を生成する。質問によっては、文書の構成単位（項目）（例えば
、編、章、節、項、小項目等）に含まれるテキストデータよりも見出しの方が適切な回答
を誘導する場合があるので、第９実施形態に係る対話管理サーバ１では、各項目のテキス
トデータのみならず、見出しも含めてマッチングをすることで質問に関連する項目を適切
に抽出することを可能とする。 <9th embodiment>
In the ninth embodiment, with respect to the structured format document according to the first to seventh embodiments, the text data of the explanation and the information and the hierarchy for identifying one or more layers leading to the explanation. The "unit information" may be associated with the text data representing the heading of. For example, the data shown in FIG. 3 (the part surrounded by "extracted in a specific XML format")
Is defined as one unit information, and a set containing a plurality of unit information is defined as the above-mentioned specific format structured document. The answer generation unit 27 matches the text data of the question text received by the reception unit 19 with the text data included in each unit information, extracts the unit information related to the question text, and extracts the extracted unit information. Generate answer candidates (answer output information) based on the heading and description corresponding to. For some questions, headings may lead to more appropriate answers than text data contained in document building blocks (items) (eg, editions, chapters, sections, sections, sub-items, etc.). The dialogue management server 1 according to the embodiment makes it possible to appropriately extract items related to a question by matching not only text data of each item but also headings.

図１７から図２７を参照して、第９実施形態に係る対話管理システムを具体的に説明す
る。図１７は、第９実施形態に係る対話管理システムの概略構成図である。対話管理サー
バ１の回答生成部２７は、例えば、単語の出現頻度を示すＴＦ（Term Frequency）、又は
、逆文書頻度を示すＩＤＦ（Inverse Document Frequency）等の指標を用いて、検索スコ
アを算出する。ＴＦとは、例えば、各構造化文書において特定の単語がどの程度出現した
かを示す。つまり、ＴＦとは、構造化文書における単語の出現頻度を、構造化文書におけ
る全単語の出現頻度の和で除した値のことである。具体的には、回答生成部２７は、ＴＦ
を用いる手法として、各単位情報における抽出された単語の出現頻度（ＴＦ）に基づいて
検索スコアを算出し、算出した検索スコアを用いて回答出力情報を生成する。 The dialogue management system according to the ninth embodiment will be specifically described with reference to FIGS. 17 to 27. FIG. 17 is a schematic configuration diagram of the dialogue management system according to the ninth embodiment. The answer generation unit 27 of the dialogue management server 1 calculates a search score using, for example, an index such as TF (Term Frequency) indicating the frequency of occurrence of words or IDF (Inverse Document Frequency) indicating the frequency of reverse documents. .. The TF indicates, for example, how many specific words appear in each structured document. That is, the TF is a value obtained by dividing the frequency of occurrence of words in a structured document by the sum of the frequency of occurrence of all words in a structured document. Specifically, the answer generation unit 27 is TF.
As a method using, a search score is calculated based on the frequency of occurrence (TF) of the extracted words in each unit information, and response output information is generated using the calculated search score.

また、ＩＤＦとは、例えば、第８実施形態において説明したとおり、文書全体における
単語の珍しさを示す指標であり、構造化文書における全単位情報の総数を、当該単語が含
まれる単位情報の数で除した値のことである。回答生成部２７は、ＴＦ−ＩＤＦを用いる
手法として、各単位情報における抽出された単語の出現頻度（ＴＦ）と、抽出された単語
が含まれる単位情報の、単位情報の総数に対する割合（ＩＤＦ）とに基づいて検索スコア
を算出する。例えば、回答生成部２７は、ＴＦとＩＤＦとを乗算することによって、検索
スコアを算出する。回答生成部２７は、算出した検索スコアを用いて回答出力情報を生成
する。 Further, the IDF is, for example, as described in the eighth embodiment, an index indicating the rarity of a word in the entire document, and the total number of all unit information in the structured document is the number of unit information including the word. It is the value divided by. As a method using TF-IDF, the answer generation unit 27 includes the frequency of appearance of the extracted words in each unit information (TF) and the ratio of the unit information including the extracted words to the total number of unit information (IDF). The search score is calculated based on. For example, the answer generation unit 27 calculates the search score by multiplying the TF and the IDF. The answer generation unit 27 generates answer output information using the calculated search score.

図１７に示すように、第９実施形態に係る対話管理サーバ１は、第１合成語辞書ＤＢ１
２Ａと、第２合成語辞書ＤＢ１２Ｂと、下位概念語辞書ＤＢ４４と、ショートトークＤＢ
４６と、をさらに備える。なお、第１合成語辞書ＤＢ１２Ａ、及び、第２合成語辞書ＤＢ
１２Ｂは、図１、１５及び１６に示す専門用語辞書ＤＢ１２の一態様である。 As shown in FIG. 17, the dialogue management server 1 according to the ninth embodiment is the first synthetic word dictionary DB1.
2A, second synthetic word dictionary DB12B, subordinate conceptual word dictionary DB44, short talk DB
46 and are further provided. The first synthetic word dictionary DB12A and the second synthetic word dictionary DB
12B is an aspect of the technical term dictionary DB 12 shown in FIGS. 1, 15 and 16.

第１合成語辞書ＤＢ１２Ａは、所定の名詞の単語に他の単語を合成した第１合成語が登
録されたデータベースである。第１合成語辞書ＤＢ１２Ａには、例えば、「取引時」（所
定の名詞の単語）に「確認」（他の単語）を合成した、又は、「確認」（所定の名詞の単
語）に「取引時」（他の単語）を合成した、「取引時確認」（第１合成語）が登録されて
いる。なお、第１合成語辞書ＤＢ１２Ａにおいては、「取引時確認」は、「取引時」又は
「確認」とは関連付けられていない。 The first compound word dictionary DB12A is a database in which a first compound word obtained by synthesizing another word with a word of a predetermined noun is registered. In the first synthetic word dictionary DB12A, for example, "confirmation" (another word) is synthesized with "at the time of transaction" (word of a predetermined noun), or "confirmation" (word of a predetermined noun) is "transaction". "Confirmation at the time of transaction" (first composite word), which is a composite of "time" (other words), is registered. In the first synthetic word dictionary DB12A, "confirmation at the time of transaction" is not associated with "at the time of transaction" or "confirmation".

第１合成語辞書ＤＢ１２Ａを用いる場合の回答出力情報の生成手法の一例を説明する。
例えば、回答生成部２７は、質問文（例えば、「取引時確認について調べたい」）から抽
出された単語（例えば、「取引時確認」、「取引時」及び「確認」等）のうち合成語「取
引時確認」が第１合成語辞書ＤＢ１２Ａに登録されていると判定した場合、各単位情報に
おける、質問文から抽出された各単語の出現頻度に基づいて第１検索スコアを算出すると
ともに、各単位情報における、合成語（例えば、「取引時確認」）の出現頻度に基づいて
第２検索スコアを算出する。そして、回答生成部２７は、両検索スコアを用いて回答出力
情報を生成する。 An example of a method for generating answer output information when the first synthetic word dictionary DB12A is used will be described.
For example, the answer generation unit 27 is a compound word among words extracted from a question sentence (for example, "I want to find out about confirmation at the time of transaction") (for example, "confirmation at the time of transaction", "at the time of transaction", "confirmation", etc.). When it is determined that "confirmation at the time of transaction" is registered in the first compound word dictionary DB12A, the first search score is calculated based on the frequency of appearance of each word extracted from the question sentence in each unit information, and the first search score is calculated. The second search score is calculated based on the frequency of occurrence of synthetic words (for example, "confirmation at the time of transaction") in each unit information. Then, the answer generation unit 27 generates the answer output information using both search scores.

第２合成語辞書ＤＢ１２Ｂは、所定の名詞の単語と、その単語に他の単語を合成した第
２合成語とが関連付けて登録されたデータベースである。第２合成語辞書ＤＢ１２Ｂは、
例えば、第１合成語辞書ＤＢ１２Ａとは異なり、「取引時確認」は、「取引時」及び「確
認」のそれぞれと関連付けられて登録される。 The second compound word dictionary DB12B is a database in which a word of a predetermined noun and a second compound word obtained by synthesizing another word with the word are registered in association with each other. The second compound word dictionary DB12B is
For example, unlike the first synthetic word dictionary DB12A, "confirmation at the time of transaction" is registered in association with each of "at the time of transaction" and "confirmation".

第２合成語辞書ＤＢ１２Ｂを用いる場合の回答出力情報の生成手法の一例を説明する。
例えば、回答生成部２７は、質問文（例えば、「取引時確認について調べたい」）から抽
出された単語（例えば、「取引時」）が第２合成語辞書ＤＢ１２Ｂに登録されていると判
定した場合、質問文から抽出された単語（例えば、「取引時」）、名詞の単語に合成した
他の単語（例えば、「確認」）、及び合成語（例えば、「取引時確認」）のそれぞれに対
する、各単位情報における出現頻度に基づいて検索スコアを算出する。そして、回答生成
部２７は、算出された検索スコアを用いて回答出力情報を生成する。 An example of a method for generating answer output information when the second synthetic word dictionary DB12B is used will be described.
For example, the answer generation unit 27 has determined that the word (for example, "at the time of transaction") extracted from the question sentence (for example, "I want to find out about confirmation at the time of transaction") is registered in the second compound word dictionary DB12B. For each of the words extracted from the interrogative (eg, "at the time of transaction"), other words synthesized into the noun word (eg, "confirmation"), and the compound words (eg, "confirmation at the time of transaction"). , Calculate the search score based on the frequency of appearance in each unit information. Then, the answer generation unit 27 generates the answer output information using the calculated search score.

下位概念語辞書ＤＢ４４は、所定の単語と、その単語の下位概念である「下位概念語」
とが関連付けて登録されたデータベースである。図１８は、下位概念語の一例を示す図で
ある。図１８に示すように、下位概念語辞書ＤＢ４４には、例えば、キーとなるキー単語
「口座」、「確認」、「処理」、「取引」及び「定期」等のそれぞれと、各キー単語の下
位概念語と、が関連付けて登録されている。 The subordinate concept word dictionary DB44 contains a predetermined word and a "subordinate concept word" which is a subordinate concept of the word.
Is a database registered in association with. FIG. 18 is a diagram showing an example of subordinate conceptual words. As shown in FIG. 18, in the subordinate concept word dictionary DB44, for example, each of the key key words "account", "confirmation", "processing", "transaction", "periodic", etc., and each key word Subordinate concept words and are registered in association with each other.

ショートトークＤＢ４６は、ユーザと対話管理サーバ１との間でショートトークを実現
するための単語及び文章が登録されたデータベースであり、例えば、所定の単語と所定の
単語を含む所定の回答文とが関連付けて登録されている。ショートトークＤＢ４６は、後
述する図２０（Ｂ）で説明するとおり、例えば、「カレー」という単語と、「カレー」を
含む「カレーが好きです。」という回答文とが関連付けて登録されている。また、ショー
トトークＤＢ４６には、例えば、「△△△」という単語が入力されると、「△△△はいい
ですよね。」という回答文が生成されるような文章テンプレート（例えば、「…はいいで
すよね。」）が登録されてもよい。さらに、ショートトークＤＢ４６は、所定の単語と、
対話用の画像情報である所定のスタンプとが関連付けて登録されてもよい。 The short talk DB 46 is a database in which words and sentences for realizing a short talk between the user and the dialogue management server 1 are registered. For example, a predetermined word and a predetermined answer sentence including the predetermined word are stored. It is associated and registered. As will be described later in FIG. 20B, the short talk DB 46 is registered in association with, for example, the word "curry" and the answer sentence "I like curry" including "curry". Also, in the short talk DB46, for example, when the word "△△△" is input, a sentence template (for example, "... is"" is generated so that the answer sentence "△△△ is good."It's good. ") May be registered. Further, the short talk DB46 includes a predetermined word and
It may be registered in association with a predetermined stamp which is image information for dialogue.

また、ショートトークＤＢ４６は、ショートトーク用に準備された複数の単位情報が登
録されたデータベースであってもよい。この場合、入力された単語の出現頻度等に基づい
て単位情報が抽出され、抽出された単位情報に基づいて回答文が生成される。 Further, the short talk DB 46 may be a database in which a plurality of unit information prepared for the short talk is registered. In this case, the unit information is extracted based on the frequency of appearance of the input word and the like, and the answer sentence is generated based on the extracted unit information.

図１９は、第９実施形態に係る対話管理処理の一例を示すフローチャートである。第９
実施形態に係る対話管理処理の一例を図１９とともに図２０から図２７を参照して説明す
る。図２０から図２７は、ユーザ端末装置６の表示部の画面の一例を示す図である。図１
９に示すように、まず、対話管理サーバ１は、ユーザからの質問文の入力を受け付ける（
ステップＳ１０）。次に、ステップＳ１２において、対話管理サーバ１は、受け付けられ
た質問文のテキストデータと、文書情報ＢＩ内の各単位情報に含まれるテキストデータと
をマッチングして、質問文に関連する単位情報を抽出し、抽出された単位情報に対応する
文書を探索する。例えば、対話管理サーバ１は、質問文から抽出された単語と各文書内の
単語とをマッチングして、マッチングする文書がある場合（Ｎｏの場合）、ステップＳ２
０に進む。他方で、対話管理サーバ１は、マッチングする文書がない場合（Ｙｅｓの場合
）、ステップＳ１４に進む。 FIG. 19 is a flowchart showing an example of the dialogue management process according to the ninth embodiment. 9th
An example of the dialogue management process according to the embodiment will be described with reference to FIGS. 20 to 27 together with FIG. 20 to 27 are views showing an example of a screen of the display unit of the user terminal device 6. Figure 1
As shown in 9, first, the dialogue management server 1 accepts the input of the question text from the user (
Step S10). Next, in step S12, the dialogue management server 1 matches the text data of the received question text with the text data included in each unit information in the document information BI, and obtains the unit information related to the question text. Extract and search for the document corresponding to the extracted unit information. For example, the dialogue management server 1 matches the words extracted from the interrogative sentence with the words in each document, and when there is a matching document (in the case of No), step S2.
Proceed to 0. On the other hand, if there is no matching document (in the case of Yes), the dialogue management server 1 proceeds to step S14.

ステップＳ１４において、対話管理サーバ１は、例えば質問文の少なくとも一部に含ま
れる単語がショートトークＤＢ４６に登録されているか否か判定する。例えば、図２０（
Ａ）に示すように、質問文Ｔ６１が「ああああ」の場合に、対話管理サーバ１が、「ああ
ああ」がショートトークＤＢ４６に登録されていないと判定すると（ステップＳ１４にお
いてＮｏ）、「ごめんね。分からないよ。」という所定の回答Ｔ６２を生成する。他方で
、図２０（Ｂ）に示すように、質問文Ｔ６４が「カレー」の場合に、対話管理サーバ１が
、「カレー」がショートトークＤＢ４６に登録されていると判定すると（ステップＳ１４
においてＹｅｓ）、「『カレー』が好きです。」という所定の回答Ｔ６５を生成する。こ
のように、例えば質問文の少なくとも一部に含まれる単語がショートトークＤＢ４６に登
録されている場合、ユーザと対話管理サーバ１との間でショートトークが実施される。よ
って、業務マニュアルに関する質問以外の質問がユーザからあった場合であっても、比較
的違和感を生じさせることなく、ユーザと対話管理サーバ１との間の対話を継続させるこ
とができる。 In step S14, the dialogue management server 1 determines, for example, whether or not a word included in at least a part of the interrogative sentence is registered in the short talk DB 46. For example, FIG. 20 (
As shown in A), when the question sentence T61 is "ahhh", when the dialogue management server 1 determines that "ahhh" is not registered in the short talk DB46 (No in step S14), "I'm sorry. Generates the predetermined answer T62, "No." On the other hand, as shown in FIG. 20B, when the question sentence T64 is "curry", the dialogue management server 1 determines that "curry" is registered in the short talk DB 46 (step S14).
Yes), generate the prescribed answer T65, "I like'curry'." In this way, for example, when a word included in at least a part of the question sentence is registered in the short talk DB 46, a short talk is performed between the user and the dialogue management server 1. Therefore, even if the user asks a question other than the question related to the business manual, the dialogue between the user and the dialogue management server 1 can be continued without causing a feeling of discomfort.

ステップＳ２０において、対話管理サーバ１は、例えば、複数算出された検索スコアの
うち最大検索スコアが所定スコア（例えば、検索スコアが０〜１０の範囲で設定されてい
る場合、検索スコアが８や９等の比較的高いスコア）以上、かつ、マッチングする文書の
数が所定数（例えば、数ファイル〜数十ファイル等）以下であるか否かを判定する。例え
ば、対話管理サーバ１は、最大検索スコアが高いスコアであり、かつ、マッチングする文
書の数が少ない場合（Ｙｅｓの場合）は、ステップＳ２２に進む。つまり、最大検索スコ
アが高く、マッチングする文書の数が少ない場合は、対話管理サーバ１は、質問文に対す
る回答の精度が比較的高いと判定し、回答候補を表示する（ステップＳ２２）。図２１（
Ａ）に示すように、対話管理サーバ１は、質問文Ｔ６７「ダブルストライプ口座について
」に対する回答候補の精度が比較的高いと判定すると、質問文Ｔ６７に対応する回答候補
Ｔ６８を生成し出力する。 In step S20, the dialogue management server 1 has, for example, a search score of 8 or 9 when the maximum search score among a plurality of calculated search scores is set to a predetermined score (for example, when the search score is set in the range of 0 to 10. Etc.) and the number of matching documents is a predetermined number (for example, several files to several tens of files) or less. For example, if the dialogue management server 1 has a high maximum search score and the number of matching documents is small (in the case of Yes), the process proceeds to step S22. That is, when the maximum search score is high and the number of matching documents is small, the dialogue management server 1 determines that the accuracy of the answer to the question sentence is relatively high, and displays the answer candidates (step S22). FIG. 21 (
As shown in A), when the dialogue management server 1 determines that the accuracy of the answer candidate for the question sentence T67 "about the double stripe account" is relatively high, the dialogue management server 1 generates and outputs the answer candidate T68 corresponding to the question sentence T67.

他方で、対話管理サーバ１は、最大検索スコアが比較的低いスコアの場合、又は、マッ
チングする文書の数が多い場合（Ｎｏの場合）は、ステップＳ２４に進む。つまり、回答
候補としてマッチングした文書のマッチング精度（検索スコア）が比較的低いものが多数
ある場合は、対話管理サーバ１は、さらに他の条件で回答候補としての文書を絞り込む必
要があるためである。 On the other hand, the dialogue management server 1 proceeds to step S24 when the maximum search score is a relatively low score or when the number of matching documents is large (No). That is, when there are many documents whose matching accuracy (search score) is relatively low as answer candidates, the dialogue management server 1 needs to narrow down the documents as answer candidates under other conditions. ..

ステップＳ２４において、対話管理サーバ１は、質問文に対してマッチングした複数の
文書のうち、例えば、マッチング精度の高い文書の上位十ファイルにおいて文書の構成単
位「〇〇編」が同一であり、かつ、聞き返し回数（例えば、後述するステップＳ３６又は
ステップＳ３８の処理が実行された後、ステップＳ１０「質問文を入力」に戻った回数）
が例えば一回以上であるかを判定する。ステップＳ２４においてＹｅｓの場合は、ステッ
プＳ４２に進む。図２１（Ｂ）に示すように、対話管理サーバ１は、例えば、質問文Ｔ７
０「『基本口座を開設したい場合の処理方法は』について教えて」が入力されると、文書
の構成単位の一つである「編」を絞り込むため、他の確からしい編を構成単位として含む
文書を提案する。対話管理サーバ１は、例えば「（１）融資支援システム編」を提案する
ような回答候補Ｔ７１を生成する。なお、上記形態では、文書の構成単位の一つである「
編」に基づいて回答候補の絞り込みを実施しているが、これに限られず、他の文書の構成
単位である、「章」、「節」、「項」、又は「小項目」等で回答候補の絞り込みを実施し
てもよいし、構成単位の二以上の構成単位を組み合わせて回答候補の組み合わせを実施し
てもよい。 In step S24, the dialogue management server 1 has the same document configuration unit "○○" in, for example, the top ten files of documents with high matching accuracy among a plurality of documents matched to the interrogative sentence. , The number of times of listening back (for example, the number of times the process returned to step S10 "input question text" after the processing of step S36 or step S38 described later was executed).
Is determined, for example, once or more. If Yes in step S24, the process proceeds to step S42. As shown in FIG. 21 (B), the dialogue management server 1 is, for example, a question sentence T7.
0 When "Tell me about" How to process when you want to open a basic account "" is entered, other probable editions are included as constituent units in order to narrow down the "edition" which is one of the constituent units of the document. Propose a document. The dialogue management server 1 generates a response candidate T71 that proposes, for example, "(1) Loan support system". In the above form, "" is one of the structural units of the document.
We are narrowing down the answer candidates based on "Hen", but not limited to this, we answer by "chapter", "section", "section", "small item", etc., which are the structural units of other documents. Candidates may be narrowed down, or a combination of answer candidates may be carried out by combining two or more constituent units of the constituent units.

他方で、ステップＳ２４においてＮｏの場合は、ステップＳ２６に進む。ステップＳ２
６において、対話管理サーバ１は、質問文に対してマッチングした複数の文書のうち、例
えば、最大検索スコアが所定スコア以上であるか否かを判定する。ステップＳ２４におけ
る「所定スコア」は、ステップＳ２０における「所定スコア」と同一であってもよいし、
ステップＳ２０における「所定スコア」よりも高いスコアであってもよいし、低いスコア
であってもよい。ステップＳ２６においてＹｅｓの場合、対話管理サーバ１は、ステップ
Ｓ４０に進み、例えば、質問文に対してマッチングした複数の文書のうち、検索スコアが
高い文書を優先して回答候補として表示する。 On the other hand, if No in step S24, the process proceeds to step S26. Step S2
In 6, the dialogue management server 1 determines whether or not, for example, the maximum search score is equal to or higher than a predetermined score among the plurality of documents matched to the question text. The "predetermined score" in step S24 may be the same as the "predetermined score" in step S20.
The score may be higher than or lower than the "predetermined score" in step S20. In the case of Yes in step S26, the dialogue management server 1 proceeds to step S40, and for example, among a plurality of documents matched to the interrogative sentence, the document having a high search score is preferentially displayed as an answer candidate.

他方で、ステップＳ２６においてＮｏの場合は、ステップＳ２８に進む。ステップＳ２
８において、対話管理サーバ１は、聞き返し回数が、例えば三回以上であるか否かを判定
する。聞き返し回数が三回以上である場合（Ｙｅｓの場合）、対話管理サーバ１は、ステ
ップＳ４０に進み、回答候補を表示する。他方で、聞き返し回数が三回を下回る場合（Ｎ
ｏの場合）、対話管理サーバ１は、ステップＳ３２に進む。なお、聞き返し回数に関する
設定回数「三回」はあくまで例示であって、設定回数は、他の回数であってもよい。 On the other hand, if No in step S26, the process proceeds to step S28. Step S2
In 8, the dialogue management server 1 determines whether or not the number of times of listening back is, for example, three or more times. When the number of times of listening back is three or more (in the case of Yes), the dialogue management server 1 proceeds to step S40 and displays the answer candidates. On the other hand, when the number of times of listening back is less than three (N)
In the case of o), the dialogue management server 1 proceeds to step S32. It should be noted that the set number of times "three times" regarding the number of times of listening back is merely an example, and the set number of times may be another number of times.

ステップＳ３２において、対話管理サーバ１は、例えば、質問文から自然言語処理によ
って抽出された１又は複数の単語のそれぞれが、図１７に示す下位概念語辞書ＤＢ４４に
登録されているか否かを判定する。対話管理サーバ１は、Ｙｅｓの場合は、ステップＳ３
８に進み、Ｎｏの場合は、ステップＳ３４に進む。ステップＳ３４において、対話管理サ
ーバ１は、過去の聞き返しで同じ質問文に対して提案（例えば、回答候補の表示）を実施
したか否かを判定する。Ｙｅｓの場合は、ステップＳ４０に進み、Ｎｏの場合は、ステッ
プＳ３６に進む。 In step S32, the dialogue management server 1 determines, for example, whether or not each of one or a plurality of words extracted from the interrogative sentence by natural language processing is registered in the subordinate concept word dictionary DB 44 shown in FIG. .. If Yes, the dialogue management server 1 is in step S3.
If the result is No, the process proceeds to step S34. In step S34, the dialogue management server 1 determines whether or not a proposal (for example, display of answer candidates) has been implemented for the same question sentence in the past hearing. In the case of Yes, the process proceeds to step S40, and in the case of No, the process proceeds to step S36.

ステップＳ３６において、対話管理サーバ１は、各見出しに含まれるテキストデータに
おける質問文から抽出された単語の出現頻度に基づいてスコアを算出し、算出したスコア
を用いた回答出力情報を生成する。例えば、対話管理サーバ１は、ＴＦ又はＴＦ−ＩＤＦ
等の手法を用いて、提案語を算出する。「提案語」とは、例えば、質問文のテキストデー
タを形態素解析して抽出された単語を含む単位情報における各見出しに含まれるテキスト
データから抽出された単語のことをいう。より具体的には、対話管理サーバ１は、まず、
例えば、図２２（Ａ）に示す質問文Ｔ７３を形態素解析して抽出した単語を含む文書の文
書ＩＤを取得する。文書ＩＤは全説明文を構成する全文書を一意に特定する識別番号であ
る。そして、対話管理サーバ１は、図２２（Ｂ）に示すような「見出し語リスト」を参照
し文書ＩＤに基づいて、当該文書の見出しに使用される各単語を特定し、特定した各単語
に対してＴＦ又はＴＦ−ＩＤＦを算出する。次に、対話管理サーバ１は、ＴＦ又はＴＦ−
ＩＤＦに基づくスコア順に単語を並び替え、質問文に含まれている単語を除外した後、例
えば上位１５語を提案語として出力する。そして、対話管理サーバ１は、図２２（Ａ）に
示すように、出力した提案語を含む回答候補Ｔ７４を生成する。なお、提案語が存在しな
い場合、ステップＳ４０に進む。 In step S36, the dialogue management server 1 calculates a score based on the frequency of appearance of words extracted from the question sentence in the text data included in each heading, and generates answer output information using the calculated score. For example, the dialogue management server 1 is TF or TF-IDF.
The proposed word is calculated by using a method such as. The "proposal word" means, for example, a word extracted from the text data included in each heading in the unit information including the word extracted by morphological analysis of the text data of the interrogative sentence. More specifically, the dialogue management server 1 first
For example, the document ID of the document including the extracted word by morphological analysis of the question sentence T73 shown in FIG. 22 (A) is acquired. The document ID is an identification number that uniquely identifies all the documents that make up all the explanations. Then, the dialogue management server 1 refers to the "heading word list" as shown in FIG. 22B, identifies each word used for the heading of the document based on the document ID, and sets each word to the specified word. On the other hand, TF or TF-IDF is calculated. Next, the dialogue management server 1 is TF or TF-.
Words are sorted in order of score based on IDF, words included in the question sentence are excluded, and then, for example, the top 15 words are output as proposed words. Then, as shown in FIG. 22 (A), the dialogue management server 1 generates an answer candidate T74 including the output proposed word. If the proposed word does not exist, the process proceeds to step S40.

図２２（Ａ）に示すように、回答候補Ｔ７４に含まれる「関係しそうなものがない」と
いう選択肢Ｓ１が選択される場合、ステップＳ１０に戻り、今回の質問文Ｔ７３と同一の
質問文が入力された状態となる。つまり、ステップＳ３６から戻ったステップＳ１０にお
いて再度、質問文Ｔ７３と同一の質問文が入力された状態となる。なお、この場合、聞き
返し回数は一回増加する。 As shown in FIG. 22 (A), when the option S1 including "there is nothing that seems to be related" included in the answer candidate T74 is selected, the process returns to step S10, and the same question text as the current question text T73 is input. It will be in the state of being. That is, in step S10 returned from step S36, the same question text as the question text T73 is input again. In this case, the number of times of listening back increases once.

ステップＳ３８において、対話管理サーバ１は、質問文のテキストデータを形態素解析
して抽出された１又は複数の単語に関連付けられた下位概念語を含む回答出力情報を生成
する。対話管理サーバ１は、例えば、図２３に示す質問文Ｔ７６のテキストデータを形態
素解析して抽出された単語（例えば、「口座」）に関して、図１８に示す下位概念語辞書
ＤＢ４４を参照して、「口座」という単語に関連付けられた下位概念語（例えば、「預金
口座」、「総合口座」等）を含む回答出力情報を生成する。 In step S38, the dialogue management server 1 generates answer output information including subordinate conceptual words associated with one or a plurality of words extracted by morphological analysis of the text data of the interrogative sentence. For example, the dialogue management server 1 refers to the subordinate concept word dictionary DB44 shown in FIG. 18 with respect to the word (for example, “account”) extracted by morphological analysis of the text data of the question sentence T76 shown in FIG. Generates answer output information that includes subordinate terms associated with the word "account" (eg, "deposit account", "general account", etc.).

図２３に示すように、回答候補Ｔ７７に含まれる「候補にない」という選択肢Ｓ３が選
択される場合、ステップＳ１０に戻り、新たな質問文が入力された状態となる。その後、
対話管理サーバ１は、質問文のテキストデータと、各単位情報に含まれるテキストデータ
とをマッチングして、回答候補を生成する際に、過去に生成された回答候補内に含まれた
単語（例えば、「預金口座」、「総合口座」等）に対応する文書を除外した上で回答候補
を生成する。この場合、聞き返し回数は一回増加する。また、回答候補Ｔ７７に含まれる
「わからない」という選択肢Ｓ５が選択される場合、ステップＳ１０に戻り、今回の質問
文Ｔ７６と同一の質問文が入力された状態となる。つまり、ステップＳ３８から戻ったス
テップＳ１０において再度、質問文Ｔ７６と同一の質問文を入力された状態となる。なお
、この場合、聞き返し回数は一回増加する。また、通常はステップＳ３４を経て、ステッ
プＳ４０に進む。 As shown in FIG. 23, when the option S3 "not in the candidate" included in the answer candidate T77 is selected, the process returns to step S10, and a new question sentence is input. afterwards,
When the dialogue management server 1 matches the text data of the interrogative text with the text data included in each unit information and generates the answer candidate, the word included in the answer candidate generated in the past (for example). , "Deposit account", "general account", etc.), and then generate answer candidates. In this case, the number of times of listening is increased by one. Further, when the option S5 of "I don't know" included in the answer candidate T77 is selected, the process returns to step S10, and the same question text as the current question text T76 is input. That is, in step S10 returned from step S38, the same question text as the question text T76 is input again. In this case, the number of times of listening back increases once. Further, normally, the process proceeds to step S40 through step S34.

（特徴）
以上説明したように、本発明の第９実施形態に係る対話管理サーバ１は、記録部１１と
受付部１９と回答生成部２７とを備える。記録部１１は、説明文のテキストデータと、説
明文に至る一又は複数の階層を識別するための情報及び階層の見出しを表すテキストデー
タとを対応付けた単位情報を複数有する構造化した形式の文書のデータ（文書情報ＢＩ）
を記録する。受付部１９は、ユーザからの質問文のテキストデータを受け付ける。回答生
成部２７は、受付部１９により受け付けられた質問文のテキストデータと、記録部１１に
記録された各単位情報に含まれるテキストデータとをマッチングして、質問文に関連する
単位情報を抽出し、抽出された単位情報に対応する見出し及び説明文に基づく回答出力情
報を生成する。 (Characteristic)
As described above, the dialogue management server 1 according to the ninth embodiment of the present invention includes a recording unit 11, a reception unit 19, and a response generation unit 27. The recording unit 11 has a structured format having a plurality of unit information in which the text data of the explanatory text, the information for identifying one or a plurality of layers leading to the explanatory text, and the text data representing the heading of the hierarchy are associated with each other. Document data (document information BI)
To record. The reception unit 19 receives the text data of the question text from the user. The answer generation unit 27 matches the text data of the question text received by the reception unit 19 with the text data included in each unit information recorded in the recording unit 11 and extracts the unit information related to the question text. Then, the answer output information based on the heading and the explanation corresponding to the extracted unit information is generated.

ここで、対話管理サーバ１は、回答生成部２７が、質問文のテキストデータを形態素解
析して抽出された単語の、各単位情報における出現頻度に基づいてスコアを算出する。そ
して、回答生成部２７が、算出したスコアを用いて回答出力情報を生成する。したがって
、質問文に関連する可能性が高い説明文を出力するものとなっている。さらに詳しくは、
対話管理サーバ１は、回答生成部２７が、質問文のテキストデータを形態素解析して抽出
された単語の、各単位情報における出現頻度と、抽出された単語が含まれる単位情報の、
単位情報の総数に対する割合とに基づいてスコアを算出する。したがって、質問文に含ま
れる単語が特徴的なものである場合、これに関連する可能性が高い説明文を出力するもの
となっている。 Here, the dialogue management server 1 calculates the score based on the frequency of appearance of the words extracted by the answer generation unit 27 by morphological analysis of the text data of the interrogative sentence in each unit information. Then, the answer generation unit 27 generates the answer output information using the calculated score. Therefore, the explanatory text that is highly likely to be related to the interrogative text is output. For more details,
In the dialogue management server 1, the answer generation unit 27 morphologically analyzes the text data of the interrogative sentence to extract the word, the frequency of appearance in each unit information, and the unit information including the extracted word.
The score is calculated based on the ratio of the unit information to the total number. Therefore, when the word included in the question sentence is characteristic, the explanation sentence that is likely to be related to the word is output.

したがって、単位情報のコンテンツにマニュアル等を反映させることで、組織内の業務
効率化を図ることができ、且つ、ユーザの利便性を向上できる。また、単位情報に基づい
て回答出力情報が生成されるので、当該業務マニュアルの活用を促進できる。 Therefore, by reflecting the manual or the like in the content of the unit information, it is possible to improve the work efficiency in the organization and improve the convenience of the user. Moreover, since the answer output information is generated based on the unit information, the utilization of the business manual can be promoted.

第９実施形態に係る対話管理サーバ１は、第１合成語辞書ＤＢ１２Ａ及び／又は第２合
成語辞書ＤＢ１２Ｂを備えるものである。したがって、ユーザが予め合成語を登録するこ
とで、登録された合成語に対する検索スコアを高くするように構成することができる。換
言すると、業務効率化を誘導するように合成語を登録しておくことで可能である。 The dialogue management server 1 according to the ninth embodiment includes a first synthetic word dictionary DB12A and / or a second synthetic word dictionary DB12B. Therefore, by registering the compound word in advance, the user can configure the search score for the registered compound word to be high. In other words, it is possible by registering synthetic words so as to induce operational efficiency.

本実施形態に係る対話管理サーバ１は、下位概念語辞書ＤＢ４４を備える。したがって
、上述したように、質問文に含まれる上位概念の単語から下位概念の情報へと絞りこんで
いくことができ、業務効率化を誘導することができる。 The dialogue management server 1 according to the present embodiment includes a subordinate concept word dictionary DB44. Therefore, as described above, it is possible to narrow down the words of the upper concept included in the question sentence to the information of the lower concept, and it is possible to induce the improvement of work efficiency.

第９実施形態に係る対話管理サーバ１は、回答生成部２７が、各見出しに含まれるテキ
ストデータにおける質問文から抽出された単語を含む回答出力情報を生成する機能を有す
る。具体的には、回答生成部２７は、質問文のテキストデータを形態素解析して抽出され
た単語が下位概念語辞書に登録されていないと判定した場合、形態素解析して抽出された
単語であって、各見出しに含まれるテキストデータにおける質問文から抽出された単語を
含む回答出力情報を生成する。このような機能により、ユーザの利便性に資する対話を実
現できる。 The dialogue management server 1 according to the ninth embodiment has a function of the answer generation unit 27 to generate answer output information including words extracted from the question sentence in the text data included in each heading. Specifically, when the answer generation unit 27 determines that the extracted word by morphological analysis of the text data of the question sentence is not registered in the subordinate conceptual word dictionary, it is a word extracted by morphological analysis. Then, answer output information including words extracted from the question sentence in the text data included in each heading is generated. With such a function, it is possible to realize a dialogue that contributes to the convenience of the user.

第９実施形態に係る対話管理サーバ１は、質問文のテキストデータと、各単位情報に含
まれるテキストデータとをマッチングして、質問文に関連する単位情報を抽出し、抽出さ
れた単位情報に対応する見出し及び説明文を含む回答候補（回答出力情報）を生成する。
ここで、質問によっては、文書の構成単位に含まれるテキストデータよりも見出しの方が
適切な回答を誘導する場合がある。よって、対話管理サーバ１は、各項目のテキストデー
タのみならず、見出しも含めてマッチングをすることで質問に関連する項目を適切に抽出
することができる。 The dialogue management server 1 according to the ninth embodiment matches the text data of the question text with the text data included in each unit information, extracts the unit information related to the question text, and uses the extracted unit information as the extracted unit information. Generate answer candidates (answer output information) including the corresponding headings and explanations.
Here, depending on the question, the heading may lead to a more appropriate answer than the text data contained in the structural unit of the document. Therefore, the dialogue management server 1 can appropriately extract the items related to the question by matching not only the text data of each item but also the heading.

なお、第９実施形態に係る対話管理サーバ１は、単語の出現頻度を示すＴＦ（Term Fre
quency）、又は、逆文書頻度を示すＩＤＦ（Inverse Document Frequency）に加えて、単
位情報の長さ等の指標を用いて検索スコアを算出するものでもよい。換言すると、本実施
形態に係る回答生成部２７が、質問文のテキストデータを形態素解析して抽出された単語
の、各単位情報における出現頻度と、抽出された単語が含まれる単位情報の、単位情報の
総数に対する割合と、単位情報の長さとに基づいてスコアを算出し、算出したスコアを用
いて回答出力情報を生成するものでもよい。すなわち、本実施形態のスコア算出方法はBM
25Similarityなどを含むものである。 The dialogue management server 1 according to the ninth embodiment is a TF (Term Fre) indicating the frequency of appearance of words.
In addition to quency) or IDF (Inverse Document Frequency) indicating the frequency of reverse documents, the search score may be calculated using an index such as the length of unit information. In other words, the answer generation unit 27 according to the present embodiment morphologically analyzes the text data of the interrogative sentence to extract the word, the frequency of appearance in each unit information, and the unit of the unit information including the extracted word. The score may be calculated based on the ratio to the total number of information and the length of the unit information, and the answer output information may be generated using the calculated score. That is, the score calculation method of this embodiment is BM.
It includes 25 Similarity and so on.

（変形例１）
本実施形態においては、図１７に示すユーザ端末装置６の表示部に表示する画面Ｇは様
々な表示形態を採り得る。 (Modification example 1)
In the present embodiment, the screen G displayed on the display unit of the user terminal device 6 shown in FIG. 17 can take various display forms.

例えば、図２４に示すように、ユーザがユーザ端末装置６の不図示の入力装置（例えば
、マウス）を操作することによって、複数の回答候補の一つのタイトル「（４０４．８６
）管理編…１項：対象取引冒頭」部分にカーソルＣを合わせた後、クリック操作（例え
ば、右クリック）を実行する。その場合、カーソルＣを合わせたときに、当該タイトル「
（４０４．８６）管理編…１項：対象取引冒頭」に関連する文書のうち、例えば、当該
タイトルの「編」、「章」及び「節」までが一致する他の文書のリストＬを画面Ｇ上に表
示するように構成してもよい。リストＬの内容はこれに限定されず、リストＬは、例えば
、当該タイトルの「編」及び「章」までが一致する他の文書のリストＬを画面Ｇ上に表示
してもよい。 For example, as shown in FIG. 24, when the user operates an input device (for example, a mouse) (not shown) of the user terminal device 6, one title of a plurality of answer candidates “(404.86)”.
) Management ... Item 1: Move the cursor C to the "Beginning of target transaction" part, and then execute a click operation (for example, right-click). In that case, when the cursor C is placed, the title "
(404.86) Management ... Item 1: Among the documents related to "Beginning of Target Transaction", for example, screen the list L of other documents in which "Edit", "Chapter" and "Section" of the title match. It may be configured to be displayed on G. The content of the list L is not limited to this, and the list L may display, for example, a list L of other documents whose titles "edition" and "chapter" match on the screen G.

また、ユーザが、リストＬ内の複数の項目Ｉのうち、例えば、項目「１項：対象取引
２．保険等」を、マウスを用いてクリック操作（例えば、左クリック）を実行して選択す
ると、タイトル「管理編２４章：ＣＲＳ制度２節：ＣＲＳ新規顧客確認の概要１項
：対象取引２．保険等」に対応する文書（例えば、業務マニュアルのｐｄｆファイル等
）を確認できるように構成してもよい。なお、画面Ｇにおいて回答候補に画像を含む場合
、図２４に示すように、他の回答候補の表示内容を視認するための障害にならないように
、当該画像は、例えば所定サイズのサムネイル画像ＴＪの形式で表示されてもよい。 Further, among the plurality of items I in the list L, the user can see, for example, the item "1 item: Target transaction".
2. When "Insurance, etc." is selected by executing a click operation (for example, left-click) using the mouse, the title "Management Chapter 24: CRS System Section 2: Overview of CRS New Customer Confirmation Item 1: Target Transactions 2. Insurance Etc. ”(for example, a pdf file of a business manual, etc.) may be configured so that it can be confirmed. When the answer candidate includes an image on the screen G, as shown in FIG. 24, the image is, for example, a thumbnail image TJ of a predetermined size so as not to be an obstacle for visually recognizing the display contents of other answer candidates. It may be displayed in a format.

また、例えば、図２５（Ａ）及び（Ｂ）に示すように、ユーザが、ユーザ端末装置６と
接続されているマウスを操作することによって、回答候補の一つのタイトル「管理編…１
項：概要冒頭」部分にカーソルＣを合わせた後、マウスを用いてクリック操作（例えば
、左クリック）を実行して選択すると、タイトル「管理編…１項：概要冒頭」に関連付
けて、当該タイトルに対応する文書内容ＤＣが画面Ｇ上に表示されるように構成してもよ
い。この構成によれば、ユーザは、特定のタイトルに対応する文書の内容を容易に把握す
ることが可能である。 Further, for example, as shown in FIGS. 25 (A) and 25 (B), the user operates a mouse connected to the user terminal device 6 to obtain one of the answer candidates, "Management ... 1".
If you move the cursor C to the "Item: Overview Beginning" part and then use the mouse to perform a click operation (for example, left-click) to select it, the title will be associated with the title "Administration ... Item 1: Overview Beginning". The document content DC corresponding to the above may be configured to be displayed on the screen G. According to this configuration, the user can easily grasp the content of the document corresponding to a specific title.

また、例えば、図２６に示すように、ユーザが、ユーザ端末装置６と接続されているマ
ウスを操作することによって、複数の回答候補の一つのタイトル「（６２１．０４）預金
編…３節：入金１．受付」部分にカーソルＣを合わせると（いわゆるマウスオーバーを
実施すると）、当該タイトル「（６２１．０４）預金編…３節：入金１．受付」に対応
する文書の内容の少なくとも一部分を含むスニペット画像ＳＰが画面Ｇ上に表示するよう
に構成してもよい。この構成によれば、ユーザは、特定のタイトルに対応する文書の内容
の少なくとも一部を容易に把握することが可能である。 Further, for example, as shown in FIG. 26, the user operates a mouse connected to the user terminal device 6 to obtain one title of a plurality of answer candidates, "(621.04) Deposits ... Section 3: When you move the cursor C to the "Deposit 1. Reception" part (when you perform so-called mouse over), at least a part of the contents of the document corresponding to the title "(621.04) Deposit ... Section 3: Deposit 1. Reception" The included snippet image SP may be configured to be displayed on the screen G. According to this configuration, the user can easily grasp at least a part of the content of the document corresponding to a specific title.

（変形例２）
図２７に示すように、ユーザ端末装置６の表示部は、図１、１５、１６及び１７に示す
各辞書の内容を変更可能な画面Ｇを表示可能である。図２７に示すように、ユーザは、ま
ず、例えば、辞書検索欄Ｓ５において変更を希望する辞書の名称を入力し検索する。そし
て、空欄Ｓ１に追加したい単語を漢字表記等で入力し、空欄Ｓ３に追加したい単語の読み
（例えば、カタカナ表記）を入力した後、追加ボタンＢ１が押下されると、辞書検索欄Ｓ
５で検索された辞書に新たな単語が追加される。他方で、ユーザが、例えばユーザ端末装
置６と接続されているマウスを操作することによって、削除ボタンＢ３を押下すると、削
除ボタンＢ３に対応する、予め登録されている単語が辞書から削除される。また、ユーザ
により変更ボタンＢ５が押下されると、変更ボタンＢ５に対応する、予め登録されている
単語の内容を変更することが可能である。 (Modification 2)
As shown in FIG. 27, the display unit of the user terminal device 6 can display the screen G on which the contents of the dictionaries shown in FIGS. 1, 15, 16 and 17 can be changed. As shown in FIG. 27, the user first inputs, for example, the name of the dictionary to be changed in the dictionary search field S5 and searches. Then, after inputting the word to be added to the blank S1 in kanji notation or the like, inputting the reading of the word to be added to the blank S3 (for example, katakana notation), and then pressing the add button B1, the dictionary search field S
A new word is added to the dictionary searched in 5. On the other hand, when the user presses the delete button B3, for example, by operating the mouse connected to the user terminal device 6, the pre-registered word corresponding to the delete button B3 is deleted from the dictionary. Further, when the change button B5 is pressed by the user, it is possible to change the content of the pre-registered word corresponding to the change button B5.

ユーザ端末装置６の表示部に表示される画面Ｇは、各辞書の内容を変更可能な画面に限
られず、構造化文書の内容を変更可能な画面を含む。例えば、構造化文書の内容を変更可
能な画面においては、図３に示すように、構造化文書の特定のＸＭＬ形式データに関して
、文書の構成単位（編、章、節、項、小項目等）ごとに内容を変更可能である。 The screen G displayed on the display unit of the user terminal device 6 is not limited to a screen in which the contents of each dictionary can be changed, and includes a screen in which the contents of the structured document can be changed. For example, on a screen where the contents of a structured document can be changed, as shown in FIG. 3, the structural units of the document (editions, chapters, sections, sections, sub-items, etc.) with respect to specific XML format data of the structured document. The content can be changed for each.

（変形例３）
また、対話管理サーバ１は、質問文に対する回答候補（回答出力情報）をユーザ端末装
置６の表示部に表示する際のタイミングを、質問文のテキストデータと、各単位情報に含
まれるテキストデータとのマッチングの精度（検索スコア）に応じて変更するように構成
してもよい。例えば、対話管理サーバ１の回答生成部２７は、所定の範囲内の検索スコア
が算出された１又は複数の単位情報を抽出し、出力制御部３１は、抽出された単位情報の
数に応じて、回答出力候補の出力時間を調整する。出力制御部３１は、検索スコアの類似
度が近しい説明文が比較的多くマッチングした場合は、マッチングする文書の数が少ない
場合に比べて、マッチングした説明文に対応する回答候補の表示のタイミング（質問に対
する応答タイミング）を遅らせる。応答タイミングは例えば以下の数式（０）で表現され
る。 (Modification 3)
Further, the dialogue management server 1 sets the timing at which the answer candidate (answer output information) for the question text is displayed on the display unit of the user terminal device 6 with the text data of the question text and the text data included in each unit information. It may be configured to be changed according to the matching accuracy (search score) of. For example, the response generation unit 27 of the dialogue management server 1 extracts one or a plurality of unit information for which the search score within a predetermined range is calculated, and the output control unit 31 extracts the unit information according to the number of the extracted unit information. , Adjust the output time of the answer output candidate. When a relatively large number of explanatory texts with similar search scores are matched, the output control unit 31 displays the timing of displaying the answer candidates corresponding to the matched explanatory texts as compared with the case where the number of matching documents is small. Delay the response timing to the question). The response timing is expressed by the following mathematical formula (0), for example.

（数０）
応答タイミング＝１[秒]＋０．２[秒]×（検索スコアの類似度が近しい単位情報の数）
応答タイミングは、例えば、質問文が入力されてから最長で２．５[秒]とする。また、
応答タイミングが２．０〜２．５[秒]の場合は、出力制御部３１は、回答候補を画面Ｇ上
に表示する前に、フィラー表現（例えば、「えーと」「うーん」等）を表示してもよい。 (Number 0)
Response timing = 1 [second] + 0.2 [second] x (number of unit information with similar search scores)
The response timing is, for example, 2.5 [seconds] at the longest after the question text is input. Also,
When the response timing is 2.0 to 2.5 [seconds], the output control unit 31 displays a filler expression (for example, "um", "um", etc.) before displaying the answer candidate on the screen G. You may.

この構成によれば、対話管理サーバ１は、質問文に対する回答候補をユーザ端末装置６
の表示部に表示する際のタイミングを、質問文のテキストデータと、各単位情報に含まれ
るテキストデータとのマッチングの精度（検索スコア）に応じて変更する。よって、ユー
ザと対話管理サーバ１との対話がより自然に実行される。換言すると、対話管理サーバ１
による出力を人間の反応に近づけることができる。また、出力制御部３１は、回答候補を
画面Ｇ上に表示する前に、フィラー表現を表示することによって、回答候補の表示の遅れ
が、本実施形態における対話管理システムの異常によるものでないことを報知することが
可能である。 According to this configuration, the dialogue management server 1 sets the answer candidates for the question text to the user terminal device 6.
The timing of displaying on the display unit of is changed according to the accuracy of matching (search score) between the text data of the question text and the text data included in each unit information. Therefore, the dialogue between the user and the dialogue management server 1 is executed more naturally. In other words, dialogue management server 1
The output of is close to the human reaction. Further, the output control unit 31 displays the filler expression before displaying the answer candidate on the screen G so that the delay in the display of the answer candidate is not due to the abnormality of the dialogue management system in the present embodiment. It is possible to notify.

（変形例４）
また、対話管理サーバ１は、図１、１５、１６及び１７に示す各辞書の他、ストップワ
ード辞書を更に備えるように構成してもよい。ストップワードとは、自然言語処理を実行
する際に処理対象外とする単語のことをいい、例えば、助詞や接続詞の他、「いろいろ」
や「ある程度」というような単語を含む。対話管理サーバ１は、ストップワード辞書を参
照することにより、特定の単語を自然言語処理から除外することで、質問文に対して、マ
ッチング精度の高い回答候補を生成することが可能になる。 (Modification example 4)
Further, the dialogue management server 1 may be configured to further include a stop word dictionary in addition to the dictionaries shown in FIGS. 1, 15, 16 and 17. A stop word is a word that is excluded from processing when executing natural language processing, for example, particles, conjunctions, and "various".
And include words such as "to some extent". The dialogue management server 1 can generate answer candidates with high matching accuracy for interrogative sentences by excluding specific words from natural language processing by referring to the stop word dictionary.

（変形例５）
また、対話管理サーバ１の回答生成部２７は、画面Ｇにおいて質問者に対応するアイコ
ン画像又は回答者（対話管理サーバ１）に対応するアイコン画像の少なくとも一方の表示
形態を、質問文に対する回答候補のための文書のマッチング精度に応じて変更するように
構成してもよい。例えば、回答生成部２７は、検索スコアが比較的上位の一又は複数の単
位情報の第１平均スコアと、一又は複数の単位情報の第１平均スコアより下位の第２平均
スコアとの差分に応じて、回答候補に対応付けられた回答者に対応するアイコン画像の表
示形態を変更する。具体的には、回答生成部２７は、検索スコアが高い順に上位１０件の
単位情報に基づいて回答候補を生成する場合、１位〜５位の単位情報のスコアの平均を第
１平均スコアとして算出し、６位〜１０位の単位情報のスコアの平均を第２平均スコアと
して算出し、それらの差分に応じてアイコン画像の表示態様を変更する。 (Modification 5)
Further, the answer generation unit 27 of the dialogue management server 1 displays at least one of the icon image corresponding to the questioner or the icon image corresponding to the respondent (dialogue management server 1) on the screen G as an answer candidate for the question text. It may be configured to change according to the matching accuracy of the document for. For example, the answer generation unit 27 determines the difference between the first average score of one or more unit information having a relatively high search score and the second average score lower than the first average score of one or more unit information. Correspondingly, the display form of the icon image corresponding to the respondent associated with the answer candidate is changed. Specifically, when the answer generation unit 27 generates answer candidates based on the top 10 unit information in descending order of search score, the average of the scores of the 1st to 5th unit information is set as the first average score. The average of the scores of the unit information of the 6th to 10th places is calculated as the second average score, and the display mode of the icon image is changed according to the difference between them.

また、回答者に対応するアイコン画像が顔画像である場合、回答生成部２７は、質問及
び回答を含む対話を継続して質問の質が向上するにつれて、表示を変化させるようにして
もよい。例えば、図２８に示す例では、キャラクターのアイコンＫが、無表情の顔画像（
図２９参照）から笑顔の顔画像（図３０参照）に変化する。また、顔画像の変化に代えて
、もしくは組み合わせて、カラースケールなどを用いて質問文に対する質の向上度合いを
表現してもよい。 Further, when the icon image corresponding to the respondent is a face image, the answer generation unit 27 may change the display as the quality of the question is improved by continuing the dialogue including the question and the answer. For example, in the example shown in FIG. 28, the character icon K is an expressionless face image (
(See FIG. 29) changes to a smiling face image (see FIG. 30). Further, the degree of improvement in the quality of the interrogative sentence may be expressed by using a color scale or the like instead of or in combination with the change of the face image.

（変形例６）
また、対話管理サーバ１の回答生成部２７は、質問文のテキストデータに含まれる合成
語であって、例えば、第１合成語辞書ＤＢ１２Ａ又は第２合成語辞書ＤＢ１２Ｂに登録さ
れている合成語の数に応じて、回答候補に対応付けられた回答者に対応するアイコン画像
の表示形態を変更するように構成してもよい。予め合成語辞書ＤＢ１２Ａ及び１２Ｂに登
録されている合成語を含む質問文が入力されるということは、質問の質が向上しているこ
との裏付けである。よって、上記と同様に、例えば、回答者に対応するアイコン画像が顔
画像である場合、回答生成部２７は、質問及び回答を含む対話を継続して質問の質が向上
するにつれて、表示を変化させるようにしてもよい。前述同様、図２８に示すように、キ
ャラクターのアイコンＫが、無表情の顔画像（図２９参照）から笑顔の顔画像（図３０参
照）に変化するようにしてもよい。また、顔画像の変化に代えて、もしくは組み合わせて
、カラースケールなどを用いて質問文に対する質の向上度合いを表現してもよい。 (Modification 6)
Further, the answer generation unit 27 of the dialogue management server 1 is a compound word included in the text data of the interrogative sentence, and is, for example, a compound word registered in the first compound word dictionary DB12A or the second compound word dictionary DB12B. Depending on the number, the display form of the icon image corresponding to the respondent associated with the answer candidate may be changed. The fact that a question sentence including a synthetic word registered in the synthetic word dictionaries DB12A and 12B in advance is input confirms that the quality of the question is improved. Therefore, similarly to the above, for example, when the icon image corresponding to the respondent is a face image, the answer generation unit 27 changes the display as the quality of the question is improved by continuing the dialogue including the question and the answer. You may let it. Similar to the above, as shown in FIG. 28, the icon K of the character may be changed from the expressionless face image (see FIG. 29) to the smiling face image (see FIG. 30). Further, the degree of improvement in the quality of the interrogative sentence may be expressed by using a color scale or the like instead of or in combination with the change of the face image.

（変形例７）
また、第９実施形態に係る対話管理サーバ１は、例えばＴＦ−ＩＤＦを用いる手法によ
り単位情報の検索スコアを算出するが、検索スコアの算出に際し、項目毎に算出したスコ
アの合計値を用いてもよいし、項目毎に算出したスコアに重みに乗じて求めた合計値を用
いてもよい。また、項目毎のスコアのうち最大のスコアを有するものを検索スコアとして
採用してもよい。さらに、項目毎に算出したスコアに重みに乗じて求めたスコアのうち最
大のスコアを有するものを検索スコアとして採用してもよい。例えば、対話管理サーバ９
は、説明文から算出されたスコアと、一の見出しから算出されたスコアとのうちの大きい
方のスコアを用いて回答出力情報を生成するものでもよい。 (Modification 7)
Further, the dialogue management server 1 according to the ninth embodiment calculates the search score of the unit information by, for example, a method using TF-IDF, but when calculating the search score, the total value of the scores calculated for each item is used. Alternatively, the total value obtained by multiplying the score calculated for each item by the weight may be used. Further, the score having the highest score among the scores for each item may be adopted as the search score. Further, the score having the maximum score among the scores obtained by multiplying the score calculated for each item by the weight may be adopted as the search score. For example, dialogue management server 9
May generate answer output information using the larger score of the score calculated from the explanation and the score calculated from one heading.

（変形例８）
また、第９実施形態に係る対話管理サーバ１は、説明文が複数の文章から構成される場
合、以下のような手法により検索スコアを補正してもよい。なお、これらの手法による補
正は情報処理部１０において実行される。 (Modification 8)
Further, when the explanatory text is composed of a plurality of texts, the dialogue management server 1 according to the ninth embodiment may correct the search score by the following method. The correction by these methods is executed by the information processing unit 10.

（Ａ）単語間距離を用いたスコア補正
まず、入力された質問文を形態素解析して単語を抽出する。続いて、抽出した単語と説
明文を構成する各文章の単語とのマッチングを行う。抽出した単語の全てがマッチした場
合、マッチした単語間距離を計測する。そして、単語間距離を用いて「単語間補正係数ｈ
ａ」を算出し、この単語間補正係数ｈａを乗算することで単位情報の検索スコアを補正す
る。 (A) Score correction using the distance between words First, the input question sentence is morphologically analyzed to extract words. Then, the extracted words are matched with the words of each sentence constituting the explanation sentence. If all of the extracted words match, the distance between the matched words is measured. Then, using the inter-word distance, "inter-word correction coefficient h"
The search score of the unit information is corrected by calculating "a" and multiplying by this inter-word correction coefficient ha.

例えば、抽出した単語がＡ，Ｂ，Ｃの３単語である場合、各文章を単語分割し、単語Ａ
と単語Ｂの出現位置を求める。そして、単語Ａの出現位置と単語Ｂの出現位置との絶対値
の最小値を求める。単語Ｂと単語Ｃ、単語Ｃと単語Ａに関しても同様の処理を行う。次に
、これらの最小値の平均値を「単語間距離」として算出する。そして、単語間距離を用い
て「単語間補正係数ｈａ」を算出する。単語間補正係数ｈａは、例えば下数式（１）で算
出される。 For example, when the extracted words are three words A, B, and C, each sentence is divided into words and word A is used.
And find the position where the word B appears. Then, the minimum value of the absolute value between the appearance position of the word A and the appearance position of the word B is obtained. The same processing is performed for word B and word C, and word C and word A. Next, the average value of these minimum values is calculated as the "distance between words". Then, the "word-to-word correction coefficient ha" is calculated using the word-to-word distance. The inter-word correction coefficient ha is calculated by, for example, the following mathematical formula (1).

（数１）
単語間補正係数ｈａ＝（１１−単語間距離）×０．０１×ａ＋１
そして、単語間距離に基づいて単位情報の検索スコアを補正する。なお、ここでは、単
語間距離は１１以上増加しないものとする。例えば単語間距離が１２となった場合は１１
に置換して単語間補正係数が算出される。また、定数ａは適宜調整可能な数値である。 (Number 1)
Interword correction coefficient ha = (11-distance between words) x 0.01 x a + 1
Then, the search score of the unit information is corrected based on the distance between words. Here, it is assumed that the inter-word distance does not increase by 11 or more. For example, if the distance between words is 12, then 11
The inter-word correction coefficient is calculated by replacing with. Further, the constant a is a numerical value that can be adjusted as appropriate.

例えば、抽出した単語Ａ，Ｂ，Ｃの３単語に対して、単語間距離が３で定数ａ＝１の場
合、単語間補正係数ｈａは１．０８となる。そこで、単語間補正係数ｈａ＝１．０８を乗
算し、単位情報の検索スコアを補正する。 For example, when the inter-word distance is 3 and the constant a = 1 for the three extracted words A, B, and C, the inter-word correction coefficient ha is 1.08. Therefore, the inter-word correction coefficient ha = 1.08 is multiplied to correct the search score of the unit information.

（Ｂ）各文章での単語出現回数を用いたスコア補正
まず、入力された質問文を形態素解析して単語を抽出する。続いて、抽出した単語と説
明文を構成する各文章の単語とのマッチングを行う。ここでは、一の単位情報内の説明文
に含まれる単語とマッチした単語数の最大値を「単語出現回数」と設定し、その最大値を
持つ文章の数を「最大文章数」とする。そして、単語出現回数及び最大文章数を用いて「
出現回数補正係数ｈｂ」を算出し、この出現回数補正係数ｈｂを乗算することで単位情報
の検索スコアを補正する。なお、出現回数補正係数ｈｂは、次の数式（２）で算出される
。ここで、定数ｂ１、ｂ２は適宜調整可能な数値である。 (B) Score correction using the number of word occurrences in each sentence First, the input question sentence is morphologically analyzed to extract words. Then, the extracted words are matched with the words of each sentence constituting the explanation sentence. Here, the maximum number of words that match the words included in the explanation in one unit information is set as the "word appearance count", and the number of sentences having the maximum value is set as the "maximum number of sentences". Then, using the number of word appearances and the maximum number of sentences, "
The appearance number correction coefficient hb ”is calculated, and the search score of the unit information is corrected by multiplying the appearance number correction coefficient hb. The occurrence frequency correction coefficient hb is calculated by the following mathematical formula (2). Here, the constants b1 and b2 are numerical values that can be appropriately adjusted.

（数２）
出現回数補正係数ｈｂ=（単語出現回数×０．０１×定数ｂ１）＋（（最大文章数−１
）×０．０１×定数ｂ２）＋１ (Number 2)
Number of occurrences correction coefficient hb = (number of occurrences of words x 0.01 x constant b1) + ((maximum number of sentences-1)
) × 0.01 × constant b2) +1

（Ｃ）各文章との類似度を用いたスコア補正
（Ｃ１）事前処理
まず、事前処理について説明する。各説明文を構成する各文章に文章ＩＤを設定する。
次に、各文章を形態素解析して単語を抽出する。そして、所定のニューラルネットワーク
を用いて、各文章ＩＤ及び各単語の固定長ベクトルを獲得する。 (C) Score correction using the degree of similarity with each sentence (C1) Pre-processing First, pre-processing will be described. A sentence ID is set for each sentence constituting each explanatory sentence.
Next, each sentence is morphologically analyzed to extract words. Then, a fixed-length vector of each sentence ID and each word is acquired by using a predetermined neural network.

ここでのニューラルネットワークは、例えば、文章ＩＤ＝ｕの文章（ｕ番目の文章）に
含まれる連続する単語ｗ（ｔ−２），ｗ（ｔ−１）から次に続く単語ｗ（ｔ）を推測する
ものである。 Here, the neural network uses, for example, consecutive words w (t-2) and w (t-1) included in a sentence (uth sentence) with sentence ID = u, followed by a word w (t). It's a guess.

詳しくは、入力層に、ｄ（ｕ），ｗ（ｔ−２），ｗ（ｔ−１）の情報が入力される。ｄ
（ｕ）は、文章ＩＤ＝ｕの文章をone-hotベクトルで表現したものであり、（Ｕ＋１）次
元を有する。ｗ（ｔ）は、ｔ番目の単語のone-hotベクトルで表現したものであり、（Ｎ
＋１）次元を有する。なお、全文章数はＵ，全単語数はＮ，ベクトル化次元数はＳとする
。 Specifically, the information of d (u), w (t-2), and w (t-1) is input to the input layer. d
(U) is a one-hot vector representation of a sentence with sentence ID = u, and has a (U + 1) dimension. w (t) is represented by the one-hot vector of the t-th word, and (N).
+1) It has a dimension. The total number of sentences is U, the total number of words is N, and the number of vectorized dimensions is S.

次に、中間層に、入力層から送られてきた情報が、次数式（３）を用いてＳ次元ベクト
ルの情報に変換される。
（数３）
ｈ＝ａｖｅｒａｇｅ（ｄ（ｕ）Ｄ＋ｗ（ｔ−２）Ｗ＋ｗ（ｔ−１）Ｗ）
ここで「ａｖｅｒａｇｅ」は平均を意味する。また記号Ｄは文章重み行列を表しており
、（Ｕ＋１）行Ｓ列の要素を有するものである。また記号Ｗは単語重み行列を表しており
、（Ｎ＋１）行Ｓ列の要素を有するものである。 Next, the information sent from the input layer to the intermediate layer is converted into S-dimensional vector information using the following mathematical formula (3).
(Number 3)
h = average (d (u) D + w (t-2) W + w (t-1) W)
Here, "average" means an average. Further, the symbol D represents a sentence weight matrix, and has elements in (U + 1) rows and S columns. Further, the symbol W represents a word weight matrix and has elements of (N + 1) rows and S columns.

続いて、出力層に、中間層から送られてきた情報が、次数式（４）を用いて（Ｎ＋１）
次元ベクトルの情報に変換される。
（数４）
ｈ＝ｓｏｆｔｍａｘ（ｈＷ’）
ここで「ｓｏｆｔｍａｘ」はソフトマックス関数を意味する。また記号Ｗ’は単語重み
行列Ｗの転置行列である。 Subsequently, the information sent from the intermediate layer to the output layer is (N + 1) using the following mathematical formula (4).
It is converted into dimensional vector information.
(Number 4)
h = softmax (hW')
Here, "softmax" means a softmax function. The symbol W'is a transposed matrix of the word weight matrix W.

そして、出力層から出力された情報と、推測する単語ｗ（ｔ）とを比較し、両者の差分
を減らすように、文章重み行列Ｄ，単語重み行列Ｗの要素を更新する。全ての文章及び単
語について同様の処理を行ない、文章重み行列Ｄ，単語重み行列Ｗの更新を繰り返す。こ
の結果、文章重み行列Ｄのｕ行目が文章ＩＤ＝ｕの文章の固定長ベクトルに対応すること
になる。また、単語重み行列Ｗのｔ行目がｔ番目の単語の固定長ベクトルに対応すること
になる。 Then, the information output from the output layer is compared with the inferred word w (t), and the elements of the sentence weight matrix D and the word weight matrix W are updated so as to reduce the difference between the two. The same processing is performed for all sentences and words, and the update of the sentence weight matrix D and the word weight matrix W is repeated. As a result, the u-th line of the sentence weight matrix D corresponds to the fixed-length vector of the sentence with the sentence ID = u. Further, the t-th row of the word weight matrix W corresponds to the fixed-length vector of the t-th word.

（Ｃ２）推論処理
次に、推論処理について説明する。質問文が入力されると、入力された質問文を形態素
解析して単語を抽出する。次に、事前処理で用いたのと同じニューラルネットワークを用
いて、質問文の固定長ベクトルを獲得する。ここでは、質問文に含まれる連続する単語ｗ
（ｔ−２），ｗ（ｔ−１）から次に続く単語ｗ（ｔ）を推測するタスクを行なう。入力層
には、ｄ（Ｕ＋１），ｗ（ｔ−２），ｗ（ｔ−１）の情報が入力される。ｄ（Ｕ＋１）は
、入力される文章（＝質問文）をone-hotベクトルで表現したものであり、（Ｕ＋１）次
元を有する。ｗ（ｔ）は、ｔ番目の単語のone-hotベクトルで表現したものであり、（Ｎ
＋１）次元を有する。質問文に、未知の単語が含まれている場合は、ｗ（Ｎ＋１）に情報
が格納される。また、中間層には、入力層から送られてきた情報が、次数式（５）を用い
てＳ次元ベクトルの情報に変換される。 (C2) Inference processing Next, the inference processing will be described. When a question sentence is input, the input question sentence is morphologically analyzed to extract words. Next, the fixed-length vector of the interrogative sentence is acquired using the same neural network used in the preprocessing. Here, consecutive words w included in the question sentence
(T-2), w (t-1) performs a task of inferring the following word w (t). Information of d (U + 1), w (t-2), and w (t-1) is input to the input layer. d (U + 1) is a one-hot vector representation of the input sentence (= interrogative sentence), and has a (U + 1) dimension. w (t) is represented by the one-hot vector of the t-th word, and (N).
+1) It has a dimension. If the question text contains an unknown word, the information is stored in w (N + 1). Further, in the intermediate layer, the information sent from the input layer is converted into the information of the S-dimensional vector by using the following mathematical formula (5).

（数５）
ｈ＝ａｖｅｒａｇｅ（ｄ（Ｕ＋１）Ｄ＋ｗ（ｔ−２）Ｗ＋ｗ（ｔ−１）Ｗ）
続いて、出力層に、中間層から送られてきた情報が入力され、次数式（６）を用いて（
Ｎ＋１）次元ベクトルの情報に変換される。 (Number 5)
h = average (d (U + 1) D + w (t-2) W + w (t-1) W)
Subsequently, the information sent from the intermediate layer is input to the output layer, and the following formula (6) is used (
N + 1) Converted to dimensional vector information.

（数６）
ｈ＝ｓｏｆｔｍａｘ（ｈＷ’）
出力層から出力された情報と、単語ｗ（ｔ）とを比較し、両者の差分を減らすように、
文章重み行列Ｄの要素を更新する。質問文に存在する全ての文章及び単語について同様の
処理を行なう。そして、文章重み行列Ｄの最終行を、入力された質問文に対応する固定長
ベクトルとする。 (Number 6)
h = softmax (hW')
Compare the information output from the output layer with the word w (t) and reduce the difference between the two.
Update the elements of the sentence weight matrix D. The same processing is performed for all sentences and words existing in the question sentence. Then, the last row of the sentence weight matrix D is a fixed-length vector corresponding to the input question sentence.

次に、入力された質問文の固定長ベクトルと各文章ＩＤに対応する固定長ベクトルとの
コサイン類似度を算出する。そして、一の単位情報における説明文内でのコサイン類似度
の最大値を最大類似度として算出する。続いて、最大類似度から「類似度補正係数ｈｃ」
を算出し、この類似度補正係数を乗算することで単位情報の検索スコアを補正する。なお
、類似度補正係数ｈｃは、次の数式（７）で算出される。なお、定数ｃは適宜調整可能な
数値である。 Next, the cosine similarity between the fixed-length vector of the input question sentence and the fixed-length vector corresponding to each sentence ID is calculated. Then, the maximum value of the cosine similarity in the explanation in one unit information is calculated as the maximum similarity. Then, from the maximum similarity, "similarity correction coefficient hc"
Is calculated, and the search score of the unit information is corrected by multiplying this similarity correction coefficient. The similarity correction coefficient hc is calculated by the following mathematical formula (7). The constant c is a numerical value that can be adjusted as appropriate.

（数７）
類似度補正係数ｈｃ＝（最大類似度×定数ｃ）＋１
例えば、最大類似度ｍｓ＝０．８で定数ｃ＝０．１の場合、類似度補正係数ｈｃ＝１．
０８となる。 (Number 7)
Similarity correction coefficient hc = (maximum similarity x constant c) + 1
For example, when the maximum similarity ms = 0.8 and the constant c = 0.1, the similarity correction coefficient hc = 1.
It becomes 08.

上記説明において、質問文の固定長ベクトルを獲得するのに、上述したニューラルネッ
トワークを用いたが、文章の固定ベクトルが生成できるものであれば、任意の文章モデル
を採用することができる。 In the above description, the above-mentioned neural network was used to acquire the fixed-length vector of the interrogative sentence, but any sentence model can be adopted as long as the fixed vector of the sentence can be generated.

なお、本発明者らによれば、一例として、単位情報の説明文の数が５００程度、説明文
を構成する文章の総数Ｕが５０００程度、それらに用いられる単語数Ｎが１０００００語
で５０００種程度のときに、ベクトル化次元数Ｓを１００〜５００程度とした場合には、
質問文の固定長ベクトルに有意な情報を持たせることができることが確認された。 According to the present inventors, as an example, the number of explanatory sentences of unit information is about 500, the total number of sentences U constituting the explanatory sentences is about 5000, and the number of words N used for them is 100,000 words and 5000 kinds. When the vectorized dimension number S is about 100 to 500,
It was confirmed that the fixed-length vector of the interrogative sentence can have significant information.

＜第１０実施形態＞
第１０実施形態では、第９実施形態に係る対話管理サーバ１において、見出しが複数存
在し、各見出し及び説明文との優先順位を設定した上で検索スコアを算出するものである
。すなわち、第１０実施形態に係る対話管理サーバ１では、単位情報に含まれる見出し（
階層）は複数あり、記録部１１が階層毎に重みを有し、回答生成部２７が重みを用いてス
コアを算出する。第１０実施形態の他の構成は第９実施形態と同様である。第１０実施形
態に係る対話管理サーバ１では、ＴＦ−ＩＤＦなどの手法により検索スコアを算出するが
、各階層に応じた項目毎に算出したスコアの合計値を検索スコアとしてもよいし、項目毎
に算出したスコアに重みに乗じて求めた合計値を検索スコアとしてもよい。また、項目毎
のスコアのうち最大のスコアを有するものを検索スコアとして採用してもよい。さらに、
項目毎に算出したスコアに重みに乗じて求めたスコアのうち最大のスコアを有するものを
検索スコアとして採用してもよい。 <10th Embodiment>
In the tenth embodiment, the dialogue management server 1 according to the ninth embodiment has a plurality of headings, and the search score is calculated after setting the priority order with each heading and the explanation. That is, in the dialogue management server 1 according to the tenth embodiment, the heading included in the unit information (
There are a plurality of layers), the recording unit 11 has a weight for each layer, and the answer generation unit 27 calculates the score using the weight. Other configurations of the tenth embodiment are the same as those of the ninth embodiment. In the dialogue management server 1 according to the tenth embodiment, the search score is calculated by a method such as TF-IDF, but the total value of the scores calculated for each item according to each layer may be used as the search score, or for each item. The total value obtained by multiplying the score calculated in 1 by the weight may be used as the search score. Further, the score having the highest score among the scores for each item may be adopted as the search score. further,
The score calculated for each item multiplied by the weight and having the highest score may be adopted as the search score.

また、第１０実施形態に係る対話管理サーバ１は、重み決定装置１５０により各階層に
応じた項目毎の重みが自動的に決定されるものでもよい。なお、重み決定装置１５０は対
話管理サーバ１と同一の装置に組み込まれてもよいし、図３１に示すように、ネットワー
クＮを介して接続されるものでもよい。具体的には、重み決定装置１５０は、図３２，３
３，３４に示すような処理を実行する。なお、重み決定装置１５０はコンピュータにより
構成されており、その情報処理部１５５により各種処理が実行される。 Further, the dialogue management server 1 according to the tenth embodiment may be one in which the weight determination device 150 automatically determines the weight for each item according to each layer. The weight determination device 150 may be incorporated in the same device as the dialogue management server 1, or may be connected via the network N as shown in FIG. 31. Specifically, the weight determining device 150 is shown in FIGS. 32 and 3.
The processing as shown in 3 and 34 is executed. The weight determination device 150 is composed of a computer, and various processes are executed by the information processing unit 155 thereof.

（Ａ）重み候補リストの作成
まず、重み決定装置１５０は、複数の階層から任意のｎ番目の階層を「第１階層」とし
て選択する（ステップＳＴ１１）。ここでは、階層は９つあるとする。また、一例として
、第１番目の階層（ｎ＝１）が第１階層として選択されるものとする。次に、第１階層の
重み（ここでは、ｐ（１））を、初期値リストの中から選択する（ステップＳＴ１２）。
初期値リストには、階層毎にとり得る重みの候補値が格納されている。具体的には、初期
値リストには、図３５に示すように、１００から１００００までの数値が１００の間隔で
並んで格納されている。ここでは、第１階層の重みとして、第１番目の重みｐ（１）＝１
００が選択されるものとする。 (A) Creation of Weight Candidate List First, the weight determination device 150 selects an arbitrary nth layer from a plurality of layers as the “first layer” (step ST11). Here, it is assumed that there are nine layers. Further, as an example, it is assumed that the first layer (n = 1) is selected as the first layer. Next, the weight of the first layer (here, p (1)) is selected from the initial value list (step ST12).
In the initial value list, candidate values of weights that can be taken for each layer are stored. Specifically, as shown in FIG. 35, numerical values from 100 to 10000 are stored side by side at intervals of 100 in the initial value list. Here, as the weight of the first layer, the first weight p (1) = 1
It is assumed that 00 is selected.

続いて、情報処理部１５５により、第１階層以外の階層の重みｐ（ｉ≠ｎ）、すなわち
重みｐ（２）〜ｐ（９）が、下数式（８）で定義される第１重み演算式から算出される（
ステップＳＴ１３）。なお、ｉは階層の番号を意味している。また、ここでは、「重みの
初期最大値」は１００００に設定される。 Subsequently, the information processing unit 155 defines the weights p (i ≠ n) of the layers other than the first layer, that is, the weights p (2) to p (9), in the first weight operation defined by the following mathematical formula (8). Calculated from the formula (
Step ST13). In addition, i means the number of the hierarchy. Further, here, the "initial maximum value of the weight" is set to 10000.

そして、ステップＳＴ１１により選択された第１階層の重みｐ（１）とステップＳＴ１
３により算出された第１階層以外の階層の重みｐ（２）〜ｐ（９）とを含む第１重みセッ
トを用いて、サンプル質問文Ｑに対応するサンプル回答Ａに一致する説明文の検索スコア
の順位を算出する（ステップＳＴ１４）。すなわち、重み決定装置１５０は、対話管理サ
ーバ１に重みセットｐ（１）〜ｐ（９）を設定して、サンプル質問文Ｑを入力する（ステ
ップＳＴ１４ａ〜１４ｃ）。そして、検索スコア順に回答を抽出する（ステップＳＴ１４
ｄ，ＳＴ１４ｅ）。出力される回答に、サンプル回答Ａと一致するものがあるか否かを判
定し、一致する場合、その順位を抽出する（ステップＳＴ１４ｆ−Ｙｅｓ，１４ｇ）。こ
こで、サンプル質問文Ｑとサンプル回答Ａとは一対一に対応しており、複数のサンプル質
問文とサンプル回答との組が対話管理サーバ１に順次入力される。すなわち、複数のサン
プル質問文及びサンプル回答の組に対して、複数の順位が個別に抽出される（ステップＳ
Ｔ１４ｈ，ＳＴ１４ｉ）。 Then, the weight p (1) of the first layer selected in step ST11 and step ST1
Search for an explanatory text that matches the sample answer A corresponding to the sample question sentence Q using the first weight set including the weights p (2) to p (9) of the layers other than the first layer calculated in 3. The ranking of the scores is calculated (step ST14). That is, the weight determination device 150 sets the weight sets p (1) to p (9) in the dialogue management server 1 and inputs the sample interrogative sentences Q (steps ST14a to 14c). Then, the answers are extracted in the order of the search score (step ST14).
d, ST14e). It is determined whether or not the output answers match the sample answer A, and if they match, the order is extracted (step ST14f-Yes, 14g). Here, the sample question sentence Q and the sample answer A have a one-to-one correspondence, and a set of a plurality of sample question sentences and the sample answer is sequentially input to the dialogue management server 1. That is, a plurality of ranks are individually extracted for a set of a plurality of sample question sentences and sample answers (step S).
T14h, ST14i).

続いて、情報処理部１５５は、ステップＳＴ１４により算出された順位と下数式（９）
及び（１０）で表されるポイント演算式とに基づいて第１重みセットに対するポイントを
算出する（ステップＳＴ１５）。 Subsequently, the information processing unit 155 uses the ranking calculated in step ST14 and the following mathematical formula (9).
And the points for the first weight set are calculated based on the point calculation formula represented by (10) (step ST15).

なお、Ｓはサンプル質問文の総数である。また、ｇ（ｓ）はステップＳＴ１４により算
出された順位の値が１１以上の場合は０以下の任意の数値（例えば−１０）となる。例え
ば、サンプル質問文が７つあり、各サンプル質問文に対応するサンプル回答に一致する回
答の検索順位がそれぞれ３，１，１，２，１，１３，４の場合、ポイントは小数点２桁目
を四捨五入して６２．９と算出される。 Note that S is the total number of sample interrogative sentences. Further, g (s) is an arbitrary numerical value (for example, -10) of 0 or less when the rank value calculated in step ST14 is 11 or more. For example, if there are seven sample question sentences and the search order of the answers that match the sample answers corresponding to each sample question sentence is 3,1,1,2,1,13,4, respectively, the point is the second decimal place. Is rounded to 62.9.

次に、ステップＳＴ１２に戻り、第１階層に対して他の重み（ｐ（１）＝１００以外）
を選択し、ステップＳＴ１３、ステップＳＴ１４及びステップＳＴ１５を、第１階層がと
り得る全ての重みが選択されるまで繰り返し実行する（ステップＳＴ１６）。 Next, the process returns to step ST12, and other weights (other than p (1) = 100) with respect to the first layer.
Is selected, and step ST13, step ST14, and step ST15 are repeatedly executed until all the weights that the first layer can take are selected (step ST16).

続いて、ステップＳＴ１５で得られたポイントの最大値を抽出し、このポイントの最大
値に基づいてステップＳＴ１１で選択した第１階層に対するポイントの閾値を算出する（
ステップＳＴ１７）。具体的には、ポイントの最大値に所定の定数（０．９など）を乗算
することで得られた値を閾値と設定する。 Subsequently, the maximum value of the points obtained in step ST15 is extracted, and the threshold value of the points for the first layer selected in step ST11 is calculated based on the maximum value of these points (
Step ST17). Specifically, the value obtained by multiplying the maximum value of points by a predetermined constant (0.9 or the like) is set as the threshold value.

次に、ステップＳＴ１５により算出されたポイントがステップＳＴ１７により算出され
た閾値以上であるときの第１階層の重みを特定し、それ以外の重みを除いたものに初期値
リストを更新する（ステップＳＴ１８）。 Next, the weight of the first layer when the point calculated in step ST15 is equal to or greater than the threshold value calculated in step ST17 is specified, and the initial value list is updated to the one excluding the other weights (step ST18). ).

そして、ステップＳＴ１１に戻り、第１階層として他の全ての階層が選択されるまで、
ステップＳＴ１２〜ＳＴ１８を繰り返し実行する（ステップＳＴ１９）。これにより、初
期値リストが更新される。 Then, the process returns to step ST11, and until all other layers are selected as the first layer.
Steps ST12 to ST18 are repeatedly executed (step ST19). As a result, the initial value list is updated.

次に、複数の階層から任意のｍ番目の階層を「第２階層」として選択する（ステップＳ
Ｔ２１）。ここでは、第５番目の階層（ｍ＝５）が第２階層として選択されるものとする
。続いて、第２階層以外の階層の重みｐ（１）〜ｐ（４），ｐ（６）〜ｐ（９）の最小値
の合計値から第２階層の重みｐ（５）の最大値を算出する（ステップＳＴ２２）。 Next, an arbitrary m-th layer from a plurality of layers is selected as the "second layer" (step S).
T21). Here, it is assumed that the fifth layer (m = 5) is selected as the second layer. Subsequently, the maximum value of the weight p (5) of the second layer is calculated from the total value of the minimum values of the weights p (1) to p (4) and p (6) to p (9) of the layers other than the second layer. Calculate (step ST22).

そして、ステップＳＴ２１で、第２階層として他の階層（ここでは第５番目以外の階層
）を選択し、ステップＳＴ２２を繰り返し実行する（ステップＳＴ２３）。そして、全て
の階層の選択が終了するまで同様の処理を実行して、初期値リストから図３６に示すよう
な「重み候補リスト」を生成する（ステップＳＴ２４）。換言すると、ステップＳＴ２２
で、第２階層の重みｐ（ｍ）の最大値より大きい重みを初期値リストから除外することで
重み候補リストを生成する。 Then, in step ST21, another layer (here, a layer other than the fifth layer) is selected as the second layer, and step ST22 is repeatedly executed (step ST23). Then, the same process is executed until the selection of all layers is completed, and a "weight candidate list" as shown in FIG. 36 is generated from the initial value list (step ST24). In other words, step ST22
Then, a weight candidate list is generated by excluding a weight larger than the maximum value of the weight p (m) of the second layer from the initial value list.

（Ｂ）重みの設定
まず、重み決定装置１５０の情報処理部１５５が、複数の階層から任意のｏ番目の一階
層を「第３階層」として選択する（ステップＳＴ３１）。ここでは、第９番目の階層が選
択されるものとする。次に、情報処理部１５５は、選択された階層以外の重みｐ（１）〜
ｐ（８）を、重み候補リストからランダムに抽出する（ステップＳＴ３２）。次に、情報
処理部１５５は、ステップＳＴ３２により抽出された重みｐ（１）〜ｐ（８）を用いて下
記の重み演算数式（１１）から、ステップＳＴ３１により選択された階層の重みｐ（９）
を算出する（ステップＳＴ３３）。 (B) Weight Setting First, the information processing unit 155 of the weight determination device 150 selects an arbitrary o-th layer from a plurality of layers as the “third layer” (step ST31). Here, it is assumed that the ninth layer is selected. Next, the information processing unit 155 has weights p (1) to other than the selected hierarchy.
p (8) is randomly extracted from the weight candidate list (step ST32). Next, the information processing unit 155 uses the weights p (1) to p (8) extracted in step ST32 to calculate the weight p (9) of the hierarchy selected in step ST31 from the following weight calculation formula (11). )
Is calculated (step ST33).

なお、右辺第２項は第３階層以外の階層の総和を表している。また、ここでは、「重み
の初期最大値」は１００００に設定される（図３５参照）。 The second term on the right side represents the sum of the layers other than the third layer. Further, here, the "initial maximum value of the weight" is set to 10000 (see FIG. 35).

ここで、算出した第３階層の重みｐ（９）が重み候補リストに存在しない場合は、算出
結果を破棄し、ステップＳＴ３２に戻る（ステップＳＴ３４）。 Here, if the calculated weight p (9) of the third layer does not exist in the weight candidate list, the calculation result is discarded and the process returns to step ST32 (step ST34).

次に、情報処理部１５５は、ステップＳＴ３３により算出された重みｐ（９）とステッ
プＳＴ３２により抽出された重みｐ（１）〜ｐ（８）とを含む重みセットを用いて、サン
プル質問文Ｑに対応するサンプル回答Ａに一致する説明文のスコアの順位を算出する（ス
テップＳＴ３５）。すなわち、重み決定装置１５０は、対話管理サーバ１に重みセットｐ
（１）〜ｐ（９）を設定して、サンプル質問文Ｑを入力する。そして、検索スコア順に回
答を出力する。出力される回答に、サンプル回答Ａと一致するものがあるか否かを判定し
、一致する場合、その順位を出力する。なお、サンプル質問文Ｑとサンプル回答Ａとは一
対一に対応している。複数のサンプル質問文とサンプル回答との組が対話管理サーバ１に
順次入力される。すなわち、複数のサンプル質問文及びサンプル回答の組に対して、複数
の順位が個別に算出される。なお、ステップＳＴ３５の処理は前述したステップＳＴ１４
と同様である（図３３参照）。 Next, the information processing unit 155 uses a weight set including the weights p (9) calculated in step ST33 and the weights p (1) to p (8) extracted in step ST32, and uses the sample question sentence Q. The ranking of the score of the explanatory text corresponding to the sample answer A corresponding to is calculated (step ST35). That is, the weight determination device 150 sets the weight on the dialogue management server 1.
Set (1) to p (9) and enter the sample question text Q. Then, the answers are output in the order of the search score. It is determined whether or not there is an answer that matches the sample answer A, and if it matches, the order is output. There is a one-to-one correspondence between the sample question sentence Q and the sample answer A. A set of a plurality of sample question sentences and sample answers is sequentially input to the dialogue management server 1. That is, a plurality of ranks are individually calculated for a set of a plurality of sample question sentences and sample answers. The process of step ST35 is the above-mentioned step ST14.
(See FIG. 33).

次に、情報処理部１５５は、算出された順位と上述した数式（９），（１０）とに基づ
いて重みセットに対するポイントを算出する（ステップＳＴ３６）。そして、ステップＳ
Ｔ３２〜ステップＳＴ３６までを所定回数実行する（ステップＳＴ３７）。ここでは、１
０万回繰り返し処理を実行する。そして、１０万回実行した後のポイントに基づいて各階
層の重みを決定する（ＳＴ３８）。具体的には、ポイントを１０万回算出したうち、ポイ
ントの値が最大値を有するときの重みセットを各階層の重みとして決定する。 Next, the information processing unit 155 calculates points for the weight set based on the calculated rank and the above-mentioned mathematical expressions (9) and (10) (step ST36). Then, step S
Execution from T32 to step ST36 a predetermined number of times (step ST37). Here, 1
The process is repeated 0,000 times. Then, the weight of each layer is determined based on the points after the execution 100,000 times (ST38). Specifically, among the points calculated 100,000 times, the weight set when the value of the points has the maximum value is determined as the weight of each layer.

＜第１１実施形態＞
第１１実施形態では、第９実施形態に係る対話管理サーバ１において、第１合成語辞書
ＤＢ１２Ａ又は第２合成語辞書ＤＢ１２Ｂに登録された合成語の妥当性を判定する。以下
、便宜上、第１合成語辞書ＤＢ１２Ａに登録された合成語の妥当性を判定するものについ
て説明するが、第２合成語辞書１２Ｂに登録された合成語の妥当性についても同様の処理
で判定できる。 <11th Embodiment>
In the eleventh embodiment, the dialogue management server 1 according to the ninth embodiment determines the validity of the synthetic word registered in the first synthetic word dictionary DB12A or the second synthetic word dictionary DB12B. Hereinafter, for convenience, a method for determining the validity of the compound word registered in the first compound word dictionary DB 12A will be described, but the validity of the compound word registered in the second compound word dictionary 12B will also be determined by the same processing. it can.

具体的には、第９実施形態に係る対話管理サーバ１は、妥当性判定装置２００により第
１合成語辞書ＤＢ１２Ａの妥当性が判定される。なお、妥当性判定装置２００は対話管理
サーバ１と同一の装置に組み込まれてもよいし、図３７に示すように、ネットワークＮを
介して接続されるものでもよい。 Specifically, in the dialogue management server 1 according to the ninth embodiment, the validity of the first synthetic word dictionary DB 12A is determined by the validity determination device 200. The validity determination device 200 may be incorporated in the same device as the dialogue management server 1, or may be connected via the network N as shown in FIG. 37.

妥当性判定装置２００は、図３８に示すような処理を実行する。妥当性判定装置２００
はコンピュータにより構成されており、その情報処理部２０５により各種処理が実行され
る。 The validation device 200 executes the process as shown in FIG. 38. Validation device 200
Is composed of a computer, and various processes are executed by the information processing unit 205.

まず、妥当性判定装置２００の情報処理部２０５が、第１合成語辞書ＤＢ１２Ａの内容
を複製した「複製辞書」を生成する（ステップＳＴ５１）。次に、情報処理部２０５は、
対話管理サーバ１の第１合成語辞書ＤＢ１２Ａを初期化する（ステップＳＴ５２）。なお
、以下に示す処理を中断する場合は、複製辞書で上書きすることで第１合成語辞書ＤＢ１
２Ａを元の状態に戻す。 First, the information processing unit 205 of the validity determination device 200 generates a "duplicate dictionary" that duplicates the contents of the first synthetic word dictionary DB 12A (step ST51). Next, the information processing unit 205
Initialize the first synthetic word dictionary DB12A of the dialogue management server 1 (step ST52). When the processing shown below is interrupted, the first compound word dictionary DB1 can be overwritten with the replication dictionary.
Return 2A to its original state.

ステップＳＴ５２の後で初期化された第１合成語辞書ＤＢ１２Ａを用いて、サンプル質
問文Ｑに対応するサンプル回答Ａに一致する説明文のスコアの順位を算出する（ステップ
ＳＴ５３）。具体的には、サンプル質問文Ｑを入力する。そして、検索スコア順に回答を
出力する。出力される回答に、サンプル回答Ａと一致するものがあるか否かを判定し、一
致する場合、その順位を出力する。なお、サンプル質問文Ｑとサンプル回答Ａとは一対一
に対応している。また、複数のサンプル質問文とサンプル回答との組が対話管理サーバ１
に順次入力される。すなわち、複数のサンプル質問文及びサンプル回答の組に対して、複
数の順位が個別に算出される。なお、このようなステップＳＴ５３の処理は前述したステ
ップＳＴ１４と同様である（図３３参照）。 Using the first synthetic word dictionary DB12A initialized after step ST52, the ranking of the scores of the explanatory texts matching the sample answer A corresponding to the sample question text Q is calculated (step ST53). Specifically, the sample question sentence Q is input. Then, the answers are output in the order of the search score. It is determined whether or not there is an answer that matches the sample answer A, and if it matches, the order is output. There is a one-to-one correspondence between the sample question sentence Q and the sample answer A. In addition, a set of a plurality of sample question sentences and sample answers is the dialogue management server 1
Is input sequentially to. That is, a plurality of ranks are individually calculated for a set of a plurality of sample question sentences and sample answers. The processing of such step ST53 is the same as that of step ST14 described above (see FIG. 33).

そして、情報処理部２０５は、ステップＳＴ５３により算出された順位と下数式（１２
）及び（１３）で表される所定のポイント演算式とに基づいてポイントを算出する（ステ
ップＳＴ５４）。
Then, the information processing unit 205 uses the rank calculated in step ST53 and the lower mathematical formula (12).
) And (13), the points are calculated based on the predetermined point calculation formula (step ST54).

次に、情報処理部２０５は、複製辞書に登録された合成語から任意の合成語を抽出して
、初期化された後の第１合成語辞書１２Ａに登録する（ステップＳＴ５５）。 Next, the information processing unit 205 extracts an arbitrary synthetic word from the synthetic word registered in the duplicate dictionary and registers it in the first synthetic word dictionary 12A after the initialization (step ST55).

続いて、情報処理部２０５は、ステップＳＴ５５の登録後の第１合成語辞書ＤＢ１２Ａ
を用いて、サンプル質問文に対応するサンプル回答に一致する説明文のスコアの順位を算
出する（ステップＳＴ５６）。 Subsequently, the information processing unit 205 describes the first synthetic word dictionary DB12A after registration in step ST55.
Is used to calculate the ranking of the scores of the explanatory texts that match the sample answers corresponding to the sample question texts (step ST56).

続いて、情報処理部２０５は、ステップＳＴ５６により算出された順位と所定のポイン
ト演算式とに基づいて、ステップＳＴ５５により抽出された合成語に対するポイントを算
出する（ステップＳＴ５７）。 Subsequently, the information processing unit 205 calculates the points for the composite words extracted in step ST55 based on the rank calculated in step ST56 and the predetermined point calculation formula (step ST57).

そして、情報処理部２０５は、ステップＳＴ５４により算出されたポイントと、ステッ
プＳＴ５７により算出されたポイントとの比較から、合成語の妥当性を判定する（ステッ
プＳＴ５８）。ここでは、ポイントが上昇していれば、ステップＳＴ５５で登録された合
成語の妥当性は高いと判定される。一方、ポイントが下降していれば、ステップＳＴ５５
で登録された合成語の妥当性は低いと判定される。妥当性が低いと判定された単語は、削
除候補単語として記録される。 Then, the information processing unit 205 determines the validity of the composite word from the comparison between the points calculated in step ST54 and the points calculated in step ST57 (step ST58). Here, if the points are increased, it is determined that the validity of the composite word registered in step ST55 is high. On the other hand, if the point is descending, step ST55
It is judged that the validity of the compound word registered in is low. Words that are determined to be of low validity are recorded as candidates for deletion.

この後は、複製辞書に登録されている合成語を一語ずつ登録し、全ての合成語が登録さ
れるまで同様の処理を繰り返す（ステップＳＴ５９）。ただし、情報処理部２０５は、ス
テップ５８において、ステップＳＴ５４により算出されたポイントとステップＳＴ５７に
より算出されたポイントとの比較に替えて、ステップＳＴ５７により算出されたポイント
と前回（新たな合成語が登録される前）のステップＳＴ５７により算出されたポイントと
の比較から、合成語の妥当性を判定する。 After that, the compound words registered in the duplicate dictionary are registered word by word, and the same process is repeated until all the compound words are registered (step ST59). However, in step 58, the information processing unit 205 replaces the comparison between the points calculated in step ST54 and the points calculated in step ST57 with the points calculated in step ST57 and the previous time (a new composite word is registered). The validity of the composite word is determined from the comparison with the points calculated in step ST57 (before the information processing is performed).

以上説明したように、本実施形態に係る妥当性判定装置２００は、初期化ステップ後の
第１合成辞書に、複製辞書に登録されている合成語を登録し、その登録の前後のポイント
を比較する。これにより、第１合成語辞書ＤＢ１２Ａに登録された合成語が妥当であるか
否かを判定できる。なお、第２合成語辞書ＤＢ１２Ｂに登録された合成語についても同様
の処理を実行することで妥当性を判定できる。 As described above, the validity determination device 200 according to the present embodiment registers the synthetic words registered in the replication dictionary in the first synthetic dictionary after the initialization step, and compares the points before and after the registration. To do. Thereby, it can be determined whether or not the synthetic word registered in the first synthetic word dictionary DB12A is appropriate. The validity of the composite word registered in the second composite word dictionary DB12B can be determined by executing the same processing.

＜第１２実施形態＞
図３９は本発明の第１２実施形態に係る対話管理システムの構成を示す模式図である。
第１２実施形態に係る対話管理サーバ１は、単語分散表現データベース（ＤＢ）２０１を
さらに備える。 <12th Embodiment>
FIG. 39 is a schematic diagram showing the configuration of the dialogue management system according to the twelfth embodiment of the present invention.
The dialogue management server 1 according to the twelfth embodiment further includes a word distributed expression database (DB) 201.

単語分散表現ＤＢ２０１は、任意の単語の分散表現を記憶するデータベースである。こ
こで「単語の分散表現」とは、各単語の特徴が格納された固定長ベクトルであり、一単語
に対してベクトル形式の数値表現が一対一で対応するものである。 The word distributed expression DB 201 is a database that stores a distributed expression of any word. Here, the "distributed expression of words" is a fixed-length vector in which the characteristics of each word are stored, and there is a one-to-one correspondence between numerical expressions in vector form for one word.

また、本実施形態に係る回答生成部２７は、以下のように動作する。まず、回答生成部
２７は、質問文のテキストデータを形態素解析して抽出された単語から、所定のルールで
単語を選択する。ここでは、形態素解析により複数の単語が抽出された場合、一番長い単
語が選択される。続いて、回答生成部２７は、選択された単語に関し、分散表現において
一定距離以内にある単語を「関連語」として単語分散表現ＤＢ２０１から抽出する In addition, the answer generation unit 27 according to the present embodiment operates as follows. First, the answer generation unit 27 selects a word from the words extracted by morphological analysis of the text data of the question sentence according to a predetermined rule. Here, when a plurality of words are extracted by morphological analysis, the longest word is selected. Subsequently, the answer generation unit 27 extracts words within a certain distance in the distributed expression as "related words" from the word distributed expression DB 201 with respect to the selected word.

次に、回答生成部２７は、質問文のテキストデータを解析して抽出された単語及び関連
語の、各単位情報における出現頻度に基づいてスコアを算出する。そして、回答生成部２
７は、算出したスコアを用いて回答出力情報を生成する。 Next, the answer generation unit 27 calculates the score based on the frequency of appearance of the words and related words extracted by analyzing the text data of the interrogative sentence in each unit information. And answer generation unit 2
7 generates answer output information using the calculated score.

例えば、回答生成部２７は、「ＥＴＣカードの清算」などの表現が質問文に含まれてい
る場合、当該表現を「ＥＴＣカード」「清算」の２単語に形態素解析する。ここでは、前
提として「ＥＴＣカード」が第１合成語辞書ＤＢ１２Ａ又は第２合成語辞書ＤＢ１２Ｂに
登録されているものとする。次に、回答生成部２７は、「ＥＴＣカード」と「清算」の２
単語のうち、文字数の長い単語である「ＥＴＣカード」を選択する。続いて、回答生成部
２７は、単語分散表現で「ＥＴＣカード」と一定距離以内にある単語を関連語として抽出
する。ここでは、例えば、関連語として「車」「クレジットカード」などを単語が関連語
として抽出される。そして、回答生成部２７は、これらの関連語も含めて検索スコアを算
出し、検索スコアの高い順に単位情報から回答出力情報を生成する。 For example, when the question sentence includes an expression such as "clearing the ETC card", the answer generation unit 27 morphologically analyzes the expression into two words "ETC card" and "clearing". Here, it is assumed that the "ETC card" is registered in the first synthetic word dictionary DB12A or the second synthetic word dictionary DB12B. Next, the answer generation unit 27 has 2 of "ETC card" and "clearing".
Among the words, select "ETC card" which is a word with a long number of characters. Subsequently, the answer generation unit 27 extracts words within a certain distance from the "ETC card" as related words in the word distribution expression. Here, for example, words such as "car" and "credit card" are extracted as related words. Then, the answer generation unit 27 calculates the search score including these related words, and generates the answer output information from the unit information in descending order of the search score.

なお、関連語が抽出されない場合は、形態素解析された単語に基づいて回答出力情報が
生成される。 If the related words are not extracted, the answer output information is generated based on the morphologically analyzed words.

上述したように、本実施形態に係る対話管理サーバ１では、関連語も含めて文書情報Ｂ
Ｉを検索するので、ユーザに必要な情報を提示する確率を高めることができる。 As described above, in the dialogue management server 1 according to the present embodiment, the document information B including related words is included.
Since I is searched, the probability of presenting necessary information to the user can be increased.

なお、上記各実施形態は、本発明の理解を容易にするためのものであり、本発明を限定
して解釈するものではない。本発明はその趣旨を逸脱することなく、変更／改良され得る
とともに、本発明にはその等価物も含まれる。また、本発明は、上記各実施形態に開示さ
れている複数の構成要素の適宜な組み合わせにより種々の開示を形成できるものである。
例えば、実施形態に示される全構成要素から幾つかの構成要素は削除してもよいものであ
る。さらに、異なる実施形態に構成要素を適宜組み合わせてもよいものである。 It should be noted that each of the above embodiments is for facilitating the understanding of the present invention, and does not limit the interpretation of the present invention. The present invention can be modified / improved without departing from the spirit thereof, and the present invention also includes an equivalent thereof. In addition, the present invention can form various disclosures by appropriately combining the plurality of components disclosed in each of the above embodiments.
For example, some components may be removed from all the components shown in the embodiments. Further, the components may be appropriately combined in different embodiments.

また、フローチャートの各ステップは、必ずしも上記したステップとおりに実行される
必要はない。例えば、図４に示すステップＳ１、Ｓ３及びＳ５は、ステップＳ７の後に実
行されてもよい。 Further, each step of the flowchart does not necessarily have to be executed according to the above-mentioned steps. For example, steps S1, S3, and S5 shown in FIG. 4 may be executed after step S7.

１…対話管理サーバ、４…対話ログＤＢ、６…ユーザ端末装置、８…オペレータ端末装置
、１０…情報処理部、１１…記録部、１２…専門用語辞書ＤＢ、１２Ａ…第１合成語辞書
ＤＢ、１２Ｂ…第２合成語辞書ＤＢ、１３…同義語辞書ＤＢ、１４…除外用語辞書ＤＢ、
１５…取得部、１７…データ変換部、１９…受付部、２１…自然言語処理部、２３…重み
付け設定部、２４…、逆文書頻度ＤＢ、２７…回答生成部、２９…対話結果管理部、３１
…出力制御部、３３…音声認識処理部、３５…音声対話管理部、４４…下位概念語辞書Ｄ
Ｂ、４６…ショートトークＤＢ、１００…対話管理システム、１５０…重み決定装置、１
５５…情報処理部、２００…妥当性判定装置、２０１…単語分散表現ＤＢ、２０５…情報
処理部 1 ... Dialogue management server, 4 ... Dialogue log DB, 6 ... User terminal device, 8 ... Operator terminal device, 10 ... Information processing unit, 11 ... Recording unit, 12 ... Technical term dictionary DB, 12A ... First compound word dictionary DB , 12B ... 2nd compound word dictionary DB, 13 ... synonymous dictionary DB, 14 ... excluded term dictionary DB,
15 ... Acquisition unit, 17 ... Data conversion unit, 19 ... Reception unit, 21 ... Natural language processing unit, 23 ... Weighting setting unit, 24 ..., Reverse document frequency DB, 27 ... Answer generation unit, 29 ... Dialogue result management unit, 31
... Output control unit, 33 ... Voice recognition processing unit, 35 ... Voice dialogue management unit, 44 ... Subordinate concept word dictionary D
B, 46 ... Short talk DB, 100 ... Dialogue management system, 150 ... Weight determination device, 1
55 ... Information processing unit, 200 ... Validity judgment device, 201 ... Word distributed expression DB, 205 ... Information processing unit

Claims

Data in a structured format document having a plurality of unit information in which the text data of the explanatory text, the information for identifying one or more layers leading to the explanatory text, and the text data representing the heading of the hierarchical text are associated with each other. A recording unit that records
The reception department that accepts text data of question sentences from users,
The text data of the question text received by the reception unit was matched with the text data included in each unit information recorded in the recording unit, and the unit information related to the question text was extracted and extracted. Answer generation unit that generates answer output information based on headings and explanations corresponding to unit information,
Dialog management server with.

The answer generation unit
The score of each unit information is calculated based on the frequency of appearance in each unit information of the words extracted by morphological analysis of the text data of the interrogative sentence.
Generate answer output information using the calculated score,
The dialogue management server according to claim 1.

The answer generation unit
Each unit information is based on the frequency of appearance of the words extracted by morphological analysis of the text data of the interrogative sentence in each unit information and the ratio of the unit information including the extracted words to the total number of unit information. Calculate the score of
Generate answer output information using the calculated score,
The dialogue management server according to claim 1 or 2.

Further provided with a first synthetic word dictionary storage unit for storing a first synthetic word dictionary in which a synthetic word obtained by synthesizing another word with a word of a predetermined noun is registered.
The answer generation unit
When it is determined that the extracted word is registered in the first synthetic word dictionary by morphological analysis of the text data of the interrogative sentence, the first word is based on the frequency of appearance of the extracted word in each unit information. In addition to calculating the score, the second score is calculated based on the frequency of appearance of the compound word in each unit information.
The answer output information is generated using the first score and the second score.
The dialogue management server according to any one of claims 1 to 3.

A second compound word dictionary storage unit for storing a second compound word dictionary registered by associating a word with a predetermined noun with a compound word obtained by synthesizing another word with the word is further provided.
The answer generation unit
When it is determined that the extracted word by morphological analysis of the text data of the question sentence is registered in the second synthetic word dictionary, the extracted word, another word synthesized into the word of the noun, and the other word synthesized into the noun word, and A score is calculated based on the frequency of appearance in each unit information for each of the compound words, and answer output information is generated using the score.
The dialogue management server according to any one of claims 1 to 3.

Further, a subordinate concept word dictionary storage unit for storing a subordinate concept word dictionary registered by associating a predetermined word with a subordinate concept word which is a subordinate concept of the word is further provided.
The answer generation unit
When it is determined that the extracted word is registered in the subordinate concept word dictionary by morphological analysis of the text data of the interrogative sentence, answer output information including the subconcept word associated with the word is generated.
The dialogue management server according to any one of claims 1 to 5.

The answer generation unit generates answer output information based on the text data included in each heading of unit information including words extracted by morphological analysis of the text data of the interrogative sentence.
The dialogue management server according to claims 1 to 6.

The answer generation unit calculates the score of the unit information related to the question sentence based on the morphological analysis of the text data of the question sentence, and sets the score within a predetermined range to the number of the calculated unit information. An output control unit for adjusting the output time of the answer output information is further provided accordingly.
The dialogue management server according to any one of claims 1 to 7.

When the unit information related to the interrogative sentence cannot be extracted, the answer generation unit generates answer output information corresponding to a predetermined answer sentence based on the extracted word by morphological analysis of the interrogative sentence.
The dialogue management server according to any one of claims 1 to 8.

The answer generation unit
Based on the morphological analysis of the text data of the interrogative sentence, the score of the unit information related to the interrogative sentence is calculated.
Corresponds to the answer output information according to the difference between the first average score of one or more unit information having a higher score and the second average score of one or more unit information lower than the first average score. Change the display format of the user image showing the user
The dialogue management server according to any one of claims 1 to 9.

It also has a compound word dictionary storage unit that stores a compound word dictionary in which a compound word obtained by synthesizing another word with a word of a predetermined noun is registered.
The answer generation unit responds to the number of synthetic words included in the text data of the interrogative sentence.
The display form of the user image indicating the user associated with the answer output information is changed.
The dialogue management server according to any one of claims 1 to 10.

The answer generation unit
The frequency of appearance of words extracted by morphological analysis of the text data of the question sentence in each unit information, the ratio of the unit information including the extracted words to the total number of unit information, and the length of the unit information. Calculate the score of each unit information based on
Generate answer output information using the calculated score,
The dialogue management server according to any one of claims 1 to 11.

The description contains one or more sentences.
The answer generation unit
Based on the morphological analysis of the text data of the interrogative sentence, the score of the unit information related to the interrogative sentence is calculated.
When the words extracted by the morphological analysis and the words constituting the sentence included in the explanatory text are compared and the first word and the second word different from each other are identified, the appearance position of the first word and the above The score of the unit information is corrected based on the distance between words with the appearance position of the second word.
The answer output information is generated using the score.
The dialogue management server according to any one of claims 1 to 12.

The answer generation unit
Based on the morphological analysis of the text data of the interrogative sentence, the score of the unit information related to the interrogative sentence is calculated.
The words extracted by the morphological analysis are compared with the words constituting the sentence included in the explanatory sentence, and the score of the unit information is corrected based on the number of matching word types in one sentence.
The answer output information is generated using the score.
The dialogue management server according to any one of claims 1 to 13.

The answer generation unit
The score is corrected based on the number of sentences in which the number of matching word types is equal to or greater than a predetermined value.
The dialogue management server according to claim 14.

The answer generation unit
Based on the morphological analysis of the text data of the interrogative sentence, the score of the unit information related to the interrogative sentence is calculated.
A fixed-length vector is calculated from the interrogative using a sentence model that converts the sentences included in the explanatory text into individual fixed-length vectors.
The score is corrected based on the degree of similarity between the fixed-length vector of the interrogative sentence and the fixed-length vector of the sentence included in the explanatory sentence.
The answer output information is generated using the score.
The dialogue management server according to any one of claims 1 to 15.

There are multiple layers,
The storage unit has a weight for each layer.
The answer generation unit
Based on the morphological analysis of the text data of the interrogative sentence and the weight, the score of the unit information related to the interrogative sentence is calculated.
The answer output information is generated using the score.
The dialogue management server according to any one of claims 1 to 16.

The weight determining device for determining the weight in the dialogue management server according to claim 17.
A selection step for selecting an arbitrary layer from the plurality of layers, and
An extraction step that randomly extracts weights other than the hierarchy selected by the selection step from a predetermined weight candidate list, and
A weight calculation step for calculating the weight of the hierarchy selected by the selection step from a predetermined weight calculation formula using the weight extracted by the extraction step, and
Score rank calculation to calculate the rank of the score of the explanatory text that matches the sample answer corresponding to the sample question text using the weight set including the weight calculated by the weight calculation step and the weight extracted by the extraction step. Steps and
A point calculation step for calculating points for the weight set based on the rank calculated by the score rank calculation step and a predetermined point calculation formula, and
A weight that executes the extraction step, the weight calculation step, the score ranking calculation step, and the determination step of determining the weight of each layer based on the points after executing the point calculation step a predetermined number of times. Decision device.

A first layer selection step that selects an arbitrary layer as the first layer from a plurality of the above layers, and
The first weight selection step of selecting the weight of the first layer from the initial value list, and
A first weight calculation step of calculating the weights of layers other than the first layer from a predetermined first weight calculation formula, and
Using the first weight set including the weight of the first layer selected by the first weight selection step and the weight of the layer other than the first layer calculated by the first weight calculation step, the sample question text The first score ranking calculation step, which calculates the ranking of the scores of the explanations that match the corresponding sample answers,
A first point calculation step for calculating points for the first weight set based on the rank calculated by the first score rank calculation step and a predetermined point calculation formula, and
When the points calculated by the first point calculation step are less than a predetermined value, the deletion step of deleting the weight of the first layer from the initial value list and the deletion step.
In the first weight selection step, another weight is selected, and the first weight calculation step, the first score ranking calculation step, the first point calculation step, and the deletion step are repeatedly executed to perform the initial value list. The first update step to update and
In the first layer selection step, another layer is selected, and the first weight selection step, the first weight calculation step, the first score ranking calculation step, the first point calculation step, the deletion step, and the above. The second update step of repeatedly executing the first update step to update the initial value list, and
A second layer selection step that selects an arbitrary layer as the second layer from the plurality of layers,
A second weight calculation step for calculating the weight of the second layer from the total value of the minimum values of the weights of the layers other than the second layer, and
In the second layer selection step, another layer is selected, the second weight calculation step is repeatedly executed, and a weight candidate list creation step for creating a weight candidate list from the initial value list, and a weight candidate list creation step.
18. The weighting device according to claim 18.

A validity determination device for determining the validity of a compound word registered in the first compound word dictionary according to claim 4.
A duplicate dictionary generation step of generating a duplicate dictionary that duplicates the contents of the first synthetic word dictionary, and
The initialization step for initializing the first compound word dictionary and
A registration step of extracting an arbitrary compound word from the compound words registered in the duplicate dictionary and registering it in the first compound word dictionary after the initialization step,
Using the first synthetic word dictionary after the registration step, a score ranking calculation step for calculating the score ranking of the explanatory text matching the sample answer corresponding to the sample question text, and a score ranking calculation step.
A point calculation step for calculating points for a composite word extracted by the registration step based on the ranking calculated by the score ranking calculation step and a predetermined point calculation formula, and a point calculation step.
By repeating the registration step, the score ranking calculation step, and the point calculation step, and comparing the points before and after the compound word is registered in the first compound word dictionary after the initialization step, the registered compound word Validation step to determine validity and
A validation device that executes.

A validity determination device for determining the validity of a compound word registered in the second compound word dictionary according to claim 5.
A duplicate dictionary generation step of generating a duplicate dictionary that duplicates the contents of the second synthetic word dictionary, and
The initialization step for initializing the second compound word dictionary and
A registration step of extracting an arbitrary compound word from the compound words registered in the duplicate dictionary and registering it in the second compound word dictionary after the initialization step,
Using the second compound word dictionary after the registration step, a score ranking calculation step for calculating the score ranking of the explanatory text matching the sample answer corresponding to the sample question text, and a score ranking calculation step.
A point calculation step for calculating points for a composite word extracted by the registration step based on the ranking calculated by the score ranking calculation step and a predetermined point calculation formula, and a point calculation step.
By repeating the registration step, the score ranking calculation step, and the point calculation step, and comparing the points before and after the compound word is registered in the second compound word dictionary after the initialization step, the registered compound word Validation step to determine validity and
A validation device that executes.

It also has a word distributed expression storage unit that stores distributed expressions of arbitrary words.
The answer generation unit
A word is selected from the words extracted by morphological analysis of the text data of the interrogative sentence according to a predetermined rule.
With respect to the selected word, words within a certain distance in the distributed expression are extracted as related words.
A score is calculated based on the frequency of appearance of the words extracted by analyzing the text data of the interrogative sentence and the related words in each unit information.
The answer output information is generated using the score.
The dialogue management server according to any one of claims 1 to 17.

Data in a structured format document having a plurality of unit information in which the text data of the explanatory text, the information for identifying one or more layers leading to the explanatory text, and the text data representing the heading of the hierarchical text are associated with each other. And the steps to record
Steps to accept text data of question sentences from users,
The text data of the received question text is matched with the text data included in each recorded unit information, the unit information related to the question text is extracted, and the heading and explanation corresponding to the extracted unit information are extracted. Steps to generate answer output information based on sentences,
When,
Dialogue management methods, including.

Computer,
Data in a structured format document having a plurality of unit information in which the text data of the explanatory text, the information for identifying one or more layers leading to the explanatory text, and the text data representing the heading of the hierarchical text are associated with each other. Recording unit, which records
Reception department that accepts text data of question texts from users,
The text data of the question text received by the reception unit was matched with the text data included in each unit information recorded in the recording unit, and the unit information related to the question text was extracted and extracted. Answer generation unit that generates answer output information based on the heading and description corresponding to the unit information,
A program that functions as.