JP2020177360A

JP2020177360A - Q&a extraction device, method, program, and answering system

Info

Publication number: JP2020177360A
Application number: JP2019078072A
Authority: JP
Inventors: 高野　隆一; Ryuichi Takano; 隆一高野; 朋之田附; Tomoyuki Tatsuki; 渡辺　潔; Kiyoshi Watanabe; 潔渡辺
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2019-04-16
Filing date: 2019-04-16
Publication date: 2020-10-29
Anticipated expiration: 2039-04-16
Also published as: JP7099397B2

Abstract

To provide a Q&A extraction device, method and program and an answering system for reducing a load of creating a scenario for teacher data or automatic answering.SOLUTION: A Q&A extraction device includes: a setting unit for setting indexes for identifying the start and the end of a question or an answer on the basis of conversation between a questioner and a respondent and a web page sentence in order to identify at least either the question or the answer; and an extraction unit for extracting a sentence from the start to the end of at least one of the question and the answer from data on the basis of the indexes set by the setting unit.SELECTED DRAWING: Figure 5

Description

本発明は、Ｑ＆Ａ抽出装置、方法、プログラム、および応答システムに関する。 The present invention relates to Q & A extraction devices, methods, programs, and response systems.

従来、音声やテキストによる人間からの質問に対してコンピュータが回答することができる自動応答システム（チャットボットとも呼ばれる）が知られている（特許文献１等）。 Conventionally, there is known an automatic response system (also called a chatbot) in which a computer can answer a question from a human by voice or text (Patent Document 1 and the like).

このような自動応答システムでは、あらかじめ、多数の質問や回答（以下、Ｑ＆Ａともいう）のデータを収集しておく必要がある。例えば、自動応答システムでは、収集した質問や回答のデータを教師データとして手作業により作成し、機械学習を行って自動応答のための学習済みモデルを生成したり（機械学習型の自動応答システムの場合）、あるいは、収集した質問や回答のデータをもとに自動応答のためのシナリオを手作業により作成したり（ルールベース型の自動応答システムの場合）する。 In such an automatic response system, it is necessary to collect data of a large number of questions and answers (hereinafter, also referred to as Q & A) in advance. For example, in an automatic response system, the collected question and answer data are manually created as teacher data, and machine learning is performed to generate a trained model for automatic response (machine learning type automatic response system). (In the case), or manually create a scenario for automatic response based on the collected question and answer data (in the case of a rule-based automatic response system).

しかしながら、多数の質問や回答を収集し、教師データや自動応答のためのシナリオを手作業で作成することは手間と時間がかかり容易ではない。 However, collecting a large number of questions and answers and manually creating scenarios for teacher data and automatic responses is laborious, time-consuming, and not easy.

そこで、本発明の一実施形態では、教師データ又は自動応答のためのシナリオを作成する負荷を軽減することを目的とする。 Therefore, one embodiment of the present invention aims to reduce the load of creating a scenario for teacher data or automatic response.

上述した課題を解決するために、本発明の一実施形態は、質問と回答とのうちの少なくとも一方を識別するための指標を設定する設定部と、前記指標に基づいて、データの中から前記質問と前記回答とのうちの少なくとも一方を抽出する抽出部と、を備える。 In order to solve the above-mentioned problems, one embodiment of the present invention includes a setting unit for setting an index for identifying at least one of a question and an answer, and the data from the data based on the index. It includes an extraction unit that extracts at least one of a question and the answer.

本発明の一実施形態によれば、教師データ又は自動応答のためのシナリオを作成する負荷を軽減することができる。 According to one embodiment of the present invention, the load of creating a scenario for teacher data or automatic response can be reduced.

本発明の一実施形態に係るＱ＆Ａ抽出装置を含む応答システムの全体の構成図である。It is a block diagram of the whole response system including the Q & A extraction apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＱ＆Ａ抽出装置および応答装置のハードウェア構成図である。It is a hardware block diagram of the Q & A extraction device and the response device which concerns on one Embodiment of this invention. 本発明の一実施形態に係る質問用装置のハードウェア構成図である。It is a hardware block diagram of the questioning apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係る質問と回答の収集例を説明するための図である。It is a figure for demonstrating the collection example of the question and answer which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＱ＆Ａ抽出装置の機能ブロック図である。It is a functional block diagram of the Q & A extraction apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係る会話の音声内の指標を説明するための図である。It is a figure for demonstrating the index in voice of the conversation which concerns on one Embodiment of this invention. 本発明の一実施形態に係るウェブページ内のフォーマットによる指標を説明するための図である。It is a figure for demonstrating the index by the format in the web page which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＱ＆Ａ記憶部に格納されるデータの一例である。This is an example of data stored in the Q & A storage unit according to the embodiment of the present invention. 本発明の一実施形態に係るＱ＆Ａ抽出の処理のフローチャートである。It is a flowchart of the process of Q & A extraction which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＱ＆Ａ抽出の処理のフローチャートである。It is a flowchart of the process of Q & A extraction which concerns on one Embodiment of this invention.

以下、各実施形態について添付の図面を参照しながら説明する。なお、本明細書および図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複した説明を省略する。 Hereinafter, each embodiment will be described with reference to the accompanying drawings. In the present specification and the drawings, components having substantially the same functional configuration are designated by the same reference numerals, so that duplicate description will be omitted.

＜システム構成＞
図１は、本発明の一実施形態に係るＱ＆Ａ抽出装置１０（情報処理装置の一例）を含む応答システム１の全体の構成図である。図１に示されるように、応答システム１は、Ｑ＆Ａ抽出装置（以下、単に抽出装置ともいう）１０、応答装置２０、質問用装置３０を含む。応答装置２０は、質問用装置３０および抽出装置１０と任意のネットワーク４０によって通信可能に接続されている。以下、それぞれについて説明する。 <System configuration>
FIG. 1 is an overall configuration diagram of a response system 1 including a Q & A extraction device 10 (an example of an information processing device) according to an embodiment of the present invention. As shown in FIG. 1, the response system 1 includes a Q & A extraction device (hereinafter, also simply referred to as an extraction device) 10, a response device 20, and a questioning device 30. The response device 20 is communicably connected to the questioning device 30 and the extraction device 10 by an arbitrary network 40. Each will be described below.

なお、図１では、抽出装置１０と応答装置２０とを別々の装置として説明しているが、抽出装置１０と応答装置２０とを１つの装置として実装する（例えば、既存の応答装置２０内に抽出装置１０を設置する）ようにしてもよい。 Although the extraction device 10 and the response device 20 are described as separate devices in FIG. 1, the extraction device 10 and the response device 20 are mounted as one device (for example, in the existing response device 20). The extraction device 10 may be installed).

Ｑ＆Ａ抽出装置１０は、応答装置２０が提供する自動応答サービスのために用いられる質問データと回答データとのうちの少なくとも一方を抽出する装置である。具体的には、抽出装置１０は、マイク５１（図４を参照しながら後述する）によって取得された音声データの中から、質問データと回答データとのうちの少なくとも一方を抽出することができる。また、抽出装置１０は、ウェブページの中から、質問データと回答データとのうちの少なくとも一方を抽出することができる。後段で、図５を参照しながら、Ｑ＆Ａ抽出装置１０について詳細に説明する。 The Q & A extraction device 10 is a device that extracts at least one of the question data and the answer data used for the automatic response service provided by the response device 20. Specifically, the extraction device 10 can extract at least one of the question data and the answer data from the voice data acquired by the microphone 51 (described later with reference to FIG. 4). Further, the extraction device 10 can extract at least one of the question data and the answer data from the web page. In the latter part, the Q & A extraction device 10 will be described in detail with reference to FIG.

応答装置２０は、質問用装置３０からの質問に応答する装置である。具体的には、応答装置２０は、質問用装置３０から質問を受信する。また、応答装置２０は、質問用装置３０へ回答を送信する。 The response device 20 is a device that responds to a question from the question device 30. Specifically, the response device 20 receives a question from the question device 30. Further, the response device 20 transmits an answer to the question device 30.

応答装置２０は、音声により質問を受け付ける構成としてもよいし、テキストにより質問を受け付ける構成としてもよい。また、応答装置２０は、音声により応答する構成としてもよいし、テキストにより応答する構成としてもよい。 The response device 20 may be configured to accept questions by voice or may be configured to accept questions by text. Further, the response device 20 may be configured to respond by voice or may be configured to respond by text.

応答装置２０は、機械学習により生成された学習済みモデルに質問を入力することによって出力される回答を用いて応答する構成とすることができる。あるいは、応答装置２０は、あらかじめ定められたシナリオに従って回答する構成とすることができる。つまり、応答装置２０は、抽出装置１０が抽出した質問や回答のデータを教師データとして機械学習を行って自動応答のための学習済みモデルを生成したり（機械学習型の場合）、あるいは、抽出装置１０が抽出した質問や回答のデータをもとに自動応答のためのシナリオを生成したり（ルールベース型の場合）することができる。 The response device 20 can be configured to respond using the answer output by inputting a question into the trained model generated by machine learning. Alternatively, the response device 20 may be configured to respond according to a predetermined scenario. That is, the response device 20 performs machine learning using the question or answer data extracted by the extraction device 10 as teacher data to generate a trained model for automatic response (in the case of the machine learning type), or extracts. A scenario for automatic response can be generated (in the case of the rule-based type) based on the question and answer data extracted by the device 10.

質問用装置３０は、応答装置２０が提供する自動応答サービスに対して質問をしたい者が利用する装置である。質問用装置３０は、例えば、図３で説明するようなデジタルサイネージ３１、コントローラ３２、マイク３３、スピーカ３４から構成される。なお、質問用装置３０は、パーソナルコンピュータ、タブレット、スマートフォン等の任意のコンピュータであってもよい。例えば、質問用装置３０は、観光地に設置される観光地を案内するための装置であり、観光地の訪問者からの質問を受け付ける。 The question device 30 is a device used by a person who wants to ask a question about the automatic response service provided by the response device 20. The questioning device 30 is composed of, for example, a digital signage 31, a controller 32, a microphone 33, and a speaker 34 as described in FIG. The questioning device 30 may be any computer such as a personal computer, a tablet, or a smartphone. For example, the questioning device 30 is a device for guiding a tourist spot installed in a tourist spot, and receives a question from a visitor of the tourist spot.

＜ハードウェア構成＞
図２は、本発明の一実施形態に係るＱ＆Ａ抽出装置１０および応答装置２０のハードウェア構成図である。抽出装置１０、応答装置２０は、１または複数のコンピュータからなる。 <Hardware configuration>
FIG. 2 is a hardware configuration diagram of the Q & A extraction device 10 and the response device 20 according to the embodiment of the present invention. The extraction device 10 and the response device 20 are composed of one or a plurality of computers.

抽出装置１０、応答装置２０は、ＣＰＵ（Central Processing Unit）１１、ＲＯＭ（Read Only Memory）１２、ＲＡＭ（Random Access Memory）１３を有する。ＣＰＵ１１、ＲＯＭ１２、ＲＡＭ１３は、いわゆるコンピュータを形成する。 The extraction device 10 and the response device 20 include a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, and a RAM (Random Access Memory) 13. The CPU 11, ROM 12, and RAM 13 form a so-called computer.

また、抽出装置１０、応答装置２０は、補助記憶装置１４、表示装置１５、操作装置１６、Ｉ／Ｆ（Interface）装置１７、ドライブ装置１８を有する。なお、抽出装置１０、応答装置２０の各ハードウェアは、バス１９を介して相互に接続されている。 Further, the extraction device 10 and the response device 20 include an auxiliary storage device 14, a display device 15, an operation device 16, an I / F (Interface) device 17, and a drive device 18. The hardware of the extraction device 10 and the response device 20 are connected to each other via the bus 19.

ＣＰＵ１１は、補助記憶装置１４にインストールされている各種プログラムを実行する演算デバイスである。 The CPU 11 is an arithmetic device that executes various programs installed in the auxiliary storage device 14.

ＲＯＭ１２は、不揮発性メモリである。ＲＯＭ１２は、補助記憶装置１４にインストールされている各種プログラムをＣＰＵ１１が実行するために必要な各種プログラム、データ等を格納する主記憶デバイスとして機能する。具体的には、ＲＯＭ１２はＢＩＯＳ（Basic Input/Output System）やＥＦＩ（Extensible Firmware Interface）等のブートプログラム等を格納する、主記憶デバイスとして機能する。 The ROM 12 is a non-volatile memory. The ROM 12 functions as a main storage device for storing various programs, data, and the like necessary for the CPU 11 to execute various programs installed in the auxiliary storage device 14. Specifically, the ROM 12 functions as a main storage device for storing boot programs such as BIOS (Basic Input / Output System) and EFI (Extensible Firmware Interface).

ＲＡＭ１３は、ＤＲＡＭ（Dynamic Random Access Memory）やＳＲＡＭ（Static Random Access Memory）等の揮発性メモリである。ＲＡＭ１３は、補助記憶装置１４にインストールされている各種プログラムがＣＰＵ１１によって実行される際に展開される作業領域を提供する、主記憶デバイスとして機能する。 The RAM 13 is a volatile memory such as a DRAM (Dynamic Random Access Memory) or a SRAM (Static Random Access Memory). The RAM 13 functions as a main storage device that provides a work area that is expanded when various programs installed in the auxiliary storage device 14 are executed by the CPU 11.

補助記憶装置１４は、各種プログラムや、各種プログラムが実行される際に用いられる情報を格納する補助記憶デバイスである。 The auxiliary storage device 14 is an auxiliary storage device that stores various programs and information used when various programs are executed.

表示装置１５は、抽出装置１０、応答装置２０の内部状態等を表示する表示デバイスである。 The display device 15 is a display device that displays the internal state of the extraction device 10 and the response device 20.

操作装置１６は、抽出装置１０、応答装置２０の管理者が抽出装置１０、応答装置２０に対して各種指示を入力する入力デバイスである。 The operation device 16 is an input device in which the manager of the extraction device 10 and the response device 20 inputs various instructions to the extraction device 10 and the response device 20.

Ｉ／Ｆ装置１７は、ネットワーク４０に接続し、抽出装置１０、応答装置２０、質問用装置３０と通信を行うための通信デバイスである。 The I / F device 17 is a communication device for connecting to the network 40 and communicating with the extraction device 10, the response device 20, and the questioning device 30.

ドライブ装置１８は記憶媒体２１をセットするためのデバイスである。ここでいう記憶媒体２１には、ＣＤ−ＲＯＭ、フレキシブルディスク、光磁気ディスク等のように情報を光学的、電気的あるいは磁気的に記録する媒体が含まれる。また、記憶媒体２１には、ＥＰＲＯＭ (Erasable Programmable Read Only Memory)、フラッシュメモリ等のように情報を電気的に記録する半導体メモリ等が含まれていてもよい。 The drive device 18 is a device for setting the storage medium 21. The storage medium 21 referred to here includes a medium such as a CD-ROM, a flexible disk, a magneto-optical disk, or the like that optically, electrically, or magnetically records information. Further, the storage medium 21 may include a semiconductor memory for electrically recording information such as an EPROM (Erasable Programmable Read Only Memory) and a flash memory.

なお、補助記憶装置１４にインストールされる各種プログラムは、例えば、配布された記憶媒体２１がドライブ装置１８にセットされ、該記憶媒体２１に記録された各種プログラムがドライブ装置１８により読み出されることでインストールされる。あるいは、補助記憶装置１４にインストールされる各種プログラムは、Ｉ／Ｆ装置１７を介して、ネットワーク４０とは異なる他のネットワークよりダウンロードされることでインストールされてもよい。 The various programs installed in the auxiliary storage device 14 are installed, for example, by setting the distributed storage medium 21 in the drive device 18 and reading the various programs recorded in the storage medium 21 by the drive device 18. Will be done. Alternatively, the various programs installed in the auxiliary storage device 14 may be installed by being downloaded from another network different from the network 40 via the I / F device 17.

図３は、本発明の一実施形態に係る質問用装置３０のハードウェア構成図である。図３に示されるように、質問用装置３０は、デジタルサイネージ３１、コントローラ３２、マイク３３、スピーカ３４を含むことができる。 FIG. 3 is a hardware configuration diagram of the questioning device 30 according to the embodiment of the present invention. As shown in FIG. 3, the questioning device 30 can include a digital signage 31, a controller 32, a microphone 33, and a speaker 34.

デジタルサイネージ３１は、例えば、タッチパネル式のサイネージである。デジタルサイネージ３１は、例えば、ウェブブラウザを介して、ユーザに自動応答サービスを提供することができる。具体的には、デジタルサイネージ３１は、マイク３３に向かって質問を発するよう促す画面を表示することができる。また、デジタルサイネージ３１は、タッチパネルを用いて質問を入力するよう促す画面を表示することができる。また、デジタルサイネージ３１は、応答装置２０から送信された回答を表示することができる。 The digital signage 31 is, for example, a touch panel type signage. The digital signage 31 can provide an automatic response service to a user, for example, via a web browser. Specifically, the digital signage 31 can display a screen prompting the microphone 33 to ask a question. In addition, the digital signage 31 can display a screen prompting the user to input a question using the touch panel. Further, the digital signage 31 can display the answer transmitted from the response device 20.

コントローラ３２は、デジタルサイネージ３１を制御するための装置である。 The controller 32 is a device for controlling the digital signage 31.

マイク３３は、応答装置２０が提供する自動応答サービスに対して質問をしたい者が発した音声（質問）を取得する。 The microphone 33 acquires a voice (question) uttered by a person who wants to ask a question to the automatic response service provided by the response device 20.

スピーカ３４は、応答装置２０から送信された音声データ（回答）を再生する。 The speaker 34 reproduces the voice data (answer) transmitted from the response device 20.

図４は、本発明の一実施形態に係る質問と回答の収集例を説明するための図である。図４に示されるように、質問者６０（例えば、観光地を訪問した訪問者）と回答者５０（例えば、観光地を案内する案内者）の会話が、回答者５０が装着しているマイク５１によって録音される。Ｑ＆Ａ抽出装置１０は、このように取得された会話の音声データの中から、質問データと回答データとのうちの少なくとも一方を抽出することができる。 FIG. 4 is a diagram for explaining a collection example of questions and answers according to an embodiment of the present invention. As shown in FIG. 4, the conversation between the questioner 60 (for example, a visitor who visited a tourist spot) and the respondent 50 (for example, a guide who guides the tourist spot) is a microphone worn by the respondent 50. Recorded by 51. The Q & A extraction device 10 can extract at least one of the question data and the answer data from the voice data of the conversation acquired in this way.

＜機能ブロック＞
図５は、本発明の一実施形態に係るＱ＆Ａ抽出装置１０の機能ブロック図である。図５に示されるように、抽出装置１０は、設定部１０１、音声取得部１０２、ウェブページ検索部１０３、抽出部１０４、Ｑ＆Ａ記憶部１０５を含む。また、抽出装置１０は、プログラムを実行することで、設定部１０１、音声取得部１０２、ウェブページ検索部１０３、抽出部１０４として機能する。以下、それぞれについて説明する。 <Functional block>
FIG. 5 is a functional block diagram of the Q & A extraction device 10 according to the embodiment of the present invention. As shown in FIG. 5, the extraction device 10 includes a setting unit 101, a voice acquisition unit 102, a web page search unit 103, an extraction unit 104, and a Q & A storage unit 105. Further, the extraction device 10 functions as a setting unit 101, a voice acquisition unit 102, a web page search unit 103, and an extraction unit 104 by executing a program. Each will be described below.

設定部１０１は、質問文や回答文を識別するための指標を設定する。具体的には、設定部１０１は、抽出装置１０の操作装置１６または他のコンピュータ等によって入力された設定を受け付ける。また、設定部１０１は、受け付けた設定を抽出部１０４が参照できるように抽出装置１０内等のメモリに記憶する。 The setting unit 101 sets an index for identifying a question sentence or an answer sentence. Specifically, the setting unit 101 accepts the settings input by the operation device 16 of the extraction device 10 or another computer or the like. Further, the setting unit 101 stores the received setting in a memory such as in the extraction device 10 so that the extraction unit 104 can refer to it.

ここで、質問文や回答文を識別するための指標について説明する。以下、＜会話の音声内の指標＞、＜ウェブページ内のフォーマットによる指標＞、＜ウェブページ内の自然言語解析による指標＞の３つの例について説明する。 Here, an index for identifying a question sentence or an answer sentence will be described. Hereinafter, three examples of <index in voice of conversation>, <index by format in web page>, and <index by natural language analysis in web page> will be described.

＜会話の音声内の指標＞
設定部１０１は、質問者と回答者との会話（例えば、観光地の案内者と訪問者との会話）内で、質問文の始まりおよび終わり、および、回答文の始まりおよび終わりに発せられるべき文言（以下、キーワードともいう）を、質問文や回答文を識別するための指標として設定することができる。以下、図６を参照しながら、＜会話の音声内の指標＞について詳細に説明する。 <Indicator in conversation voice>
The setting unit 101 should be issued at the beginning and end of the question sentence and at the beginning and end of the answer sentence in the conversation between the questioner and the respondent (for example, the conversation between the guide and the visitor of the tourist spot). A wording (hereinafter, also referred to as a keyword) can be set as an index for identifying a question sentence or an answer sentence. Hereinafter, <index in the voice of conversation> will be described in detail with reference to FIG.

図６は、本発明の一実施形態に係る会話の音声内の指標を説明するための図である。図６では、左から右へ時間が経過する。例えば、質問文の始まりのキーワードを「はい、ご質問ですね」とし、質問文の終わりのキーワードを「あなたのご質問は以上ですね」とし、回答の始まりを「それに対する答えは」とし、回答の終わりを「以上です」とする。回答する人（あるいは質問する人）がこれらのキーワードを発することによって、質問文の始まりのキーワード（「はい、ご質問ですね」）から質問文の終わりのキーワード（「あなたのご質問は以上ですね」）までの間に発せられた音声が質問文であると識別されることとなる。また、回答する人（あるいは質問する人）がこれらのキーワードを発することによって、回答文の始まりのキーワード（「それに対する答えは」）から回答文の終わりのキーワード（「以上です」）までの間に発せられた音声が回答文であると識別されることとなる。 FIG. 6 is a diagram for explaining an index in the voice of conversation according to an embodiment of the present invention. In FIG. 6, time elapses from left to right. For example, the keyword at the beginning of the question sentence is "Yes, you have a question", the keyword at the end of the question sentence is "Your question is over", and the beginning of the answer is "The answer to that". End the answer with "that's it". By issuing these keywords by the respondent (or the person asking the question), the keywords at the beginning of the question ("Yes, you have a question") to the keywords at the end of the question ("Your question is over." The voice uttered up to that point will be identified as an interrogative sentence. Also, when the respondent (or the person asking the question) issues these keywords, the keyword at the beginning of the answer sentence (“the answer to it”) to the keyword at the end of the answer sentence (“more than”) The voice uttered in is identified as the answer sentence.

なお、会話の終わりのキーワードを設定することによって、会話が終了したことを識別できるようにしてもよい。あるいは、会話の始まりおよび終わりのキーワードを設定することによって、会話の始まりのキーワードから会話の終わりのキーワードまでの間に発せられた音声が、１つの会話であると識別できるようにしてもよい。 By setting a keyword at the end of the conversation, it may be possible to identify that the conversation has ended. Alternatively, by setting the start and end keywords of the conversation, the voice emitted between the start keyword of the conversation and the end keyword of the conversation may be identified as one conversation.

＜ウェブページ内のフォーマットによる指標＞
設定部１０１は、ウェブページ内の所定のフォーマットを、質問文や回答文を識別するための指標として設定することができる。以下、図７を参照しながら、＜ウェブページ内のフォーマットによる指標＞について詳細に説明する。 <Indicator by format in web page>
The setting unit 101 can set a predetermined format in the web page as an index for identifying a question sentence or an answer sentence. Hereinafter, <index by format in the web page> will be described in detail with reference to FIG. 7.

図７は、本発明の一実施形態に係るウェブページ内のフォーマットによる指標を説明するための図である。図７は、ＦＡＱ（よくある質問とその回答）のウェブページを示す。ＦＡＱのウェブページが所定のフォーマットで作成されると、質問文のフォーマットで記載された文章は質問文であると識別され、回答文のフォーマットで記載された文章は回答文であると識別されることとなる。以下、２つのフォーマット例を説明する。なお、＜＜フォーマット例１＞＞と＜＜フォーマット例２＞＞とを組み合わせてもよい。 FIG. 7 is a diagram for explaining an index in a format in a web page according to an embodiment of the present invention. FIG. 7 shows a FAQ (Frequently Asked Questions and Answers) web page. When the FAQ web page is created in a predetermined format, the text written in the question text format is identified as the question text, and the text written in the answer text format is identified as the answer text. It will be. Two format examples will be described below. In addition, << format example 1 >> and << format example 2 >> may be combined.

＜＜フォーマット例１＞＞
例えば、設定部１０１は、ＨＴＭＬ（HyperText Markup Language）の所定の属性（例えば、隠し属性＜hidden＞）を、質問文や回答文を識別するための指標として設定することができる。そのため、ＦＡＱのウェブページの作成者は、隠し属性＜hidden＞を用いて、質問文の始まりおよび終わり、および、回答文の始まりおよび終わりを指定することができる。なお、質問文の始まりおよび終わり、および、回答文の始まりおよび終わりの指定は、隠し属性＜hidden＞であるので、図７のように、ユーザのウェブブラウザ上には表示されない。 << Format Example 1 >>
For example, the setting unit 101 can set a predetermined attribute of HTML (HyperText Markup Language) (for example, a hidden attribute <hidden>) as an index for identifying a question sentence or an answer sentence. Therefore, the creator of the FAQ web page can use the hidden attribute <hidden> to specify the beginning and end of the question sentence and the beginning and end of the answer sentence. Since the start and end of the question sentence and the designation of the start and end of the answer sentence are hidden attributes <hidden>, they are not displayed on the user's web browser as shown in FIG.

なお、一連の文章の終わりの隠し属性＜hidden＞を設定することによって、一連の文章が終了したことを識別できるようにしてもよい。あるいは、一連の文章の始まりおよび終わりの隠し属性＜hidden＞を設定することによって、一連の文章の始まりの隠し属性＜hidden＞から一連の文章の終わりの隠し属性＜hidden＞までの間に記載された文章が、１つのＦＡＱの対であると識別できるようにしてもよい。 By setting the hidden attribute <hidden> at the end of a series of sentences, it may be possible to identify that the series of sentences has ended. Alternatively, by setting the hidden attribute <hidden> at the beginning and end of a series of sentences, it is described between the hidden attribute <hidden> at the beginning of a series of sentences and the hidden attribute <hidden> at the end of a series of sentences. The text may be identified as a pair of FAQs.

＜＜フォーマット例２＞＞
例えば、設定部１０１は、ウェブページ内で文章が配置される位置を、質問文や回答文を識別するための指標として設定することができる。そのため、ＦＡＱのウェブページの作成者は、例えば、図７のように、左側の欄に配置される文章を質問文、右側の欄に配置される文章を回答文と指定することができる。 << Format Example 2 >>
For example, the setting unit 101 can set the position where the sentence is arranged in the web page as an index for identifying the question sentence and the answer sentence. Therefore, the creator of the FAQ web page can specify, for example, a sentence arranged in the left column as a question sentence and a sentence arranged in the right column as an answer sentence, as shown in FIG. 7.

＜ウェブページ内の自然言語解析による指標＞
設定部１０１は、質問または質問内の一部の文言を、回答文を識別するための指標として設定することができる。例えば、設定部１０１は、質問者が質問用装置３０に入力した質問または質問内の一部の文言、あるいは、応答システム１のシステム管理者等が指定した質問または質問内の一部の文言を、指標として設定することができる。 <Indicator by natural language analysis in web page>
The setting unit 101 can set the question or a part of the wording in the question as an index for identifying the answer sentence. For example, the setting unit 101 inputs a question or a part of the wording in the question input to the questioning device 30 by the questioner, or a question or a part of the wording in the question specified by the system administrator of the response system 1. , Can be set as an index.

図５の説明に戻る。音声取得部１０２は、質問者と回答者との会話の音声データを取得する。例えば、音声取得部１０２は、回答者が装着しているマイク５１（図４参照）が集音した質問者と回答者との会話の音声データを取得する。また、音声取得部１０２は、取得した音声データをテキスト化して文書データを生成する。また、音声取得部１０２は、生成した文書データを抽出部１０４が参照できるように抽出装置１０内等のメモリに記憶する。 Returning to the description of FIG. The voice acquisition unit 102 acquires voice data of the conversation between the questioner and the respondent. For example, the voice acquisition unit 102 acquires voice data of a conversation between the questioner and the respondent, which is collected by the microphone 51 (see FIG. 4) worn by the respondent. In addition, the voice acquisition unit 102 converts the acquired voice data into text to generate document data. Further, the voice acquisition unit 102 stores the generated document data in a memory such as in the extraction device 10 so that the extraction unit 104 can refer to it.

なお、本発明の一実施形態では、Ｑ＆Ａ抽出装置１０は、音声データをテキスト化して文書データを生成することなく、音声データのままで処理を行う（つまり、音声データから質問文、回答文を特定して抽出する）構成とすることもできる。 In one embodiment of the present invention, the Q & A extraction device 10 processes the voice data as it is without converting the voice data into text and generating the document data (that is, the question text and the answer text are processed from the voice data. It can also be configured (specifically extracted).

ウェブページ検索部１０３は、ウェブページ（ＨＴＭＬ）を取得する。例えば、ウェブページ検索部１０３は、指定された範囲または全てのウェブページから情報を収集（クロール）する。また、ウェブページ検索部１０３は、収集した情報をテキスト化して文書データを生成する。また、ウェブページ検索部１０３は、生成した文書データを抽出部１０４が参照できるように抽出装置１０内等のメモリに記憶する。 The web page search unit 103 acquires a web page (HTML). For example, the web page search unit 103 collects (crawls) information from a specified range or all web pages. In addition, the web page search unit 103 converts the collected information into text and generates document data. Further, the web page search unit 103 stores the generated document data in a memory such as in the extraction device 10 so that the extraction unit 104 can refer to it.

抽出部１０４は、音声取得部１０２が生成した文書データ、ウェブページ検索部１０３が生成した文書データの中から、設定部１０１が設定した指標に基づいて、質問と回答とのうちの少なくとも一方を抽出する。また、抽出部１０４は、抽出した質問および回答をＱ＆Ａ記憶部１０５に記憶する。以下、＜会話の音声内の指標に基づいて抽出＞、＜ウェブページ内のフォーマットによる指標に基づいて抽出＞、＜ウェブページ内の自然言語解析による指標に基づいて抽出＞の３つの例に分けて説明する。 The extraction unit 104 asks at least one of a question and an answer from the document data generated by the voice acquisition unit 102 and the document data generated by the web page search unit 103 based on the index set by the setting unit 101. Extract. In addition, the extraction unit 104 stores the extracted questions and answers in the Q & A storage unit 105. Below, it is divided into three examples: <extracting based on the index in the voice of conversation>, <extracting based on the index based on the format in the web page>, and <extracting based on the index based on natural language analysis in the web page>. I will explain.

＜会話の音声内の指標に基づいて抽出＞
抽出部１０４は、音声取得部１０２が生成した文書データの中から、設定部１０１によって設定された質問文の始まりおよび終わり、および、回答文の始まりおよび終わりに発せられるべきキーワードを検索する。また、抽出部１０４は、質問文の始まりのキーワードから質問文の終わりのキーワードまでの間の文章を質問として抽出する。また、抽出部１０４は、回答文の始まりのキーワードから回答文の終わりのキーワードまでの間の文章を回答として抽出する。 <Extracted based on the index in the voice of conversation>
The extraction unit 104 searches the document data generated by the voice acquisition unit 102 for the start and end of the question sentence set by the setting unit 101 and the keywords to be issued at the beginning and end of the answer sentence. Further, the extraction unit 104 extracts sentences between the keywords at the beginning of the question sentence and the keywords at the end of the question sentence as questions. Further, the extraction unit 104 extracts a sentence between the keyword at the beginning of the answer sentence and the keyword at the end of the answer sentence as an answer.

このように、＜会話の音声内の指標に基づいて抽出＞では、回答する人（あるいは質問する人）は、所定のキーワードを発するだけで会話内の質問と回答とをＱ＆Ａとして登録することができる。また、所定のキーワードが発せられないかぎり質問と回答とが登録されないので、不必要な会話（例えば、応答装置２０が必要としない情報）が登録されずに済む。 In this way, in <extracting based on the index in the voice of the conversation>, the person who answers (or the person who asks the question) can register the question and the answer in the conversation as Q & A simply by issuing a predetermined keyword. it can. Further, since the question and the answer are not registered unless a predetermined keyword is issued, unnecessary conversation (for example, information not required by the response device 20) can be prevented from being registered.

＜ウェブページ内のフォーマットによる指標に基づいて抽出＞
抽出部１０４は、ウェブページ検索部１０３が生成した文書データの中から、設定部１０１によって設定されたフォーマットで記載された文章を抽出する。以下、上述した２つのフォーマット例に分けて説明する。 <Extracted based on the index by the format in the web page>
The extraction unit 104 extracts a sentence described in the format set by the setting unit 101 from the document data generated by the web page search unit 103. Hereinafter, the above two format examples will be described separately.

＜＜フォーマット例１のウェブページからの抽出＞＞
例えば、抽出部１０４は、ウェブページ検索部１０３が生成した文書データの中から、設定部１０１によって設定されたＨＴＭＬの所定の属性（例えば、隠し属性＜hidden＞）を検索する。また、抽出部１０４は、ＨＴＭＬの所定の属性（例えば、隠し属性＜hidden＞）を用いて指定された、質問文の始まりおよび終わり、および、回答文の始まりおよび終わりを検索する。また、抽出部１０４は、質問文の始まりの隠し属性＜hidden＞から質問文の終わりの隠し属性＜hidden＞までの間の文章を質問として抽出する。また、抽出部１０４は、回答文の始まりの隠し属性＜hidden＞から回答文の終わりの隠し属性＜hidden＞までの間の文章を回答として抽出する。 << Extraction from the web page of format example 1 >>
For example, the extraction unit 104 searches the document data generated by the web page search unit 103 for a predetermined HTML attribute (for example, a hidden attribute <hidden>) set by the setting unit 101. In addition, the extraction unit 104 searches for the start and end of the question sentence and the start and end of the answer sentence specified by using a predetermined attribute of HTML (for example, the hidden attribute <hidden>). Further, the extraction unit 104 extracts sentences between the hidden attribute <hidden> at the beginning of the question sentence and the hidden attribute <hidden> at the end of the question sentence as a question. Further, the extraction unit 104 extracts the sentences between the hidden attribute <hidden> at the beginning of the answer sentence and the hidden attribute <hidden> at the end of the answer sentence as the answer.

＜＜フォーマット例２のウェブページからの抽出＞＞
例えば、抽出部１０４は、ウェブページ検索部１０３が生成した文書データの中から、設定部１０１によって設定された位置に配置される文章を検索する。また、抽出部１０４は、質問文が配置されるべきと設定部１０１によって設定された位置にある文章を質問として抽出する。また、抽出部１０４は、回答文が配置されるべきと設定部１０１によって設定された位置にある文章を回答として抽出する。 << Extraction from the web page of format example 2 >>
For example, the extraction unit 104 searches the document data generated by the web page search unit 103 for a sentence arranged at a position set by the setting unit 101. Further, the extraction unit 104 extracts a sentence at a position set by the setting unit 101 that the question sentence should be arranged as a question. Further, the extraction unit 104 extracts a sentence at a position set by the setting unit 101 that the answer sentence should be arranged as an answer.

このように、＜ウェブページ内のフォーマットによる指標に基づいて抽出＞では、ウェブページの作成者は、所定のフォーマットでＦＡＱを記載するだけでＦＡＱ内の質問と回答とをＱ＆Ａとして登録することができる。また、ウェブページ内のＦＡＱが更新されると、更新されたＦＡＱ内の質問と回答とが自動的にＱ＆Ａとして登録される。 In this way, in <extracting based on the index by the format in the web page>, the creator of the web page can register the question and answer in the FAQ as Q & A simply by describing the FAQ in the predetermined format. it can. In addition, when the FAQ in the web page is updated, the questions and answers in the updated FAQ are automatically registered as Q & A.

＜ウェブページ内の自然言語解析による指標に基づいて抽出＞
抽出部１０４は、ウェブページ検索部１０３が生成した文書データを自然言語解析し、設定部１０１によって設定された質問または質問内の一部の文言に対する回答を抽出する。 <Extracted based on indicators by natural language analysis in web pages>
The extraction unit 104 analyzes the document data generated by the web page search unit 103 in natural language, and extracts the question set by the setting unit 101 or the answer to a part of the wording in the question.

ここで、＜ウェブページ内の自然言語解析による指標に基づいて抽出＞が行われるタイミングの例について説明する。 Here, an example of the timing at which <extraction based on an index by natural language analysis in a web page> is performed will be described.

＜タイミング１＞
応答装置２０が、＜会話の音声内の指標に基づいて抽出＞や＜ウェブページ内のフォーマットによる指標に基づいて抽出＞によって事前に収集された質問および回答を用いて稼働中であるとする。抽出部１０４は、応答装置２０が質問用装置３０からの質問に対する回答を見つけ出せないときに、＜ウェブページ内の自然言語解析による指標に基づいて抽出＞によって回答を見つけ出す構成とすることができる。 <Timing 1>
It is assumed that the response device 20 is operating using the questions and answers collected in advance by <extracting based on the index in the voice of the conversation> and <extracting based on the index in the format in the web page>. When the response device 20 cannot find the answer to the question from the question device 30, the extraction unit 104 can be configured to find the answer by <extraction based on the index by the natural language analysis in the web page>.

＜タイミング２＞
抽出部１０４は、＜会話の音声内の指標に基づいて抽出＞や＜ウェブページ内のフォーマットによる指標に基づいて抽出＞と同様に、＜ウェブページ内の自然言語解析による指標に基づいて抽出＞によって回答を収集する構成とすることができる。応答装置２０は、＜会話の音声内の指標に基づいて抽出＞＜ウェブページ内のフォーマットによる指標に基づいて抽出＞＜ウェブページ内の自然言語解析による指標に基づいて抽出＞によって収集された質問および回答を用いて稼働することができる。 <Timing 2>
The extraction unit 104 <extracts based on the index by natural language analysis in the web page> in the same manner as <extracts based on the index in the voice of conversation> and <extracts based on the index by the format in the web page>. It can be configured to collect answers by. The response device 20 is a question collected by <extracting based on an index in the voice of conversation><extracting based on an index based on a format in a web page><extracting based on an index based on natural language analysis in a web page>. And can be operated using the answer.

Ｑ＆Ａ記憶部１０５は、抽出部１０４が抽出した質問および回答を格納する。以下、図８を参照しながら、Ｑ＆Ａ記憶部１０５に格納されるデータについて詳細に説明する。 The Q & A storage unit 105 stores the questions and answers extracted by the extraction unit 104. Hereinafter, the data stored in the Q & A storage unit 105 will be described in detail with reference to FIG.

図８は、本発明の一実施形態に係るＱ＆Ａ記憶部１０５に格納されるデータの一例である。図８に示されるように、Ｑ＆Ａ記憶部１０５には、質問のデータと回答のデータとが対応付けられて格納される。上述した＜会話の音声内の指標に基づいて抽出＞＜ウェブページ内のフォーマットによる指標に基づいて抽出＞では、抽出部１０４が抽出した質問と回答とが対応付けられて格納される。また、上述した＜ウェブページ内の自然言語解析による指標に基づいて抽出＞では、指標となった質問と、抽出部１０４が抽出した回答とが対応付けられて格納される。 FIG. 8 is an example of data stored in the Q & A storage unit 105 according to the embodiment of the present invention. As shown in FIG. 8, the Q & A storage unit 105 stores the question data and the answer data in association with each other. In the above-mentioned <extracting based on the index in the voice of conversation> <extracting based on the index in the format in the web page>, the question and the answer extracted by the extraction unit 104 are stored in association with each other. Further, in the above-mentioned <extracting based on the index by natural language analysis in the web page>, the question as the index and the answer extracted by the extraction unit 104 are stored in association with each other.

なお、図８に示されるように、質問のデータは、質問文だけでなく、質問文に含まれる検索キーワード（質問用装置３０で指定されるであろうキーワード）も格納するようにしてもよい。また、回答のデータは、回答文だけでなく、応答装置２０が応答する際のキャラクターの声、動作、遷移するＵＲＬも格納するようにしてもよい。 As shown in FIG. 8, the question data may store not only the question text but also the search keyword (keyword that will be specified by the question device 30) included in the question text. .. Further, the answer data may store not only the answer sentence but also the voice, action, and transition URL of the character when the response device 20 responds.

図９は、本発明の一実施形態に係るＱ＆Ａ抽出＜会話の音声内の指標に基づいて抽出＞
の処理のフローチャートである。 FIG. 9 shows a Q & A extraction according to an embodiment of the present invention <extraction based on an index in the voice of conversation>.
It is a flowchart of the process of.

ステップ１１（Ｓ１１）において、音声取得部１０２は、質問者と回答者との会話の音声データを取得する。 In step 11 (S11), the voice acquisition unit 102 acquires voice data of the conversation between the questioner and the respondent.

ステップ１２（Ｓ１２）において、音声取得部１０２は、Ｓ１１で取得した音声データを解析する。具体的には、音声取得部１０２は、Ｓ１１で取得した音声データをテキスト化して文書データを生成する。そして、音声取得部１０２は、生成した文書データを抽出部１０４が参照できるように抽出装置１０内等のメモリに記憶する。 In step 12 (S12), the voice acquisition unit 102 analyzes the voice data acquired in S11. Specifically, the voice acquisition unit 102 converts the voice data acquired in S11 into text to generate document data. Then, the voice acquisition unit 102 stores the generated document data in a memory such as in the extraction device 10 so that the extraction unit 104 can refer to it.

ステップ１３（Ｓ１３）において、抽出部１０４は、Ｓ１２で生成された文書データの中から、設定部１０１によって設定された会話の終わりのキーワードを時間の経過に沿って検索する。会話の終わりのキーワードが検出された場合には処理を終了し、検出されなかった場合にはステップ１４へ進む。 In step 13 (S13), the extraction unit 104 searches the document data generated in S12 for the keyword at the end of the conversation set by the setting unit 101 over time. If the keyword at the end of the conversation is detected, the process ends, and if it is not detected, the process proceeds to step 14.

ステップ１４（Ｓ１４）において、抽出部１０４は、Ｓ１２で生成された文書データの中から、設定部１０１によって設定された質問文の始まりのキーワードを時間の経過に沿って検索する。質問文の始まりのキーワードが検出された場合にはステップ１５へ進み、検出されなかった場合にはステップ１６へ進む。 In step 14 (S14), the extraction unit 104 searches the document data generated in S12 for the keyword at the beginning of the question sentence set by the setting unit 101 over time. If the keyword at the beginning of the question sentence is detected, the process proceeds to step 15, and if not detected, the process proceeds to step 16.

ステップ１５（Ｓ１５）において、抽出部１０４は、Ｓ１２で生成された文書データの中から、設定部１０１によって設定された質問文の終わりのキーワードを時間の経過に沿って検索する。そして、抽出部１０４は、質問文の始まりのキーワードから質問文の終わりのキーワードまでの間の文章を質問として抽出して、Ｑ＆Ａ記憶部１０５に記憶する。 In step 15 (S15), the extraction unit 104 searches the document data generated in S12 for the keyword at the end of the question sentence set by the setting unit 101 over time. Then, the extraction unit 104 extracts the sentence between the keyword at the beginning of the question sentence and the keyword at the end of the question sentence as a question and stores it in the Q & A storage unit 105.

ステップ１６（Ｓ１６）において、抽出部１０４は、Ｓ１２で生成された文書データの中から、設定部１０１によって設定された回答文の始まりのキーワードを時間の経過に沿って検索する。回答文の始まりのキーワードが検出された場合にはステップ１７へ進み、検出されなかった場合にはステップ１１へ戻る。 In step 16 (S16), the extraction unit 104 searches the document data generated in S12 for the keyword at the beginning of the answer sentence set by the setting unit 101 over time. If the keyword at the beginning of the answer sentence is detected, the process proceeds to step 17, and if it is not detected, the process returns to step 11.

ステップ１７（Ｓ１７）において、抽出部１０４は、Ｓ１２で生成された文書データの中から、設定部１０１によって設定された回答文の終わりのキーワードを時間の経過に沿って検索する。そして、抽出部１０４は、回答文の始まりのキーワードから回答文の終わりのキーワードまでの間の文章を回答として抽出して、Ｑ＆Ａ記憶部１０５に記憶する。 In step 17 (S17), the extraction unit 104 searches the document data generated in S12 for the keyword at the end of the answer sentence set by the setting unit 101 over time. Then, the extraction unit 104 extracts the sentence between the keyword at the beginning of the answer sentence and the keyword at the end of the answer sentence as an answer, and stores it in the Q & A storage unit 105.

図１０は、本発明の一実施形態に係るＱ＆Ａ抽出＜ウェブページ内のフォーマットによる指標に基づいて抽出＞の処理のフローチャートである。 FIG. 10 is a flowchart of a process of Q & A extraction <extraction based on an index in a format in a web page> according to an embodiment of the present invention.

ステップ２１（Ｓ２１）において、ウェブページ検索部１０３は、ウェブページ（ＨＴＭＬ）を取得する。具体的には、ウェブページ検索部１０３は、指定された範囲または全てのウェブページから情報を収集（クロール）する。 In step 21 (S21), the web page search unit 103 acquires the web page (HTML). Specifically, the web page search unit 103 collects (crawls) information from a designated range or all web pages.

ステップ２２（Ｓ２２）において、ウェブページ検索部１０３は、Ｓ２１で取得したウェブページを解析する。具体的には、ウェブページ検索部１０３は、収集した情報をテキスト化して文書データを生成する。そして、ウェブページ検索部１０３は、生成した文書データを抽出部１０４が参照できるように抽出装置１０内等のメモリに記憶する。 In step 22 (S22), the web page search unit 103 analyzes the web page acquired in S21. Specifically, the web page search unit 103 converts the collected information into text and generates document data. Then, the web page search unit 103 stores the generated document data in a memory such as in the extraction device 10 so that the extraction unit 104 can refer to it.

ステップ２３（Ｓ２３）において、抽出部１０４は、Ｓ２２で生成された文書データの中から、設定部１０１によって設定された一連の文章の終わりの隠し属性＜hidden＞を先頭から順に検索する。一連の文章の終わりの隠し属性＜hidden＞が検出された場合には処理を終了し、検出されなかった場合にはステップ２４へ進む。 In step 23 (S23), the extraction unit 104 searches the document data generated in S22 for the hidden attribute <hidden> at the end of a series of sentences set by the setting unit 101 in order from the beginning. If the hidden attribute <hidden> at the end of a series of sentences is detected, the process ends, and if it is not detected, the process proceeds to step 24.

ステップ２４（Ｓ２４）において、抽出部１０４は、Ｓ２２で生成された文書データの中から、設定部１０１によって設定された質問文の始まりの隠し属性＜hidden＞を先頭から順に検索する。質問文の始まりの隠し属性＜hidden＞が検出された場合にはステップ２５へ進み、検出されなかった場合にはステップ２６へ進む。 In step 24 (S24), the extraction unit 104 searches the document data generated in S22 for the hidden attribute <hidden> at the beginning of the question sentence set by the setting unit 101 in order from the beginning. If the hidden attribute <hidden> at the beginning of the question sentence is detected, the process proceeds to step 25, and if it is not detected, the process proceeds to step 26.

ステップ２５（Ｓ２５）において、抽出部１０４は、Ｓ２２で生成された文書データの中から、設定部１０１によって設定された質問文の終わりの隠し属性＜hidden＞を先頭から順に検索する。そして、抽出部１０４は、質問文の始まりの隠し属性＜hidden＞から質問文の終わりの隠し属性＜hidden＞までの間の文章を質問として抽出して、Ｑ＆Ａ記憶部１０５に記憶する。 In step 25 (S25), the extraction unit 104 searches the document data generated in S22 for the hidden attribute <hidden> at the end of the question sentence set by the setting unit 101 in order from the beginning. Then, the extraction unit 104 extracts the sentences between the hidden attribute <hidden> at the beginning of the question sentence and the hidden attribute <hidden> at the end of the question sentence as a question and stores them in the Q & A storage unit 105.

ステップ２６（Ｓ２６）において、抽出部１０４は、Ｓ２２で生成された文書データの中から、設定部１０１によって設定された回答文の始まりの隠し属性＜hidden＞を先頭から順に検索する。回答文の始まりの隠し属性＜hidden＞が検出された場合にはステップ２７へ進み、検出されなかった場合にはステップ２１へ戻る。 In step 26 (S26), the extraction unit 104 searches the document data generated in S22 for the hidden attribute <hidden> at the beginning of the answer sentence set by the setting unit 101 in order from the beginning. If the hidden attribute <hidden> at the beginning of the answer sentence is detected, the process proceeds to step 27, and if it is not detected, the process returns to step 21.

ステップ２７（Ｓ２７）において、抽出部１０４は、Ｓ２２で生成された文書データの中から、設定部１０１によって設定された回答文の終わりの隠し属性＜hidden＞を先頭から順に検索する。そして、抽出部１０４は、回答文の始まりの隠し属性＜hidden＞から回答文の終わりの隠し属性＜hidden＞までの間の文章を回答として抽出して、Ｑ＆Ａ記憶部１０５に記憶する。 In step 27 (S27), the extraction unit 104 searches the document data generated in S22 for the hidden attribute <hidden> at the end of the answer sentence set by the setting unit 101 in order from the beginning. Then, the extraction unit 104 extracts the sentence from the hidden attribute <hidden> at the beginning of the answer sentence to the hidden attribute <hidden> at the end of the answer sentence as an answer and stores it in the Q & A storage unit 105.

上記で説明した実施形態の各機能は、一又は複数の処理回路によって実現することが可能である。ここで、本明細書における「処理回路」とは、電子回路により実装されるプロセッサのようにソフトウェアによって各機能を実行するようプログラミングされたプロセッサや、上記で説明した各機能を実行するよう設計されたＡＳＩＣ（Application Specific Integrated Circuit）、ＤＳＰ（digital signal processor）、ＦＰＧＡ（field programmable gate array）や従来の回路モジュール等のデバイスを含むものとする。 Each function of the embodiment described above can be realized by one or more processing circuits. Here, the "processing circuit" in the present specification is a processor programmed to execute each function by software such as a processor implemented by an electronic circuit, or a processor designed to execute each function described above. It shall include devices such as ASIC (Application Specific Integrated Circuit), DSP (digital signal processor), FPGA (field programmable gate array) and conventional circuit modules.

なお、上記実施形態に挙げた構成等に、その他の要素との組み合わせ等、ここで示した構成に本発明が限定されるものではない。これらの点に関しては、本発明の趣旨を逸脱しない範囲で変更することが可能であり、その応用形態に応じて適切に定めることができる。 The present invention is not limited to the configurations shown here, such as combinations with other elements in the configurations and the like described in the above embodiments. These points can be changed without departing from the spirit of the present invention, and can be appropriately determined according to the application form thereof.

１応答システム
１０Ｑ＆Ａ抽出装置
２０応答装置
３０質問用装置
４０ネットワーク
３１デジタルサイネージ
３２コントローラ
３３マイク
３４スピーカ
５０回答者
５１マイク
６０質問者
１０１設定部
１０２音声取得部
１０３ウェブページ検索部
１０４抽出部
１０５Ｑ＆Ａ記憶部 1 Response system 10 Q & A extraction device 20 Response device 30 Question device 40 Network 31 Digital signage 32 Controller 33 Microphone 34 Speaker 50 Respondent 51 Microphone 60 Questioner 101 Setting unit 102 Voice acquisition unit 103 Web page search unit 104 Extraction unit 105 Q & A Memory

特開２００１−２５６０３６号公報Japanese Unexamined Patent Publication No. 2001-256036

Claims

A setting unit that sets an index to identify at least one of a question and an answer,
An information processing device including an extraction unit that extracts at least one of the question and the answer from the data based on the index.

Further provided with a voice acquisition unit that acquires voice data of a conversation between a questioner and a respondent, converts the voice data into text, and generates the data.
The information processing apparatus according to claim 1, wherein the index is a keyword issued at the beginning and end of the question and the beginning and end of the answer.

It further includes a web page search unit that collects information from a specified range of web pages or the entire web page, converts the information into text, and generates the data.
The information processing apparatus according to claim 1, wherein the index is in a predetermined format in the web page.

The predetermined format is a hidden attribute and
The text between the hidden attribute at the beginning of the question and the hidden attribute at the end of the question is extracted as the question, and the text between the hidden attribute at the beginning of the answer and the hidden attribute at the end of the answer is extracted as the answer. , The information processing apparatus according to claim 3.

The predetermined format is a position in the web page where the text is arranged.
The information processing apparatus according to claim 3, wherein the sentence at the position where the question should be arranged is extracted as the question, and the sentence at the position where the answer should be arranged is extracted as the answer.

It further includes a web page search unit that collects information from a specified range of web pages or the entire web page, converts the information into text, and generates the data.
The indicator is a question or some wording within the question.
The information processing apparatus according to claim 1, wherein the extraction unit analyzes the data in natural language and extracts an answer to the question or a part of the wording in the question.

The extraction unit
Get the question or some wording in the question and
The information processing apparatus according to claim 2 or 3, wherein a specified range of a web page or the entire web page is analyzed in natural language, and an answer to the question or a part of the wording in the question is extracted.

The way the computer does
Steps to set indicators to identify at least one of the question and answer,
A method comprising the step of extracting at least one of the question and the answer from the data based on the index.

A setting unit that sets an index to identify at least one of a question and an answer on a computer.
A program for functioning as an extraction unit that extracts at least one of the question and the answer from the data based on the index.

A response system including an information processing device, a response device, and a question device.
The information processing device
A setting unit that sets an index to identify at least one of a question and an answer,
An extraction unit that extracts at least one of the question and the answer from the data based on the index is provided.
The response device transmits an answer to the question received from the question device based on the question and the answer extracted by the extraction unit.
The questioning device sends a question to the answering device and receives an answer from the answering device.
Response system.