JP2015102957A

JP2015102957A - Information search system, information search device, information search method, and program

Info

Publication number: JP2015102957A
Application number: JP2013242046A
Authority: JP
Inventors: 富士本　淳; Atsushi Fujimoto; 淳富士本; 野中　誠之; Masayuki Nonaka; 誠之野中; 勝倉　裕; Yutaka Katsukura; 裕勝倉
Original assignee: Universal Entertainment Corp
Current assignee: Universal Entertainment Corp
Priority date: 2013-11-22
Filing date: 2013-11-22
Publication date: 2015-06-04
Anticipated expiration: 2033-11-22
Also published as: JP5911839B2

Abstract

PROBLEM TO BE SOLVED: To provide an information search system that can provide a user with keywords (character strings) allowing the user to acquire unknown latest topics.SOLUTION: An information search system 100 includes sentence information acquisition means for acquiring pieces of sentence information from text data collected through search based on a keyword, character string selection means for selecting one or more character strings satisfying predetermined conditions from each of the pieces of sentence information and storing the selected character strings in character string storage means for each of the corresponding sentence information, and information output means for outputting information for displaying, to a user, the selected character strings for each of the corresponding sentence information.

Description

ユーザに情報検索機能を提供する情報検索システムに関する。 The present invention relates to an information search system that provides a user with an information search function.

所定期間にネット検索等で急増した検索ワードや、所定期間にニュースなどで急増したワードなどをイベントワードとして記憶し、さらに、ネット検索等において、イベントワードとともに入力されるワード（ＡＮＤ検索等で用いられるワード）や、イベントワードを含むニュースから出現頻度などを基に予め解析したワードを、イベントワード関連語として記憶し、このような状況において、ネット検索等の検索ワードとしてイベントワードの１つが入力された場合に、上述したイベントワード関連語を一覧表示するように構成された情報検索システムがある（特許文献１参照）。 Search words that have increased rapidly due to internet searches, etc. during a predetermined period, and words that have increased rapidly due to news, etc., during a predetermined period are stored as event words. Word) or a word analyzed in advance based on the frequency of appearance from the news containing the event word, etc., is stored as an event word related word, and in such a situation, one of the event words is input as a search word for net search etc In such a case, there is an information retrieval system configured to display a list of the above-described event word related words (see Patent Document 1).

特開２００７−３４４６６号公報JP 2007-34466 A

しかしながら、このような情報検索システムでは、以下のような問題が生ずることが想定される。すなわち、イベントワード関連語が、ネット検索においてイベントワードとＡＮＤ検索したものから得られている場合、ユーザのイベントワードの入力に応じて一覧表示される、こうしたイベントワード関連語は、入力者由来の情報であるため、ネット検索のユーザにとっては既知の偏った情報である場合が多く、イベントワードとイベントワード関連語の間の関連も、入力者が考える関連にすぎない。そのため、ネット検索においてイベントワードとＡＮＤ検索したものから得られているイベントワード関連語によって、既知でない新規な話題を取得することが困難である。 However, such an information retrieval system is assumed to have the following problems. That is, when event word related words are obtained from AND search of event words in the net search, these event word related words are listed according to the user's event word input. Since it is information, it is often known biased information for users of the net search, and the association between the event word and the event word related word is also just the relationship considered by the input person. Therefore, it is difficult to acquire a new topic that is not known by using event word-related words obtained from AND search with event words in the net search.

また、イベントワード関連語が、イベントワードを含むニュースから出現頻度などを基に予め解析したものである場合、そのイベントワード関連語は、過去のニュースを解析したものであるため、そのようなイベントワード関連語によって最新の話題を取得することが困難である。 In addition, when the event word related word is analyzed in advance based on the appearance frequency or the like from the news including the event word, the event word related word is obtained by analyzing the past news. It is difficult to get the latest topic by word related words.

本発明の目的は、上述した課題を解決することができるような情報検索システムを提供することにある。 The objective of this invention is providing the information search system which can solve the subject mentioned above.

本発明の実施の形態は、
キーワードに基づく検索により収集されたテキストデータ（例えば、図２０に示す収集されたＷＥＢページから取得される外部ログ５０２（テキストデータ））から、前記キーワードに関する文情報（例えば、キーワード検索にヒットしたテキストデータの一部である質問文等）を取得する文情報取得手段（例えば、図２０に示す入力情報分析部４１）と、
前記文情報のそれぞれから所定条件を満たす１つまたは複数の文字列（例えば、意味識別可能な文字列である関連詞）を選出し、対応する前記文情報ごとに、前記文字列を文字列記憶手段（例えば、図２０に示す関連詞辞書５０）に記憶する文字列選出手段（例えば、図２０に示す文解析部４３）と、
前記選出された前記文字列を、対応する前記文情報ごとにユーザに対して表示するための情報（例えば、ユーザに対して図３７に示すような関連詞・共起語一覧表示画面６５０を表示するための図２０に示す関連詞・共起語データ５２を含む入力特定情報）を出力する情報出力手段（例えば、図２０に示す入力情報分析部４１）を備えるように構成された情報検索システム（例えば、図２０に示す情報検索システム１００）である。 Embodiments of the present invention
From text data collected by keyword-based search (for example, external log 502 (text data) acquired from the collected WEB page shown in FIG. 20), sentence information related to the keyword (for example, text hit by keyword search) Sentence information acquisition means (for example, an input information analysis unit 41 shown in FIG. 20) for acquiring a question sentence that is a part of the data;
One or a plurality of character strings satisfying a predetermined condition (for example, related verbs that are meaning-identifiable character strings) are selected from each of the sentence information, and the character strings are stored for each corresponding sentence information. A character string selection means (for example, the sentence analysis unit 43 shown in FIG. 20) stored in the means (for example, the related term dictionary 50 shown in FIG. 20);
Information for displaying the selected character string to the user for each of the corresponding sentence information (for example, a related term / co-occurrence word list display screen 650 as shown in FIG. 37 is displayed to the user) An information search system configured to include information output means (for example, the input information analysis unit 41 shown in FIG. 20) that outputs the input specific information including the related term / co-occurrence word data 52 shown in FIG. (For example, the information search system 100 shown in FIG. 20).

本発明のこのような構成によって、選出された文字列が情報発信者由来の情報として得られ、既知でない新規な話題も取得することができる。また、選出された文字列（関連詞）が、キーワードに基づく検索から得られる最新の情報であるため、最新の情報を得ることができる。 With such a configuration of the present invention, the selected character string is obtained as information originating from the information sender, and a new topic that is not known can be acquired. In addition, since the selected character string (related term) is the latest information obtained from the search based on the keyword, the latest information can be obtained.

本発明の実施の形態の特徴は、さらに、
前記文字列選出手段は、
事前に記憶された文字列データ（例えば、形態素データ等を含む辞書）との照合を行うことなく、前記文字列を選出するように構成されることである。 The feature of the embodiment of the present invention is further
The character string selection means is:
It is configured to select the character string without collating with character string data stored in advance (for example, a dictionary including morpheme data or the like).

本発明のこのような構成によって、事前に形態素データ等を含む辞書を作成・維持する労力が不要となり、容易に情報検索システムを構築できる。 With such a configuration of the present invention, it is not necessary to create and maintain a dictionary including morpheme data in advance, and an information search system can be easily constructed.

本発明の実施の形態の特徴は、さらに、
前記文字列選出手段はさらに、
前記テキストデータから同じ文字列を検索する文字列検索手段（例えば、図２１に示す文字列検索処理部４３ｂ）と、
前記同じ文字列について、前の隣接文字の異なり度合い（例えば、検索された「同じ文字列」の直前に出現する文字が、どの程度異なっているか（バリエーションがあるか）を示す指標であり、前の隣接文字として現れる文字のパターン数に基づくもの）、及び後の隣接文字の異なり度合い（例えば、検索された「同じ文字列」の直後に出現する文字が、どの程度異なっているか（バリエーションがあるか）を示す指標であり、後の隣接文字として現れる文字のパターン数に基づくもの）を判定する異なり度合い判定手段（例えば、図２１に示す異なり度合い判定処理部４３ｃ）と、
前記前の隣接文字の異なり度合い、及び前記後の隣接文字の異なり度合いに基づいて、前記同じ文字列が特定文字列（例えば、関連詞）であるか否かを決定する特定文字列決定手段（例えば、図２１に示す関連詞決定処理部４３ｄ）とを備え、
前記文字列選出手段は、前記決定された特定文字列から、前記文字列を選出するように構成されることである。 The feature of the embodiment of the present invention is further
The character string selection means further includes:
Character string search means (for example, a character string search processing unit 43b shown in FIG. 21) for searching for the same character string from the text data;
For the same character string, it is an index indicating the degree of difference of the previous adjacent character (for example, how much the character appearing immediately before the searched “same character string” is different (there is a variation), Based on the number of patterns of characters appearing as adjacent characters of) and the degree of difference of the subsequent adjacent characters (for example, how much the character appearing immediately after the searched “same character string” is different (variation exists) A different degree determination means (for example, a different degree determination processing unit 43c shown in FIG. 21) for determining an index indicating the number of patterns of characters that appear as subsequent adjacent characters),
Specific character string determining means for determining whether or not the same character string is a specific character string (for example, a related term) based on the difference degree of the preceding adjacent character and the difference degree of the subsequent adjacent character. For example, a related term determination processing unit 43d) shown in FIG.
The character string selection means is configured to select the character string from the determined specific character string.

本発明のこのような構成によって、隣接する文字の異なり度合いに応じて関連詞が判定され、形態素データ等を含む辞書と逐一比較処理を行う必要がなく、処理を高速化することができ、関連詞の表示処理等をリアルタイムに行うことができる。 With such a configuration of the present invention, related terms are determined according to the degree of difference between adjacent characters, and it is not necessary to perform comparison processing with a dictionary including morpheme data etc., and the processing can be speeded up. The display processing of the lyrics can be performed in real time.

本発明の実施の形態の特徴は、さらに、
話題に関する応答情報を規定するためのシナリオデータ（例えば、図１４に示すようなステートメントからなるデータ）を記憶するシナリオデータ記憶手段（例えば、図２０に示すシナリオデータ２８、シナリオデータ５５）と、
前記シナリオデータに基づいて、前記選出された前記文字列を含む前記応答情報を決定する応答情報決定手段（例えば、図２０に示す応答情報決定部２５）と、
前記応答情報決定手段によって決定された前記応答情報を出力する応答情報出力手段（例えば、図２０に示す出力制御部２６）をさらに備えるように構成されることである。 The feature of the embodiment of the present invention is further
Scenario data storage means (for example, scenario data 28 and scenario data 55 shown in FIG. 20) for storing scenario data (for example, data consisting of statements as shown in FIG. 14) for defining response information related to the topic;
Response information determination means (for example, a response information determination unit 25 shown in FIG. 20) for determining the response information including the selected character string based on the scenario data;
Response information output means (for example, the output control unit 26 shown in FIG. 20) for outputting the response information determined by the response information determination means is further provided.

本発明のこのような構成によって、選出された文字列（関連詞）をユーザに表示する場合の表示態様を多様化することができるとともに、表示の編集を容易に実行・管理することができる。 With such a configuration of the present invention, it is possible to diversify the display mode when the selected character string (related term) is displayed to the user, and to easily execute and manage display editing.

本発明の実施の形態の特徴は、さらに、
辞書比較手段（例えば、図２２に示す辞書比較処理部４６ｃ）をさらに備え、
前記文字列選出手段は、前記文字列を前記文字列記憶手段に記憶する場合に、それぞれ、前記テキストデータの収集条件に応じて、対応する辞書（例えば、図２２に示す関連詞辞書５０）に記憶し、
前記辞書比較手段は、複数の前記辞書を比較する比較処理を行い、比較結果を比較結果記憶手段（例えば、図２２に示す比較結果データ５４）に記憶し、
前記応答情報決定手段は、前記シナリオデータに基づいて、前記比較結果を含む前記応答情報を決定し、
前記応答情報出力手段は、前記応答情報決定手段によって決定された前記応答情報を出力し、
前記辞書比較手段はさらに、複数の前記辞書のうち少なくとも１つが更新された場合に、前記比較処理を行い、前記比較結果記憶手段に記憶された比較結果を自動的に更新するように構成されることである。 The feature of the embodiment of the present invention is further
It further includes dictionary comparison means (for example, dictionary comparison processing unit 46c shown in FIG. 22),
When the character string selection unit stores the character string in the character string storage unit, each of the character string selection units stores a corresponding dictionary (for example, the related term dictionary 50 shown in FIG. 22) according to the collection condition of the text data. Remember,
The dictionary comparison means performs a comparison process for comparing a plurality of the dictionaries, stores the comparison result in a comparison result storage means (for example, comparison result data 54 shown in FIG. 22),
The response information determining means determines the response information including the comparison result based on the scenario data;
The response information output means outputs the response information determined by the response information determination means,
The dictionary comparison unit is further configured to perform the comparison process and automatically update the comparison result stored in the comparison result storage unit when at least one of the plurality of dictionaries is updated. That is.

本発明のこのような構成によって、選出された文字列（関連詞）の更新を自動的に行うことができ、各更新タイミングにおける比較結果によって関連詞の出現状況の変化を把握して表示することにより、関連詞をユーザに表示する場合の表示態様を多様化することができる。 With such a configuration of the present invention, the selected character string (related term) can be automatically updated, and a change in the appearance status of the related term is grasped and displayed based on a comparison result at each update timing. Thus, it is possible to diversify the display mode when displaying the related words to the user.

本発明の実施の形態の特徴は、さらに、
前記情報出力手段は、
前記文情報の１つに対応する前記文字列の１つと、前記文情報の１つとは異なる文情報に対応する前記文字列の１つが共通する場合に、前記１の文情報に対応する前記文字列の集合と、前記他の文情報に対応する前記文字列の集合とを関連付けて表示するための情報を出力するように構成されることである。 The feature of the embodiment of the present invention is further
The information output means includes
When one of the character strings corresponding to one of the sentence information and one of the character strings corresponding to sentence information different from one of the sentence information are common, the character corresponding to the one sentence information It is configured to output information for displaying the set of columns in association with the set of character strings corresponding to the other sentence information.

本発明の実施の形態の特徴は、さらに、
前記情報出力手段は、
所定の１または複数の前記文情報に対応する前記文字列の集合をすべて表示するための情報を出力し、
前記文字列の表示順は、前記ユーザの前記文字列に対する利用態様に応じて決定されるように構成されることである。 The feature of the embodiment of the present invention is further
The information output means includes
Outputting information for displaying all of the set of character strings corresponding to the predetermined one or more sentence information;
The display order of the character strings is configured to be determined according to a usage mode of the user with respect to the character strings.

本発明のさらなる実施の形態は、
テキストデータの収集条件となるキーワードを入力するキーワード入力手段（例えば、図２０に示す入力制御部２１）と、
前記キーワードに基づいて収集されたテキストデータから取得された、前記キーワードに関する文情報のそれぞれから、所定条件を満たす１つまたは複数の文字列が選出された場合に、前記選出された前記文字列を、対応する前記文情報ごとにユーザに対して表示するための情報を出力する情報出力手段（例えば、図２０に示す出力制御部２６）を備えるように構成された情報検索装置（例えば、図２０に示す会話制御端末装置２’’）である。 Further embodiments of the present invention are:
Keyword input means (for example, the input control unit 21 shown in FIG. 20) for inputting a keyword as a text data collection condition;
When one or more character strings satisfying a predetermined condition are selected from each of the sentence information related to the keyword acquired from the text data collected based on the keyword, the selected character string is An information search device (for example, FIG. 20) configured to include information output means (for example, the output control unit 26 shown in FIG. 20) that outputs information for display to the user for each corresponding sentence information. The conversation control terminal device 2 ″) shown in FIG.

本発明のさらなる実施の形態は、
キーワードに基づく検索により収集されたテキストデータから、前記キーワードに関する文情報を取得する文情報取得ステップと、
前記文情報のそれぞれから所定条件を満たす１つまたは複数の文字列を選出し、対応する前記文情報ごとに、前記文字列を文字列記憶手段に記憶する文字列選出ステップと、
前記選出された前記文字列を、対応する前記文情報ごとにユーザに対して表示するための情報を出力する情報出力ステップを備えるように構成された情報検索方法である。 Further embodiments of the present invention are:
A sentence information acquisition step of acquiring sentence information related to the keyword from text data collected by keyword-based search;
A character string selection step of selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character string in a character string storage unit for each corresponding sentence information;
The information search method is configured to include an information output step of outputting information for displaying the selected character string to a user for each corresponding sentence information.

本発明のさらなる実施の形態は、
コンピュータに、
キーワードに基づく検索により収集されたテキストデータから、前記キーワードに関する文情報を取得する文情報取得手段、
前記文情報のそれぞれから所定条件を満たす１つまたは複数の文字列を選出し、対応する前記文情報ごとに、前記文字列を文字列記憶手段に記憶する文字列選出手段、及び、
前記選出された前記文字列を、対応する前記文情報ごとにユーザに対して表示するための情報を出力する情報出力手段として機能させるためのプログラムである。 Further embodiments of the present invention are:
On the computer,
Sentence information acquisition means for acquiring sentence information relating to the keyword from text data collected by keyword-based search;
Character string selection means for selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character strings in character string storage means for each corresponding sentence information, and
It is a program for causing the selected character string to function as information output means for outputting information for displaying to the user for each corresponding sentence information.

本発明の実施の形態における効果は、情報検索システムを利用するユーザに対して、既知でない最新の話題を取得しうるキーワード（文字列）を提供することができることである。 An effect of the embodiment of the present invention is that it is possible to provide a keyword (character string) that can acquire the latest topic that is not known to a user who uses the information search system.

話題提供システムの概要を示す図である。It is a figure which shows the outline | summary of a topic provision system. 会話制御端末装置の概要を示す図である。It is a figure which shows the outline | summary of a conversation control terminal device. 保守装置の概要を示す図である。It is a figure which shows the outline | summary of a maintenance apparatus. 話題提供システム１のシステム構成の概略を示すブロック図である。1 is a block diagram showing an outline of a system configuration of a topic providing system 1. FIG. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. シナリオデータのステートメントの具体的な処理を示すフローチャートである。It is a flowchart which shows the specific process of the statement of scenario data. 出力部２２０に出力される例を示す図である。6 is a diagram illustrating an example of output to an output unit 220. FIG. シナリオデータのステートメントの具体的な例を示す図である。It is a figure which shows the specific example of the statement of scenario data. 話題解析に基づいて応答情報を生成して応答情報を出力部に出力する過程を示す図である。It is a figure which shows the process which produces | generates response information based on topic analysis, and outputs response information to an output part. 話題紹介リストの構成の例を示す図である。It is a figure which shows the example of a structure of a topic introduction list. 話題の抽出、関連詞辞書の生成及び嗜好辞書の生成の過程を示す図である。It is a figure which shows the process of the extraction of a topic, the production | generation of a related term dictionary, and the production | generation of a preference dictionary. 保守装置３の話題ネタ設定画面に話題が入力されて、前記関連詞辞書、前記話題リストに基づいて話題紹介リストを生成し出力するまでの処理の流れを示したものである。The flow of processing from when a topic is input to the topic news setting screen of the maintenance device 3 until a topic introduction list is generated and output based on the related term dictionary and the topic list is shown. 情報検索システムの技術的思想を説明するための図である。It is a figure for demonstrating the technical idea of an information search system. 情報検索システムの概要を示す図である。It is a figure which shows the outline | summary of an information search system. 情報検索システムの文解析部の概要を示す図である。It is a figure which shows the outline | summary of the sentence analysis part of an information search system. 情報検索システムの情報更新部の概要を示す図である。It is a figure which shows the outline | summary of the information update part of an information search system. ＦＡＱ検索システムの画面遷移を示す図である。It is a figure which shows the screen transition of a FAQ search system. ＦＡＱ候補表示画面の表示処理を表すフローチャートである。It is a flowchart showing the display process of FAQ candidate display screen. 情報検索システムの文解析処理の概要を示す図である。It is a figure which shows the outline | summary of the sentence analysis process of an information search system. 外部ログの一例を示す図である。It is a figure which shows an example of an external log. 関連詞辞書の一例を示す図である。It is a figure which shows an example of a related term dictionary. 情報検索システムの文字列検索処理を表すフローチャートである。It is a flowchart showing the character string search process of an information search system. サフィックスアレイと二分探索を用いた文字列検索の仕組みを示す図である。It is a figure which shows the mechanism of the character string search using a suffix array and binary search. サフィックスアレイと二分探索を用いた文字列検索の仕組みを示す図である。It is a figure which shows the mechanism of the character string search using a suffix array and binary search. 情報検索システムの異なり度合い判定処理の処理手順を表すフローチャートである。It is a flowchart showing the process sequence of a different degree determination process of an information search system. 前後の隣接文字の異なり度合いを判定するための仕組みを示す図である。It is a figure which shows the mechanism for determining the difference degree of the adjacent character before and behind. ＦＡＱ表示画面の表示処理を表すフローチャートである。It is a flowchart showing the display process of a FAQ display screen. 関連詞・共起語一覧画面の表示処理を表すフローチャートである。It is a flowchart showing the display processing of a related term / co-occurrence word list screen. ＦＡＱ検索システムの画面の例を示す図である。It is a figure which shows the example of the screen of a FAQ search system. ＦＡＱ検索システムの画面の例を示す図である。It is a figure which shows the example of the screen of a FAQ search system. ＦＡＱ検索システムの画面の例を示す図である。It is a figure which shows the example of the screen of a FAQ search system. 嗜好データの例、及びＦＡＱ検索システムの画面の例を示す図である。It is a figure which shows the example of preference data, and the example of the screen of FAQ search system. 情報検索システムの情報更新部の処理概要を示す図である。It is a figure which shows the process outline | summary of the information update part of an information search system. 情報検索システムの情報更新部の文字列抽出処理の処理手順を表すフローチャートである。It is a flowchart showing the process sequence of the character string extraction process of the information update part of an information search system. 情報検索システムの情報更新部の辞書比較処理の処理手順を表すフローチャートである。It is a flowchart showing the process sequence of the dictionary comparison process of the information update part of an information search system. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を関連詞辞書の内容とともに示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system with the content of a related term dictionary. 情報検索システムの情報更新部の辞書比較処理により記憶される比較結果データの内容を示す図である。It is a figure which shows the content of the comparison result data memorize | stored by the dictionary comparison process of the information update part of an information search system. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を関連詞辞書の内容とともに示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system with the content of a related term dictionary. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を関連詞辞書の内容とともに示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system with the content of a related term dictionary. 情報検索システムの情報更新部の文字列抽出処理と辞書比較処理を関連詞辞書の内容とともに示す図である。It is a figure which shows the character string extraction process and dictionary comparison process of the information update part of an information search system with the content of a related term dictionary. 本発明の情報検索システムに含まれる話題提供サーバを構成するコンピュータのハードウェア構成の例を示す図である。It is a figure which shows the example of the hardware constitutions of the computer which comprises the topic provision server contained in the information search system of this invention. 話題提供システムの他のシステム構成の概略を示すブロック図である。It is a block diagram which shows the outline of the other system configuration | structure of a topic provision system.

以下に、本実施の形態について図面に基づいて説明する。 Hereinafter, the present embodiment will be described with reference to the drawings.

図１は、話題提供システム１の概要を示す図である。図２は、会話制御端末装置２の概要を示す図である。図３は、保守装置３の概要を示す図である。 FIG. 1 is a diagram showing an overview of the topic providing system 1. FIG. 2 is a diagram showing an outline of the conversation control terminal device 2. FIG. 3 is a diagram showing an overview of the maintenance device 3.

＜＜＜話題提供システムの概要＞＞＞
図１に示すように、本実施の形態による話題提供システム１の特徴は、
ユーザが入力情報を入力するための入力部と、
前記入力情報を分析して入力特定情報を生成する入力情報分析部と、
話題に関する応答情報を規定するためのシナリオデータを抽出するシナリオデータ記憶部と、
前記シナリオデータと前記入力特定情報とに基づいて前記応答情報を決定する応答情報決定部と、
前記応答情報決定部によって決定された応答情報を出力する出力部と、を備えることである。 <<< Overview of the topic providing system >>>
As shown in FIG. 1, the feature of the topic providing system 1 according to this embodiment is as follows.
An input unit for a user to input input information;
An input information analysis unit that analyzes the input information and generates input specific information;
A scenario data storage unit for extracting scenario data for defining response information on the topic;
A response information determination unit that determines the response information based on the scenario data and the input identification information;
And an output unit that outputs the response information determined by the response information determination unit.

本実施の形態による話題提供システム１は、図１に示すように、主に、入力部と入力情報分析部とシナリオデータ記憶部と応答情報決定部と出力部とを備える。図１においては、これらの構成を実線の四角で示した。点線の四角で示した送信部と受信部と切替入力情報入力部とについては、後述する。図１において破線で囲んだ部分が、後述する図２に示す会話制御端末装置２の構成である。 As shown in FIG. 1, the topic providing system 1 according to the present embodiment mainly includes an input unit, an input information analysis unit, a scenario data storage unit, a response information determination unit, and an output unit. In FIG. 1, these structures are indicated by solid-line squares. The transmission unit, the reception unit, and the switching input information input unit indicated by dotted-line squares will be described later. A portion surrounded by a broken line in FIG. 1 is the configuration of the conversation control terminal device 2 shown in FIG.

入力部は、ユーザが入力情報を入力するための部材や部位である。入力部は、ユーザが所望する情報を入力情報として入力できるものであればよい。たとえば、入力部は、キーボードやタッチパネルやマイクロフォンやカメラなどがある。ユーザは、入力部からテキストデータや音声データや画像データなどを入力できる。入力部に入力された入力情報は、次に説明する入力情報分析部に供給される。入力情報は、後述する送信部を介して入力情報分析部に供給されるのが好ましい。 The input unit is a member or a part for the user to input input information. The input part should just be what can input the information which a user desires as input information. For example, the input unit includes a keyboard, a touch panel, a microphone, a camera, and the like. The user can input text data, voice data, image data, and the like from the input unit. The input information input to the input unit is supplied to the input information analysis unit described below. The input information is preferably supplied to the input information analysis unit via a transmission unit described later.

入力情報分析部は、入力情報を分析して入力特定情報を生成する。入力特定情報は、入力情報に含まれる各種の情報を分析した結果、生成される情報である。たとえば、特定のキーワード（後述する関連詞など）が入力情報に含まれる数や頻度などの統計的な分析などがある。さらに、入力情報の分析により、ユーザが入力した質問などからユーザの意思や嗜好を分析することができる。さらに、他のユーザとの比較により相対的な分析結果も取得することができる。さらにまた、分析用辞書などのデータを予め生成しておき、分析用辞書によって、入力情報を分析することもできる。入力情報分析部は、生成した入力特定情報を後述する応答情報決定部に供給する。 The input information analysis unit analyzes the input information and generates input specifying information. The input specifying information is information generated as a result of analyzing various types of information included in the input information. For example, there is a statistical analysis of the number and frequency of specific keywords (such as related terms described later) included in the input information. Furthermore, by analyzing the input information, it is possible to analyze the user's intention and preference from a question input by the user. Furthermore, a relative analysis result can also be acquired by comparison with other users. Furthermore, data such as an analysis dictionary can be generated in advance, and input information can be analyzed using the analysis dictionary. The input information analysis unit supplies the generated input specifying information to a response information determination unit described later.

シナリオデータ記憶部は、シナリオデータを抽出するための部材や部位である。シナリオデータは、後述するシナリオデータ記憶部（複数のシナリオデータ）に予め記憶されているデータである。入力情報分析部によって生成された入力特定情報に基づいて必要であると判断されたシナリオデータが抽出されて、抽出されたシナリオデータがシナリオデータ記憶部に記憶される。シナリオデータ記憶部（複数のシナリオデータ）から抽出されたシナリオデータは、後述する受信部と応答情報決定部とを経てシナリオデータ記憶部に記憶される。 The scenario data storage unit is a member or part for extracting scenario data. The scenario data is data stored in advance in a scenario data storage unit (a plurality of scenario data) to be described later. Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit is extracted, and the extracted scenario data is stored in the scenario data storage unit. The scenario data extracted from the scenario data storage unit (a plurality of scenario data) is stored in the scenario data storage unit via a receiving unit and a response information determination unit described later.

シナリオデータは、ユーザに提供する話題に関する応答情報を規定するためデータである。すなわち、シナリオデータには、ユーザに提供するための話題の情報が含まれている。 The scenario data is data for defining response information related to the topic provided to the user. That is, the scenario data includes topic information to be provided to the user.

さらに、シナリオデータには、複数の話題の情報からなる話題紹介リストも含まれる。話題紹介リストはユーザに提供される。ユーザは、提供された話題紹介リストに含まれる話題を選択することで、入力情報の入力の代替の情報にすることができる。ユーザは、選択操作で会話を進めることができるので、文字を入力する場合に比べて、ユーザの入力操作を簡素にすることができ、会話を円滑に進めることができる。 Further, the scenario data includes a topic introduction list including information on a plurality of topics. The topic introduction list is provided to the user. By selecting a topic included in the provided topic introduction list, the user can use the input information as alternative information. Since the user can advance the conversation by the selection operation, the user's input operation can be simplified and the conversation can be smoothly advanced as compared with the case of inputting characters.

また、話題紹介リストによって複数の話題をユーザに提供することができ、ユーザは、各種の話題を知得でき、興味の範囲を広げることができる。 In addition, a plurality of topics can be provided to the user by the topic introduction list, and the user can learn various topics and can expand the range of interest.

さらに、シナリオデータは、ユーザに挨拶をするための情報を含むものが好ましい。単に、ユーザに話題を提供するための情報のみならず、挨拶をするための情報をシナリオデータに含めることにより、ユーザと挨拶をすることができ会話をより自然なものにすることができる。 Further, the scenario data preferably includes information for greeting the user. By including not only information for providing a topic to the user but also information for greeting in the scenario data, it is possible to greet the user and make the conversation more natural.

シナリオデータは、話題提供システム１の契約者がユーザに提供したい情報を含めて事前に作成しておくデータである。シナリオデータによって、ユーザに提供したい情報を規定することができる。 The scenario data is data created in advance including information that the subscriber of the topic providing system 1 wants to provide to the user. Information to be provided to the user can be defined by the scenario data.

さらに、シナリオデータは複数のステートメントからなる。ステートメントには、出力用情報や出力用コマンドや制御コマンドなどが含まれる。 Furthermore, scenario data consists of a plurality of statements. The statement includes output information, output commands, control commands, and the like.

出力用情報は、出力部で出力される情報である。出力用情報には、話題の情報や挨拶の情報などが含まれる。話題の情報や挨拶の情報は、ユーザに提供してユーザとの会話を進めるための情報である。 The output information is information output by the output unit. The output information includes topic information and greeting information. Topic information and greeting information are information for providing a user with a conversation with the user.

出力用コマンドは、話題の情報や挨拶の情報を出力部に出力するとき、出力の仕様を制御するためのコマンドである。たとえば、画面を消去したり、改行したり、出力している時間を制御したり、所定の画像を表示したりするためのコマンドである。 The output command is a command for controlling output specifications when topic information or greeting information is output to the output unit. For example, it is a command for erasing the screen, making a line break, controlling the output time, or displaying a predetermined image.

制御コマンドは、ステートメントを制御するための判断や、話題名（たとえば、テーマなど）を切り替えたり、状態制御指標を変更したりするためのコマンドである。特に、判断は、時間や時刻によって分岐させるための判断や、状態制御指標の内容によって、分岐させるための判断などがある。判断によって分岐させることによって、所定のステートメントに遷移させることができる。 The control command is a command for determining a statement, switching a topic name (for example, a theme), or changing a state control index. In particular, the determination includes a determination for branching according to time and time, and a determination for branching depending on the contents of the state control index. It is possible to make a transition to a predetermined statement by branching by judgment.

なお、本実施の形態では、ステートメントの各々を区別する必要がない場合には、ステートメントはシナリオデータと同義である。 In the present embodiment, when there is no need to distinguish each statement, the statement is synonymous with scenario data.

なお、単一のステートメントに、出力用情報と出力用コマンドと制御コマンドとの全てを含める必要はない。たとえば、所定のステートメントを出力用情報のみで構成したり、出力用コマンドのみで構成したり、制御コマンドのみで構成したりすることができる。 Note that it is not necessary to include all of the output information, the output command, and the control command in a single statement. For example, a predetermined statement can be composed of only output information, can be composed of only output commands, or can be composed of only control commands.

このように、ステートメントには、単に、ユーザに提供する情報のみならず、各種のコマンドも含まれる。これにより、出力部において、話題の情報や挨拶の情報をさまざまな仕様で出力することができ、会話を自然にかつ円滑に進めることができる。 Thus, the statement includes not only information provided to the user but also various commands. As a result, topic information and greeting information can be output with various specifications in the output unit, and conversation can be naturally and smoothly advanced.

上述したように、シナリオデータ（ステートメント）には、出力用情報や出力用コマンドや制御コマンドなどが含まれている。出力用情報や出力用コマンドや制御コマンドを適宜規定することによって、話題制御ルールを構築することができる。特に、話題提供システム１の契約者がユーザに提供したい各種の情報（話題）をシナリオデータに含めることにより、契約者が所望する話題制御ルールを構築することができる。 As described above, scenario data (statements) includes output information, output commands, control commands, and the like. A topic control rule can be constructed by appropriately defining output information, output commands, and control commands. In particular, the topic control rules desired by the contractor can be constructed by including in the scenario data various information (topics) that the contractor of the topic providing system 1 wants to provide to the user.

シナリオデータは、契約者が提供したい情報について適宜規定すればよいので、話題提供システム１の全般に亘る高度かつ専門的な知識や技術に依存することなく、話題制御ルールの変更、追加、修正などの保守作業を別個に行うことができる。 The scenario data only needs to be defined as appropriate for the information that the contractor wants to provide, so the topic control rules can be changed, added, modified, etc. without depending on the advanced and specialized knowledge and technology of the topic providing system 1 in general. The maintenance work can be performed separately.

応答情報決定部は、応答情報を決定する。応答情報は、上述した入力情報分析部から供給されたシナリオデータと入力特定情報とに基づいて決定される。すなわち、ユーザが入力した入力情報を分析して得られた入力特定情報を用いて応答情報を決定する。したがって、シナリオデータによって話題提供システム１の契約者の意思を応答情報に反映させることができるとともに、入力特定情報によってユーザの意思を応答情報に反映させることができ、会話の主体の双方の意思を反省させて応答情報を生成することができる。話題提供システム１の契約者とユーザとのバランスを図って会話を円滑に進めることで自然な応答を実現することができる。 The response information determination unit determines response information. The response information is determined based on the scenario data and input specifying information supplied from the input information analysis unit described above. That is, the response information is determined using the input specifying information obtained by analyzing the input information input by the user. Therefore, the intention of the subscriber of the topic providing system 1 can be reflected in the response information by the scenario data, and the intention of the user can be reflected in the response information by the input specifying information. It is possible to reflect and generate response information. A natural response can be realized by smoothing the conversation by balancing the contractor and the user of the topic providing system 1.

応答情報には、シナリオデータのステートメントが含められる。ステートメントに含まれるユーザに提供する出力用情報のみならず、出力用コマンドなどの各種のコマンドも応答情報に含めることができる。このようにすることで、話題の情報や挨拶の情報をさまざまな仕様で出力部から出力することができる。 The response information includes a statement of scenario data. In addition to the output information provided to the user included in the statement, various commands such as an output command can be included in the response information. In this way, topic information and greeting information can be output from the output unit with various specifications.

出力部は、応答情報決定部によって決定された応答情報を出力する。ユーザは、出力部から出力された応答情報を認識することによって、話題が提供される。 The output unit outputs the response information determined by the response information determination unit. The user is provided with the topic by recognizing the response information output from the output unit.

このように、本実施の形態による話題提供システム１は、出力部に出力される応答情報によってユーザに各種の話題を提供することができる。すなわち、本実施の形態による話題提供システム１は、入力情報から入力特定情報を生成し、シナリオデータと入力特定情報とから応答情報を決定し、応答情報の出力によってユーザに各種の話題を提供する。 As described above, the topic providing system 1 according to the present embodiment can provide various topics to the user based on the response information output to the output unit. That is, the topic providing system 1 according to the present embodiment generates input specifying information from input information, determines response information from scenario data and input specifying information, and provides various topics to the user by outputting response information. .

本実施の形態による話題提供システム１は、シナリオデータ（たとえば、後述するステートメントや話題紹介リストなど）と、入力情報分析部によって分析された入力特定情報とによって応答情報を決定するので、話題と会話の流れとに合わせた自然な応答を実現することができる。 The topic providing system 1 according to the present embodiment determines response information based on scenario data (for example, a statement and a topic introduction list described later) and input specific information analyzed by the input information analysis unit. Natural response that matches the flow of

さらに、シナリオデータに基づいて話題に関する応答情報を規定することができるので、話題制御システムの全般に亘る高度かつ専門的な知識や技術に依存することなく、話題制御ルールの変更、追加、修正などの保守作業を別個に行うことができる。 In addition, response information on topics can be defined based on scenario data, so topic control rules can be changed, added, modified, etc. without relying on advanced and specialized knowledge and technology throughout the topic control system. The maintenance work can be performed separately.

さらに、図１に示すように、本実施の形態による話題提供システム１の特徴は、
前記入力情報及び前記応答情報に関する状態制御指標を記憶する状態制御指標記憶部をさらに備え、
前記応答情報決定部は、前記シナリオデータ及び前記入力特定情報のほかに前記状態制御指標を加えて前記応答情報を決定することである。 Further, as shown in FIG. 1, the feature of the topic providing system 1 according to the present embodiment is as follows.
A state control index storage unit for storing a state control index related to the input information and the response information;
The response information determination unit is to determine the response information by adding the state control index in addition to the scenario data and the input specifying information.

話題提供システム１は、状態制御指標記憶部を備える。状態制御指標記憶部は、状態制御指標を記憶する。状態制御指標は、入力情報と応答情報とに関する指標である。状態制御指標は、主に会話の履歴に基づく指標であり、さらには、会話の履歴に基づいて判別できるユーザの感情や性格なども示す指標である。たとえば、ユーザが過去に入力した入力情報に基づいて定められる指標や、過去にユーザに提供した応答情報に基づいて定められる指標などがある。さらに、過去にユーザに提供した応答情報に対するユーザの入力から得られるユーザの感情や性格などを示す指標などもある。 The topic providing system 1 includes a state control index storage unit. The state control index storage unit stores a state control index. The state control index is an index related to input information and response information. The state control index is an index mainly based on the conversation history, and further indicates the user's emotion and personality that can be determined based on the conversation history. For example, there are an index defined based on input information input by the user in the past and an index defined based on response information provided to the user in the past. In addition, there is an index or the like indicating the user's emotion or personality obtained from the user's input to response information provided to the user in the past.

応答情報決定部は、シナリオデータ及び入力特定情報のほかに状態制御指標を加えて応答情報を決定する。このように、状態制御指標も用いて応答情報を決定するので、ユーザとの過去の会話や、会話から得られたユーザの感情や性格なども踏まえて話題を提供したり会話を進めたりすることができる。したがって、同じ話題を重複してユーザに提供したり、飛躍した話題をユーザに提供したりすることを防止でき、ユーザの感情や性格などにあわせた円滑な会話を進めることができる。 The response information determination unit determines the response information by adding a state control index in addition to the scenario data and the input specifying information. In this way, response information is determined using the state control index, so it is necessary to provide topics or advance conversations based on past conversations with the user and the user's emotions and personality obtained from the conversation. Can do. Therefore, it is possible to prevent the same topic from being provided to the user in duplicate or to provide the user with a leap topic, and a smooth conversation can be promoted according to the user's emotion and personality.

本実施の形態において、状態制御指標記憶部は、会話制御端末装置２に備えられている。状態制御指標は、応答情報決定部によって決定されたり変更されたりするのが好ましい。状態制御指標は、シナリオデータと入力特定情報とに基づいて応答情報決定部によって決定されるのが好ましい。 In the present embodiment, the state control index storage unit is provided in the conversation control terminal device 2. The state control index is preferably determined or changed by the response information determination unit. The state control index is preferably determined by the response information determination unit based on the scenario data and the input specifying information.

本実施の形態による話題提供システム１は、シナリオデータ及び入力特定情報のほかに状態制御指標を使って応答情報を決定するので、より話題と会話の流れとに合わせた自然な応答を実現することができる。 The topic providing system 1 according to the present embodiment determines the response information using the state control index in addition to the scenario data and the input specific information, and thus realizes a natural response that more matches the topic and the flow of conversation. Can do.

さらにまた、図１に示すように、本実施の形態による話題提供システム１の特徴は、
前記シナリオデータは、異なる話題への遷移を規定する情報（たとえば、後述する話題切替情報など）を含み、
前記異なる話題への遷移を規定する情報に応じて、現在の話題に関する応答情報を規定するためのシナリオデータから、異なる話題に関する応答情報を規定するためのシナリオデータへ、シナリオデータを切り替える話題切替部を、さらに備えることである。 Furthermore, as shown in FIG. 1, the feature of the topic providing system 1 according to this embodiment is as follows:
The scenario data includes information defining transition to a different topic (for example, topic switching information described later),
A topic switching unit that switches scenario data from scenario data for defining response information on a current topic to scenario data for defining response information on a different topic in accordance with information defining transition to the different topic Is further provided.

本実施の形態では、話題切替部は、切替入力情報入力部と入力情報分析部とを含む。話題切替部は、シナリオデータ及び入力特定情報に基づいて、シナリオデータを切り替えるか否かを判断する。具体的には、状態制御指標に基づいてシナリオデータを切り替えるのが好ましい。切替入力情報入力部は、話題切替入力情報が後述する送信部に送信され、入力情報分析部に供給される。入力情報分析部は、入力情報のほかに話題切替入力情報にも基づいて、シナリオデータ記憶部（複数のシナリオデータ）に記憶されているシナリオデータを抽出する。 In the present embodiment, the topic switching unit includes a switching input information input unit and an input information analysis unit. The topic switching unit determines whether to switch scenario data based on the scenario data and the input specifying information. Specifically, it is preferable to switch the scenario data based on the state control index. The switching input information input unit transmits the topic switching input information to a transmission unit (to be described later) and is supplied to the input information analysis unit. The input information analysis unit extracts the scenario data stored in the scenario data storage unit (a plurality of scenario data) based on the topic switching input information in addition to the input information.

シナリオデータは、異なる話題（話題名）への遷移を規定する情報を含む。話題切替部は、シナリオデータを切り替える。この切り替えは、現在の話題に関する応答情報を規定するためのシナリオデータから、異なる話題に関する応答情報を規定するためのシナリオデータへ切り替えるものである。 The scenario data includes information defining transition to a different topic (topic name). The topic switching unit switches scenario data. This switching is to switch from scenario data for defining response information on the current topic to scenario data for defining response information on a different topic.

たとえば、サーバなどに全てのシナリオデータを記憶させておき、異なる話題への遷移を規定する情報に基づいてサーバの全てのシナリオデータから組み替え直したシナリオデータを生成し、組み替え直されたシナリオデータがシナリオデータ記憶部に記憶される。このシナリオデータの組み替えは、複数のステートメントの組み合せを話題名に応じて定めることで実行できる。 For example, all scenario data is stored in a server, etc., and scenario data is generated by recombining from all scenario data of the server based on information defining transition to a different topic. It is stored in the scenario data storage unit. This recombination of scenario data can be executed by determining a combination of a plurality of statements according to the topic name.

したがって、ユーザとの会話で一の話題から他の話題に移った場合も、異なる話題への遷移を規定する情報にしたがって他の話題に遷移させることができ、あらゆる話題に対応することができ、様々なユーザに対応することができる。組み替えたシナリオデータによって他の話題をユーザに提供することができる。 Therefore, even when moving from one topic to another in a conversation with the user, it can be transitioned to another topic according to the information that defines the transition to a different topic, can correspond to any topic, It can cope with various users. Other topics can be provided to the user by the rearranged scenario data.

また、話題ごと（話題名）にシナリオデータを準備すればよいので、シナリオデータの保守が容易になる。具体的には、シナリオデータに変更が必要になった場合には、そのシナリオデータのみを修正すればよい。また、新たな話題が必要になった場合には、そのシナリオデータのみを追加すればよい。さらに、古い話題となって必要でなくなった場合には、そのシナリオデータのみを削除すればよい。 Further, since scenario data should be prepared for each topic (topic name), maintenance of scenario data becomes easy. Specifically, when the scenario data needs to be changed, only the scenario data need be corrected. If a new topic is required, only the scenario data need be added. Furthermore, when it is no longer necessary due to an old topic, only the scenario data need be deleted.

シナリオデータは、契約者が提供したい情報について、話題ごと（話題名）に適宜規定すればよいので、話題が増えた場合であっても、話題提供システム１の全般に亘る高度かつ専門的な知識や技術に依存することなく、話題制御ルールの変更、追加、修正などの保守作業を話題ごとに行うことができる。 Scenario data can be specified for each topic (topic name) as appropriate for information that the contractor wants to provide, so even if the topic increases, advanced and specialized knowledge throughout the topic providing system 1 Maintenance work such as changing, adding, and correcting topic control rules can be performed for each topic without depending on the technology.

異なる話題への遷移を規定する情報を有するので、このようなシナリオデータの更新があった場合でも、シナリオデータの遷移の整合を容易に図ることができる。 Since it has information defining transition to a different topic, it is possible to easily match the transition of scenario data even when such scenario data is updated.

たとえば、状態制御指標のうち、後述する性格指標に基づいて話題を切り替えるか否かを判断するのが好ましい。性格指標は、ユーザが、話題に対して積極的であるのか又は消極的であるのかを示す指標である。 For example, it is preferable to determine whether or not to switch topics based on a personality index described later among the state control indices. The personality index is an index indicating whether the user is active or reluctant to the topic.

本実施の形態による話題提供システム１は、シナリオデータを使って話題を切り替えることができるので、話題と会話との流れに合わせた自然な応答を実現することができる。 Since the topic providing system 1 according to this embodiment can switch topics using scenario data, it can realize a natural response according to the flow of topics and conversation.

図１において点線の四角で示した送信部と受信部と切替入力情報入力部とについて説明する。本実施の形態による話題提供システム１は、これらの送信部と受信部と切替入力情報入力部とを備えることができる。 A transmission unit, a reception unit, and a switching input information input unit indicated by dotted squares in FIG. 1 will be described. The topic providing system 1 according to the present embodiment can include these transmission unit, reception unit, and switching input information input unit.

また、図１に示した送信部は、入力情報を外部に送信するための装置や部材である。入力情報を外部に送信するものであればよい。外部は、たとえば、サーバや、会話制御端末装置２などにすることができる。 Further, the transmission unit shown in FIG. 1 is a device or member for transmitting input information to the outside. Any device that transmits input information to the outside may be used. The outside can be, for example, a server, the conversation control terminal device 2, or the like.

さらに、図１に示した受信部は入力特定情報を受信するための装置や部材である。入力特定情報は、外部で生成される。すなわち、受信部は、外部で生成された入力特定情報を受信する装置や部材である。外部では、送信部から送信された入力情報を分析して入力特定情報を生成し、生成された入力特定情報は受信部に送信される。 Further, the receiving unit shown in FIG. 1 is a device or member for receiving input specifying information. The input specifying information is generated externally. That is, the receiving unit is a device or member that receives input specifying information generated externally. Outside, the input information transmitted from the transmitting unit is analyzed to generate input specifying information, and the generated input specifying information is transmitted to the receiving unit.

さらにまた、図１に示した切替入力情報入力部は、異なる話題への遷移を規定する情報に応じて話題切替入力情報（たとえば、後述する性格指標など）を生成する。異なる話題への遷移を規定する情報は、シナリオデータに含まれる情報であり、たとえば、後述する話題切替情報などがある。 Furthermore, the switching input information input unit shown in FIG. 1 generates topic switching input information (for example, a personality index, which will be described later) according to information defining transition to a different topic. Information defining the transition to a different topic is information included in the scenario data, for example, topic switching information described later.

上述した話題切替部は、入力情報分析部と切替入力情報入力部とを含むのが好ましい。入力情報に基づいて、話題切替入力情報を生成するので、ユーザの意思を反映させた話題に遷移させることができる。 The topic switching unit described above preferably includes an input information analysis unit and a switching input information input unit. Since topic switching input information is generated based on the input information, it is possible to transition to a topic reflecting the user's intention.

また、図１に示したシナリオデータ記憶部（複数のシナリオデータ）は、複数のシナリオデータを記憶する。ここで、複数のシナリオデータは、ユーザと会話をするために必要な話題名に対応する全てのシナリオデータである。全てのシナリオデータのうち、入力特定情報に基づいて必要であると判断されたシナリオデータが抽出される。 Further, the scenario data storage unit (a plurality of scenario data) shown in FIG. 1 stores a plurality of scenario data. Here, the plurality of scenario data are all scenario data corresponding to the topic names necessary for talking with the user. Of all scenario data, scenario data determined to be necessary based on the input identification information is extracted.

＜＜＜会話制御端末装置２の概要＞＞＞
図２に示すように、本実施の形態による会話制御端末装置２の特徴は、
ユーザが入力情報を入力するための入力部と、
話題に関する応答情報を規定するためのシナリオデータを記憶するシナリオデータ記憶部と、
前記入力情報及び前記応答情報に関する状態制御指標を記憶する状態制御指標記憶部と、
前記シナリオデータと前記状態制御指標とに基づいて前記応答情報を決定する応答情報決定部と、
前記応答情報決定部によって決定された応答情報を出力する出力部と、を備えることである。 <<< Overview of Conversation Control Terminal Device 2 >>>
As shown in FIG. 2, the feature of the conversation control terminal device 2 according to this embodiment is as follows.
An input unit for a user to input input information;
A scenario data storage unit for storing scenario data for defining response information on the topic;
A state control index storage unit that stores a state control index related to the input information and the response information;
A response information determination unit that determines the response information based on the scenario data and the state control index;
And an output unit that outputs the response information determined by the response information determination unit.

本実施の形態による会話制御端末装置２は、図２に示すように、主に、入力部とシナリオデータ記憶部と状態制御指標記憶部と応答情報決定部と出力部とを備える。図２においては、これらの構成を実線の四角で示した。図２において大きく破線で囲んだ部分が、会話制御端末装置２の構成である。なお、点線の四角で示したシナリオデータ記憶部（複数のシナリオデータ）と入力情報分析部とは、話題提供サーバ４に含まれる構成である。 As shown in FIG. 2, conversation control terminal apparatus 2 according to the present embodiment mainly includes an input unit, a scenario data storage unit, a state control index storage unit, a response information determination unit, and an output unit. In FIG. 2, these configurations are indicated by solid squares. In FIG. 2, a portion surrounded by a broken line is the configuration of the conversation control terminal device 2. Note that the scenario data storage unit (a plurality of scenario data) and the input information analysis unit indicated by dotted-line boxes are included in the topic providing server 4.

入力部は、本実施の形態による話題提供システム１の入力部と同様に、ユーザが入力情報を入力するための部材や部位である。入力部は、ユーザが所望する情報を入力情報として入力できるものであればよい。たとえば、入力部は、キーボードやタッチパネルやマイクロフォンやカメラなどがある。ユーザは、入力部からテキストデータや音声データや画像データなどを入力できる。入力部に入力された入力情報は、次に説明する入力情報分析部に供給される。入力情報は、後述する送信部を介して入力情報分析部に供給されるのが好ましい。 The input unit is a member or a part for the user to input input information, similarly to the input unit of the topic providing system 1 according to the present embodiment. The input part should just be what can input the information which a user desires as input information. For example, the input unit includes a keyboard, a touch panel, a microphone, a camera, and the like. The user can input text data, voice data, image data, and the like from the input unit. The input information input to the input unit is supplied to the input information analysis unit described below. The input information is preferably supplied to the input information analysis unit via a transmission unit described later.

シナリオデータ記憶部は、シナリオデータを記憶するための部材や部位である。シナリオデータは、図２に示す話題提供サーバ４のシナリオデータ記憶部（複数のシナリオデータ）に予め記憶されているデータである。入力情報分析部によって生成された入力特定情報に基づいて必要であると判断されたシナリオデータが抽出されて、抽出されたシナリオデータがシナリオデータ記憶部に記憶される。シナリオデータ記憶部（複数のシナリオデータ）から抽出されたシナリオデータは、後述する受信部と応答情報決定部とを経てシナリオデータ記憶部に記憶される。 The scenario data storage unit is a member or part for storing scenario data. The scenario data is data stored in advance in a scenario data storage unit (a plurality of scenario data) of the topic providing server 4 shown in FIG. Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit is extracted, and the extracted scenario data is stored in the scenario data storage unit. The scenario data extracted from the scenario data storage unit (a plurality of scenario data) is stored in the scenario data storage unit via a receiving unit and a response information determination unit described later.

シナリオデータは、ユーザに提供する話題に関する応答情報を規定するためデータである。シナリオデータには、ユーザに提供するための話題の情報が含まれている。なお、会話制御端末装置２で用いるシナリオデータの構成や機能などについては、本実施の形態による話題提供システム１のシナリオデータと同じである。 The scenario data is data for defining response information related to the topic provided to the user. The scenario data includes topic information to be provided to the user. The configuration and functions of scenario data used in the conversation control terminal device 2 are the same as the scenario data of the topic providing system 1 according to this embodiment.

本実施の形態による会話制御端末装置２は、状態制御指標記憶部を備える。状態制御指標記憶部は、状態制御指標を記憶する。状態制御指標は、主に会話の履歴に基づく指標であり、さらには、会話の履歴に基づいて導くことができるユーザの感情や性格なども示す指標である。たとえば、ユーザが過去に入力した入力情報に基づいて定められる指標や、過去にユーザに提供した応答情報に基づいて定められる指標などがある。さらに、過去にユーザに提供した応答情報に対するユーザの入力から導くことができるユーザの感情や性格なども示す指標などもある。 The conversation control terminal device 2 according to the present embodiment includes a state control index storage unit. The state control index storage unit stores a state control index. The state control index is an index mainly based on the conversation history, and further indicates the user's emotion and personality that can be derived based on the conversation history. For example, there are an index defined based on input information input by the user in the past and an index defined based on response information provided to the user in the past. In addition, there is an index indicating the user's emotion and personality that can be derived from the user's input for response information provided to the user in the past.

このように、本実施の形態においては、会話制御端末装置２が状態制御指標記憶部を備える。すなわち、会話制御端末装置２の外部、たとえば、話題提供サーバ４などが状態制御指標記憶部を備える構成ではない。したがって、本実施の形態では、話題提供サーバ４などの外部の装置によって、ユーザとの会話が制御されるのではなく、会話制御端末装置２によってユーザとの会話が制御される。 Thus, in the present embodiment, conversation control terminal device 2 includes a state control index storage unit. That is, it is not a configuration in which, for example, the topic providing server 4 or the like outside the conversation control terminal device 2 includes a state control index storage unit. Therefore, in the present embodiment, the conversation with the user is not controlled by an external device such as the topic providing server 4 but the conversation with the user is controlled by the conversation control terminal device 2.

応答情報決定部は、話題提供サーバ４の入力情報分析部から供給されたシナリオデータと状態制御指標とに基づいて応答情報を決定する。シナリオデータによって話題提供システム１の契約者の意思を応答情報に反映させることができる。 The response information determination unit determines response information based on the scenario data and the state control index supplied from the input information analysis unit of the topic providing server 4. The intention of the subscriber of the topic providing system 1 can be reflected in the response information by the scenario data.

さらに、状態制御指標を用いて応答情報を決定するので、ユーザとの過去の会話や、会話から得られたユーザの感情や性格なども踏まえて話題を提供したり会話を進めたりすることができる。したがって、同じ話題を重複してユーザに提供したり、飛躍した話題をユーザに提供したりすることを防止でき、より円滑な会話を進めることで自然な応答を実現することができる。 In addition, since response information is determined using a state control index, topics can be provided and conversations can be promoted based on past conversations with the user and the user's emotions and personality obtained from the conversation. . Therefore, it is possible to prevent the same topic from being provided to the user in duplicate or to provide the user with a leap topic, and a natural response can be realized by promoting a smoother conversation.

状態制御指標は、応答情報決定部によって決定されたり変更されたりするのが好ましい。状態制御指標は、シナリオデータと入力特定情報とに基づいて応答情報決定部によって決定されるのが好ましい。 The state control index is preferably determined or changed by the response information determination unit. The state control index is preferably determined by the response information determination unit based on the scenario data and the input specifying information.

本実施の形態による会話制御端末装置２は、会話制御端末装置２にシナリオデータ記憶部と状態制御指標記憶部との双方を設けて応答情報を決定するので、ユーザとの会話が可能であるか否かを会話制御端末装置２で判断して制御でき、会話制御端末装置２における処理の負担を著しく増加させることなく、かつ、サーバの負担も増加させることなく、さらに、ネットワークのトラフィックも増加させることなく、ユーザとの会話の流れに合わせて円滑に会話を進めることができる。 Since the conversation control terminal device 2 according to the present embodiment provides both the scenario data storage unit and the state control index storage unit in the conversation control terminal device 2 and determines the response information, is it possible to have a conversation with the user? Can be determined and controlled by the conversation control terminal device 2, without significantly increasing the processing load in the conversation control terminal device 2, without increasing the load on the server, and also increase the network traffic Therefore, it is possible to smoothly advance the conversation according to the conversation flow with the user.

さらに、本実施の形態による会話制御端末装置２は、シナリオデータと状態制御指標とに基づいて応答情報を決定するので、ユーザとの会話の進行状態に応じて応答情報を決定でき、サーバの負担を増大させることなく、ユーザの状態に応じて的確に応答することができる。 Furthermore, since the conversation control terminal device 2 according to the present embodiment determines the response information based on the scenario data and the state control index, the response information can be determined according to the progress of the conversation with the user, and the load on the server It is possible to respond accurately according to the state of the user without increasing the value.

さらに、図２に示すように、本実施の形態による会話制御端末装置２の特徴は、
前記入力情報を外部に送信する送信部と、
送信された入力情報を分析して生成された入力特定情報を受信する受信部と、をさらに備え、
前記応答情報決定部は、前記シナリオデータ及び前記状態制御指標に前記入力特定情報を加えて前記応答情報を決定することである。 Furthermore, as shown in FIG. 2, the features of the conversation control terminal device 2 according to the present embodiment are as follows:
A transmission unit for transmitting the input information to the outside;
A receiving unit that receives the input specifying information generated by analyzing the transmitted input information, and
The response information determination unit determines the response information by adding the input specifying information to the scenario data and the state control index.

本実施の形態による会話制御端末装置２は、送信部と受信部とをさらに備える。 The conversation control terminal device 2 according to the present embodiment further includes a transmission unit and a reception unit.

送信部は、入力情報を外部に送信する。入力情報を会話制御端末装置２の外部に送信するものであればよい。外部は、たとえば、サーバや、他の会話制御端末装置２などにすることができる。 The transmission unit transmits input information to the outside. What is necessary is just to transmit input information outside the conversation control terminal device 2. The outside can be, for example, a server or another conversation control terminal device 2.

受信部は、入力特定情報を受信する。入力特定情報は、会話制御端末装置２の外部で生成される。すなわち、受信部は、会話制御端末装置２の外部で生成された入力特定情報を受信する装置や部材である。会話制御端末装置２の外部では、送信部から送信された入力情報を分析して入力特定情報を生成し、生成された入力特定情報は、会話制御端末装置２の受信部に送信される。 The receiving unit receives the input specifying information. The input specifying information is generated outside the conversation control terminal device 2. That is, the receiving unit is a device or member that receives input specifying information generated outside the conversation control terminal device 2. Outside the conversation control terminal device 2, the input information transmitted from the transmission unit is analyzed to generate input specification information, and the generated input specification information is transmitted to the reception unit of the conversation control terminal device 2.

応答情報決定部は、シナリオデータ及び状態制御指標に入力特定情報を加えて応答情報を決定する。ユーザが入力した入力情報を分析して得られた入力特定情報も含めて応答情報を決定する。したがって、シナリオデータによって話題提供システム１の契約者の意思を応答情報に反映させることができるとともに、入力特定情報によってユーザの意思を応答情報に反映させることができ、会話の主体の双方の意思を反省させて応答情報を生成することができる。話題提供システム１の契約者とユーザとのバランスを図って会話を円滑に進めることで自然な応答を実現することができる。 The response information determination unit determines the response information by adding input specifying information to the scenario data and the state control index. Response information is determined including input specifying information obtained by analyzing input information input by the user. Therefore, the intention of the subscriber of the topic providing system 1 can be reflected in the response information by the scenario data, and the user's intention can be reflected in the response information by the input specifying information. It is possible to reflect and generate response information. A natural response can be realized by smoothing the conversation by balancing the contractor and the user of the topic providing system 1.

本実施の形態による会話制御端末装置２は、入力特定情報を加えて応答情報を決定するので、サーバなどの外部で分析した結果である入力特定情報を含めて応答情報を決定でき、ユーザとの会話の流れに合わせて円滑に会話を進めることができる。 Since the conversation control terminal device 2 according to the present embodiment determines the response information by adding the input specific information, the response information can be determined including the input specific information that is the result of analysis performed outside the server or the like, The conversation can proceed smoothly according to the conversation flow.

さらにまた、図２に示すように、本実施の形態による会話制御端末装置２の特徴は、
前記受信部は前記入力情報に基づいて抽出されたシナリオデータを受信し、
前記シナリオデータ記憶部は受信したシナリオデータを記憶することである。 Furthermore, as shown in FIG. 2, the features of the conversation control terminal device 2 according to the present embodiment are as follows:
The receiving unit receives scenario data extracted based on the input information,
The scenario data storage unit stores received scenario data.

本実施の形態による会話制御端末装置２の受信部は、入力情報に基づいて抽出されたシナリオデータを受信する。すなわち、シナリオデータは、会話制御端末装置２の外部で抽出される。会話制御端末装置２の外部に送信された入力情報に基づき、会話制御端末装置２の外部において、入力情報に基づいてシナリオデータは抽出される。抽出されたシナリオデータは、会話制御端末装置２の受信部に送信される。 The receiving unit of the conversation control terminal device 2 according to the present embodiment receives the scenario data extracted based on the input information. That is, the scenario data is extracted outside the conversation control terminal device 2. Based on the input information transmitted to the outside of the conversation control terminal device 2, the scenario data is extracted based on the input information outside the conversation control terminal device 2. The extracted scenario data is transmitted to the reception unit of the conversation control terminal device 2.

シナリオデータ記憶部は、会話制御端末装置２の受信部で受信したシナリオデータを記憶する。上述したように、会話制御端末装置２で用いるシナリオデータの構成や機能などについては、本実施の形態による話題提供システム１のシナリオデータと同じである。 The scenario data storage unit stores scenario data received by the receiving unit of the conversation control terminal device 2. As described above, the configuration and functions of the scenario data used in the conversation control terminal device 2 are the same as the scenario data of the topic providing system 1 according to this embodiment.

本実施の形態による会話制御端末装置２は、入力情報に基づいて抽出されたシナリオデータを受信するので、ユーザが入力した入力情報に基づいてシナリオデータを切り替えていくことができ、ユーザとの会話の流れに合わせて円滑に会話を進めることができる。 Since the conversation control terminal device 2 according to the present embodiment receives the scenario data extracted based on the input information, the scenario data can be switched based on the input information input by the user, and the conversation with the user The conversation can be smoothly conducted according to the flow of

さらに、図２に示すように、本実施の形態による会話制御端末装置２の特徴は、
前記シナリオデータは、異なる話題への遷移を規定する情報（たとえば、後述する話題切替情報など）を含み、
前記異なる話題への遷移を規定する情報に応じて話題切替入力情報（たとえば、後述する性格指標など）を生成する切替入力情報入力部を備え、
前記送信部は、前記話題切替入力情報を外部に送信し、
前記受信部は、前記話題切替入力情報に基づいたシナリオデータを受信することである。 Furthermore, as shown in FIG. 2, the features of the conversation control terminal device 2 according to the present embodiment are as follows:
The scenario data includes information defining transition to a different topic (for example, topic switching information described later),
A switching input information input unit that generates topic switching input information (for example, a personality index described later) according to information defining transition to the different topic,
The transmission unit transmits the topic switching input information to the outside,
The receiving unit is to receive scenario data based on the topic switching input information.

本実施の形態による会話制御端末装置２で用いるシナリオデータは、異なる話題への遷移を規定する情報を含む。 The scenario data used in the conversation control terminal device 2 according to the present embodiment includes information defining transition to a different topic.

本実施の形態による会話制御端末装置２は、切替入力情報入力部を備える。切替入力情報入力部は、異なる話題への遷移を規定する情報に応じて話題切替入力情報を生成する。異なる話題への遷移を規定する情報は、たとえば、後述するステートメントの要素である話題切替情報などがある。また、話題切替入力情報は、たとえば、後述する性格指標などがある。 The conversation control terminal device 2 according to the present embodiment includes a switching input information input unit. The switching input information input unit generates topic switching input information according to information defining transition to a different topic. Information defining the transition to a different topic includes, for example, topic switching information that is an element of a statement to be described later. The topic switching input information includes, for example, a personality index described later.

切替入力情報入力部は、シナリオデータ及び入力特定情報に基づいて、シナリオデータを切り替えるか否かを判断する。具体的には、状態制御指標に基づいてシナリオデータを切り替えるのが好ましい。切替入力情報入力部は、話題切替入力情報が送信部に送信され、入力情報分析部に供給される。入力情報分析部は、入力情報のほかに話題切替入力情報にも基づいて、話題提供サーバ４のシナリオデータ記憶部（複数のシナリオデータ）に記憶されているシナリオデータを抽出する。 The switching input information input unit determines whether to switch scenario data based on the scenario data and the input specifying information. Specifically, it is preferable to switch the scenario data based on the state control index. In the switching input information input unit, topic switching input information is transmitted to the transmission unit and supplied to the input information analysis unit. The input information analysis unit extracts the scenario data stored in the scenario data storage unit (a plurality of scenario data) of the topic providing server 4 based on the topic switching input information in addition to the input information.

さらに、本実施の形態による会話制御端末装置２の送信部は、話題切替入力情報を外部に送信する。また、受信部は、話題切替入力情報に基づいたシナリオデータを受信する。 Furthermore, the transmission part of the conversation control terminal device 2 according to the present embodiment transmits the topic switching input information to the outside. The receiving unit receives scenario data based on the topic switching input information.

現在の話題に関する応答情報を規定するためのシナリオデータから、異なる話題に関する応答情報を規定するためのシナリオデータへ切り替えることができる。 It is possible to switch from scenario data for defining response information on the current topic to scenario data for defining response information on a different topic.

たとえば、サーバなどに全てのシナリオデータを記憶させておき、話題切替入力情報に基づいてサーバの全てのシナリオデータから組み替え直したシナリオデータを生成し、組み替え直されたシナリオデータが会話制御端末装置２のシナリオデータ記憶部に記憶される。 For example, all scenario data is stored in a server or the like, scenario data is generated by rearranging all scenario data of the server based on topic switching input information, and the rearranged scenario data is the conversation control terminal device 2. Are stored in the scenario data storage unit.

したがって、ユーザとの会話で一の話題から他の話題に移った場合も、異なる話題への遷移を規定する情報にしたがって他の話題に遷移させることができ、あらゆる話題に対応することができ、様々なユーザに対応することができる。 Therefore, even when moving from one topic to another in a conversation with the user, it can be transitioned to another topic according to the information that defines the transition to a different topic, can correspond to any topic, It can cope with various users.

状態制御指標のうち、後述する性格指標に基づいて話題を切り替えるか否かを判断するのが好ましい。性格指標は、ユーザが、話題に対して積極的であるのか又は消極的であるのかを示す指標である。 It is preferable to determine whether or not to switch topics based on a personality index described later among the state control indices. The personality index is an index indicating whether the user is active or reluctant to the topic.

本実施の形態による会話制御端末装置２によれば、状態制御指標とシナリオデータとに基づいて話題を切り替えることができるので、ユーザとの会話の状態を見ながら話題を切り替ることができ、ユーザとの会話の流れに合わせて円滑に会話を進めることができる。 According to the conversation control terminal device 2 according to the present embodiment, since the topic can be switched based on the state control index and the scenario data, the topic can be switched while viewing the state of the conversation with the user. The conversation can proceed smoothly according to the conversation flow.

図２に示した入力情報分析部は、入力情報を分析して入力特定情報を生成する。入力特定情報は、入力情報に含まれる各種の情報を分析した結果、生成される情報である。たとえば、特定のキーワード（後述する関連詞など）が入力情報に含まれる数や頻度などの統計的な分析などがある。 The input information analysis unit shown in FIG. 2 analyzes the input information and generates input specifying information. The input specifying information is information generated as a result of analyzing various types of information included in the input information. For example, there is a statistical analysis of the number and frequency of specific keywords (such as related terms described later) included in the input information.

また、図２に示したシナリオデータ記憶部（複数のシナリオデータ）は、複数のシナリオデータを記憶する。ここで、複数のシナリオデータは、ユーザと会話をするために必要な話題名に対応する全てのシナリオデータである。全てのシナリオデータのうち、入力特定情報に基づいて必要であると判断されたシナリオデータが抽出される。 Further, the scenario data storage unit (a plurality of scenario data) shown in FIG. 2 stores a plurality of scenario data. Here, the plurality of scenario data are all scenario data corresponding to the topic names necessary for talking with the user. Of all scenario data, scenario data determined to be necessary based on the input identification information is extracted.

＜＜＜保守装置３の概要＞＞＞
図３に示すように、本実施の形態による保守装置３の特徴は、
ユーザが入力する入力情報を分析することによって生成された入力特定情報に基づいて話題に関わる応答情報を規定するためのシナリオデータを記憶するシナリオデータ記憶部と、
前記入力特定情報を受信する受信部と、
前記シナリオデータを編集可能にするためのシナリオデータ編集部と、
前記受信部で受信した入力特定情報に基づいて編集したシナリオデータの応答を検証可能にするシナリオデータ検証部と、
編集したシナリオデータを外部に送信するシナリオデータ送信部と、を備えることである。 <<< Overview of Maintenance Device 3 >>>
As shown in FIG. 3, the features of the maintenance device 3 according to the present embodiment are as follows:
A scenario data storage unit that stores scenario data for defining response information related to a topic based on input specifying information generated by analyzing input information input by a user;
A receiving unit for receiving the input identification information;
A scenario data editing section for making the scenario data editable;
A scenario data verification unit that enables verification of a response of scenario data edited based on the input specific information received by the reception unit;
And a scenario data transmission unit that transmits the edited scenario data to the outside.

本実施の形態による保守装置３は、図３に示すように、主に、シナリオデータ記憶部と受信部とシナリオデータ編集部とシナリオデータ検証部とシナリオデータ送信部とを備える。図３においては、これらの構成を実線の四角で示した。なお、点線の四角で示したシナリオデータ記憶部（複数のシナリオデータ）と入力情報分析部とは、話題提供サーバ４（図２参照）に含まれる構成である。図３に示すように、シナリオデータ検証部に状態制御指標記憶部を加えたものが、後述する端末装置仮想構築部を構成する。これらの構成の間で授受される情報は、上述した話題提供システム１や会話制御端末装置２と同様である。 As shown in FIG. 3, the maintenance device 3 according to the present embodiment mainly includes a scenario data storage unit, a reception unit, a scenario data editing unit, a scenario data verification unit, and a scenario data transmission unit. In FIG. 3, these configurations are indicated by solid-line squares. Note that the scenario data storage unit (a plurality of scenario data) and the input information analysis unit indicated by dotted-line boxes are included in the topic providing server 4 (see FIG. 2). As shown in FIG. 3, a scenario data verification unit plus a state control index storage unit constitutes a terminal device virtual construction unit to be described later. Information exchanged between these components is the same as that of the topic providing system 1 and the conversation control terminal device 2 described above.

上述した話題提供システム１や会話制御端末装置２は、主として、一般のユーザが会話制御端末装置２と会話をするためのものである。これに対して、保守装置３は、主として、話題提供システム１の契約者が使用するものであり、一般のユーザに話題を提供するためのシナリオデータの保守を話題提供システム１の契約者が行うための装置である。保守装置３は、このような相違があるが、図３において、話題提供システム１や会話制御端末装置２と同様の機能を有し同様のデータを用いる構成には、同じ名称を付した。 The topic providing system 1 and the conversation control terminal device 2 described above are mainly for a general user to have a conversation with the conversation control terminal device 2. On the other hand, the maintenance device 3 is mainly used by a contractor of the topic providing system 1, and the contractor of the topic providing system 1 performs maintenance of scenario data for providing a topic to a general user. It is a device for. Although the maintenance device 3 has such a difference, in FIG. 3, the same name is given to the configuration having the same function as the topic providing system 1 and the conversation control terminal device 2 and using the same data.

この保守装置３におけるシナリオデータ記憶部及び受信部は、上述した話題提供システム１や会話制御端末装置２におけるシナリオデータ記憶部及び受信部と機能的に実質的に同じものである。話題提供システム１や会話制御端末装置２と同じものにすることができる。たとえば、ユーザが使用すると想定される会話制御端末装置２に保守装置３のシナリオデータ記憶部及び受信部を実装してもよい。さらに、後述するように、保守装置３において会話制御端末装置２を仮想的に構築し、仮想的な会話制御端末装置２のシナリオデータ記憶部及び受信部としてもよい。 The scenario data storage unit and the reception unit in the maintenance device 3 are functionally substantially the same as the scenario data storage unit and the reception unit in the topic providing system 1 and the conversation control terminal device 2 described above. It can be the same as the topic providing system 1 or the conversation control terminal device 2. For example, the scenario data storage unit and the reception unit of the maintenance device 3 may be mounted on the conversation control terminal device 2 assumed to be used by the user. Furthermore, as will be described later, the conversation control terminal device 2 may be virtually constructed in the maintenance device 3 and used as a scenario data storage unit and a reception unit of the virtual conversation control terminal device 2.

シナリオデータ記憶部は、シナリオデータを記憶するための部材や部位である。シナリオデータは、話題提供サーバ４（図２参照）のシナリオデータ記憶部（複数のシナリオデータ）に予め記憶されているデータである。入力情報分析部によって生成された入力特定情報に基づいて必要であると判断されたシナリオデータが抽出されて、抽出されたシナリオデータがシナリオデータ記憶部に記憶される。シナリオデータ記憶部（複数のシナリオデータ）から抽出されたシナリオデータは、後述する受信部と応答情報決定部とを経てシナリオデータ記憶部に記憶される。 The scenario data storage unit is a member or part for storing scenario data. The scenario data is data stored in advance in a scenario data storage unit (a plurality of scenario data) of the topic providing server 4 (see FIG. 2). Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit is extracted, and the extracted scenario data is stored in the scenario data storage unit. The scenario data extracted from the scenario data storage unit (a plurality of scenario data) is stored in the scenario data storage unit via a receiving unit and a response information determination unit described later.

シナリオデータは、入力特定情報に基づいて話題に関わる応答情報を規定するためのデータである。入力特定情報は、入力情報を分析することによって生成された情報である。入力情報は、たとえば、会話制御端末装置２においてユーザが入力する情報である。 Scenario data is data for defining response information related to a topic based on input specifying information. The input specifying information is information generated by analyzing the input information. The input information is, for example, information input by the user in the conversation control terminal device 2.

なお、上述したように、保守装置３は、主として、話題提供システム１の契約者が使用するものである。この保守装置３においては、入力情報は、仮想的にユーザが入力した情報とすることができる。保守装置３は、ユーザに対してシナリオデータを利用可能にする前にシナリオデータを検証するためのものである。したがって、ここでのユーザは、仮想的なユーザでよく、また、実際のユーザが入力するであろうと想定される情報を入力情報とすればよい。したがって、想定される様々な入力情報を用いて入力特定情報を生成し、後述するシナリオデータ検証部によってシナリオデータの応答を検証することができる。 As described above, the maintenance device 3 is mainly used by the contractor of the topic providing system 1. In the maintenance device 3, the input information can be information virtually input by the user. The maintenance device 3 is for verifying scenario data before making the scenario data available to the user. Therefore, the user here may be a virtual user, and information assumed to be input by an actual user may be input information. Therefore, it is possible to generate input specifying information using various assumed input information, and to verify the response of the scenario data by the scenario data verification unit described later.

シナリオデータには、ユーザに提供するための話題の情報が含まれている。保守装置３で用いるシナリオデータの構成や機能などについては、上述した本実施の形態による話題提供システム１のシナリオデータや、会話制御端末装置２のシナリオデータと同じである。 The scenario data includes topic information to be provided to the user. The configuration and functions of scenario data used in the maintenance device 3 are the same as the scenario data of the topic providing system 1 according to the present embodiment and the scenario data of the conversation control terminal device 2 described above.

なお、シナリオデータの構成や機能などについては、話題提供システム１や会話制御端末装置２におけるシナリオデータと同じではあるが、上述したように、保守装置３は、ユーザに対してシナリオデータを利用可能にする前にシナリオデータを検証するためのものである。したがって、保守装置３が対象とするシナリオデータは、検証するためのシナリオデータであり、ユーザに対してシナリオデータを利用可能にする前のデータである。 The configuration and functions of the scenario data are the same as the scenario data in the topic providing system 1 and the conversation control terminal device 2, but as described above, the maintenance device 3 can use the scenario data for the user. It is for verifying scenario data before making it. Therefore, the scenario data targeted by the maintenance device 3 is scenario data for verification, and is data before the scenario data is made available to the user.

受信部は、入力特定情報を受信する。入力特定情報は、保守装置３の外部で生成される。すなわち、受信部は、保守装置３の外部で生成された入力特定情報を受信する装置や部材である。保守装置３の外部で、入力情報を分析して入力特定情報を生成し、保守装置３の外部で生成された入力特定情報が、保守装置３の受信部に送信される。 The receiving unit receives the input specifying information. The input specifying information is generated outside the maintenance device 3. In other words, the receiving unit is a device or member that receives input specifying information generated outside the maintenance device 3. The input information is analyzed outside the maintenance device 3 to generate the input specification information, and the input specification information generated outside the maintenance device 3 is transmitted to the receiving unit of the maintenance device 3.

上述したように、入力情報は、仮想的にユーザが入力した情報とすることができる。したがって、ここでの入力特定情報は、実際のユーザが入力するであろうと想定される情報を入力情報として、保守装置３の外部で生成されたものにすることができる。このように、実際のユーザが入力するであろうと想定される情報を入力情報にすることで、様々な入力情報に基づいてシナリオデータを検証することができる。 As described above, the input information may be information virtually input by the user. Therefore, the input specifying information here can be information generated outside the maintenance apparatus 3 by using information that is assumed to be input by an actual user as input information. Thus, scenario data can be verified based on various types of input information by using information that is assumed to be input by an actual user as input information.

シナリオデータ編集部は、シナリオデータを編集可能にするための装置や部材である。シナリオデータは、話題提供システム１の契約者の担当者がキーボードなどを操作することによって、編集することができる。編集は、シナリオデータの追加、削除、変更などである。具体的には、編集は、シナリオデータを構成するステートメントを追加したり、削除したり、変更したりする工程である。シナリオデータの編集により、複数のユーザの各々に対してシナリオデータのカスタムを施すことができる。 The scenario data editing unit is a device or member for enabling editing of scenario data. The scenario data can be edited by the person in charge of the contractor of the topic providing system 1 operating the keyboard or the like. Editing includes adding, deleting, and changing scenario data. Specifically, editing is a process of adding, deleting, or changing a statement that constitutes scenario data. By editing the scenario data, the scenario data can be customized for each of a plurality of users.

シナリオデータ検証部は、シナリオデータの応答を検証可能にするための装置や部材である。ここでのシナリオデータは、編集したシナリオデータである。シナリオデータ編集部で編集したシナリオデータの応答が適切であるか否かを検証するための装置や部材である。 The scenario data verification unit is a device or member for enabling verification of a scenario data response. The scenario data here is edited scenario data. It is a device or member for verifying whether the response of scenario data edited by the scenario data editing unit is appropriate.

このように、シナリオデータ検証部は、想定される様々な入力情報を用いて入力特定情報を生成し、シナリオデータ編集部によって編集されたシナリオデータの応答を検証することができる。このため、あらゆるユーザに対してシナリオデータの応答が適切であるか否かを検証できるので、シナリオデータ編集部により複数のユーザの各々に対して施したシナリオデータのカスタムを検証することができる。 As described above, the scenario data verification unit can generate input specifying information using various assumed input information, and can verify the response of the scenario data edited by the scenario data editing unit. For this reason, since it is possible to verify whether or not the response of the scenario data is appropriate for every user, it is possible to verify the custom scenario data applied to each of the plurality of users by the scenario data editing unit.

シナリオデータ送信部は、編集したシナリオデータを外部に送信する。外部は、たとえば、サーバや、他の会話制御端末装置２などにすることができる。このように、編集したシナリオデータを外部に送信することによって、検証済みのシナリオデータをユーザに対して利用可能にすることができる。 The scenario data transmission unit transmits the edited scenario data to the outside. The outside can be, for example, a server or another conversation control terminal device 2. Thus, the verified scenario data can be made available to the user by transmitting the edited scenario data to the outside.

なお、上述したように、仮想的にユーザが入力した情報を入力情報として、シナリオデータ編集部でシナリオデータを編集したり、シナリオデータ検証部がシナリオデータを検証する例を示したが、後述するように、話題解析部によって話題リストを生成し、話題リストに基づくシナリオデータを編集したり検証したりすることができる。話題解析部については後述する。 As described above, the scenario data editing unit edits the scenario data or the scenario data verification unit verifies the scenario data using information input by the user as input information. Thus, a topic list can be generated by the topic analysis unit, and scenario data based on the topic list can be edited or verified. The topic analysis unit will be described later.

シナリオデータ検証部によって、シナリオデータ編集部で編集したシナリオデータの応答が適切であるか否かを検証する。このようにしたことにより、サーバなどの外部にシナリオデータを送信する前に、シナリオデータの内容や整合性を確認することができる。サーバなどの外部に送信された検証済みのシナリオデータは、最終的には、少なくともその一部が会話制御端末装置２に送信され、ユーザとの会話に用いられる。 The scenario data verification unit verifies whether the response of the scenario data edited by the scenario data editing unit is appropriate. By doing so, it is possible to confirm the contents and consistency of the scenario data before transmitting the scenario data to the outside such as a server. At least a part of the verified scenario data transmitted to the outside such as the server is finally transmitted to the conversation control terminal device 2 and used for conversation with the user.

会話制御システムにおけるデータは、ユーザが入力した入力情報を分析して入力特定情報を生成するためのデータと、この入力特定情報に基づいて応答情報を決定するためのシナリオデータとの双方がある。このシナリオデータは、ユーザへの回答である応答情報を多様化することができるデータである。上述した構成によれば、シナリオデータ編集部によって、シナリオデータを編集でき、シナリオデータ検証部によって、編集したシナリオデータの応答を検証できる。このようにすることで、会話制御システムの全般に亘る高度に専門的な知識や技術がなくても、シナリオデータについて、会話制御システムを利用する複数のユーザの各々に対してカスタムを施すことができる。 Data in the conversation control system includes both data for analyzing input information input by a user to generate input specifying information and scenario data for determining response information based on the input specifying information. This scenario data is data that can diversify response information that is an answer to the user. According to the configuration described above, the scenario data editing unit can edit the scenario data, and the scenario data verification unit can verify the response of the edited scenario data. By doing so, it is possible to customize the scenario data for each of a plurality of users who use the conversation control system without the need for highly specialized knowledge and technology throughout the conversation control system. it can.

さらに、図３に示すように、本実施の形態による保守装置３の特徴は、
ユーザが入力情報を入力するための入力部と、
前記入力情報及び前記応答情報に関する状態制御指標を記憶する状態制御指標記憶部と、
前記シナリオデータと前記状態制御指標とに基づいて前記応答情報を決定する応答情報決定部と、
前記応答情報決定部によって決定された応答情報を出力する出力部と、を備える会話制御端末装置を仮想的に構築する端末装置仮想構築部を有することである。 Furthermore, as shown in FIG. 3, the features of the maintenance device 3 according to the present embodiment are as follows:
An input unit for a user to input input information;
A state control index storage unit that stores a state control index related to the input information and the response information;
A response information determination unit that determines the response information based on the scenario data and the state control index;
And an output unit that outputs response information determined by the response information determination unit, and a terminal device virtual construction unit that virtually constructs a conversation control terminal device.

本実施の形態による保守装置３は、図３に示すように、端末装置仮想構築部を有する。この端末装置仮想構築部によって、会話制御端末装置２’が仮想的に構築される。 The maintenance device 3 according to the present embodiment has a terminal device virtual construction unit as shown in FIG. The conversation control terminal device 2 ′ is virtually constructed by the terminal device virtual construction unit.

さらに、本実施の形態による保守装置３は、図３に示すように、シナリオデータ検証部を有する。このシナリオデータ検証部に状態制御指標記憶部を加えたものが、端末装置仮想構築部を構成する。 Furthermore, the maintenance device 3 according to the present embodiment has a scenario data verification unit as shown in FIG. The scenario data verification unit plus the state control index storage unit constitutes a terminal device virtual construction unit.

このように、保守装置３には、会話制御端末装置２’が機能として備えられている。シミュレーション用のパッケージとして会話制御端末装置２’の機能を保守装置３に備えることができる。あるいは、仮想的に構築された会話制御端末装置２’から保守機能を省いたものを、実際の会話制御端末装置２とすることもできる。仮想的に構築される会話制御端末装置２’は、ハードウェアとして実現してもソフトウェアで実現してもよい。ハードウェアとして実現する場合には、実際の会話制御端末装置２とは異なる別個の装置を仮想的に構築される会話制御端末装置２’とすればよい。また、ソフトウェアで実現する場合には、保守装置３においてエミュレーションなどによって会話制御端末装置２’を実現させればよい。 Thus, the maintenance device 3 includes the conversation control terminal device 2 'as a function. The maintenance device 3 can have the function of the conversation control terminal device 2 ′ as a simulation package. Alternatively, the actual conversation control terminal device 2 can be obtained by omitting the maintenance function from the virtually constructed conversation control terminal device 2 ′. The virtually constructed conversation control terminal device 2 'may be realized as hardware or software. When realized as hardware, a separate device different from the actual conversation control terminal device 2 may be used as the conversation control terminal device 2 ′ virtually constructed. Further, in the case of realizing by software, the maintenance device 3 may realize the conversation control terminal device 2 ′ by emulation or the like.

保守装置３は、主として、話題提供システム１の契約者が使用するものであり、一般のユーザに話題を提供するためのシナリオデータの保守をするための装置であればよい。 The maintenance device 3 is mainly used by a contractor of the topic providing system 1 and may be any device for maintaining scenario data for providing a topic to a general user.

端末装置仮想構築部及びシナリオデータ検証部は、図３に示すように、主に、入力部と応答情報決定部と出力部と、を備える。さらに、端末装置仮想構築部は、状態制御指標記憶部を備える。 As illustrated in FIG. 3, the terminal device virtual construction unit and the scenario data verification unit mainly include an input unit, a response information determination unit, and an output unit. Furthermore, the terminal device virtual construction unit includes a state control index storage unit.

入力部は、ユーザが入力情報を入力するための部材や部位である。入力部は、ユーザが所望する情報を入力情報として入力できるものであればよい。たとえば、入力部は、キーボードやタッチパネルやマイクロフォンやカメラなどがある。ユーザは、入力部からテキストデータや音声データや画像データなどを入力できる。 The input unit is a member or a part for the user to input input information. The input part should just be what can input the information which a user desires as input information. For example, the input unit includes a keyboard, a touch panel, a microphone, a camera, and the like. The user can input text data, voice data, image data, and the like from the input unit.

上述したように、ここで、入力情報は、仮想的にユーザが入力した情報でよい。ユーザも、仮想的なユーザでよく、また、実際のユーザが入力するであろうと想定される情報を入力情報とすればよい。 As described above, here, the input information may be information virtually input by the user. The user may also be a virtual user, and information that is assumed to be input by an actual user may be input information.

状態制御指標記憶部は、状態制御指標を記憶する。状態制御指標は、入力情報と応答情報とに関する指標である。状態制御指標は、主に履歴に関する指標である。たとえば、ユーザが過去に入力した入力情報に関する指標や、ユーザに過去に提供した応答情報に関する指標などがある。 The state control index storage unit stores a state control index. The state control index is an index related to input information and response information. The state control index is mainly an index related to history. For example, there are indicators relating to input information input by the user in the past and indicators relating to response information provided in the past to the user.

応答情報決定部は、シナリオデータ及び入力特定情報のほかに状態制御指標を加えて応答情報を決定する。 The response information determination unit determines the response information by adding a state control index in addition to the scenario data and the input specifying information.

出力部は、応答情報決定部によって決定された応答情報を出力する。 The output unit outputs the response information determined by the response information determination unit.

このように、仮想的に構築される会話制御端末装置２’で、編集したシナリオデータを使用したり、検証済みのシナリオデータを使用したりすることができ、ユーザに対して利用可能にするよりも前に、シナリオデータの内容や制御を確認することができる。ユーザに適切な話題を提供することができる。 In this way, in the virtually constructed conversation control terminal device 2 ′, the edited scenario data can be used, or the verified scenario data can be used. Before that, you can check the contents and control of the scenario data. An appropriate topic can be provided to the user.

端末装置仮想構築部は、会話制御端末装置２’を保守装置３において仮想的に構築して実行することができる。したがって、一般のユーザが使用する会話制御端末装置２と同様の環境を保守装置３において実現することができる。これにより、ユーザが実際に会話を進める環境と同様の環境で、シナリオデータの内容や動作を予め確認することができ、ユーザと会話をする前にシナリオデータの内容を検証することができ、サーバに接続してくる複数のユーザの各々に対して施したカスタムが適切であるか否かを事前に検証することができる。 The terminal device virtual construction unit can virtually construct and execute the conversation control terminal device 2 ′ in the maintenance device 3. Therefore, the maintenance device 3 can realize the same environment as the conversation control terminal device 2 used by general users. As a result, the content and operation of the scenario data can be checked in advance in an environment similar to the environment in which the user actually carries out the conversation, and the content of the scenario data can be verified before the conversation with the user. It is possible to verify in advance whether or not the custom made for each of a plurality of users connected to is appropriate.

さらに、図３に示すように、本実施の形態による保守装置３の特徴は、
話題を関係付ける関連詞を介して話題の近さや繋がり方を付与した話題リストを生成するための話題解析部を、さらに備え、
前記シナリオデータ編集部は、前記話題リストと前記関連詞を利用してユーザに話題を紹介するための話題紹介シナリオおよびユーザの入力に応答するための入力関連シナリオを前記シナリオデータとして編集可能にすることである。 Furthermore, as shown in FIG. 3, the features of the maintenance device 3 according to the present embodiment are as follows:
A topic analysis unit for generating a topic list to which the closeness and connection method of topics is given via a related word that relates topics,
The scenario data editing unit edits a topic introduction scenario for introducing a topic to a user and an input related scenario for responding to a user input as the scenario data by using the topic list and the related terminology. That is.

本実施の形態による保守装置３は、話題解析部（図示せず）を備える。話題解析部は、話題リストを生成するための装置や部材である。話題リストは、話題を関係付ける関連詞を介して話題の近さや繋がり方を付与したデータである。話題解析部によって、話題に関連付けられる関連詞が話題リストに蓄積されていく。保守装置３においては、話題リストは、話題提供システム１の契約者に提供されるデータであり、シナリオデータを生成する際に用いられる。 The maintenance device 3 according to the present embodiment includes a topic analysis unit (not shown). The topic analysis unit is a device or member for generating a topic list. The topic list is data to which the closeness and connection method of the topics are given through the related words that relate the topics. The topic analysis unit accumulates related terms associated with the topic in the topic list. In the maintenance device 3, the topic list is data provided to the contractor of the topic providing system 1 and is used when generating scenario data.

たとえば、保守装置３の出力部には、後述する図１７に示すように、話題リストが出力される。話題提供システム１の契約者は、保守装置３の出力部に出力された話題リストを参照して、ユーザに提供するための話題紹介シナリオのデータ及び入力関連シナリオのデータを構築することができる。このようにすることで、話題提供システム１の契約者は、容易かつ簡便に話題紹介シナリオのデータ及び入力関連シナリオのデータを構築することができる。 For example, a topic list is output to the output unit of the maintenance device 3 as shown in FIG. The contractor of the topic providing system 1 can construct topic introduction scenario data and input related scenario data to be provided to the user with reference to the topic list output to the output unit of the maintenance device 3. By doing in this way, the contractor of the topic providing system 1 can easily and easily construct topic introduction scenario data and input-related scenario data.

シナリオデータ編集部は、話題紹介シナリオと入力関連シナリオとをシナリオデータとして編集可能にする。話題紹介シナリオ及び入力関連シナリオは、話題リスト及び関連詞によって編集可能にされる。話題リスト及び関連詞を用いて編集できるので、容易かつ簡便に入力関連シナリオを構築できる。 The scenario data editing unit makes it possible to edit the topic introduction scenario and the input related scenario as scenario data. The topic introduction scenario and the input related scenario are made editable by the topic list and the related words. Since editing can be performed using a topic list and related terms, an input related scenario can be easily and easily constructed.

話題紹介シナリオは、ユーザに話題を紹介するためのシナリオである。ユーザは、話題紹介シナリオによって話題が提供される。入力関連シナリオは、ユーザの入力に応答するためのシナリオである。ユーザが所定の情報、たとえば挨拶などの情報を入力すると入力関連シナリオによって、対応する挨拶などの情報がユーザに回答される。 The topic introduction scenario is a scenario for introducing a topic to a user. The user is provided with a topic by a topic introduction scenario. The input-related scenario is a scenario for responding to a user input. When the user inputs predetermined information, for example, information such as a greeting, the corresponding information such as a greeting is answered to the user by an input-related scenario.

このシナリオデータ編集部によって、話題提供システム１の契約者は、話題紹介シナリオと入力関連シナリオとを編集して所望するものにできる。 By this scenario data editing unit, the subscriber of the topic providing system 1 can edit the topic introduction scenario and the input related scenario to make them desired.

話題リストは、各種のログ、たとえば、ツイッターやブログなどの取得可能なデータに基づいて、更新することができる。すなわち、話題リストには、最新の情報を関連詞として蓄積していくことができる。このため、話題リストに基づいて話題紹介シナリオ及び入力関連シナリオを編集したり検証したりする際に、話題提供システム１の契約者は、話題リストによって最新の情報を知得して、話題紹介シナリオ及び入力関連シナリオを編集することができ、ユーザに最新の情報を含めた話題を提供することができるとともに、最新の情報によって新たなユーザ層を開拓することもできる。 The topic list can be updated based on various logs, for example, acquirable data such as Twitter and blog. That is, the latest information can be accumulated as related terms in the topic list. Therefore, when editing or verifying the topic introduction scenario and the input-related scenario based on the topic list, the subscriber of the topic providing system 1 knows the latest information from the topic list, and the topic introduction scenario. In addition, the input-related scenario can be edited, a topic including the latest information can be provided to the user, and a new user group can be cultivated by the latest information.

会話制御システムにおけるデータは、ユーザが入力した入力情報を分析して入力特定情報を生成するためのデータと、この入力特定情報に基づいて応答情報を決定するためのシナリオデータとの双方がある。このシナリオデータは、ユーザへの回答である応答情報を多様化することができるデータである。上述した構成によれば、シナリオデータ編集部によって、話題解析部と連携しながらシナリオデータを編集でき、シナリオデータ検証部によって、編集したシナリオデータの応答を検証できる。このようにすることで、会話制御システムの全般に亘る高度に専門的な知識や技術がなくても、シナリオデータについて、会話制御システムを利用する複数のユーザの各々に対してカスタムを施すことができる。 Data in the conversation control system includes both data for analyzing input information input by a user to generate input specifying information and scenario data for determining response information based on the input specifying information. This scenario data is data that can diversify response information that is an answer to the user. According to the configuration described above, the scenario data editing unit can edit the scenario data in cooperation with the topic analysis unit, and the scenario data verification unit can verify the response of the edited scenario data. By doing so, it is possible to customize the scenario data for each of a plurality of users who use the conversation control system without the need for highly specialized knowledge and technology throughout the conversation control system. it can.

また、図３に示した送信部は、入力情報を外部に送信するための装置や部材である。入力情報を外部に送信するものであればよい。外部は、たとえば、話題提供サーバ４（図２参照）や、会話制御端末装置２などにすることができる。 Further, the transmission unit shown in FIG. 3 is a device or member for transmitting input information to the outside. Any device that transmits input information to the outside may be used. The outside can be, for example, the topic providing server 4 (see FIG. 2), the conversation control terminal device 2, or the like.

さらに、図３に示した入力情報分析部は、入力情報を分析して入力特定情報を生成する。入力特定情報は、入力情報に含まれる各種の情報を分析した結果、生成される情報である。たとえば、特定のキーワード（後述する関連詞など）が入力情報に含まれる数や頻度などの統計的な分析などがある。 Further, the input information analysis unit shown in FIG. 3 analyzes the input information and generates input specifying information. The input specifying information is information generated as a result of analyzing various types of information included in the input information. For example, there is a statistical analysis of the number and frequency of specific keywords (such as related terms described later) included in the input information.

さらにまた、図３に示したシナリオデータ記憶部（複数のシナリオデータ）は、複数のシナリオデータを記憶する。ここで、複数のシナリオデータは、ユーザと会話をするために必要な話題名に対応する全てのシナリオデータである。全てのシナリオデータのうち、入力特定情報に基づいて必要であると判断されたシナリオデータが抽出される。 Furthermore, the scenario data storage unit (a plurality of scenario data) shown in FIG. 3 stores a plurality of scenario data. Here, the plurality of scenario data are all scenario data corresponding to the topic names necessary for talking with the user. Of all scenario data, scenario data determined to be necessary based on the input identification information is extracted.

＜＜＜話題提供システム１のシステム構成＞＞＞
図４は、話題提供システム１のシステム構成の概略を示すブロック図である。 <<< System configuration of topic providing system 1 >>>
FIG. 4 is a block diagram showing an outline of the system configuration of the topic providing system 1.

話題提供システム１は、Topiclet２０とiWA３０とiWA Manager４０とを有する。 The topic providing system 1 includes a Topiclet 20, iWA 30, and iWA Manager 40.

＜＜Topiclet２０＞＞
本実施の形態において、Topiclet２０は、たとえば、ユーザが使用する端末装置などのハードウェアに相当する。また、Topiclet２０は、図２に示した会話制御端末装置２に対応する。Topiclet２０によって話題がユーザに提供される。なお、本実施の形態において、「Topiclet」は、話題をユーザに提供するために端末装置で実行されるソフトウェアや、端末装置やこれらのソフトウェアによって実現できる話題提供環境と同義に用いる場合がある。 << Topiclet 20 >>
In the present embodiment, Topiclet 20 corresponds to hardware such as a terminal device used by the user, for example. The Topiclet 20 corresponds to the conversation control terminal device 2 shown in FIG. The topic is provided to the user by the Topiclet 20. In this embodiment, “Topiclet” may be used synonymously with software executed by a terminal device to provide a topic to a user, or a topic providing environment that can be realized by the terminal device or these software.

具体的には、Topiclet２０は、ＣＰＵ（中央処理装置）、ＲＯＭ（リードオンリーメモリ）、ＲＡＭ（ランダムアクセスメモリ）、ディスプレイ、キーボード（いずれも図示せず）などを有する。Topiclet２０は、パーソナルコンピュータや携帯端末装置などにすることができる。 Specifically, the Topiclet 20 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a display, a keyboard (all not shown), and the like. The Topiclet 20 can be a personal computer or a mobile terminal device.

Topiclet２０は、入力部２１０と出力部２２０と送信部２３０と受信部２４０と応答情報決定部２５０と切替入力情報入力部２６０とシナリオデータ記憶部２７０と状態制御指標記憶部２８０とを有する。 The Topiclet 20 includes an input unit 210, an output unit 220, a transmission unit 230, a reception unit 240, a response information determination unit 250, a switching input information input unit 260, a scenario data storage unit 270, and a state control index storage unit 280.

＜入力部２１０＞
入力部２１０は、ユーザが入力情報を入力するための装置や部材である。入力部２１０は、キーボードやタッチパネルやマイクなどがある。入力部２１０は、ユーザが質問などの情報を入力できる装置や部材であればよい。 <Input unit 210>
The input unit 210 is a device or member for a user to input input information. The input unit 210 includes a keyboard, a touch panel, a microphone, and the like. The input unit 210 may be any device or member that allows the user to input information such as questions.

＜出力部２２０＞
出力部２２０は、後述する応答情報決定部によって決定された応答情報を出力する。出力部２２０には、ディスプレイやスピーカなどがある。出力部２２０は、応答情報をユーザに認識可能に出力できるものであればよい。 <Output unit 220>
The output unit 220 outputs response information determined by a response information determination unit described later. The output unit 220 includes a display and a speaker. The output unit 220 only needs to output the response information so that the user can recognize it.

このように、ユーザは、入力部２１０に入力情報を入力し、出力部２２０に出力された応答情報を認識することで、会話を進めることができる。 Thus, the user can advance the conversation by inputting the input information to the input unit 210 and recognizing the response information output to the output unit 220.

＜送信部２３０及び受信部２４０＞
送信部２３０は、入力部２１０に入力された入力情報をiWA３０に送信するための装置や部材である。たとえば、送信部２３０は、通信用インターフェースなどがある。 <Transmitter 230 and Receiver 240>
The transmission unit 230 is a device or member for transmitting the input information input to the input unit 210 to the iWA 30. For example, the transmission unit 230 includes a communication interface.

なお、iWA３０に送信する入力情報には、ユーザが入力部２１０から入力情報のほかに、ユーザを識別するためのユーザＩＤも含まれる。ユーザＩＤは、ユーザを識別できる情報であればよい。ユーザの各々に割り当てた情報でもよい。さらに、ユーザＩＤは、Topiclet２０のシリアル番号などのTopiclet２０を一義的に識別できる情報にすることもできる。ユーザＩＤは、話題提供システム１や会話制御端末装置２を利用するユーザの各々を識別できる情報であればよい。 Note that the input information transmitted to the iWA 30 includes a user ID for identifying the user in addition to the input information from the input unit 210 by the user. The user ID may be information that can identify the user. Information assigned to each user may be used. Furthermore, the user ID may be information that can uniquely identify the topiclet 20 such as the serial number of the topicic 20. The user ID may be any information that can identify each user who uses the topic providing system 1 or the conversation control terminal device 2.

受信部２４０は、iWA３０から送信された入力特定情報とシナリオデータとを受信するための装置や部材である。たとえば、送信部２３０は、通信用インターフェースなどがある。Topiclet２０は、送信部２３０及び受信部２４０によって、iWA３０と通信可能に接続される。 The receiving unit 240 is a device or member for receiving input specifying information and scenario data transmitted from the iWA 30. For example, the transmission unit 230 includes a communication interface. Topiclet 20 is communicably connected to iWA 30 by transmission unit 230 and reception unit 240.

＜応答情報決定部２５０＞
応答情報決定部２５０は、入力特定情報及びシナリオデータに基づいて応答情報を決定する。たとえば、応答情報決定部２５０は、Topiclet２０のＣＰＵ、ＲＯＭ、ＲＡＭなどから構成される。このように、iWA３０から送信された入力特定情報及びシナリオデータを用いて応答情報を決定する。 <Response information determination unit 250>
The response information determination unit 250 determines response information based on the input specifying information and scenario data. For example, the response information determination unit 250 includes the CPU, ROM, RAM, etc. of the Topiclet 20. As described above, the response information is determined using the input specifying information and the scenario data transmitted from the iWA 30.

応答情報決定部２５０では、入力とは関係なくシナリオデータに基づいて応答情報は動的に変化する。 In the response information determination unit 250, the response information dynamically changes based on the scenario data regardless of the input.

応答情報決定部２５０は、応答情報を決定する。応答情報は、シナリオデータと入力特定情報とに基づいて決定される。すなわち、ユーザが入力した入力情報を分析して得られた入力特定情報を用いて応答情報を決定する。したがって、ユーザの意思を反映させた応答情報を生成することができ、ユーザが所望する話題を提供することによってユーザとの会話を円滑に進めることができる。 The response information determination unit 250 determines response information. The response information is determined based on the scenario data and the input specifying information. That is, the response information is determined using the input specifying information obtained by analyzing the input information input by the user. Therefore, the response information reflecting the user's intention can be generated, and the conversation with the user can be smoothly advanced by providing the topic desired by the user.

応答情報には、入力特定情報に基づいてシナリオデータのステートメントが含められる。ステートメントに含まれるユーザに提供する出力用情報のみならず、出力用コマンドなどの各種のコマンドも応答情報に含めることができる。このようにすることで、出力部において、話題の情報や挨拶の情報をさまざまな仕様で出力することができる。 The response information includes a statement of scenario data based on the input specifying information. In addition to the output information provided to the user included in the statement, various commands such as an output command can be included in the response information. In this way, topic information and greeting information can be output with various specifications in the output unit.

応答情報決定部２５０は、シナリオデータ及び入力特定情報のほかに状態制御指標を加えて応答情報を決定する。このように、状態制御指標も用いて応答情報を決定することで、ユーザとの過去の会話を踏まえて話題を提供したり会話を進めたりすることができる。したがって、同じ話題を重複してユーザに提供したり、飛躍した話題をユーザに提供したりすることを防止でき、より円滑な会話を進めることができる。 The response information determination unit 250 determines the response information by adding a state control index in addition to the scenario data and the input specifying information. In this way, by determining the response information using the state control index, it is possible to provide a topic or advance the conversation based on the past conversation with the user. Therefore, it is possible to prevent the same topic from being provided to the user in duplicate or to provide the user with a jumping topic, and a smoother conversation can be promoted.

また、応答情報決定部２５０は、シナリオデータと状態制御指標とに基づいて応答情報を決定してもよい。 Further, the response information determination unit 250 may determine the response information based on the scenario data and the state control index.

＜切替入力情報入力部２６０＞
切替入力情報入力部２６０は、Topiclet２０のＣＰＵ、ＲＯＭ、ＲＡＭなどから構成される。 <Switching input information input unit 260>
The switching input information input unit 260 includes the CPU, ROM, RAM, and the like of the Topiclet 20.

切替入力情報入力部２６０は、異なる話題への遷移を規定する情報に応じて話題切替入力情報を生成する。異なる話題への遷移を規定する情報は、たとえば、後述する話題切替情報などがある。また、話題切替入力情報は、たとえば、後述する性格指標などがある。 The switching input information input unit 260 generates topic switching input information according to information defining transition to a different topic. Information defining transition to a different topic includes, for example, topic switching information described later. The topic switching input information includes, for example, a personality index described later.

＜シナリオデータ記憶部２７０＞
シナリオデータ記憶部２７０は、話題に関する応答情報を規定するためのシナリオデータを抽出する。たとえば、シナリオデータ記憶部２７０は、Topiclet２０のＲＯＭやＲＡＭなどから構成される。 <Scenario data storage unit 270>
The scenario data storage unit 270 extracts scenario data for defining response information related to the topic. For example, the scenario data storage unit 270 is composed of the ROM, RAM, etc. of the Topiclet 20.

後述するように、本実施の形態では、シナリオデータは、複数のステートメントからなる。Topiclet２０において、一のステートメントから他のステートメントに遷移させていくことで、ユーザに話題を提供しつつ、ユーザと会話をすることができる。シナリオデータ記憶部２７０は、ユーザとの会話を進めていくための複数のステートメントを記憶する。ステートメントを遷移させていくことでユーザとの会話を進める具体例は、図５〜図１４で具体的に説明する。 As will be described later, in this embodiment, the scenario data is composed of a plurality of statements. In Topiclet 20, it is possible to have a conversation with the user while providing a topic to the user by making a transition from one statement to another. The scenario data storage unit 270 stores a plurality of statements for proceeding with the conversation with the user. Specific examples of advancing a conversation with the user by transitioning statements will be specifically described with reference to FIGS.

後述するように、iWA３０もシナリオデータ記憶部３２０を備える。iWA３０のシナリオデータ記憶部３２０は、全てのシナリオデータを記憶する。これに対して、Topiclet２０のシナリオデータ記憶部２７０は、一部のシナリオデータとして記憶すればよい。Topiclet２０を使用するユーザの会話に必要なシナリオデータとしてTopiclet２０に送信すればよい。 As will be described later, the iWA 30 also includes a scenario data storage unit 320. The scenario data storage unit 320 of the iWA 30 stores all scenario data. On the other hand, the scenario data storage unit 270 of the Topiclet 20 may be stored as a part of scenario data. What is necessary is just to transmit to Topiclet 20 as scenario data required for the conversation of the user who uses Topiclet20.

本実施の形態では、ユーザがTopiclet２０と会話をする際に、会話をする度に、回答などの情報がiWA３０からTopiclet２０に送信されるわけではない。ユーザに提供したい話題を含むシナリオデータがiWA３０から送信されたときには、Topiclet２０のシナリオデータ記憶部２７０にシナリオデータが記憶される。ユーザがTopiclet２０と会話をするときには、シナリオデータ記憶部２７０に既に記憶されているシナリオデータが用いられる。本実施の形態では、会話をする度にiWA３０からTopiclet２０に回答などの情報が送信されるわけではないので、ユーザと円滑に会話をすることができる。 In the present embodiment, when the user has a conversation with the Topiclet 20, information such as an answer is not transmitted from the iWA 30 to the Topiclet 20 every time the user has a conversation. When scenario data including a topic to be provided to the user is transmitted from the iWA 30, the scenario data is stored in the scenario data storage unit 270 of the Topiclet 20. When the user has a conversation with Topiclet 20, the scenario data already stored in scenario data storage unit 270 is used. In the present embodiment, since information such as an answer is not transmitted from iWA 30 to Topiclet 20 every time a conversation is made, a conversation with the user can be made smoothly.

なお、iWA３０のシナリオデータ記憶部３２０に記憶されている全てのシナリオデータをTopiclet２０に送信するようにしてもよい。より円滑に話題をユーザに提供することができる。 Note that all scenario data stored in the scenario data storage unit 320 of the iWA 30 may be transmitted to the Topiclet 20. The topic can be provided to the user more smoothly.

＜状態制御指標記憶部２８０＞
状態制御指標記憶部２８０は、状態制御指標を記憶する。状態制御指標は、入力情報及び応答情報に関する指標である。状態制御指標記憶部２８０は、Topiclet２０のＲＯＭやＲＡＭなどから構成される。 <State control index storage unit 280>
The state control index storage unit 280 stores a state control index. The state control index is an index related to input information and response information. The state control index storage unit 280 includes the ROM, RAM, and the like of the Topiclet 20.

Topiclet２０は、状態制御指標記憶部２８０を有し、状態制御指標記憶部２８０に状態制御指標が記憶される。本実施の形態では、状態制御指標は、サーバ（後述するiWA３０）には送信されず、Topiclet２０で保持される情報である。Topiclet２０は、状態制御指標記憶部２８０に記憶されている状態制御指標を参照して応答情報を決定する。状態制御指標をTopiclet２０で保持するようにすることで、iWA３０との通信量を減らすことができる。また、Topiclet２０で状態制御指標を参照してシナリオデータを用いればよいので、迅速に処理をすることができ、ユーザと円滑に会話をすることができる。 The Topiclet 20 has a state control index storage unit 280, and the state control index is stored in the state control index storage unit 280. In the present embodiment, the state control index is information that is not transmitted to the server (iWA 30 described later) but is stored in the Topiclet 20. The Topiclet 20 determines the response information with reference to the state control index stored in the state control index storage unit 280. By holding the state control index in the Topiclet 20, the amount of communication with the iWA 30 can be reduced. Further, since scenario data can be used by referring to the state control index in Topiclet 20, it is possible to perform processing quickly and to have a smooth conversation with the user.

＜状態制御指標＞
本実施の形態では、状態制御指標には、入力指標と進捗指標と性格指標との三種類の指標を用いて、ユーザに提供する話題を制御している。なお、他の指標を用いてもよい。 <State control index>
In the present embodiment, the topic provided to the user is controlled using three types of indicators, that is, an input indicator, a progress indicator, and a personality indicator, as the state control indicators. Other indicators may be used.

入力指標は、これまでにユーザがどのような入力をしてきたのか、すなわち、ユーザの入力の履歴を示す情報である。入力指標を用いることによって、ユーザが質問をしやすい状況に誘導することができる。すなわち、入力指標を用いることにより、ユーザの以前の入力を踏まえた話題をユーザに提供することができる。これにより、ユーザが、同じような質問を繰り返すなど無駄な質問をすることなく、ユーザに話題を提供することができる。 The input index is information indicating what input the user has made so far, that is, a history of user input. By using the input index, it is possible to guide to a situation where the user can easily ask a question. That is, by using the input index, it is possible to provide the user with a topic based on the user's previous input. Thereby, a user can be provided with a topic without making a useless question such as repeating a similar question.

進捗指標は、これまでにユーザに対してどのような話題を提供してきたのか、すなわち、ユーザに提供した話題の履歴を示す情報である。進捗指標によって、ユーザに提供したい話題を維持（記憶）することができる。これにより、ユーザがストレスを感じさせることなく、ユーザに話題を提供することができる。進捗指標を用いることにより、一連の説明を話題として提供している際に、途中でユーザから質問されても、一連の説明の続きを再開することができる。 The progress index is information indicating what topic has been provided to the user so far, that is, a history of the topic provided to the user. A topic desired to be provided to the user can be maintained (stored) by the progress index. Thereby, a topic can be provided to a user, without making a user feel stress. By using the progress indicator, even when a series of explanations are provided as topics, even if the user asks questions on the way, the continuation of the series of explanations can be resumed.

性格指標は、これまでにユーザがどのような姿勢で入力してきたのか、すなわち、ユーザの姿勢の履歴を示す情報である。たとえば、ある話題について、ユーザが積極的な入力してきたのか消極的な入力してきたのかを示す情報である。積極的な場合には、あるテーマに関する話題を提供し続けることができると判断することができる。一方、消極的な場合には、あるテーマとは別のテーマに切り替えて話題を提供しなければならないと判断することができる。 The personality index is information indicating what posture the user has input so far, that is, the history of the user's posture. For example, it is information indicating whether the user has made a positive input or a passive input for a certain topic. In the case of being active, it can be determined that it can continue to provide a topic on a certain theme. On the other hand, in a negative case, it can be determined that a topic must be provided by switching to a theme different from a certain theme.

たとえば、車に興味があると思われるユーザには、車に関する話題を提供し続ければよいと判断することができる。一方、車に興味がないと思われるユーザには、車とは関係のない食べ物などに関する話題を提供しなければならないと判断することができる。 For example, it can be determined that a user who is interested in a car may continue to provide a topic about the car. On the other hand, it is possible to determine that a user who is not interested in a car must be provided with a topic related to food that is not related to the car.

このように、性格指標により、ユーザに提供する話題の話題名（テーマ）を切り替えることができる。ユーザは、話題の話題名を意識することなく切り替えられた話題名に属する話題に触れることができる。 Thus, the topic name (theme) of the topic provided to the user can be switched by the personality index. The user can touch a topic belonging to the switched topic name without being aware of the topic name of the topic.

＜iWA３０＞
本実施の形態において、iWA３０は、たとえば、サーバなどのハードウェアに相当する。iWA３０は、図２に示した話題提供サーバ４に対応する。iWA３０は、Topiclet２０と通信可能に接続される。Topiclet２０においてユーザに提供される話題に関する処理を実行するためのハードウェアである。 <IWA30>
In the present embodiment, the iWA 30 corresponds to hardware such as a server, for example. The iWA 30 corresponds to the topic providing server 4 shown in FIG. The iWA 30 is communicably connected to the Topiclet 20. This is hardware for executing processing related to a topic provided to a user in Topiclet 20.

具体的には、iWA３０は、ＣＰＵ（中央処理装置）、ＲＯＭ（リードオンリーメモリ）、ＲＡＭ（ランダムアクセスメモリ）、ＨＤＤ（ハードディスクドライブ）、ディスプレイ、キーボード（いずれも図示せず）などを有する。 Specifically, the iWA 30 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), an HDD (Hard Disk Drive), a display, a keyboard (all not shown), and the like.

iWA３０は、入力情報分析部３１０とシナリオデータ記憶部３２０とを備える。 The iWA 30 includes an input information analysis unit 310 and a scenario data storage unit 320.

＜入力情報分析部３１０＞
入力情報分析部３１０は、入力情報を分析して入力特定情報を生成する。入力情報は、ユーザによって入力部２１０にから入力された情報である。入力特定情報は、入力情報を統計的に解析した結果や、その結果から話題提供に必要な情報が含まれる。たとえば、入力情報に関連詞が出現する回数や頻度などの情報がある。また、その結果によって話題提供に必要になると判断されたシナリオデータ（ステートメント）などが含まれる。 <Input Information Analysis Unit 310>
The input information analysis unit 310 analyzes the input information and generates input specifying information. The input information is information input from the input unit 210 by the user. The input specifying information includes a result obtained by statistically analyzing the input information and information necessary for providing a topic based on the result. For example, there is information such as the number and frequency of appearance of related terms in the input information. Moreover, scenario data (statement) determined to be necessary for providing topics according to the result is included.

さらに、入力情報の分析により、ユーザが入力した質問などからユーザの意思や嗜好を分析することができる。他のユーザの入力情報やその入力特定情報との比較により相対的な分析結果も取得することができる。また、分析用辞書などのデータを予め生成しておき、分析用辞書によって入力情報を分析することもできる。 Furthermore, by analyzing the input information, it is possible to analyze the user's intention and preference from a question input by the user. Relative analysis results can also be obtained by comparison with input information of other users and input specific information. It is also possible to generate data such as an analysis dictionary in advance and analyze input information using the analysis dictionary.

たとえば、入力特定情報には、関連詞、シナリオデータ、シナリオデータに含まれる関連詞の数などの各種の情報を含む。シナリオデータには、ユーザに提供する話題の情報や、ユーザと会話をするために必要な挨拶の情報などが含まれる。 For example, the input specifying information includes various information such as related terms, scenario data, and the number of related terms included in the scenario data. The scenario data includes topic information provided to the user, greeting information necessary for conversation with the user, and the like.

＜関連詞＞
本実施の形態による話題提供システム１、会話制御端末装置２及び保守装置３で用いる各種のデータは、関連詞と呼ばれるデータを基礎として構成されている。関連詞は、通常の検索処理などに用いられる通常のキーワードとは異なり、履歴情報や嗜好などの各種の情報を互いに関連付けることができる。関連詞が保持している関連情報に基づいて、入力情報を分析することができる。 <Related words>
Various data used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 according to the present embodiment are configured on the basis of data called related terms. Unlike normal keywords used in normal search processing and the like, various information such as history information and preferences can be associated with each other. The input information can be analyzed based on the related information held by the related words.

＜シナリオデータ記憶部３２０＞
シナリオデータ記憶部３２０は、複数のシナリオデータを記憶する。ここで、複数のシナリオデータは、Topiclet２０でユーザと会話をするために必要な話題名に対応する全てのシナリオデータである。全てのシナリオデータのうち、入力特定情報に基づいて必要であると判断されたシナリオデータがTopiclet２０に送信される。したがって、Topiclet２０においてユーザと会話をする際に、会話をする度に、シナリオデータがTopiclet２０に送信されるわけではない。上述したように、シナリオデータは、複数のステートメントからなる。したがって、入力特定情報に基づいて必要であると判断されたシナリオデータを構成する複数のステートメントがTopiclet２０に送信される。 <Scenario data storage unit 320>
The scenario data storage unit 320 stores a plurality of scenario data. Here, the plurality of scenario data are all scenario data corresponding to topic names necessary for talking with the user on the Topiclet 20. Of all the scenario data, the scenario data determined to be necessary based on the input identification information is transmitted to the Topiclet 20. Therefore, when talking with the user in the Topiclet 20, the scenario data is not transmitted to the Topiclet 20 every time a conversation is made. As described above, the scenario data is composed of a plurality of statements. Therefore, a plurality of statements constituting scenario data determined to be necessary based on the input identification information are transmitted to the Topiclet 20.

生成した入力特定情報に基づいて、必要であると判断された場合にステートメントがシナリオデータとしてTopiclet２０に送信される。必要でないと判断された場合には、既にTopiclet２０に送信しているステートメントで十分であり、この場合には、ステートメントはTopiclet２０に送信されない。 When it is determined that it is necessary based on the generated input specifying information, a statement is transmitted to the Topiclet 20 as scenario data. If it is determined that it is not necessary, the statement already transmitted to the Topiclet 20 is sufficient, and in this case, the statement is not transmitted to the Topiclet 20.

なお、入力特定情報に基づいて必要であると判断されたシナリオデータのみをTopiclet２０に送信するのではなく、シナリオデータ記憶部３２０に記憶されている全てのシナリオデータをTopiclet２０に送信するようにしてもよい。Topiclet２０に全てのシナリオデータを既に送信しているので、シナリオデータの送受信に要する時間を短縮でき、ユーザとの会話を円滑に進めることができる。 Note that not all scenario data determined to be necessary based on the input identification information is transmitted to the Topiclet 20, but all scenario data stored in the scenario data storage unit 320 is transmitted to the Topiclet 20. Good. Since all the scenario data has already been transmitted to the Topiclet 20, the time required for transmitting / receiving the scenario data can be shortened, and the conversation with the user can be facilitated.

また、話題名を切り替える場合には、シナリオデータ記憶部３２０に記憶されているシナリオデータを組み替えて、その話題名に対応するシナリオデータをTopiclet２０に送信する。すなわち、その話題名に対応するシナリオデータに対応するステートメントがTopiclet２０に送信される。組み替えたシナリオデータは、Topiclet２０のシナリオデータ記憶部２７０に記憶される。このシナリオデータの組み替えは、話題名に応じて実行できる。 Further, when switching the topic name, the scenario data stored in the scenario data storage unit 320 is rearranged and the scenario data corresponding to the topic name is transmitted to the Topiclet 20. That is, a statement corresponding to the scenario data corresponding to the topic name is transmitted to Topiclet 20. The rearranged scenario data is stored in the scenario data storage unit 270 of the Topiclet 20. This rearrangement of scenario data can be executed according to the topic name.

ユーザと会話が進むに従って、一の話題名から他の話題名に移す必要が生ずる場合も想定される。このような場合には、一の話題名に対応するシナリオデータでは十分に対応できなくなる。このような場合のため、他の話題名に対応するシナリオデータに切り替えることができる。 As the conversation with the user progresses, it may be necessary to move from one topic name to another. In such a case, scenario data corresponding to one topic name cannot be sufficiently handled. For such a case, it is possible to switch to scenario data corresponding to other topic names.

＜iWA Manager４０＞
本実施の形態において、iWA Manager４０は、たとえば、サーバなどのハードウェアに相当する。iWA Manager４０は、図３に示した保守装置３に対応する。iWA Manager４０は、iWA３０と通信可能に接続される。iWA Manager４０は、主に、iWA３０で用いるシナリオデータに関する処理を実行するためのハードウェアである。 <IWA Manager 40>
In the present embodiment, the iWA Manager 40 corresponds to hardware such as a server, for example. The iWA Manager 40 corresponds to the maintenance device 3 shown in FIG. iWA Manager 40 is communicably connected to iWA 30. The iWA Manager 40 is mainly hardware for executing processing related to scenario data used in the iWA 30.

具体的には、iWA Manager４０は、ＣＰＵ（中央処理装置）、ＲＯＭ（リードオンリーメモリ）、ＲＡＭ（ランダムアクセスメモリ）、ＨＤＤ（ハードディスクドライブ）、ディスプレイ、キーボード（いずれも図示せず）などを有する。 Specifically, the iWA Manager 40 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a HDD (Hard Disk Drive), a display, a keyboard (all not shown), and the like.

iWA Manager４０は、シナリオデータ編集部４１０と、シナリオデータ検証部４２０と、シナリオデータ送信部４３０と、を有する。 The iWA Manager 40 includes a scenario data editing unit 410, a scenario data verification unit 420, and a scenario data transmission unit 430.

＜シナリオデータ編集部４１０＞
シナリオデータ編集部４１０は、シナリオデータを編集可能にする装置又は部材である。シナリオデータは、話題提供システム１の契約者の担当者がキーボードなどを操作することによって、編集することができる。編集は、シナリオデータの追加、削除、変更などである。具体的には、編集は、シナリオデータを構成するステートメントを追加したり、削除したり、変更したりする工程である。 <Scenario data editing unit 410>
The scenario data editing unit 410 is a device or member that enables editing of scenario data. The scenario data can be edited by the person in charge of the contractor of the topic providing system 1 operating the keyboard or the like. Editing includes adding, deleting, and changing scenario data. Specifically, editing is a process of adding, deleting, or changing a statement that constitutes scenario data.

新しい商品が販売されたり、新しいサービスが提供されたり、各種の事件が起こったり、新しい層のユーザが増えたりするなどに応じて、最新の話題に対応できるようにシナリオデータを更新する必要がある。このため、担当者は、ネットワークを介して各種の情報を取得し、これらの情報に基づいてシナリオデータを最新のものに更新することができる。シナリオデータを更新することで、最新の情報に対応した話題をユーザに提供することができる。 Scenario data needs to be updated to accommodate the latest topics as new products are sold, new services are provided, various incidents occur, and the number of new users increases. . Therefore, the person in charge can acquire various types of information via the network and update the scenario data to the latest based on these information. By updating the scenario data, a topic corresponding to the latest information can be provided to the user.

また、シナリオデータ編集部４１０によって、誤字・脱字など不適切な情報や誤った情報を訂正することにより、適切な情報に対応した話題をユーザに提供することができる。 Further, by correcting inappropriate information such as typographical errors and omissions and incorrect information by the scenario data editing unit 410, a topic corresponding to appropriate information can be provided to the user.

＜シナリオデータ検証部４２０＞
シナリオデータ検証部４２０は、入力情報分析部で生成された入力特定情報に基づいて編集したシナリオデータの応答を検証可能にする装置や部材である。すなわち、シナリオデータ検証部４２０は、シナリオデータ編集部４１０で編集したシナリオデータの応答が適切であるか否かを検証するための装置や部材である。 <Scenario data verification unit 420>
The scenario data verification unit 420 is a device or member that enables verification of the response of scenario data edited based on the input specifying information generated by the input information analysis unit. That is, the scenario data verification unit 420 is a device or member for verifying whether the response of the scenario data edited by the scenario data editing unit 410 is appropriate.

シナリオデータの内容が適切であれば、シナリオデータの応答は適切になる。本実施の形態のシナリオデータは、出力部２２０で出力される出力用情報や、出力部２２０への出力の仕様を制御するための出力用コマンドや、ステートメントを制御するための判断や、話題名を切り替えたり、状態制御指標を変更したりするための制御コマンドを含む。このため、出力部２２０に出力されるデータが適切であるかどうかを検証するだけでなく、出力部２２０への出力の制御が適切であるかどうかの検証や、シナリオデータの遷移などの制御が適切であるかどうかの検証をする必要がある。 If the scenario data content is appropriate, the scenario data response is appropriate. Scenario data according to the present embodiment includes output information output from the output unit 220, output commands for controlling output specifications to the output unit 220, judgments for controlling statements, topic names Control commands for switching between and changing the state control index. For this reason, not only verifying whether the data output to the output unit 220 is appropriate, but also verifying whether the control of the output to the output unit 220 is appropriate and controlling the transition of scenario data, etc. It is necessary to verify whether it is appropriate.

シナリオデータ検証部４２０は、想定される様々な入力情報を用いて入力特定情報を生成し、シナリオデータ編集部４１０によって編集されたシナリオデータの応答を検証することができる。このため、あらゆるユーザに対してシナリオデータの応答が適切であるか否かを検証できるので、ユーザの各々に対してカスタムを施すことができる。 The scenario data verification unit 420 can generate input specifying information using various assumed input information, and can verify the response of the scenario data edited by the scenario data editing unit 410. For this reason, since it is possible to verify whether or not the response of the scenario data is appropriate for every user, it is possible to customize each of the users.

＜端末装置仮想構築部＞
シナリオデータ検証部４２０は、端末装置仮想構築部によって、Topiclet２０と同様の環境を仮想的に構築することができる。仮想的な環境下でシナリオデータを検証することにより、ユーザが実際に使用する環境に近い環境で、シナリオデータの出力や動作を検証することができ、シナリオデータが適切であるか否かを容易かつ的確に判断することができる。 <Terminal device virtual construction unit>
The scenario data verification unit 420 can virtually construct an environment similar to Topiclet 20 by the terminal device virtual construction unit. By verifying scenario data in a virtual environment, it is possible to verify the output and operation of scenario data in an environment close to the environment actually used by the user, and it is easy to determine whether scenario data is appropriate. It is possible to judge accurately.

＜シナリオデータ送信部４３０＞
シナリオデータ送信部４３０は、編集したシナリオデータを外部、たとえば、iWA３０に送信する。特に、シナリオデータ編集部４１０によって編集され、さらに、シナリオデータ検証部４２０によって検証されたシナリオデータをiWA３０に送信する。したがって、シナリオデータ送信部４３０は、検証済みのシナリオデータをiWA３０に送信する。 <Scenario data transmission unit 430>
The scenario data transmission unit 430 transmits the edited scenario data to the outside, for example, the iWA 30. In particular, the scenario data edited by the scenario data editing unit 410 and further verified by the scenario data verification unit 420 are transmitted to the iWA 30. Therefore, the scenario data transmission unit 430 transmits the verified scenario data to the iWA 30.

このようにすることで、iWA３０からTopiclet２０に送信されるシナリオデータを常に適切な状態に保つことができる。したがって、適切なシナリオデータを用いた話題をiWA３０を介してユーザに提供することができる。 By doing in this way, the scenario data transmitted from iWA30 to Topiclet 20 can always be kept in an appropriate state. Accordingly, a topic using appropriate scenario data can be provided to the user via the iWA 30.

＜＜＜話題提供サーバが提供するデータ構成＞＞＞
図１５に示すように、iWA３０からは、シナリオデータと入力特定情報と話題紹介リストとが出力される。以下では、シナリオデータの例と入力特定情報の例と話題紹介リストの例とを説明する。 <<< Data structure provided by topic providing server >>>
As shown in FIG. 15, scenario data, input specifying information, and a topic introduction list are output from the iWA 30. Hereinafter, an example of scenario data, an example of input specifying information, and an example of a topic introduction list will be described.

＜＜＜シナリオデータの実例＞＞＞
図１４は、本実施の形態の話題提供システム１、会話制御端末装置２及び保守装置３で用いるシナリオデータの例である。以下では、この図１４に示したシナリオデータを具体的な処理手順によって説明する。図１４に示したシナリオデータは、第１〜第１３の複数のステートメントからなる。図５〜図１２は、これらの第１〜第１３のシナリオデータの処理手順を示すフローチャートである。図１３は、第１〜第１３のシナリオデータを処理することによって出力部２２０に出力される例を示す図である。 <<< Examples of scenario data >>>
FIG. 14 is an example of scenario data used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 of the present embodiment. Hereinafter, the scenario data shown in FIG. 14 will be described according to a specific processing procedure. The scenario data shown in FIG. 14 includes a plurality of first to thirteenth statements. 5 to 12 are flowcharts showing the processing procedure of these first to thirteenth scenario data. FIG. 13 is a diagram illustrating an example of output to the output unit 220 by processing the first to thirteenth scenario data.

上述したように、本実施の形態の話題提供システム１、会話制御端末装置２及び保守装置３で用いるシナリオデータ（ステートメント）は、出力用情報や、出力用コマンドや、制御コマンドが含まれる。シナリオデータ（ステートメント）は、このような出力用情報や出力用コマンドや制御コマンドなどの各種の要素によって構成される。 As described above, the scenario data (statement) used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 of the present embodiment includes output information, output commands, and control commands. The scenario data (statement) is composed of various elements such as output information, output commands, and control commands.

図５は、第１、第２及び第３のステートメントの処理を示すフローチャートである。 FIG. 5 is a flowchart showing the processing of the first, second and third statements.

＜第１のステートメント＞
最初に、第１のステートメントに遷移する。図５では、ステップＳ５１１〜Ｓ５２１が、第１のステートメントの処理に対応する。 <First statement>
First, transition to the first statement. In FIG. 5, steps S511 to S521 correspond to the processing of the first statement.

まず、進捗指標を−１に設定する（ステップＳ５１１）。この値が進捗の状態を示す。次いで、出力部２２０を一旦消去して（ステップＳ５１３）、出力部２２０に「不安を解消する話題にシフトします。」と出力する（ステップＳ５１５）。この処理によって、たとえば、図１３（ａ）に示すように、「地震が心配だ」という話題がユーザの操作によって入力されて、出力部２２０のテキストデータ表示領域に、「不安を解消する話題にシフトします。」と表示される（図１３の１３１１）。 First, the progress index is set to −1 (step S511). This value indicates the state of progress. Next, the output unit 220 is temporarily erased (step S513), and “shift to a topic to resolve anxiety” is output to the output unit 220 (step S515). By this process, for example, as shown in FIG. 13A, the topic “I am worried about the earthquake” is input by the user's operation, and the text data display area of the output unit 220 displays “ Will shift "is displayed (1311 in FIG. 13).

次に、所定の画像を出力して（ステップＳ５１７）、３秒間待機する（ステップＳ５１９）。この処理によって、図１３（ａ）に示すように、出力部２２０の画像データ表示領域に所定の色の顔画像Ｍ１が表示されて、３秒間待機する（図１３の１３１１）。 Next, a predetermined image is output (step S517), and it waits for 3 seconds (step S519). By this processing, as shown in FIG. 13A, a face image M1 of a predetermined color is displayed in the image data display area of the output unit 220 and waits for 3 seconds (1311 in FIG. 13).

次いで、「心配だ。」という語を入力情報として、「話題ネタ」という話題名に切り替える（ステップＳ５２１）。この処理によって、この例では、第２のステートメントに遷移する（図１３の１３１３）。 Next, the topic name “topic topic” is switched using the word “I worry” as input information (step S521). By this process, in this example, a transition is made to the second statement (1313 in FIG. 13).

＜第２のステートメント＞
図５では、ステップＳ５２３〜Ｓ５２５が、第２のステートメントの処理に対応する。 <Second statement>
In FIG. 5, steps S523 to S525 correspond to the processing of the second statement.

第２のステートメントに遷移して、まず、絶対時間、たとえば、１２時００分に至ったか否かの監視を開始する（ステップＳ５２３）。次いで、絶対時間が到来したか否かを判断する（ステップＳ５２５）（図１３の１３１５）。絶対時間が到来していない場合には（ＮＯ）、第３のステートメントに遷移させる。絶対時間に至った場合には（ＹＥＳ）、絶対時間の監視を解除して第１３のステートメントに遷移させる（符号ＥＥ）。 Transition to the second statement starts with monitoring whether or not the absolute time, for example, 12:00, has been reached (step S523). Next, it is determined whether or not the absolute time has come (step S525) (1315 in FIG. 13). If the absolute time has not come (NO), the process proceeds to the third statement. When the absolute time is reached (YES), the monitoring of the absolute time is canceled and a transition is made to the thirteenth statement (reference EE).

＜第３のステートメント＞
図５では、ステップＳ５２７〜Ｓ５３３が、第３のステートメントの処理に対応する。 <Third statement>
In FIG. 5, steps S527 to S533 correspond to the processing of the third statement.

第３のステートメントに遷移したときには、進捗指標が−１であるか否かを判断する（ステップＳ５２７）（図１３の１３１７）。進捗指標が−１であると判別したときには（ＹＥＳ）、第４のステートメントに遷移させる（符号Ｅ１）。 When the process proceeds to the third statement, it is determined whether or not the progress index is −1 (step S527) (1317 in FIG. 13). When it is determined that the progress index is −1 (YES), the process proceeds to the fourth statement (reference E1).

進捗指標が−１でないと判別したときには（ＮＯ）、進捗指標が−２であるか否かを判断する（ステップＳ５２９）。進捗指標が−２であると判別したときには（ＹＥＳ）、第５のステートメントに遷移させる（符号Ｅ２）。 When it is determined that the progress index is not −1 (NO), it is determined whether or not the progress index is −2 (step S529). When it is determined that the progress index is −2 (YES), a transition is made to the fifth statement (reference E2).

進捗指標が−２でないと判別したときには（ＮＯ）、進捗指標が−３であるか否かを判断する（ステップＳ５３１）。進捗指標が−３であると判別したときには（ＹＥＳ）、第６のステートメントに遷移させる（符号Ｅ３）。 When it is determined that the progress index is not −2 (NO), it is determined whether or not the progress index is −3 (step S531). When it is determined that the progress index is −3 (YES), the process proceeds to the sixth statement (reference E3).

進捗指標が−３でないと判別したときには（ＮＯ）、進捗指標が−４であるか否かを判断する（ステップＳ５３３）。進捗指標が−４であると判別したときには（ＹＥＳ）、別のステートメント（sta:200）に遷移させる。進捗指標が−４でないと判別したときには（ＮＯ）、なにもしない。 When it is determined that the progress index is not −3 (NO), it is determined whether or not the progress index is −4 (step S533). When it is determined that the progress index is −4 (YES), a transition is made to another statement (sta: 200). When it is determined that the progress index is not -4 (NO), nothing is done.

上述した第２のステートメントや第３のステートメントは制御コマンドのみからなる。このように、ステートメントは、出力部２２０に出力するための出力用情報を有しないものでもよい。 The second statement and the third statement described above consist only of control commands. As described above, the statement may not have output information for output to the output unit 220.

＜第４のステートメント及び第７のステートメント＞
図６は、第４のステートメント及び第７のステートメントに対応する処理を示す。第４のステートメントは、ステップＳ６１１〜Ｓ６２１に対応する。第７のステートメントは、ステップＳ６２３〜Ｓ６２９に対応する。 <4th statement and 7th statement>
FIG. 6 shows processing corresponding to the fourth statement and the seventh statement. The fourth statement corresponds to steps S611 to S621. The seventh statement corresponds to steps S623 to S629.

＜第４のステートメント＞
第４のステートメントに遷移したときには、相対時間、たとえば、１２０秒の測定の開始する（ステップＳ６１１）。次いで、入力指標を１に設定するとともに（ステップＳ６１３）、入力指標を１に設定した回数を計数する（ステップＳ６１５）。 <4th statement>
When a transition is made to the fourth statement, measurement of relative time, for example, 120 seconds is started (step S611). Next, the input index is set to 1 (step S613), and the number of times the input index is set to 1 is counted (step S615).

次に、入力指標を１に設定した回数が５回に至ったか否かを判断する（ステップＳ６１７）（図１３の１３１９）。入力指標を１に設定した回数が５回に至った場合には（ＹＥＳ）、第１０のステートメントに遷移させる（符号Ｅ１１）。 Next, it is determined whether or not the number of times the input index has been set to 1 has reached 5 (step S617) (1319 in FIG. 13). When the number of times that the input index is set to 1 has reached 5 (YES), a transition is made to the tenth statement (reference E11).

入力指標を１に設定した回数が５回に至っていない場合には（ＮＯ）、相対時間、たとえば、１２０秒を経過したか否かを判断する（ステップＳ６１９）（図１３の１３１９）。相対時間を経過した場合には（ＹＥＳ）、相対時間の測定を終了し（ステップＳ６２１）、第１０のステートメントに遷移させる（符号Ｅ１１）。 If the number of times the input index is set to 1 has not reached 5 (NO), it is determined whether or not a relative time, for example, 120 seconds has elapsed (step S619) (1319 in FIG. 13). If the relative time has elapsed (YES), the measurement of the relative time is terminated (step S621), and a transition is made to the tenth statement (reference E11).

相対時間を経過していない場合には（ＮＯ）、第７のステートメントに遷移させる。 If the relative time has not elapsed (NO), the process proceeds to the seventh statement.

＜第７のステートメント＞
第７のステートメントに遷移したときには、出力部２２０を消去し（ステップＳ６２３）、出力部２２０に「「心配だ」について質問はありませんか？」と出力する（ステップＳ６２５）。次いで、所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ６２７）、１０秒間待機する（ステップＳ６２９）。この処理によって、図１３（ｂ）に示すように、出力部２２０の画像データ表示領域に所定の色の顔画像Ｍ１が表示されて、１０秒間待機する。次いで、第２のステートメントに遷移させる（符号ＥＳ）。 <Seventh statement>
When the transition to the seventh statement is made, the output unit 220 is deleted (step S623), and the output unit 220 has a question about “I'm worried”. Is output (step S625). Next, a predetermined image (for example, a face image M1 of a predetermined color) is output (step S627) and waits for 10 seconds (step S629). By this processing, as shown in FIG. 13B, a face image M1 of a predetermined color is displayed in the image data display area of the output unit 220 and waits for 10 seconds. Next, a transition is made to the second statement (reference symbol ES).

＜第５のステートメント及び第８のステートメント＞
図７は、第５のステートメント及び第８のステートメントに対応する処理を示す。第５のステートメントは、ステップＳ７１１〜Ｓ７２１に対応する。第８のステートメントは、ステップＳ７２３〜Ｓ７２９に対応する。 <5th statement and 8th statement>
FIG. 7 shows processing corresponding to the fifth statement and the eighth statement. The fifth statement corresponds to steps S711 to S721. The eighth statement corresponds to steps S723-S729.

＜第５のステートメント＞
第５のステートメントに遷移したときには、相対時間、たとえば、１２０秒の測定の開始する（ステップＳ７１１）。次いで、入力指標を２に設定するとともに（ステップＳ７１３）、入力指標を２に設定した回数を計数する（ステップＳ７１５）。 <5th statement>
When a transition is made to the fifth statement, measurement of a relative time, for example, 120 seconds is started (step S711). Next, the input index is set to 2 (step S713), and the number of times the input index is set to 2 is counted (step S715).

次に、入力指標を２に設定した回数が５回に至ったか否かを判断する（ステップＳ７１７）（図１３の１３１９）。入力指標を２に設定した回数が５回に至った場合には（ＹＥＳ）、第１１のステートメントに遷移させる（符号Ｅ１２）。 Next, it is determined whether or not the number of times the input index is set to 2 has reached 5 (step S717) (1319 in FIG. 13). When the number of times that the input index is set to 2 has reached 5 (YES), a transition is made to the 11th statement (reference E12).

入力指標を２に設定した回数が５回に至っていない場合には（ＮＯ）、相対時間、たとえば、１２０秒を経過したか否かを判断する（ステップＳ７１９）（図１３の１３１９）。相対時間を経過した場合には（ＹＥＳ）、相対時間の測定を終了し（ステップＳ７２１）、第１１のステートメントに遷移させる（符号Ｅ１２）。 If the number of times the input index is set to 2 has not reached 5 (NO), it is determined whether or not a relative time, for example, 120 seconds has elapsed (step S719) (1319 in FIG. 13). If the relative time has elapsed (YES), the measurement of the relative time is terminated (step S721), and the process proceeds to the eleventh statement (reference E12).

相対時間を経過していない場合には（ＮＯ）、第８のステートメントに遷移させる。 If the relative time has not elapsed (NO), the process proceeds to the eighth statement.

＜第８のステートメント＞
第８のステートメントに遷移したときには、出力部２２０を消去し（ステップＳ７２３）、出力部２２０に「「大丈夫」について質問はありませんか？」と出力する（ステップＳ７２５）。次いで、所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ７２７）、１０秒間待機する（ステップＳ７２９）。次いで、第２のステートメントに遷移させる（符号ＥＳ）。 <Eighth statement>
When the transition to the eighth statement is made, the output unit 220 is deleted (step S723), and the output unit 220 has a question about “OK”. Is output (step S725). Next, a predetermined image (for example, a face image M1 of a predetermined color) is output (step S727) and waits for 10 seconds (step S729). Next, a transition is made to the second statement (reference symbol ES).

＜第６のステートメント及び第９のステートメント＞
図８は、第６のステートメント及び第９のステートメントに対応する処理を示す。第６のステートメントは、ステップＳ８１１〜Ｓ８２１に対応する。第９のステートメントは、ステップＳ８２３〜Ｓ８２９に対応する。 <6th statement and 9th statement>
FIG. 8 shows processing corresponding to the sixth statement and the ninth statement. The sixth statement corresponds to steps S811 to S821. The ninth statement corresponds to steps S823 to S829.

＜第６のステートメント＞
第６のステートメントに遷移したときには、相対時間、たとえば、１２０秒の測定の開始する（ステップＳ８１１）。次いで、入力指標を３に設定するとともに（ステップＳ８１３）、入力指標を３に設定した回数を計数する（ステップＳ８１５）。 <Sixth statement>
When a transition is made to the sixth statement, measurement of relative time, for example, 120 seconds is started (step S811). Next, the input index is set to 3 (step S813), and the number of times the input index is set to 3 is counted (step S815).

次に、入力指標を３に設定した回数が５回に至ったか否かを判断する（ステップＳ８１７）（図１３の１３１９）。入力指標を３に設定した回数が５回に至った場合には（ＹＥＳ）、第１１のステートメントに遷移させる（符号Ｅ１３）。 Next, it is determined whether or not the number of times the input index is set to 3 has reached 5 (step S817) (1319 in FIG. 13). When the number of times that the input index is set to 3 has reached 5 (YES), a transition is made to the eleventh statement (reference E13).

入力指標を３に設定した回数が５回に至っていない場合には（ＮＯ）、相対時間、たとえば、１２０秒を経過したか否かを判断する（ステップＳ８１９）（図１３の１３１９）。相対時間を経過した場合には（ＹＥＳ）、相対時間の測定を終了し（ステップＳ８２１）、第１２のステートメントに遷移させる（符号Ｅ１３）。 If the number of times the input index is set to 3 has not reached 5 (NO), it is determined whether or not a relative time, for example, 120 seconds has elapsed (step S819) (1319 in FIG. 13). If the relative time has elapsed (YES), the measurement of the relative time is terminated (step S821), and a transition is made to the twelfth statement (reference E13).

相対時間を経過していない場合には（ＮＯ）、第９のステートメントに遷移させる。 If the relative time has not elapsed (NO), the process proceeds to the ninth statement.

＜第９のステートメント＞
第９のステートメントに遷移したときには、出力部２２０を消去し（ステップＳ８２３）、出力部２２０に「「危険だ」について質問はありませんか？」と出力する（ステップＳ８２５）。次いで、所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ８２７）、１０秒間待機する（ステップＳ８２９）。次いで、第２のステートメントに遷移させる（符号ＥＳ）。 <9th statement>
When the transition to the ninth statement is made, the output unit 220 is deleted (step S823), and the output unit 220 has a question about “it is dangerous”? Is output (step S825). Next, a predetermined image (for example, a face image M1 of a predetermined color) is output (step S827) and waits for 10 seconds (step S829). Next, a transition is made to the second statement (reference symbol ES).

＜第１０のステートメント＞
図９は、第１０のステートメントに対応する処理を示す。上述した図６の処理（符号Ｅ１１）によって、第１０のステートメントに遷移する。 <10th statement>
FIG. 9 shows a process corresponding to the tenth statement. Transition to the tenth statement is made by the above-described processing of FIG. 6 (reference numeral E11).

第１０のステートメントに遷移したときには、進捗指標を−２に設定する（ステップＳ９１１）。次いで、出力部２２０を消去し（ステップＳ９１３）、出力部２２０に「次の話題にシフトします」と出力する（ステップＳ９１５）。所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ９１７）、３秒間待機する（ステップＳ９１９）。次いで、「大丈夫」という語を入力情報として、この語に対応する話題名に切り替える（ステップＳ９２１）。次いで、第２のステートメントに遷移させる（符号ＥＳ）。 When transitioning to the tenth statement, the progress index is set to -2 (step S911). Next, the output unit 220 is erased (step S913), and “shift to the next topic” is output to the output unit 220 (step S915). A predetermined image (for example, a face image M1 of a predetermined color) is output (step S917) and waits for 3 seconds (step S919). Next, the word “OK” is used as input information, and the topic name corresponding to this word is switched (step S921). Next, a transition is made to the second statement (reference symbol ES).

上述した処理によって、図１３（ｄ）に示すように、出力部２２０のテキストデータ表示領域に、「次の話題にシフトします」と表示される。このようにすることで、一の話題名のシナリオデータから他の話題名のシナリオデータに遷移することができる。 As a result of the above-described processing, as shown in FIG. 13D, “Shift to next topic” is displayed in the text data display area of the output unit 220. In this way, it is possible to transition from scenario data with one topic name to scenario data with another topic name.

＜第１１のステートメント＞
図１０は、第１１のステートメントに対応する処理を示す。上述した図７の処理（符号Ｅ１２）によって、第１１のステートメントに遷移する。 <Eleventh statement>
FIG. 10 shows processing corresponding to the eleventh statement. Transition to the eleventh statement is made by the above-described processing of FIG. 7 (reference numeral E12).

第１１のステートメントに遷移したときには、進捗指標を−３に設定する（ステップＳ１０１１）。次いで、出力部２２０を消去し（ステップＳ１０１３）、出力部２２０に「次の話題にシフトします」と出力する（ステップＳ１０１５）。所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ１０１７）、３秒間待機する（ステップＳ１０１９）。次いで、「危険だ」という語を入力情報として、この語に対応する話題名に切り替える（ステップＳ１０２１）。次いで、第２のステートメントに遷移させる（符号ＥＳ）。 When transitioning to the eleventh statement, the progress indicator is set to -3 (step S1011). Next, the output unit 220 is erased (step S1013), and “shift to the next topic” is output to the output unit 220 (step S1015). A predetermined image (for example, a face image M1 of a predetermined color) is output (step S1017) and waits for 3 seconds (step S1019). Next, using the word “dangerous” as input information, the topic name corresponding to this word is switched (step S1021). Next, a transition is made to the second statement (reference symbol ES).

＜第１２のステートメント＞
図１１は、第１２のステートメントに対応する処理を示す。上述した図８の処理（符号Ｅ１３）によって、第１２のステートメントに遷移する。 <Twelfth statement>
FIG. 11 shows processing corresponding to the twelfth statement. Transition to the twelfth statement is made by the above-described processing of FIG. 8 (reference numeral E13).

第１２のステートメントに遷移したときには、進捗指標を−４に設定する（ステップＳ１１１１）。次いで、出力部２２０を消去し（ステップＳ１１１３）、出力部２２０に「時間になりました」と出力する（ステップＳ１１１５）。所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ１１１７）、３秒間待機する（ステップＳ１１１９）。次いで、第１３のステートメントに遷移させる（符号ＥＥ）。 When transitioning to the twelfth statement, the progress indicator is set to -4 (step S1111). Next, the output unit 220 is erased (step S1113), and “time has come” is output to the output unit 220 (step S1115). A predetermined image (for example, a face image M1 of a predetermined color) is output (step S1117) and waits for 3 seconds (step S1119). Next, a transition is made to the thirteenth statement (reference EE).

＜第１３のステートメント＞
図１２は、第１３のステートメントに対応する処理を示す。上述した図１１の処理（符号ＥＥ）によって、第１３のステートメントに遷移する。 <13th statement>
FIG. 12 shows processing corresponding to the thirteenth statement. The process transitions to the thirteenth statement by the process of FIG. 11 (reference EE) described above.

第１３のステートメントに遷移したときには、出力部２２０を消去し（ステップＳ１２１１）、出力部２２０に「説明を終了します」と出力する（ステップＳ１２１３）（図８の１３２１）。所定の画像（たとえば、所定の色の顔画像Ｍ１）を出力して（ステップＳ１２１５）、３秒間待機する（ステップＳ１２１７）。次いで、別のステートメント（sta:200）に遷移させる。 When the transition is made to the thirteenth statement, the output unit 220 is deleted (step S1211), and “the explanation is finished” is output to the output unit 220 (step S1213) (1321 in FIG. 8). A predetermined image (for example, a face image M1 of a predetermined color) is output (step S1215) and waits for 3 seconds (step S1217). Next, a transition is made to another statement (sta: 200).

なお、図１３（ｂ）に示す状態で、ユーザが「やっぱり、政治が心配ですね」と入力した場合には（図１３（ｃ））、出力部２２０のテキストデータ表示領域に、「政治家は嘘をつかないので心配する必要はありません」と表示され（図１３（ｃ））、再び、図１３（ｂ）に示す状態に戻る。 In the state shown in FIG. 13B, when the user inputs “After all, I am worried about politics” (FIG. 13C), the text data display area of the output unit 220 displays “politician” Do not lie, so there is no need to worry ”(FIG. 13C), and the state again returns to the state shown in FIG. 13B.

＜シナリオデータの構成＞
上述した第１〜第１３のステートメントのように、本実施の形態では、シナリオデータは複数のステートメントからなる。一のステートメントに遷移して、その一のステートメントに基づく処理を実行したあと、他のステートメントに遷移して、そのステートメントに基づく処理を実行する。このようなステートメントの遷移とステートメントにおける処理とを繰り返していくことによって、ユーザに話題を提供していくことができる。このように、本実施の形態における話題提供システム１、会話制御端末装置２及び保守装置３では、シナリオデータ（複数のステートメント）が用いられる。ここでは、単一のステートメントの構成について説明する。 <Configuration of scenario data>
As in the first to thirteenth statements described above, in this embodiment, the scenario data consists of a plurality of statements. After transitioning to one statement and executing processing based on that one statement, transitioning to another statement and executing processing based on that statement. It is possible to provide a topic to the user by repeating the transition of the statement and the processing in the statement. Thus, scenario data (a plurality of statements) is used in the topic providing system 1, the conversation control terminal device 2, and the maintenance device 3 in the present embodiment. Here, the structure of a single statement will be described.

上述した第１〜第１３のステートメントのように、本実施の形態のステートメントは、遷移情報や判断などの各種の要素から構成される。ユーザに話題を提供するための制御や、ユーザとの会話の制御に必要な要素をステートメントに含めることができる。 As in the first to thirteenth statements described above, the statement of the present embodiment is composed of various elements such as transition information and judgment. A statement can include elements necessary for providing a topic to the user and controlling a conversation with the user.

識別情報は、ステートメントを識別するために付された情報である。ステートメントを遷移させるときに、この識別情報が参照される。１つのステートメントには、遷移先情報も含まれている。 The identification information is information attached to identify the statement. This identification information is referred to when transitioning statements. One statement includes transition destination information.

すなわち、１つのステートメントには、識別情報と遷移先情報との双方が含まれている。識別情報は、ステートメントを識別するための識別情報であり、そのステートメント自身を示す情報である。一方、遷移先情報は、次に遷移させるべきステートメントを指定するための情報である。識別情報及び遷移先情報は、他のステートメントから一のステートメントに遷移するときに用いられる情報である。すなわち、他のステートメントに定められている遷移先情報にしたがって、その遷移先情報と一致する識別情報のステートメントを探し、遷移先情報と一致した識別情報のステートメント（一のステートメント）に遷移させる。このように、識別情報と遷移先情報との双方を用いることによって、ステートメントを次々に遷移させていくことができる。 That is, one statement includes both identification information and transition destination information. The identification information is identification information for identifying a statement and is information indicating the statement itself. On the other hand, the transition destination information is information for designating a statement to be transitioned next. The identification information and the transition destination information are information used when transitioning from one statement to another. That is, according to the transition destination information defined in another statement, an identification information statement that matches the transition destination information is searched, and a transition is made to the identification information statement (one statement) that matches the transition destination information. In this way, by using both the identification information and the transition destination information, it is possible to transition statements one after another.

判断は、指標に基づく判断と時間に基づく判断とがある。指標に基づく判断は、指標が所定の条件を満たすか否かを判断するための判断である。指標が、所定の条件を満たした場合には真と判別し、所定の条件を満たさなかった場合には偽と判別して分岐させることができる。時間に基づく判断は、時間が所定の条件を満たすか否かを判断するための判断である。時間や時刻が、所定の条件を満たした場合には真と判別し、所定の条件を満たさなかった場合には偽と判別して分岐させることができる。 The judgment includes judgment based on an index and judgment based on time. The determination based on the index is a determination for determining whether the index satisfies a predetermined condition. If the index satisfies a predetermined condition, it can be determined to be true, and if the index does not satisfy the predetermined condition, it can be determined to be false and branched. The determination based on time is a determination for determining whether time satisfies a predetermined condition. If the time or time satisfies a predetermined condition, it can be determined to be true, and if the predetermined condition is not satisfied, it can be determined to be false and branched.

出力情報は、出力部２２０に出力するためのテキストデータである。なお、出力情報の画像データを指定する識別情報（たとえば、ファイル名など）を含ませることによって、出力情報によって画像も出力部２２０に出力することができる。 The output information is text data to be output to the output unit 220. In addition, by including identification information (for example, a file name) specifying image data of output information, an image can also be output to the output unit 220 by the output information.

設定要素は、状態指標を設定するための要素である。設定された状態指標に基づいて、ステートメントを遷移させたり分岐させたりすることができる。 The setting element is an element for setting a state index. A statement can be transitioned or branched based on a set state index.

その他の要素として、たとえば、改行や消去などがある。改行は、出力部２２０においてテキストを改行して出力するためのものである。消去は、出力部２２０に出力したテキストや画像を消すためのものである。 Other elements include, for example, line breaks and erasures. The line break is for outputting the text with a line break in the output unit 220. Erasing is for erasing the text or image output to the output unit 220.

出力制御要素は、上述した出力情報の出力を制御するための要素である。たとえば、出力情報を出力する時間を規定したり、出力情報の出力とともに出力する画像を規定したりすることができる。 The output control element is an element for controlling the output of the output information described above. For example, it is possible to define a time for outputting the output information, or to define an image to be output together with the output of the output information.

遷移先情報は、次の遷移先のステートメントを指定するための情報である。この遷移先情報に一致する識別情報を探し、その識別情報のステートメントに遷移させる。 The transition destination information is information for designating the next transition destination statement. The identification information matching the transition destination information is searched for, and a transition is made to the statement of the identification information.

話題切替情報は、話題名を切り替えるための要素である。本実施の形態のシナリオデータ（複数のステートメント）は、複数の話題名ごとに規定されている。すなわち、複数の話題名の各々にシナリオデータが対応するように、シナリオデータは規定されている。すなわち、本実施の形態では、複数の話題名の各々にシナリオデータが対応するように規定され、さらに、シナリオデータの各々は、複数のステートメントによって構成されている。 Topic switching information is an element for switching topic names. The scenario data (a plurality of statements) of the present embodiment is defined for each of a plurality of topic names. That is, the scenario data is defined so that the scenario data corresponds to each of a plurality of topic names. That is, in this embodiment, the scenario data is defined so as to correspond to each of a plurality of topic names, and each scenario data is constituted by a plurality of statements.

ユーザと会話で一の話題名で足りる場合には、一の話題名に対応するシナリオデータを使ってステートメントを次々に遷移させていけばよい。しかしながら、ユーザと会話が進むに従って、一の話題名から他の話題名に移す必要が生ずる場合も想定される。このような場合には、一の話題名に対応するシナリオデータでは十分に対応できなくなる。このため、複数の話題名の各々に対応するシナリオデータを予め規定しておくことによって、一の話題名から他の話題名に話題が移る必要が生じた場合でも、他の話題名に対応するシナリオデータに切り替えることができる。他の話題名に対応するシナリオデータも複数のステートメントによって構成されている。他の話題名に対応するシナリオデータのステートメントを次々に遷移させることで、他の話題名についてユーザと会話を進めることができる。 When a single topic name is sufficient for conversation with the user, it is only necessary to transition statements one after another using scenario data corresponding to the single topic name. However, there may be a case where it is necessary to move from one topic name to another as the conversation progresses with the user. In such a case, scenario data corresponding to one topic name cannot be sufficiently handled. For this reason, by defining scenario data corresponding to each of a plurality of topic names in advance, even when a topic needs to be transferred from one topic name to another topic name, it corresponds to another topic name. Switch to scenario data. Scenario data corresponding to other topic names is also composed of a plurality of statements. By sequentially switching statements of scenario data corresponding to other topic names, it is possible to advance conversation with the user on other topic names.

話題名を切り替える場合には、iWA３０に記憶されている全てのシナリオデータを組み替えて、その話題名に対応するシナリオデータを生成する。組み替えたシナリオデータは、Topiclet２０のシナリオデータ記憶部２７０に記憶される。このシナリオデータの組み替えは、複数のステートメントの組み合せを話題名に応じて定めることで実行できる。 When switching the topic name, all scenario data stored in the iWA 30 are rearranged to generate scenario data corresponding to the topic name. The rearranged scenario data is stored in the scenario data storage unit 270 of the Topiclet 20. This recombination of scenario data can be executed by determining a combination of a plurality of statements according to the topic name.

また、話題名を切り替えるか否かは、上述した性格指標に基づいて判断するのが好ましい。性格指標は、ある話題について、ユーザが積極的であるのか消極的であるのかを示す情報である。積極的な場合には、話題名を切り替えることなく話題を提供し続けることができると判断することができる。一方、消極的な場合には、話題名を切り替えて話題を提供しなければならないと判断することができる。 Further, it is preferable to determine whether to switch the topic name based on the above-described personality index. The personality index is information indicating whether the user is active or passive about a certain topic. In the case of being active, it can be determined that the topic can be continuously provided without switching the topic name. On the other hand, in a negative case, it can be determined that the topic name must be switched to provide the topic.

＜＜入力特定情報の実例＞＞
入力特定情報には、入力情報に対応する識別情報あるいは識別情報を特定するための情報が付加されており、この情報によりシナリオデータを起動することができる。 << Example of input specific information >>
Identification information corresponding to the input information or information for identifying the identification information is added to the input specifying information, and scenario data can be activated by this information.

なお、図１５では、図１〜図４における入力特定情報には、次に述べる話題紹介リストも含まれる。 In FIG. 15, the input specifying information in FIGS. 1 to 4 includes a topic introduction list described below.

＜＜話題紹介リストの実例＞＞
図１６は、話題紹介リストの構成の例を示す図である。 << Example of topic introduction list >>
FIG. 16 is a diagram illustrating an example of the configuration of a topic introduction list.

＜話題紹介リストの構成＞
話題紹介リストは、関連詞集合と話題との組のリストである。関連詞集合は、話題に含まれている関連詞の集合である。関連詞の近傍は、ある関連詞Ａに着目した場合に、関連詞Ａを含む関連詞の集合である。関連詞の近傍系は、関連詞の近傍の集合である。関連詞の近傍数は、関連詞の近傍系の要素数である。関連詞の位相は、関連詞の近傍系を見ることによりわかるものである。話題解析を実施することにより、全ての関連詞の組に対して近傍系を表示できる。嗜好解析を実施することにより、関連詞の近傍系を嗜好順に表示することができる。話題紹介に関わるシナリオは、関連詞の近傍系に基づき話題の近さや繋がりに基づいて構成できるシナリオである。 <Composition of topic introduction list>
The topic introduction list is a list of pairs of related terms and topics. The related term set is a set of related terms included in the topic. The vicinity of the related term is a set of related terms including the related term A when attention is given to a certain related term A. A neighborhood system of related terms is a set of neighborhoods of related terms. The number of neighbors of the related term is the number of elements in the neighborhood system of the related term. The phase of the related term can be understood by looking at the neighborhood system of the related term. By performing the topic analysis, the neighborhood system can be displayed for all sets of related terms. By performing the preference analysis, it is possible to display the neighborhood system of related terms in the order of preference. A scenario related to topic introduction is a scenario that can be configured based on the proximity or connection of topics based on the neighborhood system of related terms.

話題は、アクションとインデックスとが割り当てられたテキストである。アクションは、テキストをクリック等により起動する変化である。話題にはインデックスとしての関連詞集合が付加されている。インデックスをクリックするとインデックスに含まれる関連詞の関連詞仲間が表示される。 A topic is text that is assigned an action and an index. An action is a change that is activated by clicking a text or the like. A set of related terms as an index is added to the topic. Clicking on the index displays the related term companions that are included in the index.

＜話題紹介リストの利用＞
図１７は、関連詞辞書や嗜好辞書などの関連詞構造が導入された関連詞により、話題紹介リストの話題は繋がっていることを示している。利用者は話題紹介リストの話題の繋がり方に着目して話題を検知することができる。 <Use of topic introduction list>
FIG. 17 shows that topics in the topic introduction list are connected by a related terminology in which a related term structure such as a related term dictionary or a preference dictionary is introduced. The user can detect a topic by paying attention to how the topics in the topic introduction list are connected.

また、利用者は、話題紹介リストにおける関連詞構造を参考にしながら、関連詞に基づいて話題切替を実施することにより、多様な観点から話題を検知することが可能となる。 Further, the user can detect a topic from various viewpoints by performing topic switching based on the related terminology while referring to the related term structure in the topic introduction list.

＜関連詞による話題切替＞
話題に関わる応答情報を規定するシナリオデータにおいては、話題名による話題切替だけではなく、話題そのものを切り替えて紹介することが必要となる。話題解析は、話題紹介リストを介して、シナリオデータで話題そのものを切り替えるために必要な「関連詞と関連詞構造」を提供している。 <Topic switching with related words>
In scenario data defining response information related to a topic, it is necessary to introduce not only the topic switching by topic name but also the topic itself. Topic analysis provides "related terms and related term structure" necessary for switching the topic itself in the scenario data via the topic introduction list.

図１７に示したように関連詞により話題の近さや繋がり方を見ることができるので、関連詞をシナリオデータで利用することにより、話題そのものの話題切替が実施できることになる。例えば、現在の話題に対して、内容の近い話題への話題切替や内容が繋がっている話題への話題切替などが実施できる。 As shown in FIG. 17, it is possible to see the closeness and connection of topics by using related words. Therefore, by using related words in scenario data, it is possible to perform topic switching of the topic itself. For example, for the current topic, topic switching to a topic having a similar content or topic switching to a topic having a connected content can be performed.

また、関連詞に関連詞構造が導入されていることにより、関連詞を変数化した変数関連詞（例えば、最も頻度の高い関連詞、最も人気のある関連詞など）をシナリオデータで利用することにより、話題の多様な紹介が可能となる。 Also, by using a related phrase structure in the related phrase, variable related phrases (for example, the most frequent related phrases, the most popular related phrases, etc.) can be used in the scenario data. Makes it possible to introduce various topics.

関連詞による話題切替ができることにより、利用者が話題紹介リストを用いて話題を検知する手作業を、話題提供システム１がシナリオデータを用いて話題提供サービスとして再現することができる。 Since the topic can be switched by using related terms, the topic providing system 1 can reproduce the manual operation in which the user detects the topic using the topic introduction list as the topic providing service using the scenario data.

＜＜＜話題解析＞＞＞
次に、改めて、図１５乃至図１８に基づいて、図４における保守装置３に設けられた話題解析部について説明する。前述したように、保守装置３は、話題提供システム１の契約者がユーザに提供したい情報を含めて事前に作成するためのシナリオデータ検証部を構成している。そして、ユーザに提供したい情報（話題紹介リスト）を作成するためには必要が無いが、シナリオデータ検証部に対して前述した状態制御指標記憶部を加えることにより、この実施形態の保守装置３は、仮想的に前記会話制御端末装置２として機能させるための端末装置仮想構築部としても機能するように構成している。つまり、前記状態制御指標記憶部をシナリオデータ検証部に加えることにより端末装置仮想構築部を構成し、この端末装置仮想構築部が前述したTopiclet２０とiWA Manager４０とに相当する。 <<< Topic Analysis >>>
Next, the topic analysis unit provided in the maintenance device 3 in FIG. 4 will be described again based on FIGS. 15 to 18. As described above, the maintenance device 3 constitutes a scenario data verification unit for creating in advance including information that the subscriber of the topic providing system 1 wants to provide to the user. And although it is not necessary to create the information (topic introduction list) that the user wants to provide, the maintenance device 3 of this embodiment adds the state control index storage unit described above to the scenario data verification unit. The terminal device virtual construction unit is configured to function as the conversation control terminal device 2 virtually. That is, a terminal device virtual construction unit is configured by adding the state control index storage unit to the scenario data verification unit, and this terminal device virtual construction unit corresponds to the above-described Topiclet 20 and iWA Manager 40.

図１５に図示されているように、本話題解析部によれば、話題を解析することが可能であるとともに、話題紹介リストの可視化を行うための出力を行うことができる。つまり、話題を関係付ける関連詞を介して、話題の近さや繋がり方を付与した話題リストを生成することが話題解析部で実行されるようにしている。また、保守装置３は、この話題解析部に加えて、話題リストと前記関連詞とを利用してユーザに話題を紹介するための話題紹介リスト（図１７における話題リストに相当）、及びユーザの入力に応答するための入力関連シナリオを前記シナリオデータとして編集可能にするためのシナリオデータ編集部とを有していることも特徴としている。保守装置３の場合には、会話制御端末装置２に仮想的に構築しているので、ここでのユーザの入力はシミュレーター（担当者）の入力に相当する。 As shown in FIG. 15, according to the topic analysis unit, it is possible to analyze the topic and to output for visualizing the topic introduction list. In other words, the topic analysis unit is configured to generate a topic list to which the closeness of the topics and how to connect the topics are connected via the related words that relate the topics. In addition to the topic analysis unit, the maintenance device 3 uses a topic list and the related terms to introduce a topic introduction list (equivalent to the topic list in FIG. 17) for introducing a topic to the user, It is also characterized by having a scenario data editing section for making it possible to edit an input related scenario for responding to an input as the scenario data. In the case of the maintenance device 3, since it is virtually constructed in the conversation control terminal device 2, the user input here corresponds to the input of the simulator (person in charge).

以下、前述した話題解析部における話題解析、話題紹介リストの生成、シナリオデータ編集部によるシナリオデータ編集について説明する。 Hereinafter, topic analysis in the topic analysis unit, generation of a topic introduction list, and scenario data editing by the scenario data editing unit will be described.

＜＜＜話題解析に基づく応答の出力＞＞＞
図１５は、話題解析に基づいて応答情報を生成して応答情報を出力部に出力する過程を示す図である。図１６は、話題紹介リストの構成の例を示す図である。図１７は、話題の抽出、関連詞辞書の生成及び嗜好辞書の生成の過程を示す図である。話題解析は、話題解析部によって実行されて、サーバとしてのiWA３０の嗜好辞書や関連詞辞書が構築される。保守装置３のＣＰＵ等で構成される制御部は、これらの辞書を利用して、契約者の担当者が入力した複数の話題（話題ネタ）の夫々に、話題に含まれる複数の関連詞を自動的に付与する。 <<< Response output based on topic analysis >>>
FIG. 15 is a diagram illustrating a process of generating response information based on topic analysis and outputting the response information to an output unit. FIG. 16 is a diagram illustrating an example of the configuration of a topic introduction list. FIG. 17 is a diagram illustrating a process of topic extraction, generation of a related term dictionary, and generation of a preference dictionary. Topic analysis is executed by the topic analysis unit, and a preference dictionary and related term dictionary of iWA30 as a server are constructed. The control unit configured by the CPU of the maintenance device 3 uses these dictionaries to assign a plurality of related terms included in the topic to each of a plurality of topics (topic topics) input by the person in charge of the contractor. Grant automatically.

図１５は、話題解析に基づいて応答情報を生成して応答情報を出力部に出力する過程を示す図である。 FIG. 15 is a diagram illustrating a process of generating response information based on topic analysis and outputting the response information to an output unit.

話題解析をした結果、iWA３０によって、入力関連シナリオや話題紹介シナリオからシナリオデータを生成する。iWA３０によって、識別情報などの要素から入力特定情報を生成する。iWA３０によって、関連詞の集合と話題とから話題リストを生成する。 As a result of the topic analysis, scenario data is generated by the iWA 30 from the input-related scenario and the topic introduction scenario. The iWA 30 generates input specific information from elements such as identification information. The topic list is generated from the set of related terms and topics by iWA30.

そして、生成したシナリオデータと入力特定情報と話題紹介リストとに基づいて応答情報を生成し、出力部２２０から応答情報を出力する。 Then, response information is generated based on the generated scenario data, input identification information, and topic introduction list, and the response information is output from the output unit 220.

このようにすることで話題解析から応答情報を生成し、応答情報を出力することができる。話題解析により応答情報を生成することにより、ユーザとの会話で用いる応答情報をユーザの各々に対して適切な内容にすることができ、ユーザとの会話をより円滑にすることができる。 In this way, response information can be generated from topic analysis and the response information can be output. By generating response information by topic analysis, the response information used in the conversation with the user can be made appropriate for each user, and the conversation with the user can be made smoother.

＜話題紹介リストの構成＞
図１６は、話題紹介リストの構成の例を示す図である。 <Composition of topic introduction list>
FIG. 16 is a diagram illustrating an example of the configuration of a topic introduction list.

話題紹介リストは、関連詞集合と話題との組のリストである。関連詞集合は、話題に含まれている関連詞の集合である。関連詞の近傍は、ある関連詞Ａに着目した場合に、関連詞Ａを含む関連詞の集合である。関連詞の近傍系は、関連詞の近傍の集合である。関連詞の近傍数は、関連詞の近傍系の要素数である。関連詞の位相は、関連詞の近傍系を見ることによりわかるものである。話題解析を実施することにより、全ての関連詞の組に対して近傍系を表示できる。嗜好解析を実施することにより、関連詞の近傍系を嗜好順に表示することができる。話題紹介に関わるシナリオは、関連詞の近傍系に基づき話題の近さや繋がりに基づいて構成できるシナリオである。 The topic introduction list is a list of pairs of related terms and topics. The related term set is a set of related terms included in the topic. The vicinity of the related term is a set of related terms including the related term A when attention is given to a certain related term A. A neighborhood system of related terms is a set of neighborhoods of related terms. The number of neighbors of the related term is the number of elements in the neighborhood system of the related term. The phase of the related term can be understood by looking at the neighborhood system of the related term. By performing the topic analysis, the neighborhood system can be displayed for all sets of related terms. By performing the preference analysis, it is possible to display the neighborhood system of related terms in the order of preference. A scenario related to topic introduction is a scenario that can be configured based on the proximity or connection of topics based on the neighborhood system of related terms.

＜話題の抽出、関連詞辞書の生成、嗜好辞書の生成＞
図１７は、話題紹介リストの作成と、嗜好辞書を用いたユーザのプロファイリング化とを示す図である。 <Topic extraction, generation of related term dictionary, generation of preference dictionary>
FIG. 17 is a diagram illustrating creation of a topic introduction list and profiling of a user using a preference dictionary.

前記話題紹介リストは、図１７における保守装置３に設けられた表示装置の画面（符号１８１３と符号１８１１）に示されるような表示形態で可視化されて、関連詞辞書Ｆや外部のニュースソースから得た話題リストＧをiWA３０から抽出しつつ、担当者の入力による話題の手入力と、外部から取り入れた話題のデータ群からなる前記話題リストＧに基づき保守装置３の制御によって自動的に話題を追加構築されていくものである。 The topic introduction list is visualized in a display form as shown on the screen (reference numerals 1813 and 1811) of the display device provided in the maintenance device 3 in FIG. 17, and obtained from the related term dictionary F or an external news source. The topic list G is automatically extracted by controlling the maintenance device 3 based on the topic list G consisting of topic data input by the person in charge and the topic data group taken from outside while extracting the topic list G from the iWA 30 It will be built.

話題紹介リストに話題（画面１８１３と１８１１の話題ネタ設定の欄）として入力され表示される内容の入力源は、第１に、契約者の担当者が、保守装置３の入力装置としての入力キーボードで話題ネタを直接にインプットして設定する第１形態と、iWA３０が外部から収集したログデータ、たとえば、ツイッターやブログなどのネットワークを介して収集できるデータから前記保守装置３の入力装置で担当者が入力した話題に基づいて、自動的にiWA３０が話題ネタを抽出する（１８１１）第２形態が存在する。抽出した話題から、話題紹介リストの候補となる話題閲覧リストのデータが生成される。iWA３０は、保守装置３を通じて担当者が手入力した話題に対し、画面１８１５に表示するために、関連詞辞書Ｆを参照して、複数の関連詞がiWA３０の制御の基で入力された話題に関連付けられる段階と、次に、担当者が手入力した話題に対して関連づけられた関連詞辞書をキーとして、iWA３０が外部から収集した一般的なニュース群からなる話題リストＧを参照し、担当者が入力した話題と関連する外部の話題を自動的にiWA３０が抽出し、その抽出したデータを保守装置３が受信して画面１８１１に示すように表示する段階とを主な構成している。 The input source of the content that is input and displayed as the topic (topic item setting field on the screens 1813 and 1811) in the topic introduction list is as follows. First, the person in charge of the contractor uses the input keyboard as the input device of the maintenance device 3 The person in charge at the input device of the maintenance device 3 from the first form in which the topic material is directly input and set from the log data collected by the iWA 30 from the outside, for example, data that can be collected via a network such as Twitter or blog There is a second form in which the iWA 30 automatically extracts the topic material based on the topic input by (1811). From the extracted topic, topic browsing list data that is a candidate for the topic introduction list is generated. The iWA 30 refers to the related term dictionary F in order to display the topic manually input by the person in charge through the maintenance device 3 on the screen 1815, and to the topic in which a plurality of related terms are input under the control of the iWA 30. The iWA 30 refers to a topic list G consisting of general news groups collected from the outside by using the related term dictionary associated with the topic manually entered by the person in charge as a key, and the person in charge The iWA 30 automatically extracts an external topic related to the topic inputted by the user, and the maintenance device 3 receives the extracted data and displays it as shown on a screen 1811.

そして、担当者は入力キーボードなどの直接的な入力装置による話題の入力と、外部からの情報をiWA３０が予め蓄積した話題リストのデータベースから得られた話題を自動的な減数又は追加を繰り返しつつ話題紹介リストを構築するのである。 Then, the person in charge repeats the topic input using a direct input device such as an input keyboard and the topic obtained from the database of the topic list in which iWA 30 previously stores information from outside while automatically reducing or adding topics. Build a referral list.

一方、本実施例の保守装置３では、前述した話題紹介リストの生成に加えて、iWA３０から得られた嗜好辞書Ｅを参照しつつ、入力された話題に対して、関連詞辞書Ｆに基づいて紐付された画面１８１５に表示される関連詞群と、他のユーザの応答履歴から生成されたユーザＩＤとユーザタイプと、関連詞とが関連付けられたユーザタイプリストのデータの関連詞を比較する。この比較によって、例えば、画面１８１７に示すように、ある話題に対する関連詞として抽出された「一番怖い」が、過去の他のユーザ履歴で構築された嗜好辞書Ｅを参照して、どのようなユーザタイプのユーザが入力した情報に、同じような関連詞が関連付けられたかを分析処理して表示して可視化を行うことができる。 On the other hand, in the maintenance apparatus 3 of the present embodiment, in addition to the generation of the topic introduction list described above, the input topic is referred to based on the related dictionary dictionary F while referring to the preference dictionary E obtained from the iWA 30. The related terms displayed in the linked screen 1815, the user IDs and user types generated from the response histories of other users, and the related terms in the user type list data associated with the related terms are compared. By this comparison, for example, as shown in a screen 1817, what kind of “most scary” extracted as a related term for a certain topic is referred to the preference dictionary E constructed by other past user histories. Visualization can be performed by analyzing and displaying whether similar related terms are associated with information input by a user of the user type.

この分析結果は同じ関連詞を入力したユーザを特定するためのユーザＩＤと、ユーザタイプ（例えば、昨日のお客）と、全てのユーザに共通する関連詞とともに記憶され、この記憶されたデータに基づいて、ユーザの嗜好を分析するために利用することができる。その利用の仕方としては、例えば、同じ嗜好を持つと考えられるようなユーザに、その嗜好に合致すると考えられる特定のサービスを、前記ユーザＩＤからユーザのメールアドレス等の送付先を特定できるので、送付先を特定して配信したり、サービス以外でも嗜好に合った話題を前記特定した送付先に提供したりするために利用することが可能となる。 This analysis result is stored together with a user ID for identifying a user who has input the same related term, a user type (for example, yesterday's customer), and related terms common to all users, and based on this stored data. And can be used to analyze user preferences. As a method of use, for example, a user who is considered to have the same preference, a specific service that is considered to match the preference can be identified from the user ID, such as the user's email address, It can be used to specify and deliver a delivery destination, or to provide a topic that suits the taste other than the service to the specified delivery destination.

図１８に示すのが、前述した保守装置３の第１形態による手入力によって話題が話題ネタ設定画面に入力されて、前記関連詞辞書、前記話題リストによって話題紹介リストを生成しサーバとしてのiWA３０に出力するまでの処理の流れを示したものである。図１７では、嗜好辞書を用いて入力された話題からユーザのプロファイルを行う処理を説明したが、このプロファイルを行う処理は話題紹介リスト生成とは異なる処理となるので、この図１８のフローでは入力された話題からユーザのプロファイルを行う処理の説明を省略する。 FIG. 18 shows iWA 30 as a server by generating a topic introduction list from the related term dictionary and the topic list when a topic is input to the topic story setting screen by manual input according to the first form of the maintenance device 3 described above. The flow of processing until output is shown. In FIG. 17, the process of performing the user profile from the topic input using the preference dictionary has been described. However, the process of performing this profile is a process different from the topic introduction list generation. A description of the process of performing the user profile from the topic is omitted.

図１８を参照して、保守装置３の制御部は、保守装置３の表示画面に話題ネタの設定画面を表示して、話題の入力が前記担当者により実行されるのを待つ（S2000）。 Referring to FIG. 18, the control unit of maintenance device 3 displays a topic story setting screen on the display screen of maintenance device 3, and waits for topic input to be executed by the person in charge (S2000).

次に、入力画面に対して話題がキーボード等の入力装置を介して入力され、話題ネタ設定のスイッチが入力されたと判定された場合（S2001：YES）、サーバとしてのiWA３０に対して、話題ネタ設定された話題ネタが送信され、この送信された話題ネタのデータの内容に応じて、関連詞辞書ＦからiWA３０のＣＰＵの制御の基で１つ以上の関連詞が抽出される。図１７の画面１８１３の場合には、話題ネタとして「パニック議論ネタ」が保守装置３からiWA３０に送信され、この「パニック議論ネタ」の内容からiWA３０のＣＰＵの制御の基で、「パニック」という関連詞が抽出される。そして、この関連詞「パニック」をキーとして、予めiWA３０のデータベース内に存在する話題リストＧのデータから、関連する話題を抽出すると、画面１８１３に図示されているように、話題リスト（選択話題数：1424 全話題数1424）スルーブット：17.25が表示される。この画面１８１３から見出せるのは、「パニック」という関連詞をキーとして、iWA３０のＣＰＵの制御の基で、１４２４の話題が抽出されて、最終的に話題紹介リストを構成する可能性のある１４２４個の話題とその話題に関連付けられた複数の関連詞が、保守装置３に受信されて画面１８１３に図示される。端的に説明すると、図１７の画面１８１３の表示を行うための、「パニック」という関連詞をキーとして、複数の話題と、それぞれの話題に対して複数の代表関連詞が関連つけられたリストデータを表示するためのデータをiWA３０から受信する（S2002）。 Next, when a topic is input to the input screen via an input device such as a keyboard and it is determined that a switch for topic topic setting is input (S2001: YES), the topic topic for the iWA 30 serving as a server. The set topic material is transmitted, and one or more related terms are extracted from the related term dictionary F under the control of the CPU of the iWA 30 in accordance with the content of the transmitted topic content data. In the case of the screen 1813 in FIG. 17, “panic discussion material” is transmitted from the maintenance device 3 to the iWA 30 as the topic material, and the content of this “panic discussion material” is called “panic” under the control of the CPU of the iWA 30. Related terms are extracted. Then, when the related topic is extracted from the data of the topic list G existing in the iWA 30 database in advance using the related term “panic” as a key, the topic list (number of selected topics) is displayed as shown in a screen 1813. : 1424 Total number of topics 1424) Through Butt: 17.25 is displayed. From this screen 1813, 1424 topics are extracted under the control of the CPU of the iWA 30 using the related word “panic” as a key, and there is a possibility that a topic introduction list may eventually be constructed. The topic and a plurality of related terms associated with the topic are received by the maintenance device 3 and displayed on the screen 1813. Briefly, list data in which a plurality of topics and a plurality of representative related terms are associated with each topic using the related term “panic” as a key to display the screen 1813 in FIG. 17. Is received from the iWA 30 (S2002).

このような入力は、第１回目の「パニック」という関連詞をキーとして１４２４件の話題が話題紹介リスト候補として提供可能に生成され表示されたが、この数が多い場合は、可視化されている話題の内容を前記担当者が判断して、より目的となる話題紹介リストに採用する話題となるように絞込みを実行することができる。つまり、図１８のフローを参照して、ステップS2004の話題紹介リストのリストデータとして確定させることなく、一定時間経過後にタイムアウトして、再度に、ステップS2001における追加の話題ネタの入力を実行することができる。例えば、画面１８１５では、第２回目の「安心します」という話題ネタを入力することによって、入力された代表関連詞と話題となる受信されたデータを表示したものであり、そこには、選択話題数１３６全話題数１４２４）スループット：１６．５４と表示される。つまり、話題提供リストとして生成中のリストデータは、１４２４から１３６へ減数したということになる。これによって、前記担当者の意向に沿った話題提供リストに近い形態に改変されたことになる。 Such input is generated and displayed so that 1424 topics can be provided as topic introduction list candidates using the first related word “panic” as a key. If this number is large, it is visualized. The person in charge can determine the content of the topic, and can narrow down the topic so that it becomes a topic to be adopted in the target topic introduction list. That is, referring to the flow of FIG. 18, without confirming the list data of the topic introduction list in step S2004, a time-out occurs after a certain time has elapsed, and the input of additional topic material in step S2001 is executed again. Can do. For example, on the screen 1815, by inputting a topic topic “I am relieved” for the second time, the input representative related verbs and the received data as the topic are displayed. Number of topics 136 Total number of topics 1424) Throughput: 16.54 is displayed. That is, the list data being generated as the topic provision list has been reduced from 1424 to 136. As a result, it has been modified to a form close to the topic provision list according to the intention of the person in charge.

さらに、新たな話題を担当者が思いつかないような場合には、その表示形態を関連詞中心とした表示形態に変更することができる。この表示形態は、画面１８１５に示されるように「条件設定：優先関連詞」を前記入力装置の操作によって入力することで、関連詞を優先したリスト形態で表示することができるようになっている。つまり、第２回目の話題に対して、iWA３０のＣＰＵの制御の基で付与した複数の関連詞は、図１７の画面1815に示すように、１３６個の各話題（１，２，３・・・・，１３６）ごとの行に、複数の関連詞が複数列で表示される表示形態に切り替えることが可能となっている。この表示形態の切替は、前述したように、保守装置３の図示しない制御部は、前記入力装置の画面切替入力に応じて、行ごとに表示される話題ごとの関連詞を列方向に列挙する形態に表示変更する。前記担当者は、この表示形態から自らが作成しようとする話題紹介に適した関連詞を画面１８１５に表示される複数の関連詞から決定し、その関連詞を前記入力装置で入力することが可能である（Ｓ2001：ＹＥＳ）。この入力した状態を示すのが、画面１８１１であり、この場合は、関連詞「一番怖い」を新たな話題ネタとして入力した。そして、関連詞「一番怖い」を話題リストのデータベースをiWA３０のＣＰＵの制御の基で参照し、入力された話題ネタに基づいて、複数の関連詞と話題とからなる新たな話題紹介リストの候補としてのリストデータを受信し（Ｓ2002）、表示する（Ｓ2003）。 Furthermore, when the person in charge cannot come up with a new topic, the display form can be changed to a display form centered on related terms. In this display mode, as shown in a screen 1815, “condition setting: priority related terminology” is input by operating the input device, so that the related terms can be displayed in a priority list format. . In other words, a plurality of related terms given to the second topic under the control of the CPU of the iWA 30 are 136 topics (1, 2, 3,...) As shown in a screen 1815 in FIG. .., 136) It is possible to switch to a display form in which a plurality of related terms are displayed in a plurality of columns in each row. As described above, this display mode switching is performed by the control unit (not shown) of the maintenance device 3 listing the related terms for each topic displayed in each row in the column direction according to the screen switching input of the input device. Change display to form. The person in charge can determine from this display form a related terminology suitable for introducing the topic he / she wants to create from a plurality of related terms displayed on the screen 1815, and input the related terminator with the input device. (S2001: YES). The screen 1811 shows this input state. In this case, the related term “most scary” is input as a new topic topic. Then, the related term “most scary” is referred to the topic list database under the control of the CPU of the iWA30, and a new topic introduction list comprising a plurality of related terms and topics based on the topic topic inputted. The list data as candidates is received (S2002) and displayed (S2003).

前記話題リストＧは、サーバとしてのiWA３０が外部からインターネットなどの外部情報収集手段によって集めた情報からなるリストである。そして、話題リストＧの各話題には、複数の関連詞が予め関連付けてiWA３０のデータベースに記憶されている。このように、この実施形態では、前記担当者の知識不足や能力不足で、話題を設定するバリエーションが陳腐なものとなる可能性があって、サーバとしてのiWA３０話題リストから得た他の関連話題を抽出して、保守装置３の画面に話題そのものと、その話題に関連する複数の関連詞を表示することによって可視化することになるから、これらの話題リストから得た話題をベースに前述したように遷移する話題のバリエーションを豊富なものとできる可能性が高まるのである。 The topic list G is a list made up of information collected by an external information collecting means such as the Internet from the outside by the iWA 30 as a server. A plurality of related terms are associated with each topic in the topic list G in advance and stored in the iWA 30 database. As described above, in this embodiment, there is a possibility that variations in setting a topic may become obsolete due to lack of knowledge or ability of the person in charge, and other related topics obtained from the iWA30 topic list as a server. And by displaying the topic itself and a plurality of related terms related to the topic on the screen of the maintenance device 3, as described above based on the topics obtained from these topic lists This increases the possibility of abundant variations of topics that transition to.

前述したような話題提供リストの候補の改変は、入力完了の入力がなされる（S2004：YES）まで、所定時間が経過するとタイムアウトして、上述した話題を入力させるための入力画面を表示し、次の話題を待って、上記S200〜S2003の処理を順次繰り返す。 The modification of the candidate of the topic provision list as described above is timed out after a predetermined time until the input completion is input (S2004: YES), and an input screen for inputting the above-mentioned topic is displayed. Waiting for the next topic, the processes of S200 to S2003 are repeated sequentially.

そして、入力完了の入力（S2004：YES）がなされると、サーバとしてのiWA３０に対して話題紹介リストとしてのデータを出力する（S2005）。 When input completion is input (S2004: YES), data as a topic introduction list is output to the iWA 30 as a server (S2005).

＜＜＜情報検索システムにおける技術的思想の概要＞＞＞
本発明の情報検索システムは、図１に示すような話題提供システム１の仕組みを用いて、ユーザに、既知でない最新の話題を取得しうるキーワード（文字列）を提供するものである。最初に、本発明の情報検索システムにおける技術的思想の概要について、図１９を参照して説明する。 <<< Outline of Technical Thought in Information Retrieval System >>>
The information search system of the present invention provides a user with a keyword (character string) that can acquire the latest unknown topic using the mechanism of the topic providing system 1 as shown in FIG. First, an outline of the technical idea in the information search system of the present invention will be described with reference to FIG.

本発明の情報検索システムでは、個人が生成した風説情報等を含む外部ログ５０２から、文解析処理５１１、嗜好解析処理５１２、話題解析処理５１３を経て、話題を識別可能な重要な文字列（関連詞）の関係性や分布状況を表示し、この表示をユーザが閲覧することによって話題（トピック）を把握することができる。外部ログ５０２の入力から関連詞の表示の提供までが瞬時に行われる。 In the information search system of the present invention, an important character string (related) that can identify a topic from an external log 502 including narrative information generated by an individual, through a sentence analysis process 511, a preference analysis process 512, and a topic analysis process 513. The relationship (distribution)) and the distribution situation are displayed, and the user can grasp the topic (topic) by viewing this display. From the input of the external log 502 to the provision of the display of related terms is performed instantaneously.

例えば、インターネット上のＷＥＢページ等を検索した結果を話題情報ととらえると、この話題情報を圧縮・要約することにより、話題情報の要約である話題辞書が得られる。本発明の情報検索システムでは、話題辞書の圧縮は、上述した文解析処理５１１の前に行われる。例えば、ＷＥＢページ等を検索した結果のうち、タグやスクリプト文などを除いた、話題情報となりうるテキストデータだけが抽出され、そのように抽出されたテキストデータが文解析処理５１１の処理対象となる。 For example, if a result of searching a web page on the Internet is regarded as topic information, a topic dictionary that is a summary of topic information can be obtained by compressing and summarizing the topic information. In the information retrieval system of the present invention, the topic dictionary is compressed before the sentence analysis process 511 described above. For example, only text data that can be topic information, excluding tags and script sentences, is extracted from the result of searching a WEB page and the like, and the extracted text data is a processing target of the sentence analysis process 511. .

より具体的な例では、企業情報（話題情報）は、個人が生成したテキストデータからなる、企業に関わる情報であり、これは、企業に関わる知識空間である。この知識空間を、上述の処理で、言語に関わる辞書を利用することなく圧縮・要約して、部分知識空間に変換する。このような処理により、結果的に、企業情報が識別できる分解テキストデータ（関連詞）の集合からなる辞書（企業に関わる部分知識空間）が得られる。この部分知識空間には、後述するように、関連詞同士の繋がりを表す情報が含まれる。 In a more specific example, the company information (topic information) is information related to the company, which is text data generated by an individual, and this is a knowledge space related to the company. This knowledge space is compressed and summarized without using a dictionary related to the language by the above-described processing, and converted into a partial knowledge space. As a result of such processing, a dictionary (partial knowledge space related to the company) consisting of a set of decomposed text data (related terms) that can identify the company information is obtained. As will be described later, this partial knowledge space includes information representing the connection between related terms.

会話制御端末装置２’’において、（例えば、会話制御端末装置２’’のユーザから）外部ログ５０２を収集するための条件が与えられ、上述した処理（文解析処理５１１、嗜好解析処理５１２、及び話題解析処理５１３）の処理結果として、会話制御端末装置２’’に関連詞が提供される。会話制御端末装置２’’は、例えば、ＰＣ（パーソナルコンピュータ）、スマートフォン、ロボットといった装置であり、会話制御端末装置２’’がＰＣであれば、処理結果の関連詞は、会話制御端末装置２’’のディスプレイに表示され、それらが、会話制御端末装置２’’のユーザに、瞬時に話題を把握するための情報として提供される。会話制御端末装置２’’は、上述の会話制御端末装置２や会話制御端末装置２’の変形例として構成される。 In the conversation control terminal device 2 ″, a condition for collecting the external log 502 is given (for example, from the user of the conversation control terminal device 2 ″), and the above-described processing (sentence analysis processing 511, preference analysis processing 512, In addition, as a result of the topic analysis processing 513), a related term is provided to the conversation control terminal device 2 ''. The conversation control terminal device 2 ″ is, for example, a device such as a PC (personal computer), a smartphone, or a robot. If the conversation control terminal device 2 ″ is a PC, the related term of the processing result is the conversation control terminal device 2. They are displayed on the display of '' and provided to the user of the conversation control terminal device 2 '' as information for instantly grasping the topic. The conversation control terminal device 2 "is configured as a modification of the above-described conversation control terminal device 2 or conversation control terminal device 2 '.

上述の文解析処理５１１、嗜好解析処理５１２、及び話題解析処理５１３は、話題提供サーバ４’によって行われる。話題提供サーバ４’は、上述の話題提供サーバ４の変形例として構成される。 The sentence analysis process 511, the preference analysis process 512, and the topic analysis process 513 described above are performed by the topic providing server 4 '. The topic providing server 4 ′ is configured as a modification of the topic providing server 4 described above.

＜＜文解析処理の概要の説明＞＞
文解析処理５１１は、外部ログ５０２に含まれる文情報を、文字列の出現特性に基づいて解析し、関連詞５０３を選出する。 << Summary of sentence analysis process >>
The sentence analysis process 511 analyzes sentence information included in the external log 502 based on the appearance characteristic of the character string, and selects a related term 503.

文解析処理５１１は、外部ログ５０２から、形態素データのような事前に記憶・調製された辞書データを用いることなく、話題を識別可能な関連詞を選出（抽出）する。すなわち、外部ログ５０２に出現する共通の文字列を検索し、それらの文字列の直前の隣接文字の異なり度合い、及び直後の隣接文字の異なり度合いに応じて、関連詞を抽出する。 The sentence analysis process 511 selects (extracts) a related verb that can identify a topic from the external log 502 without using dictionary data stored and prepared in advance such as morpheme data. That is, a common character string appearing in the external log 502 is searched, and related terms are extracted according to the degree of difference between the adjacent characters immediately before and the degree of difference between the adjacent characters immediately after those character strings.

外部ログ５０２は、上述のように個人により作成した風説情報（例えば、所定のログフォーマットにより記憶されたデータ、インターネット上に公開されているＷＥＢページ（ホームページ）やブログのテキストデータ、及びTWITTER（登録商標）のツイート情報）や、任意の機関によって事前に生成・編集されたデータやデータベース内のテキスト情報が含まれる。また、音声ファイル・動画ファイルから音声認識処理を経て取得されるテキストデータ等、様々なデータであってもよい。 The external log 502 includes narrative information created by an individual as described above (for example, data stored in a predetermined log format, WEB page (homepage) or blog text data published on the Internet, and TWITTER (registration). Trademark information), data generated and edited in advance by an arbitrary organization, and text information in a database. Also, various data such as text data acquired from a voice file / moving image file through voice recognition processing may be used.

また、外部ログ５０２は、収集条件によって収集されたデータである。例えば、キーワード検索の検索結果として示されたＷＥＢページ（ホームページ）５０１に記載されているテキストデータや、ある属性を有するユーザのブログに記載された文や、TWITTERにおけるツイート情報などであってもよい。キーワード検索における検索条件等は、ユーザが、会話制御端末装置２’’から指定することができる。なお、１つの外部ログ５０２は、複数のテキストファイルを含むもの（例えば、１つのＷＥＢサイトに含まれる複数のＷＥＢページ（ＨＴＭＬファイル）を含むもの）であってもよいし、１つのテキストファイルを分割した一部（例えば、１ファイルに含まれるテキストを１万ラインごとに分割したうちの１つ）であってもよい。 The external log 502 is data collected according to collection conditions. For example, it may be text data described in a WEB page (homepage) 501 shown as a search result of keyword search, a sentence described in a user's blog having a certain attribute, tweet information in TWITTER, or the like. . A search condition or the like in the keyword search can be specified by the user from the conversation control terminal device 2 ″. One external log 502 may include a plurality of text files (for example, one including a plurality of WEB pages (HTML files) included in one WEB site) or one text file. It may be a divided part (for example, one of texts included in one file divided every 10,000 lines).

＜＜嗜好解析処理の概要の説明＞＞
嗜好解析処理５１２は、文解析処理５１１によって抽出された関連詞について、それらの使われ方を内部ログ５０６に基づいて捉え、それらの重要性を判定する。 << Overview of preference analysis process >>
The preference analysis processing 512 captures how the related terms extracted by the sentence analysis processing 511 are used based on the internal log 506 and determines their importance.

内部ログ５０６はユーザ（ユーザの所属する機関や組織等を含む）の嗜好を示すデータであり、例えば、所定のログフォーマットにより記憶されたデータである。内部ログ５０６は、例えば、ユーザによってどのような関連詞が利用される傾向にあるか等を示すデータを含む。このように、ユーザの嗜好に応じた重要度によって関連付けられた関連詞を、本明細書では、話題鍵（クラスタ）５０４と称する。 The internal log 506 is data indicating preferences of the user (including the organization or organization to which the user belongs), and is data stored in a predetermined log format, for example. The internal log 506 includes, for example, data indicating what related terms tend to be used by the user. In this specification, the related terminology associated with the importance according to the user's preference is referred to as a topic key (cluster) 504 in this specification.

＜＜話題解析処理の概要の説明＞＞
話題解析処理５１３は、嗜好解析処理５１２によって生成された話題鍵５０４について、話題ネタ５０７に基づいて、その分布を捉え、互いに関連付けられた関連詞の分布をユーザに提供する。 << Summary of topic analysis processing >>
The topic analysis processing 513 captures the distribution of the topic key 504 generated by the preference analysis processing 512 based on the topic material 507 and provides the user with the distribution of related terms associated with each other.

話題ネタ５０７は、上述のように、契約者の担当者が、保守装置３の入力装置で直接インプットして設定する場合と、話題提供サーバ４’が、保守装置３の入力装置で担当者が入力したキーワードに基づいて、外部から収集した外部ログ５０２（例えば、TWITTERやブログなどのネットワークを介して収集できるデータ）から自動的に抽出する場合等がある。 As described above, the topic topic 507 is set when the person in charge of the contractor directly inputs and sets with the input device of the maintenance device 3 as described above, and when the topic providing server 4 ′ is the input device of the maintenance device 3 In some cases, the external log 502 collected from outside (for example, data that can be collected via a network such as TWITTER or a blog) is automatically extracted based on the input keyword.

こうした話題解析処理５１３によって、話題のなかで関連詞がどのように分布しているかを示すことができ、会話制御端末装置２’’のユーザに合わせて関連詞を推奨するようにもできる。 By such topic analysis processing 513, it is possible to show how related terms are distributed in the topic, and it is also possible to recommend related terms according to the user of the conversation control terminal device 2 ''.

＜＜＜情報検索システムの概要＞＞＞
次に、図２０を参照して、情報検索システムの概要について説明する。図２０に示す情報検索システム１００は、会話制御端末装置２’’、及び話題提供サーバ４’を含み、会話制御端末装置２’’と話題提供サーバ４’の間は、所定のネットワーク（ＬＡＮ、インターネット、ＷＡＮ、無線通信等）で接続される。 <<< Overview of Information Retrieval System >>>
Next, an overview of the information search system will be described with reference to FIG. An information search system 100 shown in FIG. 20 includes a conversation control terminal device 2 ″ and a topic providing server 4 ′. Between the conversation control terminal device 2 ″ and the topic providing server 4 ′, a predetermined network (LAN, Internet, WAN, wireless communication, etc.).

＜＜会話制御端末装置２’’の概要＞＞
会話制御端末装置２’’は、入力制御部２１、検索制御部２２、送信制御部２３、受信制御部２４、応答情報決定部２５、出力制御部２６、及びネットワークインタフェース（Ｉ／Ｆ）部２７を含む。また、ＲＡＭのような主記憶装置、またはハードディスクや半導体メモリのような外部記憶装置に、シナリオデータ２８を記憶する。 << Overview of Conversation Control Terminal 2 ">>
The conversation control terminal device 2 ″ includes an input control unit 21, a search control unit 22, a transmission control unit 23, a reception control unit 24, a response information determination unit 25, an output control unit 26, and a network interface (I / F) unit 27. including. The scenario data 28 is stored in a main storage device such as a RAM or an external storage device such as a hard disk or a semiconductor memory.

入力制御部２１は、会話制御端末装置２’’のユーザがキーボードやマウス等を用いて入力を受け付け、入力の内容に応じて入力データ等を対応する機能部に渡す。例えば、ユーザは、キーボードにより検索キーワードを入力したり、マウスにより興味のある関連詞の表示領域をクリックしたりする。 The input control unit 21 receives an input by the user of the conversation control terminal device 2 ″ using a keyboard, a mouse, or the like, and passes input data or the like to a corresponding function unit according to the input content. For example, the user inputs a search keyword using a keyboard, or clicks a display area of an interesting related term using a mouse.

検索制御部２２は、会話制御端末装置２’’で動作する一般的なＷＥＢブラウザを含む。会話制御端末装置２’’は例えばインターネットに接続されており、ユーザがこのＷＥＢブラウザを操作して、ＷＥＢページの検索（一般的に利用可能なインターネット検索）を行うと、検索制御部２２は、得られた検索結果を送信制御部２３に送信する。検索結果には、検索キーワードに関連するＷＥＢページのアドレス（例えば、ＵＲＬ等のインターネットアドレス識別情報）が含まれている。 The search control unit 22 includes a general WEB browser that operates on the conversation control terminal device 2 ″. The conversation control terminal device 2 ″ is connected to the Internet, for example, and when the user operates the WEB browser to search a WEB page (generally available Internet search), the search control unit 22 The obtained search result is transmitted to the transmission control unit 23. The search result includes the address of the WEB page related to the search keyword (for example, Internet address identification information such as URL).

送信制御部２３は、検索制御部２２から検索結果を受信すると、これを、例えば、ＡＰＩ送信により、入力情報として、話題提供サーバ４’の入力情報分析部４１に送信する。 Upon receiving the search result from the search control unit 22, the transmission control unit 23 transmits this as input information to the input information analysis unit 41 of the topic providing server 4 ′ by API transmission, for example.

受信制御部２４は、話題提供サーバ４’の入力情報分析部４１から送信される入力特定情報等を受信し、これを応答情報決定部２５に供給する。 The reception control unit 24 receives the input specifying information transmitted from the input information analysis unit 41 of the topic providing server 4 ′ and supplies it to the response information determination unit 25.

応答情報決定部２５は、シナリオデータ２８と入力特定情報とに基づいて応答情報を決定する。入力情報分析部４１から、入力特定情報（例えば、関連詞の分布を表示するためのデータ）と当該表示に必要であると判断されたシナリオデータとに基づいて応答情報を決定する。 The response information determination unit 25 determines response information based on the scenario data 28 and the input specifying information. Response information is determined from the input information analysis unit 41 based on input specifying information (for example, data for displaying the distribution of related terms) and scenario data determined to be necessary for the display.

出力制御部２６は、応答情報決定部２５により決定された応答情報を会話制御端末装置２’’に表示するよう制御する。 The output control unit 26 performs control so that the response information determined by the response information determination unit 25 is displayed on the conversation control terminal device 2 ″.

ネットワークインタフェース部２７は、ネットワークを介して接続された話題提供サーバ４’との間のアクセスやデータ送受信、及びその他のコンピュータ（例えば、インターネットを介して接続されるインターネット検索エンジンを備えるサーバ等）との間のアクセスやデータ送受信を制御する。 The network interface unit 27 includes access to the topic providing server 4 ′ connected via the network, data transmission / reception, and other computers (for example, a server including an Internet search engine connected via the Internet). To control access and data transmission / reception.

シナリオデータ２８は、図１４に示すような、ユーザに提供する話題に関する応答情報を規定するためデータである。シナリオデータ２８は、後述する話題提供サーバ４’のシナリオデータ５５に予め記憶されているデータである。話題提供サーバ４’の入力情報分析部４１によって生成された入力特定情報に基づいて必要であると判断されたシナリオデータが、シナリオデータ５５から抽出され、抽出されたシナリオデータが、会話制御端末装置２’’のシナリオデータ２８に記憶される。シナリオデータ５５から抽出されたシナリオデータは、受信制御部２４と応答情報決定部２５の処理によってシナリオデータ２８に記憶される。 The scenario data 28 is data for defining response information related to topics provided to the user as shown in FIG. The scenario data 28 is data stored in advance in scenario data 55 of the topic providing server 4 ′ described later. Scenario data determined to be necessary based on the input specifying information generated by the input information analysis unit 41 of the topic providing server 4 ′ is extracted from the scenario data 55, and the extracted scenario data is the conversation control terminal device. 2 ″ scenario data 28 is stored. The scenario data extracted from the scenario data 55 is stored in the scenario data 28 by the processes of the reception control unit 24 and the response information determination unit 25.

話題提供サーバ４’のシナリオデータ５５にすべてのシナリオデータを記憶させておき、異なる話題への遷移を規定する情報に基づいて、シナリオデータ５５のシナリオデータから組み替え直したシナリオデータを生成し、組み替え直されたシナリオデータのみを会話制御端末装置２’’のシナリオデータ２８に記憶することができる。 All scenario data is stored in the scenario data 55 of the topic providing server 4 ′, and scenario data that is rearranged from the scenario data of the scenario data 55 is generated and rearranged based on the information that defines the transition to a different topic. Only the corrected scenario data can be stored in the scenario data 28 of the conversation control terminal device 2 ″.

＜＜話題提供サーバ４’の概要＞＞
話題提供サーバ４’は、入力情報分析部４１、及びネットワークインタフェース（Ｉ／Ｆ）部４７を含む。また、ＲＡＭのような主記憶装置、またはハードディスクや半導体メモリのような外部記憶装置に、検索結果データ４８、関連詞候補データ４９、関連詞辞書５０、嗜好データ５１、関連詞・共起語データ５２、話題データ５３、比較結果データ５４、及びシナリオデータ５５が記憶される。なお、これらのデータは、様々なデータフォーマット、データ記憶形式をとることができる。 << Overview of Topic Providing Server 4 '>>
The topic providing server 4 ′ includes an input information analysis unit 41 and a network interface (I / F) unit 47. In addition, the search result data 48, the related term candidate data 49, the related term dictionary 50, the preference data 51, the related term / co-occurrence data are stored in a main storage device such as a RAM or an external storage device such as a hard disk or a semiconductor memory. 52, topic data 53, comparison result data 54, and scenario data 55 are stored. These data can take various data formats and data storage formats.

入力情報分析部４１は、会話制御端末装置２’’から受信した入力情報を分析して入力特定情報を生成する。入力特定情報は、入力情報に含まれる各種の情報を分析した結果、生成される情報であり、例えば、後述する関連詞の分布などが含まれる。入力情報分析部４１はさらに、外部ログ取得制御部４２、文解析部４３、嗜好解析部４４、話題解析部４５、及び情報更新部４６を含む。 The input information analysis unit 41 analyzes the input information received from the conversation control terminal device 2 ″ to generate input specifying information. The input specifying information is information generated as a result of analyzing various types of information included in the input information, and includes, for example, the distribution of related terms described later. The input information analysis unit 41 further includes an external log acquisition control unit 42, a sentence analysis unit 43, a preference analysis unit 44, a topic analysis unit 45, and an information update unit 46.

外部ログ取得制御部４２は、会話制御端末装置２’’から受信した入力情報が、外部ログ５０２を識別する識別情報（例えば、検索キーワードに関連するＷＥＢページ５０１のアドレスを含む検索結果）である場合に、例えば、インターネット経由でその識別情報にアクセスし、対応するＨＴＭＬデータ等を取得する。また、会話制御端末装置２’’から受信した入力情報が、関連詞を抽出する対象となるテキストデータそのものを含んでいる場合は、そのデータを外部ログ５０２として文解析部４３に提供する。 In the external log acquisition control unit 42, the input information received from the conversation control terminal device 2 ″ is identification information for identifying the external log 502 (for example, a search result including the address of the WEB page 501 related to the search keyword). In this case, for example, the identification information is accessed via the Internet, and the corresponding HTML data is acquired. In addition, when the input information received from the conversation control terminal device 2 ″ includes text data itself that is a target for extracting related terms, the data is provided to the sentence analysis unit 43 as an external log 502.

また、所定の間隔で自動起動されるクローラー（図３９参照）から、外部ログ５０２を識別する情報（例えば、検索キーワードに関連するＷＥＢページ５０１のアドレスを含む検索結果）を受信した場合は、当該外部ログ５０２にアクセスし、対応するデータを取得して、取得したデータを、関連詞辞書を比較するために情報更新部４６に提供する。 Further, when information identifying the external log 502 (for example, a search result including the address of the WEB page 501 related to the search keyword) is received from a crawler (see FIG. 39) that is automatically started at a predetermined interval, The external log 502 is accessed, the corresponding data is acquired, and the acquired data is provided to the information update unit 46 for comparison with the related term dictionary.

文解析部４３は、外部ログ取得制御部４２によって取得された外部ログ５０２からテキストデータを取得し、文字列の出現特性に応じて当該テキストデータに含まれる重要な関連詞を抽出し、関連詞辞書５０に記憶する。 The sentence analysis unit 43 acquires the text data from the external log 502 acquired by the external log acquisition control unit 42, extracts important related terms included in the text data according to the appearance characteristics of the character string, and the related terms Store in dictionary 50.

嗜好解析部４４は、文解析部４３によって関連詞辞書５０に記憶された関連詞について、嗜好データ５１に基づいて重要性を判定し、判定結果を関連詞・共起語データ５２に記憶する。嗜好データ５１は、ユーザによる関連詞の利用態様を記憶した内部ログ５０６を含むデータである。 The preference analysis unit 44 determines the importance of the related terms stored in the related term dictionary 50 by the sentence analysis unit 43 based on the preference data 51 and stores the determination result in the related term / co-occurrence word data 52. The preference data 51 is data including an internal log 506 that stores the usage of the related terms by the user.

話題解析部４５は、嗜好解析部４４によって生成された関連詞・共起語データ５２に記憶された関連詞について、話題データ５３に基づいて、その分布を捉え、互いに関連詞を関連付け、関連詞・共起語データ５２を更新する。話題データ５３は、契約者の担当者がインプットして設定、または自動的に抽出された話題ネタ５０７を含むデータである。 The topic analysis unit 45 captures the distribution of the related terms stored in the related term / co-occurrence word data 52 generated by the preference analysis unit 44 based on the topic data 53 and associates the related terms with each other. Update the co-occurrence word data 52. The topic data 53 is data including a topic material 507 that is input and set by the person in charge of the contractor or is automatically extracted.

情報更新部４６は、異なる収集条件により収集されたテキストデータに基づいて、それぞれ関連詞を選出して関連詞辞書を生成し、こうして生成された関連詞辞書を比較し、比較結果を比較結果データ５４に記憶する。 Based on the text data collected under different collection conditions, the information updating unit 46 selects a related term to generate a related term dictionary, compares the generated related term dictionaries, and compares the comparison result with comparison result data. 54.

ネットワークインタフェース部４７は、ネットワークを介して接続された会話制御端末装置２’’との間のアクセスやデータ送受信、及びその他のコンピュータ（例えば、インターネットを介して接続されるインターネット検索エンジンを備えるサーバ等）との間のアクセスやデータ送受信を制御する。 The network interface unit 47 is used to access and transmit / receive data to / from the conversation control terminal device 2 ″ connected via the network, and other computers (for example, a server having an Internet search engine connected via the Internet, etc. ) And access data transmission / reception.

なお、この実施例では、情報検索システム１００を、会話制御端末装置２’’、及び話題提供サーバ４’を含むシステムとして説明したが、会話制御端末装置２’’、及び話題提供サーバ４’を一体化させた１つのコンピュータとして構成することもできる。また逆に、同様の機能を、ネットワーク接続された３つ以上のコンピュータに分散させて実現することもできる。 In this embodiment, the information search system 100 has been described as a system including the conversation control terminal device 2 ″ and the topic providing server 4 ′. However, the conversation control terminal device 2 ″ and the topic providing server 4 ′ are It can also be configured as one integrated computer. Conversely, the same function can also be realized by distributing it to three or more computers connected to the network.

＜文解析部の概要＞
次に、図２１を参照して、文解析部４３の概要について説明する。文解析部４３では、テキストデータから同じ文字列を検索し、当該検索された同じ文字列についてそれぞれ、前の隣接文字の異なり度合いと後の隣接文字の異なり度合いを判定し、その判定された異なり度合いに基づいて、その検索された「同じ文字列」が、話題に関して重要性が高く、テキストデータを意味識別可能な関連詞であるか否かを決定する。前の隣接文字の異なり度合いとは、検索された「同じ文字列」の直前に出現する文字が、どの程度異なっているかを示す指標である。同様に、後の隣接文字の異なり度合いとは、検索された「同じ文字列」の直後に出現する文字が、どの程度異なっているかを示す指標である。 <Outline of sentence analysis department>
Next, an outline of the sentence analysis unit 43 will be described with reference to FIG. The sentence analysis unit 43 retrieves the same character string from the text data, and determines the degree of difference between the previous adjacent character and the degree of difference between the subsequent adjacent character for each of the retrieved character string, and the determined difference Based on the degree, it is determined whether or not the retrieved “same character string” is a related verb that has high importance with respect to the topic and can semantically identify the text data. The degree of difference between the previous adjacent characters is an index indicating how much the character appearing immediately before the searched “same character string” is different. Similarly, the degree of difference between subsequent adjacent characters is an index indicating how much the character appearing immediately after the searched “same character string” is different.

ここで、例えば、検索された「同じ文字列」のうち、前の隣接文字の異なり度合いと後の隣接文字の異なり度合いが大きい文字列が、関連詞として決定される。このようにして決定された１つまたは複数の文字列は、必要に応じて、所定の記憶手段に記憶される。 Here, for example, among the searched “same character strings”, a character string having a large difference between the preceding adjacent characters and a large difference between the subsequent adjacent characters is determined as a related term. One or more character strings determined in this way are stored in a predetermined storage unit as necessary.

このような文字列の抽出は、テキストデータに含まれる複数の同じ文字列に注目したときに、それぞれの文字列の直前に位置する文字として多くのバリエーションの文字が出現するとともに、それぞれの文字列の直後に位置する文字として多くのバリエーションの文字が出現するという出現特性が認められる場合、その文字列が、独立した、よく用いられる用語である、との考えに基づくものである。このように、本発明の文解析部４３では、文字列の運動学（kinematics）を基礎におく考えに基づいて文字列が抽出される。 This kind of character string extraction is based on the fact that when you focus on multiple identical character strings included in text data, many variations of characters appear as characters that are located immediately before each character string. When the appearance characteristic that many variations of characters appear as the character positioned immediately after is recognized, the character string is based on the idea that the character string is an independent and frequently used term. As described above, the sentence analysis unit 43 of the present invention extracts character strings based on the idea based on kinematics of character strings.

ここで、「いろは」という文字列が１００回出現する日本語テキストデータを仮定すると、この文字列「いろは」を１００個検索し、それぞれの文字列「いろは」について、直前の文字が何かを調べる。その結果、「あ」や「い」を含む３０通りの文字が出現するという事実が得られるものとする。このことは、例えば、「・・・あいろは・・・」や「・・・いいろは・・・」といった表現が、上記の日本語テキストデータに存在するということを示している。一方、それぞれの文字列「いろは」について、直後の文字が何かを調べる。その結果、「わ」や「ん」を含む２０通りの文字が出現するという事実が得られるものとする。このことは、例えば、「・・・いろはわ・・・」や「・・・いろはん・・・」といった表現が、上記の日本語テキストデータに存在するということを示している。 Here, assuming Japanese text data in which the character string “Iroha” appears 100 times, 100 character strings “Iroha” are searched, and for each character string “Iroha”, what is the immediately preceding character? Investigate. As a result, the fact that 30 characters including “A” and “I” appear appears. This indicates that, for example, expressions such as “... Airoha ...” and “... Airoha ...” exist in the above Japanese text data. On the other hand, for each character string “Iroha”, the character immediately after it is examined. As a result, the fact that 20 characters including “wa” and “n” appear can be obtained. This indicates that, for example, expressions such as “... Irohawa ...” and “... Irohan ...” exist in the Japanese text data.

この場合、前の隣接文字の異なり度合いは、例えば、「あ」や「い」を含む３０通りというバリエーションの数に基づいて判定され、後の隣接文字の異なり度合いは、例えば、「わ」や「ん」を含む２０通りというバリエーションの数に基づいて判定される。ここで、前の隣接文字の異なり度合いと後の隣接文字の異なり度合いが大きいと判定された場合は、「いろは」という文字列の前後の文字が大きな多様性をもって変化しており、これによって文字列「いろは」が、独立した用語であって重要性の高い語である可能性が高いと判断され、関連詞として決定され、必要に応じて記憶手段に記憶される。前の隣接文字の異なり度合いと後の隣接文字の異なり度合いが大きいか否かは、共通の、または個別の判断基準により判断される。 In this case, the degree of difference between the preceding adjacent characters is determined based on, for example, the number of variations of 30 types including “A” and “I”. Judgment is made based on the number of 20 variations including “n”. Here, when it is determined that the difference between the preceding adjacent character and the difference between the subsequent adjacent characters is large, the characters before and after the character string “Iroha” have changed with great diversity. The column “Iroha” is determined to be an independent term and highly likely to be a highly important word, is determined as a related term, and is stored in the storage unit as necessary. Whether the difference between the preceding adjacent characters and the difference between the subsequent adjacent characters is large is determined based on a common or individual determination criterion.

文解析部４３は、テキストデータ取得処理部４３ａ、文字列検索処理部４３ｂ、異なり度合い判定処理部４３ｃ、及び関連詞決定処理部４３ｄを備える。さらに、関連詞決定処理部４３ｄには、関連詞決定部４３ｄ−１、及びランク付け管理部４３ｄ−２が含まれる。 The sentence analysis unit 43 includes a text data acquisition processing unit 43a, a character string search processing unit 43b, a different degree determination processing unit 43c, and a related term determination processing unit 43d. Furthermore, the related term determination processing unit 43d includes a related term determination unit 43d-1 and a ranking management unit 43d-2.

テキストデータ取得処理部４３ａは、外部ログ５０２（処理の対象となるテキストデータ）を取得し、これを文字列検索処理部４３ｂに提供する（後述の図２５に示すテキストデータ取得処理５２０）。文字列検索処理部４３ｂは、図２５に示す文字列検索処理５３０を行う。異なり度合い判定処理部４３ｃは、図２５に示す異なり度合い判定処理５４０を行う。 The text data acquisition processing unit 43a acquires the external log 502 (text data to be processed) and provides it to the character string search processing unit 43b (text data acquisition processing 520 shown in FIG. 25 described later). The character string search processing unit 43b performs a character string search process 530 shown in FIG. The difference degree determination processing unit 43c performs a difference degree determination process 540 shown in FIG.

関連詞決定処理部４３ｄは、関連詞を決定し、必要に応じて、決定された関連詞を関連詞辞書５０に記憶する（図２５に示す関連詞決定処理５５０）。 The related term determination processing unit 43d determines a related term, and stores the determined related term in the related term dictionary 50 as necessary (related term determination processing 550 shown in FIG. 25).

また、関連詞決定部４３ｄ−１は、外部ログ５０２に含まれる同じ文字列に関する前後の隣接文字の異なり度合い等から、当該同じ文字が関連詞であるか否かを決定する。ランク付け管理部４３ｄ−２は、１つの外部ログ５０２において、複数の関連詞が決定される場合に、必要に応じてその関連詞についてランク付けを行う。 Also, the related term determining unit 43d-1 determines whether or not the same character is a related term from the degree of difference between adjacent characters before and after the same character string included in the external log 502. When a plurality of related terms are determined in one external log 502, the ranking management unit 43d-2 ranks the related terms as necessary.

＜情報更新部の概要＞
次に、図２２を参照して、情報更新部４６の概要について説明する。情報更新部４６は、テキストデータ取得処理部４６ａ、文字列抽出処理部４６ｂ、辞書比較処理部４６ｃ、及び比較結果出力部４６ｄを備える。 <Outline of information update unit>
Next, an overview of the information update unit 46 will be described with reference to FIG. The information update unit 46 includes a text data acquisition processing unit 46a, a character string extraction processing unit 46b, a dictionary comparison processing unit 46c, and a comparison result output unit 46d.

テキストデータ取得処理部４６ａは、外部ログ５０２（処理の対象となるテキストデータ）を取得し、これを文字列抽出処理部４６ｂに提供する（後述の、図３９に示すテキストデータ取得処理７００）。文字列抽出処理部４６ｂは、外部ログ５０２から関連詞を抽出し、これを、対応する関連詞辞書５０に記憶する（図３９に示す文字列抽出処理７１０）。文字列抽出処理部４６ｂは、例えば、上述した文解析部４３による処理と同様の処理である。 The text data acquisition processing unit 46a acquires the external log 502 (text data to be processed) and provides it to the character string extraction processing unit 46b (text data acquisition processing 700 shown in FIG. 39, which will be described later). The character string extraction processing unit 46b extracts a related term from the external log 502 and stores it in the corresponding related term dictionary 50 (character string extraction processing 710 shown in FIG. 39). The character string extraction processing unit 46b is, for example, the same processing as the processing by the sentence analysis unit 43 described above.

辞書比較処理部４６ｃは、複数の関連詞辞書５０を比較し、比較結果を比較結果データ５４に記憶する（図３９に示す辞書比較処理７２０）。 The dictionary comparison processing unit 46c compares a plurality of related terminology dictionaries 50 and stores the comparison result in the comparison result data 54 (dictionary comparison processing 720 shown in FIG. 39).

比較結果出力部４６ｅは、比較結果データ５４から表示すべき比較結果を取得し、これを含む入力特定情報を会話制御端末装置２’’に送信する。 The comparison result output unit 46e acquires a comparison result to be displayed from the comparison result data 54, and transmits input specifying information including the comparison result to the conversation control terminal device 2 ''.

＜＜＜本発明の情報検索システムをＦＡＱ検索に適用した実施例の説明＞＞＞
次に、本発明の一実施形態に係る情報検索システムを用いて、ユーザの指示に応じてＦＡＱ検索の結果を表示するＦＡＱ検索システムについて説明する。 <<< Explanation of an embodiment in which the information search system of the present invention is applied to FAQ search >>>
Next, the FAQ search system that displays the result of the FAQ search according to the user's instruction using the information search system according to the embodiment of the present invention will be described.

＜＜ＦＡＱ検索システムの画面遷移＞＞
図２３には、ＦＡＱ検索システムの画面遷移が示されている。ユーザは最初に、会話制御端末装置２’’において所定の指示を行い、ディスプレイにＦＡＱ検索画面６００を表示させて、そこで所望の検索キーワードを（キーボード等を用いて）入力する。ＦＡＱ検索画面６００は、例えば、図３５（Ａ）に示すような入力指示画面であり、ＦＡＱ検索画面６００には、検索キーワード入力部６０１と「ＦＡＱ検索」ボタン６０２が表示されている。 << Screen transition of FAQ search system >>
FIG. 23 shows the screen transition of the FAQ search system. First, the user gives a predetermined instruction on the conversation control terminal device 2 ″, displays the FAQ search screen 600 on the display, and inputs a desired search keyword (using a keyboard or the like) there. The FAQ search screen 600 is an input instruction screen as shown in FIG. 35A, for example, and a search keyword input unit 601 and a “FAQ search” button 602 are displayed on the FAQ search screen 600.

ユーザがここで、検索キーワード入力部６０１に検索キーワード（図３５（Ａ）の例では、「ネットワーク」）を入力し、「ＦＡＱ検索」ボタン６０２をマウス等でクリックすると、ＦＡＱ候補表示画面６１０が表示される。ＦＡＱ候補表示画面６１０は、例えば、図３５（Ｂ）に示すような表示画面であり、関連詞索引表示部６１１、候補質問文表示部６１２、及び「ＦＡＱ検索画面に戻る」ボタン６１３が表示されている。候補質問文表示部６１２に示された質問は、すべて「ネットワーク」に関するもので、ユーザが入力した検索キーワードに基づいて検索された結果が表示されている。関連詞索引表示部６１１に示された関連詞の集合は、それぞれ対応する質問に含まれる関連詞の集合である。ユーザがここで、「ＦＡＱ検索画面に戻る」ボタン６１３をクリックすると、会話制御端末装置２’’のディスプレイの表示がＦＡＱ検索画面６００に戻る。 When the user inputs a search keyword (“network” in the example of FIG. 35A) to search keyword input unit 601 and clicks “FAQ search” button 602 with a mouse or the like, FAQ candidate display screen 610 is displayed. Is displayed. The FAQ candidate display screen 610 is, for example, a display screen as shown in FIG. 35B, and displays a related term index display portion 611, a candidate question sentence display portion 612, and a “return to FAQ search screen” button 613. ing. The questions shown in the candidate question sentence display unit 612 are all related to “network”, and the search results based on the search keyword input by the user are displayed. The set of related terms shown in the related term index display unit 611 is a set of related terms included in the corresponding question. When the user clicks a “return to FAQ search screen” button 613 here, the display on the display of the conversation control terminal device 2 ″ returns to the FAQ search screen 600.

ＦＡＱ候補表示画面６１０において、ユーザが候補質問文表示部６１２に表示された候補質問文のうちの１つをマウスのクリック等によって選択すると（矢印（１））、ＦＡＱ表示画面６３０が表示される。ＦＡＱ表示画面６３０は、例えば、図３６に示すような表示画面であり、質問表示部６３１、関連詞索引表示部６３２、回答表示部６３３、及び「ＦＡＱ候補表示画面に戻る」ボタン６３４が表示されている。ユーザがここで、「ＦＡＱ候補表示画面に戻る」ボタン６３４をクリックすると、会話制御端末装置２’’のディスプレイの表示がＦＡＱ候補表示画面６１０に戻る。 In the FAQ candidate display screen 610, when the user selects one of the candidate question sentences displayed in the candidate question sentence display unit 612 by clicking the mouse or the like (arrow (1)), the FAQ display screen 630 is displayed. . The FAQ display screen 630 is, for example, a display screen as shown in FIG. 36, and displays a question display portion 631, a related term index display portion 632, an answer display portion 633, and a “return to FAQ candidate display screen” button 634. ing. When the user clicks a “return to FAQ candidate display screen” button 634 here, the display on the conversation control terminal device 2 ″ returns to the FAQ candidate display screen 610.

ＦＡＱ候補表示画面６１０において、ユーザが関連詞索引表示部６１１に表示された関連詞索引のうちの１つをマウスのクリック等によって選択すると（矢印（２））、関連詞・共起語一覧表示画面６５０が表示される。関連詞・共起語一覧表示画面６５０は、例えば、図３７に示すような表示画面であり、ＮＯ表示部６５１、関連詞表示部６５２、近傍関連詞表示部（６５３〜６５６）、及び「ＦＡＱ候補表示画面に戻る」ボタン６５７が表示されている。ユーザがここで、「ＦＡＱ候補表示画面に戻る」ボタン６５７をクリックすると、会話制御端末装置２’’のディスプレイの表示がＦＡＱ候補表示画面６１０に戻る。 When the user selects one of the related term indexes displayed on the related term index display unit 611 by clicking the mouse on the FAQ candidate display screen 610 (arrow (2)), the related term / co-occurrence word list is displayed. A screen 650 is displayed. The related term / co-occurrence word list display screen 650 is, for example, a display screen as shown in FIG. 37, and includes a NO display unit 651, a related term display unit 652, a neighborhood related term display unit (653-656), and “FAQ”. A “return to candidate display screen” button 657 is displayed. Here, when the user clicks a “return to FAQ candidate display screen” button 657, the display on the conversation control terminal device 2 ″ returns to the FAQ candidate display screen 610.

関連詞・共起語一覧表示画面６５０において、ユーザが関連詞表示部６５２または近傍関連詞表示部（６５３〜６５６）に表示された関連詞のうちの１つをマウスのクリック等によって選択すると、ＦＡＱ検索画面６６０が表示される。ＦＡＱ検索画面６６０は、例えば、図３８（Ｂ）に示すような表示画面であり、これは、図３５（Ａ）に示すＦＡＱ検索画面６００と実質的に同様のものであり、画面制御が元に戻ったことを示している。そして、ＦＡＱ検索画面６６０の検索キーワード入力部６６１には、検索キーワードとして、関連詞・共起語一覧表示画面６５０で選択した関連詞（例えば、図３７の例では、「ＳＮＳ」）が、自動的にセットされる。 On the related term / co-occurrence word list display screen 650, when the user selects one of the related terms displayed on the related term display unit 652 or the neighborhood related term display unit (653-656) by clicking a mouse or the like, The FAQ search screen 660 is displayed. The FAQ search screen 660 is a display screen as shown in FIG. 38B, for example, which is substantially the same as the FAQ search screen 600 shown in FIG. It shows that it has returned to. In the search keyword input unit 661 of the FAQ search screen 660, as a search keyword, the related term (for example, “SNS” in the example of FIG. 37) selected on the related term / co-occurrence word list display screen 650 is automatically displayed. Is set automatically.

ユーザがこの状況で、「ＦＡＱ検索」ボタン６６２をクリックすると、再び、ＦＡＱ候補表示画面６１０が表示され、今度は、「ＳＮＳ」に関する質問文が、候補質問文表示部６１２に示される。 When the user clicks the “FAQ search” button 662 in this situation, the FAQ candidate display screen 610 is displayed again, and the question sentence regarding “SNS” is displayed in the candidate question sentence display unit 612 this time.

＜＜ＦＡＱ候補表示画面の表示処理に関する説明＞＞
次に、図２４を参照して、ＦＡＱ候補表示画面の表示処理について説明する。図２４は、ＦＡＱ候補表示画面の表示処理を表すフローチャートであり、会話制御端末装置２’’と話題提供サーバ４’においてそれぞれどのような処理が行われるかを示している。会話制御端末装置２’’では、例えば、上述したTopiclet２０によって各処理が行われ、図３５〜図３８に示した、会話制御端末装置２’’のディスプレイへの画面表示は、ここでは、Topiclet２０によって、またはTopiclet２０の制御によって動作するＷＥＢブラウザ等によって行われる。 << Explanation regarding display processing of FAQ candidate display screen >>
Next, the display process of the FAQ candidate display screen will be described with reference to FIG. FIG. 24 is a flowchart showing the processing for displaying the FAQ candidate display screen, and shows what processing is performed in each of the conversation control terminal device 2 ″ and the topic providing server 4 ′. In the conversation control terminal device 2 ″, for example, each process is performed by the above-described Topiclet 20, and the screen display on the display of the conversation control terminal device 2 ″ shown in FIGS. Or a WEB browser that operates under the control of the Topiclet 20.

最初に、ステップＳ１１において、ユーザがＦＡＱ検索画面６００で「ＦＡＱ検索」ボタン６０２をクリックしたか否かが判定される。「ＦＡＱ検索」ボタン６０２がクリックされない間は（ＮＯ）、この判定が繰り返される。「ＦＡＱ検索」ボタン６０２がクリックされた場合（ＹＥＳ）、ステップＳ１２において、ユーザによって検索キーワード入力部６０１に入力された検索キーワードによる検索結果を入力情報として話題提供サーバ４’に送信する。この実施例においては、検索結果は、一般的なインターネット検索サイトでキーワード検索を行った結果であり、Topiclet２０は、このインターネット検索サイトでの検索を制御し、検索結果をＡＰＩ送信により話題提供サーバ４’に送信する。検索結果は、例えば、当該キーワード検索にヒットしたＷＥＢページのアドレスである。 First, in step S <b> 11, it is determined whether or not the user has clicked the “FAQ search” button 602 on the FAQ search screen 600. While the “FAQ search” button 602 is not clicked (NO), this determination is repeated. When the “FAQ search” button 602 is clicked (YES), in step S12, the search result based on the search keyword input to the search keyword input unit 601 by the user is transmitted as input information to the topic providing server 4 '. In this embodiment, the search result is a result of keyword search on a general Internet search site, and the Topiclet 20 controls the search on this Internet search site, and the topic providing server 4 transmits the search result by API transmission. Send to '. The search result is, for example, the address of the WEB page that hits the keyword search.

話題提供サーバ４’が会話制御端末装置２’’から入力情報を受け取ると、ステップＳ１３において、入力情報を分析し、入力情報に含まれるＷＥＢページのアドレスにアクセスして、ＷＥＢページに対応するＨＴＭＬデータ等から、対象となるテキストデータとなる外部ログ５０２を取得する。 When the topic providing server 4 ′ receives the input information from the conversation control terminal device 2 ″, in step S13, the input information is analyzed, the address of the WEB page included in the input information is accessed, and the HTML corresponding to the WEB page is accessed. The external log 502 as the target text data is acquired from the data or the like.

次に、話題提供サーバ４’は、ステップＳ１４において、取得した外部ログ５０２に対して文解析処理を実行し、外部ログ５０２から関連詞を抽出する。文解析処理については、後で詳細に説明する。 Next, in step S <b> 14, the topic providing server 4 ′ executes sentence analysis processing on the acquired external log 502 and extracts related terms from the external log 502. The sentence analysis process will be described later in detail.

その後、話題提供サーバ４’は、ステップＳ１５において、ステップＳ１４で外部ログ５０２から抽出された関連詞から、関連詞辞書５０を生成する。関連詞辞書５０には、外部ログ５０２内のそれぞれの文情報に対する関連詞索引５０ａが含まれる。 Thereafter, in step S15, the topic providing server 4 'generates a related term dictionary 50 from the related terms extracted from the external log 502 in step S14. The related term dictionary 50 includes a related term index 50 a for each sentence information in the external log 502.

次に、話題提供サーバ４’は、ステップＳ１６において、ＦＡＱ候補表示画面６１０に表示するために、関連詞辞書５０から関連詞索引５０ａ等を取得し、これらの情報を入力特定情報として会話制御端末装置２’’に送信する。 Next, in step S16, the topic providing server 4 ′ acquires the related term index 50a and the like from the related term dictionary 50 to be displayed on the FAQ candidate display screen 610, and uses the information as input specific information to control the conversation control terminal. Send to device 2 ''.

会話制御端末装置２’’は、話題提供サーバ４’から入力特定情報を受信すると（ステップＳ１７）、ステップＳ１８において、受信した入力特定情報とシナリオデータ２８に基づいて、応答情報を決定する。なお、話題提供サーバ４’は、必要に応じてシナリオデータ５５を会話制御端末装置２’’に送信し、会話制御端末装置２’’はこれをシナリオデータ２８に記憶する。 When receiving the input specifying information from the topic providing server 4 ′ (step S 17), the conversation control terminal device 2 ″ determines response information based on the received input specifying information and the scenario data 28 in step S 18. The topic providing server 4 ′ transmits the scenario data 55 to the conversation control terminal device 2 ″ as necessary, and the conversation control terminal device 2 ″ stores this in the scenario data 28.

次に、ステップＳ１９において、ステップＳ１８で決定された応答情報を会話制御端末装置２’’のディスプレイに表示する。例えば、図３５（Ｂ）に示すようなＦＡＱ候補表示画面６１０が表示さる。この実施例では、例えば、候補質問文表示部６１２には、収集された質問文の一部（Ｑ１、Ｑ８、Ｑ１３、Ｑ２４、Ｑ２５）が候補質問文としてリスト表示される。また、関連詞索引表示部６１１には、候補質問文として表示された質問文にそれぞれ対応する関連詞索引が示されている。 Next, in step S19, the response information determined in step S18 is displayed on the display of the conversation control terminal device 2 ''. For example, a FAQ candidate display screen 610 as shown in FIG. 35 (B) is displayed. In this embodiment, for example, a part of the collected question sentences (Q1, Q8, Q13, Q24, Q25) is displayed as a list of candidate question sentences on the candidate question sentence display unit 612. The related term index display unit 611 shows related term indexes corresponding to the question sentences displayed as candidate question sentences.

＜＜文解析処理の詳細な説明＞＞
次に、図２５を参照して、話題提供サーバ４’の文解析部４３（図２０、図２１参照）で実行される文解析処理の概要を説明する。最初に、文解析部４３は、テキストデータである外部ログ５０２を取得する（テキストデータ取得処理５２０）。外部ログ５０２は、前述のように、様々なデータソースから受信することができる。この実施例では、会話制御端末装置２’’から受信したＷＥＢページのアドレスに基づいて、各ＷＥＢページにアクセスし、対応するＨＴＭＬデータ等からテキストデータを取得している。また、取得した外部ログ５０２、または外部ログ５０２を取得する際に、特定のテキストデータだけを取得するようフィルタ処理を行ったり、特定の分類によりグルーピングをしたりすることもできる。 << Detailed explanation of sentence analysis process >>
Next, with reference to FIG. 25, an outline of sentence analysis processing executed by the sentence analysis unit 43 (see FIGS. 20 and 21) of the topic providing server 4 ′ will be described. First, the sentence analysis unit 43 acquires the external log 502 that is text data (text data acquisition processing 520). The external log 502 can be received from various data sources as described above. In this embodiment, each WEB page is accessed based on the address of the WEB page received from the conversation control terminal device 2 ″, and the text data is obtained from the corresponding HTML data or the like. Further, when acquiring the acquired external log 502 or the external log 502, it is possible to perform a filtering process so as to acquire only specific text data, or group by a specific classification.

次に、文解析部４３は、テキストデータ取得処理５２０により取得された外部ログ５０２から、同じ（共通の）文字列を検索する（文字列検索処理５３０）。この処理は、例えば、取得した外部ログ５０２の中の「いろは」という同じ文字列を検索し、取り出す処理である。１つのテキストデータに１００個の文字列「いろは」が存在する場合は、そのすべてが取り出される。また、テキストデータの中には、「いろは」以外にも同じ文字列が複数存在する可能性があるが、その場合は、それらの文字列も同様に検索して取り出す。例えば、文字列「いろは」の他に、文字列「にほへと」が複数含まれていれば、その文字列も同様に取り出される。なお、文字列「いろはに」などのように、同じ文字列としてすでに取り出されている「いろは」をそのまま含む文字列が複数ある場合も、文字列「いろは」とは別に、同じ文字列として検索される。 Next, the sentence analysis unit 43 searches for the same (common) character string from the external log 502 acquired by the text data acquisition process 520 (character string search process 530). This processing is, for example, processing for searching for and retrieving the same character string “Iroha” in the acquired external log 502. If 100 character strings “Iroha” exist in one text data, all of them are extracted. In text data, there may be a plurality of the same character strings other than “Iroha”. In this case, these character strings are similarly searched and extracted. For example, in addition to the character string “Iroha”, if a plurality of character strings “Nihoheto” are included, the character string is similarly extracted. In addition, even if there are multiple character strings that contain “Iroha” that has already been extracted as the same character string, such as the character string “Iroha ni”, the same character string is searched separately from the character string “Iroha”. Is done.

文字列検索処理５３０はさらに、同じ文字列として検索された文字列を、それぞれ、その文字列の前の隣接文字と後の隣接文字とともに、検索結果データ４８に記憶する。検索結果データ４８に記憶されるデータは、例えば、上記の例の文字列「いろは」については、１００個のそれぞれの「いろは」について、文字列「いろは」、「いろは」の前の隣接文字、及び「いろは」の後の隣接文字を含むデータである。上述した例の場合、文字列「にほへと」や文字列「いろはに」についても同様に、それらの文字列と、前の隣接文字、及び後の隣接文字が検索結果データ４８に記憶される。 The character string search process 530 further stores the character strings searched as the same character string in the search result data 48 together with the adjacent character after the character string and the adjacent character after the character string. The data stored in the search result data 48 is, for example, the character string “Iroha” in the above example, the character string “Iroha”, the adjacent character before “Iroha”, for each of the 100 “Iroha”, And the data including the adjacent character after “Iroha”. In the case of the above-described example, the character string “Ihohani” and the character string “Irohani” are also stored in the search result data 48 in the same manner as those character strings, the previous adjacent character, and the subsequent adjacent character.

このように、文字列検索処理５３０において、同じ文字列が検索された場合に、その文字列と、前後の隣接文字を記憶するのは、最終的に、重要な意味を持つ関連詞を決定するためであるが、外部ログ５０２のなかに同じ文字列が複数存在する場合であっても、その出現頻度が所定の頻度に達しない場合は、この時点で、関連詞として決定される可能性がないとの判断を行い、その文字列に関するデータを検索結果データ４８として記憶しないようにすることができる。多くの文字からなる外部ログ５０２において、わずかな回数しか出現しない語（文字列）は、そもそも重要性が高くないと判断できるからである。 As described above, when the same character string is searched in the character string search processing 530, storing the character string and the adjacent characters before and after it ultimately determines an associated verb having an important meaning. For this reason, even if there are a plurality of the same character strings in the external log 502, if the appearance frequency does not reach the predetermined frequency, there is a possibility that it will be determined as a related term at this point. It is possible to determine that the character string is not stored as search result data 48. This is because a word (character string) that appears only a small number of times in the external log 502 composed of many characters can be determined to be less important in the first place.

また、文字列検索処理５３０において、外部ログ５０２から同じ文字列を検索し記憶するために、本実施形態では、サフィックスアレイ（Suffix Array：接尾辞配列）という検索用データ構造を用い、これを二分探索（Binary Search）により検索することで、同じ文字列を高速に検索している。なお、本実施形態では、上記のような方法により、文字列検索処理５３０を行っているが、他の様々な方法を採用して、同様の検索処理を行うことができる。サフィックスアレイと二分探索を用いた文字列検索処理５３０の処理については、後で詳細に説明する。 In the character string search processing 530, in order to search for and store the same character string from the external log 502, in the present embodiment, a search data structure called a suffix array (Suffix Array) is used. By searching by search (Binary Search), the same character string is searched at high speed. In the present embodiment, the character string search processing 530 is performed by the method as described above, but the same search processing can be performed by employing various other methods. The processing of the character string search processing 530 using the suffix array and binary search will be described in detail later.

次に、文解析部４３は、文字列検索処理５３０によって検索結果データ４８に記憶された文字列とその前後の隣接文字の内容から、前の隣接文字の異なり度合いと後の隣接文字の異なり度合いを判定する（異なり度合い判定処理５４０）。 Next, the sentence analysis unit 43 uses the character string stored in the search result data 48 by the character string search processing 530 and the contents of adjacent characters before and after the character string to determine the degree of difference between the previous adjacent character and the degree of difference between the subsequent adjacent characters. (Difference degree determination processing 540).

ここで、１つの文字をｓ（ｉ）と表し、
ｓ（ｉ）〜ｓ（ｊ）より構成される文字列ｍ（ｉ，ｊ）を、
ｍ（ｉ，ｊ）＝（ｓ（ｉ），ｓ（ｉ＋１），ｓ（ｉ＋２），・・・ｓ（ｊ−２），ｓ（ｊ−１），ｓ（ｊ））と表し、
ｓ（ｉ）〜ｓ（ｊ−１）より構成される文字列ｍ（ｉ，ｊ−１）を、
ｍ（ｉ，ｊ−１）＝（ｓ（ｉ），ｓ（ｉ＋１），ｓ（ｉ＋２），・・・ｓ（ｊ−２），ｓ（ｊ−１））と表し、
ｓ（ｉ＋１）〜ｓ（ｊ）より構成される文字列ｍ（ｉ＋１，ｊ）を、
ｍ（ｉ＋１，ｊ）＝（ｓ（ｉ＋１），ｓ（ｉ＋２），・・・ｓ（ｊ−２），ｓ（ｊ−１），ｓ（ｊ））と表す。 Here, one character is represented as s (i),
A character string m (i, j) composed of s (i) to s (j)
m (i, j) = (s (i), s (i + 1), s (i + 2),... s (j−2), s (j−1), s (j))
A character string m (i, j-1) composed of s (i) to s (j-1)
m (i, j−1) = (s (i), s (i + 1), s (i + 2),... s (j−2), s (j−1))
A character string m (i + 1, j) composed of s (i + 1) to s (j) is
m (i + 1, j) = (s (i + 1), s (i + 2),... s (j−2), s (j−1), s (j)).

この場合、前の隣接文字に関する境界条件は、
Ｔ（ｉ−１）＝｛Ｓ（ｉ−１）｜ｍ（ｉ，ｊ）｝
Ｔ（ｉ）＝｛Ｓ（ｉ）｜ｍ（ｉ＋１，ｊ）｝
で定義され、
後の隣接文字に関する境界条件は、
Ｂ（ｊ）＝｛Ｓ（ｊ）｜ｍ（ｉ，ｊ−１）｝
Ｂ（ｊ＋１）＝｛Ｓ（ｊ＋１）｜ｍ（ｉ，ｊ）｝
で定義される。 In this case, the boundary condition for the previous adjacent character is
T (i-1) = {S (i-1) | m (i, j)}
T (i) = {S (i) | m (i + 1, j)}
Defined in
The boundary condition for the next adjacent character is
B (j) = {S (j) | m (i, j-1)}
B (j + 1) = {S (j + 1) | m (i, j)}
Defined by

ここで、例えば、｛Ｓ（ｉ−１）｜ｍ（ｉ，ｊ）｝は、文字列ｍ（ｉ，ｊ）を共通とし、その直前に出現する文字の集合を意味する。なお、ここで、ｓ（ｉ）∈Ｔ（ｉ）、及びｓ（ｊ）∈Ｂ（ｊ）が成立する。 Here, for example, {S (i−1) | m (i, j)} means a set of characters appearing immediately before the character string m (i, j) in common. Here, s (i) εT (i) and s (j) εB (j) hold.

例えば、Ｔ（ｉ−１）の要素数が多く、Ｔ（ｉ）の要素数が１の場合、ｓ（ｉ）が関連詞の先頭になる可能性が高く、一方、Ｂ（ｊ）の要素数が１で、Ｂ（ｊ＋１）の要素数が多い場合、ｓ（ｊ）が関連詞の末尾になる可能性が高く、結果的に、文字列ｍ（ｉ，ｊ）は、関連詞の候補として判断される。 For example, when the number of elements of T (i-1) is large and the number of elements of T (i) is 1, s (i) is likely to be the head of a related term, while the element of B (j) When the number is 1 and the number of elements of B (j + 1) is large, s (j) is likely to be the end of the related term. As a result, the character string m (i, j) is a candidate for the related term. It is judged as.

このように、同じ文字列について、それらの前（または後）の隣接文字の出現態様、すなわち、隣接文字がどれくらいのバリエーションで出現するかに基づいて、同じ文字列についての隣接文字に関する異なり度合いを判定する。前後の隣接文字に関する異なり度合いが判定されると、判定された異なり度合いは、対応する文字列とともに、関連詞候補データ４９に記憶される。なお、異なり度合い判定処理５４０の処理は、後で詳細に説明する。 In this way, the degree of difference regarding the adjacent character for the same character string is determined based on the appearance mode of the adjacent character before (or after) the same character string, that is, how many variations the adjacent character appears. judge. When the degree of difference regarding the adjacent characters before and after is determined, the determined degree of difference is stored in the related term candidate data 49 together with the corresponding character string. Note that the process of the difference degree determination process 540 will be described in detail later.

次に、文解析部４３は、異なり度合い判定処理５４０により判定された、同じ文字列の前後の隣接文字に関する異なり度合いに基づいて、その同じ文字列が関連詞であるか否かを決定し、関連詞であると決定された場合、その文字列を関連詞辞書５０に記憶する（関連詞決定処理５５０）。 Next, the sentence analysis unit 43 determines whether or not the same character string is a related term based on the degree of difference regarding the adjacent characters before and after the same character string determined by the difference degree determination processing 540. If it is determined to be a related term, the character string is stored in the related term dictionary 50 (related term determination process 550).

上述のように、同じ文字列について、それらの前後の隣接文字にどのようなバリエーションがあるかを見ると、隣接文字の出現要素数が小さい場合、その隣接文字と「同じ文字列」は一体となって、よく使われる別の文字列を形成していると考えることができ、他方、隣接文字の出現要素数が大きい場合は、隣接文字と「同じ文字列」が区切られ、その「同じ文字列」が独立した用語であって重要性の高い語である可能性が高いと考えられる。また、その「同じ文字列」が関連詞であるか否かは、前の隣接文字の異なり度合いと後の隣接文字の異なり度合いのほかに、さらなる要素を考慮して決定することができる。こうした、関連詞決定処理５５０については、後で詳細に説明する。 As described above, regarding the same character string, if there are variations in the adjacent characters before and after them, if the number of appearance elements of the adjacent character is small, the adjacent character and the “same character string” are combined. Can be considered to form another character string that is often used. On the other hand, if the number of appearance elements of the adjacent character is large, the adjacent character and the “same character string” are separated. The "column" is an independent term and is likely to be a highly important word. Further, whether or not the “same character string” is a related term can be determined in consideration of other factors in addition to the degree of difference between the preceding adjacent characters and the degree of difference between the subsequent adjacent characters. Such a related term determination process 550 will be described in detail later.

さらに、関連詞決定処理５５０では、関連詞が複数決定された場合に、関連詞として決定された文字列の間でランク付けを行うようにすることができる。こうしたランク付けは、例えば、文字列の重要度に関するランク付けであり、その文字列に関する、前後の隣接文字の異なり度合いのほかに、さらなる要素を考慮して決定することができる。例えば、文字列の文字長、出現頻度等に基づいてランク付けがされうる。また、順位を示すだけでなく、相対的な程度の差を表すことができるように、数値によってランク付けを行うこともできる。 Further, in the related term determining process 550, when a plurality of related terms are determined, ranking can be performed among character strings determined as related terms. Such ranking is, for example, ranking related to the importance of a character string, and can be determined in consideration of other factors in addition to the degree of difference between adjacent characters in the character string. For example, ranking can be performed based on the character length, appearance frequency, etc. of the character string. In addition to ranking, ranking can also be performed numerically so that a relative degree of difference can be represented.

＜具体的な外部ログの例に対する文解析処理の説明＞
図２６は、外部ログ５０２の一例である外部ログ５０２ａを示している。外部ログ５０２ａは、図２６（Ａ）に示すように、検索キーワードによる検索結果として、質問文の記載部分に「ネットワーク」を含む文情報だけが集められたものであり、元のデータは、例えば、インターネット上の様々なサーバにおいて様々な利用者によって書き込まれた問題解決のためのテキストデータである。これらのテキストデータの代表的な例としては、インターネット上に公開されているＷＥＢページ（ホームページ）やブログのテキストデータ、及びTWITTERのツイート情報などがある。また、任意の機関によって事前に生成・編集されたデータやデータベース内のテキスト情報が含まれてもよい。元のデータでは、質問文（Ｑ（質問））と回答文（Ａ（回答））とが１対１に対応するよう構成されているものとする。なお、ここでは、質問文の記載部分に文字列「ネットワーク」が含まれる文情報のみを元のデータから抽出しているが、質問文と回答文の記載部分に文字列「ネットワーク」が含まれる文情報を抽出するなど、様々なバリエーションを考えることができる。 <Description of sentence analysis processing for specific external log examples>
FIG. 26 shows an external log 502 a that is an example of the external log 502. As shown in FIG. 26A, the external log 502a is a collection of only sentence information including “network” in the description part of the question sentence as a search result based on the search keyword. This is text data for solving problems written by various users on various servers on the Internet. Typical examples of these text data include WEB page (homepage) and blog text data published on the Internet, and TWITTER tweet information. Further, data generated and edited in advance by an arbitrary organization or text information in a database may be included. In the original data, it is assumed that the question sentence (Q (question)) and the answer sentence (A (answer)) correspond to each other on a one-to-one basis. Here, only the sentence information in which the character string “network” is included in the description part of the question sentence is extracted from the original data, but the character string “network” is included in the description part of the question sentence and the answer sentence. Various variations such as extracting sentence information can be considered.

ここで、文解析部４３による文解析処理が行われると、図２６（Ａ）に示した外部ログ５０２ａがテキストデータ取得処理５２０によって取得され、その後、上述した文字列検索処理５３０、異なり度合い判定処理５４０、及び関連詞決定処理５５０が行われると、図２６（Ｂ）に示すように、抽出された文情報の質問文について、それぞれ複数の関連詞が抽出されている。例えば、質問文のうち、Ｑ１については、「ネットワーク」、「トラブル」、「対応」、及び「設定」が選出されている。Ｑ１の質問の記載において、検索キーワードに相当する「ネットワークには」下線が付され、他の関連詞は矩形で囲まれている。また、Ｑ１の質問の記載に対応する、抽出された関連詞の集合として、｛ネットワーク、設定、トラブル、対応｝が示されている。この関連詞の集合は、関連詞決定処理５５０に関して上述したランク付けの順に記載されている。 Here, when the sentence analysis process by the sentence analysis unit 43 is performed, the external log 502a shown in FIG. 26A is acquired by the text data acquisition process 520, and then the above-described character string search process 530, difference degree determination When the processing 540 and the related term determination processing 550 are performed, a plurality of related terms are extracted for each question sentence of the extracted sentence information, as shown in FIG. For example, in the question sentence, “network”, “trouble”, “response”, and “setting” are selected for Q1. In the description of the question of Q1, “the network” corresponding to the search keyword is underlined, and other related terms are surrounded by a rectangle. Further, {network, setting, trouble, correspondence} is shown as a set of extracted related terms corresponding to the description of the question of Q1. This set of related terms is described in the ranking order described above with respect to the related term determination process 550.

同様に、Ｑ８については、「ネットワーク」、「設定」、及び「事象別に」が選出されている。Ｑ８の質問の記載において、検索キーワードに相当する「ネットワークには」下線が付され、他の関連詞は矩形で囲まれている。また、Ｑ８の質問の記載に対応する、抽出された関連詞の集合として、｛ネットワーク、設定、事象別に｝が示されている。また、ここで、質問文だけでなく、回答文の記載内容に関しても関連詞を抽出し、関連詞索引に含めるように構成することもできる。 Similarly, for Q8, “network”, “setting”, and “by event” are selected. In the description of the question of Q8, the “network” corresponding to the search keyword is underlined, and other related terms are surrounded by a rectangle. Also, {by network, setting, event} is shown as a set of extracted related terms corresponding to the description of the question of Q8. Here, it is possible to extract not only the question sentence but also related contents of the answer sentence so as to be included in the related term index.

このような関連詞決定処理５５０によって生成された関連詞辞書５０の例が、図２７（Ａ）に示されている。関連詞索引５０ａとして、図２６（Ｂ）に示した、それぞれの質問に関する関連詞の集合がそのまま記憶されている。関連詞辞書５０としては、こうした関連詞索引５０ａのみを記憶しておけば十分な場合もあるが、この実施例では、関連詞索引に対応する質問文を質問文５０ｂに記憶し、さらに、その質問文５０ｂに対応する回答を、回答文５０ｃに記憶する。 An example of the related term dictionary 50 generated by the related term determining process 550 is shown in FIG. As a related term index 50a, a set of related terms related to each question shown in FIG. 26B is stored as it is. In some cases, it is sufficient to store only the related term index 50a as the related term dictionary 50. However, in this embodiment, the question sentence corresponding to the related term index is stored in the question sentence 50b. The answer corresponding to the question sentence 50b is stored in the answer sentence 50c.

関連詞辞書５０に記憶された関連詞索引５０ａは、図２７（Ａ）に示すように、対応する文情報にそれぞれ対応付けて記憶されることにより、１の関連詞の集合と他の関連詞の集合との間で共通する関連詞を介して、他の集合に属する関連詞同士が関連付けられることになる。例えば、図２７（Ｂ）に示すように、質問文のうち、Ｑ１の質問についての関連詞の集合として、｛ネットワーク、設定、トラブル、対応｝が把握され、質問文のうち、Ｑ８の質問についての関連詞の集合として、｛ネットワーク、設定、事象別に｝が把握される場合、（検索キーワードである関連詞「ネットワーク」を除いて）共通している関連詞「設定」が存在し、それによって、関連詞「設定」と関連する関連詞（共起関連詞ということもできる）が｛トラブル、対応、事象別に｝であることが把握され、さらに、関連詞「トラブル」や「対応」が関連詞「事象別に」と関連する、すなわち、共に、関連詞「設定」と共起するという共通点を有しているという点で関係性を有していることが把握される。 As shown in FIG. 27A, the related term index 50a stored in the related term dictionary 50 is stored in association with corresponding sentence information, so that one set of related terms and other related terms are stored. The related terms belonging to the other set are associated with each other through the related terms common to the set. For example, as shown in FIG. 27B, {network, setting, trouble, correspondence} is grasped as a set of related terms for the question Q1 in the question sentence, and the question Q8 in the question sentence. If {network, setting, event} is grasped as a set of related terms, there is a common related term “setting” (except for the related keyword “network” that is the search keyword). , It is understood that the related term (also known as co-occurrence related term) is related to the related term “setting”, and is related to the related terms “trouble” and “correspondence”. It is understood that they have a relationship in that they are related to the verb “by event”, that is, they have a common point that they co-occur with the related term “setting”.

このような、関連詞同士の関係性は、この実施例では、検索キーワード「ネットワーク」で収集された外部ログ５０２ａの文情報の間で見られたものであるが、まったく異なる検索キーワードで収集された文情報の関連詞同士に、このような関連性を見いだすことができる場合もあり、この場合に、潜在的な話題の関係性を発見することもできる。 In this embodiment, the relationship between the related terms is seen among the sentence information of the external log 502a collected by the search keyword “network”, but is collected by a completely different search keyword. In some cases, it is possible to find such a relationship among related terms in the sentence information. In this case, a potential topical relationship can also be found.

＜文解析部における文字列検索処理の詳細な説明＞
図２８ないし図３０を参照して、文字列検索処理５３０を説明する。図２８は、文字列検索処理５３０の処理手順を表すフローチャートである。図２９、図３０は、サフィックスアレイと二分探索を用いた文字列検索の仕組みを示す図であり、検索対象の文字列として、例えば、外部ログ５０２の文字列の一部である「このコードがコードリストにある」とのテキストデータ５０２−１が設定される。通常は、外部ログ５０２のテキスト部分すべてが検索対象となるが、ここでは、例示のため、テキスト部分の一部としている。 <Detailed description of character string search processing in sentence analyzer>
The character string search processing 530 will be described with reference to FIGS. FIG. 28 is a flowchart illustrating the processing procedure of the character string search processing 530. FIGS. 29 and 30 are diagrams showing a mechanism for character string search using a suffix array and binary search. As a character string to be searched, for example, “this code is a part of a character string of the external log 502”. Text data 502-1 “is in the code list” is set. Normally, the entire text portion of the external log 502 is a search target, but here it is a part of the text portion for illustration.

最初に、文字列「このコードがコードリストにある」のなかから同じ文字列を検索するために、サフィックスアレイを作成する。図２８のステップＳ２１において、テキストデータの最初の文字から最終の文字まで、サフィックスを展開する。ここで、図２９を参照すると、図２９（Ａ）に示された、検索対象のテキストデータ５０２−１（「このコードがコードリストにある」）から、図２９（Ｂ）に示すように、１〜１５までのインデックスが付されたサフィックスが展開されている。それぞれのサフィックスは、検索対象のテキストデータ５０２−１において、インデックスの位置（開始文字位置）から末尾までの文字列であり、例えば、インデックス「１」については、検索対象のテキストデータ５０２−１の１文字目から末尾（１５文字目）までの文字列「このコードがコードリストにある」が示されている。また、インデックス「１０」については、検索対象のテキストデータ５０２−１の１０文字目から末尾（１５文字目）までの文字列「リストにある」が示されている。最後のインデックス「１５」については、末尾である１５文字目の文字「る」が示されている。 First, a suffix array is created in order to search for the same character string from the character string “this code is in the code list”. In step S21 in FIG. 28, the suffix is expanded from the first character to the last character of the text data. Here, referring to FIG. 29, from the text data 502-1 to be searched ("this code is in the code list") shown in FIG. 29A, as shown in FIG. Suffixes with indexes from 1 to 15 are expanded. Each suffix is a character string from the index position (start character position) to the end in the text data 502-1 to be searched. For example, for the index “1”, the text data 502-1 to be searched A character string “This code is in the code list” from the first character to the end (15th character) is shown. For the index “10”, the character string “in the list” from the 10th character to the end (15th character) of the text data 502-1 to be searched is shown. For the last index “15”, the last character “RU” is shown.

次に、図２８のステップＳ２２において、展開されたサフィックスを所定順序でソートし、サフィックスアレイを作成する。ここで、図２９を参照すると、図２９（Ｂ）に示されたサフィックスがソートされ、ソート後のサフィックスアレイが図２９（Ｃ）に示されている。ソートは、例えば、各文字に対応する文字コード（この例ではＪＩＳコード）によって行われ、１レコード目から４レコード目にかけて、１文字目が、「ー」（ＪＩＳコード＝213C）、「あ」（ＪＩＳコード＝2422）、「が」（ＪＩＳコード＝242C）のように、当該文字コード順にソートされていることが分かる。１レコード目と２レコード目は、１文字目と２文字目が同じであるが、３文字目が「が」と「リ」で異なり（「が」（ＪＩＳコード＝242C）、「リ」（ＪＩＳコード＝256A））、それによって、１レコード目が最初に配置されている。なお、インデックスは、図２９（Ｂ）のインデックスが、対応するレコードに付随してそのまま付与されている。 Next, in step S22 of FIG. 28, the expanded suffixes are sorted in a predetermined order to create a suffix array. Here, referring to FIG. 29, the suffixes shown in FIG. 29 (B) are sorted, and the suffix array after sorting is shown in FIG. 29 (C). Sorting is performed by, for example, a character code corresponding to each character (in this example, JIS code). From the first record to the fourth record, the first character is “-” (JIS code = 213C), “A”. It can be seen that the characters are sorted in the order of the character codes such as (JIS code = 2422) and “ga” (JIS code = 242C). The first and second records have the same first and second characters, but the third character differs between “GA” and “RE” (“GA” (JIS code = 242C), “RI” ( JIS code = 256A)), the first record is arranged first. Note that the index shown in FIG. 29B is attached to the corresponding record as it is.

次に、図２８のステップＳ２３において、順次、テキストデータ内から検索文字列を１つずつ決定する。これは、テキストデータ内に同じ文字列があるかどうかを検索するのであるから、テキストデータ内のすべての部分文字列を検索文字列として、テキストデータと照合する。例えば、図２９の検索対象のテキストデータ５０２−１に関しては、１文字の検索文字列「こ」〜「る」、２文字の検索文字列「この」、「のコ」、・・・、「にあ」、「ある」、３文字の検索文字列「このコ」、「のコー」、・・・、「トにあ」、「にある」等が順に決定され、以下同様に、１５文字の「このコードがコードリストにある」までが検索文字列となる。ただしこの場合、１文字や１５文字の検索文字列については、検索対象のテキストデータ５０２−１と照合する意味がないので、省略することができる。また、それ以外にも、検索文字列の文字長を所定の範囲に限定することができる。 Next, in step S23 of FIG. 28, one search character string is sequentially determined from the text data. In this case, whether or not the same character string exists in the text data is searched, and all partial character strings in the text data are collated with the text data as search character strings. For example, regarding the text data 502-1 to be searched in FIG. 29, a search character string “ko” to “ru” of one character, a search character string “this”, “noco”,. ‘N’, ‘Yes’, 3 character search strings ‘Kon’, ‘No’, ・・・, ‘To’, ‘In’, etc. The search character string up to “This code is in the code list”. However, in this case, the search character string of 1 character or 15 characters can be omitted because it is meaningless to collate with the text data 502-1 to be searched. In addition, the character length of the search character string can be limited to a predetermined range.

次に、図２８のステップＳ２４において、検索を行う検索文字列がすべて終了したか否かが判定される。ここで、すべて終了した場合は、文字列検索処理が終了する。まだすべてが終了していない場合は、ステップＳ２５において、検索文字列を検索キーとして、サフィックスアレイを検索する。ここで、図３０を参照すると、図３０（Ａ）及び図３０（Ｂ）に示すように、検索文字列として、検索対象のテキストデータ５０２−１内から３文字の文字列「コード」が決定された場合の検索処理が示されている。このとき、図３０（Ｃ）のサフィックスアレイに対して二分探索が行われる。図３０（Ｃ）のサフィックスアレイは、図２９（Ｃ）に示すサフィックスアレイと同じものであり、図３０（Ｃ）では、二分探索による検索の過程が示されている。 Next, in step S24 of FIG. 28, it is determined whether or not all the search character strings to be searched are completed. Here, when all the processes are completed, the character string search process ends. If all have not been completed yet, the suffix array is searched using the search character string as a search key in step S25. Here, referring to FIG. 30, as shown in FIGS. 30A and 30B, a character string “code” of three characters is determined from the text data 502-1 to be searched as a search character string. The search process in the case of being performed is shown. At this time, a binary search is performed on the suffix array of FIG. The suffix array in FIG. 30C is the same as the suffix array shown in FIG. 29C, and FIG. 30C shows a search process by binary search.

図３０（Ｃ）を参照すると、まず、サフィックスアレイの中央のレコード（１）（インデックス＝１５、文字列「る」）と検索文字列「コード」を比較する。このとき、「る」のＪＩＳコードは246B、「コ」のＪＩＳコードは2533で、「コ」の方が大きいので、レコード（１）より下でサフィックスアレイの下半分の中心に位置するレコード（２）（インデックス＝１２、文字列「トにある」）と検索文字列「コード」を比較する。このとき、「ト」のＪＩＳコードは2548、「コ」のＪＩＳコードは2533で、「コ」の方が小さいので、レコード（１）とレコード（２）の中心に位置するレコード（３）（インデックス＝７、文字列「コードリストにある」）と検索文字列「コード」を比較する。 Referring to FIG. 30C, the record (1) in the center of the suffix array (index = 15, the character string “RU”) is compared with the search character string “code”. At this time, the JIS code of “ru” is 246B, the JIS code of “ko” is 2533, and “co” is larger, so the record (1) below the record array (1) located in the center of the lower half of the suffix array ( 2) Compare the search character string “code” with (index = 12, character string “is in”). At this time, the JIS code for “G” is 2548, the JIS code for “G” is 2533, and “C” is smaller, so the record (3) ( Index = 7, character string “is in code list”) and search character string “code” are compared.

そうすると、レコード（３）の最初の３文字と検索文字列「コード」が一致するため、検索文字列と同じ文字列が検索対象のテキストデータ５０２−１に見つかったことになる。さらにその後、レコード（３）の上下のレコードと比較すると、レコード（４）の最初の３文字と検索文字列「コード」が一致するため、検索文字列と同じ文字列が検索対象のテキストデータ５０２−１が見つかったことになる。このような二分探索により、検索対象のテキストデータ５０２−１のなかに、「コード」という文字列が２つあることがわかる（そのうちの１つは、検索対象のテキストデータ５０２−１から抽出された検索文字列「コード」それ自体である）。 Then, since the first three characters of the record (3) match the search character string “code”, the same character string as the search character string is found in the search target text data 502-1. After that, when compared with the upper and lower records of the record (3), the first three characters of the record (4) and the search character string “code” match, so the same character string as the search character string is the text data 502 to be searched. -1 is found. Such a binary search reveals that there are two character strings “code” in the text data 502-1 to be searched (one of which is extracted from the text data 502-1 to be searched). Search string "code" itself).

次に、図２８のステップＳ２６において、検索文字列が所定数ヒットしたか否かが判定される。検索文字列が所定数ヒットしないと判定された場合、その文字列を関連詞の候補とすることなく、次の検索文字列による検索を行うため、ステップＳ２３に進む。ここで、所定数は、検索対象のテキストデータ５０２−１の文字数や、検索文字列の文字数など、様々な要素に基づいて決定されうる。所定数ヒットしないということは、検索対象のテキストデータ５０２−１において、その検索文字列の出現頻度が小さく、重要な語ではないということを表している。なお、この段階では、出現頻度を評価することなく、関連詞の候補として記憶しておき、後続の関連詞決定処理等において最終的に判断するように構成することもできる。 Next, in step S26 of FIG. 28, it is determined whether or not a predetermined number of search character strings have been hit. If it is determined that a predetermined number of search character strings do not hit, the process proceeds to step S23 in order to perform a search by the next search character string without using the character string as a candidate for a related term. Here, the predetermined number can be determined based on various factors such as the number of characters in the text data 502-1 to be searched and the number of characters in the search character string. The fact that the predetermined number of hits does not occur means that the frequency of appearance of the search character string is low in the search target text data 502-1 and is not an important word. It should be noted that at this stage, the appearance frequency can be stored as a related term candidate without being evaluated, and finally determined in a subsequent related term determination process or the like.

ステップＳ２６において、検索文字列が所定数ヒットしたと判定された場合、ステップＳ２７に進み、そこで、検索キー（検索文字列）に一致した文字列を関連詞の候補とし、それぞれ、前後の隣接文字とともに、１レコードとして関連詞候補データ４９に記憶する。ここで、図３０（Ｃ）を参照すると、検索結果として、検索文字列「コード」と一致する文字列が先頭に見つかったレコード（３）とレコード（４）について、それぞれ、検索文字列と同じ文字列「コード」と、前後の隣接文字が、１レコードとして記憶される。例えば、レコード（３）については、「が」、「コード」、「リ」が１レコードとして記憶される。前の隣接文字が「が」であり、後の隣接文字が「リ」である。また、レコード（４）については、「の」、「コード」、「が」が１レコードとして記憶される。前の隣接文字が「の」であり、後の隣接文字が「が」である。 If it is determined in step S26 that the search character string has been hit by a predetermined number, the process proceeds to step S27, where the character string that matches the search key (search character string) is set as a related term candidate, At the same time, it is stored in the related term candidate data 49 as one record. Here, referring to FIG. 30C, as a search result, the record (3) and the record (4) in which the character string matching the search character string “code” is found at the head are the same as the search character string, respectively. The character string “code” and adjacent characters before and after are stored as one record. For example, for record (3), “GA”, “CODE”, and “RE” are stored as one record. The preceding adjacent character is “GA”, and the subsequent adjacent character is “RE”. For record (4), “no”, “code”, and “ga” are stored as one record. The preceding adjacent character is “no” and the subsequent adjacent character is “ga”.

このように、この実施例における文解析部４３では、上述のように、サフィックスアレイと二分探索を用いて、高速にテキストデータ内で同じ文字列を見つけ出すように構成されているが、本発明は当該処理方法に限定されるものではない。上述した以外の方法で、テキストデータ内の同じ文字列を見つけ出すようにすることができる。 As described above, the sentence analysis unit 43 in this embodiment is configured to find the same character string in the text data at high speed using the suffix array and the binary search as described above. It is not limited to the said processing method. It is possible to find the same character string in the text data by a method other than that described above.

＜文解析部における異なり度合い判定処理の詳細な説明＞
次に、図３１及び図３２を参照して、この実施例における文解析部４３の異なり度合い判定処理について、より詳細に説明する。 <Detailed description of the degree of difference determination process in the sentence analysis unit>
Next, with reference to FIG. 31 and FIG. 32, the different degree determination processing of the sentence analysis unit 43 in this embodiment will be described in more detail.

図３１は、異なり度合い判定処理５４０の処理手順を表すフローチャートである。図３２は、前後の隣接文字の異なり度合いを判定するための仕組みを示す図であり、図３０に示すような、検索文字列「コード」での文字列検索処理を、多くの文字を含む検索対象文字列に対して行った結果、検索結果として検索文字列「コード」が２６個得られ、それに対応する２６件のレコードを処理する状況を示している。 FIG. 31 is a flowchart showing the processing procedure of the degree-of-difference determination processing 540. FIG. 32 is a diagram showing a mechanism for determining the degree of difference between the adjacent characters before and after, and the character string search process using the search character string “code” as shown in FIG. As a result of performing the search on the target character string, 26 search character strings “codes” are obtained as search results, and the corresponding 26 records are processed.

最初に、図３１のステップＳ３１において、文字列検索処理５３０で、検索結果データ４８に記憶された文字列のレコード（前後の隣接文字を含む）から、１つの文字列に関するレコードを取り出す。ここで、図３２（Ａ）を参照すると、検索結果データ４８に記憶された文字列「コード」についてのレコード（全２６レコード）が取り出され、メモリに展開された様子が示されている。 First, in step S31 of FIG. 31, a character string search process 530 extracts a record relating to one character string from the character string records (including preceding and following adjacent characters) stored in the search result data 48. Here, referring to FIG. 32A, a state is shown in which records (26 records in total) for the character string “code” stored in the search result data 48 are extracted and expanded in the memory.

次に、図３１のステップＳ３２において、検索結果データ４８に記憶された文字列のレコードをすべて取得し、異なり度合い判定処理を行うべきデータがなくなったと判定された場合、図３１の異なり度合い判定処理は終了する。ステップＳ３２において、すべての処理が終了しておらず、検索結果データ４８に記憶された文字列の１つについてすべてのレコードが取得できている場合は、ステップＳ３３に進む。 Next, if all the character string records stored in the search result data 48 are acquired in step S32 in FIG. 31 and it is determined that there is no more data to be subjected to the different degree determination process, the different degree determination process in FIG. Ends. If all the processes have not been completed in step S32 and all the records have been acquired for one of the character strings stored in the search result data 48, the process proceeds to step S33.

次に、図３１のステップＳ３３において、検索結果データ４８に記憶された文字列の１つについて取得したすべてのレコードについて、前の隣接文字でソートし、前の隣接文字に関する出現文字のパターン数を求める。ここで、図３２（Ａ）を参照すると、検索結果データ４８に記憶された文字列「コード」について取得したレコード（全２６件）について、前の隣接文字５６１でソートした結果が示されている。このソートは、上述した文字列検索処理におけるサフィックスアレイの作成のときと同様、文字コード（例えば、ＪＩＳコード）を用いて行うことができる。このようなソートを行った後、レコード間で前の隣接文字５６１の値が変化した（ブレークした）回数をカウントすることによって、前の隣接文字５６１の出現文字のパターン数が分かる。図３２（Ａ）の場合、前の隣接文字５６１は、「」、「（」、「、」、「「」、「が」、「た」、「で」、「ど」、「の」、「は」、「べ」、「も」、「り」の１３パターンである。 Next, in step S33 of FIG. 31, all records acquired for one of the character strings stored in the search result data 48 are sorted by the previous adjacent character, and the number of appearance character patterns related to the previous adjacent character is determined. Ask. Here, referring to FIG. 32A, the result of sorting the records (26 in total) acquired for the character string “code” stored in the search result data 48 by the previous adjacent character 561 is shown. . This sorting can be performed using a character code (for example, JIS code) as in the case of creating a suffix array in the character string search process described above. After performing such sorting, the number of appearance character patterns of the previous adjacent character 561 can be determined by counting the number of times the value of the previous adjacent character 561 has changed (breaked) between records. In the case of FIG. 32A, the previous adjacent characters 561 are “”, “(”, “,”, ““ ”,“ ga ”,“ ta ”,“ de ”,“ do ”,“ no ”, There are 13 patterns of “ha”, “be”, “mo”, and “ri”.

このように、この実施例の文解析部４３では、前の隣接文字５６１をソートし、その値が各レコード間で変化したかどうかで、前の隣接文字に関する出現文字のパターン数を求めているが、本発明は当該処理方法に限定されるものではない。出現文字のパターン数は、他の様々な方法で求めることができる。また、所定の文字や文字パターンに関する扱いを、仕様に応じて柔軟に設定することができる。例えば、改行・改ページなどの制御文字等を考慮しないようにすることができる。また、句点や読点などを考慮することもできるし、無視することもできる。さらに、英字の大文字と小文字を同じ文字として扱うこともできるし、別の文字として扱うこともできる。また、ソートについては、半角英数などの１バイトコード文字については、対応する１バイトをソートし、漢字などの２バイトコード文字については、対応する２バイトをソートする。また、前の隣接文字５６１をソートする場合、本実施形態では、１文字の隣接文字をソートすることとしているが、２文字以上の文字をソートして、その異なり度合いを判定するようにもできる。 As described above, the sentence analysis unit 43 of this embodiment sorts the previous adjacent characters 561 and obtains the number of appearance character patterns related to the previous adjacent characters depending on whether or not the value has changed between the records. However, the present invention is not limited to the processing method. The number of appearance character patterns can be obtained by various other methods. In addition, it is possible to flexibly set the handling of predetermined characters and character patterns according to specifications. For example, it is possible not to consider control characters such as line feeds and page breaks. You can also consider punctuation marks and punctuation marks, or ignore them. Furthermore, uppercase and lowercase letters can be treated as the same character, or they can be treated as different characters. As for sorting, for 1 byte code characters such as single-byte alphanumeric characters, the corresponding 1 byte is sorted, and for 2 byte code characters such as Kanji, the corresponding 2 bytes are sorted. Also, when sorting the previous adjacent character 561, in this embodiment, one adjacent character is sorted. However, it is possible to sort two or more characters and determine the degree of difference between them. .

次に、図３１のステップＳ３４において、前の隣接文字５６１に関する異なり度合いを判定する。この場合、異なり度合いは、隣接文字５６１が（図３２（Ａ）の２６件のレコードの間で）、どの程度異なっているかを示す指標であり、従って、上述した前の隣接文字に関する出現文字のパターン数に基づいて判定される。異なり度合いは、パターン数そのものであってもよいが、例えば、文字列の出現頻度（図３２（Ａ）の場合は、「コード」の出現頻度が２６）等を考慮して判定することもできる。また、異なり度合いを所定の閾値により複数の段階（例えば、３段階）で評価することもできる。またさらに、隣接文字５６１が特定の文字である場合に、その文字のカウントや異なり度合いの判定に関し、任意の重み付けを行うようにすることもできる。 Next, in step S34 of FIG. 31, the degree of difference regarding the previous adjacent character 561 is determined. In this case, the degree of difference is an index indicating how much the adjacent character 561 is different (between the 26 records in FIG. 32A), and thus the appearance character related to the previous adjacent character described above. It is determined based on the number of patterns. The degree of difference may be the number of patterns itself, but can be determined in consideration of, for example, the appearance frequency of a character string (in the case of FIG. 32A, the appearance frequency of “code” is 26). . Also, the degree of difference can be evaluated in a plurality of stages (for example, three stages) using a predetermined threshold. Furthermore, when the adjacent character 561 is a specific character, arbitrary weighting can be performed with respect to the count of the character and determination of the degree of difference.

次に、図３１のステップＳ３５において、検索結果データ４８に記憶された文字列の１つについて取得したすべてのレコードについて、後の隣接文字でソートし、後の隣接文字に関する出現文字のパターン数を求める。ここで、図３２（Ｂ）を参照すると、検索結果データ４８に記憶された文字列「コード」について取得したレコード（全２６件）について、後の隣接文字５６３でソートした結果が示されている。このようなソートにより、図３２（Ａ）に示したレコード５６５、レコード５６６は、それぞれ点線矢印に示す位置に配置される。 Next, in step S35 of FIG. 31, all records acquired for one of the character strings stored in the search result data 48 are sorted by the subsequent adjacent character, and the number of appearance character patterns related to the subsequent adjacent character is determined. Ask. Here, referring to FIG. 32 (B), the result obtained by sorting the records (26 in total) acquired for the character string “code” stored in the search result data 48 by the subsequent adjacent characters 563 is shown. . By such sorting, the records 565 and 566 shown in FIG. 32A are arranged at the positions indicated by dotted arrows.

このソートは、上述した文字列検索処理におけるサフィックスアレイの作成のときと同様、文字コード（例えば、ＪＩＳコード）を用いて行うことができる。このようなソートを行った後、レコード間で後の隣接文字５６３の値が変化した（ブレークした）回数をカウントすることによって、後の隣接文字５６３の出現文字のパターン数が分かる。図３２（Ｂ）の場合、後の隣接文字２１３は、「、」、「「」、「」」、「が」、「で」、「と」、「に」、「の」、「は」、「を」、「リ」、「支」の１２パターンである。 This sorting can be performed using a character code (for example, JIS code) as in the case of creating a suffix array in the character string search process described above. After performing such sorting, the number of characters appearing in the subsequent adjacent character 563 can be determined by counting the number of times the value of the subsequent adjacent character 563 has changed (breaked) between records. In the case of FIG. 32B, the subsequent adjacent characters 213 are “,”, ““ ”,“ ””, “ga”, “de”, “to”, “ni”, “no”, “ha”. , “O”, “Li”, and “Branch”.

このように、この実施例の文解析部４３では、後の隣接文字５６３をソートし、その値が各レコード間で変化したかどうかで、後の隣接文字に関する出現文字のパターン数を求めているが、本発明は当該処理方法に限定されるものではない。出現文字のパターン数は、他の様々な方法で求めることができる。また、所定の文字については、当該パターン数のカウントに含めないようにすることができる。また、ソートについては、半角英数などの１バイトコード文字については、対応する１バイトをソートし、漢字などの２バイトコード文字については、対応する２バイトをソートする。また、後の隣接文字５６３をソートする場合、本実施形態では、１文字の隣接文字をソートすることとしているが、２文字以上の文字をソートして、その異なり度合いを判定するようにもできる。 As described above, the sentence analysis unit 43 of this embodiment sorts the subsequent adjacent characters 563, and obtains the number of appearance character patterns related to the subsequent adjacent characters depending on whether or not the value has changed between the records. However, the present invention is not limited to the processing method. The number of appearance character patterns can be obtained by various other methods. Further, the predetermined character can be excluded from the count of the number of patterns. As for sorting, for 1 byte code characters such as single-byte alphanumeric characters, the corresponding 1 byte is sorted, and for 2 byte code characters such as Kanji, the corresponding 2 bytes are sorted. Further, when the subsequent adjacent characters 563 are sorted, in this embodiment, one adjacent character is sorted. However, two or more characters can be sorted and the degree of difference can be determined. .

次に、図３１のステップＳ３６において、後の隣接文字５６３に関する異なり度合いを判定する。この場合、異なり度合いは、隣接文字５６３が（図３２（Ｂ）の２６件のレコードの間で）、どの程度異なっているかを示す指標であり、従って、上述した後の隣接文字に関する出現文字のパターン数に基づいて判定される。異なり度合いは、パターン数そのものであってもよいが、例えば、文字列の出現頻度（図３２（Ｂ）の場合は、「コード」の出現頻度が２６）等を考慮して判定することもできる。また、異なり度合いを所定の閾値により複数の段階（例えば、３段階）で評価することもできる。またさらに、隣接文字５６３が特定の文字である場合に、その文字のカウントや異なり度合いの判定に関し、任意の重み付けを行うようにすることもできる。 Next, in step S36 in FIG. 31, the degree of difference regarding the subsequent adjacent character 563 is determined. In this case, the degree of difference is an index indicating how much the adjacent character 563 is different (between the 26 records in FIG. 32B). It is determined based on the number of patterns. The degree of difference may be the number of patterns itself, but can be determined in consideration of, for example, the appearance frequency of a character string (in the case of FIG. 32B, the appearance frequency of “code” is 26). . Also, the degree of difference can be evaluated in a plurality of stages (for example, three stages) using a predetermined threshold. Furthermore, when the adjacent character 563 is a specific character, arbitrary weighting can be performed with respect to the count of the character and determination of the degree of difference.

次に、図３１のステップＳ３７において、判定対象の文字列、及び、その文字列に関して判定された、前の隣接文字に関する異なり度合いと後の隣接文字に関する異なり度合いが、関連詞候補データ４９に記憶される。 Next, in step S37 of FIG. 31, the character string to be determined and the degree of difference regarding the previous adjacent character and the degree of difference regarding the subsequent adjacent character determined for the character string are stored in the related term candidate data 49. Is done.

図３１のステップＳ３７の処理が終了すると、ステップＳ３１に進み、次の「同じ文字列」に関する処理が行われる。 When the process of step S37 in FIG. 31 is completed, the process proceeds to step S31, and the process for the next “same character string” is performed.

＜文解析部における関連詞決定処理の詳細な説明＞
関連詞決定処理５５０においては、異なり度合い判定処理５４０により関連詞候補データ４９に記憶されたデータを順次読み出し、判定された前後の隣接文字の異なり度合いに基づいて、対応する文字列が関連詞か否かを決定し、決定された関連詞を、例えば、上述した関連詞索引の形で関連詞辞書５０に記憶するとともに、その関連詞索引に対応する文情報（例えば、上述の例では、質問文と回答文）を関連詞辞書５０に記憶する。関連詞決定処理５５０は、例えば、前後の隣接文字の異なり度合いの大小に応じて、対応する文字列が関連詞であるか否かを決定する。 <Detailed explanation of related term decision processing in sentence analysis section>
In the related term determination process 550, the data stored in the related term candidate data 49 by the difference degree determination process 540 is sequentially read out, and the corresponding character string is a related term based on the determined difference degree between adjacent characters. Whether or not, the determined related term is stored in the related term dictionary 50 in the form of the related term index, for example, and sentence information corresponding to the related term index (for example, in the above example, question Sentence and answer sentence) are stored in the related term dictionary 50. The related term determination process 550 determines whether or not the corresponding character string is a related term, for example, according to the degree of difference between the adjacent character before and after.

前後の隣接文字の異なり度合いの大小については、共通の、または前後で異なる判断基準により判断されうる。前後の隣接文字の異なり度合いが、所定の大きさであると判断された場合に、対応する文字列は、独立した用語であって、話題を識別する重要な語である関連詞として決定される。すなわち、前後の隣接文字の異なり度合いによりスコアが計算され、そのスコアに基づいて、その対応する文字列が関連詞であるか否かを決定するようにしてもよい。 The degree of difference between the adjacent characters before and after can be determined based on a common or different determination criterion. When the degree of difference between the adjacent characters before and after is determined to be a predetermined size, the corresponding character string is an independent term and is determined as a related term that is an important word for identifying a topic. . That is, a score may be calculated based on the degree of difference between adjacent characters before and after, and based on the score, it may be determined whether or not the corresponding character string is a related term.

また、関連詞決定処理５５０では、前後の隣接文字の異なり度合いに加えて、対応する文字列の文字長、対応する文字列の出現頻度、特定の文字が前の隣接文字として出現する確率・頻度、特定の文字が後の隣接文字として出現する確率・頻度、特定の文字の組合せが前後の隣接文字として出現する確率・頻度等を考慮してスコアを計算し、計算されたスコアに基づいて、その対応する文字列が関連詞であるか否かを決定するようにしてもよい。 In addition, in the related term determination process 550, in addition to the degree of difference between the adjacent characters before and after, the character length of the corresponding character string, the appearance frequency of the corresponding character string, the probability / frequency of appearance of a specific character as the previous adjacent character , Calculate the score in consideration of the probability and frequency that a specific character appears as a subsequent adjacent character, the probability and frequency that a specific character combination appears as the adjacent character before and after, and based on the calculated score, You may make it determine whether the corresponding character string is a related term.

さらに、１つのテキストデータ（検索対象の文字列）において、複数の関連詞の候補がある場合に、その候補の数や、それぞれについて計算されたスコアの分布に応じて、関連詞として決定するための決定条件を変化させることもできる。 Furthermore, when there are a plurality of related term candidates in one text data (character string to be searched), in order to determine as a related term according to the number of candidates and the distribution of scores calculated for each candidate. It is also possible to change the determination conditions.

またさらに、関連詞決定処理５５０では、外部ログ５０２、または外部ログ５０２に含まれる文情報のそれぞれについて決定された関連詞が複数ある場合に、関連詞として決定された文字列の間でランク付けを行うようにすることができる。こうしたランク付けは、例えば、話題に関する文字列の重要度に関するランク付けであり、上述した、前後の隣接文字の異なり度合いにより計算されたスコアや、前後の隣接文字の異なり度合いに加え、他の様々な要素を加味して計算されたスコア等に基づいて決定される。また、このようなランク付けは、関連詞として決定された文字列の重要度を順序付けるだけでなく、関連詞間の相対的な重要度を示すことができるように、例えば、上述のスコアの値を用いる等して、具体的な数値によりランク付けを行うことができる。 Furthermore, in the related term determination process 550, when there are a plurality of related terms determined for each of the external log 502 or sentence information included in the external log 502, ranking is performed among character strings determined as related terms. Can be done. Such ranking is, for example, ranking related to the importance of a character string related to a topic. In addition to the above-described score calculated based on the degree of difference between adjacent characters and the degree of difference between adjacent characters before and after, It is determined on the basis of a score calculated taking into account various factors. In addition, such ranking is not only used to order the importance of strings determined as related terms, but also to indicate the relative importance between related terms, for example, the above-mentioned score Ranking can be performed by specific numerical values by using values or the like.

また、このようなランク付けは、関連詞が複数決定された場合に行われるが、１つのテキストデータに関して決定された複数の関連詞について行うこともできるし、所定の条件によりグルーピングされた複数のテキストデータに関して決定された複数の関連詞について行うこともできる。 Such ranking is performed when a plurality of related terms are determined, but can also be performed for a plurality of related terms determined with respect to one text data, or a plurality of grouped groups according to a predetermined condition. It can also be performed for a plurality of related terms determined for text data.

例えば、所定の条件によりグルーピングされた複数のテキストデータとして、入力された検索キーワードにヒットしたＷＥＢページ群のテキストデータや、所定の属性に該当するユーザのTWITTERの内容などが考えられる。 For example, as a plurality of text data grouped according to a predetermined condition, text data of a WEB page group hit with an input search keyword, contents of a user's TWITTER corresponding to a predetermined attribute, and the like can be considered.

＜＜ＦＡＱ表示画面の表示処理に関する説明＞＞
次に、図３３を参照して、ＦＡＱ表示画面の表示処理について説明する。図３３は、ＦＡＱ表示画面の表示処理を表すフローチャートであり、会話制御端末装置２’’と話題提供サーバ４’においてそれぞれどのような処理が行われるかを示している。会話制御端末装置２’’では、例えば、上述したTopiclet２０によって各処理が行われる。 << Explanation regarding display processing of FAQ display screen >>
Next, the FAQ display screen display process will be described with reference to FIG. FIG. 33 is a flowchart showing the display processing of the FAQ display screen, and shows what processing is performed in the conversation control terminal device 2 ″ and the topic providing server 4 ′. In the conversation control terminal device 2 ″, for example, each process is performed by the above-described Topiclet 20.

最初に、ステップＳ４１において、ユーザが、図３５（Ｂ）に示すＦＡＱ候補表示画面６１０で、候補質問文表示部６１２に表示された候補質問文のうちの１つをマウスのクリック等によって選択したか否かが判定される。候補質問文のうちの１つが選択されていない間は（ＮＯ）、この判定が繰り返される。候補質問文のうちの１つが選択された場合（ＹＥＳ）、ステップＳ４２において、選択された候補質問文を、入力情報として話題提供サーバ４’に送信する。入力情報には、選択された質問文そのものを含むこともできるが、この質問文を識別できる識別子が含まれていれば十分である。なお、ユーザは、関心のある質問文を複数同時に選択するようにもできる。 First, in step S41, the user selects one of the candidate question sentences displayed on the candidate question sentence display unit 612 on the FAQ candidate display screen 610 shown in FIG. 35B by clicking the mouse or the like. It is determined whether or not. While one of the candidate question sentences is not selected (NO), this determination is repeated. When one of the candidate question sentences is selected (YES), in step S42, the selected candidate question sentence is transmitted as input information to the topic providing server 4 '. The input information can include the selected question text itself, but it is sufficient if the input information includes an identifier that can identify the question text. Note that the user can select a plurality of question sentences of interest simultaneously.

話題提供サーバ４’が会話制御端末装置２’’から入力情報を受け取ると、ステップＳ４３において、入力情報を分析し、入力情報に含まれる質問文に対応する回答文を、関連詞辞書５０から取得する。なお、この実施例では、図２７（Ａ）に示すように、関連詞辞書５０に質問文５０ｂと、これに対応する回答文５０ｃを記憶するようにしているが、これらの質問文５０ｂと回答文５０ｃを、関連詞辞書５０の関連詞索引５０ａと関連付けながら、別のファイルに記憶させることもできる。 When the topic providing server 4 ′ receives the input information from the conversation control terminal device 2 ″, in step S43, the input information is analyzed, and an answer sentence corresponding to the question sentence included in the input information is acquired from the related term dictionary 50. To do. In this embodiment, as shown in FIG. 27A, a question sentence 50b and an answer sentence 50c corresponding to the question sentence 50b are stored in the related term dictionary 50. The sentence 50c can be stored in another file while being associated with the related term index 50a of the related term dictionary 50.

次に、話題提供サーバ４’は、ステップＳ４４において、ＦＡＱ表示画面６３０に表示するために、関連詞辞書５０から取得した質問文５０ｂに対応する回答文５０ｃを含む情報を関連詞・共起語データ５２に記憶するとともに、この情報を入力特定情報として会話制御端末装置２’’に送信する。 Next, in step S44, the topic providing server 4 ′ displays information including the answer sentence 50c corresponding to the question sentence 50b acquired from the related term dictionary 50 to be displayed on the FAQ display screen 630 as a related term / co-occurrence word. The information is stored in the data 52 and transmitted to the conversation control terminal device 2 ″ as input specifying information.

会話制御端末装置２’’は、話題提供サーバ４’から入力特定情報を受信すると（ステップＳ４５）、ステップＳ４６において、受信した入力特定情報とシナリオデータ２８に基づいて、応答情報を決定する。なお、話題提供サーバ４’は、必要に応じてシナリオデータ５５を会話制御端末装置２’’に送信し、会話制御端末装置２’’はこれをシナリオデータ２８に記憶する。 When receiving the input specifying information from the topic providing server 4 ′ (Step S 45), the conversation control terminal device 2 ″ determines response information based on the received input specifying information and the scenario data 28 in Step S 46. The topic providing server 4 ′ transmits the scenario data 55 to the conversation control terminal device 2 ″ as necessary, and the conversation control terminal device 2 ″ stores this in the scenario data 28.

次に、ステップＳ４７において、ステップＳ４６で決定された応答情報を会話制御端末装置２’’のディスプレイに表示する。例えば、図３５（Ｂ）に示すようなＦＡＱ候補表示画面６１０において、候補質問文表示部６１２にリスト表示されている質問文の１つ（例えば、矢印（１）に示すＱ２４の質問）を選択すると、図３６に示されるようなＦＡＱ表示画面６３０が表示され、そこで、質問表示部６３１に、選択した質問文であるＱ２４の質問が表示されるとともに、関連詞索引表示部６３２に、Ｑ２４の質問に対応する関連詞索引が表示され、さらに、回答表示部６３３に、Ｑ２４の質問に対応する回答（Ａ２４の回答）が表示される。 Next, in step S47, the response information determined in step S46 is displayed on the display of the conversation control terminal device 2 ''. For example, on the FAQ candidate display screen 610 as shown in FIG. 35B, one of the question sentences listed in the candidate question sentence display unit 612 (for example, the question Q24 indicated by the arrow (1)) is selected. Then, an FAQ display screen 630 as shown in FIG. 36 is displayed, where the question display unit 631 displays the question Q24 which is the selected question sentence, and the related term index display unit 632 displays Q24. The related term index corresponding to the question is displayed, and further, the answer corresponding to the question of Q24 (the answer of A24) is displayed on the answer display unit 633.

このような、ＦＡＱ検索画面６００からＦＡＱ表示画面６３０までの画面遷移により、ユーザは、ユーザの指定した検索キーワードでＦＡＱを検索し、検索結果として複数の質問文の候補を表示させることができ、さらにそこで、それぞれの候補質問文において、どのような重要なキーワードが現れているか（すなわち、どのような事項に関連するものであるか）を、関連詞索引を見ることによって容易に把握することができる。 By such a screen transition from the FAQ search screen 600 to the FAQ display screen 630, the user can search the FAQ with the search keyword designated by the user and display a plurality of question sentence candidates as search results. Furthermore, it is possible to easily grasp what important keywords appear in each candidate question sentence (that is, what matters are related) by looking at the related index. it can.

また、この実施例では、ＦＡＱ候補表示画面６１０において、質問文に対応する回答文を表示していないが、候補質問文をリスト表示させる段階で、それぞれ対応する回答文を表示するようにすることもできる。 Further, in this embodiment, the FAQ candidate display screen 610 does not display the answer sentence corresponding to the question sentence, but the corresponding answer sentence is displayed at the stage where the candidate question sentence is displayed as a list. You can also.

＜＜関連詞・共起語一覧画面の表示処理に関する説明＞＞
次に、図３４を参照して、関連詞・共起語一覧画面の表示処理について説明する。図３４は、関連詞・共起語一覧画面の表示処理を表すフローチャートであり、会話制御端末装置２’’と話題提供サーバ４’においてそれぞれどのような処理が行われるかを示している。会話制御端末装置２’’では、例えば、上述したTopiclet２０によって各処理が行われる。 << Explanation about display processing of related term / co-occurrence word list screen >>
Next, with reference to FIG. 34, display processing of the related term / co-occurrence word list screen will be described. FIG. 34 is a flowchart showing the display processing of the related term / co-occurrence word list screen, and shows what processing is performed in the conversation control terminal device 2 ″ and the topic providing server 4 ′. In the conversation control terminal device 2 ″, for example, each process is performed by the above-described Topiclet 20.

最初に、ステップＳ５１において、ユーザが、図３５（Ｂ）に示すＦＡＱ候補表示画面６１０で、関連詞索引表示部６１１に表示された関連詞索引のうちの１つをマウスのクリック等によって選択したか否かが判定される。関連詞索引のうちの１つが選択されていない間は（ＮＯ）、この判定が繰り返される。関連詞索引のうちの１つが選択された場合（ＹＥＳ）、ステップＳ５２において、選択された関連詞索引を、入力情報として話題提供サーバ４’に送信する。入力情報には、選択された関連詞索引そのものを含むこともできるが、この関連詞索引を識別できる識別子が含まれていれば十分である。なお、ユーザは、関心のある関連詞索引を複数同時に選択するようにもできる。 First, in step S51, the user selects one of the related term indexes displayed on the related term index display unit 611 on the FAQ candidate display screen 610 shown in FIG. 35B by clicking the mouse or the like. It is determined whether or not. While one of the related term indexes is not selected (NO), this determination is repeated. When one of the related term indexes is selected (YES), in step S52, the selected related term index is transmitted as input information to the topic providing server 4 '. The input information may include the selected related term index itself, but it is sufficient if the input information includes an identifier that can identify this related term index. Note that the user can select a plurality of related terminology indexes of interest simultaneously.

話題提供サーバ４’が会話制御端末装置２’’から入力情報を受け取ると、ステップＳ５３において、入力情報を分析し、関連詞辞書５０から、入力情報に含まれる関連詞索引を含むすべての関連詞索引を取得する。 When the topic providing server 4 ′ receives the input information from the conversation control terminal device 2 ″, in step S53, the input information is analyzed, and all the related terms including the related term index included in the input information are analyzed from the related term dictionary 50. Get the index.

次に、話題提供サーバ４’は、ステップＳ５４において、嗜好データ５１のような内部ログ５０６に基づいて、ステップＳ５３で取得した関連詞索引に含まれるすべての関連詞について嗜好解析を行う。嗜好データ５１は、それぞれのユーザによって、これまでにどのような関連詞がどのように利用されてきたかといった利用態様を示すデータを記憶したログファイルであり、例えば、図３８（Ａ）に示すように、ユーザＩＤ５１ａごとに、そのユーザがどの関連詞５１ｂを、参照または検索キーワードとして入力してきたかを記憶している。このほか、嗜好データ５１として、関連詞の利用日時や詳細な利用内容を記憶しておき、これらの情報をも加味して嗜好解析を行うこともできる。 Next, in step S54, the topic providing server 4 'performs preference analysis for all related terms included in the related term index acquired in step S53 based on the internal log 506 such as the preference data 51. The preference data 51 is a log file that stores data indicating usage modes, such as what related terms have been used so far by each user, for example, as shown in FIG. In addition, for each user ID 51a, which related words 51b the user has input as a reference or search keyword is stored. In addition, as the preference data 51, the usage date and time and detailed usage contents of the related terms can be stored, and the preference analysis can be performed in consideration of the information.

話題提供サーバ４’は、連詞索引に含まれるすべての関連詞について、この嗜好データ５１に基づいて、嗜好解析を行い、各関連詞の重要度を決定する。例えば、嗜好データ５１を参照して、同じユーザにおいて、利用頻度の高い関連詞ほど、高い重要度となるよう関連詞の重要度を設定する。このような重要度によって関連付けられた関連詞は、上述の話題鍵（クラスタ）５０４に相当する。 The topic providing server 4 ′ performs preference analysis on all the related terms included in the conjunction index based on the preference data 51 and determines the importance of each related term. For example, with reference to the preference data 51, the importance level of the related term is set so that the more frequently used related term is, the higher the importance level is for the same user. The related terms associated with such importance correspond to the topic key (cluster) 504 described above.

次に、話題提供サーバ４’は、ステップＳ５５において、話題データ５３のような話題ネタ５０７に基づいて、ステップＳ５４で重要度が設定された関連詞について、さらに話題解析を行う。話題データ５３は、契約者の担当者が入力した話題であったり、当該担当者が入力した話題に基づいて、外部ログ５０２から自動的に抽出した話題であったりする。このような話題データ５３に基づいて、その分布を捉え、互いに関連付けられた関連詞の分布をユーザに提供する。例えば、ＦＡＱのなかで話題となっている関連詞とその共起語を関連付け、話題の中で関連詞がどのように分布しているのか表すようにできる。また、話題ネタの入力や自動抽出において、会話制御端末装置２’’のユーザごとに調整を行い、ユーザに合わせて関連詞を推奨するように構成することができる。例えば、対象となる質問文を、所定の範囲に限定するよう、話題を調整することもできる。 Next, in step S55, the topic providing server 4 'further performs topic analysis on the related terms whose importance is set in step S54 based on the topic material 507 such as the topic data 53. The topic data 53 may be a topic input by the person in charge of the contractor or a topic automatically extracted from the external log 502 based on the topic input by the person in charge. Based on such topic data 53, the distribution is captured, and the distribution of related terms associated with each other is provided to the user. For example, it is possible to associate a related term that is a topic in the FAQ with its co-occurrence word and express how the related term is distributed in the topic. Moreover, in the input of topic material or automatic extraction, it is possible to make adjustments for each user of the conversation control terminal device 2 ″ and recommend related terms according to the user. For example, the topic can be adjusted so as to limit the target question sentence to a predetermined range.

次に、話題提供サーバ４’は、ステップＳ５６において、関連詞・共起語一覧表示画面６５０に表示するために、最終的にステップＳ５５において関連付けがされた関連詞を含む情報を入力特定情報として会話制御端末装置２’’に送信する。 Next, in step S56, the topic providing server 4 ′ displays information including the related terms finally associated in step S55 as input specific information for display on the related term / co-occurrence word list display screen 650. It is transmitted to the conversation control terminal device 2 ″.

会話制御端末装置２’’は、話題提供サーバ４’から入力特定情報を受信すると（ステップＳ５７）、ステップＳ５８において、受信した入力特定情報とシナリオデータ２８に基づいて、応答情報を決定する。なお、話題提供サーバ４’は、必要に応じてシナリオデータ５５を会話制御端末装置２’’に送信し、会話制御端末装置２’’はこれをシナリオデータ２８に記憶する。 When the conversation control terminal device 2 ″ receives the input specifying information from the topic providing server 4 ′ (step S 57), it determines response information based on the received input specifying information and the scenario data 28 in step S 58. The topic providing server 4 ′ transmits the scenario data 55 to the conversation control terminal device 2 ″ as necessary, and the conversation control terminal device 2 ″ stores this in the scenario data 28.

次に、ステップＳ５９において、ステップＳ５８で決定された応答情報を会話制御端末装置２’’のディスプレイに表示する。例えば、図３７に示すような関連詞・共起語一覧表示画面６５０が、会話制御端末装置２’’のディスプレイに表示される。関連詞・共起語一覧表示画面６５０には、ＮＯ表示部６５１、関連詞表示部６５２、近傍関連詞表示部（６５３〜６５６）、及び「ＦＡＱ候補表示画面に戻る」ボタン６５７が表示され、関連詞が２次元のマトリックスの形態で表示されている。 Next, in step S59, the response information determined in step S58 is displayed on the display of the conversation control terminal device 2 ''. For example, a related term / co-occurrence word list display screen 650 as shown in FIG. 37 is displayed on the display of the conversation control terminal device 2 ″. On the related term / co-occurrence word list display screen 650, a NO display unit 651, a related term display unit 652, a neighborhood related term display unit (653-656), and a “return to FAQ candidate display screen” button 657 are displayed. Related terms are displayed in the form of a two-dimensional matrix.

マトリックスの縦軸方向について、関連詞表示部６５２には、この実施例のＦＡＱ検索に関して関連詞として抽出され、関連詞辞書５０の関連詞索引に現れるすべての関連詞が、重複無く順に表示されている。表示順は、嗜好解析によって決定された各関連詞の重要度に応じて設定され、この実施例では、ＮＯ表示部６５１に示された数字が低いほど（図３７の上部に表示されるほど）、対応する関連詞の重要度が高いものとなっている。なお、図３７では、ＮＯ表示部６５１に示された数字は、１〜１７までとなっているが、関連詞・共起語一覧表示画面６５０のスライダーバーを操作して下方向に移動させることによって、より多くの関連詞を閲覧することができる。 With respect to the vertical axis direction of the matrix, the related term display unit 652 extracts all related terms that are extracted as related terms regarding the FAQ search of this embodiment and appear in the related term index of the related term dictionary 50 in order without duplication. Yes. The display order is set according to the importance of each related term determined by preference analysis. In this embodiment, the lower the number shown in the NO display unit 651 (the higher the number displayed in the upper part of FIG. 37). The importance of the corresponding related terms is high. In FIG. 37, the numbers shown in the NO display portion 651 are 1 to 17, but the sliders on the related term / co-occurrence word list display screen 650 are operated to move downward. Allows you to browse more related terms.

マトリックスの横軸方向について、関連詞表示部６５２に表示された関連詞の右横には、近傍関連詞表示部６５３〜近傍関連詞表示部６５６が示され、ここには、関連詞表示部６５２に表示された関連詞に関する近傍関連詞が表示される。近傍とは、図１６に関して説明したように、ある関連詞Ａに着目した場合、関連詞Ａを含む関連詞集合は、「関連詞の近傍」であり、そのような関連詞をここでは、近傍関連詞としている。なお、関連詞集合は、ある話題に含まれている関連詞の集合であり、ここでは、関連詞索引がこれに相当する。 Regarding the horizontal axis direction of the matrix, a neighborhood related term display unit 653 to a neighborhood related term display unit 656 are shown on the right side of the related term displayed on the related term display unit 652, and here, a related term display unit 652 is displayed. The neighborhood related verbs related to the related verbs displayed in are displayed. As described with reference to FIG. 16, the neighborhood is a set of related terms including the related term A when the attention is given to a related term A, and such a related term is referred to as a neighborhood here. It is a related verb. The related term set is a set of related terms included in a certain topic, and here, the related term index corresponds to this.

なお、図３７では、近傍関連詞は、近傍関連詞１〜近傍関連詞４までの４つしか示されていないが、関連詞・共起語一覧表示画面６５０のスライダーバーを操作して右方向に移動させることによって、より多くの近傍関連詞を閲覧することができる。近傍関連詞は、関連詞表示部６５２に表示された関連詞と共起する関連詞（共起関連詞：すなわち、同じ話題で、共に出現する関連詞）ということができる。近傍関連詞の（横方向における）表示順は、共起関係が強いもの、すなわち、関連詞表示部６５２に表示された関連詞と共に出現する頻度が高いものほど、関連詞表示部６５２に表示された関連詞に近い位置に表示されるよう調整される。また、この横方向の表示順に関して、関連詞のランク付けや、ユーザまたは情報検索システムによる設定等を考慮して決定することもできる。 In FIG. 37, there are only four neighborhood related terms from neighborhood related terms 1 to 4; however, the slider bar on the related terms / co-occurrence word list display screen 650 is operated to the right. By moving to, more neighborhood related terms can be browsed. The neighborhood related terminology can be said to be a related term co-occurring with the related term displayed in the related term display unit 652 (co-occurrence related terminology: that is, a related term appearing together in the same topic). The display order of the neighborhood related terms (in the horizontal direction) is displayed in the related term display unit 652 as the co-occurrence relationship is strong, that is, the frequency of appearance with the related term displayed in the related term display unit 652 is higher. It is adjusted to be displayed at a position close to the related verb. Further, the display order in the horizontal direction can be determined in consideration of ranking of related terms, setting by a user or an information search system, and the like.

また、関連詞・共起語一覧表示画面６５０では、関連詞表示部６５２に表示された関連詞のうち、図３５（Ｂ）に示すＦＡＱ候補表示画面６１０で、関連詞索引表示部６１１に表示された関連詞索引のうちの１つをマウスのクリック等によって選択した際に、その関連詞索引に含まれていた関連詞（この実施例では、例えば、「ネットワーク」、「接続」、「ルータ」、及び「モデム」）は、ユーザが認識しやすいようにハイライト表示となっている。 Also, on the related term / co-occurrence word list display screen 650, among the related terms displayed on the related term display unit 652, the FAQ candidate display screen 610 shown in FIG. 35B displays on the related term index display unit 611. When one of the related related term indexes is selected by clicking the mouse or the like, the related terms included in the related term index (in this embodiment, for example, “network”, “connection”, “router”). "And" modem ") are highlighted so that the user can easily recognize them.

ユーザは、関連詞・共起語一覧表示画面６５０の関連詞表示部６５２、及び近傍関連詞表示部（６５３〜）に表示された関連詞のマトリックス表示により、ユーザ自身が元来着目していた関連詞（検索キーワード）と関連する別の関連詞を、自身が関連詞をどのように利用してきたかという観点と、ユーザまたは情報検索システムによって着目・推奨する話題に近いかという観点で見ることができ、新たな関連詞についての「気づき」が与えられうる。例えば、図３７の関連詞・共起語一覧表示画面６５０における関連詞のマトリックス表示では、検索キーワード「ネットワーク」による検索を行ったが、ＮＯ表示部６５１に表示された数字が、１〜１３、及び１５の場合には、表示された関連詞の集合から、概ね通信ネットワークに関連する話題であることが分かるが、ＮＯ表示部６５１に表示された数字が１４、１６、及び１７の場合は、ソーシャルネットワークに関連する話題となっており、他と異なる話題が出現していることが分かる。 The user originally paid attention by the matrix display of the related terms displayed in the related term display unit 652 and the neighborhood related term display unit (653-) of the related term / co-occurrence word list display screen 650. To see another related terminology related to the related terminology (search keyword) in terms of how it has used the related terminology and whether it is close to the topic that the user or information retrieval system is focusing on or recommending Yes, you can be given “awareness” about new related terms. For example, in the related term / co-occurrence word list display screen 650 in FIG. 37, the related term matrix display is performed by the search keyword “network”, but the numbers displayed on the NO display unit 651 are 1 to 13, In the case of No. 15 and No. 15, it can be seen from the set of displayed related terms that the topic is generally related to the communication network. However, when the numbers displayed on the NO display unit 651 are 14, 16, and 17, It is a topic related to social networks, and it can be seen that a different topic has appeared.

ユーザが、関連詞・共起語一覧表示画面６５０の関連詞表示部６５２、または近傍関連詞表示部（６５３〜）に表示された関連詞の１つをマウスのクリック等によって選択すると（図３７の矢印）、画面が、図３８（Ｂ）に示すようなＦＡＱ検索画面６６０に自動的に遷移し、そこで、選択された関連詞（この実施例では、近傍関連詞の「ＳＮＳ」）が、ＦＡＱ検索画面６６０の検索キーワード入力部６６１に自動的にセットされる。ユーザがこの状況で、「ＦＡＱ検索」ボタン６６２をクリックすると、再び、ＦＡＱ候補表示画面６１０が表示され、今度は、「ＳＮＳ」に関する質問文が、候補質問文表示部６１２に示される。 When the user selects one of the related terms displayed on the related term display unit 652 or the neighboring related term display unit (653-) on the related term / co-occurrence word list display screen 650 (FIG. 37). ), The screen automatically transitions to a FAQ search screen 660 as shown in FIG. 38 (B), where the selected related term (in this example, the neighborhood related term “SNS”) is It is automatically set in the search keyword input part 661 of the FAQ search screen 660. When the user clicks the “FAQ search” button 662 in this situation, the FAQ candidate display screen 610 is displayed again, and the question sentence regarding “SNS” is displayed in the candidate question sentence display unit 612 this time.

＜＜＜情報更新部における処理概要の説明＞＞＞
情報更新部４６は、異なる収集条件によって収集された外部ログ５０２（テキストデータ）から、意味識別可能な文字列を抽出し、抽出された文字列を、テキストデータに対応する関連詞辞書に記憶し、これらの関連詞辞書について比較処理を行うことによって得られた比較結果を比較結果データ５４に記憶し更新する。比較処理は、関連詞辞書の更新がされた場合に自動的に行われる。 <<< Description of processing outline in information update unit >>>
The information updating unit 46 extracts a character string that can be distinguished from the external log 502 (text data) collected under different collection conditions, and stores the extracted character string in a related term dictionary corresponding to the text data. The comparison result obtained by performing the comparison process on these related terminology dictionaries is stored in the comparison result data 54 and updated. The comparison process is automatically performed when the related term dictionary is updated.

異なる関連詞辞書に対応付けられたテキストデータは、異なる収集条件によって収集されたテキストデータであり、これらのテキストデータは、例えば、同様の対象やデータソースについて異なるタイミングで収集される複数のテキストデータであったり、同様のタイミングにおいて、異なる主題や検索条件によって収集される複数のテキストデータであったりする。 Text data associated with different related terminology dictionaries is text data collected under different collection conditions. These text data are, for example, a plurality of text data collected at different timings for the same target or data source. Or a plurality of text data collected according to different subjects and search conditions at the same timing.

上記の比較処理は、複数の関連詞辞書を比較して、新たに出現した関連詞のほか、消滅した関連詞、共通して出現する関連詞、（３つ以上の時系列テキストデータに対応する辞書において）再度出現した関連詞など、関連詞の出現状況を判定し、関連詞がこのような出現状況のうちいずれかである場合に、その関連詞を比較結果として記憶する。 The above comparison process compares a plurality of related terminators, and in addition to newly appearing related terms, disappeared related terms, commonly appearing related terms, (corresponding to three or more time-series text data) The appearance status of a related term such as a related term that has reappeared is determined, and if the related term is one of such appearance statuses, the related term is stored as a comparison result.

また、１つのテキストデータから複数の関連詞が抽出された場合に、それらの関連詞を１つの集合として（共起関連詞として）関連付け、対応する関連詞辞書に記憶することができる。このように構成することにより、比較処理において、複数の関連詞辞書で共通する関連詞があると判定された場合に、その関連詞の共起関連詞を比較し、さらなる判定を行うことができる。 Further, when a plurality of related terms are extracted from one text data, these related terms can be associated as one set (as a co-occurrence related term) and stored in a corresponding related term dictionary. By configuring in this way, in the comparison process, when it is determined that there is a common related term in a plurality of related term dictionaries, the co-occurrence related terms of the related terms can be compared and further determination can be performed. .

このように、関連詞の出現に関する履歴等をとらえることにより、関連詞の意味を炙りだせるようになる。すなわち、このような処理を繰り返すことにより、関連詞が所属する話題名が明確となるほか、いつもの関連詞と（新たに出現した）新着関連詞とを区別して扱うことができたり、関連詞辞書の比較処理によって、話題の類似性や相違性についての判断をしたりすることができる。こうした機能を、関連詞学習機能と称することとする。関連詞学習機能により、エンドユーザの入力識別手段が多様化することが期待できる。 In this way, by capturing the history of the appearance of related terms, the meaning of the related terms can be found out. In other words, by repeating this process, the topic name to which the related term belongs becomes clear, and the usual related term can be distinguished from the new arrival related term (newly appeared). It is possible to determine the similarity or difference between topics by dictionary comparison processing. Such a function is referred to as a related term learning function. It can be expected that end-user input identification means will be diversified by the related term learning function.

図３９を参照して、情報更新部４６の処理概要について説明する。最初に、情報更新部４６は、テキストデータである外部ログ５０２を取得する（テキストデータ取得処理７００）。外部ログ５０２は、例えば、クローラー７３０によって収集される。クローラー７３０が、ＷＥＢページのネットワークアドレス（ＵＲＬ等）を返す場合は、そのネットワークアドレスにアクセスすることにより外部ログ５０２を取得するようにもできる。またさらに、取得した外部ログ５０２、または外部ログ５０２を取得する際に、特定のテキストデータだけを取得するようフィルタ処理を行ったり、特定の分類によりグルーピングをしたりすることもできる。 With reference to FIG. 39, the processing outline of the information update unit 46 will be described. First, the information update unit 46 acquires the external log 502 that is text data (text data acquisition processing 700). The external log 502 is collected by the crawler 730, for example. When the crawler 730 returns the network address (URL or the like) of the WEB page, the external log 502 can be acquired by accessing the network address. Furthermore, when acquiring the acquired external log 502 or the external log 502, it is possible to perform a filtering process so as to acquire only specific text data, or to perform grouping according to a specific classification.

クローラー７３０は、例えば、自動起動され、決められた時間に決められた話題名に対して話題解析を行う（すなわち、検索を行って、定期的に話題を収集する）。話題名は、例えば、関連詞辞書５０を保持しているサービス（ユーザが扱う話題にそれぞれ対応したサービスＩＤに割り当てられた領域）に記憶され、ユーザが１０個の話題を扱いたい場合は、１０個のサービスを利用してそれらの話題を扱うことになる。また、上述した１つ１つのサービスについて、対応する話題チップを設定し、各話題チップが常時、対応する話題に関する情報を収集し、ユーザの入力に応じて、関連する話題チップを連携・統合させ、より多様な話題提供サービスを実現することもできる。 For example, the crawler 730 is automatically activated and performs topic analysis on a topic name determined at a predetermined time (that is, performs a search and periodically collects topics). The topic name is stored in, for example, a service (an area assigned to a service ID corresponding to each topic handled by the user) that holds the related term dictionary 50. If the user wants to handle 10 topics, the topic name is 10 These services are handled using individual services. In addition, for each of the services described above, corresponding topic chips are set, each topic chip always collects information on the corresponding topic, and the related topic chips are linked and integrated according to user input. It is also possible to realize more diverse topic providing services.

クローラー７３０による検索は、例えば、インターネット上の既存のインターネット検索サイトにアクセスし、そこで検索キーワードを指定することにより、当該インターネット検索サイトの検索サーバから検索結果を受信する。検索結果には、例えば、検索キーワードに合致または類似するコンテンツを含んだＷＥＢページのアドレス（ＷＥＢページ１のアドレス、ＷＥＢページ２のアドレス、ＷＥＢページ３のアドレス、・・・、ＷＥＢページＸのアドレス）が含まれる。 In the search by the crawler 730, for example, an existing Internet search site on the Internet is accessed, and a search keyword is specified there, thereby receiving a search result from a search server of the Internet search site. The search result includes, for example, the address of a WEB page that includes content that matches or is similar to the search keyword (the address of WEB page 1, the address of WEB page 2, the address of WEB page 3,..., The address of WEB page X. ) Is included.

クローラー７３０は、この実施例では、既存のインターネット検索サイトにおける検索を実行することで検索結果を取得するようにしているが、他の様々な方法により、所定の条件を満たすＷＥＢページのアドレスを取得することができる。また、検索対象はインターネット上のＷＥＢページに限定されるものではなく、TWITTERのツイート情報や、任意の機関や組織によって事前に生成・編集された（ネットワーク上またはローカルの）データやデータベース内のテキスト情報であってもよい。 In this embodiment, the crawler 730 acquires a search result by executing a search on an existing Internet search site. However, the crawler 730 acquires a web page address that satisfies a predetermined condition by various other methods. can do. The search target is not limited to WEB pages on the Internet, but TWITTER tweet information, data (database or local) generated and edited in advance by any organization or organization, and text in the database It may be information.

既存のインターネット検索サイトは、そのインターネット検索サイトが使用する検索サーバに備えられた検索エンジンにより、インターネット上のデータソースから検索キーワードに合致、または類似するＷＥＢページのアドレスを、検索のリクエストに応じて（あるいは事前の定期的収集活動により）収集する。 In an existing Internet search site, a search engine provided in a search server used by the Internet search site allows an address of a WEB page that matches or is similar to a search keyword from a data source on the Internet in response to a search request. Collect (or through pre-periodic collection activities).

クローラー７３０は、検索サーバから検索結果が送信されると、クローラー７３０が動作するコンピュータから、その検索結果を（例えば、ＡＰＩ送信により）話題提供サーバ４’に送信する。また、クローラー７３０は、フィルタを用いて、検索結果のうち、所定の条件を満たすものを除外するよう構成することができる。 When the search result is transmitted from the search server, the crawler 730 transmits the search result to the topic providing server 4 ′ (for example, by API transmission) from the computer on which the crawler 730 operates. In addition, the crawler 730 can be configured to exclude a search result that satisfies a predetermined condition using a filter.

この実施例では、クローラー７３０が、決められた時間に自動的に起動されるが、話題提供サーバ４’の動作制御に応じてクローラー７３０の動作を制御し、検索結果を取得するようにしてもよい。また、クローラー７３０が所定のインターバルで検索結果を取得し、クローラー７３０が動作するコンピュータにその検索結果を保持しておき、話題提供サーバ４’が、必要なタイミングで、当該コンピュータにアクセスして検索結果を取得するようにもできる。また、クローラー７３０が話題提供サーバ４’で実行されるように構成することもできる。 In this embodiment, the crawler 730 is automatically activated at a predetermined time, but the operation of the crawler 730 is controlled in accordance with the operation control of the topic providing server 4 ′ and the search result is acquired. Good. Further, the crawler 730 acquires search results at predetermined intervals, holds the search results in a computer on which the crawler 730 operates, and the topic providing server 4 ′ accesses the computer at a necessary timing to perform a search. You can also get the result. Further, the crawler 730 may be configured to be executed by the topic providing server 4 '.

また、この実施例では、クローラー７３０が動作するコンピュータが、検索結果として検索キーワードに関連するＷＥＢページのアドレスを話題提供サーバ４’に送信するが、クローラー７３０が動作するコンピュータにおいて、これらのＷＥＢページにアクセスし、その結果得られたテキストデータを外部ログ５０２として話題提供サーバ４’に送信するようにもできる。 In this embodiment, the computer on which the crawler 730 operates transmits the address of the WEB page related to the search keyword as a search result to the topic providing server 4 ′. In the computer on which the crawler 730 operates, these WEB pages And the text data obtained as a result can be transmitted as the external log 502 to the topic providing server 4 ′.

次に、情報更新部４６は、テキストデータ取得処理７００により取得された外部ログ５０２から、意味識別可能な文字列を抽出し、抽出された文字列を関連詞辞書５０に記憶する（文字列抽出処理７１０）。このように抽出された文字列は、上述の関連詞に相当するものであり、これらの関連詞は、その関連詞が抽出された外部ログ５０２に対応する関連詞辞書１〜３等に、それぞれ記憶される。 Next, the information update unit 46 extracts a character string whose meaning can be identified from the external log 502 acquired by the text data acquisition process 700, and stores the extracted character string in the related term dictionary 50 (character string extraction). Process 710). The character strings thus extracted correspond to the above-mentioned related terms, and these related terms are respectively stored in the related term dictionary 1 to 3 corresponding to the external log 502 from which the related terms are extracted. Remembered.

外部ログ５０２から関連詞を抽出する方法として様々な方法が考えられる。例えば、上述した文解析部４３による方法で関連詞の抽出を行うことができる。 Various methods can be considered as a method for extracting a related term from the external log 502. For example, the related words can be extracted by the method by the sentence analysis unit 43 described above.

文字列抽出処理７１０では、複数の外部ログ５０２から関連詞が抽出され、それぞれ対応する関連詞辞書５０に記憶される。例えば、複数の外部ログ５０２は、同様の対象やデータソースについて異なるタイミングで収集されたテキストデータであったり、同様のタイミングにおいて、異なる主題や検索条件によって収集される複数のテキストデータであったりする。文字列抽出処理７１０の詳細な処理については、後で説明する。 In the character string extraction process 710, related terms are extracted from the plurality of external logs 502 and stored in the corresponding related term dictionary 50, respectively. For example, the plurality of external logs 502 may be text data collected at different timings for the same target or data source, or may be a plurality of text data collected by different subjects or search conditions at the same timing. . Detailed processing of the character string extraction processing 710 will be described later.

次に、情報更新部４６は、文字列抽出処理７１０により、それぞれ関連詞が記憶された複数の関連詞辞書５０を比較し、関連詞の出現状況に応じて、比較結果を比較結果データ５４に記憶する（辞書比較処理７２０）。 Next, the information updating unit 46 compares a plurality of related terminology dictionaries 50 each storing a related term by character string extraction processing 710, and compares the comparison result into the comparison result data 54 according to the appearance status of the related term. Store (dictionary comparison process 720).

例えば、異なるタイミングで収集された２つの関連詞辞書（関連詞辞書１、関連詞辞書２）を比較する場合、関連詞辞書１に存在せず、関連詞辞書２に存在する関連詞があれば、これを、新たに出現した新着の関連詞として、比較結果データ５４に記憶し、逆に、関連詞辞書１に存在し、関連詞辞書２に存在しない関連詞があれば、これを、消滅した関連詞として、比較結果データ５４に記憶する。 For example, when comparing two related terminology dictionaries (related term dictionary 1 and related term dictionary 2) collected at different timings, if there is a related term that does not exist in the related term dictionary 1 but exists in the related term dictionary 2 This is stored in the comparison result data 54 as a newly arrived related term, and if there is a related term that exists in the related term dictionary 1 but does not exist in the related term dictionary 2, it is erased. And stored in the comparison result data 54.

また、例えば、異なる主題について同様のタイミングで収集された３つの関連詞辞書（関連詞１〜３）を比較する場合、すべての関連詞辞書１〜３に存在する関連詞があれば、これを、共通した関連詞として比較結果データ５４に記憶する。 Also, for example, when comparing three related terminology dictionaries (related terms 1 to 3) collected at the same timing for different subjects, if there are related terms existing in all the related terminators dictionaries 1 to 3, And stored in the comparison result data 54 as a common related term.

なお、複数の関連詞辞書において、関連詞がどのような出現状況のときに比較結果データ５４に記憶するかは、情報更新部４６の利用態様に応じて柔軟に規定することができる。辞書比較処理７２０の詳細な処理については、後で詳細に説明する。 It should be noted that, in a plurality of related terminology dictionaries, what kind of appearance state of the related term is stored in the comparison result data 54 can be flexibly defined according to the use mode of the information updating unit 46. Detailed processing of the dictionary comparison processing 720 will be described later in detail.

会話制御端末装置２’’から、ユーザが直接、あるいは応答シナリオに応じて、所定の関連詞辞書を比較した比較結果データ５４の表示が要求されると、話題提供サーバ４’がこれらの比較結果データ５４を含む入力特定情報を会話制御端末装置２’’に送信し、会話制御端末装置２’’は、この入力特定情報を受け取ると、入力特定情報とシナリオデータ２８に基づいて応答情報を決定し、会話制御端末装置２’’のディスプレイに当該応答情報を表示するよう制御する。 When the user requests the conversation control terminal device 2 ″ to display the comparison result data 54 comparing the predetermined relational dictionary directly or according to the response scenario, the topic providing server 4 ′ displays the comparison result. The input specific information including the data 54 is transmitted to the conversation control terminal device 2 ″. When the conversation control terminal device 2 ″ receives the input specific information, it determines response information based on the input specific information and the scenario data 28. Then, control is performed so that the response information is displayed on the display of the conversation control terminal device 2 ″.

会話制御端末装置２’’のディスプレイには、例えば、話題名と、この話題における関連詞の変化が表示される。関連詞の変化の表示には、例えば、関連詞の出現状況とこれに対応する関連詞が含まれる。 On the display of the conversation control terminal device 2 ″, for example, a topic name and a change in a related term in this topic are displayed. The display of the change of the related term includes, for example, the appearance status of the related term and the corresponding related term.

後述する図４６の比較結果データ５４に関しては、例えば、話題名「２０１３年１０月１０日、ｔ２における「株の取引」の話題」について、出現状況が「新着関連詞」である関連詞「税率」と、出現状況が「消滅関連詞」である関連詞「口座」が表示される。この表示は、図４６に示す比較結果データ５４の、レコード５４ａに対応するものである。ユーザは、ディスプレイに表示された内容を見て、話題名に関する話題において、ｔ２というタイミングで、新たに「税率」という関連詞が出現し、これと同時に、「口座」という関連詞が消滅したという気づきを得ることができる。 For the comparison result data 54 of FIG. 46 to be described later, for example, for the topic name “October 10, 2013, topic of“ stock trading ”at t2,” the related term “tax rate” whose appearance status is “new arrival related terminology”. ”And the related term“ account ”whose appearance status is“ annihilation related term ”. This display corresponds to the record 54a of the comparison result data 54 shown in FIG. The user looks at the contents displayed on the display, and in the topic related to the topic name, a new related term “tax rate” appears at the timing t2, and at the same time, the related term “account” disappears. You can get notice.

＜＜情報更新部における文字列抽出処理の説明＞＞
図４０を参照して、文字列抽出処理７１０について説明する。図４０は、文字列抽出処理７１０の処理手順を表すフローチャートである。最初に、ステップＳ６１において、関連詞を抽出する対象となる外部ログ５０２（テキストデータ）を読み込む。テキストデータは、上述のように、テキストデータが取得できる限り、どのようなデータであってもよい。 << Description of Character String Extraction Processing in Information Update Unit >>
The character string extraction process 710 will be described with reference to FIG. FIG. 40 is a flowchart showing the processing procedure of the character string extraction processing 710. First, in step S61, an external log 502 (text data) that is a target for extracting related terms is read. As described above, the text data may be any data as long as the text data can be acquired.

次に、ステップＳ６２において、ステップＳ６１で読み込んだテキストデータから、意味識別可能な文字列である関連詞を抽出する。テキストデータから関連詞を抽出する方法は、上述のように、文解析部４３による、前後の隣接文字の異なり度合いに基づく方法や、形態素解析を用いた方法などを含む様々な方法がある。 Next, in step S62, a related term which is a character string whose meaning can be identified is extracted from the text data read in step S61. As described above, there are various methods for extracting related terms from text data, including a method based on the degree of difference between adjacent characters by the sentence analysis unit 43 and a method using morphological analysis.

次に、ステップＳ６３において、ステップＳ６２で１つのテキストデータに対して複数の関連詞が抽出された場合に、所定の判断基準により、その複数の関連詞にランク付けを行う。例えば、テキストデータにおける関連詞の重要度に応じてランク付けを行うことができ、関連詞の文字長や出現頻度に応じてランク付けが行われうる。また、関連詞を、前後の隣接文字の異なり度合いに基づく方法により抽出する場合は、前後の隣接文字の異なり度合いに応じてランク付けが行われる。なお、ランク付けは、このような基準のほか様々な要素、及びこれらの組合せによって行うことができる。このような「ランク」は、話題との関連性を示すものである。また、関連詞が複数抽出された場合であっても、このようなランク付けを行わないようにすることもできる。 Next, in step S63, when a plurality of related terms are extracted for one text data in step S62, the plurality of related terms are ranked according to a predetermined criterion. For example, ranking can be performed according to the importance of related terms in text data, and ranking can be performed according to the character length and appearance frequency of related terms. In the case where the related terms are extracted by a method based on the degree of difference between the adjacent characters before and after, ranking is performed according to the degree of difference between the adjacent characters before and after. The ranking can be performed by various factors in addition to such criteria and combinations thereof. Such “rank” indicates relevance to a topic. Further, even when a plurality of related terms are extracted, it is possible not to perform such ranking.

次に、ステップＳ６４において、ステップＳ６３でランク付けされた関連詞を、テキストデータに対応する関連詞辞書に記憶する。例えば、１つのテキストデータから抽出された関連詞は、１つのレコードにまとめて記憶され、各関連詞は、ランク付けに応じた記憶位置（配列エントリー）に記憶される。関連詞は、このように記憶されることにより、複数の関連詞が、１つのテキストデータ（それらの関連詞が抽出されたテキストデータ）に関連付けられた集合として定義される。ランク付けは、その集合のなかで関連詞を順位付けるものである。 Next, in step S64, the related terms ranked in step S63 are stored in the related term dictionary corresponding to the text data. For example, the related terms extracted from one text data are collectively stored in one record, and each related term is stored in a storage position (array entry) corresponding to the ranking. By storing the related terms in this way, a plurality of related terms are defined as a set associated with one text data (text data from which the related terms are extracted). Ranking is to rank related terms in the set.

文字列抽出処理は、処理対象のテキストデータが複数ある場合は、上述したステップＳ６１からステップＳ６４までの処理を、テキストデータごとに繰り返す。 In the character string extraction process, when there are a plurality of text data to be processed, the processes from step S61 to step S64 described above are repeated for each text data.

＜＜情報更新部における辞書比較処理の説明＞＞
図４１を参照して、辞書比較処理７２０について説明する。図４１は、辞書比較処理７２０の処理手順を表すフローチャートである。この実施例では、時系列データとして収集された２つのテキストデータ（テキストデータ１、テキストデータ２）から関連詞が抽出され、それぞれ対応する関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、に記憶されている状況で比較処理が行われるものとする。 << Description of Dictionary Comparison Processing in Information Update Unit >>
The dictionary comparison process 720 will be described with reference to FIG. FIG. 41 is a flowchart showing the processing procedure of dictionary comparison processing 720. In this embodiment, related terms are extracted from two text data (text data 1, text data 2) collected as time series data, and the corresponding related terms dictionary (i-1) and related terms dictionary (i) are extracted. It is assumed that the comparison process is performed in the situation stored in.

最初に、ステップＳ７１において、関連詞辞書（ｉ−１）、関連詞辞書（ｉ）に記憶されている関連詞を読み出す。次に、ステップＳ７２において、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）の関連詞を比較し、そのなかから関連詞辞書（ｉ−１）に存在せず、関連詞辞書（ｉ）に存在する関連詞を、新たに出現した関連詞（新着関連詞）として比較結果データ５４に記憶する。それぞれの関連詞辞書は、例えば、話題名に対応付けられており、辞書比較処理７２０は、この話題名を用いて比較を行うことができる。新着関連詞は、対応する関連詞辞書を表すことができる話題名、出現状況（この場合は、新たに出現したことを表す「新着」の文字やこれに対応するコード等）とともに比較結果データ５４に記憶される。 First, in step S71, related terms stored in the related term dictionary (i-1) and the related term dictionary (i) are read. Next, in step S72, the related terms in the related term dictionary (i-1) and the related term dictionary (i) are compared with each other. ) Are stored in the comparison result data 54 as newly appearing related terms (new arrival related terms). Each related term dictionary is associated with, for example, a topic name, and the dictionary comparison process 720 can perform comparison using the topic name. The new arrival related terminology is a comparison result data 54 together with a topic name that can represent the corresponding related terminology dictionary and an appearance status (in this case, a “new arrival” character that represents a new appearance, a code corresponding thereto, and the like). Is remembered.

次に、ステップＳ７３において、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）の関連詞を比較し、そのなかから関連詞辞書（ｉ−１）に存在し、関連詞辞書（ｉ）に存在しない関連詞を、消滅した関連詞（消滅関連詞）として比較結果データ５４に記憶する。それぞれの関連詞辞書は、例えば、話題名に対応付けられており、消滅関連詞は、この話題名、出現状況（この場合は、新たに出現したことを表す「消滅」の文字やこれに対応するコード等）とともに比較結果データ５４に記憶される。 Next, in step S73, the related terms in the related term dictionary (i-1) and the related term dictionary (i) are compared, and among them, the related term dictionary (i-1) exists, and the related term dictionary (i). Is stored in the comparison result data 54 as an extinct related term (an extinct related term). Each related terminology dictionary is associated with, for example, a topic name, and an extinction related terminology corresponds to the topic name, appearance status (in this case, “disappearance” indicating new appearance and this) Are stored in the comparison result data 54.

その後、ステップＳ７４において、関連詞辞書（ｉ−１）の内容を関連詞辞書（ｉ）にコピーする。これは、次のタイミングにおいて、文字列抽出処理７１０が関連詞を記憶するための関連詞辞書（ｉ−１）を用意するためであり、その後、この新たな関連詞辞書（ｉ−１）と、関連詞辞書（ｉ−１）の内容がコピーされた関連詞辞書（ｉ）が、辞書比較処理７２０によって比較される。 Thereafter, in step S74, the contents of the related term dictionary (i-1) are copied to the related term dictionary (i). This is because, at the next timing, the character string extraction process 710 prepares a related term dictionary (i-1) for storing related terms, and thereafter, this new related term dictionary (i-1) and The related term dictionary (i) to which the content of the related term dictionary (i-1) is copied is compared by the dictionary comparison processing 720.

このように、文字列抽出処理７１０と辞書比較処理７２０は、所定のタイミングで繰り返し実行されるが、詳細な説明については後述する。また、辞書比較処理７２０が繰り返し処理されることによって、比較結果データ５４に、その処理タイミングにおいてそれぞれ比較結果が記憶されることになるが、比較結果を記憶する際に、それ以前に記憶されていた比較結果を消去するか、累積的に記憶するかは、本発明に係る情報検索システム１００の仕様に応じて決定される。また、比較結果データ５４を、辞書比較処理７２０ごとに別個に用意するようにしてもよい。 As described above, the character string extraction processing 710 and the dictionary comparison processing 720 are repeatedly executed at a predetermined timing, and detailed description thereof will be described later. Further, by repeatedly performing the dictionary comparison process 720, the comparison result is stored in the comparison result data 54 at the processing timing. However, when the comparison result is stored, it is stored before that. Whether the comparison result is deleted or stored cumulatively is determined according to the specification of the information search system 100 according to the present invention. Further, the comparison result data 54 may be prepared separately for each dictionary comparison process 720.

また、この例では省略したが、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）を比較して、共通する関連詞（共通関連詞）を比較結果データ５４に記憶することもできる。この場合、例えば、関連詞辞書（ｉ−１）において共通関連詞とともに記憶されている他の関連詞（共起関連詞）と、関連詞辞書（ｉ）において共通関連詞とともに記憶されている他の関連詞（共起関連詞）との間に共通性があるか否かをさらに比較して、当該共通性に関する情報を比較結果データ５４に記憶することができる。 Although omitted in this example, the related term dictionary (i-1) and the related term dictionary (i) can be compared, and a common related term (common related term) can be stored in the comparison result data 54. In this case, for example, other related terms (co-occurrence related terms) stored together with the common related term in the related term dictionary (i-1) and others stored together with the common related term in the related term dictionary (i) It is possible to further compare whether or not there is a commonality with other related terms (co-occurrence related terms), and to store information on the commonality in the comparison result data 54.

さらに、上記のような共起関連詞を比較する場合に、それらの共起関連詞に関連付けられたランクを考慮して共通性に関する情報を判定してもよい。例えば、ランクの高い（それらの関連詞で示される話題にとって重要性が高い）共起関連詞が、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）において共通する場合、共通関連詞の共通性はより高く評価されうる。 Further, when comparing the co-occurrence related terms as described above, information on the commonality may be determined in consideration of the ranks associated with the co-occurrence related terms. For example, when co-occurrence related terms having high rank (high importance for the topic indicated by those related terms) are common in the related term dictionary (i-1) and the related term dictionary (i), Commonality can be appreciated more.

＜＜＜情報更新部における文字列抽出処理と辞書比較処理の説明＞＞＞
図４２は、同じＷＥＢページから時系列に収集された５つのテキストデータ（テキストデータ１〜５）から、文字列抽出処理７１０によって、それぞれ異なるタイミングで関連詞が抽出され、抽出された関連詞が、それぞれ対応する関連詞辞書（ｉ−１）または関連詞辞書（ｉ）に記憶され、その後、関連詞辞書（ｉ）が更新された場合に、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）とを対象として辞書比較処理７２０が行われ、これらの処理が、時間（Ｔ＝ｔ１）から（Ｔ＝ｔ５）まで周期的に行われている例を示している（時間（Ｔ＝ｔ６以降は省略した）。 <<< Description of Character String Extraction Processing and Dictionary Comparison Processing in Information Update Unit >>>
In FIG. 42, the related terms are extracted at different timings by the character string extraction processing 710 from the five text data (text data 1 to 5) collected in time series from the same WEB page. Are stored in the corresponding related term dictionary (i-1) or the related term dictionary (i), and then the related term dictionary (i-1) and the related term dictionary when the related term dictionary (i) is updated. A dictionary comparison process 720 is performed for (i), and these processes are periodically performed from time (T = t1) to (T = t5) (time (T = T = It was omitted after t6).

最初に、時間（Ｔ＝ｔ１）において、この時点で所定のＷＥＢページから収集されたテキストデータ１から、文字列抽出処理７１０ａによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ−１）に記憶される。この文字列抽出処理７１０ａは、図４０を参照して説明した文字列抽出処理７１０に対応する。 First, at time (T = t1), a related term is extracted by the character string extraction processing 710a from the text data 1 collected from a predetermined WEB page at this time, and the extracted related term is stored in the related term dictionary (i -1). This character string extraction process 710a corresponds to the character string extraction process 710 described with reference to FIG.

次の、時間（Ｔ＝ｔ２）において、Ｔ＝ｔ１の場合と同様に、同じＷＥＢページから収集されたテキストデータ２から、文字列抽出処理７１０ｂによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ）に記憶される。ここで、対象のＷＥＢページにおいて話題や記載内容の変化があれば、抽出される関連詞もそれに応じて変化することになる。関連詞辞書（ｉ）に関連詞が記憶されると（更新されると）、辞書比較処理７２０ａによって関連詞辞書（ｉ−１）と関連詞辞書（ｉ）の比較が行われ、関連詞の出現状況に応じて、例えば、新たに出現した新着関連詞等が比較結果データ５４に記憶される。また、比較処理が終わると、関連詞辞書（ｉ）の内容が、関連詞辞書（ｉ−１）にコピーされる。 At the next time (T = t2), as in the case of T = t1, a related phrase is extracted from the text data 2 collected from the same WEB page by the character string extraction processing 710b. It is stored in the related term dictionary (i). Here, if there is a change in the topic or description content in the target WEB page, the extracted related terminology also changes accordingly. When a related term is stored in the related term dictionary (i) (updated), the dictionary comparison process 720a compares the related term dictionary (i-1) with the related term dictionary (i), and Depending on the appearance status, for example, newly-arrived related words that have newly appeared are stored in the comparison result data 54. When the comparison process is completed, the contents of the related term dictionary (i) are copied to the related term dictionary (i-1).

この辞書比較処理７２０ａは、図４１を参照して説明した辞書比較処理７２０に対応する。なお、この図では、辞書比較処理７２０ａが、Ｔ＝ｔ２のタイミングで行われているように記載されているが、関連詞辞書（ｉ）が更新された後に行われるものである。 This dictionary comparison process 720a corresponds to the dictionary comparison process 720 described with reference to FIG. In this figure, the dictionary comparison process 720a is described as being performed at the timing of T = t2, but is performed after the related term dictionary (i) is updated.

次の、時間（Ｔ＝ｔ３）において、Ｔ＝ｔ１、ｔ２の場合と同様に、同じＷＥＢページから収集されたテキストデータ３から、文字列抽出処理７１０ｃによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ）に記憶される。ここで、対象のＷＥＢページにおいて話題や記載内容の変化があれば、抽出される関連詞もそれに応じて変化することになる。関連詞辞書（ｉ）に関連詞が記憶されると（更新されると）、辞書比較処理７２０ｂによって関連詞辞書（ｉ−１）と関連詞辞書（ｉ）の比較が行われ、関連詞の出現状況に応じて、例えば、新たに出現した新着関連詞等が比較結果データ５４に記憶される。また、比較処理が終わると、関連詞辞書（ｉ）の内容が、関連詞辞書（ｉ−１）にコピー（待避）される。 At the next time (T = t3), as in the case of T = t1, t2, the related term is extracted by the character string extraction processing 710c from the text data 3 collected from the same WEB page, and the extracted relation The lyrics are stored in the related dictionary (i). Here, if there is a change in the topic or description content in the target WEB page, the extracted related terminology also changes accordingly. When a related term is stored in the related term dictionary (i) (updated), the dictionary comparison process 720b compares the related term dictionary (i-1) with the related term dictionary (i), Depending on the appearance status, for example, newly-arrived related words that have newly appeared are stored in the comparison result data 54. When the comparison process is completed, the contents of the related term dictionary (i) are copied (saved) to the related term dictionary (i-1).

以降、同様にこれらの文字列抽出処理（７１０ｄ、７１０ｅ）及び辞書比較処理（７２０ｃ、７２０ｄ）を繰り返して、比較結果データ５４が、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）を比較した結果得られた関連詞により、各タイミング（Ｔ＝ｔ２〜ｔ５）ごとに更新される。比較結果としての関連詞を最新のものだけ記憶するか累積的に記憶するかは、上述したように、適用する応用システム等の仕様による。 Thereafter, the character string extraction processing (710d, 710e) and the dictionary comparison processing (720c, 720d) are repeated in the same manner, so that the comparison result data 54 is obtained from the related term dictionary (i-1) and the related term dictionary (i). It is updated at each timing (T = t2 to t5) with the related term obtained as a result of the comparison. Whether only the latest related terms as comparison results are stored or cumulatively stored depends on the specifications of the applied system or the like as described above.

図４３は、同じＷＥＢページから時系列に収集された５つのテキストデータ（テキストデータ１〜５）から、文字列抽出処理７１０によって、それぞれ異なるタイミングで関連詞が抽出され、抽出された関連詞が、それぞれ対応する関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、または関連詞辞書（ｉ＋１）に記憶され、その後、関連詞辞書（ｉ＋１）が更新された場合に、関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、関連詞辞書（ｉ＋１）とを対象として辞書比較処理７２０が行われ、これらの処理が、時間（Ｔ＝ｔ１）から（Ｔ＝ｔ５）まで周期的に行われている例を示している（時間（Ｔ＝ｔ６）以降は省略した）。図４１との相違は、関連詞辞書がサイクリックに３つ用いられている点である。 In FIG. 43, related words are extracted at different timings by the character string extraction processing 710 from five text data (text data 1 to 5) collected in time series from the same WEB page. Are stored in the corresponding related term dictionary (i-1), the related term dictionary (i), or the related term dictionary (i + 1), and then the related term dictionary (i + 1) is updated. i-1), a dictionary comparison process 720 is performed on the related term dictionary (i) and the related term dictionary (i + 1), and these processes are performed periodically from time (T = t1) to (T = t5). (The time after time (T = t6) is omitted). The difference from FIG. 41 is that three related terminology dictionaries are used cyclically.

最初に、時間（Ｔ＝ｔ１）において、この時点で所定のＷＥＢページから収集されたテキストデータ１から、文字列抽出処理７１０ｆによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ−１）に記憶される。 First, at time (T = t1), a related term is extracted by the character string extraction processing 710f from the text data 1 collected from a predetermined WEB page at this time, and the extracted related term is stored in the related term dictionary (i -1).

次の、時間（Ｔ＝ｔ２）において、Ｔ＝ｔ１の場合と同様に、同じＷＥＢページから収集されたテキストデータ２から、文字列抽出処理７１０ｇによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ）に記憶される。ここで、対象のＷＥＢページにおいて話題や記載内容の変化があれば、抽出される関連詞もそれに応じて変化することになる。その後、時間（Ｔ＝ｔ３）において、Ｔ＝ｔ１、ｔ２の場合と同様に、同じＷＥＢページから収集されたテキストデータ３から、文字列抽出処理７１０ｈによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ＋１）に記憶される。ここで、対象のＷＥＢページにおいて話題や記載内容の変化があれば、抽出される関連詞もそれに応じて変化することになる。 At the next time (T = t2), as in the case of T = t1, a related phrase is extracted from the text data 2 collected from the same WEB page by the character string extraction processing 710g. It is stored in the related term dictionary (i). Here, if there is a change in the topic or description content in the target WEB page, the extracted related terminology also changes accordingly. Thereafter, at time (T = t3), as in the case of T = t1 and t2, the related terms are extracted from the text data 3 collected from the same WEB page by the character string extraction processing 710h. Is stored in the related term dictionary (i + 1). Here, if there is a change in the topic or description content in the target WEB page, the extracted related terminology also changes accordingly.

時間（Ｔ＝ｔ３）において、関連詞辞書（ｉ＋１）に関連詞が記憶されると（更新されると）、辞書比較処理７２０ｆによって３つの関連詞辞書（関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、関連詞辞書（ｉ＋１））の比較が行われ、関連詞の出現状況に応じて、関連詞等が比較結果データ５４に記憶される。この実施例では、３つの関連辞書に関する比較が行われるため、２つの関連詞辞書に関する変化に基づいて記憶する関連詞として、例えば、新たに出現した新着関連詞や消滅した消滅関連詞のほか、３つの関連詞辞書に亘る変化に基づいて記憶すべきと判定される関連詞もある。例えば、関連詞辞書（ｉ）において一旦消滅して、関連詞辞書（ｉ＋１）において再度出現した復活関連詞などである。その他、関連詞辞書において関連詞の出現頻度を記憶していることが条件となるが、短期間のうちに（３つの関連詞辞書において）急激に出現頻度が高まった関連詞、（３つの関連詞辞書において）出現頻度が激減した関連詞、（３つの関連詞辞書において）出現頻度が再び高くなった関連詞、他の関連詞の出現頻度が変化するなかで一定範囲の出現頻度を維持する関連詞などを把握することも可能である。 When a related term is stored (updated) in the related term dictionary (i + 1) at time (T = t3), three related term dictionaries (related term dictionary (i-1), A comparison is made between the lyrics dictionary (i) and the related words dictionary (i + 1), and the related words are stored in the comparison result data 54 according to the appearance status of the related words. In this embodiment, comparisons are made with respect to three related dictionaries, so as related words to be stored based on changes related to the two related word dictionaries, for example, newly appearing related words and disappearing related words, Some related terms are determined to be stored based on changes across the three related term dictionary. For example, it is a resurrection related verb that once disappeared in the related term dictionary (i) and reappears in the related term dictionary (i + 1). In addition, it is a condition that the frequency of appearance of the related term is memorized in the related term dictionary, but the related term that has rapidly increased in the short term (in the three related term dictionary), Maintains a certain range of appearance frequency as the frequency of appearance of related verbs (in the dictionary dictionary), related verbs in the frequency of appearance again (in the three related terminology dictionaries), and changes in the frequency of appearance of other related terms It is also possible to grasp related terms.

辞書比較処理７２０ｆにおいて、比較処理が終わると、関連詞辞書（ｉ）の内容が、関連詞辞書（ｉ−１）にコピーされるとともに、関連詞辞書（ｉ＋１）の内容が、関連詞辞書（ｉ）にコピーされる。なお、この図では、辞書比較処理７２０ｆが、Ｔ＝ｔ３のタイミングで行われているように記載されているが、関連詞辞書（ｉ−１）、及び関連詞辞書（ｉ）が更新された後に行われるものである。 In the dictionary comparison process 720f, when the comparison process ends, the contents of the related term dictionary (i) are copied to the related term dictionary (i-1), and the contents of the related term dictionary (i + 1) i) is copied. In this figure, the dictionary comparison process 720f is described as being performed at the timing of T = t3, but the related term dictionary (i-1) and the related term dictionary (i) are updated. It will be done later.

次の、時間（Ｔ＝ｔ４）において、Ｔ＝ｔ１〜ｔ３の場合と同様に、同じＷＥＢページから収集されたテキストデータ４から、文字列抽出処理７１０ｉによって関連詞が抽出され、抽出された関連詞が関連詞辞書（ｉ＋１）に記憶される。ここで、対象のＷＥＢページにおいて話題や記載内容の変化があれば、抽出される関連詞もそれに応じて変化することになる。関連詞辞書（ｉ＋１）に関連詞が記憶されると（更新されると）、辞書比較処理７２０ｇによって３つの関連詞辞書（関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、関連詞辞書（ｉ＋１））の比較が行われ、関連詞の出現状況に応じて、関連詞等が比較結果データ５４に記憶される。また、比較処理が終わると、関連詞辞書（ｉ）の内容が、関連詞辞書（ｉ−１）にコピーされるとともに、関連詞辞書（ｉ＋１）の内容が、関連詞辞書（ｉ）にコピーされる。 At the next time (T = t4), as in the case of T = t1 to t3, a related term is extracted from the text data 4 collected from the same WEB page by the character string extraction processing 710i, and the extracted relation The lyrics are stored in the related dictionary (i + 1). Here, if there is a change in the topic or description content in the target WEB page, the extracted related terminology also changes accordingly. When a related term is stored in the related term dictionary (i + 1) (updated), three related terms dictionaries (related term dictionary (i-1), related term dictionary (i), related term) are compared by dictionary comparison processing 720g. Dictionary (i + 1)) is compared, and related terms are stored in the comparison result data 54 according to the appearance status of the related terms. When the comparison process is completed, the contents of the related term dictionary (i) are copied to the related term dictionary (i-1), and the contents of the related term dictionary (i + 1) are copied to the related term dictionary (i). Is done.

以降、同様にこれらの文字列抽出処理７１０ｊ及び辞書比較処理７２０ｈを繰り返して、比較結果データ５４が、関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、関連詞辞書（ｉ＋１）を比較した結果得られた関連詞により、各タイミング（Ｔ＝ｔ３〜ｔ５）ごとに更新される。比較結果としての関連詞を最新のものだけ記憶するか累積的に記憶するかは、上述したように、適用する応用システム等の仕様による。 Thereafter, the character string extraction processing 710j and the dictionary comparison processing 720h are similarly repeated, and the comparison result data 54 compares the related term dictionary (i-1), the related term dictionary (i), and the related term dictionary (i + 1). It is updated at each timing (T = t3 to t5) with the related term obtained as a result. Whether only the latest related terms as comparison results are stored or cumulatively stored depends on the specifications of the applied system or the like as described above.

なお、図４２の実施例では２つの関連詞辞書を、図４３の実施例では３つの関連詞辞書を（サイクリックに）用いて比較結果データ５４を記憶・更新しているが、これ以上の関連詞辞書を用いて辞書比較処理を行ってもよい。これによって、より多くのタイミングにおける関連詞の出現状況を把握することができ、この出現状況が所定条件を満たす場合に、当該関連詞を比較結果データ５４に記憶することができる。 In the embodiment of FIG. 42, the comparison result data 54 is stored / updated using two related terminology dictionaries and in the example of FIG. 43 (cyclically), the comparison result data 54 is stored / updated. You may perform a dictionary comparison process using a related term dictionary. As a result, the appearance status of the related term at more timings can be grasped, and when the appearance status satisfies a predetermined condition, the related term can be stored in the comparison result data 54.

図４４は、異なるＷＥＢページ（異なる主題に関するＷＥＢページ）から同タイミングで収集された３つのテキストデータ（テキストデータＡ〜Ｃ）から、文字列抽出処理７１０によって関連詞が抽出され、抽出された関連詞が、それぞれ対応する関連詞辞書Ａ、関連詞辞書Ｂ、または関連詞辞書Ｃに記憶され、その後、これらの３つの関連詞辞書を対象として辞書比較処理７２０が行われ、これらの処理が、時間（Ｔ＝ｔ１）から（Ｔ＝ｔ３）まで周期的に行われている例を示している（時間（Ｔ＝ｔ４）以降は省略した）。 FIG. 44 shows the relations extracted from the three text data (text data A to C) collected at the same timing from different WEB pages (WEB pages related to different subjects) by the character string extraction processing 710 and extracted. The lyrics are stored in the corresponding related term dictionary A, related term dictionary B, or related term dictionary C, respectively, and then a dictionary comparison process 720 is performed on these three related term dictionaries. An example in which the period is periodically performed from time (T = t1) to (T = t3) is shown (the time (T = t4) and thereafter are omitted).

最初に、時間（Ｔ＝ｔ１）において、この時点で所定の異なるＷＥＢページからそれぞれ収集された３つのテキストデータ（テキストデータＡ〜Ｃ）から、文字列抽出処理７１０ｋ、文字列抽出処理７１０ｍ、文字列抽出処理７１０ｎによって関連詞が抽出され、抽出された関連詞がそれぞれ、関連詞辞書Ａ、関連詞辞書Ｂ、関連詞辞書Ｃに記憶される。 First, at time (T = t1), character string extraction processing 710k, character string extraction processing 710m, characters from three text data (text data A to C) respectively collected from predetermined different WEB pages at this time A related term is extracted by the column extraction processing 710n, and the extracted related term is stored in the related term dictionary A, the related term dictionary B, and the related term dictionary C, respectively.

その後、辞書比較処理７２０ｋによって３つの関連詞辞書（関連詞辞書Ａ、関連詞辞書Ｂ、関連詞辞書Ｃ）の比較が行われ、関連詞の出現状況に応じて、関連詞等が比較結果データ５４に記憶される。この実施例では、例えば、３つの関連詞辞書に共通して存在する関連詞が比較結果データ５４に記憶される。この場合、テキストデータＡ〜Ｃは、それぞれ異なる主題に関するＷＥＢページから収集されたテキストデータであり、異なる関連詞よりも、３つの関連詞辞書に共通する関連詞（共通関連詞）に着目したほうが、共通の話題を発見することが可能となり、その点で意味のある場合が多い。 After that, the dictionary comparison processing 720k compares the three related terminology dictionaries (related term dictionary A, related term dictionary B, and related term dictionary C). 54. In this embodiment, for example, related terms existing in common in the three related term dictionaries are stored in the comparison result data 54. In this case, the text data A to C are text data collected from WEB pages related to different subjects, and it is better to focus on related terms (common related terms) common to the three related terms dictionaries rather than different related terms. It becomes possible to discover a common topic, and it is often meaningful in that respect.

また、関連詞辞書Ａにおいて共通関連詞とともに記憶されている他の関連詞（共起関連詞）、関連詞辞書Ｂにおいて共通関連詞とともに記憶されている他の関連詞（共起関連詞）、及び関連詞辞書Ｃにおいて共通関連詞とともに記憶されている他の関連詞（共起関連詞）との間に共通性があるか否かをさらに比較して、当該共通性に関する情報を比較結果データ５４に記憶することができる。 In addition, other related terms (co-occurrence related terms) stored together with the common related term in the related term dictionary A, other related terms (co-occurrence related terms) stored together with the common related term in the related term dictionary B, And other related terms (co-occurrence related terms) stored together with the common related term in the related term dictionary C, and further comparing the information on the commonality with the comparison result data. 54 can be stored.

さらに、上記のような共起関連詞を比較する場合に、それらの共起関連詞に関連付けられたランクを考慮して共通性に関する情報を判定してもよい。例えば、ランクの高い（それらの関連詞で示される話題にとって重要性が高い）共起関連詞が、３つの関連詞辞書において共通する場合、共通関連詞の共通性はより高く評価されうる。 Further, when comparing the co-occurrence related terms as described above, information on the commonality may be determined in consideration of the ranks associated with the co-occurrence related terms. For example, if co-occurrence related terms with high rank (high importance for the topic indicated by those related terms) are common in the three related term dictionaries, the commonness of the common related terms can be evaluated more highly.

このような、時間（Ｔ＝ｔ１）における３つの関連詞辞書の比較を、時間（Ｔ＝ｔ２）において繰り返し行うことができる。このような処理を行うことにより、比較結果データ５４を時系列に更新することができる。 Such comparison of the three related term dictionaries at time (T = t1) can be repeatedly performed at time (T = t2). By performing such processing, the comparison result data 54 can be updated in time series.

時間（Ｔ＝ｔ２）において、時間（Ｔ＝ｔ１）と同様に、この時点で所定の異なるＷＥＢページからそれぞれ収集された３つのテキストデータ（テキストデータＡ’〜Ｃ’）から、文字列抽出処理７１０ｋ’、文字列抽出処理７１０ｍ’、文字列抽出処理７１０ｎ’によって関連詞が抽出され、抽出された関連詞がそれぞれ、関連詞辞書Ａ’、関連詞辞書Ｂ’、関連詞辞書Ｃ’に記憶される。この実施例では、テキストデータＡ’は、テキストデータＡと同じＷＥＢページ、または同じ主題のＷＥＢページを想定している。同様に、テキストデータＢ’は、テキストデータＢと同じＷＥＢページ、または同じ主題のＷＥＢページであり、テキストデータＣ’は、テキストデータＣと同じＷＥＢページ、または同じ主題のＷＥＢページである。 At time (T = t2), similarly to time (T = t1), character string extraction processing is performed from three text data (text data A ′ to C ′) respectively collected from predetermined different WEB pages at this time. 710k ′, character string extraction processing 710m ′, and character string extraction processing 710n ′ extract related terms, and the extracted related terms are stored in the related term dictionary A ′, the related term dictionary B ′, and the related term dictionary C ′, respectively. Is done. In this embodiment, the text data A 'is assumed to be the same WEB page as the text data A or the same subject WEB page. Similarly, the text data B 'is the same WEB page as the text data B or the same WEB page, and the text data C' is the same WEB page as the text data C or the same WEB page.

その後、辞書比較処理７２０ｋ’によって３つの関連詞辞書（関連詞辞書Ａ’、関連詞辞書Ｂ’、関連詞辞書Ｃ’）の比較が行われ、関連詞の出現状況に応じて、関連詞等が比較結果データ５４に記憶される。この実施例では、例えば、３つの関連詞辞書に共通して存在する関連詞が比較結果データ５４に記憶される。 After that, the dictionary comparison process 720k ′ compares the three related terminology dictionaries (the related term dictionary A ′, the related term dictionary B ′, the related term dictionary C ′), and the related terms etc. according to the appearance status of the related term. Is stored in the comparison result data 54. In this embodiment, for example, related terms existing in common in the three related term dictionaries are stored in the comparison result data 54.

さらに、時間（Ｔ＝ｔ３）において、時間（Ｔ＝ｔ１、ｔ２）と同様に、この時点で所定の異なるＷＥＢページからそれぞれ収集された３つのテキストデータ（テキストデータＡ’’〜Ｃ’’）から、文字列抽出処理７１０ｋ’’、文字列抽出処理７１０ｍ’’、文字列抽出処理７１０ｎ’’によって関連詞が抽出され、抽出された関連詞がそれぞれ、関連詞辞書Ａ’’、関連詞辞書Ｂ’’、関連詞辞書Ｃ’’に記憶される。この実施例では、テキストデータＡ’’は、テキストデータＡ、テキストデータＡ’と同じＷＥＢページ、または同じ主題のＷＥＢページを想定している。同様に、テキストデータＢ’’は、テキストデータＢ、テキストデータＢ’と同じＷＥＢページ、または同じ主題のＷＥＢページであり、テキストデータＣ’’は、テキストデータＣ、テキストデータＣ’と同じＷＥＢページ、または同じ主題のＷＥＢページである。 Furthermore, at time (T = t3), as with time (T = t1, t2), three text data (text data A ″ to C ″) respectively collected from predetermined different WEB pages at this time. Are extracted by the character string extraction processing 710k ″, the character string extraction processing 710m ″, and the character string extraction processing 710n ″, and the extracted related verbs are the related term dictionary A ″ and the related term dictionary, respectively. B ″ is stored in the related term dictionary C ″. In this embodiment, the text data A ″ is assumed to be the same WEB page as the text data A, the text data A ′, or the same subject WEB page. Similarly, the text data B ″ is the same WEB page as the text data B and text data B ′ or the WEB page of the same subject, and the text data C ″ is the same WEB as the text data C and text data C ′. Page, or WEB page of the same subject.

その後、辞書比較処理７２０ｋ’’によって３つの関連詞辞書（関連詞辞書Ａ’’、関連詞辞書Ｂ’’、関連詞辞書Ｃ’’）の比較が行われ、関連詞の出現状況に応じて、関連詞等が比較結果データ５４に記憶される。この実施例では、例えば、３つの関連詞辞書に共通して存在する関連詞が比較結果データ５４に記憶される。 Thereafter, three related terminology dictionaries (related term dictionary A ″, related term dictionary B ″, and related term dictionary C ″) are compared by dictionary comparison processing 720k ″, and according to the appearance status of the related terminology. , Related terms and the like are stored in the comparison result data 54. In this embodiment, for example, related terms existing in common in the three related term dictionaries are stored in the comparison result data 54.

なお、図４４の実施例では、同じタイミングで、異なるＷＥＢページ（異なる主題に関するＷＥＢページ）から収集された３つのテキストデータに基づいて関連詞の抽出を行ったが、２つのテキストデータからそれぞれ関連詞の抽出を行ってもよいし、４つ以上のテキストデータからそれぞれ関連詞の抽出を行ってもよい。 In the example of FIG. 44, related terms are extracted based on three text data collected from different WEB pages (WEB pages related to different subjects) at the same timing. Extraction of the lyrics may be performed, or the related lyrics may be extracted from four or more text data.

＜＜情報更新部における文字列抽出処理と辞書比較処理の詳細な説明＞＞
次に、図４５を参照して、図４３に示した文字列抽出処理７１０と辞書比較処理７２０の例をより詳細に説明する。図４５は、３つのテキストデータ（テキストデータ１〜３）に対してそれぞれ文字列抽出処理（７１０ｆ、７１０ｇ、７１０ｈ）が行われ、対応する関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、関連詞辞書（ｉ＋１）に対して、辞書比較処理７２０ｆが行われるところを示している。 << Detailed Description of Character String Extraction Processing and Dictionary Comparison Processing in Information Update Unit >>
Next, an example of the character string extraction processing 710 and the dictionary comparison processing 720 shown in FIG. 43 will be described in more detail with reference to FIG. In FIG. 45, character string extraction processing (710f, 710g, 710h) is performed on three text data (text data 1 to 3), respectively, and the corresponding related term dictionary (i-1) and related term dictionary (i ), A dictionary comparison process 720f is performed on the related term dictionary (i + 1).

この実施例では、３つのテキストデータは、共通の主題「株の取引」に関連する同じＷＥＢページから収集されたものである。例えば、ＷＥＢ検索により、検索キーワード「株の取引」を入力し、その結果得られた３つのＷＥＢページを１つのテキストデータとして扱う。図４５では、テキストデータ１は、時間（Ｔ＝ｔ１）における、第１のＷＥＢページから得られたテキストデータ１−１、第２のＷＥＢページから得られたテキストデータ１−２、第３のＷＥＢページから得られたテキストデータ１−３を含む。同様に、テキストデータ２は、時間（Ｔ＝ｔ２）における、第１のＷＥＢページから得られたテキストデータ２−１、第２のＷＥＢページから得られたテキストデータ２−２、第３のＷＥＢページから得られたテキストデータ２−３を含み、テキストデータ３は、時間（Ｔ＝ｔ３）における、第１のＷＥＢページから得られたテキストデータ３−１、第２のＷＥＢページから得られたテキストデータ３−２、第３のＷＥＢページから得られたテキストデータ３−３を含む。ここで、第１のＷＥＢページのＵＲＬはすべて同じであり、第２のＷＥＢページのＵＲＬはすべて同じであり、第３のＷＥＢページのＵＲＬはすべて同じである。 In this example, the three text data were collected from the same WEB page related to the common subject “stock trading”. For example, a search keyword “stock transaction” is input by WEB search, and three WEB pages obtained as a result are handled as one text data. In FIG. 45, text data 1 includes text data 1-1 obtained from the first WEB page, text data 1-2 obtained from the second WEB page, and third data at time (T = t1). The text data 1-3 obtained from the WEB page is included. Similarly, the text data 2 includes text data 2-1 obtained from the first WEB page, text data 2-2 obtained from the second WEB page, and third WEB at time (T = t2). The text data 3 is obtained from the second WEB page, the text data 3-1 obtained from the first WEB page at time (T = t3), including the text data 2-3 obtained from the page. Text data 3-2 and text data 3-3 obtained from the third WEB page are included. Here, the URLs of the first WEB page are all the same, the URLs of the second WEB page are all the same, and the URLs of the third WEB page are all the same.

ここで、１つのテキストデータに含まれる３つのＷＥＢページに対応するテキストデータ（テキストデータ１−１、テキストデータ１−２、テキストデータ１−３）はそれぞれ、図２６に示すような質問文に対応付けて考えることができる。例えば、テキストデータ１−１はＱ１の質問であり、テキストデータ１−２はＱ８の質問であり、テキストデータ１−３はＱ１３の質問である。 Here, the text data (text data 1-1, text data 1-2, text data 1-3) corresponding to three WEB pages included in one text data is respectively converted into a question sentence as shown in FIG. It can be considered in association. For example, the text data 1-1 is a Q1 question, the text data 1-2 is a Q8 question, and the text data 1-3 is a Q13 question.

文字列抽出処理７１０ｆは、時間（Ｔ＝ｔ１）において、テキストデータ１から所定の方法により関連詞を抽出し、関連詞辞書（ｉ−１）に記憶する。この実施例では、関連詞が４つ抽出され、それぞれをランクの順に配列して、１レコードとし関連詞辞書（ｉ−１）に格納する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「○社」、「△銀行」、「申込みは」、「口座」である。形態素解析等では、意味を持つ最小の単位（形態素）に分解して文字列を把握するが、他の方法では、形態素より大きな単位（例えば、文や文の一部）が関連詞として抽出されうる。上述の「申込みは」といった、名詞と助詞からなる文字列も関連詞として抽出されている。 The character string extraction process 710f extracts a related term from the text data 1 by a predetermined method at time (T = t1), and stores it in the related term dictionary (i-1). In this embodiment, four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary (i-1). The ranking of related terms can be determined based on, for example, the appearance frequency. The extracted four related terms (related terms 1 to 4) are “○ company”, “△ bank”, “application” and “account” in order of rank. In morphological analysis, etc., the character string is ascertained by dividing it into the smallest meaningful unit (morpheme), but in other methods, units larger than the morpheme (for example, sentences or parts of sentences) are extracted as related terms. sell. A character string composed of a noun and a particle, such as the above-mentioned “Application”, is also extracted as a related particle.

また、この実施例では、それぞれのテキストデータに関して４つの関連詞が抽出されるようになっているが、これは説明の便宜のためのものである（以降の実施例も同様である）。実際には、関連詞がいくつ抽出されてもよく、テキストデータによってその数が異なる。文字列抽出処理において、一定の判定基準において閾値を超えた場合に文字列が関連詞として決定される場合は、その判定に応じて抽出される関連詞の数が変わってくる。また、抽出する関連詞の数を固定数とすることもできるし、テキストデータの文字数等に応じて設定するようにもできる。 In this embodiment, four related terms are extracted for each text data, but this is for convenience of explanation (the same applies to the following embodiments). Actually, any number of related terms may be extracted, and the number varies depending on the text data. In the character string extraction process, when a character string is determined as a related term when a threshold is exceeded according to a certain criterion, the number of related terms extracted is changed according to the determination. Also, the number of related terms to be extracted can be fixed, or can be set according to the number of characters of text data.

文字列抽出処理７１０ｇは、時間（Ｔ＝ｔ２）において、テキストデータ２から所定の方法により関連詞を抽出し、関連詞辞書（ｉ）に記憶する。この実施例では、関連詞が４つ抽出され、それぞれをランクの順に配列して、１レコードとし関連詞辞書（ｉ）に格納する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「税率」、「○社」、「△銀行」、「申込みは」である。 The character string extraction processing 710g extracts a related term from the text data 2 by a predetermined method at time (T = t2), and stores it in the related term dictionary (i). In this embodiment, four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary (i). The ranking of related terms can be determined based on, for example, the appearance frequency. The extracted four related terms (related terms 1 to 4) are “tax rate”, “○ company”, “△ bank”, and “application” in order of rank.

同様に、文字列抽出処理７１０ｈは、時間（Ｔ＝ｔ３）において、テキストデータ３から所定の方法により関連詞を抽出し、関連詞辞書（ｉ＋１）に記憶する。この実施例では、関連詞が４つ抽出され、それぞれをランクの順に配列して、１レコードとし関連詞辞書（ｉ＋１）に格納する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「○社」、「口座」、「△銀行」、「申込みは」である。 Similarly, the character string extraction process 710h extracts a related term from the text data 3 by a predetermined method at time (T = t3), and stores it in the related term dictionary (i + 1). In this embodiment, four related terms are extracted, and each is arranged in order of rank, and is stored as one record in the related term dictionary (i + 1). The ranking of related terms can be determined based on, for example, the appearance frequency. The four extracted related terms (related terms 1 to 4) are “○ company”, “account”, “△ bank”, and “application” in the rank order.

次に、関連詞辞書（ｉ−１）、関連詞辞書（ｉ）、関連詞辞書（ｉ＋１）に対して、辞書比較処理７２０ｆが行われる。この実施例では、辞書比較処理７２０ｆは、新たに出現した関連詞（新着関連詞）、消滅した関連詞（消滅関連詞）、及び再度出現した関連詞（復活関連詞）を検出し、これらを比較結果データ５４に記憶するものとする。 Next, dictionary comparison processing 720f is performed on the related term dictionary (i-1), the related term dictionary (i), and the related term dictionary (i + 1). In this embodiment, the dictionary comparison process 720f detects newly appearing related terms (new arrival related terms), disappeared related terms (disappearing related terms), and again appearing related terms (resurrection related terms). It is assumed that the comparison result data 54 is stored.

例えば、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）を比較すると、関連詞辞書（ｉ）で、関連詞「税率」が新たに出現しており、さらに、関連詞「口座」が消滅している。そこで、比較結果データ５４には、図４６のレコード５４ａに示すように、関連詞「税率」「口座」が記憶される。また、比較結果データ５４には、これらの関連詞とともに、出現状況を表すデータ（この実施例では、新たに出現した関連詞の場合「新着」、消滅した関連詞の場合「消滅」）が同じレコードに記憶される。さらに、この実施例では、当該出現状況となったタイミングを示すために、関連詞辞書を識別する「話題名」のデータが記憶される。各関連詞辞書は、話題名や日付けと対応付けられ、関連詞辞書（ｉ）は、ここでは「２０１３年１０月１０日、ｔ２における「株の取引」の話題」といった話題名に対応付けられている。 For example, when the related term dictionary (i-1) and the related term dictionary (i) are compared, the related term “tax rate” has newly appeared in the related term dictionary (i). It has disappeared. Therefore, as shown in the record 54a in FIG. 46, the relative result “tax rate” and “account” are stored in the comparison result data 54. In addition, the comparison result data 54 has the same data indicating the appearance status together with these related terms (in this embodiment, “new arrival” for newly appearing related terms, “annihilation” for disappearing related terms). Stored in the record. Furthermore, in this embodiment, “topic name” data for identifying a related term dictionary is stored in order to indicate the timing at which the appearance situation is reached. Each related term dictionary is associated with a topic name and date, and the related term dictionary (i) is associated with a topic name such as “topic of“ stock trading ”on October 10, 2013, t2”. It has been.

次に、関連詞辞書（ｉ）と関連詞辞書（ｉ＋１）を比較すると、関連詞辞書（ｉ＋１）で、関連詞「税率」が消滅しており、さらに、関連詞「口座」が復活している（関連詞辞書（ｉ−１）に存在し、関連詞辞書（ｉ）で消滅していた）。そこで、比較結果データ５４には、図４６のレコード５４ｂに示すように、関連詞「口座」「税率」が記憶される。また、比較結果データ５４には、これらの関連詞とともに、出現状況を表すデータ（この実施例では、再度出現した（復活した）関連詞の場合「復活」、消滅した関連詞の場合「消滅」）が同じレコードに記憶される。さらに、この実施例では、当該出現状況となったタイミングを示すために、関連詞辞書を識別する「話題名」のデータが記憶される。各関連詞辞書は、話題名や日付けと対応付けられ、関連詞辞書（ｉ＋１）は、ここでは「２０１３年１０月１０日、ｔ３における「株の取引」の話題」といった話題名に対応付けられている。 Next, when the related term dictionary (i) and the related term dictionary (i + 1) are compared, the related term “tax rate” disappears in the related term dictionary (i + 1), and the related term “account” is restored. (It exists in the related term dictionary (i-1) and disappears in the related term dictionary (i)). Therefore, as shown in the record 54b in FIG. 46, the comparison result data 54 stores the related terms “account” and “tax rate”. In addition, the comparison result data 54 includes data indicating the appearance status together with these related terms (in this embodiment, “resurrection” in the case of related terms that have reappeared (resurrected), and “disappear” in the case of related terms that have disappeared). ) Is stored in the same record. Furthermore, in this embodiment, “topic name” data for identifying a related term dictionary is stored in order to indicate the timing at which the appearance situation is reached. Each related term dictionary is associated with a topic name and date, and the related term dictionary (i + 1) is associated with a topic name such as “topic of“ stock trading ”on October 10, 2013 at t3”. It has been.

次に、図４７を参照して、図４４に示した文字列抽出処理７１０と辞書比較処理７２０の例をより詳細に説明する。図４７は、時間（Ｔ＝ｔ１）において、３つのテキストデータ（テキストデータＡ〜Ｃ）に対してそれぞれ文字列抽出処理（７１０ｋ、７１０ｍ、７１０ｎ）が行われ、対応する関連詞辞書Ａ、関連詞辞書Ｂ、関連詞辞書Ｃに対して、辞書比較処理７２０ｋが行われるところを示している。 Next, an example of the character string extraction process 710 and the dictionary comparison process 720 shown in FIG. 44 will be described in more detail with reference to FIG. In FIG. 47, at time (T = t1), the character string extraction processing (710k, 710m, 710n) is performed on each of the three text data (text data A to C), and the corresponding related dictionary dictionary A, related This shows that dictionary comparison processing 720k is performed on the lyrics dictionary B and the related dictionary dictionary C.

この実施例では、３つのテキストデータは、同じ時間（Ｔ＝ｔ１）において、異なる主題に関連するＷＥＢページから収集されたものである。すなわち、テキストデータＡは、「Ａ社の技術」を主題としたＷＥＢページに基づくものであり、テキストデータＢは、「Ｂ社の技術」を主題としたＷＥＢページに基づくものであり、テキストデータＣは、「ＡＩ（人工知能）関連技術」を主題としたＷＥＢページに基づくものである。 In this example, the three text data were collected from WEB pages related to different subjects at the same time (T = t1). That is, the text data A is based on a WEB page whose theme is “Technology of Company A”, and the text data B is based on a WEB page whose theme is “Technology of Company B”. C is based on a WEB page whose theme is “AI (artificial intelligence) related technology”.

例えば、テキストデータＡに関しては、ＷＥＢ検索により、検索キーワード「Ａ社の技術」を入力し、その結果得られた３つのＷＥＢページを１つのテキストデータとして扱う。同様に、テキストデータＢに関しては、ＷＥＢ検索により、検索キーワード「Ｂ社の技術」を入力し、その結果得られた３つのＷＥＢページを１つのテキストデータとして扱い、テキストデータＣに関しては、ＷＥＢ検索により、検索キーワード「ＡＩ（人工知能）関連技術」を入力し、その結果得られた３つのＷＥＢページを１つのテキストデータとして扱う。 For example, regarding the text data A, a search keyword “Technology of Company A” is input by WEB search, and three WEB pages obtained as a result are handled as one text data. Similarly, for the text data B, the search keyword “Technology of company B” is input by WEB search, and the resulting three WEB pages are treated as one text data. For text data C, the WEB search is performed. Thus, the search keyword “AI (artificial intelligence) related technology” is input, and the three WEB pages obtained as a result are handled as one text data.

図４７では、テキストデータ１は、主題「Ａ社の技術」に関連した、第１のＷＥＢページから得られたテキストデータＡ−１、第２のＷＥＢページから得られたテキストデータＡ−２、第３のＷＥＢページから得られたテキストデータＡ−３を含む。同様に、テキストデータ２は、主題「Ｂ社の技術」に関連した、第１のＷＥＢページから得られたテキストデータＢ−１、第２のＷＥＢページから得られたテキストデータＢ−２、第３のＷＥＢページから得られたテキストデータＢ−３を含み、テキストデータ３は、主題「ＡＩ（人工知能）関連技術」に関連した、第１のＷＥＢページから得られたテキストデータＣ−１、第２のＷＥＢページから得られたテキストデータＣ−２、第３のＷＥＢページから得られたテキストデータＣ−３を含む。 In FIG. 47, text data 1 includes text data A-1 obtained from the first WEB page, text data A-2 obtained from the second WEB page, related to the subject “Technology of company A”. The text data A-3 obtained from the third WEB page is included. Similarly, the text data 2 includes text data B-1 obtained from the first WEB page, text data B-2 obtained from the second WEB page, Text data B-3 obtained from three WEB pages, the text data 3 including text data C-1 obtained from the first WEB page related to the subject “AI (Artificial Intelligence) related technology”, It includes text data C-2 obtained from the second WEB page and text data C-3 obtained from the third WEB page.

文字列抽出処理７１０ｋは、テキストデータＡから所定の方法により関連詞を抽出し、関連詞辞書Ａに記憶する。この実施例では、関連詞が４つ抽出され、それぞれをランクの順に配列して、１レコードとし関連詞辞書Ａに格納する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「Ａ社」、「音声」、「音声認識」、「営業」となっている。 The character string extraction process 710k extracts a related term from the text data A by a predetermined method and stores it in the related term dictionary A. In this embodiment, four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary A. The ranking of related terms can be determined based on, for example, the appearance frequency. The extracted four related terms (related terms 1 to 4) are “Company A”, “Voice”, “Voice Recognition”, and “Sales” in order of rank.

文字列抽出処理７１０ｍは、テキストデータＢから所定の方法により関連詞を抽出し、関連詞辞書Ｂに記憶する。この実施例では、関連詞が４つ抽出され、それぞれをランクの順に配列して、１レコードとし関連詞辞書Ｂに格納する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「音声」、「研究開発」、「Ｂ社の業績」、「音声認識」となっている。 The character string extraction process 710m extracts a related term from the text data B by a predetermined method and stores it in the related term dictionary B. In this embodiment, four related terms are extracted, and each is arranged in order of rank and stored in the related term dictionary B as one record. The ranking of related terms can be determined based on, for example, the appearance frequency. The extracted four related terms (related terms 1 to 4) are “voice”, “research and development”, “business achievements of company B”, and “voice recognition” in order of rank.

同様に、文字列抽出処理７１０ｎは、テキストデータＣから所定の方法により関連詞を抽出し、関連詞辞書Ｃに記憶する。この実施例では、関連詞が４つ抽出され、それぞれをランクの順に配列して、１レコードとし関連詞辞書Ｃに格納する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「ＡＩ」、「ロボット」、「音声認識」、「エージェント」となっている。 Similarly, the character string extraction process 710n extracts a related term from the text data C by a predetermined method and stores it in the related term dictionary C. In this embodiment, four related terms are extracted, and each is arranged in order of rank and stored as one record in the related term dictionary C. The ranking of related terms can be determined based on, for example, the appearance frequency. The extracted four related terms (related terms 1 to 4) are “AI”, “robot”, “voice recognition”, and “agent” in order of rank.

次に、関連詞辞書Ａ、関連詞辞書Ｂ、関連詞辞書Ｃに対して、辞書比較処理７２０ｋが行われる。この実施例では、辞書比較処理７２０ｋは、３つの辞書に共通する関連詞（共通関連詞）を検出し、これらを比較結果データ５４に記憶するものとする。 Next, a dictionary comparison process 720k is performed on the related term dictionary A, the related term dictionary B, and the related term dictionary C. In this embodiment, the dictionary comparison process 720k detects related terms common to the three dictionaries (common related terms) and stores them in the comparison result data 54.

関連詞辞書Ａ、関連詞辞書Ｂ、関連詞辞書Ｃを比較すると、それぞれ上述した関連詞を記憶しており、共通関連詞として「音声認識」が存在することが認められる。そこで、この「音声認識」を比較結果データ５４に記憶する。 When the related term dictionary A, the related term dictionary B, and the related term dictionary C are compared, it is recognized that the above-mentioned related terms are stored, and “speech recognition” exists as a common related term. Therefore, this “voice recognition” is stored in the comparison result data 54.

このような共通関連詞を把握することにより、企業間関連情報の分析を効果的に実現することができる。例えば、Ａ社の技術に関する記述で多く用いられている関連詞をテキストデータＡに基づく文字列抽出処理７１０ｋにより抽出し、Ｂ社の技術に関する記述で多く用いられている関連詞をテキストデータＢに基づく文字列抽出処理７１０ｍにより抽出し、ＡＩ（人工知能）関連技術に関する記述で多く用いられている関連詞をテキストデータＣに基づく文字列抽出処理７１０ｎにより抽出することによって、Ａ社とＢ社とが、どのようなＡＩ関連技術で共通性を有している可能性があるかを、客観的に把握することができる。 By grasping such common related terms, it is possible to effectively realize analysis of related information between companies. For example, a related terminology frequently used in the description about the technology of the company A is extracted by the character string extraction processing 710k based on the text data A, and a related terminology frequently used in the description about the technology of the company B is extracted into the text data B A and B companies are extracted by a string extraction process 710n based on text data C and extracted by a string extraction process 710n based on text data C. However, it is possible to objectively understand what AI-related technologies may have commonality.

なお、図４７に示す実施例では、同じ時間（Ｔ＝ｔ１）において、異なる主題に関連するＷＥＢページから収集されたテキストデータに基づく関連詞辞書を比較するものであり、ＷＥＢページは、それぞれは意図的なＷＥＢ検索により収集されたものであるが、まったく偶然に集められたＷＥＢページからそれぞれ得られた関連詞辞書を比較した結果、偶然にも共通の関連詞が見いだされるといったケースも考えられる。 In the embodiment shown in FIG. 47, the related dictionary is based on text data collected from WEB pages related to different subjects at the same time (T = t1). Although it was collected by intentional WEB search, as a result of comparing the related terminology dictionaries obtained from WEB pages collected by chance, there may be cases where common related terms are found by chance. .

次に、図４８を参照して、情報更新部４６における文字列抽出処理７１０と辞書比較処理７２０の他の実施例を説明する。図４８は、図４５に示した文字列抽出処理７１０と辞書比較処理７２０の変形例を示すものである。図４８には、２つのテキストデータ（テキストデータ１、テキストデータ２）に対してそれぞれ文字列抽出処理７１０が行われ、対応する関連詞辞書（ｉ−１）、関連詞辞書（ｉ）に対して、辞書比較処理７２０が行われるところを示している。図４５に示す、テキストデータ３に関する処理については表示を省略した。 Next, another embodiment of the character string extraction process 710 and the dictionary comparison process 720 in the information update unit 46 will be described with reference to FIG. FIG. 48 shows a modification of the character string extraction process 710 and the dictionary comparison process 720 shown in FIG. In FIG. 48, character string extraction processing 710 is performed on each of two text data (text data 1 and text data 2), and the corresponding related term dictionary (i-1) and related term dictionary (i) are processed. Thus, the dictionary comparison process 720 is performed. The display related to the text data 3 shown in FIG. 45 is omitted.

この実施例では、２つのテキストデータは、共通の主題「株の取引」に関連する同じＷＥＢページから収集されたものである。例えば、ＷＥＢ検索により、検索キーワード「株の取引」を入力し、その結果得られた３つのＷＥＢページを１つのテキストデータとするが、関連詞は、ＷＥＢページの単位ごとに抽出する。これによって、関連詞が、ＷＥＢページごとに管理されるが、３つのＷＥＢページに基づくテキストデータを３つ用意し、そのテキストデータごとに関連詞が抽出されるようにしてもよい。ここでは、関連詞が、複数のテキストデータから抽出されていることが重要である。 In this example, the two text data were collected from the same WEB page associated with the common subject "stock trading". For example, a search keyword “stock transaction” is input by WEB search, and three WEB pages obtained as a result are set as one text data, but related terms are extracted for each WEB page unit. As a result, related terms are managed for each WEB page, but three text data based on three WEB pages may be prepared, and related terms may be extracted for each text data. Here, it is important that the related terms are extracted from a plurality of text data.

図４８では、テキストデータ１は、時間（Ｔ＝ｔ１）における、第１のＷＥＢページから得られたテキストデータ１−１、第２のＷＥＢページから得られたテキストデータ１−２、第３のＷＥＢページから得られたテキストデータ１−３を含む。同様に、テキストデータ２は、時間（Ｔ＝ｔ２）における、第１のＷＥＢページから得られたテキストデータ２−１、第２のＷＥＢページから得られたテキストデータ２−２、第３のＷＥＢページから得られたテキストデータ２−３を含む。ここで、第１のＷＥＢページのＵＲＬはすべて同じであり、第２のＷＥＢページのＵＲＬはすべて同じであり、第３のＷＥＢページのＵＲＬはすべて同じである。 In FIG. 48, text data 1 includes text data 1-1 obtained from the first WEB page, text data 1-2 obtained from the second WEB page, and third data at time (T = t1). The text data 1-3 obtained from the WEB page is included. Similarly, the text data 2 includes text data 2-1 obtained from the first WEB page, text data 2-2 obtained from the second WEB page, and third WEB at time (T = t2). It includes text data 2-3 obtained from the page. Here, the URLs of the first WEB page are all the same, the URLs of the second WEB page are all the same, and the URLs of the third WEB page are all the same.

文字列抽出処理７１０は、時間（Ｔ＝ｔ１）において、テキストデータ１から所定の方法により関連詞を抽出する。これは、上述のように、テキストデータ１に含まれるテキストデータごとに行われ、例えば、テキストデータ１−１から所定の方法により関連詞が抽出され、テキストデータ１−２から所定の方法により関連詞が抽出され、テキストデータ１−３から所定の方法により関連詞が抽出される。この実施例では、それぞれについて関連詞が４つ抽出され、各関連詞をランクの順に配列する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。テキストデータ１−１に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「○社」、「口座」、「申込手続」、「△銀行」である。テキストデータ１−２に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「口座」、「△銀行」、「申込みは」、「◇社」である。テキストデータ１−３に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「株の購入」、「△銀行」、「指し値」、「○社」である。 The character string extraction processing 710 extracts related terms from the text data 1 by a predetermined method at time (T = t1). As described above, this is performed for each text data included in the text data 1. For example, a related term is extracted from the text data 1-1 by a predetermined method, and related text is extracted from the text data 1-2 by a predetermined method. The lyrics are extracted, and the related words are extracted from the text data 1-3 by a predetermined method. In this embodiment, four related terms are extracted for each, and each related term is arranged in rank order. The ranking of related terms can be determined based on, for example, the appearance frequency. Regarding the text data 1-1, the extracted four related terms (related terms 1 to 4) are “○ company”, “account”, “application procedure”, and “Δ bank” in rank order. Regarding the text data 1-2, the extracted four related terms (related terms 1 to 4) are “account”, “Δ bank”, “application”, and “◇ sha” in order of rank. Regarding the text data 1-3, the extracted four related terms (related terms 1 to 4) are “buy stock”, “Δ bank”, “limit”, and “○ company” in order of rank.

次に、文字列抽出処理７１０は、このように抽出された関連詞のそれぞれについて、近傍関連詞を求め、それらを関連詞辞書（ｉ−１）に記憶する。近傍関連詞は、ある関連詞に着目した場合に、その関連詞とともに出現する（共起する）関連詞である。各テキストデータ（１−１、１−２、１−３）において、そのテキストデータに対応する話題に含まれている関連詞の集合は関連詞集合であり、ある関連詞に着目した場合、その関連詞を含む関連詞集合を、関連詞の近傍と称し、関連詞の近傍の集合を、関連詞の近傍系と称する。関連詞辞書（ｉ−１）には、関連詞ごとに、関連詞の近傍系が記憶される。 Next, the character string extraction processing 710 obtains neighborhood related words for each of the extracted related words and stores them in the related word dictionary (i-1). A neighborhood related terminator is a related term that appears (co-occurs) with a related term when a particular related term is focused. In each text data (1-1, 1-2, 1-3), the set of related terms included in the topic corresponding to the text data is a set of related terms. A set of related terms including related terms is referred to as a neighborhood of related terms, and a set of neighborhoods of related terms is referred to as a neighborhood system of related terms. In the related term dictionary (i-1), a neighborhood system of related terms is stored for each related term.

例えば、関連詞「○社」に着目すると、この関連詞は、テキストデータ１−１について抽出されており、関連詞の近傍は｛○社、口座、申請手続、△銀行｝である。同様に、この関連詞は、テキストデータ１−３について抽出されており、関連詞の近傍は｛株の購入、△銀行、指し値、○社｝である。これらから、関連詞「○社」について、関連詞の近傍系は、｛○社、口座、株の購入、△銀行、申請手続、指し値｝となる（テキストデータ１−１についての関連詞の近傍と、テキストデータ１−３についての関連詞の近傍とで重複する関連詞「△銀行」は１つだけ含められる）。 For example, focusing on the related term “○ Company”, this related term is extracted for the text data 1-1, and the neighborhood of the related term is {○ Company, Account, Application Procedure, ΔBank}. Similarly, this related term is extracted for the text data 1-3, and the neighborhood of the related term is {stock purchase, Δbank, limit, ○ company}. From these, for the related term “○ Company”, the neighborhood system of the related term is {○ Company, account, stock purchase, △ bank, application procedure, limit}} (the neighborhood of the related term for text data 1-1) And only one related term “Δ bank” is duplicated in the vicinity of the related term for the text data 1-3).

こうして求められた関連詞の近傍系が、それぞれの関連詞「○社」、「口座」、「△銀行」、「申請手続」、「株の購入」、「申込みは」、「指し値」、「◇社」について、関連詞辞書（ｉ−１）に記憶される。各関連詞については、関連詞の近傍系（近傍関連詞１〜７）が記憶されるが、これらの順序は、文字列抽出処理７１０により行われたランク付けや、共起性の高さ等を考慮して定められる。 The relative system of the related terms thus obtained is the related terms “○ Company”, “Account”, “△ Bank”, “Application procedure”, “Stock purchase”, “Application is”, “Limit price”, “ ◇ Company ”is stored in the related term dictionary (i-1). For each related terminology, the related system of the related terms (neighboring related terms 1 to 7) is stored, and the order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of

さらに、文字列抽出処理７１０は同様に、時間（Ｔ＝ｔ２）において、テキストデータ２から所定の方法により関連詞を抽出する。これは、上述のように、テキストデータ２に含まれるテキストデータごとに行われ、例えば、テキストデータ２−１から所定の方法により関連詞が抽出され、テキストデータ２−２から所定の方法により関連詞が抽出され、テキストデータ２−３から所定の方法により関連詞が抽出される。この実施例では、それぞれについて関連詞が４つ抽出され、各関連詞をランクの順に配列する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。テキストデータ２−１に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「○社」、「口座」、「新しい制度」、「申請手続」である。テキストデータ２−２に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「口座」、「△銀行」、「◇社」、「株の購入」である。テキストデータ２−３に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「株の購入」、「△銀行」、「○社」、「新しい制度」である。 Furthermore, the character string extraction processing 710 similarly extracts related terms from the text data 2 by a predetermined method at time (T = t2). As described above, this is performed for each text data included in the text data 2, and for example, related words are extracted from the text data 2-1 by a predetermined method, and related by a predetermined method from the text data 2-2. The lyrics are extracted, and the related words are extracted from the text data 2-3 by a predetermined method. In this embodiment, four related terms are extracted for each, and each related term is arranged in rank order. The ranking of related terms can be determined based on, for example, the appearance frequency. Regarding the text data 2-1, the extracted four related terms (related terms 1 to 4) are "○ company", "account", "new system", and "application procedure" in order of rank. Regarding the text data 2-2, the extracted four related terms (related terms 1 to 4) are “account”, “Δ bank”, “◇ company”, “buy stock” in order of rank. Regarding the text data 2-3, the extracted four related terms (related terms 1 to 4) are “stock purchase”, “Δ bank”, “○ company”, and “new system” in order of rank.

次に、文字列抽出処理７１０は、このように抽出された関連詞のそれぞれについて、近傍関連詞を求め、それらを関連詞辞書（ｉ）に記憶する。例えば、関連詞「○社」に着目すると、この関連詞は、テキストデータ２−１について抽出されており、関連詞の近傍は｛○社、口座、新しい制度、申請手続｝である。同様に、この関連詞は、テキストデータ２−３について抽出されており、関連詞の近傍は｛株の購入、△銀行、○社、新しい制度｝である。これらから、関連詞「○社」について、関連詞の近傍系は、｛○社、口座、株の購入、新しい制度、申請手続、△銀行｝となる（テキストデータ２−１についての関連詞の近傍と、テキストデータ２−３についての関連詞の近傍とで重複する関連詞「新しい制度」は１つだけ含められる）。 Next, the character string extraction processing 710 obtains neighborhood related words for each of the extracted related words, and stores them in the related word dictionary (i). For example, paying attention to the related term “○ Company”, this related term is extracted for the text data 2-1, and the neighborhood of the related term is {○ Company, account, new system, application procedure}. Similarly, this related term is extracted for the text data 2-3, and the neighborhood of the related term is {stock purchase, Δbank, ○ company, new system}. From these, for the related term "○ Company", the neighborhood system of the related term is {○ Company, account, stock purchase, new system, application procedure, △ bank} (the related terms for Text Data 2-1 Only one related term “new system” that overlaps the neighborhood and the neighborhood of the related term for the text data 2-3 is included).

こうして求められた関連詞の近傍系が、それぞれの関連詞「○社」、「口座」、「△銀行」、「申請手続」、「株の購入」、「新しい制度」、「◇社」について、関連詞辞書（ｉ）に記憶される。各関連詞については、関連詞の近傍系（近傍関連詞１〜６）が記憶されるが、これらの順序は、文字列抽出処理７１０により行われたランク付けや、共起性の高さ等を考慮して定められる。 The related system of the related terms obtained in this way is the related terms “○ Company”, “Account”, “△ Bank”, “Application Procedure”, “Stock Purchase”, “New System”, “◇ Company”. Are stored in the related term dictionary (i). For each related terminology, a neighborhood system of the related terms (neighboring related terms 1 to 6) is stored, and the order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of

次に、辞書比較処理７２０により、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）が比較される。その結果、関連詞「指し値」、「申込みは」は、時間（Ｔ＝ｔ２）で消滅した消滅関連詞であり（図４８の符号７５３参照）、関連詞「新しい制度」は、時間（Ｔ＝ｔ２）で新たに出現する新着関連詞であり（図４８の符号７５２参照）、これらの関連詞は、関連詞辞書（ｉ−１）と関連詞辞書（ｉ）の変化分として比較結果データ５４に記憶される。 Next, by the dictionary comparison process 720, the related term dictionary (i-1) and the related term dictionary (i) are compared. As a result, the related terms “limit” and “application” are disappearing related terms that disappeared at time (T = t2) (see reference numeral 753 in FIG. 48), and the related term “new system” is time (T = new related words appearing newly at t2) (see reference numeral 752 in FIG. 48), and these related words are compared result data 54 as changes in the related word dictionary (i-1) and the related word dictionary (i). Is remembered.

さらに、辞書比較処理７２０は、関連詞「○社」、「口座」、「△銀行」、「申請手続」、「株の購入」、「◇社」については、時間（Ｔ＝ｔ１）及び時間（Ｔ＝ｔ２）の両方で存在する関連詞であるが、各関連詞の近傍関連詞についても比較を行う。そうすると、近傍関連詞（または近傍関連詞の順序）が変化していることが分かる（図４８の符号７５１参照）。このことは、話題における当該関連詞の関係性や位置づけが変化していることを示している。本発明に係る情報検索システム１００が、こうした近傍関連詞の変化についても可視化しようとする場合は、これらの情報についても比較結果データ５４に記憶する。 Further, the dictionary comparison processing 720 performs the time (T = t1) and time for the related terms “○ Company”, “Account”, “△ Bank”, “Application procedure”, “Stock purchase”, and “◇ Company”. Although it is a related term which exists in both (T = t2), it compares also about the neighborhood related terminator of each related term. Then, it can be seen that the neighborhood related terms (or the order of neighborhood related terms) are changed (see reference numeral 751 in FIG. 48). This indicates that the relationship and position of the related terminology in the topic are changing. When the information search system 100 according to the present invention attempts to visualize such changes in the neighborhood related terms, these information are also stored in the comparison result data 54.

次に、図４９を参照して、情報更新部４６における文字列抽出処理７１０と辞書比較処理７２０の他の実施例を説明する。図４９は、図４７に示した文字列抽出処理７１０と辞書比較処理７２０の変形例を示すものである。図４９には、３つのテキストデータ（テキストデータＡ〜Ｃ）に対してそれぞれ文字列抽出処理７１０が行われ、対応する関連詞辞書Ａ、関連詞辞書Ｂ、関連詞辞書Ｃに対して、辞書比較処理７２０が行われるところを示している。 Next, with reference to FIG. 49, another embodiment of the character string extraction process 710 and the dictionary comparison process 720 in the information update unit 46 will be described. FIG. 49 shows a modification of the character string extraction process 710 and the dictionary comparison process 720 shown in FIG. In FIG. 49, character string extraction processing 710 is performed for each of the three text data (text data A to C), and the corresponding related term dictionary A, related term dictionary B, and related term dictionary C are compared with the dictionary. It shows where the comparison process 720 is performed.

この実施例では、３つのテキストデータは、同じ時間（Ｔ＝ｔ１）において、異なる主題に関連する同じＷＥＢページから収集されたものである。すなわち、テキストデータＡは、「Ａ社の技術」を主題としたＷＥＢページに基づくものであり、テキストデータＢは、「Ｂ社の技術」を主題としたＷＥＢページに基づくものであり、テキストデータＣは、「ＡＩ（人工知能）関連技術」を主題としたＷＥＢページに基づくものである。 In this example, the three text data were collected from the same WEB page associated with different subjects at the same time (T = t1). That is, the text data A is based on a WEB page whose theme is “Technology of Company A”, and the text data B is based on a WEB page whose theme is “Technology of Company B”. C is based on a WEB page whose theme is “AI (artificial intelligence) related technology”.

例えば、テキストデータＡに関しては、ＷＥＢ検索により、検索キーワード「Ａ社の技術」を入力し、その結果得られた２つのＷＥＢページから２つのテキストデータ（テキストデータＡ−１、テキストデータＡ−２）を取得し、これらを文字列抽出処理７１０においては個別に扱う。同様に、テキストデータＢに関しては、ＷＥＢ検索により、検索キーワード「Ｂ社の技術」を入力し、その結果得られた２つのＷＥＢページから２つのテキストデータ（テキストデータＢ−１、テキストデータＢ−２）を取得し、これらを文字列抽出処理７１０においては個別に扱う。同様に、テキストデータＣに関しては、ＷＥＢ検索により、検索キーワード「ＡＩ（人工知能）関連技術」を入力し、その結果得られた２つのＷＥＢページから２つのテキストデータ（テキストデータＣ−１、テキストデータＣ−２）を取得し、これらを文字列抽出処理７１０においては個別に扱う。図４８では、テキストデータＡ、テキストデータＢ、テキストデータＣにそれぞれ３つのテキストデータが含まれていたが、この実施例では、それぞれ２つのテキストデータを含むものとする。 For example, for text data A, a search keyword “Technology of Company A” is input by WEB search, and two text data (text data A-1 and text data A-2 are obtained from two WEB pages obtained as a result. ) And are individually handled in the character string extraction processing 710. Similarly, for text data B, a search keyword “Technology of company B” is input by WEB search, and two text data (text data B-1, text data B-) are obtained from the two WEB pages obtained as a result. 2) are acquired, and these are handled individually in the character string extraction processing 710. Similarly, with respect to the text data C, a search keyword “AI (artificial intelligence) related technology” is input by WEB search, and two text data (text data C-1, text) are obtained from two WEB pages obtained as a result. Data C-2) is acquired, and these are handled individually in the character string extraction processing 710. In FIG. 48, text data A, text data B, and text data C each include three text data. In this embodiment, it is assumed that each text data A, text data B, and text data C includes two text data.

文字列抽出処理７１０は、時間（Ｔ＝ｔ１）において、テキストデータＡから所定の方法により関連詞を抽出する。これは、上述のように、テキストデータＡに含まれるテキストデータごとに行われ、例えば、テキストデータＡ−１から所定の方法により関連詞が抽出され、テキストデータＡ−２から所定の方法により関連詞が抽出される。この実施例では、それぞれについて関連詞が４つ抽出され、各関連詞をランクの順に配列する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。テキストデータＡ−１に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「Ａ社」、「音声」、「音声認識」、「ロボット」である。テキストデータＡ−２に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「圧縮技術」、「音声認識」、「営業」、「音声」である。 The character string extraction processing 710 extracts a related term from the text data A by a predetermined method at time (T = t1). As described above, this is performed for each text data included in the text data A. For example, related words are extracted from the text data A-1 by a predetermined method, and related text is extracted from the text data A-2 by a predetermined method. The lyrics are extracted. In this embodiment, four related terms are extracted for each, and each related term is arranged in rank order. The ranking of related terms can be determined based on, for example, the appearance frequency. Regarding the text data A-1, the extracted four related terms (related terms 1 to 4) are “Company A”, “speech”, “speech recognition”, and “robot” in order of rank. Regarding the text data A-2, the extracted four related terms (related terms 1 to 4) are “compression technology”, “speech recognition”, “sales”, and “speech” in order of rank.

次に、文字列抽出処理７１０は、このように抽出された関連詞のそれぞれについて、近傍関連詞を求め、それらを関連詞辞書Ａに記憶する。近傍関連詞は、ある関連詞に着目した場合に、その関連詞とともに出現する（共起する）関連詞である。各テキストデータ（Ａ−１、Ａ−２）において、そのテキストデータに対応する話題に含まれている関連詞の集合は関連詞集合であり、ある関連詞に着目した場合、その関連詞を含む関連詞集合を、関連詞の近傍と称し、関連詞の近傍の集合を、関連詞の近傍系と称する。関連詞辞書Ａには、関連詞ごとに、関連詞の近傍系が記憶される。 Next, the character string extraction processing 710 obtains neighborhood related words for each of the extracted related words and stores them in the related word dictionary A. A neighborhood related terminator is a related term that appears (co-occurs) with a related term when a particular related term is focused. In each text data (A-1, A-2), a set of related terms included in a topic corresponding to the text data is a related term set. A set of related terms is called a neighborhood of related terms, and a set of neighborhoods of related terms is called a neighborhood system of related terms. In the related term dictionary A, a neighborhood system of related terms is stored for each related term.

例えば、関連詞「音声認識」に着目すると、この関連詞は、テキストデータＡ−１について抽出されており、関連詞の近傍は｛Ａ社、音声、音声認識、ロボット｝である。同様に、この関連詞は、テキストデータＡ−２について抽出されており、関連詞の近傍は｛圧縮技術、音声認識、営業、音声｝である。これらから、関連詞「音声認識」について、関連詞の近傍系は、｛音声認識、Ａ社、圧縮技術、音声、ロボット、営業｝となる（テキストデータＡ−１についての関連詞の近傍と、テキストデータＡ−２についての関連詞の近傍とで重複する関連詞「音声」は１つだけ含められる）。 For example, paying attention to the related term “speech recognition”, this related term is extracted for the text data A-1, and the neighborhood of the related term is {Company A, speech, speech recognition, robot}. Similarly, this related term is extracted for the text data A-2, and the neighborhood of the related term is {compression technology, speech recognition, sales, speech}. From these, for the related term “speech recognition”, the neighborhood system of the related term is {speech recognition, company A, compression technology, speech, robot, business} (the neighborhood of the related term for the text data A-1 and Only one related terminology “speech” that overlaps with the neighborhood of the related term for the text data A-2 is included).

こうして求められた関連詞の近傍系が、それぞれの関連詞「Ａ社」、「音声認識」、「音声」、「圧縮技術」、「営業」、「ロボット」について、関連詞辞書Ａに記憶される。各関連詞については、関連詞の近傍系（近傍関連詞１〜５）が記憶されるが、これらの順序は、文字列抽出処理７１０により行われたランク付けや、共起性の高さ等を考慮して定められる。 The neighborhood system of the related terms thus obtained is stored in the related term dictionary A for the related terms “Company A”, “speech recognition”, “speech”, “compression technology”, “sales”, and “robot”. The For each related terminology, the neighborhood system of the related terms (neighboring related terms 1 to 5) is stored. The order of these is determined by the ranking performed by the character string extraction process 710, the high co-occurrence, etc. Determined in consideration of

さらに、文字列抽出処理７１０は、時間（Ｔ＝ｔ１）において、テキストデータＢから所定の方法により関連詞を抽出する。これは、上述のように、テキストデータＢに含まれるテキストデータごとに行われ、例えば、テキストデータＢ−１から所定の方法により関連詞が抽出され、テキストデータＢ−２から所定の方法により関連詞が抽出される。この実施例では、それぞれについて関連詞が４つ抽出され、各関連詞をランクの順に配列する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。テキストデータＢ−１に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「音声」、「Ｂ社の業績」、「音声認識」、「研究開発」である。テキストデータＢ−２に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「研究開発」、「音声」、「音声認識」、「認証技術」である。 Further, the character string extraction processing 710 extracts a related term from the text data B by a predetermined method at time (T = t1). As described above, this is performed for each text data included in the text data B. For example, related words are extracted from the text data B-1 by a predetermined method, and related words are extracted from the text data B-2 by a predetermined method. The lyrics are extracted. In this embodiment, four related terms are extracted for each, and each related term is arranged in rank order. The ranking of related terms can be determined based on, for example, the appearance frequency. Regarding the text data B-1, the extracted four related terms (related terms 1 to 4) are “voice”, “business achievements of B company”, “voice recognition”, and “research and development” in order of rank. Regarding the text data B-2, the extracted four related terms (related terms 1 to 4) are “research and development”, “speech”, “speech recognition”, and “authentication technology” in order of rank.

次に、文字列抽出処理７１０は、このように抽出された関連詞のそれぞれについて、近傍関連詞を求め、それらを関連詞辞書Ｂに記憶する。例えば、関連詞「音声」に着目すると、この関連詞は、テキストデータＢ−１について抽出されており、関連詞の近傍は｛音声、Ｂ社の業績、音声認識、研究開発｝である。同様に、この関連詞は、テキストデータＢ−２について抽出されており、関連詞の近傍は｛研究開発、音声、音声認識、認証技術｝である。これらから、関連詞「音声」について、関連詞の近傍系は、｛音声、研究開発、Ｂ社の業績、音声認識、認証技術｝となる（テキストデータＢ−１についての関連詞の近傍と、テキストデータＢ−２についての関連詞の近傍とで重複する関連詞「音声認識」、「研究開発」はそれぞれ１つだけ含められる）。 Next, the character string extraction processing 710 obtains neighborhood related terms for each of the related terms extracted in this way, and stores them in the related term dictionary B. For example, focusing on the related term “speech”, this related term is extracted for the text data B-1, and the neighborhood of the related term is {speech, achievement of company B, speech recognition, research and development}. Similarly, this related term is extracted for the text data B-2, and the neighborhood of the related term is {R & D, speech, speech recognition, authentication technology}. From these, for the related term “speech”, the neighborhood system of the related term is {speech, research and development, achievement of company B, speech recognition, authentication technology} (the neighborhood of the related term for the text data B-1, Only one of the related terms “speech recognition” and “research and development” that overlap in the vicinity of the related term for the text data B-2 is included).

こうして求められた関連詞の近傍系が、それぞれの関連詞「音声」、「研究開発」、「Ｂ社の業績」、「音声認識」、「認証技術」について、関連詞辞書Ｂに記憶される。各関連詞については、関連詞の近傍系（近傍関連詞１〜４）が記憶されるが、これらの順序は、文字列抽出処理７１０により行われたランク付けや、共起性の高さ等を考慮して定められる。 The related system of the related terms thus obtained is stored in the related term dictionary B for the related terms “speech”, “research and development”, “business achievements of company B”, “speech recognition”, and “authentication technology”. . For each related terminology, the related system of the related terms (neighboring related terms 1 to 4) is stored. The order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of

さらに、文字列抽出処理７１０は、時間（Ｔ＝ｔ１）において、テキストデータＣから所定の方法により関連詞を抽出する。これは、上述のように、テキストデータＣに含まれるテキストデータごとに行われ、例えば、テキストデータＣ−１から所定の方法により関連詞が抽出され、テキストデータＣ−２から所定の方法により関連詞が抽出される。この実施例では、それぞれについて関連詞が４つ抽出され、各関連詞をランクの順に配列する。関連詞のランク付けは、例えば、出現頻度等に基づいて決定することができる。テキストデータＣ−１に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「ＡＩ」、「エージェント」、「ロボット」、「音声認識」である。テキストデータＣ−２に関し、抽出された４つの関連詞（関連詞１〜関連詞４）は、ランク順に「ロボット」、「音声認識」、「ＡＩ」、「学習機能」である。 Further, the character string extraction processing 710 extracts a related term from the text data C by a predetermined method at time (T = t1). As described above, this is performed for each text data included in the text data C. For example, a related term is extracted from the text data C-1 by a predetermined method, and related text is extracted from the text data C-2 by a predetermined method. The lyrics are extracted. In this embodiment, four related terms are extracted for each, and each related term is arranged in rank order. The ranking of related terms can be determined based on, for example, the appearance frequency. Regarding the text data C-1, the extracted four related terms (related terms 1 to 4) are “AI”, “agent”, “robot”, and “voice recognition” in rank order. Regarding the text data C-2, the extracted four related terms (related terms 1 to 4) are “robot”, “voice recognition”, “AI”, and “learning function” in order of rank.

次に、文字列抽出処理７１０は、このように抽出された関連詞のそれぞれについて、近傍関連詞を求め、それらを関連詞辞書Ｃに記憶する。例えば、関連詞「ＡＩ」に着目すると、この関連詞は、テキストデータＣ−１について抽出されており、関連詞の近傍は｛ＡＩ、エージェント、ロボット、音声認識｝である。同様に、この関連詞は、テキストデータＣ−２について抽出されており、関連詞の近傍は｛ロボット、音声認識、ＡＩ、学習機能｝である。これらから、関連詞「ＡＩ」について、関連詞の近傍系は、｛ＡＩ、ロボット、エージェント、音声認識、学習機能｝となる（テキストデータＣ−１についての関連詞の近傍と、テキストデータＣ−２についての関連詞の近傍とで重複する関連詞「ロボット」、「音声認識」はそれぞれ１つだけ含められる）。 Next, the character string extraction processing 710 obtains neighborhood related words for each of the extracted related words and stores them in the related word dictionary C. For example, focusing on the related term “AI”, this related term is extracted for the text data C-1, and the neighborhood of the related term is {AI, agent, robot, speech recognition}. Similarly, this related term is extracted for the text data C-2, and the neighborhood of the related term is {robot, voice recognition, AI, learning function}. From these, for the related term “AI”, the neighborhood system of the related term is {AI, robot, agent, speech recognition, learning function} (the neighborhood of the related term for the text data C-1 and the text data C− Only one of the related terms “robot” and “speech recognition” that overlap in the vicinity of the related term for 2 is included).

こうして求められた関連詞の近傍系が、それぞれの関連詞「ＡＩ」、「ロボット」、「音声認識」、「エージェント」、「学習機能」について、関連詞辞書Ｃに記憶される。各関連詞については、関連詞の近傍系（近傍関連詞１〜４）が記憶されるが、これらの順序は、文字列抽出処理７１０により行われたランク付けや、共起性の高さ等を考慮して定められる。 The neighborhood system of the related terms thus obtained is stored in the related term dictionary C for each of the related terms “AI”, “robot”, “speech recognition”, “agent”, and “learning function”. For each related terminology, the related system of the related terms (neighboring related terms 1 to 4) is stored. The order of these is determined by the ranking performed by the character string extraction processing 710, the high co-occurrence, etc. Determined in consideration of

次に、辞書比較処理７２０により、関連詞辞書Ａ〜Ｃが比較される。その結果、関連詞「音声認識」は、時間（Ｔ＝ｔ１）において、３つの関連詞辞書に共通する関連詞（共通関連詞）であり（図４９の符号７５５、７５５’、７５５’’参照）、これが比較結果データ５４に記憶される。このような共通関連詞を把握することにより、異なる主題のなかから共通する話題を発見することができ、この実施例のような主題の設定を行えば、企業間関連情報の分析にもつながる。 Next, the dictionary comparison process 720 compares the related term dictionaries A to C. As a result, the related term “speech recognition” is a related term (common related term) common to the three related term dictionaries at time (T = t1) (see reference numerals 755, 755 ′, and 755 ″ in FIG. 49). This is stored in the comparison result data 54. By grasping such common related terms, common topics can be found from different themes, and setting the subject as in this embodiment also leads to analysis of related information between companies.

また、辞書比較処理７２０は、この共通関連詞について、各関連詞の近傍関連詞についても比較を行う。そうすると、近傍関連詞の共通性や近傍関連詞の順序の共通性等を把握することができ、これによって、共通関連詞の間の共通性のレベルを判定することができる。 In addition, the dictionary comparison process 720 also compares the related related terms with respect to the related related terms. Then, it is possible to grasp the commonality of neighboring related terms, the commonality of the order of neighboring related terms, and the like, thereby determining the level of commonality between the common related terms.

なお、関連詞辞書Ａと関連詞辞書Ｂにおける共通関連詞として「音声」が把握されうるが（図４９の符号７５６、７５６’参照）、このような一部の関連詞辞書における共通関連詞も重要な情報となる場合があるため、比較結果データ５４に記憶することができる。 Note that “speech” can be grasped as a common related terminology in the related terminology dictionary A and the related terminology dictionary B (see reference numerals 756 and 756 ′ in FIG. 49). Since it may be important information, it can be stored in the comparison result data 54.

＜＜＜文脈学習機能とサービスＩＤの切り換え＞＞＞
本発明の一実施形態に係る情報検索システム１００では、会話制御端末装置２’’と、この装置を利用するユーザとの間の対話において、上述したように、入力指標や進捗指標等を学習して、その入力指標や進捗指標等に応じた対話を実現することができる。このような機能が、文脈学習機能である。入力指標は、これまでにユーザがどのような入力をしてきたのか、すなわち、ユーザの入力の履歴を示す情報である。また、進捗指標は、これまでにユーザに対してどのような話題を提供してきたのか、すなわち、ユーザに提供した話題の履歴を示す情報である。 <<< Switching between context learning function and service ID >>>
In the information search system 100 according to an embodiment of the present invention, as described above, the input index, the progress index, etc. are learned in the dialogue between the conversation control terminal device 2 ″ and the user who uses this device. Thus, it is possible to realize a dialogue according to the input index and the progress index. Such a function is a context learning function. The input index is information indicating what input the user has made so far, that is, a history of user input. The progress index is information indicating what topic has been provided to the user so far, that is, a history of the topic provided to the user.

さらに、情報検索システム１００では、話題提供サーバ４’の情報更新部４６で関連詞の出現に関する履歴等をとらえることにより、関連詞が所属する話題名を明確にすることができるほか、いつもの関連詞と（新たに出現した）新着関連詞とを区別して扱うことができたり、関連詞辞書の比較処理によって、話題の類似性や相違性についての判断をしたりすることができる、関連詞学習機能を実現できる。こうした関連詞学習機能により、会話制御端末装置２’’のユーザは、多くの話題名に関わる入力タイプを扱うことができるようになり、ユーザの入力識別手段が多様化できる。 Furthermore, in the information retrieval system 100, the information update unit 46 of the topic providing server 4 ′ can capture the history of the appearance of related terms by the information update unit 46, thereby clarifying the topic name to which the related term belongs, and the usual related information. Related words learning that can distinguish between words and new arrival related words (newly appearing), and can make judgments on similarities and differences between topics through comparison processing of related words Functions can be realized. With this related term learning function, the user of the conversation control terminal device 2 ″ can handle input types related to many topic names, and the user's input identification means can be diversified.

シナリオデータは、ユーザの会話制御端末装置２’’における入力状況から判定される入力タイプに基づいてサービスＩＤを切り換えるよう制御する。例えば、情報検索システム１００における話題提供サーバ４’の文解析部４３、嗜好解析部４４、及び話題解析部４５によって提供される関連詞辞書５０や関連詞・共起語一覧表示画面６５０等を提供する機能に対応するサービスＩＤと、話題提供サーバ４’の情報更新部４６によって提供される関連詞辞書５０や比較結果データ５４の表示等を提供する機能に対応するサービスＩＤを、シナリオデータによって自動的に切り換える。 The scenario data is controlled so as to switch the service ID based on the input type determined from the input situation in the user's conversation control terminal device 2 ″. For example, the related word dictionary 50 provided by the sentence analysis unit 43, the preference analysis unit 44, and the topic analysis unit 45 of the topic providing server 4 ′ in the information search system 100, the related term / co-occurrence word list display screen 650, and the like are provided. The service ID corresponding to the function to be performed and the service ID corresponding to the function providing the display of the related term dictionary 50 and the comparison result data 54 provided by the information updating unit 46 of the topic providing server 4 ′ are automatically determined by the scenario data. Switch automatically.

対応するシナリオデータのステートメントは、例えば、所定の入力タイプが入力された場合のアクションとして、対応するサービスＩＤのサービスに遷移させるように設定される。これを、図１４に示すようなシナリオデータで表すと、以下のようなステートメントとなる。
<sto:$IDN$:<sta:$num$>:$input$>
ここで、「sto」は、状態を遷移させる記述(shift to)であり、「$IDN$」は遷移先のサービスの識別番号であり、「<sta:$num$>」は、その遷移先のサービスにおける状態番号であり、「<$input$>」は、ユーザの入力文である。 The statement of the corresponding scenario data is set so as to transition to the service with the corresponding service ID as an action when a predetermined input type is input, for example. When this is represented by scenario data as shown in FIG. 14, the following statement is obtained.
<sto: $ IDN $: <sta: $ num $>: $ input $>
Here, “sto” is a description to shift the state (shift to), “$ IDN $” is the identification number of the destination service, and “<sta: $ num $>” is the destination "<$ Input $>" is a user input sentence.

＜＜＜本発明の一実施形態に係る話題提供サーバのハードウェア構成の説明＞＞＞
次に、図５０を参照して、本発明の一実施形態に係る話題提供サーバ４’を構成するコンピュータのハードウェア構成の例について説明する。ただし、図５０に示す話題提供サーバ４’の構成は、その代表的な構成を例示したにすぎない。 <<< Description of Hardware Configuration of Topic Providing Server According to One Embodiment of the Present Invention >>>
Next, with reference to FIG. 50, an example of a hardware configuration of a computer constituting the topic providing server 4 ′ according to the embodiment of the present invention will be described. However, the configuration of the topic providing server 4 ′ illustrated in FIG. 50 is merely an example of the typical configuration.

話題提供サーバ４’は、ＣＰＵ（Central Processing Unit）８０１、ＲＡＭ（Random Access Memory）８０２、ＲＯＭ（Read Only Memory）８０３、ネットワークインタフェース８０４、オーディオ制御部８０５、マイクロフォン８０６、スピーカ８０７、ディスプレイコントローラ８０８、ディスプレイ８０９、入力機器インタフェース８１０、キーボード８１１、マウス８１２、外部記憶装置８１３、外部記録媒体インタフェース８１４、及びこれらの構成要素を互いに接続するバス８１５を含んでいる。 The topic providing server 4 ′ includes a CPU (Central Processing Unit) 801, a RAM (Random Access Memory) 802, a ROM (Read Only Memory) 803, a network interface 804, an audio control unit 805, a microphone 806, a speaker 807, a display controller 808, A display 809, an input device interface 810, a keyboard 811, a mouse 812, an external storage device 813, an external recording medium interface 814, and a bus 815 for connecting these components to each other are included.

ＣＰＵ８０１は、話題提供サーバ４’の各構成要素の動作を制御し、ＯＳの制御下で、本発明に係る文解析部４３、嗜好解析部４４、話題解析部４５、情報更新部４６における処理等の実行を制御する。 The CPU 801 controls the operation of each component of the topic providing server 4 ′, and under the control of the OS, processes in the sentence analysis unit 43, the preference analysis unit 44, the topic analysis unit 45, the information update unit 46 according to the present invention, and the like. Control the execution of

ＲＡＭ８０２には、ＣＰＵ８０１で実行される各処理を実行するためのプログラムや、それらのプログラムが実行中に使用するデータが一時的に格納される。また、上述のように、関連詞辞書５０や比較結果データ５４等も記憶されうる。ＲＯＭ８０３には、話題提供サーバ４’の起動時に実行されるプログラム等が格納される。 The RAM 802 temporarily stores programs for executing each process executed by the CPU 801 and data used during the execution of these programs. Further, as described above, the related term dictionary 50, the comparison result data 54, and the like can also be stored. The ROM 803 stores a program executed when the topic providing server 4 'is started.

ネットワークインタフェース８０４は、ネットワーク９００に接続するためのインタフェースである。ネットワーク９００は、例えば、図２０に示す会話制御端末装置２’’やクローラー７３０が動作するコンピュータとの間のネットワークや、インターネットのようなネットワークである。 The network interface 804 is an interface for connecting to the network 900. The network 900 is, for example, a network between the conversation control terminal device 2 ″ shown in FIG. 20 and a computer on which the crawler 730 operates, or a network such as the Internet.

オーディオ制御部８０５は、マイクロフォン８０６とスピーカ８０７を制御して音声の入出力を制御する。ディスプレイコントローラ８０８は、ＣＰＵ８０１が発行する描画命令を実際に処理するための専用コントローラである。ディスプレイ８０９は、例えば、ＬＣＤ（Liquid Crystal Display）やＣＲＴ（Cathode Ray Tube）で構成される表示装置である。 The audio control unit 805 controls the microphone 806 and the speaker 807 to control audio input / output. The display controller 808 is a dedicated controller for actually processing a drawing command issued by the CPU 801. The display 809 is a display device configured by, for example, an LCD (Liquid Crystal Display) or a CRT (Cathode Ray Tube).

入力機器インタフェース８１０は、キーボード８１１やマウス８１２から入力された信号を受信して、その信号パターンに応じて所定の指令をＣＰＵ８０１に送信する。 The input device interface 810 receives a signal input from the keyboard 811 or the mouse 812, and transmits a predetermined command to the CPU 801 according to the signal pattern.

外部記憶装置８１３は、例えば、ハードディスクや半導体メモリのような記憶装置であり、この装置内には上述したプログラムやデータが記録され、実行時に、必要に応じてそこからＲＡＭ８０２にロードされる。例えば、また、上述のように、関連詞辞書５０や比較結果データ５４等も記憶されうる。 The external storage device 813 is, for example, a storage device such as a hard disk or a semiconductor memory. The above-described program and data are recorded in this device, and are loaded from there to the RAM 802 as necessary at the time of execution. For example, as described above, the related term dictionary 50, the comparison result data 54, and the like can also be stored.

外部記録媒体インタフェース８１４は、外部記録媒体９１０にアクセスして、そこに記録されているデータを読み取る。外部記録媒体９１０は、例えば、可搬型のフラッシュメモリ、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）などである。ＣＰＵ８０１で実行され、本発明の各機能を実現するためのプログラムは、この外部記録媒体インタフェース８１４を介して外部記録媒体９１０から提供されうる。また、本発明の各機能を実現するためのプログラムの他の流通形態としては、ネットワーク上の所定のサーバから、ネットワーク９００及びネットワークインタフェース８０４を介して外部記憶装置８１３やＲＡＭ８０２に格納されるというルート等も考えられる。 The external recording medium interface 814 accesses the external recording medium 910 and reads data recorded therein. The external recording medium 910 is, for example, a portable flash memory, a CD (Compact Disc), a DVD (Digital Versatile Disc), or the like. A program executed by the CPU 801 and realizing each function of the present invention can be provided from the external recording medium 910 via the external recording medium interface 814. Further, as another distribution form of the program for realizing each function of the present invention, a route in which the program is stored in the external storage device 813 or the RAM 802 via a network 900 and a network interface 804 from a predetermined server on the network. Etc. are also conceivable.

本発明の一実施形態に係る話題提供サーバ４’のハードウェア構成について、一例を説明したが、本発明の情報検索システム１００に含まれる会話制御端末装置２’’やクローラー７３０が動作するコンピュータのハードウェア構成も基本的に、図５０に示す構成と同様である。ただし、ここで、話題提供サーバ４’やクローラー７３０が動作するコンピュータについては、オーディオ制御部８０５、マイクロフォン８０６、スピーカ８０７、ディスプレイコントローラ８０８、ディスプレイ８０９、入力機器インタフェース８１０、キーボード８１１、及びマウス８１２は必須の構成要素ではない。 An example of the hardware configuration of the topic providing server 4 ′ according to the embodiment of the present invention has been described. However, the conversation control terminal device 2 ″ and the crawler 730 included in the information search system 100 of the present invention The hardware configuration is basically the same as that shown in FIG. However, regarding the computer on which the topic providing server 4 ′ and the crawler 730 operate, the audio control unit 805, the microphone 806, the speaker 807, the display controller 808, the display 809, the input device interface 810, the keyboard 811, and the mouse 812 It is not an essential component.

また、ここまでに説明した情報検索システム１００は、会話制御端末装置２’’で動作するTopiclet２０と、話題提供サーバ４’との間で、シナリオデータ２８（または、シナリオデータ５５）による制御に応じてデータの送受信を行うことにより、会話制御端末装置２’’のディスプレイに関連詞等の情報を表示する構成である。また、Topiclet２０を、例えば、所定のタイミングで会話制御端末装置２’’にダウンロードされ起動されるようにし、そのTopiclet２０がインターネット等のネットワークを介して話題提供サーバ４’と通信を行う、いわゆるクラウドコンピューティングによるシステムとして構成することができる。 Further, the information search system 100 described so far responds to the control by the scenario data 28 (or the scenario data 55) between the Topiclet 20 operating on the conversation control terminal device 2 ″ and the topic providing server 4 ′. By transmitting and receiving data, information such as related terms is displayed on the display of the conversation control terminal device 2 ″. The Topiclet 20 is downloaded to the conversation control terminal device 2 ″ at a predetermined timing and activated, for example, and the Topiclet 20 communicates with the topic providing server 4 ′ via a network such as the Internet. Can be configured as a system based on

しかしながら、本発明に係る情報検索システム１００は、他の様々な構成・方法により、本発明の技術的思想を効果的に実現することができる。例えば、上述した話題提供サーバ４’の各機能を、ＷＥＢサーバやＡＳＰ（Active Server Pages）サーバ等で構成し、会話制御端末装置２’’で動作する一般的なＷＥＢブラウザが、シナリオデータに制御されることなく（あるいは、シナリオデータによる制御の下で）、ユーザの指示に応じて、会話制御端末装置２’’のディスプレイに、関連詞等の情報を表示したり、話題や関連詞辞書や比較結果データの切り換えを行ったりするように構成することができる。この場合、話題提供サーバ４’として機能する、ＷＥＢサーバやＡＳＰサーバ等は、会話制御端末装置２’’のディスプレイに画面表示を行うために、当該表示のためのデータ（例えば、ＨＴＭＬデータ）を編集し生成する。 However, the information search system 100 according to the present invention can effectively realize the technical idea of the present invention by various other configurations and methods. For example, each function of the topic providing server 4 ′ described above is configured by a WEB server, an ASP (Active Server Pages) server, etc., and a general WEB browser operating on the conversation control terminal device 2 ″ controls the scenario data. (Or under the control of scenario data), information such as related terms is displayed on the display of the conversation control terminal device 2 ″ in response to a user instruction, a topic, a related term dictionary, The comparison result data can be switched. In this case, a WEB server, an ASP server, or the like that functions as the topic providing server 4 ′ displays data (for example, HTML data) for the display in order to display the screen on the display of the conversation control terminal device 2 ″. Edit and generate.

＜＜＜話題提供システムの他のシステム構成＞＞＞
次に、図５１を参照して、話題提供システムの他のシステム構成の概略について説明する。 <<< Other system configuration of topic providing system >>>
Next, an outline of another system configuration of the topic providing system will be described with reference to FIG.

図５１に示す話題提供システム１’は、会話制御端末装置１００２（Topiclet１０２０）、話題提供サーバ１００４（iWA１０３０）、保守装置１００３（iWA Manager１０４０）、および話題解析装置１００５を有する。図４に示す話題提供システム１では、話題解析処理による結果の（シナリオデータ編集部４１０への）提供が、保守装置３（iWA Manager４０）において行われるように示されているが、図５１に示す話題提供システム１’では、話題解析処理自体が話題解析装置１００５という、保守装置１００３（iWA Manager１０４０）とは別個の装置により実行されることが示されており、話題提供システムをこのようなシステム構成で実現することもできる。 51 includes a conversation control terminal device 1002 (Topiclet 1020), a topic providing server 1004 (iWA 1030), a maintenance device 1003 (iWA Manager 1040), and a topic analysis device 1005. In the topic providing system 1 shown in FIG. 4, it is shown that the result of topic analysis processing (to the scenario data editing unit 410) is provided in the maintenance device 3 (iWA Manager 40), but is shown in FIG. In the topic providing system 1 ′, it is shown that the topic analysis processing itself is executed by a topic analysis device 1005, which is a device separate from the maintenance device 1003 (iWA Manager 1040). Can also be realized.

話題提供システム１と話題提供システム１’は、その他の構成については同様であり、詳細な説明については省略する。会話制御端末装置１００２は、受信部１２４０、および送信部１２３０を含む。受信部１２４０は会話制御端末装置２の受信部２４０に対応し、送信部１２３０は会話制御端末装置２の送信部２３０に対応する。会話制御端末装置１００２は、基本的に図４の会話制御端末装置２と同様であり、他の構成要素については表示を省略する。話題提供サーバ１００４は、入力情報分析部１３１０、およびシナリオデータ記憶部１３２０を含み、保守装置１００３は、シナリオデータ送信部１４３０、シナリオデータ編集部１４１０、および端末装置仮想構築部１４２０を含む。話題解析装置１００５は、話題解析部１５１０を含む。 The topic providing system 1 and the topic providing system 1 ′ are the same in other configurations, and detailed description thereof is omitted. The conversation control terminal device 1002 includes a reception unit 1240 and a transmission unit 1230. The reception unit 1240 corresponds to the reception unit 240 of the conversation control terminal device 2, and the transmission unit 1230 corresponds to the transmission unit 230 of the conversation control terminal device 2. The conversation control terminal device 1002 is basically the same as the conversation control terminal device 2 of FIG. 4, and the display of other components is omitted. The topic providing server 1004 includes an input information analysis unit 1310 and a scenario data storage unit 1320, and the maintenance device 1003 includes a scenario data transmission unit 1430, a scenario data editing unit 1410, and a terminal device virtual construction unit 1420. The topic analysis device 1005 includes a topic analysis unit 1510.

図５１に示す話題提供システム１’において、話題解析装置１００５の話題解析部１５１０は、話題提供サーバ１００４に対してネットワークを介して接続され、話題解析処理により求められたシナリオデータやその他のデータを、話題提供サーバ１００４の入力情報分析部１３１０に提供する。また、話題解析装置１００５の話題解析部１５１０は、保守装置１００３に対してネットワークを介して接続され（または、話題提供サーバ１００４を介して間接的に接続され）、話題解析処理により求められたシナリオデータやその他のデータを、保守装置１００３のシナリオデータ編集部１４１０に提供する。 In the topic providing system 1 ′ shown in FIG. 51, the topic analyzing unit 1510 of the topic analyzing apparatus 1005 is connected to the topic providing server 1004 via a network, and receives scenario data and other data obtained by topic analyzing processing. And provided to the input information analysis unit 1310 of the topic providing server 1004. Further, the topic analysis unit 1510 of the topic analysis device 1005 is connected to the maintenance device 1003 via a network (or indirectly connected via the topic providing server 1004), and is a scenario obtained by topic analysis processing. Data and other data are provided to the scenario data editing unit 1410 of the maintenance device 1003.

話題解析装置１００５の話題解析部１５１０は、話題リストを生成し、話題リストに基づくシナリオデータを編集したり検証したりする。話題リストは、話題を関係付ける関連詞を介して話題の近さや繋がり方を付与したデータである。話題解析部１５１０によって、話題に関連付けられる関連詞が話題リストに蓄積されていく。保守装置１００３に提供された話題リストは、話題提供システム１’の契約者に提供されるデータであり、これらのデータは、例えば、シナリオデータを生成する際に用いられる。 The topic analysis unit 1510 of the topic analysis device 1005 generates a topic list and edits or verifies scenario data based on the topic list. The topic list is data to which the closeness and connection method of the topics are given through the related words that relate the topics. The topic analysis unit 1510 accumulates related terms associated with the topic in the topic list. The topic list provided to the maintenance device 1003 is data provided to the contractor of the topic providing system 1 ′, and these data are used, for example, when generating scenario data.

また、話題解析装置１００５の話題解析部１５１０は、図２０に示す話題提供サーバ４’の文解析部４３の処理を実現するように構成されてもよいし、入力情報分析部４１の各処理を実現するように構成されてもよい。 Further, the topic analysis unit 1510 of the topic analysis device 1005 may be configured to realize the processing of the sentence analysis unit 43 of the topic providing server 4 ′ illustrated in FIG. 20, and each process of the input information analysis unit 41 may be performed. It may be configured to be realized.

さらに、話題解析装置１００５の話題解析部１５１０は、話題提供システム１’の契約者のそれぞれに対応付けられるように複数配置することができる。その場合に、それぞれの話題解析装置１００５で取得した情報を利用（または整理・統合したうえで利用）し、対応する保守装置１００３や話題提供サーバ１００４にデータを提供することもできる。話題解析装置１００５の話題解析部１５１０によりデータが提供されるタイミングとデータの内容は、提供先によって（すなわち、保守装置１００３と、話題提供サーバ１００４とで）異なっていてもよい。 Further, a plurality of topic analysis units 1510 of the topic analysis apparatus 1005 can be arranged so as to be associated with each contractor of the topic providing system 1 ′. In that case, information acquired by each topic analysis device 1005 can be used (or used after being organized and integrated), and data can be provided to the corresponding maintenance device 1003 and the topic providing server 1004. The timing at which data is provided by the topic analysis unit 1510 of the topic analysis device 1005 and the content of the data may differ depending on the provision destination (that is, between the maintenance device 1003 and the topic provision server 1004).

また、契約者（または、契約者に属する担当者）は、保守装置１００３を用いて、話題解析装置１００５から提供されるデータにより、話題リストを構築したりシナリオデータを編集・作成することができるが、図５１に示す話題提供システム１’のような構成により、インターネット上のサービスとして遠隔から利用できるＰａａＳ（Platform as a Service）やＳａａＳ（Software as a Service）といった仕組みにより、保守装置１００３において各機能を実現することもできる。 Also, the contractor (or the person in charge belonging to the contractor) can use the maintenance device 1003 to build a topic list or edit / create scenario data using data provided from the topic analysis device 1005. However, with the configuration of the topic providing system 1 ′ shown in FIG. 51, the maintenance device 1003 uses a mechanism such as PaaS (Platform as a Service) or SaaS (Software as a Service) that can be used remotely as a service on the Internet. Functions can also be realized.

ここまで、本発明の一実施形態に係る情報検索システム１００について、本発明を実施することができるいくつかの実施例を示しながら説明してきたが、これらの実施例は本発明を説明するための一例に過ぎず、本発明の権利範囲はこれらの実施例に限定されることはない。これらの実施例以外の様々な方法や構成によって、本発明の技術的思想を実現することができる。 So far, the information search system 100 according to an embodiment of the present invention has been described with reference to some examples that can implement the present invention. These examples are for explaining the present invention. The scope of right of the present invention is not limited to these examples. The technical idea of the present invention can be realized by various methods and configurations other than these embodiments.

１、１’ 話題提供システム
２、１００２会話制御端末装置（Topiclet２０、Topiclet１０２０）
２’、２’’ 会話制御端末装置
３、１００３保守装置（iWA Manager４０、iWA Manager１０４０）
４、４’、１００４話題提供サーバ（iWA３０、iWA１０３０）
１０話題記憶装置
２１入力制御部
２２検索制御部
２３送信制御部
２４受信制御部
２５応答情報決定部
２６出力制御部
４１入力情報分析部
４２外部ログ取得制御部
４３文解析部
４４嗜好解析部
４５話題解析部
４６情報更新部
１００情報検索システム
１００５話題解析装置 1, 1 'Topic providing system 2, 1002 Conversation control terminal device (Topiclet 20, Topiclet 1020)
2 ′, 2 ″ conversation control terminal device 3, 1003 maintenance device (iWA Manager 40, iWA Manager 1040)
4, 4 ', 1004 Topic providing server (iWA30, iWA1030)
DESCRIPTION OF SYMBOLS 10 Topic storage device 21 Input control part 22 Search control part 23 Transmission control part 24 Reception control part 25 Response information determination part 26 Output control part 41 Input information analysis part 42 External log acquisition control part 43 Sentence analysis part 44 Preference analysis part 45 Topic Analysis unit 46 Information update unit 100 Information retrieval system 1005 Topic analysis device

Claims

Sentence information acquisition means for acquiring sentence information related to the keyword from text data collected by keyword-based search;
Character string selection means for selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character strings in character string storage means for each corresponding sentence information;
An information search system comprising: information output means for outputting information for displaying the selected character string to a user for each corresponding sentence information.

The character string selection means is:
2. The information search system according to claim 1, wherein the character string is selected without performing collation with character string data stored in advance.

The character string selection means further includes:
A character string search means for searching for the same character string from the text data;
For the same character string, a different degree determination means for determining a difference degree of a preceding adjacent character and a difference degree of a subsequent adjacent character;
Specific character string determining means for determining whether or not the same character string is a specific character string based on the difference degree of the preceding adjacent character and the difference degree of the subsequent adjacent character;
The information search system according to claim 2, wherein the character string selection unit selects the character string from the determined specific character string.

Scenario data storage means for storing scenario data for defining response information on the topic;
Response information determining means for determining the response information including the selected character string based on the scenario data;
The information search system according to claim 1, further comprising response information output means for outputting the response information determined by the response information determination means.

A dictionary comparison means;
The character string selection means, when storing the character string in the character string storage means, respectively, according to the collection condition of the text data, stored in a corresponding dictionary,
The dictionary comparison means performs a comparison process for comparing a plurality of the dictionaries, stores a comparison result in the comparison result storage means,
The response information determining means determines the response information including the comparison result based on the scenario data;
The response information output means outputs the response information determined by the response information determination means,
The dictionary comparison unit further performs the comparison process when at least one of the plurality of the dictionaries is updated, and automatically updates the comparison result stored in the comparison result storage unit. The information retrieval system according to claim 1.

The information output means includes
When one of the character strings corresponding to one of the sentence information and one of the character strings corresponding to sentence information different from one of the sentence information are common, the character corresponding to the one sentence information 6. The information search system according to claim 1, wherein information for displaying a set of columns in association with the set of character strings corresponding to the other sentence information is output.

The information output means includes
Outputting information for displaying all of the set of character strings corresponding to the predetermined one or more sentence information;
The information search system according to claim 1, wherein the display order of the character strings is determined according to a usage mode of the user with respect to the character strings.

A keyword input means for inputting a keyword as a text data collection condition;
When one or more character strings satisfying a predetermined condition are selected from each of the sentence information related to the keyword acquired from the text data collected based on the keyword, the selected character string is And an information output means for outputting information for display to the user for each corresponding sentence information.

A sentence information acquisition step of acquiring sentence information related to the keyword from text data collected by keyword-based search;
A character string selection step of selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character string in a character string storage unit for each corresponding sentence information;
An information search method comprising: an information output step of outputting information for displaying the selected character string to a user for each corresponding sentence information.

On the computer,
Sentence information acquisition means for acquiring sentence information relating to the keyword from text data collected by keyword-based search;
Character string selection means for selecting one or more character strings satisfying a predetermined condition from each of the sentence information, and storing the character strings in character string storage means for each corresponding sentence information, and
A program for causing the selected character string to function as information output means for outputting information for displaying to the user for each corresponding sentence information.