JP2001147940A

JP2001147940A - Retrieval system and retrieval method

Info

Publication number: JP2001147940A
Application number: JP33197599A
Authority: JP
Inventors: Miyoshi Fukui; 美佳福井; Masaru Suzuki; 優鈴木; Hideki Tsutsui; 秀樹筒井; Toshihiko Manabe; 俊彦真鍋
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-11-22
Filing date: 1999-11-22
Publication date: 2001-05-29

Abstract

PROBLEM TO BE SOLVED: To provide a retrieval system capable of easily managing and reutilizing data. SOLUTION: Data stored in a data storage part 10 are grouped by a grouping part 30, overlap information, which evaluates the grouped data are included to be overlapped over plural pieces of data belonging to the same group is extracted y an information extracting part 40, and edited information is prepared by editing the extracted overlap information by an information editing part 50 and stored in the data storage part 10. A data retrieving part 20 can retrieve data stored in the data storage part 10. Only the edited information stored in the data storage part 10 can be retrieved from the outside.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、蓄積された情報を
検索するための検索システム及び検索方法に関する。[0001] The present invention relates to a search system and a search method for searching stored information.

【０００２】[0002]

【従来の技術】近年、製品の多様化、高機能化、利用者
層の拡大がすすんでいる。そのため、企業の社内外の窓
口相談（ヘルプデスク）部門に対し、製品の使用方法、
不具合に関する問い合わせが増加している。ヘルプデス
ク業務は、高度な専門性をもつ多くの人員を必要とし、
コストの増大が問題視されている。トラブル対応などス
トレスの多い業務のため、専門性をもつスタッフを多く
抱えることさえ困難な状況になっている。迅速な対応の
ためにも、コスト削減のためにも、ヘルプデスク業務の
支援が急務となっている。2. Description of the Related Art In recent years, diversification of products, enhancement of functions, and expansion of users have been progressing. For this reason, the company's internal and external consultation (help desk) departments are instructed on how to use the product,
Inquiries about defects are increasing. Help desk operations require a large number of highly specialized personnel,
The increase in cost is regarded as a problem. Due to stressful work such as troubleshooting, it is difficult to have many specialized staffs. There is an urgent need for help desk operations to respond quickly and reduce costs.

【０００３】こういった問題を解決するものとして、過
去の問い合わせと回答などの事例を電子的に記録し再利
用をはかるヘルプデスク支援システムがある。問い合わ
せをうけたスタッフが過去の事例を参照することによ
り、迅速に回答を行うことができる。自然言語で過去の
事例を検索する機能をもつシステムもあり、高度な専門
性をもたないスタッフでも、ある程度の問い合わせへの
対応が可能になっている。ただし、蓄積された過去の事
例は、顧客ごとに異なる状況に関する記録であり、その
まま再利用できないものも多い。似たような事例が多く
含まれる、古くなり役に立たない事例がある、などの問
題が起こり、求める情報を探しにくい。効率化のために
は、蓄積された膨大な事例のメンテナンスが必要になる
が、専門性を要する上に煩雑な作業のため、あまり行わ
れていないのが現状である。As a solution to such a problem, there is a help desk support system for electronically recording past cases such as inquiries and answers, and reusing them. Staff who receive inquiries can respond quickly by referring to past cases. Some systems have a function to search past cases in natural language, so that even non-specialized staff can respond to some inquiries. However, the past cases that have been accumulated are records relating to situations that differ for each customer, and many of them cannot be reused as they are. Problems such as many similar cases are included, and old and useless cases occur, making it difficult to find the information to be sought. In order to improve efficiency, it is necessary to maintain a huge number of accumulated cases, but because of the need for specialty and complicated work, at present it is rarely performed.

【０００４】一方、コスト削減のため、頻繁に受ける質
問の回答を集めて整理し、ＦＡＱという形でＷＷＷペー
ジなどに公開する試みも行われている。これにより、直
接窓口に問い合わせるケースを減らし、対応の迅速化と
コストの削減を見込むことができる。しかし実際には、
ＦＡＱの作成、管理には、過去の事例のメンテナンスと
同様の問題があり、煩雑な作業が必要である。さらに、
顧客に直接公開する情報の場合、過去の問い合わせ事例
をそのまま公開するわけにはいかない。ＦＡＱに載せる
べき事例を選別し、一般的な状況におきかえ、わかりや
すく簡潔な説明を作成する必要がある。高度な専門性の
上に、表現力も求められる作業であり、ＦＡＱ公開によ
るコスト削減を達成するのは難しい状況にある。ＦＡＱ
作成を支援する有効な方法は実現されていない。[0004] On the other hand, in order to reduce costs, attempts have been made to collect and organize answers to frequently asked questions and to publish them on a WWW page or the like in the form of FAQ. As a result, it is possible to reduce the number of cases of inquiring directly to the counter, and to expect quicker response and cost reduction. But actually,
Creating and managing FAQs has the same problems as maintenance of past cases, and requires complicated work. further,
In the case of information that is disclosed directly to customers, past inquiry cases cannot be disclosed directly. It is necessary to select the cases to be included in the FAQ, replace the general situation, and create an easy-to-understand and concise explanation. This is a task that requires a high degree of specialty and expressive power, and it is difficult to achieve cost reductions by opening FAQs. FAQ
No effective method of supporting creation has been realized.

【０００５】もちろん、以上のような問題は、ヘルプデ
スク業務に関する情報を扱うシステムだけでなく、他の
情報を扱うシステムにおいても同様に発生する問題であ
る。[0005] Of course, the above-mentioned problem occurs not only in a system that handles information related to the help desk business but also in a system that handles other information.

【０００６】[0006]

【発明が解決しようとする課題】以上説明したように、
従来の検索システムでは、蓄積された膨大なデータを再
利用するための管理が容易ではなく、再利用によるコス
ト削減の達成が困難であった。As described above,
In a conventional search system, management for reusing a huge amount of accumulated data is not easy, and it is difficult to achieve cost reduction by reusing.

【０００７】本発明は、上記事情を考慮してなされたも
ので、データの管理や再利用等を容易にすることを可能
とした検索システム及び検索方法を提供することを目的
とする。[0007] The present invention has been made in view of the above circumstances, and has as its object to provide a search system and a search method capable of facilitating data management and reuse.

【０００８】[0008]

【課題を解決するための手段】本発明に係る検索システ
ムは、所定の記憶装置に記憶されたデータ（例えば、テ
キスト、音声、画像などのデータ）をグループ化するグ
ループ化手段と、前記グループ化手段によりグループ化
されたデータについて、同一のグループに属する複数の
データに渡って重複して含まれる重複情報を抽出する抽
出手段と、前記抽出手段によって抽出された重複情報を
編集する編集手段とを具備することを特徴とする。A search system according to the present invention comprises: a grouping means for grouping data (for example, data such as text, voice, image, etc.) stored in a predetermined storage device; Extraction means for extracting duplicate information included in a plurality of data belonging to the same group with respect to data grouped by the means, and editing means for editing the duplicate information extracted by the extraction means. It is characterized by having.

【０００９】好ましくは、前記編集手段は、編集対象と
する重複情報が抽出されたもとのデータ群を提示する手
段を含むようにしてもよい。Preferably, the editing means may include means for presenting a data group from which duplicate information to be edited is extracted.

【００１０】好ましくは、前記グループ化手段は、構造
を持つデータから、該構造に基づいて、グループの核と
なる核データ（例えば、タイトル）を検出する手段と、
前記核データから特徴（例えば、単語組）を抽出する手
段と、前記特徴を含む核データを持つデータを同じグル
ープに含めることによって、グループ化を行う手段と、
前記核データのうちの前記特徴を含む所定の範囲を、グ
ループのラベルとして付与する手段とを含むようにして
もよい。Preferably, the grouping means detects, from the data having a structure, core data (for example, a title) serving as a core of the group based on the structure,
Means for extracting features (for example, word sets) from the core data, means for grouping by including data having core data including the features in the same group,
Means for assigning a predetermined range including the feature of the nuclear data as a group label.

【００１１】好ましくは、前記所定の記憶装置に蓄積さ
れた各々のデータは固有の識別子を付与されたものであ
り、前記グループ化手段は、或るデータ中にその内容と
して記述された、他のデータに付与された識別子または
それを特定可能な情報に基づいて、該或るデータと該他
のデータとを同一グループとするようにしてもよい。[0011] Preferably, each data stored in the predetermined storage device is provided with a unique identifier, and the grouping means includes the other data described as the content in a certain data. The certain data and the other data may be put into the same group based on an identifier given to the data or information capable of specifying the data.

【００１２】好ましくは、前記抽出手段は、前記グルー
プ化されたデータ群中の各データを、予め定義されたフ
ォームへと変換する手段と、少なくとも２つのデータに
重複する前記フォームの項目およびその内容を、前記重
複情報として抽出するようにしてもよい。[0012] Preferably, the extracting means converts the data in the grouped data group into a predefined form, an item of the form overlapping at least two data, and the contents thereof. May be extracted as the duplicate information.

【００１３】好ましくは、前記抽出手段によって抽出さ
れ、前記編集手段によって編集された後の情報のみを検
索対象として検索する検索手段を更に具備するようにし
てもよい。[0013] Preferably, the information processing apparatus may further include search means for searching only information extracted by the extraction means and edited by the editing means as a search target.

【００１４】好ましくは、前記編集手段は、前記抽出手
段によって抽出された情報の公開可否を指示する指示手
段を含み、前記検索手段は、前記指示手段により公開可
とされた情報のみを検索対象として検索するようにして
もよい。Preferably, the editing means includes an instruction means for instructing whether or not the information extracted by the extracting means can be made public, and the search means makes only the information made public by the instruction means a search target. You may make it search.

【００１５】好ましくは、前記重複情報以外の情報であ
って所定の基準を満たす情報を補完情報として抽出する
補完情報抽出手段を更に具備するようにしてもよい。[0015] Preferably, the information processing apparatus may further include supplementary information extracting means for extracting information other than the overlapping information and satisfying a predetermined criterion as supplementary information.

【００１６】好ましくは、前記グループ化されたデータ
群中の各データを、予め定義されたフォームへと変換
し、少なくとも１つのデータにおいては他のデータと重
複しないフォームの項目およびその内容を、該グループ
の補完情報として抽出する補完情報抽出手段を更に具備
し、前記編集手段は、前記補完情報をも編集可能であ
り、前記検索手段は、前記補完情報のみ検索することを
も可能であるようにしてもよい。Preferably, each data in the grouped data group is converted into a predefined form, and in at least one data, the items of the form that do not overlap with other data and the contents thereof are stored in the form. The information processing apparatus further includes supplementary information extracting means for extracting as complementary information of the group, wherein the editing means can also edit the complementary information, and the search means can search only the complementary information. You may.

【００１７】また、本発明に係る検索方法は、所定の記
憶装置に記憶されたデータ（例えば、テキスト、音声、
画像などのデータ）をグループ化し、グループ化された
データについて、同一のグループに属する複数のデータ
に渡って重複して含まれると評価される重複情報を抽出
し、抽出された重複情報を編集することを特徴とする。Further, the search method according to the present invention provides a method for storing data (for example, text, voice,
Image data, etc.), and for the grouped data, duplicate information that is evaluated to be included in multiple data belonging to the same group is extracted and edited. It is characterized by the following.

【００１８】なお、装置に係る本発明は方法に係る発明
としても成立し、方法に係る本発明は装置に係る発明と
しても成立する。The present invention relating to the apparatus is also realized as an invention relating to a method, and the present invention relating to a method is also realized as an invention relating to an apparatus.

【００１９】また、装置または方法に係る本発明は、コ
ンピュータに当該発明に相当する手順を実行させるため
の（あるいはコンピュータを当該発明に相当する手段と
して機能させるための、あるいはコンピュータに当該発
明に相当する機能を実現させるための）プログラムを記
録したコンピュータ読取り可能な記録媒体としても成立
する。The present invention relating to an apparatus or a method is provided for causing a computer to execute a procedure corresponding to the present invention (or for causing a computer to function as means corresponding to the present invention, or for causing a computer to correspond to the present invention). The present invention is also realized as a computer-readable recording medium in which a program for realizing the function of performing the above is recorded.

【００２０】本発明によれば、蓄積されたデータをグル
ープ化し、グループ化されたデータから重複情報を抽出
し、抽出された重複情報をもとに編集を行うようにした
ので、膨大なデータから有用な情報を選別し、再利用可
能な形へ編集を行うことが容易になる。また、編集を行
った後の情報のみを検索対象とすることができる。According to the present invention, accumulated data is grouped, duplicate information is extracted from the grouped data, and editing is performed based on the extracted duplicate information. It is easy to sort out useful information and edit it into a reusable form. Also, only information after editing can be set as a search target.

【００２１】本発明によれば、蓄積されたデータの管理
など、情報の再利用にかかわる煩雑な作業が軽減され
る。また、例えば、高度な専門的知識を持たないスタッ
フでもＦＡＱなどの作成が容易になる。また、例えば、
ヘルプデスク部門のコストの削減のみならず、質の高い
情報を迅速に公開することにより、顧客へのサービスを
充実することができる。According to the present invention, complicated operations related to information reuse, such as management of stored data, are reduced. Also, for example, even a staff member who does not have a high level of specialized knowledge can easily create a FAQ or the like. Also, for example,
Not only can the help desk department reduce costs, but it can also enhance the service to customers by releasing high-quality information quickly.

【００２２】[0022]

【発明の実施の形態】以下、図面を参照しながら発明の
実施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００２３】図１に、本発明の一実施形態に係る検索シ
ステムの構成例を示す。FIG. 1 shows a configuration example of a search system according to an embodiment of the present invention.

【００２４】この検索システムは、例えば、データの管
理者を使用者として想定したものである（管理者以外の
者（例えば、顧客など一般ユーザ、あるいは企業内の管
理者以外の者）を使用者（ユーザ）として想定した構成
部分については後述する）。This search system assumes, for example, a data manager as a user (a person other than a manager (for example, a general user such as a customer or a person other than a company administrator). The components assumed as (user) will be described later.)

【００２５】図１に示されるように、この検索システム
は、データ蓄積部１０、データ検索部２０、グループ化
部３０、情報抽出部４０、情報編集部５０を備えてい
る。As shown in FIG. 1, the search system includes a data storage unit 10, a data search unit 20, a grouping unit 30, an information extraction unit 40, and an information editing unit 50.

【００２６】各処理を行う部分はソフトウェアで実現可
能であるが、その実現形態には種々のものが考えられ
る。例えば、（１）データ検索部２０／グループ化部３
０／情報抽出部４０／情報編集部５０を全て一つの計算
機上に搭載し、データ蓄積部１０をこの計算機のローカ
ルな記憶装置とする方法、（２）データ検索部２０／グ
ループ化部３０／情報抽出部４０／情報編集部５０を全
て一つの計算機上に搭載し、データ蓄積部１０をネット
ワーク上の別の計算機に管理させる方法、（３）上記の
（１）または（２）において、データ検索部２０のみ別
の計算機上に搭載する方法など、１または複数の計算機
にどのように上記各部分を組み合わせて搭載するか、ネ
ットワークを利用するか否かなど、種々の形態が考えら
れる。The part for performing each processing can be realized by software, but various forms of realization are conceivable. For example, (1) data search unit 20 / grouping unit 3
0 / Information extraction unit 40 / Information editing unit 50 are all mounted on one computer, and the data storage unit 10 is a local storage device of this computer. (2) Data search unit 20 / Grouping unit 30 / A method in which the information extracting unit 40 / information editing unit 50 are all mounted on one computer and the data storage unit 10 is managed by another computer on the network. (3) In the above (1) or (2), Various forms are conceivable, such as a method in which only the search unit 20 is mounted on another computer, a method in which the above-described units are combined and mounted in one or a plurality of computers, and whether or not a network is used.

【００２７】なお、図１では、データ蓄積部１０に蓄積
するデータを入力するデータ入力部は省略している。こ
のデータ入力部も、上記のいずれかの部分が搭載されて
いる計算機に搭載してもよいし、単独に計算機に搭載し
てもよい。In FIG. 1, a data input section for inputting data to be stored in the data storage section 10 is omitted. This data input unit may be mounted on a computer in which any of the above-mentioned parts is mounted, or may be mounted independently on the computer.

【００２８】また、ユーザ・インタフェースとしては、
ＧＵＩ（グラフィカル・ユーザ・インタフェース）を用
いると好ましい。As a user interface,
It is preferable to use a GUI (graphical user interface).

【００２９】データ蓄積部１０は、データ入力部を介し
て与えられた、テキスト、音声、画像などのデータを蓄
積するためのものである。The data storage section 10 is for storing data such as text, voice, images, etc., provided via the data input section.

【００３０】データ検索部２０は、自然言語文などによ
る検索命令を受け付け、解析し、データ蓄積部１０に蓄
積されたデータから、命令の内容に合致したデータを探
し出し、提示するためのものである。The data search unit 20 is for receiving and analyzing a search command based on a natural language sentence or the like, searching for data matching the content of the command from the data stored in the data storage unit 10, and presenting the data. .

【００３１】グループ化部３０は、データ蓄積部１０に
蓄積されたデータから、複数のデータを含むまとまり
（グループ）を作成するためのものである。The grouping unit 30 is for creating a group (group) including a plurality of data from the data stored in the data storage unit 10.

【００３２】なお、グループ化は、データが入力される
ごとに行う方法と、適当なタイミングでまとめて行う方
法とがある。また、一部または全部のデータについてグ
ループを組み直すことも可能である。Note that there are a method of performing grouping every time data is input and a method of performing grouping at appropriate timing. It is also possible to rearrange groups for some or all of the data.

【００３３】情報抽出部４０は、グループ化部３０によ
って作成されたグループに属するデータを参照して、複
数のデータに重複して含まれる情報（重複情報）を抽出
する。なお、重複情報は、必ずしも全く同一の文に制限
されるものではなく、例えば、重複していると評価し得
る内容を有する情報というような意味内容である。ま
た、重複情報は、必ずしも１つのグループに属するすべ
てのデータに含まれている場合に制限されるものではな
く、例えば、１つのグループに属する一定の数（２でも
よい）以上のデータに含まれている場合、１つのグルー
プに属するデータのうちの一定の割合を占めるデータに
含まれている場合などとすることができる。重複情報
は、そのグループに属するデータの全部または（多く
の）一部に共通して現れる情報であるので、例えば、そ
の情報の内容は、グループにとって（あるいは、そのグ
ループのテーマもしくは特徴などにとって）、共通に妥
当する情報、あるいは重要な情報、あるいは有益な情
報、あるいは共通に含めるべきもしくは含むことが許さ
れる情報であるなどと評価することができる。したがっ
て、例えば、重複情報を素材とし、これを編集（加工）
することによって、そのグループの代表的な情報、ある
いは整理された情報、あるいは一般ユーザからの閲覧に
耐える情報などの有益な情報を作成することができる。The information extracting section 40 refers to the data belonging to the group created by the grouping section 30 and extracts information (duplicate information) redundantly included in a plurality of data. The duplicate information is not necessarily limited to exactly the same sentence, but has a meaning content such as information having a content that can be evaluated as duplicate. In addition, the duplication information is not necessarily limited to the case where it is included in all data belonging to one group. For example, the duplication information is included in a certain number (or may be two or more) of data belonging to one group. Case, the data may be included in data that occupies a certain percentage of the data belonging to one group. Since the duplicate information is information that appears commonly to all or a (many) part of the data belonging to the group, for example, the content of the information is for the group (or for the theme or characteristic of the group). , Information that is commonly valid, important information, or useful information, or information that should or should be included in common. Therefore, for example, duplicate information is used as a material, and this is edited (processed).
By doing so, it is possible to create useful information such as representative information of the group, organized information, or information that can be browsed by general users.

【００３４】情報編集部５０は、情報抽出部４０によっ
て或るグループに属するデータから抽出された重複情報
をもとに編集を行うためのものである。例えば、抽出さ
れた重複情報を提示し、ユーザによる追加、変更、削除
など情報の加工の指示を受け付け、指示通りに情報を加
工する。情報抽出部４０によって抽出され、情報編集部
５０によって編集された情報（以下、編集済み情報と呼
ぶ）は、データ蓄積部１０に蓄積される。The information editing section 50 is for editing based on the duplicate information extracted from the data belonging to a certain group by the information extracting section 40. For example, the extracted duplicate information is presented, an instruction for information processing such as addition, change, or deletion by the user is received, and the information is processed as instructed. Information extracted by the information extracting unit 40 and edited by the information editing unit 50 (hereinafter, referred to as edited information) is stored in the data storage unit 10.

【００３５】なお、本実施形態では、素材となった重複
情報を持つデータを直接編集するのではなく、そのデー
タをもとにして新たに編集済み情報を作成するものとす
る。また、素材となったデータは、編集済み情報を作成
した後も、保存しておくものとする。In the present embodiment, it is assumed that data having duplicate information as a material is not directly edited, but newly edited information is created based on the data. It is assumed that the data used as the material is stored even after the edited information is created.

【００３６】また、データ検索部２０は、データ蓄積部
１０に蓄積された編集済み情報をも検索可能とする。The data search unit 20 can search the edited information stored in the data storage unit 10.

【００３７】以下では、本実施形態の処理の流れについ
て説明する。In the following, the flow of the processing of this embodiment will be described.

【００３８】図２に、本実施形態の処理手順の一例を示
す。この手順は、素材となるデータをグループ化し、或
るグループに属するデータから重複情報を抽出し、編集
済み情報を作成する処理の流れを表す。FIG. 2 shows an example of the processing procedure of this embodiment. This procedure represents a flow of a process of grouping data as materials, extracting duplicate information from data belonging to a certain group, and creating edited information.

【００３９】始めに、データ蓄積部１０に蓄積された情
報を読み込み（ステップＳ１１）、グループ化を行う
（ステップＳ１２）。First, the information stored in the data storage unit 10 is read (step S11), and grouping is performed (step S12).

【００４０】以下、いくつかのグループ化方法の例を示
す。Hereinafter, examples of some grouping methods will be described.

【００４１】ここでは、図３に例示するような「タイト
ル」と「質問文」と「回答文」の各項目の情報を含む
「テキストデータ」（１０１）をデータの一例とし、こ
のようなテキストデータが多量に蓄積されているものと
する。Here, as an example of data, “text data” (101) including information of each item of “title”, “question text”, and “answer text” as shown in FIG. It is assumed that a large amount of data is accumulated.

【００４２】第１のグループ化方法では、次に例示する
ような「タイトル」によるクラスタ生成アルゴリズムを
用い、生成された１つのクラスタに含まれるデータ群を
１つのグループとする。In the first grouping method, a data group included in one generated cluster is defined as one group by using a cluster generation algorithm based on “title” as exemplified below.

【００４３】（１）あるデータの持つタイトルｔｊに
ついて、次の条件を満たす全ての単語組Ｗｉを生成す
る。１≦Ｎ（Ｗｉ）≦Ｐ１ＷｉはＷｔｊに包含される。Ｎ（Ｗｉ）／Ｎ（Ｗｔｊ）＞Ｐ２Ｗｉ：任意の単語組（クラスタ生成の基準の候補とな
る）Ｗｔｊ：あるデータの持つタイトルｔｊの単語組Ｎ：単語組を構成する単語の数を返す関数Ｐ１，Ｐ２：定数これをグループ化の対象とする全てまたは一部のデータ
について行う。(1) For a title tj of certain data, all word sets Wi satisfying the following conditions are generated. 1 ≦ N (Wi) ≦ P1 Wi is included in Wtj. N (Wi) / N (Wtj)> P2 Wi: an arbitrary word set (a candidate for a reference for cluster generation) Wtj: a word set of a title tj of certain data N: returns the number of words constituting the word set Functions P1 and P2: Constant This is performed for all or some data to be grouped.

【００４４】（２）手順（１）で作成した各単語組を
包含するタイトルを持つデータからクラスタを生成す
る。(2) A cluster is generated from data having a title including each word set created in the procedure (1).

【００４５】例えば、「２０００年問題のテストについ
て」というタイトルからは、Ｗｔｊ＝｛２０００年、問
題、テスト｝が得られる。また、Ｐ１＝２とすると、Ｗ
ｉとして、｛２０００年、問題｝、｛２０００年、テス
ト｝、｛問題、テスト｝、｛２０００年｝、｛問題｝、
｛テスト｝が得られる。また、Ｐ２＝０．２５とする
と、Ｎ（Ｗｔｊ）＝３、Ｎ（Ｗｉ）＝１または２で、こ
の例の場合、いずれの単語組もＮ（Ｗｉ）／Ｎ（Ｗｔ
ｊ）＞Ｐ２を満たす。このような単語組の生成を他のデ
ータについても行う。そして、生成された単語組の全
て、もしくは単語組の一部について、その単語組を包含
するタイトルを持つデータからクラスタを生成する。例
えば、｛２０００年、問題｝という単語組を包含するタ
イトルを持つデータが多数あれば、｛２０００年、問
題｝を採用し、｛２０００年、問題｝を包含するタイト
ルを持つデータから｛２０００年、問題｝クラスタを生
成する。例えば、上記の「２０００年問題のテストにつ
いて」というタイトルを持つデータは、｛２０００年、
問題｝クラスタに属することになる。For example, Wtj = {2000, question, test} is obtained from the title “About the test for the year 2000 question”. If P1 = 2, W
i is {2000, problem}, {2000, test}, {problem, test}, {2000}, {problem},
{Test} is obtained. If P2 = 0.25, N (Wtj) = 3, N (Wi) = 1 or 2, and in this example, any word set is N (Wi) / N (Wt
j)> P2 is satisfied. Such a word set is generated for other data. Then, for all of the generated word sets or a part of the word sets, clusters are generated from data having titles that include the word sets. For example, if there is a large number of data having a title including the word set of {2000, question}, {2000, question} is adopted. Generate a problem cluster. For example, the data with the title "About the year 2000 test" is $ 2000,
Problem I belong to the cluster.

【００４６】なお、上記のアルゴリズムでは、１つのデ
ータが複数のクラスタに含まれ得ることになるが、１つ
のデータが複数のクラスタに含まれることを可能として
もよいし、１つのデータは１つのクラスタのみに含まれ
るように限定してもよい。後者の場合、予め定められた
他の基準によって、複数のクラスタに含まれ得るデータ
をどのクラスタに属させるかを決定する。In the above algorithm, one data can be included in a plurality of clusters. However, one data may be included in a plurality of clusters, or one data may be included in one cluster. It may be limited to be included only in the cluster. In the latter case, it is determined to which cluster data that can be included in a plurality of clusters belongs according to another predetermined criterion.

【００４７】また、クラスタ間に階層を作成するように
してもよい。例えば、ＷｉとＷｊから生成されたクラス
タＣｉとＣｊがあったとき、ＷｉがＷｊに包含されれ
ば、ＣｉはＣｊを包含する。例えば、｛２０００年、問
題、テスト｝クラスタは、｛２０００年、問題｝クラス
タの子供ととなる。Also, a hierarchy may be created between clusters. For example, when there are clusters Ci and Cj generated from Wi and Wj, if Wi is included in Wj, Ci includes Cj. For example, the {2000, Question, Test} cluster is a child of the {2000, Question} cluster.

【００４８】また、上記のアルゴリズムを適用する前
に、同義語辞書などにより、同義語を同じ単語に変換す
ることにより、さらに１つのクラスタに含まれるデータ
の数を増やすことが可能である。例えば、上記の例で、
「２０００年」の同義語辞書に「Ｙ２Ｋ」が記載されて
いれば、「Ｙ２Ｋ問題のパッチ」といったタイトルを持
つデータも、同じクラスタに分類される。Before the above algorithm is applied, by converting synonyms to the same words using a synonym dictionary or the like, it is possible to further increase the number of data included in one cluster. For example, in the above example,
If “Y2K” is described in the synonym dictionary of “2000”, data having a title such as “Y2K problem patch” is also classified into the same cluster.

【００４９】次に、第２のグループ化方法として、デー
タに含まれる単語を用いて、多変量解析の分野で一般的
な階層クラスタリング手法により複数のグループに分類
することができる。Next, as a second grouping method, words included in the data can be used to classify the data into a plurality of groups by a hierarchical clustering method that is common in the field of multivariate analysis.

【００５０】次に、第３のグループ化方法として、予め
分類するカテゴリ構造を決めておき、それぞれのカテゴ
リにデータを割り振ることができる。例えば、重要語の
辞書を用意し、各単語がカテゴリ内に含まれる確率など
の重みを求めておく。そして、各データに辞書に記載さ
れた単語が出現するたび、カテゴリごとの重みを加算
し、各データを一番重みの高いカテゴリに分類する。Next, as a third grouping method, a category structure to be classified can be determined in advance, and data can be allocated to each category. For example, a dictionary of important words is prepared, and a weight such as a probability that each word is included in a category is obtained. Then, each time a word described in the dictionary appears in each data, the weight for each category is added, and each data is classified into the category with the highest weight.

【００５１】次に、第４のグループ化方法として、タイ
トルなどのデータ中に、以前の問い合わせに言及する記
述があれば、これを利用することができる。例えば、問
い合わせの記録にユニークなＩＤを付加してある場合、
データ中に「Ｙ０００００００００８２の続件」などと
いう記載を見つけて、このデータと参照されている過去
のデータ（ＩＤ“Ｙ０００００００００８２“を持つデ
ータ）とをグループ化する。データ中に、他のデータの
ユニークなＩＤを特定可能な情報が記載されている場合
も同様である。Next, as a fourth grouping method, if there is a description referring to a previous inquiry in data such as a title, this can be used. For example, if a unique ID is added to the inquiry record,
A description such as "continuation of Y00000000082" is found in the data, and this data and referenced past data (data having ID "Y0000000282") are grouped. The same applies to the case where information capable of specifying a unique ID of other data is described in the data.

【００５２】以上の他にも、種々の方法が考えられる。
また、それらグループ化手法は、複数のものを適宜組み
合わせて使用することも可能である。例えば、まず、上
記の第１のグループ化方法を用いてタイトルによるクラ
スタを決定した後、さらに、各クラスタに含まれるデー
タを参照する記述を含むデータを上記の第４のグループ
化方法に従って探して同クラスタに追加するような調整
を行うことも可能である。Various other methods are conceivable.
In addition, these grouping methods can be used by appropriately combining a plurality of methods. For example, first, clusters based on titles are determined using the above-described first grouping method, and then data including a description referring to data included in each cluster is searched for in accordance with the above-described fourth grouping method. It is also possible to make adjustments to add to the same cluster.

【００５３】さて、続いて、グループ化されたデータか
ら、重複情報を抽出する（ステップＳ１３）。Next, duplicate information is extracted from the grouped data (step S13).

【００５４】重複情報の抽出にも種々の方法が考えられ
るが、例えば、上記の「タイトル」によるクラスタ生成
アルゴリズムによりグループ化が行われた場合には、グ
ループに属するデータ群のタイトルには、同じ単語組が
含まれていることになる。そこで、データ群のタイトル
の中から、注目する単語組の要素が隣接する場合、もし
くは注目する単語組の要素が一定の範囲内にある場合
に、その隣接する部分もしくは一定の範囲の部分を含む
部分を、重複情報として抽出することができる。例え
ば、｛２０００年、問題｝という単語組から生成された
グループからは「２０００年問題について」といった情
報が抽出される。Various methods can be considered for extracting duplicate information. For example, when grouping is performed by the above-mentioned “title” cluster generation algorithm, the same title is assigned to the data group belonging to the group. The word set is included. Therefore, from the titles of the data group, when the elements of the word set of interest are adjacent, or when the elements of the word set of interest are within a certain range, the adjacent part or the part of the certain range is included. The part can be extracted as duplicate information. For example, information such as "about the year 2000 problem" is extracted from the group generated from the word set {2000, problem}.

【００５５】また、例えば、グループ化されたデータの
うち、「質問文」の項目のデータと「回答文」の項目の
データそれぞれごとに、重複する単語組が含まれるか調
べ、重複する単語組が存在する情報構成単位（例えば、
一文、一段落など）、あるいは重複する単語組が隣接し
て存在する情報構成単位、あるいは重複する単語組が一
定の範囲内に存在する情報構成単位を、重複情報として
抽出することも可能である。Also, for example, of the grouped data, it is checked whether or not each of the data of the item of “question text” and the data of the item of “answer text” includes an overlapping word set. Is an information constituent unit (for example,
One sentence, one paragraph, or the like), or an information constituent unit in which an overlapping word set exists adjacently, or an information constituent unit in which an overlapping word set exists within a certain range can be extracted as duplicate information.

【００５６】また、単語組ではなく個々の単語の存在を
もとに、以下のような方法により、質問文や回答文など
の重複情報を抽出することが可能である。（１）複数のデータ（または複数のデータの同一又は類
似の項目）に重複する単語を見つける。例えば、回答
１、回答２、回答３とあったときに、回答１と回答２と
回答３との間で重複している（すなわち、複数の回答に
出現している）単語を見つける。回答の個数が多い場合
は、単に複数の回答に出現するだけではなくて、一定の
割合以上出現するものに限定してもよい。Also, based on the existence of individual words instead of word sets, it is possible to extract duplicate information such as question sentences and answer sentences by the following method. (1) Find a word that is duplicated in a plurality of data (or the same or similar item of a plurality of data). For example, when there is an answer 1, an answer 2, and an answer 3, a word that is duplicated between the answer 1, the answer 2, and the answer 3 (that is, appears in a plurality of answers) is found. When the number of answers is large, the answer may not be limited to only appearing in a plurality of answers, but may be limited to appearing at a certain ratio or more.

【００５７】（２）重複した単語を沢山もっている文を
抽出する例えば、文を構成している単語が全て上記（１）の条件
にあっていれば、無条件に抽出する。そのような文がな
いときは、最も多くもっている文を一つ抽出する。(2) Extract a sentence having many duplicate words For example, if all the words constituting the sentence meet the above condition (1), the sentence is extracted unconditionally. If there is no such sentence, one sentence that has the most is extracted.

【００５８】（３）抜き出した文の重複を削除する。例
えば、抜き出した文を構成する単語（上記（１）の条件
にあったもの）が全く同じであるような文が、複数個抽
出されたときには、先頭の文だけを抽出する。(3) The duplicate of the extracted sentence is deleted. For example, when a plurality of sentences having exactly the same words (that satisfy the condition (1)) constituting the extracted sentence are extracted, only the first sentence is extracted.

【００５９】また、別の重複情報の抽出方法として、以
下のようなフォームを用いた方法がある。As another method for extracting duplicate information, there is a method using the following form.

【００６０】この場合、例えば、図３に示したような問
い合わせと回答のデータを解析し、図４に示すような問
い合わせ対応の意味構造を表すフォーム（１１１）に、
以下のような方法で変換する。In this case, for example, the data of the inquiry and the answer as shown in FIG. 3 are analyzed, and the form (111) representing the meaning structure of the inquiry as shown in FIG.
Conversion is performed in the following manner.

【００６１】フォームの各項目（１１２）の特徴的な表
現を、品詞や単語の並びによるルールとして記述してお
く。データ中の各単位文を、形態素解析などにより品詞
単位に分解した後、該ルールを適用する（なお、このよ
うなルールの適用に関しては例えば特開平１０−２０７
９０４号公報に開示されている）。ルールに合致する記
述を含む単位文を、その項目の情報（１１３）として抽
出する。例えば、「マニュアル」、「ページ」、「参
照」などの単語を含むというルールを記述しておけば、
データ中の「詳しくはマニュアル○○の××ページをご
参照ください」といった文を、図４のフォームにおける
参照情報として抽出することができる。A characteristic expression of each item (112) of the form is described as a rule based on a part of speech or a sequence of words. After each unit sentence in the data is decomposed into parts of speech by morphological analysis or the like, the rule is applied.
904). A unit statement including a description that matches the rule is extracted as information (113) of the item. For example, if you write a rule that includes words such as "manual", "page", and "reference",
A sentence such as “Please refer to page XX of manual XX” in the data can be extracted as reference information in the form of FIG.

【００６２】このようにして、グループ中のすべてのデ
ータを上記のような方法でフォームに変換する。In this way, all data in the group is converted into a form by the above-described method.

【００６３】なお、フォームは、問い合わせの内容に応
じて複数用意してもよい。A plurality of forms may be prepared according to the contents of the inquiry.

【００６４】さて、フォームの項目ごとに複数のデータ
から抽出された情報がある場合、二つの情報の重複度
を、例えば、以下のような式により算出する。重複度＝（二つの情報に重複して含まれる単語数）／
（二つの情報に含まれる全単語数）そして、重複度が閾値以上の単位文を重複情報として抽
出する。この場合、重複情報がそのまま複数抽出され
る。If there is information extracted from a plurality of data for each item of the form, the degree of duplication of the two pieces of information is calculated by the following formula, for example. Degree of duplication = (number of words included in two pieces of information) /
(Total number of words included in two pieces of information) Then, a unit sentence whose degree of duplication is equal to or more than a threshold is extracted as duplication information. In this case, a plurality of pieces of duplicate information are extracted as they are.

【００６５】なお、抽出された重複情報について、（編
集や検索等の際の提示や利用等のための）順位付けを行
うようにしてもよい。例えば、各項目ごとに、重複度を
順位としてもよい。Note that the extracted duplicate information may be ranked (for presentation or use at the time of editing, searching, etc.). For example, the degree of duplication may be set as a rank for each item.

【００６６】また、例えば、重複度の高い単位文を多く
含むデータを１つ選択し、そのデータに含まれる単位文
のうち、重複度が閾値より高いもののみを、データ中で
の出現順に抽出するようにしてもよい。Further, for example, one data containing many unit sentences having a high degree of duplication is selected, and among the unit statements contained in the data, only those having a degree of duplication higher than a threshold are extracted in the order of appearance in the data. You may make it.

【００６７】なお、フォームに沿った重複情報の表示例
を図５に示す。FIG. 5 shows a display example of the overlapping information along the form.

【００６８】以上の他にも、種々の方法が考えられる。
また、それら重複情報抽出手法は、複数のものを適宜組
み合わせて使用することも可能である。例えば、上記の
ようにしてデータをフォームに変換した後、各項目の情
報に関して、前述したような単語の共起関係を用いる方
法により、重複する単語組を含む範囲を重複情報として
抽出することもできる。例えば、図４のフォームの場合
に、製品名、要望、状況、対策等の項目に関して重複情
報を抽出し、グループに含まれるデータの概要とするこ
ともできる。In addition to the above, various methods are conceivable.
In addition, these overlapping information extraction methods can be used by appropriately combining a plurality of methods. For example, after converting the data into the form as described above, for the information of each item, it is also possible to extract the range including the overlapping word set as the overlapping information by the method using the word co-occurrence relation as described above. it can. For example, in the case of the form shown in FIG. 4, it is also possible to extract redundant information on items such as a product name, a request, a situation, and a countermeasure, and provide an overview of data included in the group.

【００６９】さて、続いて、抽出された重複情報を編集
する（ステップＳ１４）。編集済み情報はデータ蓄積部
１０に格納される（ステップＳ１５）。Next, the extracted duplicate information is edited (step S14). The edited information is stored in the data storage unit 10 (Step S15).

【００７０】本実施形態では、抽出された重複情報の編
集では、あるグループについて、そのグループに属する
データから抽出された重複情報をもとにして、編集済み
情報を作成する。例えば、「２０００年問題」に関係す
るグループに属する「一般ユーザに閲覧されることを想
定していない文章」あるいは「個別具体的な質疑応答結
果をそのまま記述した文章」を含むデータから抽出され
た重複情報を素材として、一般ユーザに閲覧されること
を想定した内容を持つ編集済み情報を容易に作成するこ
とができる。In the present embodiment, in editing the extracted duplicate information, edited information is created for a certain group based on the duplicate information extracted from the data belonging to the group. For example, extracted from data including “a sentence that is not supposed to be viewed by general users” or “a sentence that directly describes individual and specific Q & A results” belonging to a group related to the “2000 problem”. Using the duplicate information as a material, it is possible to easily create edited information having contents assumed to be viewed by a general user.

【００７１】以下、重複情報の編集（編集済み情報の作
成）について説明する。The editing of duplicate information (creation of edited information) will be described below.

【００７２】例えば、図６に示すような画面（１３１）
に、抽出した重複情報（の複製）を表示し、ユーザの編
集指示に従い、その内容を修正する。この例では、画面
の上部（１３２）で、あるデータについての「タイト
ル」と「質問文」と「回答文」の項目それぞれから抽出
された重複情報を提示している。ユーザは、枠内の情報
を直接編集することができる。画面の下部（１３３）に
は、同一グループに属するデータが表示可能であり、こ
れを適宜参照しながら上部（１３２）の情報を編集する
ことができる。For example, a screen (131) as shown in FIG.
Then, (duplicate) of the extracted duplication information is displayed, and the content is corrected according to the user's editing instruction. In this example, at the upper part (132) of the screen, the duplicate information extracted from the items of "title", "question sentence" and "answer sentence" for a certain data is presented. The user can directly edit the information in the frame. Data belonging to the same group can be displayed in the lower part (133) of the screen, and the information in the upper part (132) can be edited while referring to the data as appropriate.

【００７３】また、図６の上部（１３２）に示すよう
に、「公開」または「非公開」をユーザが選択できるよ
うにし（１３４，１３５）、例えば、ＦＡＱとして外部
の顧客や企業内の管理者以外の者などに公開する場合
に、一部の情報を公開しない等の指定をさせてもよい。
これにより、編集した情報を管理者のみが参照すること
が可能になる。さらに、企業内情報などについて、閲覧
可能な者もしくは部署を細かい単位で指定できるように
してもよい。Further, as shown in the upper part (132) of FIG. 6, the user can select "public" or "non-public" (134, 135). If the information is to be disclosed to a person other than the person, a designation may be made that some information is not disclosed.
Thereby, only the administrator can refer to the edited information. Further, with respect to in-company information and the like, it may be possible to specify a person or a department that can be viewed in fine units.

【００７４】前述したような、タイトルによるグループ
化を行うと、グループを構成するデータのタイトルの内
容が揃っているため、グループの特徴をとらえやすく、
ユーザが編集しやすいという利点がある。When grouping by title is performed as described above, the contents of the titles of the data constituting the group are uniform, so that the characteristics of the group can be easily grasped.
There is an advantage that the user can easily edit.

【００７５】また、例えば、図６のような画面に加え
て、さらに図７に示すように編集画面（１４１）の中段
（４２３）にボタン群（１４２〜１４４）を用意し、グ
ループ化そのものの編集も行えるようにすることもでき
る。Further, for example, in addition to the screen as shown in FIG. 6, a group of buttons (142 to 144) is prepared in the middle (423) of the edit screen (141) as shown in FIG. Editing can also be enabled.

【００７６】例えば、図７の下部（４２４）に表示され
た当該グループに属するデータのうち、一つ目のデータ
を選択すると、それが選択された表示状態になる。ここ
で、「削除」ボタン（１４２）をクリックすると、当該
選択された一つ目のデータは、当該グループから削除さ
れる、といった機能を持たせることも可能である。。For example, when the first data among the data belonging to the group displayed in the lower part (424) of FIG. 7 is selected, the selected data is displayed. Here, when the "delete" button (142) is clicked, the function of deleting the selected first data from the group can be provided. .

【００７７】同様に、１つまたは複数のデータを選択し
て「分割」ボタン（１４３）を押すと、当該グループか
ら、当該選択されたデータを削除し、削除されたデータ
を含む新規のグループを作成する、といった機能を持た
せることも可能である。Similarly, when one or a plurality of data is selected and the “split” button (143) is pressed, the selected data is deleted from the group, and a new group including the deleted data is deleted. It is also possible to have a function of creating.

【００７８】また、「追加」ボタン（１４４）をクリッ
クすることにより、データの検索を受け付ける画面など
が起動され、自然言語文などを入力し検索結果から選択
したデータを、この編集中のグループに追加する、とい
った機能を持たせることも可能である。By clicking an "add" button (144), a screen for accepting a data search is started, and data selected from the search results by inputting a natural language sentence or the like is added to the group being edited. It is also possible to have a function such as addition.

【００７９】なお、グループの編集を行ったら、重複情
報を抽出し直すようにしてもよい。After the group is edited, the duplicate information may be extracted again.

【００８０】また、例えば、図４に示すようなフォーム
の各項目ごとに重複する情報を抽出した場合、例えば図
５に示すような画面（１２１）において、それぞれの項
目について、抽出された重複情報のうち一つを提示し
て、ユーザに編集させる。For example, when duplicate information is extracted for each item of the form as shown in FIG. 4, for example, on the screen (121) as shown in FIG. Is presented and the user edits it.

【００８１】なお、前述したように、重複度の高い単位
文を多く含むデータについての各項目の重複情報を提示
するようにしてもよいし、各項目ごとに最も高い重複度
を有するものを提示するようにしてもよい。As described above, it is possible to present the duplication information of each item for data including many unit sentences having a high degree of duplication, or to present the item having the highest degree of duplication for each item. You may make it.

【００８２】図５の画面で、提示されなかった情報に関
しては、項目の枠の横のボタン（１２３）をクリックす
ると、例えば図８の１５２に示すように表示される。こ
れにより、他の重複情報（例えば、順位が２番目以降の
もの）を選択することもでき、また、他のデータを参照
しながら各項目の情報を編集することが可能になる。On the screen shown in FIG. 5, information not presented is displayed by clicking the button (123) next to the item frame, for example, as shown at 152 in FIG. As a result, it is possible to select other duplicate information (for example, the second or subsequent item), and it is possible to edit the information of each item while referring to other data.

【００８３】作成された編集済み情報は、「登録」など
のボタンを押すことにより、グループ化されたデータと
関連づけられて、データ蓄積部１０に蓄積される。この
とき、作成された編集済み情報には、（編集の素材とな
る）データとは異なる識別子を付加されて登録される。The created edited information is stored in the data storage unit 10 in association with the grouped data by pressing a button such as “Register”. At this time, the created edited information is registered with an identifier different from the data (which is the material of editing) added.

【００８４】ところで、情報抽出部４０で抽出された重
複情報は、その抽出直後に編集するのではなく、データ
蓄積部１０に蓄積しておき、後で編集作業を行っても良
い。この場合の処理手順の一例を図９に示す。すなわ
ち、自然言語文などを入力してデータ蓄積部１０に蓄積
されたデータを検索し、検索結果を表示する（ステップ
Ｓ２１〜Ｓ２３）。検索結果からユーザがデータを選択
すると（ステップＳ２４）、その選択されたデータが属
するグループの編集画面（例えば、図５、図６、図７、
図８など）が表示され、ユーザの編集作業を促進する。
そして、ユーザは重複情報をもとに編集を行い（ステッ
プＳ２５）、編集済み情報は識別子を付されてデータ蓄
積部１０に格納される（ステップＳ２６）。この場合、
ユーザが編集した情報に関しては、「編集済み」である
ことが識別できる情報を付加して登録しておけば、編集
後の情報を識別することができる。例えば、前述のよう
に、編集済み情報にデータとは異なる識別情報を付加す
ることによって識別することができる。Incidentally, the duplication information extracted by the information extraction unit 40 may be stored in the data storage unit 10 instead of being edited immediately after the extraction, and may be edited later. FIG. 9 shows an example of a processing procedure in this case. That is, a natural language sentence or the like is input to search for data stored in the data storage unit 10, and the search results are displayed (steps S21 to S23). When the user selects data from the search results (step S24), an edit screen of a group to which the selected data belongs (for example, FIGS. 5, 6, 7,
8 and the like are displayed to facilitate the user's editing work.
Then, the user edits based on the duplicate information (step S25), and the edited information is stored in the data storage unit 10 with an identifier attached (step S26). in this case,
If the information edited by the user is added and registered with information that can be identified as “edited”, the information after editing can be identified. For example, as described above, identification can be performed by adding identification information different from data to edited information.

【００８５】また、データを（データ蓄積部１０に）登
録するときに同時に編集作業を行わせることもできる。
この場合の処理手順の一例を図１０に示す。すなわち、
図３のようなデータを扱う場合に、質問を受けつけたユ
ーザがデータを入力した後に、入力したデータを利用す
るなどして、データ蓄積部１０に蓄積されたデータを検
索し、検索結果を表示する（ステップＳ３１〜Ｓ３
３）。ユーザが検索結果からデータを選択すると（ステ
ップＳ３４）、図７に例示したように、ユーザが選択し
たグループに関して編集画面が表示される。続いて、ユ
ーザは編集を行うことができる（ステップＳ３５）。そ
の際、ユーザが「追加」ボタン（１４４）をクリックす
ると、入力中の新しいデータがそのグループに追加され
る。なお、格納すべき編集済み情報があれば、データ蓄
積部１０に格納する（ステップＳ３６）。また、上記に
おいて、ユーザが入力中のデータが属するであろうグル
ープのデータを参考にしながら、入力中のデータの「回
答」を作成、修正することもできる。すなわち、質問を
受けつけたユーザが「回答」を入力する前または「回
答」の原案を入力した後に、入力したデータを利用する
などして、データ蓄積部１０に蓄積されたデータを検索
し、検索結果を表示する（ステップＳ３１〜Ｓ３３）。
ユーザが検索結果からデータを選択すると（ステップＳ
３４）、図７に例示したように、ユーザが選択したグル
ープに関して編集画面が表示される。続いて、ユーザは
編集を行うことができる（ステップＳ３５）。その際、
ユーザが「追加」ボタン（１４４）をクリックすると、
入力中の新しいデータがそのグループに追加される。ユ
ーザは、同じグループのデータを参考にしながら、入力
中のデータの「回答」を作成し、あるいは「回答」の原
案の修正をすることができる。また、格納すべき編集済
み情報があれば、データ蓄積部１０に格納する（ステッ
プＳ３６）。Further, an editing operation can be performed at the same time as registering data (in the data storage unit 10).
FIG. 10 shows an example of the processing procedure in this case. That is,
When handling the data as shown in FIG. 3, after the user who receives the question inputs the data, the data stored in the data storage unit 10 is searched by using the input data, and the search result is displayed. (Steps S31 to S3
3). When the user selects data from the search result (step S34), an editing screen is displayed for the group selected by the user as illustrated in FIG. Subsequently, the user can perform editing (step S35). At this time, when the user clicks the "Add" button (144), the new data being input is added to the group. If there is edited information to be stored, it is stored in the data storage unit 10 (step S36). In the above description, the "answer" of the data being input can be created and corrected while referring to the data of the group to which the data being input by the user belongs. That is, before the user who accepts the question inputs “answer” or after inputting the draft of “answer”, the data stored in the data storage unit 10 is searched by using the input data, and the search is performed. The result is displayed (steps S31 to S33).
When the user selects data from the search results (step S
34), an editing screen is displayed for the group selected by the user as illustrated in FIG. Subsequently, the user can perform editing (step S35). that time,
When the user clicks the "Add" button (144),
New data being entered is added to the group. The user can create an “answer” of the input data or modify the original “answer” while referring to the data of the same group. If there is edited information to be stored, the edited information is stored in the data storage unit 10 (step S36).

【００８６】ここで、データの検索結果の表示画面で
は、従来の検索結果画面のように、スコアの高い順にデ
ータのタイトルを一覧表示するものでもよいが、検索結
果のデータを、所属するグループごとにグルーピングし
て、グループのタイトルとともに、データのタイトルを
表示させることも可能である。例えば、図１１に示すよ
うに、表示画面（２０１）の上部（２０２）によってユ
ーザが検索のための操作を行うと、検索条件に従って検
索が行われる。検索が終了すると、検索結果のデータ
は、（グループにかかわらずに）スコアの高い順に並べ
られたのち、所属するグループ（ここではＦＡＱと表
示）ごとにまとめられ、表示画面（２０１）の下部（２
０３）に表示される。その際、例えば、スコアの高い順
に並べられたのち、例えばスコアの高い順で１位と１０
位のデータが同じグループに属する場合には、１位のデ
ータを一番上に表示（２０４）するとともに、その下に
１０位のデータをも並べて表示（２０５）するようにし
てもよい。あるいは、１位のデータのみ表示し、１０位
のデータは検索結果画面に表示しなくても良い。このよ
うにグルーピングして表示することにより、ユーザは似
たようなデータをまとめて確認することができ、所望の
情報を探す手間が削減される。Here, the data search result display screen may display a list of data titles in descending order of score, as in the conventional search result screen. , And the title of the data can be displayed together with the title of the group. For example, as shown in FIG. 11, when the user performs an operation for search on the upper part (202) of the display screen (201), the search is performed according to the search condition. When the search is completed, the data of the search results are arranged in descending order of the score (regardless of the group), and then grouped for each group (here, displayed as FAQ). 2
03). At that time, for example, after being arranged in descending order of the score, for example, the first place and the tenth are arranged in the descending order of the score.
If the data belonging to the same rank belongs to the same group, the data of the first place may be displayed at the top (204), and the data of the tenth place may be displayed below the data (205). Alternatively, only the first data may be displayed, and the tenth data may not be displayed on the search result screen. By grouping and displaying in this way, the user can collectively confirm similar data, and the trouble of searching for desired information is reduced.

【００８７】また、グループの構造全体の編集を行う画
面を用意してもよい。例えば、図７の「全体表示」ボタ
ン（１４５）をクリックすると、図１２に示すようなグ
ループの階層関係の編集画面を起動する。図１２の表示
画面（２１１）の上部（２１２）には、図７と同様に、
編集中のグループのタイトルなどが表示されており、編
集を行うことができる。表示画面（２１１）の下部（２
１３）には、グループの階層関係が表示されている。A screen for editing the entire structure of the group may be prepared. For example, when the “all display” button (145) in FIG. 7 is clicked, an editing screen for the hierarchical relationship of groups as shown in FIG. 12 is started. The upper part (212) of the display screen (211) of FIG.
The title of the group being edited is displayed, and editing can be performed. The lower part of the display screen (211) (2
13) shows the hierarchical relationship of the groups.

【００８８】図１２の例において、枠に囲まれた文字
は、そのグループのタイトルなどである。また、各タイ
トルの横には、そのグループに含まれるデータの数など
を棒グラフで表示してもよい。これにより、蓄積された
データの分布などを確認することができる。In the example shown in FIG. 12, characters surrounded by a frame are the title of the group and the like. Further, the number of data included in the group may be displayed as a bar graph next to each title. Thus, the distribution of the stored data can be confirmed.

【００８９】図１２の例では、例えば、以下のようなグ
ループの編集作業を行うことができる。In the example of FIG. 12, for example, the following group editing work can be performed.

【００９０】表示画面の下部（２１３）に表示されたグ
ループを選択すると、表示画面の上部（２１２）のタイ
トルなどの表示が、当該選択されたグループのもの更新
される。When a group displayed at the bottom (213) of the display screen is selected, the display such as the title at the top (212) of the display screen is updated for the selected group.

【００９１】グループを選択した状態で「開く」ボタン
（２１８）などを選択すると、選択されたグループに下
位のグループがあれば、表示画面の下部（２１３）に下
位グループを含むグループ階層を表示する。選択された
グループの下位にグループがなければ、図７の画面に切
りかわり、グループに含まれるデータを提示する。When an "open" button (218) or the like is selected with a group selected, if there is a lower group in the selected group, a group hierarchy including the lower group is displayed at the lower part (213) of the display screen. . If there is no group below the selected group, the screen is switched to the screen shown in FIG. 7 and data included in the group is presented.

【００９２】グループを選択した状態で「削除」ボタン
（２１４）をクリックすると、そのグループを削除す
る。When a "delete" button (214) is clicked while a group is selected, the group is deleted.

【００９３】一つまたは複数のグループを選択した状態
で「分割」ボタン（２１５）をクリックすると、選択さ
れたグループを含む新しいグループが、現在の親グルー
プの並びに生成される。例えば、図１２において、「ａ
ａａａ」と「ｂｂｂｂ」を選択して「分割」をクリック
すると、「ＸＸＸＸ」グループの並びに新しくグループ
ができ、その下に、「ａａａａ」と「ｂｂｂｂ」グルー
プが移動する。When one or more groups are selected and the "split" button (215) is clicked, a new group containing the selected groups is created along with the current parent group. For example, in FIG.
When “aaa” and “bbbb” are selected and “split” is clicked, a new group of “XXXX” groups is created, and the “aaaa” and “bbbb” groups are moved below them.

【００９４】一つまたは複数のグループを選択した状態
で「挿入」ボタン（２１６）をクリックすると、そのグ
ループを含む新しいグループが、現在の親グループの子
供として生成される。例えば、図１２において、「ａａ
ａａ」と「ｂｂｂｂ」を選択して「挿入」をクリックす
ると、「ＸＸＸＸ」グループの子供として新しくグルー
プができ、その下に、「ａａａａ」と「ｂｂｂｂ」グル
ープが移動する。When one or more groups are selected and the "Insert" button (216) is clicked, a new group containing that group is created as a child of the current parent group. For example, in FIG.
When "aa" and "bbbb" are selected and "insert" is clicked, a new group is created as a child of the "XXXX" group, and the "aaaa" and "bbbb" groups are moved below them.

【００９５】一つまたは複数のグループを選択した状態
で「結合」ボタン（２１７）をクリックすると、選択さ
れたグループのメンバーが統合されて、一つのグループ
になる。例えば、図１２において、「ａａａａ」と「ｂ
ｂｂｂ」とを選択して「結合」ボタン（２１７）をクリ
ックすると、「ａａａａ」グループに「ｂｂｂｂ」グル
ープのデータが加わり、一つのグループとなる。一方、
「ｂｂｂｂ」グループ自体は削除されることになる。When one or a plurality of groups are selected and the "join" button (217) is clicked, the members of the selected groups are integrated into one group. For example, in FIG. 12, "aaa" and "b"
When "bbb" is selected and the "join" button (217) is clicked, data of the "bbbb" group is added to the "aaaa" group to form one group. on the other hand,
The “bbbb” group itself will be deleted.

【００９６】また、図１２の各グループのタイトルの横
に、グループの特徴を表す用語などを併記してもよい。
例えば、前述したようなタイトルを用いたグループ化を
行った場合に、グループを決定する｛２０００年、問
題｝などの単語組を、そのまま併記するようにしてもよ
い。あるいは、公知の確率モデルなどの手法により、グ
ループの特徴を表す用語（ラベル）を求め、これを表示
するようにしてもよい。Further, terms indicating the characteristics of the groups may be written beside the titles of the groups in FIG.
For example, when grouping is performed using titles as described above, a word set such as {2000, question} for determining a group may be written as it is. Alternatively, a term (label) representing a feature of the group may be obtained by a known probability model or the like, and may be displayed.

【００９７】これまでは、データの管理者等を使用者と
して想定した部分について説明してきた。例えば、デー
タ自体は管理者のみが閲覧可能であって、管理者が、該
データをもとにして、管理者等以外の者にも閲覧可能な
情報を作成するような場合であった。Up to this point, the description has been given of a portion in which a data manager or the like is assumed to be a user. For example, there has been a case where data itself can be viewed only by an administrator, and the administrator creates information that can be viewed by anyone other than the administrator based on the data.

【００９８】以下では、作成された編集済み情報を、管
理者以外のユーザが検索する場合について説明する。Hereinafter, a case will be described in which a user other than the administrator searches the created edited information.

【００９９】図１３に、この場合のシステム構成例を示
す。すなわち、図１３は、図１のシステムに情報検索部
６０を付加したものである。FIG. 13 shows an example of the system configuration in this case. That is, FIG. 13 is obtained by adding the information search unit 60 to the system of FIG.

【０１００】情報検索部６０は、自然言語文などによる
検索命令を受け付け、解析し、（データ蓄積部１０に蓄
積され、情報編集部５０によって抽出され、情報編集部
５０によって編集され、データ蓄積部１０に蓄積され
た）編集済み情報の中から、命令の内容に合致したデー
タを探し出し、提示するためのものである。The information retrieval unit 60 receives and analyzes a retrieval command based on a natural language sentence or the like, and stores it (stored in the data storage unit 10, extracted by the information editing unit 50, edited by the information editing unit 50, and This is for searching for and presenting data that matches the content of the instruction from the edited information (accumulated in 10).

【０１０１】この情報検索部６０は、ソフトウェアで実
現可能であり、例えば、管理者以外のユーザの使用する
計算機（例えば、インターネットにおける一般ユーザの
使用する計算機、当該システムが接続された企業内ＬＡ
Ｎに接続された管理者以外の者など）に搭載される。な
お、図１のシステムがスタンドアローンの装置である場
合に、さらに、この情報検索部６０を当該スタンドアロ
ーンの装置に搭載する形態も可能である。The information retrieval unit 60 can be realized by software. For example, a computer used by a user other than the administrator (for example, a computer used by a general user on the Internet, an intra-company LA to which the system is connected).
N, etc.). When the system in FIG. 1 is a stand-alone device, a mode in which the information search unit 60 is further mounted on the stand-alone device is also possible.

【０１０２】図１４に、蓄積された編集済み情報を、管
理者以外のユーザが検索する場合の処理手順の一例を示
す。FIG. 14 shows an example of a processing procedure when a user other than the administrator searches the stored edited information.

【０１０３】管理者以外のユーザは、例えば自然言語検
索文などを入力する。情報検索部６０は、検索文を読み
込み（ステップＳ４１）、データ蓄積部１０に蓄積され
た編集済み情報のうち（もとのデータはここでの検索対
象にしないものとする）、検索要求に合致した情報を探
し出して（ステップＳ４２）、ユーザに表示する（ステ
ップＳ４３）。A user other than the administrator inputs a natural language search sentence, for example. The information search unit 60 reads the search sentence (step S41), and matches the search request among the edited information stored in the data storage unit 10 (the original data is not set as a search target here). The searched information is searched for (step S42) and displayed to the user (step S43).

【０１０４】図１５に、検索画面の一例を示す。例え
ば、図１５の表示画面（３０１）の上部の枠（３０２）
に検索文を入力して、検索ボタン（３０３）を選択す
る。すると、検索文が読み込まれ、データ蓄積部１０に
格納された編集済み情報とのマッチングを行い、あては
まる順に、検索結果を画面下部（３０４）に表示する。
それぞれの検索結果として、例えば、タイトルと質問文
と回答文の一部が表示される。FIG. 15 shows an example of the search screen. For example, the upper frame (302) of the display screen (301) in FIG.
And a search button (303) is selected. Then, the search sentence is read and matched with the edited information stored in the data storage unit 10, and the search results are displayed at the bottom of the screen (304) in the order in which they apply.
As each search result, for example, a title, a question sentence, and a part of the answer sentence are displayed.

【０１０５】ユーザは、検索結果から所望の情報を選択
し、閲覧する。The user selects desired information from the search results and browses it.

【０１０６】これによって、管理者以外のユーザには、
例えば他の顧客との応答履歴など、編集前のそのままの
データを閲覧されることなく、管理者以外のユーザに閲
覧されることを想定した編集済み情報のみ公開すること
ができる。As a result, users other than the administrator can
For example, only edited information that is supposed to be viewed by a user other than the administrator can be made public, without viewing the unedited data, such as the response history with another customer, as it is.

【０１０７】また、前述のように、「公開」または「非
公開」の情報を付加して登録しておけば、編集済み情報
についても、そのうち「公開」とされた情報のみを提示
するようにすることができる。Further, as described above, if "public" or "non-public" information is added and registered, only the information which has been made "public" among the edited information is presented. can do.

【０１０８】以上は、同一グループに属するデータ群か
ら抽出した重複情報をもとに編集済み情報を作成する場
合について説明したが、以下では、データに含まれる重
複情報以外の部分をも利用する場合について説明する。The case where edited information is created on the basis of the duplicate information extracted from the data group belonging to the same group has been described above. In the following, the case where a part other than the duplicate information included in the data is used will be described. Will be described.

【０１０９】図１６に、この場合のシステム構成例を示
す。すなわち、図１６は、図１のシステムに情報検索部
６０を付加したものである。FIG. 16 shows an example of the system configuration in this case. That is, FIG. 16 is obtained by adding an information search unit 60 to the system of FIG.

【０１１０】データ蓄積部１０、データ検索部２０、グ
ループ化部３０、情報抽出部４０については基本的には
図１の各構成とそれぞれ同様の機能を持つ。The data storage unit 10, the data search unit 20, the grouping unit 30, and the information extraction unit 40 have basically the same functions as those of the respective components in FIG.

【０１１１】以下では、これまで述べてきたものと相違
する部分を中心に説明する。In the following, description will be made focusing on portions different from those described above.

【０１１２】図１６で付加された補完情報抽出部４１
は、重複情報以外の情報の全て、あるいは重複情報以外
の情報であって且つ所定の基準を満たす情報を、補完情
報として抽出するためのものである。例えば、グループ
化部３０によって作成されたグループに属するデータの
うち、少なくともある二つのデータにおいては重複して
いない情報を、補完情報として抽出する。The supplementary information extraction unit 41 added in FIG.
Is for extracting, as supplementary information, all pieces of information other than duplicate information or information other than duplicate information and satisfying a predetermined criterion. For example, among the data belonging to the group created by the grouping unit 30, information that does not overlap in at least some two data is extracted as complementary information.

【０１１３】情報編集部５０は、情報抽出部４０によっ
て抽出された重複情報、および補完情報抽出部４１によ
って抽出された補完情報を編集する。The information editing unit 50 edits the duplicate information extracted by the information extracting unit 40 and the complementary information extracted by the complementary information extracting unit 41.

【０１１４】情報抽出部４０によって抽出された重複情
報、補完情報抽出部４１によって抽出された補完情報、
情報編集部５０で重複情報をもとに編集された編集済み
情報（以下、編集済みの補完情報と区別するために編集
済み重複情報と呼ぶ）、および補完情報をもとに編集さ
れた補完済み情報（以下、編集済み補完情報と呼ぶ）
は、データ蓄積部１０に蓄積される。The overlapping information extracted by the information extracting unit 40, the complementary information extracted by the complementary information extracting unit 41,
Edited information edited on the basis of the duplicate information by the information editing unit 50 (hereinafter, referred to as edited duplicate information to distinguish it from the edited complementary information), and complemented information edited based on the complementary information Information (hereinafter referred to as edited supplementary information)
Are stored in the data storage unit 10.

【０１１５】図１７に、この場合の処理手順の一例を示
す。ステップＳ５１〜Ｓ５３は図２のステップＳ１１〜
Ｓ１３と同様である。FIG. 17 shows an example of the processing procedure in this case. Steps S51 to S53 correspond to steps S11 to S11 in FIG.
It is the same as S13.

【０１１６】ステップＳ５６は図２のステップＳ１５と
同様である。Step S56 is the same as step S15 in FIG.

【０１１７】ステップＳ５４の補完情報の抽出は、例え
ば、以下のようなフォームを用いた方法によって実施で
きる。The extraction of the complementary information in step S54 can be carried out, for example, by a method using the following form.

【０１１８】図３のような問い合わせと回答のデータを
解析し、例えば図４に示すような問い合わせ対応の意味
構造をあらわすフォームに変換する。この変換について
は既に説明したものと同様である。The data of the inquiry and the answer as shown in FIG. 3 are analyzed and converted into a form representing the meaning structure corresponding to the inquiry as shown in FIG. 4, for example. This conversion is the same as that already described.

【０１１９】ここでは、グループ化部３０によって作成
されたグループ内の任意の二つのデータの間の、フォー
ムのある項目に関する重複度を、それぞれのデータにつ
いて該項目に変換された情報に着目して、例えば前述の
場合と同様に以下の式で算出する。重複度＝（二つの情報に共通して含まれる単語数）／
（二つの情報に含まれる単語数）あるデータのある項目の情報について、グループ内の他
の全てのデータの同じ項目との重複度が予め指定された
閾値を越えない場合に、補完情報抽出部４１は、該情報
Ａを、該グループの該項目に関する補完情報として、抽
出する。Here, the degree of duplication of a certain item of the form between any two pieces of data in the group created by the grouping unit 30 is focused on the information converted to the item for each data. For example, it is calculated by the following equation in the same manner as described above. Degree of duplication = (number of words commonly included in two pieces of information) /
(Number of words included in two pieces of information) When the degree of duplication of the information of a certain item of certain data with the same item of all other data in the group does not exceed a predetermined threshold, a complementary information extracting unit 41 extracts the information A as complementary information on the item of the group.

【０１２０】次に、ステップＳ５５では、情報抽出部４
０によって抽出された重複情報および補完情報抽出部４
１によって抽出された補完情報を、以下のように編集す
る。Next, in step S55, the information extraction unit 4
Duplicate information and complementary information extraction unit 4 extracted by 0
The supplementary information extracted by 1 is edited as follows.

【０１２１】例えば、図１８のように、情報抽出部４０
によって抽出された重複情報および補完情報抽出部４１
によって抽出された補完情報をフォーム形式で表示す
る。For example, as shown in FIG.
Information and complementary information extraction unit 41 extracted by
Display the supplementary information extracted in the form.

【０１２２】図１８では、表示画面（４０１）の左側
（４０２）に情報抽出部４０によって抽出された重複情
報、右側（４０３）に補完情報抽出部４１によって抽出
された補完情報を表示しており、それぞれの項目の枠内
を編集することができる。それぞれの項目に編集すべき
情報が複数あるときには、項目の枠の横のボタン（４０
４０）を押すことで、編集対象とする情報を入れ替える
ことができる。In FIG. 18, the duplicate information extracted by the information extracting unit 40 is displayed on the left side (402) of the display screen (401), and the complementary information extracted by the complementary information extracting unit 41 is displayed on the right side (403). , The contents of each item can be edited. When there is a plurality of pieces of information to be edited for each item, a button (40
By pressing (40), information to be edited can be exchanged.

【０１２３】図１９は、他の表示画面例である。表示画
面（４１１）の左側（４１２）と右側（４１３）とは、
同じフォーム構成になっており、重複情報は左の枠に、
補完情報は右の枠に表示する。これによって、それぞれ
の情報を編集する以外にも、例えば、補完情報から重複
情報への移動、あるいは重複情報から補完情報への移動
などをスムーズに行うことができる。FIG. 19 shows another example of the display screen. The left side (412) and the right side (413) of the display screen (411)
It has the same form configuration, duplicate information is in the left frame,
The supplementary information is displayed in the right frame. This makes it possible to smoothly move, for example, from complementary information to duplicate information, or from duplicate information to complementary information, in addition to editing each piece of information.

【０１２４】図２０の表示画面例は、図７の表示画面の
上部（４２２）の表示を、図１８の表示に代えたものに
相当する例である。この場合には、例えば、図２０の表
示画面の上部（４２５）の補完情報の「操作」項目の編
集中に、編集中のデータを抽出した元データが画面下の
部分（４２４）に表示される。The display screen example of FIG. 20 is an example in which the display at the upper part (422) of the display screen of FIG. 7 is replaced with the display of FIG. In this case, for example, while editing the “operation” item of the supplementary information at the upper part (425) of the display screen of FIG. 20, the original data from which the data being edited is extracted is displayed at the lower part (424) of the screen. You.

【０１２５】図２１の表示画面例は、図７の表示画面の
上部（４２２）の重複情報の表示領域の下に、補完情報
の表示領域を追加したものに相当する例である（なお、
図２１においてボタン等の記述を省略している）。この
場合、例えば、補完情報を選択して、上矢印ボタン（４
３４）を押すと、重複情報へ移動する、あるいは、重複
情報を選択して、下矢印ボタン（４３５）を押すと、補
完情報へ移動する、といった操作が可能である。The display screen example of FIG. 21 is an example corresponding to a display area of supplementary information added below the display area of overlapping information at the upper part (422) of the display screen of FIG.
The description of the buttons and the like is omitted in FIG. 21). In this case, for example, the supplementary information is selected and the up arrow button (4
When the user presses 34), the user can move to the overlapping information, or when the user selects the overlapping information and presses the down arrow button (435), the operation moves to the complementary information.

【０１２６】図２２は、図５の形式の画面に、補完情報
表示エリアを追加した例である。図２２の例では、重複
情報は太枠内に表示し、補完情報は細枠内に表示するな
どによって、両者の区別を明確にしている。同一項目に
ついて重複情報と補完情報の両方が存在する場合には、
２つの枠を表示する。FIG. 22 shows an example in which a supplementary information display area is added to the screen in the format of FIG. In the example of FIG. 22, the distinction between the two is made clear by displaying the overlapping information in a bold frame and displaying the complementary information in a thin frame. If both duplicate information and complementary information exist for the same item,
Display two frames.

【０１２７】次に、図１６の検索システムに、情報検索
部６０が接続され、編集済み重複情報や編集済み補完情
報に対する検索を行う場合について説明する。Next, a case will be described in which the information search unit 60 is connected to the search system of FIG. 16 and searches for edited duplicate information and edited complementary information.

【０１２８】図２３に、この場合の処理手順の一例を示
す。FIG. 23 shows an example of the processing procedure in this case.

【０１２９】例えば、先に示した図１５と同様な検索イ
ンタフェースの表示画面（３０１）の上部の枠（３０
２）に検索文を入力して、検索ボタン（３０３）を選択
する。すると、検索文が読み込まれ、データ蓄積部１０
に格納された編集済み重複情報とのマッチングを行い、
あてはまる順に、検索結果を画面下部（３０４）に表示
する（ステップＳ６１，Ｓ６２）。それぞれの検索結果
として、例えば、タイトルと質問文と回答文の一部が表
示される。For example, the upper frame (30) of the display screen (301) of the search interface similar to that shown in FIG.
Input a search sentence in 2) and select a search button (303). Then, the search sentence is read and the data storage unit 10
Performs matching with the edited duplicate information stored in
The search results are displayed at the bottom of the screen (304) in the applicable order (steps S61 and S62). As each search result, for example, a title, a question sentence, and a part of the answer sentence are displayed.

【０１３０】ここで、ユーザが表示された検索結果から
１つを選択すると、例えば図２４のように表示される。
画面中段（３０５）には、選択され編集済み重複情報が
表示され、画面下部（３０６）には、同一グループから
抽出された編集済み補完情報が表示される。ここで、画
面上部の枠（３０２）に、編集済み補完情報を検索する
ための検索文を入力し、「しぼりこみ」ボタン（３０
７）を押すと、表示中のグループの中の編集済み補完情
報のみから検索を行い、その結果が表示される（ステッ
プＳ６３，Ｓ６４）別の検索インタフェース例として
は、図２５に示すように、画面上部に重複情報用の検索
文入力部（図中の質問の枠）と補完情報用の検索文入力
部（図中の細目の枠）との両方を設けるようにしてもよ
い。この場合、両方の検索文入力部ともに入力して検索
を実行すれば、図２５の下部に表示するように重複情報
と補完情報との両方の条件がマッチングする情報を検索
することができる。Here, when the user selects one from the displayed search results, the search result is displayed, for example, as shown in FIG.
The selected and edited duplicate information is displayed in the middle part of the screen (305), and the edited complementary information extracted from the same group is displayed in the lower part of the screen (306). Here, a search sentence for searching for the edited complementary information is input in a frame (302) at the top of the screen, and a “squeeze” button (30)
When 7) is pressed, a search is performed only from the edited complementary information in the displayed group, and the result is displayed (steps S63 and S64). As another search interface example, as shown in FIG. At the top of the screen, there may be provided both a search sentence input unit for duplicate information (question frame in the figure) and a search sentence input unit for complementary information (fine frame in the figure). In this case, if the search is executed by inputting both of the search sentence input units, it is possible to search for information in which both conditions of the duplicate information and the complementary information match as shown in the lower part of FIG.

【０１３１】なお、以上の各機能は、ソフトウェアとし
ても実現可能である。The above functions can be realized as software.

【０１３２】また、本実施形態は、コンピュータに所定
の手段を実行させるための（あるいはコンピュータを所
定の手段として機能させるための、あるいはコンピュー
タに所定の機能を実現させるための）プログラムを記録
したコンピュータ読取り可能な記録媒体としても実施す
ることもできる。Further, the present embodiment is a computer in which a program for causing a computer to execute predetermined means (or for causing a computer to function as predetermined means, or for causing a computer to realize predetermined functions) is recorded. It can also be implemented as a readable recording medium.

【０１３３】本発明は、上述した実施の形態に限定され
るものではなく、その技術的範囲において種々変形して
実施することができる。The present invention is not limited to the above-described embodiments, but can be implemented with various modifications within the technical scope thereof.

【０１３４】[0134]

【発明の効果】本発明によれば、データの再利用のため
の情報の選別や編集等を支援することにより、データの
管理や再利用等を容易にすることができる。また、編集
作業を行った情報のみ外部からの検索対象とすることが
できる。According to the present invention, management and reuse of data can be facilitated by supporting selection and editing of information for data reuse. Further, only the information on which the editing work has been performed can be a search target from the outside.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る検索システムの構成
例を示す図FIG. 1 is a diagram showing a configuration example of a search system according to an embodiment of the present invention.

【図２】同実施形態における処理手順の一例を示すフロ
ーチャートFIG. 2 is an exemplary flowchart illustrating an example of a processing procedure in the embodiment.

【図３】同実施形態におけるデータの一例を示す図FIG. 3 is a view showing an example of data in the embodiment.

【図４】同実施形態におけるフォームの一例を示す図FIG. 4 is an exemplary view showing an example of a form in the embodiment.

【図５】同実施形態におけるフォームに沿った重複情報
の表示例を図５FIG. 5 shows an example of display of overlapping information along a form in the embodiment.

【図６】同実施形態におけるＧＵＩ画面例を示す図FIG. 6 is an exemplary view showing an example of a GUI screen according to the embodiment.

【図７】同実施形態におけるＧＵＩ画面例を示す図FIG. 7 shows an example of a GUI screen according to the embodiment.

【図８】同実施形態におけるＧＵＩ画面例を示す図FIG. 8 is a view showing an example of a GUI screen according to the embodiment.

【図９】同実施形態における処理手順の他の例を示すフ
ローチャートFIG. 9 is an exemplary flowchart illustrating another example of the processing procedure in the embodiment.

【図１０】同実施形態における処理手順のさらに他の例
を示すフローチャートFIG. 10 is an exemplary flowchart illustrating still another example of the processing procedure in the embodiment.

【図１１】同実施形態におけるＧＵＩ画面例を示す図FIG. 11 is a view showing an example of a GUI screen according to the embodiment.

【図１２】同実施形態におけるＧＵＩ画面例を示す図FIG. 12 is a view showing an example of a GUI screen according to the embodiment.

【図１３】本発明の一実施形態に係る検索システムの他
の構成例を示す図FIG. 13 is a diagram showing another configuration example of the search system according to the embodiment of the present invention.

【図１４】同実施形態における処理手順のさらに他の例
を示すフローチャートFIG. 14 is an exemplary flowchart illustrating still another example of the processing procedure in the embodiment.

【図１５】同実施形態におけるＧＵＩ画面例を示す図FIG. 15 is a view showing an example of a GUI screen according to the embodiment.

【図１６】本発明の一実施形態に係る検索システムのさ
らに他の構成例を示す図FIG. 16 is a diagram showing still another configuration example of the search system according to the embodiment of the present invention;

【図１７】同実施形態における処理手順のさらに他の例
を示すフローチャートFIG. 17 is an exemplary flowchart illustrating still another example of the processing procedure in the embodiment.

【図１８】同実施形態におけるＧＵＩ画面例を示す図FIG. 18 is a view showing an example of a GUI screen according to the embodiment.

【図１９】同実施形態におけるＧＵＩ画面例を示す図FIG. 19 is a view showing an example of a GUI screen according to the embodiment.

【図２０】同実施形態におけるＧＵＩ画面例を示す図FIG. 20 is an exemplary view showing a GUI screen example in the embodiment.

【図２１】同実施形態におけるＧＵＩ画面例を示す図FIG. 21 is an exemplary view showing an example of a GUI screen according to the embodiment.

【図２２】同実施形態におけるＧＵＩ画面例を示す図FIG. 22 is an exemplary view illustrating an example of a GUI screen according to the embodiment.

【図２３】同実施形態における処理手順のさらに他の例
を示すフローチャートFIG. 23 is a flowchart showing still another example of the processing procedure in the embodiment;

【図２４】同実施形態におけるＧＵＩ画面例を示す図FIG. 24 is a view showing an example of a GUI screen according to the embodiment.

【図２５】同実施形態におけるＧＵＩ画面例を示す図FIG. 25 is a view showing an example of a GUI screen according to the embodiment.

[Explanation of symbols]

１０…データ蓄積部２０…データ検索部３０…グループ化部４０…情報抽出部４１…補完情報抽出部５０…情報編集部６０…情報検索部 DESCRIPTION OF SYMBOLS 10 ... Data storage part 20 ... Data search part 30 ... Grouping part 40 ... Information extraction part 41 ... Complementary information extraction part 50 ... Information editing part 60 ... Information search part

───────────────────────────────────────────────────── フロントページの続き (72)発明者筒井秀樹神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者真鍋俊彦神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内Ｆターム(参考） 5B075 ND03 ND06 ND14 ND16 ND20 NR03 NR12 NS10 PP24 PQ02 PQ34 PQ46 PQ74 PR06 QM08 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Hideki Tsutsui 1st Toshiba R & D Center, Komukai-ku, Kawasaki-shi, Kanagawa Pref. No. 1 town Toshiba R & D Center F-term (reference) 5B075 ND03 ND06 ND14 ND16 ND20 NR03 NR12 NS10 PP24 PQ02 PQ34 PQ46 PQ74 PR06 QM08

Claims

[Claims]

1. A grouping means for grouping data stored in a predetermined storage device, and data grouped by the grouping means included in a plurality of data belonging to the same group in a redundant manner. A retrieval system comprising: extraction means for extracting duplicate information to be extracted; and editing means for editing the duplicate information extracted by the extraction means.

2. The retrieval system according to claim 1, wherein said editing means includes means for presenting an original data group from which duplicate information to be edited has been extracted.

3. The grouping means includes: means for detecting, from data having a structure, core data serving as a nucleus of a group based on the structure; means for extracting a feature from the core data; Means for performing grouping by including data having nuclear data including the same in the same group, and means for assigning a predetermined range including the feature of the nuclear data as a label of the group. The search system according to claim 1, which performs the search.

4. Each of the data stored in the predetermined storage device is provided with a unique identifier, and the grouping means is configured to output the other data described as the content in a certain data. 2. The method according to claim 1, wherein the certain data and the other data are in the same group based on an identifier assigned to the data or information capable of specifying the data.
Search system described in.

5. The extracting means includes means for converting each data in the grouped data group into a predefined form, and an item of the form overlapping at least two data and its contents. 2. The retrieval system according to claim 1, wherein the information is extracted as the duplicate information.

6. The apparatus according to claim 1, further comprising a search unit that searches for only information extracted by the extraction unit and edited by the editing unit as a search target. Search system described in.

7. The editing means includes an instructing means for instructing whether or not the information extracted by the extracting means can be disclosed, and the retrieving means retrieves only the information permitted to be disclosed by the instructing means as a retrieval target. The search system according to claim 6, wherein the search is performed.

8. The apparatus according to claim 1, further comprising supplementary information extracting means for extracting information other than the overlapping information and satisfying a predetermined criterion as supplementary information. Search system described.

9. Each of the data in the grouped data group is converted into a predefined form, and at least one form item that does not overlap with other data and its contents are stored in the group. Further comprising complementary information extracting means for extracting the complementary information as the complementary information, wherein the editing means can also edit the complementary information, and the search means can also search only the complementary information. The search system according to claim 6 or 7, wherein

10. A method for grouping data stored in a predetermined storage device, and extracting, from the grouped data, duplication information that is evaluated to be included in a plurality of data belonging to the same group. A search method characterized by editing extracted duplicate information.