JPH10301942A

JPH10301942A - Data mining device

Info

Publication number: JPH10301942A
Application number: JP9106234A
Authority: JP
Inventors: Nobuyoshi Wada; 信義和田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1997-04-23
Filing date: 1997-04-23
Publication date: 1998-11-13

Abstract

PROBLEM TO BE SOLVED: To obtain a data mining device capable of simultaneously satisfying mutually opposed purposes, i.e., the prevention of cutting-off of necessary candidates and the suppression of generation of unnecessary candidates, in order to detect regularity based on user's interactive utilization. SOLUTION: Prior to the input of data to a data mining engine 101 for generating a correlation engine, a preprocessing part 102 is arranged so that a user can refer to the attribute of the data and select the level or attribute of the data to be processed by interactively operating the engine 101 to generate regularity having required accuracy as a mining result. Prior to the generation and display of an output from the engine 101, a post processing part 103 is arranged to enable the user to interactively select the obtained regularity, stores its history and reuse the history at succeeding processing and after, so that the selection can be gradually and automatically advanced and the device can be allowed to correspond also to much unnecessary regularity generation.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、データベース中
の大量データの中からそのデータの関連情報を用いてよ
り役に立つ相関ルールを生成するデータマイニング装置
に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data mining apparatus for generating a more useful association rule from a large amount of data in a database by using information related to the data.

【０００２】[0002]

【従来の技術】大量のデータの中から規則性を発見する
データマイニングのなかの一つの方法として、文献１
（Agrawal,R.,“Fast Algorithm for Mining Associati
on Rules”）および対応する特開平８−２６３３４６号
公報の「大規模データベース内の順次パターンをマイニ
ングするためのシステムおよび方法」）に記述されるよ
うに、同一レコード内に頻繁に同時生起する項目群を発
見する相関ルール法が提案されてきている。2. Description of the Related Art As one method of data mining for finding regularity from a large amount of data, reference 1
(Agrawal, R., “Fast Algorithm for Mining Associati
on Rules ") and the corresponding system that frequently occurs in the same record as described in" System and method for mining sequential patterns in a large-scale database "in JP-A-8-263346. Association rule methods for finding groups have been proposed.

【０００３】この相関ルール法では、基本的にデータ内
に登場する項目名がそのままマイニングの対象となって
きたが、これでは利用者にとって有益なルールとしては
詳細すぎる場合がある。この問題に対する解決方法とし
て、複数の方法が考えられるが、文献２（Srikant,R.,
Agrawal,R.,“Mining Generalized Association Rule
s”, pp.４０７−４１９，VLDB’９５）および対応する
特開平８−２８７１０６号公報の「データベースのマイ
ニングシステム」）にて、自動的に有効と思われる水準
にまでルールを一般化する方法が提案されてきている。
この方法では、不要と思われる水準のルールをシステム
が自動的に削除している。In the association rule method, basically, the item names appearing in the data are directly subjected to mining, but this may be too detailed as a rule useful for the user. There are several possible solutions to this problem, but reference 2 (Srikant, R.,
Agrawal, R., “Mining Generalized Association Rule
s ", pp. 407-419, VLDB '95) and the corresponding" database mining system "of JP-A-8-287106), a method of generalizing rules to a level automatically considered to be effective. Has been proposed.
In this method, the system automatically removes the rules that are deemed unnecessary.

【０００４】[0004]

【発明が解決しようとする課題】これに対し、利用者に
とって有効な水準は利用者のみが利用の際にもに決定し
うるという立場で、対話的にその水準を決定し、対話的
にルールの取捨選択を行う方式は提案されてきていな
い。データマイニングの利用者にとって有益なデータ項
目によるルールを発見し、不要なルールは利用者に見せ
る前に削除する。稀に発生する事象は、重要な意味を有
する場合のある一方、大量のデータに埋もれがちであ
る。従来の相関ルール法では、候補の取捨基準として頻
度によるものと信頼度によるものを与えるが、稀に起る
事象では、頻度の取捨基準として大きい値では早期に棄
却される一方、小さい値を選ぶと大量のマイニング結果
生成につながり、今度は大量の結果をユーザが取捨選択
する過程で、貴重な結果が埋もれる可能性が高い。On the other hand, from the standpoint that only the user can determine the effective level for the user, the level is determined interactively, and the rule is determined interactively. There has not been proposed a method for selecting the information. Find rules based on data items that are useful to data mining users, and delete unnecessary rules before showing them to users. Rarely occurring events, while important in some cases, tend to be buried in large amounts of data. In the conventional association rule method, candidates based on frequency and reliability are given as criteria for discarding candidates, but in rare cases, large values are discarded as early as the frequency discard criteria, while small values are selected. This leads to the generation of a large amount of mining results, and it is highly likely that valuable results will be buried in the process of selecting a large amount of results this time by the user.

【０００５】この発明は上述した従来例に係る問題点を
解消するためになされたもので、貴重な規則性を利用者
の対話的利用に基づいて発見するために、必要な候補の
枝刈り防止と不要な候補生成の抑制という相反しがちな
目標を一度に満たすことができるデータマイニング装置
を得ることを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problems of the prior art, and prevents the pruning of necessary candidates in order to discover valuable regularity based on interactive use of a user. It is an object of the present invention to obtain a data mining apparatus that can satisfy the conflicting goals of suppressing generation of unnecessary candidates at a time.

【０００６】[0006]

【課題を解決するための手段】この発明に係るデータマ
イニング装置は、複数の関連するデータ項目を各レコー
ド毎に格納してなる第１のデータベースと、各データ項
目毎に階層化されて属性名と属性値を含んだデータを格
納してなる第２のデータベースと、上記第１のデータベ
ースに格納された各レコードのデータ項目及び上記第２
のデータベースに格納された各データ項目毎に階層化さ
れたデータ構造を読み込み、データ項目の選択操作入力
に応じて上記第１のデータベースに格納された各レコー
ドのデータ項目に対し上記第２のデータベースに格納さ
れた各データ項目毎に階層化されたデータ構造を参照し
てデータ項目を変換したレコードを出力する前処理部
と、上記前処理部を介した入力される全レコードに対し
同一レコード内に頻繁に出現するデータ項目をピックア
ップしてそれらデータ項目間で出現頻度及び信頼度の閾
値を満たす相関ルールを生成してマイニング結果として
出力するデータマイニングエンジンと、上記データマイ
ニングエンジンから出力されるマイニング結果に対し取
捨選択の選択操作入力に基づいて上記第２のデータベー
スに格納された各データ項目毎に階層化されたデータ構
造を参照して削除又は優先指定に基づき並べ換えられた
マイニング結果を出力する後処理部と、上記後処理部か
ら出力されるマイニング結果を格納する第３のデータベ
ースとを備えたものである。According to the present invention, there is provided a data mining apparatus comprising: a first database storing a plurality of related data items for each record; an attribute name hierarchized for each data item; And a second database storing data including attribute values and data items of each record stored in the first database and the second database.
The hierarchical structure is read for each data item stored in the database of the second database, and the data item of each record stored in the first database is read in response to the data item selection operation input. A pre-processing unit that outputs a record obtained by converting a data item by referring to a hierarchical data structure for each data item stored in a pre-processing unit. A data mining engine that picks up data items that frequently appear in the data items, generates an association rule that satisfies the thresholds of the appearance frequency and reliability between those data items, and outputs the result as a mining result; and a mining output from the data mining engine. Each data stored in the second database is based on the result of the selection operation input of the selection. A post-processing unit that outputs a mining result that is deleted or rearranged based on a priority designation with reference to a data structure hierarchized for each data item, and a third database that stores the mining result output from the post-processing unit It is provided with.

【０００７】また、上記前処理部は、第１のデータベー
スに格納された各レコードのデータ項目を読み込み、読
み込まれる都度対応するデータ項目に対する頻度追加情
報を出力する読込みデータ解釈部と、上記頻度追加情報
が入力される都度対応するデータ項目の出現頻度を演算
格納しその出現頻度を設定基準値と比較して該基準値よ
り高い頻度であるとき識別表示指示を出力するデータ項
目加工部と、上記読込みデータ解釈部を介して読み込ま
れたレコードのデータ項目を表示すると共に上記識別表
示指示に基づいて対応するデータ項目を識別表示する表
示部とを備えたことを特徴とするものである。The preprocessing unit reads a data item of each record stored in the first database, and outputs a frequency additional information for a corresponding data item each time the data item is read. A data item processing unit for calculating and storing an appearance frequency of a corresponding data item each time information is input, comparing the appearance frequency with a set reference value, and outputting an identification display instruction when the frequency is higher than the reference value; A display unit for displaying the data items of the record read through the read data interpretation unit and identifying and displaying the corresponding data items based on the identification display instruction.

【０００８】また、上記前処理部は、マイニングに使用
する選択操作入力を受ける入力部と、上記読込みデータ
解釈部を介して入力されるレコードのデータ項目に対し
入力される項目変換情報に基づいてデータ項目を変換し
たレコードを上記データマイニングエンジンに出力する
データ変換部とをさらに備え、上記データ項目加工部
は、上記読込みデータ解釈部を介して入力されるレコー
ドのデータ項目に対し上記第２のデータベースから対応
するデータ項目の階層化されたデータ構造を入力して上
記表示部に表示させ、上記入力部を介して入力される選
択操作入力に基づいて階層化されたデータ項目のデータ
構造からマイニングに用いるデータ項目を選択し、選択
されたデータ項目の識別表示指示を上記表示部に出力す
ると共に選択されたデータ項目の項目変換情報を上記デ
ータ変換部に出力することを特徴とするものである。[0008] The pre-processing unit is based on an input unit for receiving a selection operation input used for mining, and item conversion information input for a data item of a record input via the read data interpretation unit. A data conversion unit that outputs a record obtained by converting the data item to the data mining engine, wherein the data item processing unit performs the second data conversion on the data item of the record input via the read data interpretation unit. A hierarchical data structure of a corresponding data item is input from the database and displayed on the display unit, and mining is performed from the data structure of the hierarchical data item based on a selection operation input input via the input unit. Is selected, and an instruction to identify and display the selected data item is output to the display unit and the selected data item is selected. The item conversion information over data item is characterized in that the output to the data conversion unit.

【０００９】また、上記データ項目加工部は、上記表示
部に表示されるデータ項目に対し演算格納された出現頻
度の頻度情報を同時に表示させることを特徴とするもの
である。Further, the data item processing section is characterized by simultaneously displaying the frequency information of the appearance frequency calculated and stored for the data item displayed on the display section.

【００１０】また、上記データ項目加工部は、上記表示
部に表示される階層化されたデータ項目のデータ構造に
対し上記入力部を介して入力される複数のデータ項目の
選択操作入力に基づいて階層化されたデータ項目のデー
タ構造からマイニングに用いる複数のデータ項目を選択
し、階層関係にある複数のデータ項目のレコードを上記
データマイニングエンジンに出力すべく選択された複数
のデータ項目の項目変換情報を上記データ変換部に出力
することを特徴とするものである。[0010] The data item processing section may select a plurality of data items input through the input section with respect to the data structure of the hierarchized data items displayed on the display section. A plurality of data items to be used for mining are selected from the data structure of the hierarchized data items, and item conversion of the plurality of data items selected to output records of the plurality of data items having a hierarchical relationship to the data mining engine. The information is output to the data conversion unit.

【００１１】また、上記データ項目加工部は、上記表示
部に表示される階層化されたデータ項目のデータ構造に
対し上記第２のデータベースに格納された各データ項目
毎の属性値を読み込み上記表示部に表示される対応する
データ項目にその属性値をも表示させると共に、上記入
力部を介して入力される属性値の選択操作入力に基づい
て属性値を加えたデータ項目または属性値を変更したデ
ータ項目を追加したレコードを出力すべく上記データ変
換部に項目変換情報を出力することを特徴とするもので
ある。The data item processing unit reads the attribute value of each data item stored in the second database with respect to the data structure of the hierarchized data item displayed on the display unit, and reads the attribute value. The attribute value is also displayed in the corresponding data item displayed in the section, and the data item or attribute value to which the attribute value is added based on the attribute value selection operation input input through the input section is changed. Item conversion information is output to the data conversion unit in order to output a record to which a data item has been added.

【００１２】また、上記データ項目加工部は、上記表示
部に表示される階層化されたデータ項目のデータ構造に
対し上記第２のデータベースに格納された各データ項目
に対応する属性名毎の属性値の階層関係を読み込み上記
表示部に階層化された属性値をも表示させると共に、上
記入力部を介して入力される属性値の選択操作入力に基
づいて属性値を加えたデータ項目または属性値を変更し
たデータ項目を追加したレコードを出力すべく上記デー
タ変換部に項目変換情報を出力することを特徴とするも
のである。[0012] The data item processing unit may further include an attribute for each attribute name corresponding to each data item stored in the second database with respect to the data structure of the hierarchical data items displayed on the display unit. A data item or attribute value obtained by reading the hierarchical relationship of values, displaying the hierarchical attribute values on the display unit, and adding the attribute values based on the attribute value selection operation input input via the input unit. Item conversion information is output to the data conversion unit in order to output a record to which a data item in which is changed is added.

【００１３】また、上記データ項目加工部は、上記入力
部を介して入力されるすべての操作入力を操作履歴とし
て上記第２のデータベースに保存させ、上記入力部を介
して入力される操作履歴の表示操作入力に基づいて上記
第２のデータベースから操作履歴を読出し上記表示部に
ユーザ操作に置き換えて操作履歴を適用した結果を表示
させることを特徴とするものである。The data item processing section stores all operation inputs input via the input section as operation histories in the second database, and stores the operation histories input via the input section. An operation history is read out from the second database based on a display operation input, and a result of applying the operation history instead of a user operation is displayed on the display unit.

【００１４】また、上記データ項目加工部は、上記入力
部を介して入力される識別子を付加した操作履歴を上記
第２のデータベースに保存させ、上記入力部を介して入
力される識別子によって選択された操作履歴を上記第２
のデータベースから読出し上記表示部にユーザ操作に置
き換えて操作履歴を適用した結果を表示させることを特
徴とするものである。Further, the data item processing section stores the operation history to which the identifier input through the input section is added in the second database, and is selected by the identifier input through the input section. The operation history
And the result of applying the operation history instead of the user operation is displayed on the display unit.

【００１５】また、上記後処理部は、上記データマイニ
ングエンジンから出力されるマイニング結果を一時保持
するマイニング結果保持部と、該マイニング結果を表示
する表示部と、表示されたマイニング結果に対する後処
理加工のための操作入力を受ける入力部と、上記入力部
を介して入力される操作入力に基づいて上記マイニング
結果保持部に保持されたマイニング結果を加工処理して
出力するマイニング結果加工部と、上記入力部を介して
入力されるすべての操作入力を操作履歴として一時保持
する操作履歴保持部とを備えると共に、上記後処理部の
外部に、上記操作履歴保持部により一時保持されたすべ
ての操作履歴を格納する第４のデータベースを備えたこ
とを特徴とするものである。[0015] The post-processing section includes a mining result holding section for temporarily storing a mining result output from the data mining engine, a display section for displaying the mining result, and a post-processing process for the displayed mining result. An input unit for receiving an operation input for processing, a mining result processing unit for processing and outputting the mining result held in the mining result holding unit based on the operation input input via the input unit, An operation history holding unit that temporarily holds all operation inputs input via the input unit as an operation history, and all operation logs temporarily held by the operation history holding unit outside the post-processing unit. And a fourth database for storing

【００１６】また、上記マイニング結果加工部は、上記
入力部を介して入力されるルール削除指定の操作入力に
基づいて上記表示部に表示されるマイニング結果から削
除指定を受けたルールを削除することを特徴とするもの
である。Further, the mining result processing unit deletes a rule specified to be deleted from the mining result displayed on the display unit based on an operation input for specifying rule deletion input through the input unit. It is characterized by the following.

【００１７】また、上記マイニング結果加工部は、上記
入力部を介して入力される階層距離指定の操作入力に基
づいて上記第２のデータベースに格納されたデータ項目
の階層化されたデータ構造を参照して当該階層距離に対
応する類義語によって構成される類似ルールについても
一括削除することを特徴とするものである。The mining result processing section refers to a hierarchical data structure of data items stored in the second database based on an operation input for specifying a hierarchical distance input through the input section. Then, similar rules composed of synonyms corresponding to the hierarchical distance are also collectively deleted.

【００１８】また、上記マイニング結果加工部は、上記
入力部を介して入力される優先指定の操作入力に基づい
て上記表示部に表されるルール群のうち優先指定に該当
するルールを先頭位置に移動させかつ識別表示させるこ
とを特徴とするものである。Further, the mining result processing unit sets a rule corresponding to the priority designation among the rule group displayed on the display unit based on a priority designation operation input input via the input unit, to a head position. It is characterized by being moved and identified.

【００１９】また、上記マイニング結果加工部は、上記
入力部を介して入力される階層距離指定の操作入力に基
づいて上記第２のデータベースに格納されたデータ項目
の階層化されたデータ構造を参照して当該階層距離に対
応する類義語によって構成される類似ルールについても
上記表示部に優先指定に基づく表示を行わせることを特
徴とするものである。The mining result processing unit refers to a hierarchical data structure of data items stored in the second database based on an operation input for specifying a hierarchical distance input via the input unit. Further, the similarity rule constituted by a synonym corresponding to the hierarchical distance is also displayed on the display unit based on the priority designation.

【００２０】また、上記マイニング結果加工部は、上記
第２のデータベースに格納されたデータ項目の階層化さ
れたデータ構造を参照して、ルール群のうち、条件部の
項目数が多いルールに対し、そのルールと結論部が一致
または類似し、かつ条件部のすべての項目が一致または
類似する条件部の項目数が少ないルールについて、上記
条件部の項目数が多いルールが削除または優先処理され
たときは同様の処理を実行することを特徴とするもので
ある。Further, the mining result processing unit refers to the hierarchical data structure of the data items stored in the second database, and, for a rule having a large number of items in the condition part in the rule group, , A rule with a large number of items in the condition part was deleted or given priority processing for a rule with a small number of items in the condition part where the conclusion part matches or is similar to the rule and all the items in the condition part match or are similar. At the time, similar processing is executed.

【００２１】また、上記マイニング結果加工部は、上記
入力部を介して入力される識別子を付加した操作履歴を
上記操作履歴保持部を介して上記第４のデータベースに
保存させ、上記入力部を介して入力される識別子によっ
て選択された操作履歴を上記第４のデータベースから読
出し上記表示部に操作履歴を反映したマイニング結果を
表示させることを特徴とするものである。The mining result processing section stores the operation history added with the identifier input through the input section in the fourth database through the operation history holding section, and stores the operation history through the input section. The operation history selected by the input identifier is read out from the fourth database, and a mining result reflecting the operation history is displayed on the display unit.

【００２２】さらに、外部で作成された識別子が付され
た取捨選択規則を格納してなる第５のデータベースをさ
らに備えると共に、上記後処理部に、上記第５のデータ
ベースに格納されている識別子が付された取捨選択規則
を読出し一時保持する外部作成取捨選択規則保持部をさ
らに備え、上記マイニング結果加工部は、上記外部作成
取捨選択規則保持部に保持された取捨選択規則を反映し
たマイニング結果を上記表示部に表示させることを特徴
とするものである。Furthermore, a fifth database storing a selection rule to which an identifier created outside is added is further provided, and the identifier stored in the fifth database is stored in the post-processing unit. An externally created selection rule holding unit that reads and temporarily holds the attached selection rule is further provided, and the mining result processing unit stores a mining result reflecting the selection rule held in the externally created selection rule holding unit. The information is displayed on the display unit.

【００２３】[0023]

BEST MODE FOR CARRYING OUT THE INVENTION

実施の形態１．この発明の実施の形態１におけるデータ
マイニング装置は、図１に示すような構成を有する。一
般に、データマイニングとは、複数の関連するデータ項
目を各レコード毎に格納してなるデータベース１０４か
らデータベース１０５に格納すべきマイニング結果とし
て相関ルールを生成するものということができる。Embodiment 1 FIG. The data mining device according to the first embodiment of the present invention has a configuration as shown in FIG. Generally, data mining can be said to generate an association rule as a mining result to be stored in the database 105 from a database 104 storing a plurality of related data items for each record.

【００２４】基本的には、データマイニングエンジン１
０１がデータベース１０４内に格納される全レコードに
対し同一レコード内に頻繁に出現するデータ項目をピッ
クアップしてそれらデータ項目間で出現頻度及び信頼度
の閾値を満たす相関ルールを生成するが、生成されるル
ールの品質を向上する方法として、この実施の形態１で
は、関連情報を適宜用いる方法を採用しており、それ
は、具体的には、各データ項目毎に階層化されたデータ
構造及び属性名と属性値を含んだデータを格納してなる
データ項目辞書・属性辞書データベース１０６、ルール
取捨選択のユーザ操作履歴を格納するデータベース１０
７、外部作成の取捨選択規則を格納してなるデータベー
ス１０８を用いる。Basically, the data mining engine 1
01 picks up data items that frequently appear in the same record with respect to all records stored in the database 104 and generates an association rule that satisfies the threshold of the appearance frequency and reliability between those data items. In the first embodiment, as a method of improving the quality of a rule, a method of appropriately using related information is employed. Specifically, the method includes a data structure and an attribute name hierarchized for each data item. Item dictionary / attribute dictionary database 106 storing data including attribute and attribute values, database 10 storing user operation history of rule selection
7. Use the database 108 storing the externally created selection rules.

【００２５】これらデータベースを有効に用いるため
に、データマイニングエンジン１０１に対し前処理部１
０２と後処理部１０３を加えることとした。すなわち、
前処理部１０２はデータベース１０６の辞書を用い、後
処理部１０３はデータベース１０７と１０８のルール取
捨選択規則を用いる。利用者は、これらの前処理部１０
２と後処理部１０３による処理段階において、利用者に
とって最も利便性のある結果を得るために、対話的にこ
れら前処理と後処理の段階で操作入力を与える。そし
て、最終的に、データベース１０５に格納するマイニン
グ結果として簡潔な形をした利便性の高いものを得る。In order to use these databases effectively, the pre-processing unit 1
02 and the post-processing unit 103 are added. That is,
The pre-processing unit 102 uses the dictionary of the database 106, and the post-processing unit 103 uses the rule selection rule of the databases 107 and 108. The user can use these pre-processing units 10
2 and in the processing stage by the post-processing unit 103, in order to obtain the most convenient result for the user, an operation input is interactively given in these pre-processing and post-processing stages. Finally, a simple and highly convenient mining result stored in the database 105 is obtained.

【００２６】すなわち、上記前処理部１０２は、上記デ
ータベース１０４に格納された各レコードのデータ項目
及び上記データベース１０６に格納された各データ項目
毎に階層化されたデータ構造を読み込み、データ項目の
選択操作入力に応じて上記データベース１０４に格納さ
れた各レコードのデータ項目に対し上記データベース１
０６に格納された各データ項目毎に階層化されたデータ
構造を参照してデータ項目を変換したレコードを出力す
る。また、上記後処理部１０３は、データマイニングエ
ンジン１０１から出力されるマイニング結果に対し取捨
選択の選択操作入力に基づいて上記データベース１０６
に格納された各データ項目毎に階層化されたデータ構造
を参照して削除又は優先指定に基づき並べ換えられたマ
イニング結果を出力する。なお、図１中、１０９は対話
形式で操作入力を与えるユーザを示す。That is, the preprocessing unit 102 reads the data structure of each record stored in the database 104 and the data structure hierarchized for each data item stored in the database 106, and selects the data item. In response to an operation input, the data item of each record stored in the database
Reference is made to a hierarchical data structure for each data item stored in 06, and a record obtained by converting the data item is output. In addition, the post-processing unit 103 transmits the database 106 to the mining result output from the data mining engine 101 based on a selection operation input of selection.
The mining result, which is deleted or rearranged based on the priority designation, is output with reference to the data structure hierarchized for each data item stored in. In FIG. 1, reference numeral 109 denotes a user who gives an operation input in an interactive manner.

【００２７】ここで、上記前処理部１０２としては図２
に示す構成を備える。図示されるように、前処理部１０
２は、データベース１０４に格納された各レコードのデ
ータ項目を読み込み、読み込まれる都度対応するデータ
項目に対する頻度追加情報を出力する読込みデータ解釈
部１０２ａ、上記頻度追加情報が入力される都度対応す
るデータ項目の出現頻度を演算格納しその出現頻度を設
定基準値と比較して該基準値より高い頻度であるとき対
応データ項目を色や記号で識別表示させるデータ項目加
工部１０２ｂ、上記読込みデータ解釈部１０２ａを介し
て読み込まれたレコードのデータ項目を表示すると共に
上記識別表示指示に基づいて対応するデータ項目を識別
表示する表示部１０２ｃを備えており、ユーザには、表
示部１０２ｃに表示された情報が現状でのマイニングに
使用される項目の情報として与えられるようになされて
いて、識別表示されたデータ項目がマイニング対象とな
ることを容易に識別可能とすることができるようになっ
ている。Here, as the pre-processing unit 102, FIG.
Is provided. As shown, the pre-processing unit 10
Reference numeral 2 denotes a read data interpreting unit 102a which reads data items of each record stored in the database 104 and outputs additional frequency information to the corresponding data item each time the data item is read, and a corresponding data item each time the additional frequency information is input The data item processing unit 102b calculates and stores the appearance frequency of the data item, compares the appearance frequency with a set reference value, and when the frequency is higher than the reference value, identifies the corresponding data item by color or symbol, and reads the read data interpretation unit 102a. And a display unit 102c for displaying the data items of the record read via the display unit 102, and displaying the corresponding data items based on the identification display instruction. The user can receive the information displayed on the display unit 102c. It is provided as information on items used for mining in the current situation, and is identified and displayed. Data item is made to be able to readily identifiable to be a mined.

【００２８】また、上記前処理部１０２は、ユーザから
マイニングに使用する選択操作入力を受ける入力部１０
２ｄ、上記読込みデータ解釈部１０２ａを介して入力さ
れるレコードのデータ項目に対し入力される項目変換情
報に基づいてデータ項目を変換したレコードをデータマ
イニングエンジン１０１に送出するデータとして出力す
るデータ変換部１０２ｅをさらに備えており、上記デー
タ項目加工部１０２ｂは、上記読込みデータ解釈部１０
２ａを介して入力されるレコードのデータ項目に対しデ
ータベース１０６から対応するデータ項目の階層化され
たデータ構造を入力して上記表示部１０２ｃに表示さ
せ、上記入力部１０２ｄを介して入力される選択操作入
力に基づいて階層化されたデータ項目のデータ構造から
マイニングに用いるデータ項目を選択し、選択されたデ
ータ項目の識別表示指示を上記表示部１０２ｃに出力す
ると共に選択されたデータ項目の項目変換情報を上記デ
ータ変換部１０２ｅに出力するようにしている。The pre-processing unit 102 is provided with an input unit 10 for receiving a selection operation input used for mining from a user.
2d, a data conversion unit that outputs a record obtained by converting a data item based on item conversion information input to a data item of a record input via the read data interpretation unit 102a as data to be sent to the data mining engine 101 102e, and the data item processing unit 102b includes the read data interpretation unit 10b.
2a, a hierarchical data structure of the corresponding data items is input from the database 106 to the data items of the record, displayed on the display unit 102c, and selected by the input unit 102d. A data item to be used for mining is selected from the data structure of the hierarchized data items based on the operation input, an identification display instruction of the selected data item is output to the display unit 102c, and the item conversion of the selected data item is performed. Information is output to the data conversion unit 102e.

【００２９】例えば、データベース１０４から「男、３
８才、肝炎、胃腸炎、金属アレルギー、国道沿い在住、
太り気味、飲酒量大、薬品Ａ有効」というレコードが読
み出されて表示部１０２ｃに表示され、また、データ項
目のうち例えば金属アレルギーの階層構造（木構造）が
データベース１０６から読み出されて表示部１０２ｃに
表示されている場合に、ユーザからの選択操作入力とし
て、「金属アレルギー」の階層構造からそれより上位の
「アレルギー」をマイニング対象のデータ項目として選
択すれば、選択されたデータ項目の識別表示が上記表示
部１０２ｃに出力されると共に、データ変換部１０２ｅ
でデータ変換され「男、３８才、肝炎、胃腸炎、アレル
ギー、国道沿い在住、太り気味、飲酒量大、薬品Ａ有
効」というレコードが出力されることになり、ユーザに
とって分かりやすい形で対話的にマイニングに用いるデ
ータ項目を指定することができ、データ変換することが
できる。For example, from the database 104, "man, 3
8 years old, hepatitis, gastroenteritis, metal allergy, living along national highway,
The record of "weight gain, large amount of drinking, drug A effective" is read out and displayed on the display unit 102c, and among the data items, for example, the hierarchical structure (tree structure) of metal allergy is read out from the database 106 and displayed. In the case where the data item is displayed in the section 102c, if the user selects “allergy” higher than that from the hierarchical structure of “metal allergy” as a data item to be mined as a selection operation input from the user, the selected data item The identification display is output to the display unit 102c, and the data conversion unit 102e
The data is converted into a record of "male, 38 years old, hepatitis, gastroenteritis, allergy, living along the national highway, being overweight, having a large amount of drinking, and drug A effective". The data item used for mining can be specified, and the data can be converted.

【００３０】なお、データベース１０６には、例えば
「金属アレルギー，ｉｓａ，アレルギー」、「薬品Ａ有
効，ｉｓａ，抗生物質系服用薬有効」、「肝炎，潜伏期
間，３年」という２項関係又は３項関係から内部的に階
層関係を生成する辞書が格納されるようになっており、
ここで、３項関係では、１番目がデータ項目、２番目が
属性名、３番目が属性値を示し、また、２項関係では、
ｉｓａはその前の項目がその後の項目の下位となること
を示している。The database 106 contains, for example, a binary relation or "3,""metal allergy, isa, allergy,""medicine A effective, isa, antibiotics effective,""hepatitis, incubation period, 3 years." A dictionary that internally generates hierarchical relationships from item relationships is stored.
Here, in the ternary relation, the first indicates a data item, the second indicates an attribute name, and the third indicates an attribute value.
isa indicates that the preceding item is lower than the following item.

【００３１】また、上記データ項目加工部１０２ｂは、
その際、上記表示部１０２ｃに表示されるデータ項目に
対し演算格納された出現頻度の頻度情報を同時に表示さ
せるようにしており、ユーザにとってどの頻度情報を用
いてマイニングに用いるかを判定可能にしている。The data item processing unit 102b
At this time, the frequency information of the appearance frequency calculated and stored for the data item displayed on the display unit 102c is simultaneously displayed, so that it is possible for the user to determine which frequency information is used for mining. I have.

【００３２】また、上記データ項目加工部１０２ｂは、
上記表示部１０２ｃに表示される階層化されたデータ項
目のデータ構造に対し上記入力部１２０ｄを介して入力
される複数のデータ項目の選択操作入力に基づいて階層
化されたデータ項目のデータ構造からマイニングに用い
る複数のデータ項目を選択し、階層関係にある複数のデ
ータ項目のレコードを上記データマイニングエンジン１
０１に出力すべく選択された複数のデータ項目の項目変
換情報を上記データ変換部１０２ｅに出力することがで
きるようになっている。The data item processing unit 102b
The data structure of the hierarchized data items displayed on the display unit 102c is changed from the data structure of the hierarchized data items based on a selection operation input of a plurality of data items input through the input unit 120d. A plurality of data items used for mining are selected, and records of a plurality of data items having a hierarchical relationship are stored in the data mining engine 1
Item conversion information of a plurality of data items selected to be output to No. 01 can be output to the data conversion unit 102e.

【００３３】例えば、データマイニングエンジン１０１
に出力するレコードとして「男、３８才、肝炎、胃腸
炎、アレルギー、国道沿い在住、太り気味、飲酒量大、
薬品Ａ有効」の他に、データ項目として「アレルギー」
と共に「金属アレルギー」をもマイニング対象と選択す
れば、データマイニングエンジン１０１に出力するレコ
ードとして「男、３８才、肝炎、胃腸炎、金属アレルギ
ー、国道沿い在住、太り気味、飲酒量大、薬品Ａ有効」
も出力することができ、対話的にマイニングに用いる複
数のデータ項目の選択を行うことができる。すなわち、
項目Ａの下位として項目Ｂその下位として項目Ｃがある
場合、例えば全体階層中の部分階層で上位の項目Ａと下
位の項目Ｃの両者の選択を行うことができる。For example, the data mining engine 101
The record to be output to "male, 38 years old, hepatitis, gastroenteritis, allergy, living along the national highway, a bit fat, high drinking,
In addition to "Effective for drug A", data item "Allergy"
In addition, if "metal allergy" is also selected as a mining target, the record to be output to the data mining engine 101 is "male, 38 years old, hepatitis, gastroenteritis, metal allergy, living along the national highway, fat, heavy drinking, drug A Effectiveness"
Can also be output, and a plurality of data items used for mining can be selected interactively. That is,
When item B is lower than item A and item C is lower than item A, for example, both upper item A and lower item C can be selected in a partial hierarchy of the entire hierarchy.

【００３４】また、上記データ項目加工部１０２ｂは、
上記表示部１０２ｃに表示される階層化されたデータ項
目のデータ構造に対し上記データベース１０６に格納さ
れた各データ項目毎の属性値を読み込み上記表示部１０
２ｃに表示される対応するデータ項目にその属性値をも
表示させると共に、上記入力部１０２ｄを介して入力さ
れる属性値の選択操作入力に基づいて属性値を加えたデ
ータ項目または属性値を変更したデータ項目を追加した
レコードを出力すべく上記データ変換部１０２ｅに項目
変換情報を出力することができるようになっており、ど
の属性値をマイニング対象として指定するかを可能にし
ている。The data item processing unit 102b
For the data structure of the hierarchical data items displayed on the display unit 102c, the attribute values of each data item stored in the database 106 are read and the display unit 10 is read.
The attribute value is also displayed in the corresponding data item displayed in 2c, and the data item or attribute value to which the attribute value is added is changed based on the attribute value selection operation input input through the input unit 102d. Item conversion information can be output to the data conversion unit 102e in order to output a record to which the added data item is added, and it is possible to specify which attribute value is specified as a mining target.

【００３５】また、上記データ項目加工部１０２ｂは、
上記表示部１０２ｃに表示される階層化されたデータ項
目のデータ構造に対し上記データベース１０６に格納さ
れた各データ項目に対応する属性名毎の属性値の階層関
係を読み込み上記表示部１０２ｃに階層化された属性値
をも表示させるると共に、上記入力部１０２ｄを介して
入力される属性値の選択操作入力に基づいて属性値を加
えたデータ項目または属性値を変更したデータ項目を追
加したレコードを出力すべく上記データ変換部１０２ｅ
に項目変換情報を出力することができるようになってお
り、属性名毎の属性値の階層関係からどの水準の属性値
をマイニング対象として指定するかを可能にしている。The data item processing unit 102b
The hierarchical structure of attribute values for each attribute name corresponding to each data item stored in the database 106 is read from the data structure of the hierarchical data items displayed on the display unit 102c, and the hierarchical structure is displayed on the display unit 102c. The displayed attribute value is also displayed, and a record to which a data item to which an attribute value is added or a data item to which an attribute value is changed is added based on an attribute value selection operation input input via the input unit 102d is displayed. The data conversion unit 102e for outputting
In this case, it is possible to output which level of the attribute value is to be designated as the mining target from the hierarchical relationship of the attribute value for each attribute name.

【００３６】また、上記データ項目加工部１０２ｂは、
上記入力部１０２ｄを介して入力されるすべての操作入
力を一時保持し操作履歴として上記データベース１０６
に保存させ、上記入力部１０２ｄを介して入力される操
作履歴の表示操作入力に基づいて上記データベース１０
６から操作履歴を読出し上記表示部１０２ｃにユーザ操
作に置き換えて操作履歴を適用した結果を表示させるこ
とができるようになっており、次回実行の際にその保存
された履歴を再利用することを可能にしている。The data item processing unit 102b
All the operation inputs input via the input unit 102d are temporarily stored and stored in the database 106 as an operation history.
In the database 10 based on the operation input of the operation history input through the input unit 102d.
6, the operation history is read out, and the result of applying the operation history in place of the user operation can be displayed on the display unit 102c, and the saved history can be reused at the next execution. Making it possible.

【００３７】また、上記データ項目加工部１０２ｂは、
上記入力部１０２ｄを介して入力される識別子を付加し
た操作履歴を上記データベース１０６に保存させ、上記
入力部１０２ｄを介して入力される識別子によって選択
された操作履歴を上記データベース１０６から読出し上
記表示部１０２ｃにユーザ操作に置き換えて操作履歴を
適用した結果を表示させることができるようになってお
り、次回実行の際に様々な履歴から適切なものとして識
別子によって選択して再利用することを可能にしてい
る。The data item processing unit 102b
The operation history to which the identifier input via the input unit 102d is added is stored in the database 106, and the operation history selected by the identifier input via the input unit 102d is read from the database 106 and the display unit The result of applying the operation history instead of the user operation can be displayed in 102c, and it is possible to select and reuse an appropriate one from various histories by an identifier at the next execution. ing.

【００３８】また、上記後処理部１０３としては図３に
示す構成を備える。図示されるように、後処理部１０３
は、データマイニングエンジン１０１から出力されるマ
イニング結果を一時保持するマイニング結果保持部１０
３ａと、該マイニング結果を表示するマイニング結果表
示部１０３ｂと、表示されたマイニング結果に対する後
処理加工のための操作入力を受ける入力部１０３ｃと、
上記入力部１０３ｃを介して入力される操作入力に基づ
いて上記マイニング結果保持部１０３ａに保持されたマ
イニング結果を加工処理して出力するマイニング結果加
工部１０３ｄと、上記入力部１０３ｃを介して入力され
るユーザによるすべての操作入力を操作履歴として一時
保持して保持された操作履歴をデータベース１０７に格
納するユーザ操作履歴保持部１０３ｅと、データベース
１０８に格納されている取捨選択規則を読出し一時保持
する外部作成取捨選択規則保持部１０３ｆを備えてお
り、必然的に大量に生成されてしまう相関ルールから利
便性の高いものを選択的に選び価値の高いルールがより
優先的に扱われるようにしている。また、操作履歴をデ
ータベース１０７に格納しまたはデータベース１０８に
格納されている取捨選択規則を読出することで、次回実
行の際にその保存された履歴または取捨選択規則を再利
用して、繁雑な操作を繰り返す事なく後処理に利用する
ことができるようにしている。The post-processing section 103 has the configuration shown in FIG. As illustrated, the post-processing unit 103
Is a mining result holding unit 10 that temporarily holds mining results output from the data mining engine 101.
3a, a mining result display unit 103b for displaying the mining result, an input unit 103c for receiving an operation input for post-processing of the displayed mining result,
A mining result processing unit 103d that processes and outputs the mining result held in the mining result holding unit 103a based on an operation input input through the input unit 103c, and a mining result processing unit 103d input through the input unit 103c. A user operation history holding unit 103e that temporarily stores all operation inputs by the user as operation history and stores the held operation history in the database 107, and an external unit that reads out and temporarily stores the sorting rules stored in the database 108. The apparatus includes a creation / selection rule holding unit 103f, and selectively selects a convenient rule from correlation rules that are inevitably generated in large quantities so that a rule with a high value is treated with higher priority. Further, by storing the operation history in the database 107 or reading out the sorting rules stored in the database 108, the saved history or the sorting rules can be reused at the next execution to perform complicated operations. Can be used for post-processing without repeating.

【００３９】上記後処理部１０３には、データマイニン
グエンジン１０１からのマイニング結果として、例えば
「国道沿い在住，アレルギー，→，薬品Ａ有効，０．０
２４，０．６３」というルールが入力される。ここで、
矢印の左辺はルールの条件部、右辺は結論部を示し、条
件部のデータ項目数は不定であり、結論部は、「薬品Ａ
有効」というデータ項目と、全レコード中で「国道沿い
在住」と「アレルギー」および「薬品Ａ有効」の３つの
データ項目が含まれる相対頻度（出現頻度）と、条件部
のすべてのデータ項目（「国道沿い在住」と「アレルギ
ー」の２つのデータ項目）が同時に出現する元で「薬品
Ａ有効」のデータ項目が出現する相対頻度（信頼度）が
続いている。In the post-processing unit 103, as a mining result from the data mining engine 101, for example, “resident on the national road, allergy, →, medicine A valid, 0.0
24, 0.63 "is input. here,
The left side of the arrow indicates the condition part of the rule, and the right side indicates the conclusion part. The number of data items in the condition part is indefinite.
Data item "valid", the relative frequency (appearance frequency) in which all data items include "living along the national road", "allergy" and "medicine A valid" in all records, and all data items in the condition part ( The relative frequency (reliability) of the appearance of the data item of “medicine A valid” continues under the condition that “the two data items of“ living along the national road ”and“ allergy ”) appear at the same time.

【００４０】上記後処理部１０３において、マイニング
結果加工部１０３ｄは、上記入力部１０３ｃを介して入
力されるルール削除指定の操作入力に基づいて上記表示
部１０３ｂに表示されるマイニング結果、つまり上述し
たごとくルール群から削除指定を受けたルールを削除す
るようになっており、不要なルールを削除して簡素化し
た相関ルールを得ることができるようになっている。In the post-processing unit 103, the mining result processing unit 103d outputs the mining result displayed on the display unit 103b based on the operation input for specifying rule deletion input via the input unit 103c, ie, the above-described mining result processing unit 103d. As described above, a rule specified to be deleted is deleted from the rule group, and an unnecessary rule is deleted to obtain a simplified correlation rule.

【００４１】また、上記マイニング結果加工部１０３ｄ
は、上記入力部１０３ｃを介して入力される階層距離指
定の操作入力に基づいて上記データベース１０６に格納
されたデータ項目の階層化されたデータ構造を参照して
当該階層距離に対応する類義語によって構成される類似
ルールについても一括削除するようになっており、指定
された階層距離に対応する類義語のデータ項目が含まれ
るすべてのルールをも一括削除していちいちデータ項目
毎に指定する繁雑さを解消することを可能にしている。The mining result processing section 103d
Is configured by a synonym corresponding to the hierarchical distance by referring to the hierarchical data structure of the data items stored in the database 106 based on the operation input for specifying the hierarchical distance input via the input unit 103c. Similar rules are also deleted at once, and all rules that include synonym data items corresponding to the specified hierarchical distance are also deleted at once, eliminating the complexity of specifying each data item one by one. It is possible to do.

【００４２】また、上記マイニング結果加工部１０３ｄ
は、上記入力部１０３ｃを介して入力される優先指定の
操作入力に基づいて上記表示部１０３ｂに表されるルー
ル群のうち優先指定に該当するルールを先頭位置に移動
させかつ色や符号で識別表示させるようになっており、
優先指定されたルールを識別可能にしている。The mining result processing section 103d
Moves the rule corresponding to the priority designation from the rule group displayed on the display unit 103b to the head position based on the operation input of the priority designation input via the input unit 103c, and identifies the rule by color or code. Is to be displayed,
Identifies priority rules.

【００４３】また、上記マイニング結果加工部１０３ｄ
は、上記入力部１０３ｃを介して入力される階層距離指
定の操作入力に基づいて上記データベース１０６に格納
されたデータ項目の階層化されたデータ構造を参照して
当該階層距離に対応する類義語によって構成される類似
ルールについても上記表示部１０３ｂに優先指定に基づ
く表示を行わせるようになっており、類義語によって構
成される類似ルールについてもメニューを選択する形で
範囲を指定して優先を指定可能にしている。The mining result processing section 103d
Is configured by a synonym corresponding to the hierarchical distance by referring to the hierarchical data structure of the data items stored in the database 106 based on the operation input for specifying the hierarchical distance input via the input unit 103c. The similarity rule to be displayed is also displayed on the display unit 103b based on the priority designation. For the similarity rule composed of synonyms, the priority can be designated by specifying a range by selecting a menu. ing.

【００４４】また、上記マイニング結果加工部１０３ｄ
は、上記データベース１０６に格納されたデータ項目の
階層化されたデータ構造を参照して、ルール群のうち、
条件部の項目数が多いルールに対し、例えば「アレルギ
ー，大酒飲み、国道沿い在住，→，薬品Ａ有効」という
ルールに対し、そのルールと結論部が一致または類似
し、かつ条件部のすべての項目が一致または類似する条
件部の項目数が少ないルール、例えば「国道沿い在住，
アレルギー，→，薬品Ａ有効」というルールについて、
上記条件部の項目数が多いルールが削除または優先処理
されたときは同様の処理を実行するようにしており、デ
ータ項目が一致または類似するルールについて削除また
は優先処理を容易に行うことができるようにしている。The mining result processing section 103d
Refers to the hierarchical data structure of the data items stored in the database 106, and
For a rule with a large number of items in the condition part, for example, for a rule of "allergy, heavy drinking, living along a national road, →, drug A effective", the rule and the conclusion part are the same or similar, and all the conditions part Rules with a small number of items in the condition part where the items match or are similar, for example,
Allergy, →, Drug A is effective "
The same processing is executed when a rule having a large number of items in the condition section is deleted or subjected to priority processing, so that a rule having a matching or similar data item can be easily deleted or subjected to priority processing. I have to.

【００４５】また、上記マイニング結果加工部１０３ｄ
は、上記入力部１０３ｃを介して入力される識別子を付
加した操作履歴を上記ユーザ操作履歴保持部１０３ｅを
介してデータベース１０７に保存させ、上記入力部１０
３ｃを介して入力される識別子によって選択された操作
履歴を上記データベース１０７から読出し上記表示部１
０３ｂに操作履歴を反映したマイニング結果を表示させ
るようにしており、次回の操作では利用者毎に保持され
た操作履歴又は識別子が付された操作履歴をユーザにと
って適宜利用可能にすることができるようにしている。The mining result processing section 103d
Stores the operation history to which the identifier input via the input unit 103c is added in the database 107 via the user operation history storage unit 103e.
The operation history selected by the identifier input via the interface 3c is read out from the database 107 and the display unit 1 is read out.
03b displays a mining result reflecting the operation history, and in the next operation, the operation history held for each user or the operation history with an identifier can be appropriately made available to the user. I have to.

【００４６】さらに、上記マイニング結果加工部１０３
ｄは、外部で作成された識別子が付された取捨選択規則
を格納してなるデータベース１０８に格納されている識
別子が付された取捨選択規則を読出し一時保持する外部
作成取捨選択規則保持部１０３ｆに保持された取捨選択
規則を反映したマイニング結果を上記表示部１０３ｂに
表示させるようにしており、外部で作成された取捨選択
規則を識別子で保持し操作履歴と同様にしてユーザにと
って適宜利用可能にしている。Further, the mining result processing section 103
d reads the selection rule with the identifier stored in the database 108 storing the selection rule with the identifier created externally and stores the selection rule with the identifier in the externally created selection rule holding unit 103f which temporarily stores the selection rule with the identifier. The mining result reflecting the held sorting rules is displayed on the display unit 103b, and the sorting rules created externally are stored as identifiers and can be appropriately used by the user in the same manner as the operation history. I have.

【００４７】こうした前処理の機能と後処理の機能を両
方備えることにより、これまで相関ルール法を単に適用
するだけでは発見困難であった規則性をも、発見が可能
となる。すなわち、稀に起こる事象についても、頻度に
関する取捨基準を小さくすることで枝刈りされるのを防
ぎ、必然的に大量に生成されてしまう相関ルールから後
処理で利便性の高いものを選択的に選ぶことによる。ま
た、詳細にすぎるデータ項目も類似性を基準に括ること
で、より簡素化された相関ルールとなる一方、頻度に関
する取捨基準で枝刈りされずに、より関心の高い相関ル
ールが処理の途中で棄却されにくくする効果を期待でき
る。By providing both the pre-processing function and the post-processing function, it is possible to discover regularity that has been difficult to find by simply applying the association rule method. In other words, even for rarely occurring events, pruning can be prevented by reducing the frequency criterion, and those that are highly convenient in post-processing can be selected from association rules that are inevitably generated in large quantities. By choosing. Also, by grouping too detailed data items on the basis of similarity, it becomes a simpler association rule, while a correlation rule of higher interest is not pruned due to the frequency cut-off criterion and The effect of making it hard to be rejected can be expected.

【００４８】[0048]

【発明の効果】以上のように、この発明に係るデータマ
イニング装置によれば、複数の関連するデータ項目を各
レコード毎に格納してなる第１のデータベースと、各デ
ータ項目毎に階層化されて属性名と属性値を含んだデー
タを格納してなる第２のデータベースと、上記第１のデ
ータベースに格納された各レコードのデータ項目及び上
記第２のデータベースに格納された各データ項目毎に階
層化されたデータ構造を読み込み、データ項目の選択操
作入力に応じて上記第１のデータベースに格納された各
レコードのデータ項目に対し上記第２のデータベースに
格納された各データ項目毎に階層化されたデータ構造を
参照してデータ項目を変換したレコードを出力する前処
理部と、上記前処理部を介した入力される全レコードに
対し同一レコード内に頻繁に出現するデータ項目をピッ
クアップしてそれらデータ項目間で出現頻度及び信頼度
の閾値を満たす相関ルールを生成してマイニング結果と
して出力するデータマイニングエンジンと、上記データ
マイニングエンジンから出力されるマイニング結果に対
し取捨選択の選択操作入力に基づいて上記第２のデータ
ベースに格納された各データ項目毎に階層化されたデー
タ構造を参照して削除又は優先指定に基づき並べ換えら
れたマイニング結果を出力する後処理部と、上記後処理
部から出力されるマイニング結果を格納する第３のデー
タベースとを備えたので、貴重な規則性を利用者の対話
的利用に基づいて発見して、必要なデータ項目候補の枝
刈り防止と不要な候補生成の抑制という相反しがちな目
標を一度に満たすことができ、利用者にとって利便性の
最も高い詳細度のデータ項目を用いた相関ルールを生成
することが可能となる。As described above, according to the data mining apparatus of the present invention, a first database storing a plurality of related data items for each record, and a hierarchical structure for each data item. A second database storing data including attribute names and attribute values, and a data item of each record stored in the first database and a data item stored in the second database. The hierarchical data structure is read, and the data item of each record stored in the first database is hierarchized for each data item stored in the second database in response to a data item selection operation input. Pre-processing unit that outputs a record in which data items are converted with reference to the data structure, and the same record for all records that are input through the pre-processing unit A data mining engine that picks up data items that frequently appear in the data items, generates an association rule that satisfies the thresholds of the appearance frequency and reliability between those data items, and outputs the result as a mining result; and a mining output from the data mining engine. Based on the selection operation input of the selection of the result, the mining result which is deleted or rearranged based on the priority designation is output by referring to the hierarchically structured data structure for each data item stored in the second database. Since a post-processing unit and a third database for storing mining results output from the post-processing unit are provided, valuable regularities are discovered based on the interactive use of the user, and necessary data items are found. Can meet the conflicting goals of preventing pruning of candidates and suppressing unnecessary generation of candidates at once, and use It is possible to generate the association rules using data item of the highest detail level of convenience for.

【００４９】また、上記前処理部は、第１のデータベー
スに格納された各レコードのデータ項目を読み込み、読
み込まれる都度対応するデータ項目に対する頻度追加情
報を出力する読込みデータ解釈部と、上記頻度追加情報
が入力される都度対応するデータ項目の出現頻度を演算
格納しその出現頻度を設定基準値と比較して該基準値よ
り高い頻度であるとき識別表示指示を出力するデータ項
目加工部と、上記読込みデータ解釈部を介して読み込ま
れたレコードのデータ項目を表示すると共に上記識別表
示指示に基づいて対応するデータ項目を識別表示する表
示部とを備えたので、利用者にマイニング対象とするデ
ータ項目を容易に識別可能とすることができる。The preprocessing unit reads a data item of each record stored in the first database, and outputs a frequency addition information for the corresponding data item each time the data item is read. A data item processing unit for calculating and storing an appearance frequency of a corresponding data item each time information is input, comparing the appearance frequency with a set reference value, and outputting an identification display instruction when the frequency is higher than the reference value; A display unit for displaying the data items of the record read via the read data interpretation unit and identifying and displaying the corresponding data items based on the identification display instruction, so that the data items to be mined for the user. Can be easily identified.

【００５０】また、上記前処理部は、マイニングに使用
する選択操作入力を受ける入力部と、上記読込みデータ
解釈部を介して入力されるレコードのデータ項目に対し
入力される項目変換情報に基づいてデータ項目を変換し
たレコードを上記データマイニングエンジンに出力する
データ変換部とをさらに備え、上記データ項目加工部
は、上記読込みデータ解釈部を介して入力されるレコー
ドのデータ項目に対し上記第２のデータベースから対応
するデータ項目の階層化されたデータ構造を入力して上
記表示部に表示させ、上記入力部を介して入力される選
択操作入力に基づいて階層化されたデータ項目のデータ
構造からマイニングに用いるデータ項目を選択し、選択
されたデータ項目の識別表示指示を上記表示部に出力す
ると共に選択されたデータ項目の項目変換情報を上記デ
ータ変換部に出力するようにしたので、データベース中
に出現するデータ項目の階層構造を直観的に理解しやす
い方法で表示し、利用者がその表示されたデータ項目の
構造からマイニングに用いるデータ項目を選択し、選択
されたデータ項目について一目瞭然の状態とすること
で、利用者がわたりやすい形で対話的にマイニングに用
いるデータ項目群を指定することができる。Further, the preprocessing section is based on an input section for receiving a selection operation input used for mining, and item conversion information input for a data item of a record input via the read data interpretation section. A data conversion unit that outputs a record obtained by converting the data item to the data mining engine, wherein the data item processing unit performs the second data conversion on the data item of the record input via the read data interpretation unit. A hierarchical data structure of a corresponding data item is input from the database and displayed on the display unit, and mining is performed from the data structure of the hierarchical data item based on a selection operation input input via the input unit. Is selected, and an instruction to identify and display the selected data item is output to the display unit and the selected data item is selected. Since the data conversion information of data items is output to the data conversion unit, the hierarchical structure of data items appearing in the database is displayed in an intuitive and easy-to-understand manner so that the user can display the displayed data. By selecting data items to be used for mining from the item structure and making the selected data items at a glance, it is possible to interactively specify a data item group to be used for mining in a manner that is easy for the user to understand.

【００５１】また、上記データ項目加工部は、上記表示
部に表示されるデータ項目に対し演算格納された出現頻
度の頻度情報を同時に表示させるようにしたので、利用
者にとってどの頻度情報を用いてマイニングに用いるか
を判定可能にすることができる。Further, the data item processing section is configured to simultaneously display the frequency information of the appearance frequency calculated and stored for the data item displayed on the display section, so that the user can use any frequency information. It can be determined whether to use for mining.

【００５２】また、上記データ項目加工部は、上記表示
部に表示される階層化されたデータ項目のデータ構造に
対し上記入力部を介して入力される複数のデータ項目の
選択操作入力に基づいて階層化されたデータ項目のデー
タ構造からマイニングに用いる複数のデータ項目を選択
し、階層関係にある複数のデータ項目のレコードを上記
データマイニングエンジンに出力すべく選択された複数
のデータ項目の項目変換情報を上記データ変換部に出力
するようにしたので、対話的にマイニングに用いる階層
関係にある複数のデータ項目の選択を行うことができ
る。Further, the data item processing unit is configured to select a plurality of data items input through the input unit with respect to the data structure of the hierarchized data items displayed on the display unit. A plurality of data items to be used for mining are selected from the data structure of the hierarchized data items, and item conversion of the plurality of data items selected to output records of the plurality of data items having a hierarchical relationship to the data mining engine. Since the information is output to the data conversion unit, it is possible to interactively select a plurality of data items having a hierarchical relationship used for mining.

【００５３】また、上記データ項目加工部は、上記表示
部に表示される階層化されたデータ項目のデータ構造に
対し上記第２のデータベースに格納された各データ項目
毎の属性値を読み込み上記表示部に表示される対応する
データ項目にその属性値をも表示させると共に、上記入
力部を介して入力される属性値の選択操作入力に基づい
て属性値を加えたデータ項目または属性値を変更したデ
ータ項目を追加したレコードを出力すべく上記データ変
換部に項目変換情報を出力するようにしたので、どの属
性値のデータ項目をマイニング対象として指定するのか
を可能にすることができる。The data item processing unit reads the attribute value of each data item stored in the second database with respect to the data structure of the hierarchized data items displayed on the display unit, and reads the attribute value. The attribute value is also displayed in the corresponding data item displayed in the section, and the data item or attribute value to which the attribute value is added based on the attribute value selection operation input input through the input section is changed. Since the item conversion information is output to the data conversion unit in order to output the record to which the data item has been added, it is possible to specify which attribute value data item is to be designated as a mining target.

【００５４】また、上記データ項目加工部は、上記表示
部に表示される階層化されたデータ項目のデータ構造に
対し上記第２のデータベースに格納された各データ項目
に対応する属性名毎の属性値の階層関係を読み込み上記
表示部に階層化された属性値をも表示させると共に、上
記入力部を介して入力される属性値の選択操作入力に基
づいて属性値を加えたデータ項目または属性値を変更し
たデータ項目を追加したレコードを出力すべく上記デー
タ変換部に項目変換情報を出力するようにしたので、属
性名毎の属性値の階層関係からどの水準の属性値をマイ
ニング対象として指定するかの判定を可能にすることが
できる。In addition, the data item processing unit may be configured to generate an attribute for each attribute name corresponding to each data item stored in the second database with respect to the data structure of the hierarchical data item displayed on the display unit. A data item or attribute value obtained by reading the hierarchical relationship of values, displaying the hierarchical attribute values on the display unit, and adding the attribute values based on the attribute value selection operation input input via the input unit. Since the item conversion information is output to the data conversion unit in order to output a record in which the data item in which the data item is changed is added, which level of the attribute value is specified as the mining target from the hierarchical relationship of the attribute value for each attribute name Can be determined.

【００５５】また、上記データ項目加工部は、上記入力
部を介して入力されるすべての操作入力を操作履歴とし
て上記第２のデータベースに保存させ、上記入力部を介
して入力される操作履歴の表示操作入力に基づいて上記
第２のデータベースから操作履歴を読出し上記表示部に
ユーザ操作に置き換えて操作履歴を適用した結果を表示
させるようにしたので、次回実行の際にその保持された
履歴を再利用することを可能にすることができる。The data item processing section stores all operation inputs input through the input section as operation histories in the second database, and stores the operation histories input through the input section. The operation history is read from the second database based on the display operation input, and the result of applying the operation history is displayed on the display unit by replacing the operation history with the user operation. It can be reused.

【００５６】また、上記データ項目加工部は、上記入力
部を介して入力される識別子を付加した操作履歴を上記
第２のデータベースに保存させ、上記入力部を介して入
力される識別子によって選択された操作履歴を上記第２
のデータベースから読出し上記表示部にユーザ操作に置
き換えて操作履歴を適用した結果を表示させるようにし
たので、次回実行の際には様々の履歴から適切なものと
して識別子によって選択して再利用することを可能にす
ることができる。Further, the data item processing section stores the operation history to which the identifier input through the input section is added in the second database, and is selected by the identifier input through the input section. The operation history
The result of applying the operation history instead of the user operation is displayed on the display unit after reading from the database, so that the next time it is executed, it can be reused by selecting it from the various histories as appropriate by using the identifier. Can be made possible.

【００５７】また、上記後処理部は、上記データマイニ
ングエンジンから出力されるマイニング結果を一時保持
するマイニング結果保持部と、該マイニング結果を表示
する表示部と、表示されたマイニング結果に対する後処
理加工のための操作入力を受ける入力部と、上記入力部
を介して入力される操作入力に基づいて上記マイニング
結果保持部に保持されたマイニング結果を加工処理して
出力するマイニング結果加工部と、上記入力部を介して
入力されるすべての操作入力を操作履歴として一時保持
する操作履歴保持部とを備えると共に、上記後処理部の
外部に、上記操作履歴保持部により一時保持されたすべ
ての操作履歴を格納する第４のデータベースを備えたの
で、必然的に大量に生成される相関ルールから利便性の
高いものを選択的に選び価値の高いルールを優先的に扱
うことができると共に、格納された操作履歴を再利用し
てマイニング結果に反映させることができる。Further, the post-processing section includes a mining result holding section for temporarily holding a mining result output from the data mining engine, a display section for displaying the mining result, and a post-processing process for the displayed mining result. An input unit for receiving an operation input for processing, a mining result processing unit for processing and outputting the mining result held in the mining result holding unit based on the operation input input via the input unit, An operation history holding unit that temporarily holds all operation inputs input via the input unit as an operation history, and all operation logs temporarily held by the operation history holding unit outside the post-processing unit. Is provided, so that highly convenient ones can be selectively selected from a large number of association rules that are necessarily generated in large quantities. The high select value rules it is possible to deal with priority, it can be reflected in the mining results to reuse the stored operation history.

【００５８】また、上記マイニング結果加工部は、上記
入力部を介して入力されるルール削除指定の操作入力に
基づいて上記表示部に表示されるマイニング結果から削
除指定を受けたルールを削除するようにしたので、不要
なルールを削除して簡素化した相関ルールを得ることが
できる。Further, the mining result processing section deletes a rule specified to be deleted from the mining result displayed on the display section based on an operation input for specifying rule deletion input through the input section. Therefore, it is possible to obtain a simplified correlation rule by deleting unnecessary rules.

【００５９】また、上記マイニング結果加工部は、上記
入力部を介して入力される階層距離指定の操作入力に基
づいて上記第２のデータベースに格納されたデータ項目
の階層化されたデータ構造を参照して当該階層距離に対
応する類義語によって構成される類似ルールについても
一括削除するようにしたので、指定された階層距離に対
応する類義語のデータ項目が含まれるすべてのルールを
も一括削除してデータ項目毎に指定する繁雑さを解消で
きる。The mining result processing unit refers to the hierarchical data structure of the data items stored in the second database based on the operation input for specifying the hierarchical distance input via the input unit. And similar rules composed of synonyms corresponding to the hierarchical distance are also deleted at once, so all rules containing data items of synonyms corresponding to the specified hierarchical distance are also deleted at once. The complexity of specifying each item can be eliminated.

【００６０】また、上記マイニング結果加工部は、上記
入力部を介して入力される優先指定の操作入力に基づい
て上記表示部に表されるルール群のうち優先指定に該当
するルールを先頭位置に移動させかつ識別表示させるよ
うにしたので、利用者に優先指定されたルールを識別可
能にすることができる。Further, the mining result processing unit sets a rule corresponding to the priority designation among the rule group displayed on the display unit based on a priority designation operation input input via the input unit, to a head position. Since the user is moved and identified and displayed, it is possible to identify the rule preferentially specified by the user.

【００６１】また、上記マイニング結果加工部は、上記
入力部を介して入力される階層距離指定の操作入力に基
づいて上記第２のデータベースに格納されたデータ項目
の階層化されたデータ構造を参照して当該階層距離に対
応する類義語によって構成される類似ルールについても
上記表示部に優先指定に基づく表示を行わせるようにし
たので、類義語によって構成される類似ルールについて
も優先を容易に指定可能にすることができる。Further, the mining result processing section refers to a hierarchical data structure of data items stored in the second database based on an operation input for specifying a hierarchical distance input through the input section. The similarity rule composed of synonyms corresponding to the hierarchical distance is also displayed on the display unit based on the priority designation, so that priority can be easily designated for similar rules composed of synonyms. can do.

【００６２】また、上記マイニング結果加工部は、上記
第２のデータベースに格納されたデータ項目の階層化さ
れたデータ構造を参照して、ルール群のうち、条件部の
項目数が多いルールに対し、そのルールと結論部が一致
または類似し、かつ条件部のすべての項目が一致または
類似する条件部の項目数が少ないルールについて、上記
条件部の項目数が多いルールが削除または優先処理され
たときは同様の処理を実行するようにしたので、データ
項目が一致または類似するルールについて削除または優
先処理を容易に行うことができる。Further, the mining result processing unit refers to the hierarchical data structure of the data items stored in the second database and refers to a rule group having a large number of condition part items in a rule group. , A rule with a large number of items in the condition part was deleted or given priority processing for a rule with a small number of items in the condition part where the conclusion part matches or is similar to the rule and all the items in the condition part match or are similar In some cases, similar processing is executed, so that deletion or priority processing can be easily performed on rules with matching or similar data items.

【００６３】また、上記マイニング結果加工部は、上記
入力部を介して入力される識別子を付加した操作履歴を
上記操作履歴保持部を介して上記第４のデータベースに
保存させ、上記入力部を介して入力される識別子によっ
て選択された操作履歴を上記第４のデータベースから読
出し上記表示部に操作履歴を反映したマイニング結果を
表示させるようにしたので、次回の操作では利用者ごと
に保持された操作履歴もしくは識別名が付された操作履
歴をユーザが適宜利用可能にすることができる。Further, the mining result processing section causes the operation history to which the identifier input through the input section is added to be stored in the fourth database via the operation history holding section, and the mining result processing section stores the operation history through the input section. The operation history selected by the input identifier is read out from the fourth database and a mining result reflecting the operation history is displayed on the display unit, so that the operation held for each user in the next operation is performed. The history or the operation history to which the identification name is assigned can be appropriately used by the user.

【００６４】さらに、外部で作成された識別子が付され
た取捨選択規則を格納してなる第５のデータベースをさ
らに備えると共に、上記後処理部に、上記第５のデータ
ベースに格納されている識別子が付された取捨選択規則
を読出し一時保持する外部作成取捨選択規則保持部をさ
らに備え、上記マイニング結果加工部は、上記外部作成
取捨選択規則保持部に保持された取捨選択規則を反映し
たマイニング結果を上記表示部に表示させるようにした
ので、外部で作成された取捨選択規則を、操作履歴と同
様の形で、識別名で保持し、ユーザにとって適宜利用可
能にすることができる。Further, a fifth database storing an externally created selection rule to which an identifier is attached is further provided, and the post-processing unit stores the identifier stored in the fifth database. An externally created selection rule holding unit that reads and temporarily holds the attached selection rule is further provided, and the mining result processing unit stores a mining result reflecting the selection rule held in the externally created selection rule holding unit. Since the display rule is displayed on the display unit, the selection rule created externally can be stored as an identification name in the same manner as the operation history, and can be appropriately used by the user.

[Brief description of the drawings]

【図１】この発明に係るデータマイニング装置を示す
構成図である。FIG. 1 is a configuration diagram showing a data mining device according to the present invention.

【図２】図１の前処理部の内部構成図である。FIG. 2 is an internal configuration diagram of a preprocessing unit in FIG. 1;

【図３】図１の後処理部の内部構成図である。FIG. 3 is an internal configuration diagram of a post-processing unit in FIG. 1;

[Explanation of symbols]

１０１データマイニングエンジン、１０２前処理
部、１０２ａ読込みデータ解釈部、１０２ｂデータ
項目加工部、１０２ｃ表示部、１０２ｄ入力部、１
０２ｅデータ変換部、１０３後処理部、１０３ａ
マイニング結果保持部、１０３ｂ表示部、１０３ｃ
入力部、１０３ｄマイニング結果加工部、１０３ｅ
ユーザ操作履歴保持部、１０３ｆ外部作成取捨選択規
則保持部、１０４〜１０８データベース。101 data mining engine, 102 preprocessing unit, 102a read data interpretation unit, 102b data item processing unit, 102c display unit, 102d input unit,
02e Data conversion unit, 103 Post-processing unit, 103a
Mining result holding unit, 103b display unit, 103c
Input unit, 103d Mining result processing unit, 103e
User operation history storage unit, 103f Externally created selection rule storage unit, 104-108 database.

Claims

[Claims]

A first database storing a plurality of related data items for each record; and a first database storing data including an attribute name and an attribute value hierarchized for each data item. And a hierarchical structure for each data item of each record stored in the first database and each data item stored in the second database. In response to the data item of each record stored in the first database, a record obtained by converting the data item with reference to a hierarchical data structure for each data item stored in the second database is output. Pre-processing unit, and picks up data items that frequently appear in the same record for all the records input through the pre-processing unit. A data mining engine that generates an association rule that satisfies the thresholds of appearance frequency and reliability between items and outputs the result as a mining result; and the mining result output from the data mining engine is selected based on a selection operation input of selection. A post-processing unit that outputs a mining result that is deleted or rearranged based on priority designation with reference to a hierarchical data structure for each data item stored in the second database; and a post-processing unit that outputs the mining result. A data mining apparatus comprising: a third database that stores a mining result.

2. The data mining device according to claim 1, wherein the preprocessing unit reads data items of each record stored in the first database, and outputs additional frequency information for the corresponding data items each time the data items are read. A read data interpreting unit that calculates and stores the appearance frequency of the corresponding data item each time the above-mentioned additional frequency information is input, compares the appearance frequency with a set reference value, and indicates an identification display when the frequency is higher than the reference value. And a display unit for displaying the data item of the record read via the read data interpretation unit and for identifying and displaying the corresponding data item based on the identification display instruction. A data mining device characterized by the following.

3. The data mining apparatus according to claim 2, wherein the pre-processing unit includes an input unit for receiving a selection operation input used for mining, and a data item of a record input via the read data interpretation unit. A data conversion unit that outputs a record obtained by converting a data item based on the input item conversion information to the data mining engine, wherein the data item processing unit is input via the read data interpretation unit. For the data item of the record, the hierarchical data structure of the corresponding data item is input from the second database and displayed on the display unit, and the hierarchical structure is selected based on the selection operation input input via the input unit. Data item to be used for mining is selected from the data structure of the coded data item, and an instruction to identify and display the selected data item A data mining device, which outputs to the display unit and the item conversion information of the selected data item to the data conversion unit.

4. The data mining device according to claim 3, wherein the data item processing unit simultaneously displays frequency information of an appearance frequency calculated and stored for the data item displayed on the display unit. Data mining equipment.

5. The data mining apparatus according to claim 4, wherein the data item processing unit is configured to input a plurality of data structures of hierarchical data items displayed on the display unit via the input unit. In order to select a plurality of data items to be used for mining from the data structure of the hierarchized data items based on the input operation of the data items of the above, and to output records of the plurality of data items having a hierarchical relationship to the data mining engine A data mining device, which outputs item conversion information of a plurality of selected data items to the data conversion unit.

6. The data mining device according to claim 3, wherein the data item processing unit is configured to execute the second data item processing on the data structure of the hierarchical data items displayed on the display unit. An attribute value for each data item stored in the database is read, the attribute value is also displayed on the corresponding data item displayed on the display unit, and an attribute value selection operation input through the input unit is input. A data mining device, which outputs item conversion information to the data conversion unit so as to output a record to which a data item to which an attribute value has been added or a data item to which an attribute value has been changed has been added.

7. The data mining device according to claim 3, wherein the data item processing unit is configured to perform the second data item processing on the data structure of the hierarchical data items displayed on the display unit. The hierarchical relationship of attribute values for each attribute name corresponding to each data item stored in the database is read and the display unit also displays the hierarchized attribute values, and the attribute values input through the input unit are displayed. A data mining device, which outputs item conversion information to the data conversion unit to output a record to which a data item to which an attribute value has been added or a data item to which an attribute value has been changed has been added based on a selection operation input.

8. The data mining apparatus according to claim 6, wherein the data item processing unit stores all operation inputs input via the input unit as operation histories in the second database, The operation history is read from the second database based on the operation history input input via the input unit, and the result obtained by applying the operation history to the display unit instead of the user operation is displayed on the display unit. Data mining equipment.

9. The data mining device according to claim 8, wherein the data item processing unit causes the second database to store an operation history to which an identifier input via the input unit is added, and stores the operation history in the second database. A data mining device that reads an operation history selected by an identifier input through the second database from the second database, replaces the operation history with a user operation on the display unit, and displays a result of applying the operation history.

10. The data mining apparatus according to claim 1, wherein the post-processing unit temporarily stores a mining result output from the data mining engine, and the mining result. Display unit, an input unit for receiving an operation input for post-processing for the displayed mining result, and the mining result storage unit based on the operation input input via the input unit. A mining result processing unit that processes and outputs a mining result, and an operation history holding unit that temporarily holds all operation inputs input via the input unit as an operation history,
A data mining apparatus, comprising: a fourth database that stores all operation histories temporarily held by the operation history holding unit, outside the post-processing unit.

11. The data mining device according to claim 10, wherein the mining result processing unit is configured to perform processing on the mining result displayed on the display unit based on an operation input for designating rule deletion input through the input unit. A data mining device for deleting a rule specified to be deleted.

12. The data mining apparatus according to claim 11, wherein the mining result processing unit stores data stored in the second database based on a hierarchical distance designation operation input input via the input unit. A data mining apparatus characterized in that similar rules formed by synonyms corresponding to the hierarchical distance are collectively deleted with reference to a hierarchical data structure of items.

13. The data mining device according to claim 10, wherein the mining result processing unit displays a table on the display unit based on a priority designation operation input input via the input unit. A data mining apparatus characterized in that a rule corresponding to a priority designation among a group of rules to be moved is moved to a head position and displayed for identification.

14. The data mining apparatus according to claim 13, wherein the mining result processing unit stores data stored in the second database based on a hierarchical distance designation operation input input via the input unit. A data mining apparatus characterized in that a similar rule composed of synonyms corresponding to the hierarchical distance is displayed on the display unit based on the priority designation with reference to a hierarchical data structure of the items.

15. The data mining device according to claim 11, wherein the mining result processing unit refers to a hierarchical data structure of data items stored in the second database. , Of a rule group, a rule with a large number of items in the condition part, a rule whose conclusion part matches or is similar to the rule, and a rule with a small number of items in the condition part where all items of the condition part match or are similar, A data mining apparatus, wherein a similar process is executed when a rule having a large number of items in the condition section is deleted or subjected to priority processing.

16. The data mining device according to claim 11, wherein the mining result processing unit stores the operation history to which the identifier input via the input unit is added, in the operation history holding unit. Through the fourth
The operation history selected by the identifier input through the input unit is read from the fourth database, and the display unit displays a mining result reflecting the operation history. Mining equipment.

17. The data mining device according to claim 16, further comprising a fifth database storing a selection rule to which an identifier created externally is added, and wherein the post-processing unit includes the fifth database. Further, an externally created selection rule holding unit for reading out and temporarily storing the selection rule with the identifier stored in the database of No. 4 is further provided, and the mining result processing unit is held in the externally created selection rule storage unit. A data mining apparatus for displaying a mining result reflecting the selection rule on the display unit.