JP3212695B2

JP3212695B2 - Knowledge acquisition device for knowledge base system and knowledge correction device

Info

Publication number: JP3212695B2
Application number: JP16303892A
Authority: JP
Inventors: 智恵子小林; 利一田中
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1992-06-22
Filing date: 1992-06-22
Publication date: 2001-09-25
Anticipated expiration: 2016-09-25
Also published as: JPH064290A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、診断、推定、予測、制
御、計測等の分野において、取得したデータから知識を
獲得し、知識を修正することができる知識ベースシステ
ムの知識獲得装置およびその知識修正装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a knowledge acquisition system for a knowledge base system capable of acquiring knowledge from acquired data and correcting the knowledge in the fields of diagnosis, estimation, prediction, control, measurement, and the like. It relates to a knowledge correction device.

【０００２】[0002]

【従来の技術】従来、知識ベースシステムは、プロダク
ション・ルールと呼ばれるＩＦ（条件部）−ＴＨＥＮ
（実行部）形式の知識を格納する知識ベースを有し、こ
の知識ベースの知識を組み合わせて推論等を行ってい
た。この知識ベースの知識獲得およびその修正は、知識
ベースシステムの能力を左右する重要なものである。2. Description of the Related Art Conventionally, a knowledge base system uses an IF (condition part) -THEN called a production rule.
(Execution unit) There is a knowledge base that stores knowledge in the form, and inferences and the like are performed by combining the knowledge of this knowledge base. Knowledge acquisition and modification of this knowledge base is important in determining the ability of a knowledge base system.

【０００３】しかし、この知識ベースシステムの知識ベ
ースの知識獲得およびその知識修正は、知識獲得のため
の十分なデータを保持していいても、別の手段により統
計的にデータの分析を行い、その結果を利用して知識を
獲得していたため、知識ベースに知識を格納するのに多
大の労力を必要とした。また、この場合の知識修正も獲
得した知識との整合性を考慮しないために多大の労力を
必要とした。[0003] However, knowledge acquisition and knowledge correction of the knowledge base of this knowledge base system are performed by statistically analyzing the data by another means, even if sufficient data for knowledge acquisition is held. Because knowledge was acquired using the results, storing the knowledge in the knowledge base required a great deal of effort. In addition, knowledge modification in this case also requires a great deal of effort because it does not consider consistency with acquired knowledge.

【０００４】例えば、エキスパートシステムで使用され
る知識ベースのルールは、専門家から専門知識を獲得
し、解釈し、さらにこの獲得した知識をエキスパートシ
ステムに適合する形式のルールを作成して知識ベースに
格納する作業が必要であり、十分なエキスパートシステ
ムを構築するためには、多大の時間および労力を必要と
するという問題点があった。[0004] For example, knowledge-based rules used in an expert system are obtained by acquiring expert knowledge from an expert, interpreting the knowledge, and creating a rule in a format suitable for the expert system to create a knowledge-based rule. There is a problem that a storing operation is required, and a large amount of time and labor is required to construct a sufficient expert system.

【０００５】[0005]

【発明が解決しようとする課題】前述したように、従来
の知識ベースシステムにおいて、知識ベースに格納され
る知識の獲得およびその知識の修正は、知識ベースシス
テムの能力を左右する重要なものでもあるにもかかわら
ず、多大の時間および労力を必要としていたという問題
点があった。As described above, in a conventional knowledge base system, acquisition of knowledge stored in the knowledge base and modification of the knowledge are also important factors that affect the ability of the knowledge base system. Nevertheless, there was a problem that a great deal of time and labor was required.

【０００６】そこで、本発明は、かかる問題点を除去
し、知識の獲得を自動的に行い、かつ知識の修正を簡易
に行うことができる知識ベースシステムの知識獲得装置
およびその知識修正装置を提供することを目的とする。Therefore, the present invention provides a knowledge acquisition device of a knowledge base system and a knowledge correction device capable of eliminating such problems, automatically acquiring knowledge, and easily modifying knowledge. The purpose is to do.

【０００７】[0007]

【課題を解決するための手段】第１の発明は、複数の属
性に対応したデータを有している複数のデータ群から知
識ベースシステムの知識ベースに格納される知識を獲得
する知識ベースシステムの知識獲得装置において、前記
複数の属性のうちの特定の属性の前記複数のデータ群の
部分データ群をクラスタ分析して前記複数のデータ群を
クラスタに分類するデータ分類手段と、前記データ分類
手段により分類されたクラスタごとに、所定の帰納学習
方法により求めたクラスタ属性条件の論理式をルールの
条件部の知識として獲得する条件部知識獲得手段と、前
記データ分類手段により分類されたクラスタごとに、他
の属性からなる該クラスタ内の複数の部分データ群に基
づき、所定のデータ分析方法により最適な数値的関係式
をルールの実行部の知識として獲得する実行部知識獲得
手段と、前記条件部知識獲得手段と前記実行部知識獲得
手段とから得られる前記クラスタごとの知識から前記知
識ベースのルールを生成するルール生成手段とを具備し
たことを特徴とする。According to a first aspect of the present invention, there is provided a knowledge base system for acquiring knowledge stored in a knowledge base of a knowledge base system from a plurality of data groups having data corresponding to a plurality of attributes. In the knowledge acquisition apparatus, a data classification unit that performs a cluster analysis on a partial data group of the plurality of data groups having a specific attribute among the plurality of attributes and classifies the plurality of data groups into clusters, For each classified cluster, a condition part knowledge obtaining means for obtaining a logical expression of a cluster attribute condition obtained by a predetermined induction learning method as knowledge of a condition part of a rule, and for each cluster classified by the data classification means, Based on a plurality of partial data groups in the cluster including other attributes, an optimal numerical relational expression is determined by a predetermined data analysis method according to a rule execution unit. An execution unit knowledge acquisition unit for acquiring as knowledge, and a rule generation unit for generating the knowledge base rule from the knowledge for each cluster obtained from the condition unit knowledge acquisition unit and the execution unit knowledge acquisition unit. It is characterized by.

【０００８】第２の発明は、複数の属性に対応したデー
タを有している複数のデータ群をもとに、該複数の属性
のうちの特定の属性の前記複数のデータ群の部分データ
群をクラスタ分析して前記複数のデータ群をクラスタに
分類するデータ分類手段と、前記データ分類手段により
分類されたクラスタごとに、所定の帰納学習方法により
求めたクラスタ属性条件の論理式をルールの条件部の知
識として獲得する条件部知識獲得手段と、前記データ分
類手段により分類されたクラスタごとに、他の属性から
なる該クラスタ内の複数の部分データ群に基づき、所定
のデータ分析方法により最適な数値的関係式をルールの
実行部の知識として獲得する実行部知識獲得手段と、前
記条件部知識獲得手段と前記実行部知識獲得手段とから
得られる前記クラスタごとの知識から前記知識ベースの
ルールを生成するルール生成手段とを有し、前記条件部
知識獲得手段により獲得された知識のうち知識ベースの
知識としての利用を留保する知識をサブ知識として前記
知識ベースに格納して、新規データ群に対する前記知識
ベースの知識修正を行う知識ベースシステムの知識修正
装置において、前記新規データ群と前記知識ベースに格
納されているルールの条件部の知識とを照合して満足す
るものがあるか否かを判定する知識判定手段と、前記知
識判定手段により満足するものがない場合に、前記新規
データ群と前記知識ベースに格納されている前記サブ知
識とを照合して満足するものがあるか否かを判定するサ
ブ知識判定手段と、前記サブ知識判定手段により満足す
るものがあると判定した場合に、前記サブ知識の留保を
取り消して前記ルールの条件部の知識に追加するサブ知
識追加手段と、前記知識判定手段あるいは前記サブ知識
判定手段により満足するものがあると判定された場合、
前記ルールの条件部の知識に対応する前記ルールの実行
部の知識による処理を行い、該処理出力が所定の許容範
囲内か否かを判定する許容範囲判定手段と、前記許容範
囲判定手段により許容範囲内と判定された場合は、前記
ルールの条件部の知識および修正した前記ルールの実行
部の知識を修正知識として前記知識ベースに格納する部
分知識修正手段と、前記サブ知識判定手段により満足す
るものがないと判定された場合あるいは前記許容範囲判
定手段により許容範囲内でないと判定された場合に、再
度、前記新規データ群を含めた前記複数のデータ群から
知識獲得を行わせる指示をする全体知識修正手段とを具
備したことを特徴とする。A second invention is based on a plurality of data groups having data corresponding to a plurality of attributes, and a partial data group of the plurality of data groups having a specific attribute among the plurality of attributes. Classifying the plurality of data groups into clusters by cluster analysis, and for each of the clusters classified by the data classifying means, a logical expression of a cluster attribute condition obtained by a predetermined induction learning method is used as a rule condition. For each of the clusters classified by the condition class knowledge obtaining means for obtaining as the knowledge of the part, and for each cluster classified by the data classification means, based on a plurality of partial data groups in the cluster including other attributes, an optimum data analysis method is performed by a predetermined data analysis method. Execution unit knowledge obtaining means for obtaining a numerical relational expression as knowledge of an execution unit of a rule; and a class obtained from the condition unit knowledge obtaining means and the execution unit knowledge obtaining means. Rule generation means for generating the rules of the knowledge base from the knowledge for each data, and the knowledge that reserves the use as knowledge of the knowledge base among the knowledge acquired by the condition part knowledge acquisition means as the sub-knowledge. In a knowledge correction device of a knowledge base system for storing knowledge in a knowledge base and correcting the knowledge of the knowledge base with respect to a new data group, the new data group is compared with knowledge of a condition part of a rule stored in the knowledge base. Knowledge determination means for determining whether or not there is something to be satisfied, and when there is no satisfaction by the knowledge determination means, the new data group is compared with the sub-knowledge stored in the knowledge base. Sub-knowledge determining means for determining whether or not there is something to be satisfied, and when the sub-knowledge determining means determines that there is something to be satisfied, A sub knowledge adding means for adding to the knowledge of the condition part of the rule to cancel the reservation of the sub-knowledge, when it is determined that there is to be satisfied by the knowledge judgment means or the sub-knowledge judging means,
An allowable range determining unit that performs a process based on the knowledge of the rule executing unit corresponding to the knowledge of the condition unit of the rule, and determines whether the processing output is within a predetermined allowable range; If determined to be within the range, it is satisfied by the partial knowledge correction means for storing the knowledge of the condition part of the rule and the knowledge of the corrected execution part of the rule as corrected knowledge in the knowledge base, and the sub-knowledge determination means. When it is determined that there is nothing, or when it is determined that the data is not within the allowable range by the allowable range determining unit, an instruction to acquire knowledge again from the plurality of data groups including the new data group is issued. And a knowledge correcting means.

【０００９】[0009]

【００１０】[0010]

【作用】第１の発明は、複数の属性に対応したデータを
有している複数のデータ群から知識ベースシステムの知
識ベースに格納される知識を獲得する知識ベースシステ
ムの知識獲得装置において、データ分類手段が、前記複
数の属性のうちの特定の属性の前記複数のデータ群の部
分データ群をクラスタ分析して前記複数のデータ群をク
ラスタに分類し、条件部知識獲得手段が、前記データ分
類手段により分類されたクラスタごとに、所定の帰納学
習方法により求めたクラスタ属性条件の論理式をルール
の条件部の知識として獲得し、前記クラスタ単位知識獲
得手段の実行部知識獲得手段が、前記データ分類手段に
より分類されたクラスタごとに、他の属性からなる該ク
ラスタ内の複数の部分データ群に基づき、所定のデータ
分析方法により最適な数値的関係式をルールの実行部の
知識として獲得し、前記クラスタ単位知識獲得手段のル
ール生成手段が、前記条件部知識獲得手段と前記実行部
知識獲得手段とから得られる前記クラスタごとの知識か
ら前記知識ベースのルールを生成する。According to a first aspect of the present invention, there is provided a knowledge acquiring apparatus for a knowledge base system for acquiring knowledge stored in a knowledge base of a knowledge base system from a plurality of data groups having data corresponding to a plurality of attributes. Classifying means for cluster-analyzing a partial data group of the plurality of data groups having a specific attribute among the plurality of attributes to classify the plurality of data groups into clusters; For each of the clusters classified by the means, a logical expression of a cluster attribute condition obtained by a predetermined induction learning method is acquired as knowledge of a condition part of a rule, and the execution unit knowledge acquiring means of the cluster unit knowledge acquiring means comprises: For each of the clusters classified by the classifying means, based on a plurality of partial data groups in the cluster including other attributes, a predetermined data analysis method is used. And a rule generating means of the cluster unit knowledge obtaining means, wherein the rule generating means of the cluster unit knowledge obtaining means obtains the knowledge for each cluster obtained from the condition part knowledge obtaining means and the execution part knowledge obtaining means. From the knowledge base.

【００１１】第２の発明は、複数の属性に対応したデー
タを有している複数のデータ群をもとに、該複数の属性
のうちの特定の属性の前記複数のデータ群の部分データ
群をクラスタ分析して前記複数のデータ群をクラスタに
分類するデータ分類手段と、前記データ分類手段により
分類されたクラスタごとに、所定の帰納学習方法により
求めたクラスタ属性条件の論理式をルールの条件部の知
識として獲得する条件部知識獲得手段と、前記データ分
類手段により分類されたクラスタごとに、他の属性から
なる該クラスタ内の複数の部分データ群に基づき、所定
のデータ分析方法により最適な数値的関係式をルールの
実行部の知識として獲得する実行部知識獲得手段と、前
記条件部知識獲得手段と前記実行部知識獲得手段とから
得られる前記クラスタごとの知識から前記知識ベースの
ルールを生成するルール生成手段とを有し、前記条件部
知識獲得手段により獲得された知識のうち知識ベースの
知識としての利用を留保する知識をサブ知識として前記
知識ベースに格納して、新規データ群に対する前記知識
ベースの知識修正を行う知識ベースシステムの知識修正
装置において、知識判定手段が、前記新規データ群と前
記知識ベースに格納されているルールの条件部の知識と
を照合して満足するものがあるか否かを判定し、サブ知
識判定手段が、前記知識判定手段により満足するものが
ない場合に、前記新規データ群と前記知識ベースに格納
されている前記サブ知識とを照合して満足するものがあ
るか否かを判定し、サブ知識追加手段が、前記サブ知識
判定手段により満足するものがあると判定した場合に、
前記サブ知識の留保を取り消して前記ルールの条件部の
知識に追加し、許容範囲判定手段が、前記知識判定手段
あるいは前記サブ知識判定手段により満足するものがあ
ると判定された場合、前記ルールの条件部の知識に対応
する前記ルールの実行部の知識による処理を行い、該処
理出力が所定の許容範囲内か否かを判定し、部分知識修
正手段が、前記許容範囲判定手段により許容範囲内と判
定された場合は、前記ルールの条件部の知識および修正
した前記ルールの実行部の知識を修正知識として前記知
識ベースに格納し、全体知識修正手段が、前記サブ知識
判定手段により満足するものがないと判定された場合あ
るいは前記許容範囲判定手段により許容範囲内でないと
判定された場合に、再度、前記新規データ群を含めた前
記複数のデータ群から知識獲得を行わせる指示をする。According to a second aspect of the present invention, based on a plurality of data groups having data corresponding to a plurality of attributes, a partial data group of the plurality of data groups having a specific attribute among the plurality of attributes is provided. Classifying the plurality of data groups into clusters by cluster analysis, and for each of the clusters classified by the data classifying means, a logical expression of a cluster attribute condition obtained by a predetermined induction learning method is used as a rule condition. For each of the clusters classified by the condition class knowledge obtaining means for obtaining as the knowledge of the part, and for each cluster classified by the data classification means, based on a plurality of partial data groups in the cluster including other attributes, an optimum data analysis method is performed by a predetermined data analysis method. Execution unit knowledge obtaining means for obtaining a numerical relational expression as knowledge of an execution unit of a rule; and a class obtained from the condition unit knowledge obtaining means and the execution unit knowledge obtaining means. Rule generation means for generating the rules of the knowledge base from the knowledge for each data, and the knowledge that reserves the use as knowledge of the knowledge base among the knowledge acquired by the condition part knowledge acquisition means as the sub-knowledge. In a knowledge correction device of a knowledge base system for storing knowledge in a knowledge base and correcting knowledge of the knowledge base for a new data group, a knowledge judging means includes a condition part of the new data group and a rule stored in the knowledge base. It is determined whether there is a satisfactory one by comparing with the knowledge of the sub-knowledge determining means, if there is no satisfying by the knowledge determining means, stored in the new data group and the knowledge base The sub-knowledge adding unit determines whether or not there is any content that is satisfied by comparing the sub-knowledge with the sub-knowledge. If it is determined,
Cancel the reservation of the sub-knowledge and add it to the knowledge of the condition part of the rule, and if the allowable range determining means determines that there is something to be satisfied by the knowledge determining means or the sub-knowledge determining means, A process is performed based on the knowledge of the execution unit of the rule corresponding to the knowledge of the condition unit, and it is determined whether or not the processing output is within a predetermined allowable range. If it is determined, the knowledge of the condition part of the rule and the knowledge of the corrected execution part of the rule are stored in the knowledge base as modified knowledge, and the overall knowledge modifying means is satisfied by the sub-knowledge determining means. When it is determined that there is no data, or when it is determined that the data is not within the allowable range by the allowable range determining means, the plurality of data groups including the new data group are again An instruction to perform the Luo knowledge acquisition.

【００１２】[0012]

【００１３】[0013]

【実施例】以下、図面を参照して本発明の一実施例につ
いて説明する。図１は、本発明の一実施例である知識ベ
ースシステムの構成ブロック図である。ここで、本知識
ベースシステムを、事例ベース推論システムに応用した
場合について説明する。事例ベース推論システムとは、
過去の問題解決経験を事例として蓄積しておき、新規問
題に対して類似事例を検索・修正することにより結論を
導く推論を行うものである。具体的に、売上予測システ
ムに適用して、以下詳細に説明する。すなわち、家電製
品、販売店、あるいはコンビニエンス・ストアやレスト
ランなどのチェーン店において、過去の出店事例を蓄え
ておき、その中から類似店を検索し、類似店の情報を参
考にすることから新規に出店する予定の店舗の売上高を
予測するものである。ここでは、売上高を予測計算する
ルールの獲得、あるいはルールの修正について説明す
る。An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a configuration block diagram of a knowledge base system according to one embodiment of the present invention. Here, a case where the present knowledge base system is applied to a case based reasoning system will be described. What is a case-based reasoning system?
This is to accumulate past problem-solving experiences as cases and to perform inferences that lead to conclusions by searching and correcting similar cases for new problems. Specifically, the present invention will be described in detail below by applying to a sales forecasting system. In other words, in the case of home appliances, retail stores, or chain stores such as convenience stores and restaurants, store past store openings, store similar stores, search for similar stores, and refer to information on similar stores. This is to predict the sales of the store that will open. Here, acquisition of a rule for predicting and calculating sales or modification of the rule will be described.

【００１４】図１において、売上予測システム内の知識
ベースシステムは、知識を獲得する知識獲得装置１、知
識を修正する知識修正装置２、ＩＦ−ＴＨＥＮ形式の知
識を格納している知識ベース３、知識のもとになるデー
タが格納されている事例ベース４、知識獲得装置１，知
識修正装置２および事例ベース４の制御して知識ベース
の知識を獲得・修正の全体制御を行う知識ベース制御部
５、知識ベース３の知識をもとに推論等を行う推論部
８、入力部６および出力部７から構成される。また、知
識獲得装置１は、事例ベース４に格納されているデータ
をクラスタ分析するクラスタ分析部１０、クラスタ分析
部１０により分類されたクラスタごとにルールの条件部
を帰納学習してルールの条件部の知識を獲得する帰納学
習部１１、クラスタ分析部１０により分類されたクラス
タごとに、クラスタ内のデータを分析し、ルールの実行
部の知識を獲得するデータ分析部１２、帰納学習部１１
およびデータ分析部１２で獲得された知識をもとにＩＦ
−ＴＨＥＮ形式のルールを生成するルール生成部１３か
ら構成される。In FIG. 1, a knowledge base system in the sales forecasting system includes a knowledge acquisition device 1 for acquiring knowledge, a knowledge modification device 2 for modifying knowledge, a knowledge base 3 storing knowledge in IF-THEN format, A knowledge base control unit that controls the case base 4, the knowledge acquisition device 1, the knowledge correction device 2, and the case base 4 in which data serving as a source of knowledge is stored, and performs overall control of acquiring and modifying knowledge of the knowledge base. 5, an inference unit 8 for performing inference based on the knowledge of the knowledge base 3, an input unit 6, and an output unit 7. Further, the knowledge acquisition device 1 performs a cluster analysis on the data stored in the case base 4 by cluster analysis, and inductively learns the condition part of the rule for each of the clusters classified by the cluster analysis part 10 to perform the condition part of the rule. An inductive learning unit 11 for acquiring knowledge of the data, a data analyzing unit 12 for analyzing data in the cluster for each cluster classified by the cluster analyzing unit 10 and acquiring knowledge of a rule executing unit, and an inductive learning unit 11
And IF based on the knowledge acquired by the data analysis unit 12
A rule generation unit 13 that generates a rule in the THEN format.

【００１５】さらに、知識修正装置２は、新たに入力さ
れたデータを含めて知識とするために、新たに入力され
たデータの抽象データと知識ベース３に格納されている
全てのルールの条件部とを照合して満足するルールがあ
るか否かを判定する知識判定部２０、知識判定部２０の
判定結果により満足するルールの条件部がない場合に知
識ベース３に格納されているサブ知識、すなわち、帰納
学習部１１の知識獲得の条件を満足しているが知識ベー
ス３に知識として活用される知識適用要件を満足してい
ないもので知識ベースに格納されているルールの条件を
含めたルールの条件部を満足するものがあるか否かを判
定するサブ知識判定部２１、サブ知識判定部２１の判定
によりルールの条件部を満足するものがある場合に満足
するサブ知識を照合したルールの条件部に追加するサブ
知識追加部２２、知識判定手段２０により満足すると判
定されたルールの条件部に対応するルールの実行部およ
びサブ知識追加部２２により追加されたルールの条件部
に対応するルールの実行部の処理を行って被決定要因の
出力値が所定の許容範囲内であるか否かを判定する許容
範囲判定部２３、許容範囲判定部２３の判定により許容
範囲を満足する場合にデータ分析部１２を起動させるこ
とによりルールの実行部のみを修正する部分知識修正部
２４、サブ知識判定部２１あるいは許容範囲判定部２３
の判定を満足しない場合に再度新たに入力されたデータ
を含めた全データに対して知識獲得装置１により知識獲
得を行う指示をする全体知識修正指示部２５から構成さ
れる。次に、知識獲得装置１について説明する。図２
は、知識獲得装置１の動作フローチャートである。図２
において、事例ベース４からデータを入力する（ステッ
プ２０１）。入力されたデータはクラスタ分析部１０に
おいてクラスタ分析を行う（ステップ２０２）。このク
ラスタ分析に際して使用するデータは、共通する一部の
データ群により行う。そして、その結果として、所定の
類似度に従って、グループすなわちクラスタに分類され
る（ステップ２０３）。Further, the knowledge correcting device 2 converts the abstract data of the newly input data and the condition part of all the rules stored in the knowledge base 3 into knowledge including the newly input data. A knowledge determining unit 20 that determines whether there is a rule that satisfies the condition. If there is no condition part of the rule that is satisfied according to the determination result of the knowledge determining unit 20, the sub-knowledge stored in the knowledge base 3; That is, a rule that satisfies the condition of knowledge acquisition of the inductive learning unit 11 but does not satisfy the knowledge application requirement used as knowledge in the knowledge base 3 and includes the condition of the rule stored in the knowledge base. Sub-knowledge determining unit 21 that determines whether or not there is one that satisfies the condition part of the rule. The sub-knowledge adding unit 22 to be added to the condition part of the rule, the execution unit of the rule corresponding to the condition part of the rule determined to be satisfied by the knowledge determination unit 20, and the condition part of the rule added by the sub-knowledge adding unit 22 The allowable range determining unit 23 that performs the process of the execution unit of the corresponding rule to determine whether the output value of the determined factor is within a predetermined allowable range, and satisfies the allowable range by the determination of the allowable range determining unit 23 In this case, the partial knowledge correction unit 24, the sub-knowledge determination unit 21, or the allowable range determination unit 23 that corrects only the rule execution unit by activating the data analysis unit 12
Is satisfied, the knowledge acquisition device 1 instructs the knowledge acquisition device 1 to acquire knowledge for all data including newly input data. Next, the knowledge acquisition device 1 will be described. FIG.
5 is an operation flowchart of the knowledge acquisition device 1. FIG.
, Data is input from the case base 4 (step 201). The input data is subjected to cluster analysis in the cluster analysis unit 10 (step 202). The data used for this cluster analysis is based on a common partial data group. Then, as a result, they are classified into groups, that is, clusters, according to a predetermined similarity (step 203).

【００１６】ここで、売上予測システムの具体的なクラ
スタ分析について説明する。まず、取得したデータには
２種類あり、１つは、売場面積、従業員数、駐車場収容
台数等の店舗情報と、人口密度、家族構成、道路状況等
の地域情報とからなる調査データであり、１つは、上記
地域情報から抽出した抽象項目データである。この抽象
項目データは、アンケートにより求めてもよいし、他の
抽出方法により求めてもよい。また、地域情報そのもの
を抽象項目データとしてもよい。なお、取得したデータ
は、それぞれの店舗ごとのデータ（事例）として求めら
れている。Here, a specific cluster analysis of the sales prediction system will be described. First, there are two types of acquired data. One is survey data consisting of store information such as the sales floor area, the number of employees, and the number of parking lots, and regional information such as population density, family composition, and road conditions. One is abstract item data extracted from the area information. This abstract item data may be obtained by a questionnaire or by another extraction method. Alternatively, the local information itself may be used as the abstract item data. Note that the acquired data is obtained as data (case) for each store.

【００１７】ここでは、抽象項目データとして、次のよ
うな８つの項目を取り上げ、それぞれかっこ内のような
値をとるものとする。なお、かっこ内の値は連続値と離
散値が考えられるが、本実施例では、離散値とする。・住民生活リッチ度（1.低い 2.やや低い 3.普通 4.
高い 5.非常に高い）・周辺地域状況（1.良くない 2.あまり良くない
3.普通 4.良い5.非常に良い）・交通事情（1.良くない 2.あまり良くない
3.普通 4.良い5.非常に良い）・顧客タイプ（1.固定 2.両方 3.多い方）・季節変動性（1.小 2.中 3.大）・ＯＡ化の普及度（1.小 2.中下 3.中 4.中上 5.
上）・マニア性（1.低い 2.普通 3.高い）・ファミリー性（1.弱い 2.普通 3.強い）これら８つの抽象項目データをもとにクラスタ分析を行
う。クラスタ分析とは、似ている度合いすなわち、類似
度、あるいはその逆に標本間の性質的な違いすなわち、
距離に基づいてクラスタに分類しようとするものであ
る。Here, the following eight items are taken as abstract item data, and each item takes a value as shown in parentheses. Note that continuous and discrete values can be considered as values in parentheses, but in the present embodiment, discrete values are used.・ Richness of residents' life (1.Low 2.Slightly low 3.Normal 4.
High 5. Very high ・ Surrounding area situation (1. Not good 2. Not very good)
3.Normal 4.Good 5.Very good ・ Traffic (1.Not good 2.Not very good)
3.Normal 4.Good 5.Very good ・ Customer type (1.Fixed 2.Both 3.Most) ・ Seasonal variability (1.Small 2.Medium 3.Large) ・ Diffusion of OA (1 Small 2.Medium lower 3.Medium 4.Medium upper 5.
Above) ・ Mania (1.Low 2.Normal 3.High) ・ Family (1.Low 2.Normal 3.Strong) Perform cluster analysis based on these eight abstract item data. Cluster analysis is a measure of similarity, that is, similarity, or vice versa.
It is intended to classify into clusters based on distance.

【００１８】図３および図４は、具体的な抽象項目デー
タを示す図である。図３および図４には、上記した８つ
の抽象項目に対する６８店舗のデータが示されており、
事例ベース４に格納されている。この事例ベース４への
入力は、入力部６により入力される。FIGS. 3 and 4 are diagrams showing specific abstract item data. 3 and 4 show data of 68 stores for the above eight abstract items.
It is stored in Case Base 4. The input to the case base 4 is input by the input unit 6.

【００１９】図５および図６は、重み付けされた抽象項
目データを示す図である。図５および図６に示されたデ
ータは、図３および図４のデータに対して重み付けがさ
れたデータであり、８つの抽象項目データの値に対して
それぞれ次のような重み付けの値が乗算され、重み付け
がされている。すなわち、住民生活リッチ度×０．５８周辺地域状況 ×０．６６交通事情 ×０．８５顧客タイプ ×０．４２季節変動性 ×０．２２ＯＡ化の普及度 ×０．３９マニア性 ×０．２７ファミリー性 ×０．３４従って、ステップ２０１において、入力されるデータは
図５および図６に示した重み付けされたデータが入力さ
れる。なお、重み付けについては、別に重み付け専用の
手段を設けてもよい。FIGS. 5 and 6 show the weighted abstract item data. The data shown in FIGS. 5 and 6 are data obtained by weighting the data of FIGS. 3 and 4, and the values of the eight abstract item data are multiplied by the following weighting values. Are weighted. That is, the degree of inhabitants' richness × 0.58 Surrounding area conditions × 0.66 Traffic conditions × 0.85 Customer type × 0.42 Seasonal variability × 0.22 OA adoption degree × 0.39 Maniacity × 0. 27 family nature × 0.34 Therefore, in step 201, the weighted data shown in FIGS. 5 and 6 is input as the input data. For weighting, a dedicated means for weighting may be separately provided.

【００２０】さて、この重み付けされた抽象項目データ
に基づいてクラスタ分析を行う。このクラスタ分析の方
法は種々のものがあるが、本実施例における計算方法
は、距離計算を「標準ユークリッド距離」、クラスタ結
合を「最短距離法」により行う。もちろん、他の計算方
法によりクラスタ分析を行ってもよい。Now, cluster analysis is performed based on the weighted abstract item data. Although there are various cluster analysis methods, the calculation method in the present embodiment performs the distance calculation by the “standard Euclidean distance” and the cluster combination by the “shortest distance method”. Of course, cluster analysis may be performed by another calculation method.

【００２１】図７は、クラスタ分析の結果を示す樹状図
である。図７の樹状図は、横軸に図５および図６のデー
タ番号が適切に配置されたクラスタを表し、縦軸にクラ
スタ間の類似度を表している。ここで、本実施例におい
ては、クラスタ分析の類似度を「ａ」と指定しておくこ
とにより、１６個のクラスタを得ることができる。すな
わち、類似度「ａ」の線Ｌａと交わるクラスタの樹は１
６箇所となり、６８店舗のデータは、１６のクラスタに
分類されたことになる。ここで、指定する類似度は任意
に設定できるものであり、例えば類似度「ｂ」に設定す
ることにより類似度「ｂ」の線Ｌｂは５個のクラスタの
樹を交わり、５個のクラスタを得ることができる。な
お、クラスタの分類は、図７の線Ｌａ、Ｌｂのように全
てのデータに対して同一の類似度を設定する必要はな
く、異なる類似度でクラスタに分類してもよい。例え
ば、あるクラスタは線Ｌａで交わる類似度で分類され、
また、あるクラスタは線Ｌｂで交わる類似度で分類さ
れ、最終的に全てのデータがあるクラスタに分類されれ
ばよい。FIG. 7 is a tree diagram showing the results of the cluster analysis. In the tree diagram of FIG. 7, the horizontal axis represents clusters in which the data numbers of FIGS. 5 and 6 are appropriately arranged, and the vertical axis represents the similarity between clusters. Here, in this embodiment, 16 clusters can be obtained by designating the similarity of the cluster analysis as “a”. That is, the cluster tree that intersects with the line La having the similarity “a” is 1
There are six places, and the data of 68 stores are classified into 16 clusters. Here, the designated similarity can be set arbitrarily. For example, by setting the similarity “b”, the line Lb of the similarity “b” crosses the tree of five clusters, and the five clusters are connected. Obtainable. It is not necessary to set the same similarity for all the data as indicated by the lines La and Lb in FIG. 7, and the clusters may be classified with different similarities. For example, a certain cluster is classified by the similarity crossing at the line La,
Further, a certain cluster may be classified based on the degree of similarity intersecting with the line Lb, and finally, all data may be classified into a certain cluster.

【００２２】ここでは、類似度を「ａ」として、１６個
のクラスタを得ることができたとする。すなわち、クラスタ番号１＝｛データ番号１，３５｝クラスタ番号２＝｛データ番号２，３，５，６，
７，８，１０，１２，１３，１８，２０，２６，２７，
２８，３２，３４，３６，３７，４０，４１，４２，４
４，４６，４７，５２，５４，６０，６１，６２，６
６，６８｝クラスタ番号３＝｛データ番号４、３８｝クラスタ番号４＝｛データ番号９，３３，４３，６
７｝クラスタ番号５＝｛データ番号１１，４５｝クラスタ番号６＝｛データ番号１４，４８｝クラスタ番号７＝｛データ番号１５，３１，４９，
６５｝クラスタ番号８＝｛データ番号１６，５０｝クラスタ番号９＝｛データ番号１７，５１，５８｝クラスタ番号１０＝｛データ番号１９，２１，３０，
５３，５５，６４｝クラスタ番号１１＝｛データ番号２２，５６｝クラスタ番号１２＝｛データ番号２３｝クラスタ番号１３＝｛データ番号２４｝クラスタ番号１４＝｛データ番号２５，５９｝クラスタ番号１５＝｛データ番号２９，６３｝クラスタ番号１６＝｛データ番号５７｝このようにして、クラスタ分析部１０はクラスタ分析を
行い、入力データを抽象項目の数量化されたデータに基
づいてクラスタ分類を行う。Here, it is assumed that the similarity is "a" and 16 clusters can be obtained. That is, cluster number 1 = {data number 1, 35} cluster number 2 = {data number 2, 3, 5, 6,
7, 8, 10, 12, 13, 18, 20, 26, 27,
28, 32, 34, 36, 37, 40, 41, 42, 4
4,46,47,52,54,60,61,62,6
6,68 {cluster number 3 = {data number 4, 38} cluster number 4 = {data number 9, 33, 43, 6}
7 cluster number 5 = {data number 11, 45} cluster number 6 = {data number 14, 48} cluster number 7 = {data number 15, 31, 49,
65｝ cluster number 8 = ｛data number 16,50｝ cluster number 9 = ｛data number 17,51,58｝ cluster number 10 = ｛data number 19,21,30,
53, 55, 64} cluster number 11 = {data number 22, 56} cluster number 12 = {data number 23} cluster number 13 = {data number 24} cluster number 14 = {data number 25, 59} cluster number 15 = {Data No. 29, 63} Cluster No. 16 = {Data No. 57} In this way, the cluster analysis unit 10 performs the cluster analysis, and classifies the input data into clusters based on the quantified data of the abstract items.

【００２３】次に、図２に戻り、フローチャートの説明
を続ける。帰納学習部１１は、クラスタ分析部１０によ
りクラスタ分類された１つのクラスタを取りだし（ステ
ップ２０４）、所定の帰納学習法により帰納学習を行う
（ステップ２０５）。そして、所定の帰納学習法により
獲得された属性条件の論理式をルールの条件部知識とし
て獲得する（ステップ２０６）。Next, returning to FIG. 2, the description of the flowchart will be continued. The induction learning unit 11 extracts one cluster classified by the cluster analysis unit 10 (step 204) and performs induction learning by a predetermined induction learning method (step 205). Then, a logical expression of the attribute condition obtained by a predetermined induction learning method is obtained as rule part knowledge of the rule (step 206).

【００２４】一方、データ分析部１２も、ステップ２０
４で取りだしたクラスタのデータのうちルールの実行部
獲得に関するデータ群を取り出し（ステップ２０７）、
１つの被決定要因と複数の決定要因との間の数値的関係
を求めるため統計的方法の１つである重回帰分析を行い
（ステップ２０８）、１つの数値的関係を有する計算式
のパラメータを獲得する（ステップ２０９）。そして、
このパラメータを含む計算式をルールの実行部知識とし
て獲得する（ステップ２１０）。On the other hand, the data analysis unit 12 also
A data group related to the acquisition of the execution unit of the rule is extracted from the cluster data extracted in step 4 (step 207),
A multiple regression analysis, which is one of the statistical methods, is performed to determine a numerical relationship between one determined factor and a plurality of determinants (step 208), and parameters of a calculation formula having one numerical relationship are calculated. Acquisition (step 209). And
A calculation formula including this parameter is acquired as the execution part knowledge of the rule (step 210).

【００２５】まず、帰納学習部１１の動作を売上予測シ
ステムの具体例をもとに詳細に説明する。本実施例にお
いて、帰納学習部１１は帰納学習の方法として「ＩＤ
３」を採用している。この「ＩＤ３」は、教示型の帰納
アルゴリズムの１つである。ここで、例えばクラスタ番
号“２”に対し、「ＩＤ３」により帰納学習する場合
は、クラスタ番号“２”に属する３２店舗のデータを正
例とし、その他の店舗のデータを負例として帰納学習す
る。First, the operation of the induction learning unit 11 will be described in detail based on a specific example of the sales prediction system. In the present embodiment, the inductive learning unit 11 uses “ID
3 "is adopted. This “ID3” is one of the teaching-type induction algorithms. Here, for example, in the case of performing inductive learning using “ID3” for the cluster number “2”, inductive learning is performed using data of 32 stores belonging to the cluster number “2” as a positive example and data of other stores as a negative example. .

【００２６】図８は、帰納学習部１１の帰納学習結果を
示す図である。図８において、帰納学習法「ＩＤ３」に
より成功した８つの条件Ｓ１〜Ｓ８が得られる。ここ
で、帰納学習法「ＩＤ３」により成功とは、クラスタ番
号“２”に属する３２店舗のデータを正例とし、その他
の店舗のデータを負例とした帰納学習法「ＩＤ３」によ
り条件部知識の獲得に成功した場合をいい、失敗とは、
上記帰納学習法「ＩＤ３」により条件部知識の獲得に失
敗した場合をいう。なお、ここで、成功した場合に、デ
ータ数が“１”であるものは、データ数が少ないため例
外と判断し、獲得された知識として知識ベース３に格納
されるが、知識ベース３における知識として活用される
ことはない。そして、この知識は、後述する知識ベース
３の知識の修正あるいは追加のときに利用される。この
獲得された例外の知識を以下「サブ知識」あるいは「サ
ブ条件」という。FIG. 8 is a diagram showing an inductive learning result of the inductive learning unit 11. In FIG. 8, eight conditions S1 to S8 that are successful by the inductive learning method “ID3” are obtained. Here, the success by the inductive learning method “ID3” means that the data of 32 stores belonging to the cluster number “2” is a positive example, and the data of the other stores is a negative example by the inductive learning method “ID3”. Successful acquisition means failure,
This refers to a case where acquisition of the condition part knowledge has failed due to the inductive learning method “ID3”. In this case, if the number of data is “1”, the number of data is small, so that it is judged as an exception because the number of data is small and stored in the knowledge base 3 as acquired knowledge. It will not be used as. This knowledge is used when modifying or adding knowledge of the knowledge base 3 described later. This acquired knowledge of the exception is hereinafter referred to as “sub-knowledge” or “sub-condition”.

【００２７】従って、データ数が１つであるサブ条件Ｓ
１，Ｓ３，Ｓ８は、例外とみなされ、クラスタ番号
“２”から獲得されたルールの条件部の知識は、次のよ
うになる。すなわち、・住民生活リッチ度（やや低い）かつファミリー性（普通）又は・住民生活リッチ度（普通）かつ交通事情｛（普通）ＯＲ（良い）ＯＲ（たいへん良い）｝又は・住民生活リッチ度（高い）かつマニア性（低い）とまとめられることになる。Therefore, the sub-condition S in which the number of data is one is
1, S3 and S8 are regarded as exceptions, and the knowledge of the condition part of the rule obtained from the cluster number "2" is as follows. That is: ・ Richness of residents 'life (somewhat low) and family (normal) or ・ Richness of residents' life (normal) and traffic conditions ｛(ordinary) OR (good) OR (very good) 又は or High) and geeky (low).

【００２８】次に、データ分析部１２の具体的動作につ
いて説明する。データ分析部１２は、重回帰分析を利用
してルールの実行部の知識を獲得するため、各クラスタ
ごとに重回帰分析を行う。この際、変量は売上高に影響
を及ぼしそうな調査データを用いる。なお、予め影響を
及ぼしているデータ項目をピックアップすることも可能
である。Next, the specific operation of the data analyzer 12 will be described. The data analysis unit 12 performs multiple regression analysis for each cluster in order to acquire knowledge of the rule execution unit using multiple regression analysis. At this time, the variables use survey data likely to affect sales. Note that it is also possible to pick up data items that have an effect in advance.

【００２９】具体的には、次のようなものが考えられ
る。すなわち、目的変量：新規店の売上高説明変量：売場面積比（新規店の売場面積／類似店の売
場面積）：従業員数比（新規店の従業員数／類似店の従業員数）：駐車場収容台数比（新規店の駐車場収容台数／類似店
の駐車場収容台数）とし、各変量におけるパラメータを求める。パラメータ
とは、重み付け値であり回帰係数を求めることになる。
すなわち、式（１）のパラメータａ，ｂ，ｃ，ｄを求め
ることになる。Ｐs＝｛（Ｕs／Ｕr）・ａ＋（Ｊs／Ｊr）・ｂ＋（Ｃs／Ｃr）・ｃ＋ｄ｝・Ｐr （１）ここで、式（１）中の符号の意味は下記の通りである。Ｐs：新規店の売上高Ｐr：類似店の売上
高Ｕs：新規店の売場面積Ｕr：類似店の売場
面積Ｊs：新規店の従業員数Ｊr：類似店の従業
員数Ｃs：新規店の駐車場収容台数Ｃr：類似店の駐車
場収容台数ａ：売上面積比の回帰係数ｂ：従業員数比の
回帰係数ｃ：駐車場収容台数比の回帰係数ｄ：定数なお、説明変量を新規店と類似店とのデータ比を用いて
いるが、データを対にせず、データをそのまま代入する
式（１−１）のような計算式であってもよい。なお、式
（１−１）におけるａ’、ｂ’、ｃ’、ｄ’は回帰係数
である。本実施例では、類似店のデータの効果すなわ
ち、類似店のデータの影響力を加味するため、データを
対として扱っている。Ｐs＝Ｕr・ａ’＋Ｊr・ｂ’＋Ｃr・ｃ’＋ｄ’ （１−１）さて、次に、パラメータａ，ｂ，ｃ，ｄを求めることに
なる。この場合、初期ルールの獲得であるので、１つの
事例をピックアップして新規店と仮定して、重回帰分析
を行う。例えば、データ番号２の事例は、クラスタ分析
の結果からデータ番号３６の事例が最も類似しているの
で、新規店のデータはデータ番号２のデータを式（１）
に代入し、類似店のデータはデータ番号３６のデータを
式（１）に代入する。ここで、新規店であるデータ番号
２のデータおよび類似店であるデータ番号３６のデータ
は次のような値である。Ｐs＝９０００万円Ｐr＝７９００万円Ｕs＝４０平方ｍＵr＝３０平方
ｍＪs＝２人Ｊr＝１人Ｃs＝１０台Ｃr＝１０台同様に他のデータ対の値を、式（１）に代入して重回帰
分析を行うと、各パラメータは次のような値になる。す
なわち、ａ＝０．０６３６，ｂ＝０．２２３６，ｃ＝０．００３
６，ｄ＝０．６９８４が得られ、ルールの実行部の知識として、次の式（２）
が獲得されることになる。Ｐs＝｛（Ｕs／Ｕr）・０．０６３６＋（Ｊs／Ｊr）・０．２２３６＋（Ｃs／Ｃr）・０．００３６＋０．６９８４｝・Ｐr （２）図９は、クラスタ番号２の重回帰分析結果による実際の
売上高と予想売上高との誤差を示す図である。図９にお
いて、左から事例のデータ番号、事例の実際の売上高、
予想売上高、実際の売上高と予想売上高との誤差、およ
び類似店としてピックアップされた事例のデータ番号を
示している。図９において、誤差は全体的に少ないこと
がわかり、妥当な売上予測が可能である。このようにし
て、他のクラスタについても同様にして処理される。Specifically, the following can be considered. In other words, the target variable: the sales amount of the new store Description variable: the sales floor area ratio (the new store sales floor area / similar store sales floor area): the number of employees (the number of new store employees / the number of similar store employees): parking lot accommodation The ratio for the number of vehicles (the number of parking lots of new stores / the number of parking lots of similar stores) is calculated, and the parameters for each variable are calculated. The parameter is a weight value, and a regression coefficient is obtained.
That is, the parameters a, b, c, and d in Expression (1) are obtained. Ps = {(Us / Ur) · a + (Js / Jr) · b + (Cs / Cr) · c + d} · Pr (1) Here, the meanings of the symbols in the expression (1) are as follows. Ps: New store sales Pr: Similar store sales Us: New store sales area Ur: Similar store sales area Js: New store employees Jr: Similar store employees Cs: New store parking lot accommodation Number of vehicles Cr: Number of parking lots accommodated in similar stores a: Regression coefficient of sales area ratio b: Regression coefficient of ratio of number of employees c: Regression coefficient of parking lot ratio d: Constant Although the data ratio is used, a calculation formula such as Expression (1-1) in which the data is substituted as it is without using the data as a pair may be used. In addition, a ', b', c ', and d' in Formula (1-1) are regression coefficients. In this embodiment, the data is treated as a pair in order to take into account the effect of the data of similar stores, that is, the influence of the data of similar stores. Ps = Ur.a '+ Jr.b' + Cr.c '+ d' (1-1) Next, parameters a, b, c, and d are obtained. In this case, since an initial rule is acquired, multiple regression analysis is performed by picking up one case and assuming a new store. For example, the case of data No. 2 is the most similar to the case of data No. 36 from the result of the cluster analysis.
And the data of the similar store is obtained by substituting the data of the data number 36 into the equation (1). Here, the data of the data number 2 as a new store and the data of the data number 36 as a similar store have the following values. Ps = 90 million yen Pr = 79 million yen Us = 40 square meters Ur = 30 square meters Js = 2 persons Jr = 1 person Cs = 10 cars Cr = 10 Similarly, the value of another data pair is expressed by the equation (1). When the multiple regression analysis is performed by substituting into That is, a = 0.0636, b = 0.236, and c = 0.003
6, d = 0.6984, and the following equation (2)
Will be obtained. Ps = {(Us / Ur) · 0.0636 + (Js / Jr) · 0.2236 + (Cs / Cr) · 0.0036 + 0.6984} · Pr (2) FIG. 9 shows a multiple regression analysis of cluster number 2. It is a figure which shows the difference | error of actual sales and expected sales based on a result. In FIG. 9, from the left, the data number of the case, the actual sales amount of the case,
The table shows the expected sales, the error between the actual sales and the expected sales, and the data numbers of the cases picked up as similar stores. In FIG. 9, it can be seen that the error is small overall, and a reasonable sales forecast can be made. In this way, other clusters are similarly processed.

【００３０】さらに、図２に戻り、フローチャートの説
明を続ける。１つのクラスタに対し、帰納学習部１１に
より獲得されたルールの条件部の知識と、データ分析部
１２により獲得されたルールの実行部の知識は、知識生
成部１３によりルールとしてまとめられ、知識ベース３
に格納される（ステップ２１１）。そして、まだクラス
タがある場合には、ステップ２０４に移行してクラスタ
ごとの知識獲得を行い、全クラスタがなくなるまで処理
を行う（ステップ２１２）。Returning to FIG. 2, the description of the flowchart will be continued. For one cluster, the knowledge of the condition part of the rule acquired by the inductive learning unit 11 and the knowledge of the execution unit of the rule acquired by the data analysis unit 12 are summarized as a rule by the knowledge generation unit 13, and the knowledge base 3
(Step 211). If there are still clusters, the process proceeds to step 204 to acquire knowledge for each cluster, and the process is performed until all clusters are exhausted (step 212).

【００３１】例えば、クラスタ番号２により獲得され、
知識ベース３に格納される知識は、次のようにまとめら
れる。ＩＦ・住民生活リッチ度（やや低い）かつファミリー性（普通）又は・住民生活リッチ度（普通）かつ交通事情｛（普通）ＯＲ（良い）ＯＲ（たいへん良い）｝又は・住民生活リッチ度（高い）かつマニア性（低い）ＴＨＥＮＰs＝｛（Ｕs／Ｕr）・０．０６３６＋（Ｊs／Ｊr）・０．２２３６＋（Ｃs／Ｃr）・０．００３６＋０．６９８４｝・Ｐr このようにして、クラスタ単位でルールの獲得が行わ
れ、知識ベース１３に格納される。For example, acquired by cluster number 2,
The knowledge stored in the knowledge base 3 is summarized as follows. IF ・ Resident richness (slightly low) and family (normal) or ・ Resident richness (normal) and traffic conditions (normal) OR (good) OR (very good)｝ or ・ Resident richness (high) ) And mania (low) THEN Ps = {(Us / Ur) .0.0636+ (Js / Jr) .0.2236+ (Cs / Cr) .0.0036 + 0.6984} .Pr Rules are obtained in units and stored in the knowledge base 13.

【００３２】さて、次に、上記のようにして獲得された
ルールの修正について説明する。ルールの獲得後は、新
しいデータに対してルールを修正する必要がある場合が
ある。以下、新規に出店する予定の店舗の売上高を予測
計算する既存のルールの修正について説明する。Next, the modification of the rules obtained as described above will be described. After a rule is acquired, it may be necessary to modify the rule for new data. Hereinafter, a description will be given of a modification of an existing rule for predicting and calculating the sales amount of a store to be newly opened.

【００３３】図１０は、知識修正装置２の動作フローチ
ャートである。図１０において、まず、入力部６におい
て入力された（ステップ３０１）新規データは、知識ベ
ース制御部５を介して知識修正装置２の知識判定部２０
に入力され、知識判定部２０は、新規データに対し、知
識ベース３が有するルールの条件部の全てと照合し（ス
テップ３０２）、ルールの条件部の照合が成功したか否
かを判定する（ステップ３０３）。新規データが知識ベ
ース３が有するルールの条件部との照合に失敗した場合
は、サブ知識判定部２１において、知識ベースに格納さ
れているサブ知識と新規データを照合し（ステップ３０
４）、照合が成功したか否かを判定する（ステップ３０
５）。ステップ３０５において、照合が成功した場合
は、サブ知識追加部２２において、ルールの条件部に照
合が成功したサブ知識を加える（ステップ３０６）。FIG. 10 is a flowchart of the operation of the knowledge correction device 2. In FIG. 10, first, new data input at the input unit 6 (step 301) is transmitted to the knowledge determination unit 20 of the knowledge correction device 2 via the knowledge base control unit 5.
, The knowledge determination unit 20 checks the new data against all of the condition parts of the rule of the knowledge base 3 (step 302), and determines whether the matching of the condition part of the rule is successful (step 302). Step 303). If the new data fails to match the condition part of the rule of the knowledge base 3, the sub-knowledge determining unit 21 checks the new data against the sub-knowledge stored in the knowledge base (step 30).
4), it is determined whether the collation is successful (step 30)
5). If the matching is successful in step 305, the sub-knowledge adding unit 22 adds the successfully matched sub-knowledge to the condition part of the rule (step 306).

【００３４】一方、ステップ３０３においてルールの条
件部の照合が成功した場合、あるいはステップ３０６に
おいて、ルールの条件部にサブ知識が追加された場合
は、許容範囲判定部２３において、照合が成功あるいは
サブ知識が追加されたルールの条件部に対応する実行部
の処理を行い、処理結果である出力値が、予め設定した
許容範囲を満足するか否かを判定する（ステップ３０
７）。この許容範囲を満足した場合には、さらに、ルー
ルの実行部の修正を行うか否かを判断する（ステップ３
０８）。ルールの実行部の修正を行う場合には、データ
分析部１２においてまず、新規データを加えた重回帰分
析を行い（ステップ３０９）、回帰式のパラメータを獲
得し（ステップ３１０）、各クラスタごとのルールの実
行部の知識を獲得する（ステップ３１１）。すなわち、
ルールの実行部のみの修正を、知識獲得装置１内のデー
タ分析部１２により行う。その後、部分知識修正部２４
は、ルールの条件部とそのルールの条件部に対応する修
正したルールの実行部をまとめる（ステップ３１２）。On the other hand, if the matching of the condition part of the rule is successful in step 303, or if the sub-knowledge is added to the condition part of the rule in step 306, the allowable range determination unit 23 determines whether the matching is successful or not. The processing of the execution unit corresponding to the condition part of the rule to which the knowledge is added is performed, and it is determined whether or not the output value as the processing result satisfies a preset allowable range (step 30).
7). If the allowable range is satisfied, it is further determined whether or not to modify the execution part of the rule (step 3).
08). To correct the rule execution unit, the data analysis unit 12 first performs a multiple regression analysis with new data (step 309), obtains the parameters of the regression equation (step 310), Acquire knowledge of the rule execution unit (step 311). That is,
Only the rule execution unit is corrected by the data analysis unit 12 in the knowledge acquisition device 1. Then, the partial knowledge correction unit 24
Combines the condition part of the rule and the corrected rule execution part corresponding to the rule condition part (step 312).

【００３５】一方、ステップ３０８において、ルールの
実行部の修正を行わない場合は、許容範囲を満足するこ
とを含めたデータを事例ベース４に格納する（ステップ
３１３）。この事例ベース４に格納されたデータは、そ
の後知識ベース制御部５によるバッチ処理によりルール
の実行部の修正をデータ分析部１２を起動して行うこと
になる。On the other hand, if it is determined in step 308 that the execution part of the rule is not to be modified, the data including the fact that the tolerance is satisfied is stored in the case base 4 (step 313). The data stored in the case base 4 is then subjected to batch processing by the knowledge base control unit 5 to correct the rule execution unit by activating the data analysis unit 12.

【００３６】ところで、ステップ３０７において、許容
範囲を満足しない場合、あるいはステップ３０５におい
て、サブ知識を含めたルールの条件部の照合に失敗した
場合は、知識ベースに格納されているルール全体の修正
を行うか否かを全体知識修正指示部２５が判断し（ステ
ップ３１４）、修正処理を行う場合は、知識獲得装置１
により知識の獲得を最初から行う（ステップ３１５）。
すなわち、全データのクラスタ分析から知識ベースへの
知識格納までを行う。一方、修正処理を行わない場合
は、ステップ３１３と同様に、データを事例ベース４に
格納する（ステップ３１６）。すなわち、事例ベース４
に格納されたデータは、その後知識ベース制御部５によ
るバッチ処理により知識ベース３の全体修正を行う。If the allowable range is not satisfied in step 307, or if the matching of the condition part of the rule including the sub-knowledge fails in step 305, the entire rule stored in the knowledge base is corrected. The overall knowledge modification instructing unit 25 determines whether or not to perform the modification (step 314).
(Step 315).
That is, the process from cluster analysis of all data to knowledge storage in the knowledge base is performed. On the other hand, when the correction process is not performed, the data is stored in the case base 4 as in step 313 (step 316). That is, Case Base 4
Then, the entire knowledge base 3 is corrected by batch processing by the knowledge base control unit 5 for the data stored in the knowledge base 3.

【００３７】次に、売上予測システムの具体的なルール
の修正について説明する。上述したように、ルールの修
正は、大きく次の３つの場合がある。第１に、適用可能
なルールの条件部がサブ知識を考慮しても知識ベースに
存在しない場合であり、この場合は、知識獲得装置１に
おいてクラスタ分析から行い、全ルールの修正を行う。
第２に、適用可能なルールの条件部が知識ベースに存在
するが、ルールの実行部の処理結果が許容範囲を満足し
ない場合であり、この場合も、知識獲得装置１において
クラスタ分析から行い、第３に、適用可能なルールの条
件部が知識ベースに存在し、かつルールの実行部の処理
結果が許容範囲を満足する場合であり、この場合は、ル
ールの条件部に対応するルールの実行部の修正のために
データ分析のみを行う。Next, a specific rule modification of the sales prediction system will be described. As described above, there are three major cases for modifying the rule. First, there is a case where the condition part of the applicable rule does not exist in the knowledge base even in consideration of the sub-knowledge. In this case, the knowledge acquisition device 1 performs all the rules by performing a cluster analysis.
Second, there is a case where the condition part of the applicable rule exists in the knowledge base, but the processing result of the rule execution part does not satisfy the allowable range. Third, there is a case where the condition part of the applicable rule exists in the knowledge base and the processing result of the rule execution part satisfies an allowable range. In this case, the execution of the rule corresponding to the condition part of the rule is performed. Only data analysis is performed to correct parts.

【００３８】以下、第３の場合で、ルールの条件部の照
合に失敗するがサブ知識を考慮したルールの条件部の照
合を満足する場合を中心に説明する。まず、重み付けさ
れた新規データが、知識修正装置２の知識判定部２０に
入力されると、知識判定部２０は、新規データに対し、
知識ベース３が有するルールの条件部の全てと照合す
る。新規データと、複数のルールの条件部との照合を繰
り返し、ルールの条件部の照合が成功したか否かを判定
する。新規データと知識ベース３が有するルールの条件
部との照合に失敗すると、サブ知識判定部２１におい
て、知識ベースに格納されているサブ知識としてのルー
ルの条件部と新規データを照合する。例えば、図８は上
述したように、帰納学習を用いて知識の獲得を行った結
果であり、８つの知識が獲得されているのがわかる。し
かし、ルールの条件部を獲得する際の事例数を考慮し、
データ数が１個の場合は、例外とみなされ、サブ知識と
なる。事例数が１個では、現段階で典型的なルールとみ
なすわけにはいかないからである。The following description will focus on the third case in which the matching of the condition part of the rule fails but the matching of the condition part of the rule in consideration of the sub-knowledge is satisfied. First, when the weighted new data is input to the knowledge determination unit 20 of the knowledge correction device 2, the knowledge determination unit 20
It matches with all of the condition parts of the rules of the knowledge base 3. The collation between the new data and the condition part of a plurality of rules is repeated, and it is determined whether the collation of the condition part of the rule is successful. If the matching between the new data and the condition part of the rule of the knowledge base 3 fails, the sub-knowledge determining part 21 compares the condition part of the rule as the sub-knowledge stored in the knowledge base with the new data. For example, FIG. 8 shows the result of acquiring knowledge using inductive learning as described above, and it can be seen that eight pieces of knowledge have been acquired. However, considering the number of cases when acquiring the condition part of the rule,
If the number of data is one, it is regarded as an exception and becomes sub-knowledge. This is because a single case cannot be regarded as a typical rule at this stage.

【００３９】ここで、サブ条件の照合を繰り返し、全て
のサブ条件の照合に失敗した場合は、どの条件にも属さ
ないデータであるので、もう一度クラスタ分析からやり
直すことになる。なお、クラスタ分析からやり直すか否
かのタイミングの決定は、任意であり、例えば、計算機
上に設定しておき、データ数が所定数になったら自動的
に処理を開始するようにしたり、あるいは、ユーザが指
定することにより処理を行うようにする。ここでは、ユ
ーザの指定により処理を行うようにしており、ユーザの
指定でデータの処理を行う場合は、もう一度クラスタ分
析からやり直すことになる。また、データの処理を行わ
ない場合は、データを事例ベースに格納し、指定のあっ
た時に処理を行う。また、データのサブ条件の照合に成
功した場合は、満足したサブ条件をルールの条件部の知
識に追加する。例えば、新規データが、次のような場
合、すなわち、・住民生活リッチ度（2.やや低い）・周辺地域状況（3.普通）・交通事情（3.普通）・顧客タイプ（2.両方）・季節変動性（1.小）・ＯＡ化の普及度（2.中下）・マニア性（1.低い）・ファミリー性（3.強い）である場合は、既に図８のクラスタ番号２から獲得され
たルールの条件部の知識を満足しないが、「住民生活リ
ッチ度（やや低い）かつファミリー性（強い）」という
サブ条件は満足する。したがって、図８のルールの条件
部の知識は、次のように変更される。すなわち、・住民生活リッチ度（低い）かつファミリー性（普通） …サブ条件・住民生活リッチ度（やや低い）かつファミリー性（普通） …ルール条件・住民生活リッチ度（やや低い）かつファミリー性（強い） …ルール条件・住民生活リッチ度（普通）かつ交通事情（普通） …ルール条件・住民生活リッチ度（普通）かつ交通事情（良い） …ルール条件・住民生活リッチ度（普通）かつ交通事情（非常に良い） …ルール条件・住民生活リッチ度（高い）かつマニア性（低い） …ルール条件・住民生活リッチ度（高い）かつマニア性（普通）かつＯＡ化の普及度（中） …サブ条件これにより、「住民生活リッチ度（やや低い）かつファ
ミリー性（強い）」というサブ条件は、ルール条件に変
更され、ルールの条件部に追加される。Here, the sub-condition matching is repeated, and if the matching of all the sub-conditions fails, the data does not belong to any condition. Therefore, the cluster analysis is started again. The determination of whether or not to start over from the cluster analysis is arbitrary. For example, the timing may be set on a computer, and the processing may be automatically started when the number of data reaches a predetermined number, or Processing is performed by the user's specification. Here, the processing is performed according to the user's specification, and when processing the data according to the user's specification, the processing is performed again from the cluster analysis. When data processing is not performed, the data is stored in the case base and the processing is performed when specified. If the matching of the data sub-condition is successful, the satisfied sub-condition is added to the knowledge of the condition part of the rule. For example, when the new data is as follows: ・ Richness of inhabitants' lives (2. Slightly low) ・ Surrounding area (3.Normal) ・ Traffic (3.Normal) ・ Customer type (2.Both) -Seasonal variability (1. Small)-OA penetration (2. Middle and lower)-Mania (1. Low)-Family (3. Strong), if already cluster number 2 in Fig. 8 Although it does not satisfy the knowledge of the condition part of the acquired rules, it satisfies the sub-condition of "richness of inhabitants' lives (somewhat low) and family (strong)". Therefore, the knowledge of the condition part of the rule in FIG. 8 is changed as follows. That is: ・ Richness of resident life (low) and family (normal)… sub-condition ・ Richness of resident life (slightly low) and family (normal)… Rule condition ・ Richness of resident life (slightly low) and family ( Strong)… Rule conditions ・ Richness of resident life (normal) and traffic conditions (normal)… Rule conditions ・ Richness of resident lives (normal) and traffic conditions (good)… Rule conditions ・ Rich condition of residents and richness (normal) and traffic conditions (Very good)… Rule condition ・ Richness of resident life (high) and mania (low)… Rule condition ・ Richness of resident life (high) and mania (normal) and spread of OA (medium)… sub Condition As a result, the sub-condition "richness of residents' life (slightly low) and family (strong)" is changed to the rule condition and added to the condition part of the rule. That.

【００４０】次に、新規データが、ルールの条件部を満
足した場合は、この新規データをルールの実行部の計算
式に代入した時に、計算した出力値は許容範囲を満足す
るか否かを判断する。ここで、許容範囲とは、出力値に
対して、誤差を含む程度を設定したものである。計算式
に代入して求められた値すなわち予測値に対してのみ許
容範囲を設定して判断することもできるが、本実施例に
おいては、重回帰分析を用いているので、この分析結果
を十分に利用するため、計算式のパラメータである回帰
係数の信頼区間を設けて許容範囲を満足するか否かを判
断する。これは、被決定要因データと各決定要因との相
関関係の確率値に応じて決めることができる。ここで、
確率値とは、データ分析結果から導かれる信頼度であ
る。つまり、予測をする時の、予測が当たる可能性すな
わち、有意水準を「％」で表現したものである。Next, when the new data satisfies the condition part of the rule, when this new data is substituted into the calculation formula of the execution part of the rule, it is determined whether the calculated output value satisfies the allowable range. to decide. Here, the allowable range is a value in which an output value includes an error. Although it is possible to set an allowable range only for the value obtained by substituting into the calculation formula, that is, for the predicted value, it is possible to make a judgment. However, in this embodiment, since multiple regression analysis is used, Therefore, a confidence interval of the regression coefficient, which is a parameter of the calculation formula, is provided to determine whether or not the tolerance is satisfied. This can be determined according to the probability value of the correlation between the determinant data and each determinant. here,
The probability value is a reliability derived from a data analysis result. That is, at the time of making a prediction, the likelihood of the prediction hitting, that is, the significance level is expressed by “%”.

【００４１】例えば、被決定要因の“売上高比”に対し
て、決定要因の“売場面積比”、“駐車場収容台数
比”、“従業員数比”との信頼度が順番に９９％、９０
％、９０％以下とすると、この信頼度に応じて範囲の幅
を決めることができる。なぜなら、相関関係の強いもの
は、影響力が大きいことからデータの範囲は狭くなり、
逆に、相関関係の弱いものは、影響力が小さいことから
データの範囲を強いものに比べて広くとることができる
からである。また、決定要因（重回帰分析では、一般に
説明変数と呼んでいる）の最大値と最小値の制限も重要
である。つまり、入力される決定要因データが過去のデ
ータを用いて重回帰分析を行った時の決定要因データの
最大値と最小値の間に収まっているかどうかである。こ
の制限値に入力データが収まっていなければ、被決定要
因を求めるための計算式に当てはめても意味のない予測
値となるからである。For example, with respect to the determined factor "sales ratio", the reliability of the determining factors "sales floor area ratio", "parking lot accommodation ratio" and "employee ratio" is 99% in order. 90
% And 90% or less, the width of the range can be determined according to the reliability. This is because those with strong correlations have a large influence and the range of data is narrow,
Conversely, data having a weak correlation can have a wider range of data than data having a strong correlation because the influence is small. It is also important to limit the maximum and minimum values of determinants (generally called explanatory variables in multiple regression analysis). That is, whether the input determinant data falls between the maximum value and the minimum value of the determinant data when multiple regression analysis is performed using past data. If the input data does not fall within the limit value, it becomes a meaningless predicted value even if it is applied to a calculation formula for determining the determined factor.

【００４２】新規データが許容範囲を満足しない場合
は、もう一度、クラスタ分析からやり直すことになる。
この処理は、サブ条件を満足しない場合と同様に行う。If the new data does not satisfy the allowable range, the cluster analysis is started again.
This processing is performed in the same manner as when the sub-condition is not satisfied.

【００４３】データが許容範囲を満足する場合は、入力
したデータが既存のルールを満足するので、問題はない
が、さらに厳格なルールに修正するために、重回帰分析
を行うことができる。この修正を行うタイミングは、ル
ールの実行部の修正を行うか否かで判断する。データの
処理を行わない場合は、このデータを事例ベースに格納
し、指定のあった時に処理を行う。すなわち、バッチ処
理を行う。データの処理を行う場合は、格納していたデ
ータを加えて、ルールの獲得のときと同様にして、重回
帰分析を行い、計算式のパラメータを獲得し、各クラス
タごとの実行部獲得し、新しいルールの実行部として修
正し、ルールにまとめる。If the data satisfies the allowable range, the input data satisfies the existing rules, so there is no problem. However, a multiple regression analysis can be performed in order to correct the stricter rules. The timing for making this correction is determined based on whether or not the execution part of the rule is to be corrected. When data processing is not performed, this data is stored in the case base, and processing is performed when specified. That is, a batch process is performed. When processing the data, add the stored data, perform multiple regression analysis in the same way as when acquiring the rules, obtain the parameters of the calculation formula, obtain the execution unit for each cluster, Modify it as a new rule execution part and compile it into rules.

【００４４】例えば、知識獲得装置１により当初のルー
ルが獲得されたクラスタ番号２の既存ルールは、まず、
次のように修正結果を得る。・照合に成功したサブ条件をルールの条件部に追加・新たな重回帰分析の結果ａ＝０．０６４２，ｂ＝０．２２２８，ｃ＝０．００２
５，ｄ＝０．７００１そして、既存ルールは、次のように修正される。すなわ
ち、ＩＦ・住民生活リッチ度（やや低い）かつファミリー性｛（普通）ＯＲ（強い）｝又は・住民生活リッチ度（普通）かつ交通事情｛（普通）ＯＲ（良い）ＯＲ（たいへん良い）｝又は・住民生活リッチ度（高い）かつマニア性（低い）ＴＨＥＮＰs＝｛（Ｕs／Ｕr）・０．０６４２＋（Ｊs／Ｊr）・０．２２２８＋（Ｃs／Ｃr）・０．００２５＋０．７００１｝・Ｐr となる。For example, the existing rule of cluster number 2 for which the initial rule has been acquired by the knowledge acquiring device 1
Get the modified result as follows: -The sub-conditions that have been successfully matched are added to the condition part of the rule.-The results of the new multiple regression analysis: a = 0.0642, b = 0.2228, c = 0.002
5, d = 0.7001 Then, the existing rule is modified as follows. That is, IF ・ Residents 'richness (slightly low) and family characteristics {(normal) OR (strong)} or ・ Residents' richness (normal) and traffic conditions ｛(normal) OR (good) OR (very good)｝ Or ・ Richness of residents' life rich (high) and mania (low) THEN Ps = {(Us / Ur) · 0.0642 + (Js / Jr) · 0.2228 + (Cs / Cr) · 0.0025 + 0.7001}・ It becomes Pr.

【００４５】このようにして、ＩＦ−ＴＨＥＮ形式のル
ールの獲得および修正を行うことができる。In this manner, the rules in the IF-THEN format can be obtained and corrected.

【００４６】上述したように、膨大なデータを、第１段
階として、クラスタ分類手段により最適なクラスタに分
類し、第２段階として、この分類されたクラスタ単位に
ＩＦ−ＴＨＥＮ形式のルールを獲得するようにしている
ため、知識の獲得が自動的に行うことができ、かつその
知識の修正は、獲得されたクラスタ単位の知識との整合
性を考慮しつつ行うようにしているため、簡易に行うこ
とができる。すなわち、獲得した知識は、クラスタごと
に分類されているため、後で、簡易に修正したり、追加
したり、検証したり、様々な活用ができる。また、専門
家からの獲得が困難であるような知識も自動的に獲得す
ることができる。As described above, a huge amount of data is classified into optimal clusters by the cluster classification means as the first step, and a rule in the IF-THEN format is obtained for each classified cluster as the second step. As a result, knowledge acquisition can be performed automatically, and the correction of the knowledge can be performed easily while considering the consistency with the acquired cluster unit knowledge. be able to. That is, since the acquired knowledge is classified for each cluster, it can be easily modified, added, verified, and variously used later. In addition, knowledge that is difficult to obtain from a specialist can be automatically obtained.

【００４７】[0047]

【発明の効果】以上説明したように、本発明は、複数の
属性に対応したデータを有している複数のデータ群から
知識ベースシステムの知識ベースに格納される知識を獲
得する知識ベースシステムの知識獲得装置において、デ
ータ分類手段が、前記複数の属性のうちの特定された第
１の属性群からなる前記複数のデータ群の部分データ群
をクラスタ分析して前記複数のデータ群をクラスタに分
類し、条件部知識獲得手段が、前記データ分類手段によ
り分類されたクラスタごとに、所定の帰納学習方法によ
り求めたクラスタ属性条件の論理式をルールの条件部の
知識として獲得し、実行部知識獲得手段が、前記データ
分類手段により分類されたクラスタごとに、特定された
第２の属性群からなる該クラスタ内の複数の部分データ
群に基づき、所定のデータ分析方法により最適な数値的
関係式をルールの実行部の知識として獲得し、ルール生
成手段が、前記条件部知識獲得手段と前記実行部知識獲
得手段とから得られる前記クラスタごとの知識から前記
知識ベースのルールを生成する。また、適用可能な条件
部知識がサブ知識を考慮しても知識ベースに存在しない
場合は、前記データ分類手段によるデータ分類から行
い、新規データを含む全データから全知識の修正を行
う。適用可能な条件部知識が知識ベースに存在するが、
実行部知識による処理結果が許容範囲を満足しない場合
も、データ分類手段によりデータ分類から行い、新規デ
ータを含む全データから全知識の修正を行う。適用可能
な条件部知識が知識ベースに存在し、かつ実行部知識に
よる処理結果が許容範囲を満足する場合は、条件部知識
に対応する実行部知識の修正のみを実行部知識獲得手段
により行うようにしている。したがって、本発明は、知
識ベースの知識獲得を自動的に行うことができ、その知
識の修正、更新も獲得された知識がクラスタごとに分け
られているため簡易に行うことができ、知識ベースの知
識獲得およびその修正のための時間および労力が短縮で
きるという利点を有する。As described above, the present invention relates to a knowledge base system for acquiring knowledge stored in a knowledge base of a knowledge base system from a plurality of data groups having data corresponding to a plurality of attributes. In the knowledge acquisition device, the data classification unit performs a cluster analysis on the partial data groups of the plurality of data groups including the specified first attribute group among the plurality of attributes to classify the plurality of data groups into clusters. Then, the condition part knowledge obtaining means obtains, as knowledge of the condition part of the rule, a logical expression of a cluster attribute condition obtained by a predetermined induction learning method for each of the clusters classified by the data classification means, and obtains execution part knowledge. Means for each of the clusters classified by the data classification means, based on a plurality of partial data groups in the clusters, each of which includes the specified second attribute group, An optimal numerical relational expression is acquired as knowledge of an execution unit of a rule by a data analysis method, and a rule generation unit uses the knowledge for each cluster obtained from the condition unit knowledge acquisition unit and the execution unit knowledge acquisition unit to obtain the above-mentioned knowledge. Generate knowledge base rules. If the applicable condition part knowledge does not exist in the knowledge base even in consideration of the sub-knowledge, the data classification is performed by the data classification means, and all the knowledge including the new data is corrected. Applicable conditional knowledge exists in the knowledge base,
Even when the processing result by the execution unit knowledge does not satisfy the allowable range, the data classification unit performs data classification and corrects all knowledge from all data including new data. If the applicable condition part knowledge exists in the knowledge base and the processing result by the execution part knowledge satisfies the allowable range, only the correction of the execution part knowledge corresponding to the condition part knowledge is performed by the execution part knowledge acquisition means. I have to. Therefore, according to the present invention, knowledge acquisition of the knowledge base can be automatically performed, and the correction and update of the knowledge can be easily performed because the acquired knowledge is divided into clusters. This has the advantage that the time and effort for knowledge acquisition and its modification can be reduced.

【００４８】また、専門家に全てを頼ること無く、複数
のデータ群から客観的かつ簡易に知識の獲得・修正する
ことができるため、その後も獲得した知識ベースの知識
を追加、修正、検証等により活用することができるとい
う利点を有する。Further, knowledge can be objectively and easily acquired and corrected from a plurality of data groups without relying on all the specialists, so that the acquired knowledge base knowledge can be added, modified, verified, etc. It has the advantage that it can be utilized.

【００４９】さらに、獲得した知識は、予め、専門家か
ら獲得することが困難な知識も獲得することができると
いう利点を有する。Further, the acquired knowledge has an advantage that it is possible to acquire in advance knowledge that is difficult to acquire from an expert.

[Brief description of the drawings]

【図１】本発明の一実施例である知識ベースシステムの
構成ブロック図。FIG. 1 is a configuration block diagram of a knowledge base system according to an embodiment of the present invention.

【図２】知識獲得装置１の動作フローチャート。FIG. 2 is an operation flowchart of the knowledge acquisition device 1.

【図３】具体的な抽象項目データを示す図。FIG. 3 is a diagram showing specific abstract item data.

【図４】具体的な抽象項目データを示す図。FIG. 4 is a diagram showing specific abstract item data.

【図５】重み付けされた抽象項目データを示す図。FIG. 5 is a diagram showing weighted abstract item data.

【図６】重み付けされた抽象項目データを示す図。FIG. 6 is a diagram showing weighted abstract item data.

【図７】クラスタ分析の結果を示す樹状図。FIG. 7 is a dendrogram showing the results of cluster analysis.

【図８】帰納学習部１１の帰納学習結果を示す図。FIG. 8 is a diagram showing an induction learning result of the induction learning unit 11;

【図９】クラスタ番号２の重回帰分析結果による実際の
売上高と予想売上高との誤差を示す図。FIG. 9 is a diagram showing an error between actual sales and expected sales based on the result of the multiple regression analysis of cluster number 2;

【図１０】知識修正装置２の動作フローチャート。FIG. 10 is an operation flowchart of the knowledge correction device 2.

[Explanation of symbols]

１知識獲得装置２知識修正装置３知識ベース４事例ベース５知識ベース制御部６入力部７出力部８推論部１０クラスタ分析部１１帰納学習部１２データ分析部１３知識生成部２０知識判定部２１サブ知識判定部２２サブ知識追加部２３許容範囲判定部２４部分知識修正部２５全体知識修正指示部 REFERENCE SIGNS LIST 1 knowledge acquisition device 2 knowledge correction device 3 knowledge base 4 case base 5 knowledge base control unit 6 input unit 7 output unit 8 inference unit 10 cluster analysis unit 11 inductive learning unit 12 data analysis unit 13 knowledge generation unit 20 knowledge determination unit 21 sub Knowledge determination unit 22 Sub-knowledge addition unit 23 Tolerance range determination unit 24 Partial knowledge correction unit 25 Overall knowledge correction instruction unit

フロントページの続き (56)参考文献特開平２−59931（ＪＰ，Ａ) 特開平３−224054（ＪＰ，Ａ) 特開平１−259488（ＪＰ，Ａ) 特開昭61−72354（ＪＰ，Ａ) 特開平４−596（ＪＰ，Ａ) 数理科学、Ｖｏｌ．29、Ｎｏ．３、株式会社サイエンス社、1991年、ｐｐ．27 〜31（特許庁ＣＳＤＢ文献番号：ＣＳＮＷ199900287003) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 9/44 G06F 17/60 G06F 19/00 G06F 17/30 ＪＩＣＳＴファイル（ＪＯＩＳ) ＣＳＤＢ（日本国特許庁)Continuation of the front page (56) References JP-A-2-59931 (JP, A) JP-A-3-224054 (JP, A) JP-A-1-259488 (JP, A) JP-A-61-72354 (JP) , A) JP-A-4-596 (JP, A) Mathematical Sciences, Vol. 29, No. 3. Science Co., Ltd., 1991, pp. 27-31 (Patent Office CSDB Document Number: CSN W199900287003) (58) Fields investigated (Int. Cl. ⁷ , DB name) G06F 9/44 G06F 17/60 G06F 19/00 G06F 17/30 JICST file (JOIS) CSDB (Japan Patent Office)

Claims

(57) [Claims]

1. A knowledge acquiring apparatus for a knowledge base system for acquiring knowledge stored in a knowledge base of a knowledge base system from a plurality of data groups having data corresponding to a plurality of attributes, Data classification means for performing a cluster analysis on the partial data groups of the plurality of data groups having the specific attribute, and classifying the plurality of data groups into clusters; for each cluster classified by the data classification means,
Condition part knowledge acquiring means for acquiring a logical expression of a cluster attribute condition obtained by a predetermined induction learning method as knowledge of a condition part of a rule, and for each cluster classified by the data classification means,
Executing unit knowledge obtaining means for obtaining an optimal numerical relational expression as knowledge of an executing unit of a rule by a predetermined data analysis method based on a plurality of partial data groups in the cluster including other attributes; A knowledge acquisition device for a knowledge base system, comprising: rule generation means for generating the rule of the knowledge base from the knowledge for each cluster obtained from the acquisition means and the execution part knowledge acquisition means.

2. A cluster analysis of a partial data group of the plurality of data groups having a specific attribute among the plurality of attributes based on a plurality of data groups having data corresponding to the plurality of attributes. Data classification means for classifying the plurality of data groups into clusters, and for each cluster classified by the data classification means,
Condition part knowledge acquiring means for acquiring a logical expression of a cluster attribute condition obtained by a predetermined induction learning method as knowledge of a condition part of a rule, and for each cluster classified by the data classification means,
Executing unit knowledge obtaining means for obtaining an optimal numerical relational expression as knowledge of an executing unit of a rule by a predetermined data analysis method based on a plurality of partial data groups in the cluster including other attributes; And a rule generating means for generating the rule of the knowledge base from the knowledge for each cluster obtained from the obtaining means and the execution part knowledge obtaining means, wherein the knowledge base is obtained from the knowledge obtained by the condition part knowledge obtaining means. A knowledge correcting device of a knowledge base system for storing knowledge that reserves use as knowledge in the knowledge base as sub-knowledge and performing knowledge correction of the knowledge base for a new data group, wherein the new data group and the knowledge base Knowledge determining means for comparing the knowledge of the condition part of the rule stored in A sub-knowledge determining unit that compares the new data group with the sub-knowledge stored in the knowledge base to determine whether there is a satisfied one; A sub-knowledge adding unit for canceling the reservation of the sub-knowledge and adding it to the knowledge of the condition part of the rule when the sub-knowledge determining unit determines that there is something to be satisfied; the knowledge determining unit or the sub-knowledge determining unit If it is determined that there is something that satisfies the rule, processing is performed based on the knowledge of the rule execution unit corresponding to the knowledge of the condition part of the rule, and a determination is made as to whether the processing output is within a predetermined allowable range. When the range is determined to be within the allowable range by the range determining means, the knowledge of the condition part of the rule and the knowledge of the corrected execution part of the rule are corrected. Partial knowledge correcting means to be stored in the knowledge base as described above, and if the sub-knowledge determining means determines that there is no satisfaction, or if the allowable range determining means determines that it is not within the allowable range, A knowledge correction device for a knowledge base system, comprising: an overall knowledge correction unit for instructing to acquire knowledge from the plurality of data groups including the new data group.