JPH07160658A

JPH07160658A - Method for classifying data

Info

Publication number: JPH07160658A
Application number: JP5306778A
Authority: JP
Inventors: Masato Togami; 正人戸上
Original assignee: Togami Electric Mfg Co Ltd
Current assignee: Togami Electric Mfg Co Ltd
Priority date: 1993-12-07
Filing date: 1993-12-07
Publication date: 1995-06-23

Abstract

PURPOSE:To provide a learning method for which inductive machine learning and neural network learning are combined for utilizing the merits of both inductive machine learning and neural network learning and making up demerits. CONSTITUTION:In this method for classifying data, learning is performed by the inductive machine learning and when a category can not be discriminated by the learning, the learning is performed by a neural network. Thus, by combining the inductive learning and the neural network, the recognition rate of the data is improved. Even when an erroneous answer is worked out by the neural network, between which categories an error is made is limited by the inductive learning. The categories can be limited further by using a feature that the attribute of the multidimensional space of the neural network us nonlinearly discriminated even in a part where attribute values overlap.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、データが属性とその値
の対の集合で与えられている場合に、データをいくつか
のカテゴリー（クラス）に分類する方法において、特に
属性に分布がある場合の帰納的機械学習とニューラルネ
ットワーク学習とを組み合わせた方法に関し、特にパタ
ーン認識、事故診断に有用なデータの分類方法に関す
る。The present invention relates to a method of classifying data into several categories (classes) when the data is given as a set of pairs of attributes and their values, and there is a distribution in the attributes. In particular, it relates to a method combining inductive machine learning and neural network learning, and particularly to a data classification method useful for pattern recognition and accident diagnosis.

【０００２】[0002]

【従来の技術】帰納的機械学習方法は、従来、属性の値
に分布を持たず、離散的な属性値により識別木を作成し
ていた。特願平４−１３００８３号では、属性値が重な
っている場合でも、取り扱える帰納的学習方法を提案し
た。2. Description of the Related Art Conventionally, an inductive machine learning method has created an identification tree with discrete attribute values without distribution of attribute values. Japanese Patent Application No. 4-130083 proposes an inductive learning method that can handle even when attribute values overlap.

【０００３】[0003]

【発明が解決しようとする課題】ところで、先に提案し
た特願平４−１３００８３号では、属性値が重なってい
る場合、どのような状況で分類できないか、２つ以上の
カテゴリーを提示することにあり、確率分布でどちらの
カテゴリーにより属しているかを提示した。しかしなが
ら、２つ以上のカテゴリーを提示した場合でも、できれ
ば１つのカテゴリーに限定して学習することが望まれ
る。By the way, in Japanese Patent Application No. 4-130083 previously proposed, it is necessary to present two or more categories in which situations cannot be classified when the attribute values overlap. , And presented which category the probability distribution belongs to. However, even if two or more categories are presented, it is desirable to limit learning to one category if possible.

【０００４】また、ニューラルネットワークを用いた学
習方法では、属性値の分布が広い場合、またカテゴリー
の数、データの数が多い場合、あいまいな結果や、間違
った結果を出す場合がある。しかしながら、ニューラル
ネットワークを用いて多次元空間の属性を非線型に識別
できるという特徴を持つ。したがって１つのカテゴリー
に限定して学習することが可能となる。Further, in the learning method using the neural network, an ambiguous result or an erroneous result may be produced when the distribution of attribute values is wide, or when the number of categories and the number of data are large. However, it has the feature that attributes of multidimensional space can be identified non-linearly by using neural network. Therefore, it is possible to limit the learning to one category.

【０００５】これに対し、特願平４−１３００８３号で
は、属性の分布が広い場合でも学習が可能であるが、多
次元空間の属性を非線型に識別できるという特徴はな
く、線型での識別のみである。On the other hand, in Japanese Patent Application No. 4-130083, learning is possible even when the distribution of attributes is wide, but there is no feature that attributes in a multidimensional space can be identified in a non-linear manner, and identification in a linear manner is possible. Only.

【０００６】したがって本発明が解決すべき課題は、帰
納的機械学習とニューラルネットワークの両方の利点を
生かし、欠点を補うような、帰納的機械学習とニューラ
ルネットワーク学習を組み合わせた学習方法を提案する
ことにある。Therefore, the problem to be solved by the present invention is to propose a learning method combining inductive machine learning and neural network learning, which makes use of the advantages of both inductive machine learning and neural networks and compensates for their drawbacks. It is in.

【０００７】[0007]

【課題を解決するための手段】前記課題を解決するた
め、本発明の第１の分類方法は、帰納的機械学習により
学習を行い、その学習によってカテゴリーが判別できな
い場合、ニューラルネットワークによって学習を行うも
のである。In order to solve the above problems, the first classification method of the present invention performs learning by inductive machine learning, and when a category cannot be discriminated by the learning, performs learning by a neural network. It is a thing.

【０００８】本発明の第２の分類方法は、帰納的機械学
習方法により、完全に識別できるカテゴリーを除いたカ
テゴリーをニューラルネットワークによって学習するも
のである。The second classification method of the present invention is to learn the categories excluding the completely identifiable categories by a neural network by the recursive machine learning method.

【０００９】本発明の第３の分類方法は、帰納的機械学
習によって少なくとも一つ以上の属性値の分布が分離し
ているカテゴリーのすべての組み合わせを識別する属性
に対しては機械学習を用いて学習し、その機械学習でで
きないカテゴリーに対してニューラルネットワークで学
習を行うものである。The third classification method of the present invention uses machine learning for attributes that identify all combinations of categories in which at least one attribute value distribution is separated by inductive machine learning. Learning is performed by a neural network for a category that cannot be learned by machine learning.

【００１０】本発明の第４の分類方法は、第３の分類方
法で行った機械学習の上位のノードで機械学習による学
習を行い、その機械学習でできないカテゴリーに関して
は、ニューラルネットワークで学習を行うものである。In the fourth classification method of the present invention, learning by machine learning is performed at a node higher than the machine learning performed by the third classification method, and a category that cannot be learned by the machine learning is learned by a neural network. It is a thing.

【００１１】本発明の第５の分類方法は、第３の分類方
法と第４の分類方法を組み合わせて機械学習とニューラ
ルネットワークで学習を行うものである。The fifth classification method of the present invention is a combination of the third classification method and the fourth classification method for performing machine learning and neural network learning.

【００１２】[0012]

【作用】本発明では、属性値の分布が広い、カテゴリー
の数、データの数が多い場合、帰納的学習方法によって
まずカテゴリーを識別し、２つ以上のカテゴリーが識別
できない場合には、ニューラルネットワークの多次元空
間の属性を非線型に識別するという特徴を用いて、さら
にカテゴリーの限定を行う。According to the present invention, when the distribution of attribute values is wide, the number of categories and the number of data are large, the categories are first identified by the inductive learning method, and if two or more categories cannot be identified, the neural network is used. The category is further limited by using the feature of non-linearly identifying the attributes of the multidimensional space.

【００１３】[0013]

【実施例】以下、本発明を具体的に説明する。まず、本
発明の特徴を明らかにするため、以下の事故原因診断
を、まず特願平４−１３００８３号で開示した方法に基
づいてその実施例と結果を示し、それから、その結果を
用いながら、どの段階でニューラルネットワークを組み
合わせるかを説明する。The present invention will be specifically described below. First, in order to clarify the characteristics of the present invention, the following accident cause diagnosis is first shown with examples and results based on the method disclosed in Japanese Patent Application No. 4-130083, and then using the results, Explain at what stage the neural network is combined.

【００１４】図１はデータの分類に係る事故原因診断方
法を実施するための試験回路の一例を示す電気系統図で
ある。同図において、１は遮断器、２はＺＰＤ（零相電
圧検出器）、３は電源側コンデンサ、４は零相変流器、
５は負荷側コンデンサ、６は事故発生用開閉器、７は高
圧開閉器、８は制御器、９は変圧器である。試験した事
故発生方法は、碍子、架橋ポリエチレンケーブル、鳥
肉、完全地絡、抵抗地絡、ギャップ地絡である。FIG. 1 is an electrical system diagram showing an example of a test circuit for carrying out an accident cause diagnosis method relating to data classification. In the figure, 1 is a circuit breaker, 2 is a ZPD (zero phase voltage detector), 3 is a power source side capacitor, 4 is a zero phase current transformer,
Reference numeral 5 is a load side capacitor, 6 is a switch for accident occurrence, 7 is a high voltage switch, 8 is a controller, and 9 is a transformer. The tested accident methods are insulator, cross-linked polyethylene cable, poultry, complete ground fault, resistance ground fault and gap ground fault.

【００１５】故障時の零相電流をデジタル波形記録計
（サンプリングレートは４８ＫＨｚ）に記録した。記録
した波形の一部を２００ｍｓ分切り出し、ＦＦＴ（高速
フーリエ変換）波形により６０Ｈｚを基本波とする２次
から８次までの歪率（％）と全高調波歪率（％）（２次
から２０次）を求めた。測定系の構成を図２のブロック
図に示す。また、代表的な波形とＦＦＴ解析結果を図３
に示す。The zero-phase current at the time of failure was recorded on a digital waveform recorder (sampling rate was 48 KHz). A part of the recorded waveform is cut out for 200 ms, and the FFT (Fast Fourier Transform) waveform is used to distort (%) from the 2nd to 8th order with the fundamental wave at 60 Hz and the total harmonic distortion (%) (from the 2nd order) 20th) was calculated. The configuration of the measurement system is shown in the block diagram of FIG. In addition, a representative waveform and FFT analysis results are shown in FIG.
Shown in.

【００１６】ａ）決定木学習方法ここで選択すべきｍ個のカテゴリーをＣ₁，…，Ｃ_i，
…，Ｃ_mとし、これらのカテゴリーが個々にもつｎ個の
属性をＡ₁，…，Ａ_i，…，Ａ_nとする。事故原因診断
における選択すべきカテゴリーを、Ｃ_I：碍子、Ｃ_C：
ＣＶケーブル、Ｃ_N：鳥肉、Ｃ_K：完全地絡、Ｃ_R：抵
抗地絡、Ｃ_G：ギャップ地絡とする。上記合計約４００
のデータをとり、約３００データを学習データ、約１０
０データをその学習結果のテストに用いた。また上記の
カテゴリーが個々にもつ属性を、Ａ₂：２次高調波歪
率、Ａ₃：３次高調波歪率、Ａ₄：４次高調波歪率、Ａ
₅：５次高調波歪率、Ａ₆：６次高調波歪率、Ａ₇：７
次高調波歪率、Ａ₈：８次高調波歪率、Ａ_T：全高調波
歪率とする。A) Decision Tree Learning Method The m categories to be selected here are C ₁ , ..., C _i ,
..., and C _m, n number of attributes that these categories have individually A _1, ..., A _i, ..., and A _n. The categories to be selected in the accident cause diagnosis are C _I : insulator, C _C :
CV cable, C _N : poultry, C _K : complete ground fault, C _R : resistance ground fault, C _G : gap ground fault. About 400 above
Data is taken, about 300 data are learning data, about 10
0 data was used for testing the learning result. The attributes above categories have individually, A _2: 2-order harmonic distortion, A _3: 3-order harmonic distortion, A _4: 4-order harmonic distortion, A
₅ : 5th harmonic distortion factor, A ₆ : 6th harmonic distortion factor, A ₇ : 7
Next THD, A _8: 8-order harmonic distortion, A _T: the total harmonic distortion.

【００１７】各属性値の範囲は各属性値の最大値と最小
値とした。各属性値の範囲を表１に示す。The range of each attribute value is the maximum value and the minimum value of each attribute value. Table 1 shows the range of each attribute value.

【表１】 [Table 1]

【００１８】ｂ）分離されているカテゴリーの組合せに
よる決定木の作成すべてのカテゴリーを識別するためにまず任意の２つの
カテゴリーの属性値の分布を考えると、図４に示すよう
に三つの場合が考えられる。属性Ａ_Kに関する分布に対
して、Ｃ_iから見たＣ_jの相対的な関係は、状態(i) Ｃ_iの属性値の分布とＣ_jの属性値の分布は
重なっていない。状態(ii) Ｃ_iの属性値の分布はＣ_jの属性値の分布と
すべて重なっている。状態(iii) Ｃ_iの属性値の分布はＣ_jの属性値の分布と
一部重なっている。B) Creation of Decision Tree by Combination of Separated Categories Considering distribution of attribute values of arbitrary two categories in order to identify all categories, there are three cases as shown in FIG. Conceivable. Regarding the relative relationship of C _j viewed from C _{i with} respect to the distribution regarding the attribute A _{K, the} distribution of the attribute value of the state (i) C _i and the distribution of the attribute value of C _j do not overlap. State (ii) The distribution of the attribute values of C _i all overlaps the distribution of the attribute values of C _j . State (iii) The distribution of the attribute values of C _i partially overlaps the distribution of the attribute values of C _j .

【００１９】ここで、属性Ａ_KによりＣ_iとＣ_jを完全
に識別できるのは状態(i) だけである。つまり、属性Ａ
_KでＣ_iとＣ_jが完全に識別可能であるためには、その
二つのカテゴリーの属性値の分布の状態が状態(i) であ
ることが必要条件となる。Here, only the state (i) can completely distinguish C _i and C _j by the attribute A _K. That is, attribute A
_{In order} for C _i and C _{j to} be completely distinguishable by _K , it is necessary that the distribution state of the attribute values of the two categories is the state (i).

【００２０】Ｃ_iとＣ_jを属性Ａ_Kにより完全に識別でき
る場合を１、完全に識別できない場合を０と、ブール変
数に対応させ、式（１）に示すような係数ｂ_Kを定義す
る。ｂ_K＝１状態(i) であり、属性Ａ_Kにより識別可能ｂ_K＝０属性Ａ_Kにより完全には識別不可能（１）A coefficient b _K as shown in equation (1) is defined by corresponding to a Boolean variable such that C _i and C _j can be completely discriminated by the attribute A _K, and 1 can be completely discriminated. . b _K = 1 state (i), identifiable by attribute A _K b _K = 0 not completely identifiable by attribute A _K (1)

【００２１】したがってｆ（Ｃ_i，Ｃ_j）により、Ａ_Kを
ブール変数と考え、Ｃ_iとＣ_jを識別可能とする属性は
ｂ_Kを用い、論理和の形に表現すると次式のようにな
る。また属性Ａ_Kのブール変数の記号を新たに定義する
必要があるが、計算結果を見れば識別にどの属性を用い
ればよいかわかるので、新たに定義しないこととする。ｆ（Ｃ_i，Ｃ_j）＝ｂ₁・Ａ₁＋…＋ｂ_K・Ａ_K＋…＋ｂ_n・Ａ_n （２）Therefore, by using f (C _i , C _j ), A _K is considered as a Boolean variable, and b _K is used as an attribute that makes C _i and C _j distinguishable. become. Further, it is necessary to newly define the symbol of the Boolean variable of the attribute A _K , but since the attribute to be used for identification can be known from the calculation result, it is not newly defined. f (C _i , C _j ) = b ₁ · A ₁ + ... + b _K · _AK +… + b _n · A _n (2)

【００２２】例えばカテゴリーＣ₁とＣ₂に属する対象が
属性Ａ₁，Ａ₂，Ａ₃を有し、属性Ａ₁とＡ₂が状態(i)であ
る場合、属性Ａ₁又はＡ₂を用いればＣ₁とＣ₂を識別でき
ることは明らかであるが、これは（２）式ではｆ
（Ｃ₁，Ｃ₂）＝１・Ａ₁＋１・Ａ₂＋０・Ａ₃＝Ａ₁＋Ａ₂
となる。〔又は〕が論理和に相当する。つまり、（２）
式においてＣ_iとＣ_jはｆ（Ｃ_i，Ｃ_j）＝１となる場合に
識別可能となり、ｆ（Ｃ_i，Ｃ_j）の項の少なくとも一つ
の属性を用いればＣ_iとＣ_jは完全に識別できる。For example, if the objects belonging to the categories C ₁ and C ₂ have the attributes A ₁ , A ₂ and A ₃ and the attributes A ₁ and A ₂ are in the state (i), the attribute A ₁ or A ₂ is used. It is clear that C ₁ and C ₂ can be discriminated from each other, but this is f in the equation (2).
(C ₁ , C ₂ ) = 1 · A ₁ + 1 · A ₂ + 0 · A ₃ = A ₁ + A ₂
Becomes [Or] corresponds to the logical sum. That is, (2)
In the formula, C _i and C _j are discriminable when f (C _i , C _j ) = 1, and if at least one attribute of the term of f (C _i , C _j ) is used, C _i and C _j are Fully identifiable.

【００２３】（２）式から任意の二つのカテゴリーを識
別する属性を求めることができる。例えばＣ_IとＣ_Cを識
別する属性は次式で求めることができる。ｆ（Ｃ_I，Ｃ_C）＝Ａ₂＋Ａ₃＋Ａ₄＋Ａ₈ （３）The attribute for discriminating any two categories can be obtained from the equation (2). For example, the attribute that identifies C _I and C _C can be obtained by the following equation. f (C _I , C _C ) = A ₂ + A ₃ + A ₄ + A ₈ (3)

【００２４】少なくとも一つ以上の属性値の分布が完全
に分離しているカテゴリーの組合せを識別可能な属性集
合の組は、ｆ（Ｃ_i，Ｃ_j）＝１となるすべての組合せに
対してｆ（Ｃ_i，Ｃ_j）（ｉ＝１，・・・，ｎ，ｊ＝
１，・・・，ｍ，ｉ≠ｊ）の論理積Ｅをとることによ
り求めることができる。Ｅ＝Πｆ（Ｃ_i，Ｃ_j）但し、ｉ∈｛１，…，ｎ｝、ｊ∈｛１，…，ｍ｝，ｉ≠ｊ（４）A set of attribute sets that can identify a combination of categories in which at least one or more distributions of attribute values are completely separated is a set of attributes for all combinations for which f (C _i , C _j ) = 1. f (C _i , C _j ) (i = 1, ..., N, j =
It can be obtained by taking the logical product E of 1, ..., M, i ≠ j). E = Πf (C _i , C _j ) where iε {1, ..., n}, jε {1, ..., m}, i ≠ j (4)

【００２５】Ｅ＝Πｆ（Ｃ_i，Ｃ_j）の演算結果は積和形
で表すことができ、積の形に表した一項をＡＳ_x（属性
の組）とすると次のように表せる。Ｅ＝ＡＳ₁＋・・・＋ＡＳ_x＋・・・＋ＡＳ_p 但しＡＳ_x＝Ａ_aＡ_bＡ_c… （５）The calculation result of E = Πf (C _i , C _j ) can be expressed in the product-sum form, and if one term expressed in the product form is AS _x (set of attributes), it can be expressed as follows. E = AS ₁ + ... + AS _x + ... + AS _p However, AS _x = A _a A _b A _c (5)

【００２６】従って、ＡＳ₁，…，ＡＳ_x，…，ＡＳ_pの
それぞれの一項は少なくとも一つ以上の属性Ａ_Kの属性
値の分布が完全に分離しているカテゴリーの組合せを識
別可能な属性集合である。Therefore, each item of AS ₁ , ..., AS _x , ..., AS _p can identify a combination of categories in which the distribution of attribute values of at least one attribute A _K is completely separated. It is an attribute set.

【００２７】式（５）によって少なくとも一つ以上の属
性値の分布が完全に分離しているカテゴリーの組合せす
べてを識別可能な属性集合の組が選択できる。Ｅ＝Ａ₂Ａ_T＋Ａ₃Ａ₄Ａ_T （６）となる。これを次のように置き換える。By the equation (5), a set of attribute sets can be selected which can identify all combinations of categories in which at least one attribute value distribution is completely separated. E = A ₂ _AT + A ₃ A ₄ _AT (6) Replace it with:

【００２８】ＡＳ₁＝Ａ₂Ａ_T，ＡＳ₂＝Ａ₃Ａ₄Ａ_T （７）AS ₁ = A ₂ _AT , AS ₂ = A ₃ A ₄ _AT (7)

【００２９】求められた２組の属性の組、ＡＳ₁，ＡＳ₂
の属性を使うことにより、分離しているカテゴリーの組
合せによるカテゴリーが識別できる。しかし碍子と鳥
肉、碍子とギャップとＣＶケーブルは完全には識別でき
ない。The two sets of attributes obtained, AS ₁ and AS ₂
By using the attribute of, the category can be identified by the combination of separated categories. However, insulators and poultry, insulators and gaps, and CV cables cannot be completely identified.

【００３０】ｃ）識別木の各ノードへの属性の配置（７）式で求めた２組の属性の組の内で任意の組を選び
出す。ここではＡＳ₆はＡ₂Ａ_Tを選択するとする。識
別木の各ノードへの配置は次のようにする。C) Arrangement of Attribute at Each Node of Identification Tree An arbitrary set is selected from the two sets of attributes obtained by the equation (7). Here, AS ₆ selects A ₂ _AT . The arrangement of the identification tree at each node is as follows.

【００３１】（７）式で得られた属性の組が２つ以上の
属性を持つ場合には、任意の属性を上位のノードに配置
する。Ａ_Tの属性を配置する。（７）式で得られた属性
の組が２つ以上ある場合は属性の重なりの状態により、
属性の分布に重なりのない領域、属性の分布に重なる領
域に分かれる。属性がこれらの重なりのない領域の値に
なった場合には、根ノードで分類が完了する。When the set of attributes obtained by the equation (7) has two or more attributes, an arbitrary attribute is placed in the upper node. Place _AT attributes. When there are two or more sets of attributes obtained by equation (7), depending on the state of attribute overlap,
It is divided into an area that does not overlap with the attribute distribution and an area that overlaps with the attribute distribution. When the attributes have values in these non-overlapping regions, the root node completes the classification.

【００３２】重なりのある領域はカテゴリー間の分類が
不可能であり、他の属性で再度分類する。その場合、重
なりのある領域のカテゴリーの識別に必要な属性の組を
再帰的に（２）及び（４）式により求める。その内で
（７）式で得られた属性の組の集合の内に入るものの中
で任意の属性を選ぶ。これらの処理をカテゴリーＣ_iと
状態(i) にあるカテゴリーＣ_jとの間において再識別ノ
ードがなくなるまで行う。Areas with overlap cannot be classified between categories, and are classified again with other attributes. In that case, a set of attributes necessary for identifying the category of the overlapping area is recursively obtained by the expressions (2) and (4). Among them, an arbitrary attribute is selected from those falling within the set of attribute sets obtained by the expression (7). Performing these processes to re-identify the node is eliminated between the category C _j in the category C _i and state (i).

【００３３】ｄ）分離した属性を持たないカテゴリー識
別Ｃ_iとＣ_jの二つのカテゴリーの属性値の分布の状態が状
態(i) となる属性Ａ_Kを得ることができず、どの属性Ａ
_Kに対しても、Ｃ_iとＣ_jの二つのカテゴリーの属性値の
分布が状態(ii)または状態(iii) である場合、つまりｆ
（Ｃ_i，Ｃ_j）＝０の場合が考えられる。もしこのような
カテゴリーが存在するとＣ_i，Ｃ_jの識別は不可能とな
る。以下、任意の二つのカテゴリーの属性値の分布が状
態(ii)または状態(iii) である場合の方法について説明
する。具体的には識別できない碍子と鳥肉はカテゴリー
の分類が必要である。D) Category identification having no separated attribute C _i and C _j cannot acquire attribute A _K whose distribution state of attribute values of two categories is state (i), and which attribute A
_{Also for K} , if the distribution of the attribute values of the two categories C _i and C _j is state (ii) or state (iii), that is, f
The case of (C _i , C _j ) = 0 can be considered. If such a category exists, C _i and C _j cannot be identified. Hereinafter, a method when the distribution of attribute values of any two categories is the state (ii) or the state (iii) will be described. Insulators and poultry that cannot be specifically identified require category classification.

【００３４】（１）カテゴリーの分割ある属性値の分布Ａ_Kに関してあるｓ個のカテゴリーＣ
₁，…，Ｃ_i，…，Ｃ_sが部分的に重なり合っている場
合、すなわちｓ個のカテゴリーのすべての組合せが図４
の状態(ii)または状態(iii) にある場合においても、部
分的にはカテゴリーの識別が可能な値の範囲が存在する
と考えられる。これらを用いれば部分的な識別が可能と
なるため、以下の方法でカテゴリーの分割を行う。(1) Division of categories S distributions of certain attribute values A _K with certain s categories C
_{When 1} , ..., C _i , ..., C _s partially overlap, that is, all combinations of s categories are shown in FIG.
Even in the state (ii) or the state (iii), it is considered that there is a range of values that allows the category to be identified. Since partial identification is possible by using these, the categories are divided by the following method.

【００３５】ある属性Ａ_Kについて任意のカテゴリーＣ
_iと他の全てのカテゴリーと重なりのない部分、任意の
カテゴリーＣ_iと他の任意の一個のカテゴリーが重なる
部分、任意のカテゴリーＣ_iと他の任意の二個のカテゴ
リーが重なる部分、・・・、任意のカテゴリーＣ_iと他
の任意のｓ−２個のカテゴリーが重なる部分、任意のカ
テゴリーＣ_iと他の任意のｓ−１個のカテゴリーが重な
る部分に分けることができる。上記の分割により、分割
した新たなカテゴリーを作ることができる。また任意の
カテゴリーＣ_iと他の任意のｓ−ｎ個のカテゴリーが重
なる部分の組合せの数は_sＣ_s-n+1で与えられる。また
分割したカテゴリーがすべての属性Ａ_Kに対して空集合
の場合、新たなカテゴリーは作らないとする。Arbitrary category C for some attribute A _K
_i and all other categories do not overlap, any category C _i overlaps any other one category, any category C _i overlaps any other two categories, ... ., A portion where any category C _i and other arbitrary s−2 categories overlap, and a portion where any category C _i and other arbitrary s−1 categories overlap. By the above division, a new divided category can be created. Also, the number of combinations of portions in which an arbitrary category C _i and other arbitrary s−n categories overlap is given by _s C _{s-n + 1} . If the divided category is an empty set for all attributes A _K , no new category is created.

【００３６】具体的に図５で子ノードが３個のカテゴリ
ーＣ₁，Ｃ₂，Ｃ₃が区別できない場合を考える。ここで
の属性をＡ₁，Ａ₂とする。Specifically, consider the case where the categories C ₁ , C ₂ and C ₃ having three child nodes cannot be distinguished in FIG. The attributes here are A ₁ and A ₂ .

【００３７】ここでカテゴリーＣ_iと他のすべてのカテ
ゴリーとの重なりのない部分によって新しく作られたカ
テゴリーをＣ_i*とする。例えば図５の属性Ａ₁に関して
作られたＣ_i*のように属性値の分布が分離される場合も
ある。任意のカテゴリーＣ_iと他の任意の一つのカテゴ
リーＣ_jが重なる部分によって新しく作られたカテゴリ
ーをＣ_ij*とする。例えば図５の属性Ａ₁に関して作ら
れたＣ_13*である。新たに作られたＣ_13*はカテゴリー
Ｃ₁かカテゴリーＣ₃を意味する。以下任意のカテゴリ
ーＣ_iと他の任意の二つのカテゴリーが重なる部分か
ら、新しく作られたカテゴリーを同様に定義する。図５
の属性Ａ₁に関してはＣ_2*とＣ_3*は空集合のため、新た
なカテゴリーを作らないとする。このとき新たに作られ
たカテゴリーは、すべての任意の二つの組合せにおいて
状態(i) を満たすので、属性Ａ_Kを使って上記の方法に
よりカテゴリーの分割を行うことができる。A category newly created by the non-overlapping part of the category C _i and all other categories is referred to as C _{i *} . For example, the distribution of attribute values may be separated like C _{i *} created for the attribute A _{1 in} FIG. _Let C _{ij *} be a category newly created by a portion where an arbitrary category C _i and another arbitrary category C _j overlap. For example, C _{13 *} created for attribute A ₁ in FIG. The newly created C _{13 *} means category C ₁ or category C ₃ . Hereinafter, a newly created category is similarly defined from a portion where an arbitrary category C _i and other arbitrary two categories overlap. Figure 5
As for the attribute A ₁ of C _{2 *} and C _{3 *} , a new category is not created because it is an empty set. At this time, since the newly created category satisfies the state (i) in all arbitrary two combinations, the category can be divided by the above method using the attribute A _K.

【００３８】識別できない碍子と鳥肉はカテゴリーの分
離が必要である。属性Ａ_Kに関して任意のカテゴリーＣ
_iは他のすべてのカテゴリーと重なりのない部分のカテ
ゴリーに分割できる。碍子と鳥肉の場合はＣ_I*，Ｃ_N*の
カテゴリーを作ることができる。以下同様に任意のカテ
ゴリーＣ_iと他の任意の一つのカテゴリーが重なる部分
のカテゴリーＣ_IN*を作ることができる。Insulators and poultry that cannot be distinguished require category separation. Any category C for attribute A _K
_i can be divided into categories that do not overlap with all other categories. In the case of insulators and poultry, categories of C _{I *} and C _{N *} can be created. Similarly, a category C _{IN * of a} portion where an arbitrary category C _i and another arbitrary category overlap can be created.

【００３９】（２）分離した属性を持たないカテゴリー
識別属性値の分布が完全に分離していないカテゴリーの集合
に対してカテゴリーの分割により、新たなカテゴリーを
生成する。どの属性を使って新たなカテゴリーを生成す
るかを属性の確率分布により決定する。(2) Category identification having no separated attribute A new category is generated by dividing a category into a set of categories whose attribute value distribution is not completely separated. Which attribute is used to generate a new category is determined by the probability distribution of the attribute.

【００４０】属性の確率分布は以下のように表現するこ
とができる。属性Ａ_Kの表す確率変数をＺとし、属性値
ｚ_Kでの確率をｐ_iとすると、属性Ａ_Kの確率分布はＰ（Ｚ＝ｚ_K）＝ｐ_i （８）とおくことができ、任意の属性Ａ_Kの属性値の分布にお
いてａ≦Ｚ≦ｂの範囲の確率は、The probability distribution of attributes can be expressed as follows. _If the random variable represented by the attribute A _K is Z and the probability at the attribute value z _K is p _i , the probability distribution of the attribute A _K can be set as P (Z = z _K ) = p _i (8), The probability in the range of a ≦ Z ≦ b in the distribution of attribute values of an arbitrary attribute A _K is

【数１】上式よりａ≦Ｚ≦ｂの範囲の確率を求めることができ
る。上記で求めた確率を用い、子ノードの識別に効果的
な属性の選択を行う。[Equation 1] The probability in the range of a ≦ Z ≦ b can be obtained from the above equation. Using the probabilities obtained above, attributes effective for identifying the child node are selected.

【００４１】任意のカテゴリーＣ_iとＣ_jの属性値の分
布において、他の分布と重なりのない部分の確率の高い
属性値はより高い確度でどのカテゴリーに属するか識別
できる。そこで、ある属性Ａ_KにおけるカテゴリーＣ_i
のＣ_jに対して全く重なっていない領域の属性値の確率
分布を求め、その確率をｐ（ＤＪ_AK（Ｃ_i，Ｃ_j））とす
る。これはＡ_KがＣ_iの識別に対してどの程度Ｃ_jの影
響があるかを示すものである。In the distribution of attribute values of arbitrary categories C _i and C _j , it is possible to identify with high accuracy which category an attribute value having a high probability of a portion that does not overlap with another distribution belongs to. Therefore, a category C _{i with} a certain attribute A _K
The probability distribution of the attribute values of the region that does not overlap with C _j of is calculated, and the probability is defined as p (DJ _AK (C _i , C _j )). This indicates whether A _K is the effect of extent C _j with respect to the identification of C _i.

【００４２】確率ｐ（ＤＪ_AK（Ｃ_i，Ｃ_j））を使い、次
の評価関数を定める。評価値の１番高い属性を用い、子
ノードの識別を行う。The probability p (DJ _AK (C _i , C _j )) is used to determine the following evaluation function. The child node is identified using the attribute with the highest evaluation value.

【数２】 [Equation 2]

【００４３】次に属性の確率分布を考える必要がある
が、ここでは計算を簡単にするために、属性値の確率分
布が最大値と最小値の間で一様に分布しているものとし
て考える。次に評価値Ｆ^*(Ａ_K)の算出ならびに子ノード
の識別を行う。式（１０）を用いて算出する。その結果
最大Ｆ^*(Ａ_k)を用いて識別することができる。図６及び
図６に決定木による学習結果を示す。Next, it is necessary to consider the probability distribution of attributes, but here, in order to simplify the calculation, it is assumed that the probability distribution of attribute values is uniformly distributed between the maximum value and the minimum value. . Next, the evaluation value F ^* (A _K ) is calculated and the child node is identified. It is calculated using equation (10). As a result, the maximum F ^* (A _k ) can be used for identification. 6 and 6 show the learning result by the decision tree.

【００４４】図６及び図７を用いた識別木学習の結果を
用いて事故原因診断のフローチャートを作成すると、図
８〜図１４のようになる。8 to 14 are flowcharts of the accident cause diagnosis made using the results of the discrimination tree learning using FIGS. 6 and 7.

【００４５】約１００のデータをテストとして用いた結
果、７．５３％は診断ができず具体的に言えば図８〜図
１４で“ＥＲＲＯＲ”に入った、１つに限定できた場合が５０．５４％２つに限定できた場合が２１．５１％３つに限定できた場合が２０．４３％であった。As a result of using about 100 data as a test, 7.53% could not be diagnosed, and more specifically, it entered "ERROR" in FIGS. 8 to 14, and in some cases, it could be limited to one. 0.54% was 21.51% when it could be limited to two, and 20.43% when it could be limited to three.

【００４６】次に具体的にどのようにしてニューラルネ
ットワークを組み合わせるかについて説明する。ニュー
ラルネットワークについては、「ニューラルネットワー
ク情報処理」麻生英樹著，産業図書発行、に詳細に説明
されている。用いるニューラルネットワークは、バック
プロパゲーション、ボルツマンマシン、パーセプトロン
などがあるが、ここではバックプロパゲーションを用い
た実施例について説明する。もちろん、入力は属性値、
入力の数は属性の種類、出力はカテゴリー、出力の数は
識別するカテゴリーの数となる。Next, how to specifically combine neural networks will be described. The neural network is described in detail in "Neural Network Information Processing" by Hideki Aso, published by Sangyo Tosho. The neural network used includes back propagation, Boltzmann machine, perceptron, and the like. Here, an example using back propagation will be described. Of course, the input is the attribute value,
The number of inputs is the type of attribute, the number of outputs is the category, and the number of outputs is the number of categories to identify.

【００４７】どの段階で組み合わせるかについては、次
の３段階とその組合せがある。先に、特願平４−１３０
０８３号の方法では、碍子と鳥肉、碍子とギャップとＣ
Ｖケーブルは完全には識別できないことを述べた。した
がって、碍子、鳥肉、ギャップ、ＣＶケーブルの４つの
カテゴリーについて、上記のデータをすべて用いてニュ
ーラルネットワークで学習する場合、そこで用いる属性
はすべてを用いてもよく、任意の属性を用いてもよく、
またＦ^*（Ａ_k）の高い属性から用いてもよい。これは本
発明の解決手段の第２の方法に相当する。Regarding which stage to combine, there are the following three stages and their combinations. First, Japanese Patent Application No. 4-130
In method No. 083, insulator and poultry, insulator and gap and C
It has been stated that V-cables cannot be fully identified. Therefore, when learning with a neural network using all of the above data for the four categories of insulator, poultry, gap, and CV cable, all attributes may be used and any attributes may be used. ,
Alternatively, the attribute having the highest F ^* (A _k ) may be used. This corresponds to the second method of the solution of the present invention.

【００４８】この方法による結果を示す。完全地絡と抵
抗地絡は、機械学習により識別できる。約６０データに
ついて行ったところ、約４０データは正解を示し、約２
０データは間違った答えを出した。The results of this method are shown below. The complete ground fault and the resistance ground fault can be identified by machine learning. When we performed about 60 data, about 40 data showed the correct answer, about 2
0 data gave the wrong answer.

【００４９】２）この機械学習では、Ａ_T，Ａ₂の属性
を用いて、重なりのない２つのカテゴリー組み合わせは
識別できる。さらに重なっている部分は（２）の分離し
た属性を持たないカテゴリー集合に対して確率的に有利
な属性で識別している。それをニューラルネットワーク
で識別しようという方法である。2) In this machine learning, two category combinations that do not overlap can be identified by using the attributes of A _T and A ₂ . Further, the overlapping portion is identified by the attribute that is probabilistically advantageous with respect to the category set having no separate attribute of (2). It is a method to identify it with a neural network.

【００５０】すなわち、フローチャートでいうと、カテゴリー１）１０４，１０５，１０６鳥肉，碍子２）１１２，１１３，１１４鳥肉，碍子３）１１９，１２０，１２１鳥肉，碍子４）１２７，１２８，１２９碍子，ギャップ５）１３４，１３５，１３６碍子，ギャップ６）１３８，１３９，１４０，碍子，ギャップ，ＣＶケーブル１４１，１４２７）１４４，１４５，１４６ＣＶケーブル，碍子８）１５１，１５２，１５３ＣＶケーブル，碍子In other words, in the flow chart, category 1) 104, 105, 106 bird meat, insulator 2) 112, 113, 114 bird meat, insulator 3) 119, 120, 121 bird meat, insulator 4) 127, 128, 129 insulator, gap 5) 134,135,136 insulator, gap 6) 138,139,140, insulator, gap, CV cable 141,142 7) 144,145,146 CV cable, insulator 8) 151,152,153 CV Cable, insulator

【００５１】上記８組のそれぞれに対してニューラルネ
ットワークを用いて学習する。Learning is performed using a neural network for each of the above eight sets.

【００５２】具体的に言えば、例えば５）に関しては学
習用データの中で１３４，１３５，１３６に落ちるデー
タを用いて２つのカテゴリー「碍子，ギャップ」を学習
し、テストするデータも１３４，１３５，１３６のフロ
ーチャートに落ちるデータの場合はニューラルネットワ
ークで識別をする。その結果、約１０％程度は間違った
答えを出したが、１つに限定してカテゴリーを識別する
ことができた。これは解決手段の第３の方法に相当す
る。Specifically, for example, with regard to 5), the data to be learned for the two categories "insulators and gaps" using the data falling to 134, 135, 136 among the learning data and the data to be tested are also 134,135. In the case of the data falling in the flow chart of No. 136 of FIG. As a result, about 10% gave a wrong answer, but it was possible to identify the category by limiting to one. This corresponds to the third method of solution.

【００５３】３）３番目の方法として、２）の方法の上
位のノードで機械学習をやめ、そのあとニューラルネッ
トワークによる学習を行う方法である。3) As a third method, there is a method in which machine learning is stopped at an upper node of the method of 2) and then learning is performed by a neural network.

【００５４】具体的に言えば、フローチャートでは、１）１０３，１０４，１０５，１０６，１０７，１０８鳥肉，碍子２）１１１，１１２，１１３，１１４，１１５，１１６鳥肉，碍子３）１１８，１１９，１２０，１２１，１２２，１２３鳥肉，碍子４）１２６，１２７，１２８，１２９，１３０，１３１碍子，ギャップ５）１３３，１３４，１３５，１３６，１３７，１３８，碍子，ＣＶケーブ１３９，１４０，１４１，１４２，１４３，１４４，１４５，１４６，１４７，１４８６）１５０，１５１，１５２，１５３，１５４，１５５碍子，ＣＶケーブの上記６組に対し、それぞれニューラルネットワークを
用いて学習する。結果は、２番目の方法と同様な結果が
得られた。これが本発明の第４の方法に相当する。Specifically, in the flow chart, 1) 103, 104, 105, 106, 107, 108 bird meat, insulator 2) 111, 112, 113, 114, 115, 116 bird meat, insulator 3) 118, 119,120,121,122,123 Bird meat, insulator 4) 126,127,128,129,130,131 insulator, gap 5) 133,134,135,136,137,138, insulator, CV cave 139,140 , 141, 142, 143, 144, 145, 146, 147, 148 6) 150, 151, 152, 153, 154, 155 insulators, and the above-mentioned 6 sets of CV caves are learned by using neural networks. The result was similar to that of the second method. This corresponds to the fourth method of the present invention.

【００５５】４）４番目の方法は２番目の方法と３番目
の方法を組み合わせる方法で、例えば３番目の５）の組
に対しては２番目の方法により識別する。すなわち、２
番目の５），６），７）すなわち５）１３４，１３５，１３６６）１３８，１３９，１４０，１４１，１４２７）１４４，１４５，１４６に分けてやる方法である。この方法では上記２番目，３
番目よりもよい結果が得られた。これが本発明の第５の
方法に相当する。4) The fourth method is a method in which the second method and the third method are combined, and for example, the third set 5) is identified by the second method. Ie 2
5), 6), 7), that is, 5) 134, 135, 136 6) 138, 139, 140, 141, 142 7) 144, 145, 146. In this method, the second and third
Better results than th. This corresponds to the fifth method of the present invention.

【００５６】[0056]

【発明の効果】上述したように、本発明によれば下記の
効果を奏する。As described above, the present invention has the following effects.

【００５７】帰納的学習とニューラルネットワーク
を組み合わせることにより、データの認識率が上がる。The combination of inductive learning and neural networks increases the data recognition rate.

【００５８】ニューラルネットワークでもし間違っ
た答えを出したとしても、帰納的学習でどのカテゴリー
間で間違っているか、限定することができる。Even if a neural network gives a wrong answer, it is possible to limit which category is wrong between the categories by inductive learning.

【００５９】属性値に重なりがある部分でも、ニュ
ーラルネットワークの多次元空間の属性を非線型に識別
するという特徴を用いてさらにカテゴリーの限定を行う
ことができる。Even in the part where the attribute values overlap, the category can be further limited by using the feature of non-linearly identifying the attributes of the multidimensional space of the neural network.

【００６０】データの属性値が分布を持つ場合、診
断、パターン認識、画像処理などいろいろな分類に適用
できる。When the attribute value of data has a distribution, it can be applied to various classifications such as diagnosis, pattern recognition, and image processing.

【００６１】シミュレータなどで属性値の分布を求
めている場合、シミュレータのパラメータを変えても、
その変化に伴いデータの分類を機械学習により学習させ
ることにより、迅速に作成することができる。When the distribution of attribute values is obtained by a simulator or the like, even if the parameters of the simulator are changed,
By learning the classification of data by machine learning according to the change, it is possible to create the data quickly.

【００６２】人間の主観が入らないアルゴリズムを
自動的に作成することができる。An algorithm that does not include human subjectivity can be automatically created.

【００６３】効率のよいアルゴリズムを作成するこ
とができる。An efficient algorithm can be created.

[Brief description of drawings]

【図１】データの分類に係る事故原因診断方法を実施
するための試験回路の一例を示す電気系統図である。FIG. 1 is an electrical system diagram showing an example of a test circuit for implementing an accident cause diagnosis method related to data classification.

【図２】本発明の実施例における測定系の構成を示す
ブロック図である。FIG. 2 is a block diagram showing a configuration of a measurement system in an example of the present invention.

【図３】故障時の零相電流の代表的な波形とＦＦＴ解
析結果を示すグラフである。FIG. 3 is a graph showing a typical waveform of zero-phase current at the time of a failure and an FFT analysis result.

【図４】二つのカテゴリー間の属性値分布関係を示す
説明図である。FIG. 4 is an explanatory diagram showing an attribute value distribution relationship between two categories.

【図５】本発明に係るカテゴリーの分割を示す説明図
である。FIG. 5 is an explanatory diagram showing division of categories according to the present invention.

【図６】本発明に係る識別木学習の結果を示す説明図
である。FIG. 6 is an explanatory diagram showing a result of discrimination tree learning according to the present invention.

【図７】本発明に係る識別木学習の結果を示す説明図
である。FIG. 7 is an explanatory diagram showing a result of discrimination tree learning according to the present invention.

【図８】本発明に係る事故原因診断のフローチャート
（１）である。FIG. 8 is a flowchart (1) of accident cause diagnosis according to the present invention.

【図９】本発明に係る事故原因診断のフローチャート
（２）である。FIG. 9 is a flowchart (2) of accident cause diagnosis according to the present invention.

【図１０】本発明に係る事故原因診断のフローチャー
ト（３）である。FIG. 10 is a flowchart (3) of accident cause diagnosis according to the present invention.

【図１１】本発明に係る事故原因診断のフローチャー
ト（４）である。FIG. 11 is a flowchart (4) of accident cause diagnosis according to the present invention.

【図１２】本発明に係る事故原因診断のフローチャー
ト（５）である。FIG. 12 is a flowchart (5) of accident cause diagnosis according to the present invention.

【図１３】本発明に係る事故原因診断のフローチャー
ト（６）である。FIG. 13 is a flowchart (6) of accident cause diagnosis according to the present invention.

【図１４】本発明に係る事故原因診断のフローチャー
ト（７）である。FIG. 14 is a flowchart (7) of accident cause diagnosis according to the present invention.

[Explanation of symbols]

１遮断器、２ＺＰＤ（零相電圧検出器）、３電源
側コンデンサ、４零相変流器、５負荷側コンデン
サ、６事故発生用開閉器、７高圧開閉器、８制御
器、９変圧器1 circuit breaker, 2 ZPD (zero phase voltage detector), 3 power supply side capacitor, 4 zero phase current transformer, 5 load side capacitor, 6 accident switch, 7 high voltage switch, 8 controller, 9 transformer

Claims

[Claims]

1. A data classification method, characterized in that learning is performed by inductive machine learning, and when a category cannot be discriminated by the learning, learning is performed by a neural network.

2. A method for classifying data, characterized in that a category except a completely discriminable category is learned by a neural network by an inductive machine learning method.

3. An attribute that identifies all combinations of categories in which at least one or more attribute value distributions are separated by inductive machine learning is learned using machine learning and cannot be performed by that machine learning. A method of classifying data, which is characterized by learning with a neural network for categories.

4. The learning is performed by machine learning at a higher node of the machine learning performed by the classification method according to claim 3, and a category which cannot be learned by the machine learning is learned by a neural network. Classification method.

5. A method for classifying data, characterized in that the classification method according to claim 3 and the classification method according to claim 4 are combined to perform learning by machine learning and a neural network.