KR102134324B1

KR102134324B1 - Apparatus and method for extracting rules of artficial neural network

Info

Publication number: KR102134324B1
Application number: KR1020180023503A
Authority: KR
Inventors: 박영택; 바트셀렘; 이완곤; 최현영
Original assignee: 숭실대학교산학협력단
Priority date: 2018-02-27
Filing date: 2018-02-27
Publication date: 2020-07-15
Also published as: KR20190102703A

Abstract

본 발명의 일 실시예에 따르면, 입력 데이터 세트를 학습하여 인공 신경망 모델을 생성하는 데이터 학습부; 상기 입력 데이터 세트를 구분하는 초평면(hyperplane)과 접하는 복수개의 큐브에 따라 상기 입력 데이터 세트의 이진 분류 규칙을 추출하는 제1규칙 추출부; 상기 이진 분류 규칙에 따라 상기 입력 데이터 세트를 이진 데이터로 분류하는 이진 분류부; 및 이진 데이터로 분류된 입력 데이터 세트에 대하여 은닉층과 출력층의 관계를 탐색하여 인공 신경망의 논리 규칙을 생성하는 제2규칙 추출부를 포함하는 인공 신경망 규칙 추출 장치를 제공한다.According to an embodiment of the present invention, a data learning unit for generating an artificial neural network model by learning an input data set; A first rule extraction unit for extracting a binary classification rule of the input data set according to a plurality of cubes contacting a hyperplane separating the input data set; A binary classification unit classifying the input data set into binary data according to the binary classification rule; It provides an artificial neural network rule extraction apparatus including a second rule extraction unit that searches for a relationship between a hidden layer and an output layer for an input data set classified as binary data to generate logical rules of an artificial neural network.

Description

Apparatus and method for extracting artificial neural network rules{APPARATUS AND METHOD FOR EXTRACTING RULES OF ARTFICIAL NEURAL NETWORK}

본 발명의 일실시예는 인공 신경망 규칙 추출 장치 및 방법에 관한 것으로, 더욱 구체적으로는 다양한 데이터의 관리, 분석, 패턴 추출 등에 적용되는 인공 신경망 기술의 규칙 추출 장치 및 방법에 관한 것이다.An embodiment of the present invention relates to an apparatus and method for artificial neural network rule extraction, and more particularly, to a rule extraction apparatus and method for artificial neural network technology applied to management, analysis, pattern extraction, etc. of various data.

다양하고 방대한 양의 정보가 존재하는 오늘날의 지식사회에서 지식은 유일한 자원이라기보다 오직 하나 있는 의미 있는 자원이라고 강조되고 있다. 여러 형태의 지식 원천으로부터 필요한 지식과 정보를 추출하여 이를 구조적으로 조직화하는 지식 획득(knowledge aquisition)이 중요한 문제로 부각되고 있다. 과거에는 주로 전문가나 문헌에 의해 지식을 얻었지만, 문제의 범위가 복잡하고 넓어짐에 따라 과거의 전통적인 방법으로 지식을 획득하기 어려워졌다. 이에 대한 대안으로 데이터를 분석하여 패턴과 규칙을 찾아내어 지식을 추출하려는 방법들이 제시되고 있다. 데이터로부터 지식을 추출하는 기법으로 전통적으로는 통계기법을 사용했으나, 최근에는 인공 신경망, 의사결정나무, 유전자알고리즘, 사례추론시스템, 퍼지시스템 등의 인공지능(artificial intelligent) 기법 등이 사용되고 있다.In today's knowledge society, where there is a wide variety of information, it is emphasized that knowledge is only one meaningful resource rather than one. Knowledge aquisition, which extracts necessary knowledge and information from various types of knowledge sources and structurally organizes them, is becoming an important issue. In the past, knowledge was mainly obtained by experts or literature, but as the scope of the problem became complicated and widened, it became difficult to acquire knowledge by the traditional methods of the past. As an alternative, methods to extract knowledge by analyzing data and finding patterns and rules have been proposed. Statistical techniques have been traditionally used to extract knowledge from data, but artificial intelligence techniques such as artificial neural networks, decision trees, genetic algorithms, case inference systems, and fuzzy systems have recently been used.

특히, 인공 신경망은 분류 예측 문제의 해결하기 위한 다방면의 문제영역에서 사용되고 있다. 인공 신경망은 부실 기업 예측 모형, 채권등급 평가 등 재무 관련 자료들을 분석하고 활용하여 결과를 예측함에 있어 그 정확성이 로지스틱 회귀분석(logistic regression), 판별 분석 등의 통계기법이나 의사결정나무(decision tree)등의 다른 인공지능 기법보다 우 수하여 그 활용 범위가 넓다는 장점이 있다. 또한 데이터의 잡음에 민감하지 않고 그 구조가 견고하다. 그러나 자료를 학습하는 내부 과정이 복잡한 수학적 모델에 의해서 생성되기 때문에 사용자들이 결과를 이해하기 어려우며, 복잡한 구조에 의한 설명력 부재는 인공 신경망의 가장 큰 문제점으로 지적되고 있다.In particular, artificial neural networks have been used in a variety of problem areas for solving classification prediction problems. The artificial neural network analyzes and utilizes financial data such as bad corporate forecasting models and bond ratings to predict the results, and its accuracy is statistical techniques such as logistic regression and discriminant analysis or decision trees. It has the advantage of being superior to other artificial intelligence techniques, such as its wide range of applications. Also, it is not sensitive to data noise and its structure is robust. However, it is difficult for users to understand the results because the internal process of learning data is generated by a complex mathematical model, and the lack of explanatory power due to the complex structure has been pointed out as the biggest problem of artificial neural networks.

본 발명이 이루고자 하는 기술적 과제는 이진 속성뿐만이 아니라 연속 속성을 가지는 데이터가 적용되는 인공 신경망의 규칙을 추출할 수 있는 인공 신경망 규칙 추출 장치 및 방법을 제공하는데 있다.An object of the present invention is to provide an artificial neural network rule extraction apparatus and method capable of extracting rules of an artificial neural network to which data having continuous properties as well as binary properties is applied.

상기 제1규칙 추출부는, 상기 입력 데이터 세트를 구분하는 초평면을 형성하고, 상기 초평면에 접하면서 제1 레이블을 가지는 입력 데이터를 가장 많이 포함하는 큐브를 순차적으로 형성하며, 상기 큐브의 범위에 따라 상기 이진 분류 규칙을 추출할 수 있다.The first rule extracting unit forms an hyperplane that separates the input data set, sequentially forms a cube that includes the input data having the first label while contacting the hyperplane, and according to the range of the cube Binary classification rules can be extracted.

상기 제1규칙 추출부는, 초평면 상에 위치하면서 최대한 많은 수의 입력 데이터를 형성하는 최적점을 찾고, 최적점에 따라 큐브를 형성할 수 있다.The first rule extracting unit may locate an optimal point on the hyperplane and form as many input data as possible, and form a cube according to the optimal point.

상기 제1규칙 추출부는, 상기 이진 분류 규칙에 따라 큐브에 포함되는 입력 데이터는 "1"의 값으로 치환하고, 큐브에 포함되지 않는 입력 데이터는 "0"의 값으로 치환할 수 있다.The first rule extracting unit may replace input data included in a cube with a value of "1" and input data not included in a cube with a value of "0" according to the binary classification rule.

상기 입력 데이터 세트는 3개 이상의 속성을 가지는 데이터를 포함할 수 있다.The input data set may include data having three or more attributes.

상기 데이터 학습부는 역전파 알고리즘을 이용하여 상기 인공 신경망 모델을 생성할 수 있다.The data learning unit may generate the artificial neural network model using a back propagation algorithm.

상기 제2규칙 추출부는 NofM알고리즘을 이용하여 상기 논리 규칙을 생성할 수 있다.The second rule extraction unit may generate the logic rule using a NofM algorithm.

본 발명의 실시예에 따르면, 입력 데이터 세트를 학습하여 인공 신경망 모델을 생성하는 단계; 상기 입력 데이터 세트를 구분하는 초평면(hyperplane)과 접하는 복수개의 큐브에 따라 상기 입력 데이터 세트의 이진 분류 규칙을 추출하는 단계; 상기 이진 분류 규칙에 따라 상기 입력 데이터 세트를 이진 데이터로 분류하는 단계; 및 이진 데이터로 분류된 입력 데이터 세트에 대하여 은닉층과 출력층의 관계를 탐색하여 인공 신경망의 논리 규칙을 생성하는 단계를 포함하는 인공 신경망 규칙 추출 방법을 제공한다.According to an embodiment of the present invention, generating an artificial neural network model by learning an input data set; Extracting a binary classification rule of the input data set according to a plurality of cubes contacting a hyperplane separating the input data set; Classifying the input data set into binary data according to the binary classification rule; And generating a logical rule of an artificial neural network by searching a relationship between a hidden layer and an output layer for an input data set classified as binary data.

상기 이진 분류 규칙을 추출하는 단계는, 상기 입력 데이터 세트를 구분하는 초평면을 형성하는 단계; 상기 초평면에 접하면서 제1 레이블을 가지는 입력 데이터를 가장 많이 포함하는 제1큐브를 형성하는 단계; 상기 제1큐브를 제외한 큐브 중 상기 초평면에 접하면서 제1 레이블을 가지는 입력 데이터를 가장 많이 포함하는 제2큐브를 형성하는 단계; 상기 제1큐브 및 상기제2큐브의 범위에 따라 상기 이진 분류 규칙을 추출하는 단계를 포함한다.The step of extracting the binary classification rule may include: forming a hyperplane separating the input data set; Forming a first cube in contact with the hyperplane and having the most input data having a first label; Forming a second cube that includes the most input data having a first label while touching the hyperplane among cubes other than the first cube; And extracting the binary classification rule according to the range of the first cube and the second cube.

상기 논리 규칙을 생성하는 단계는, 입력층과 은닉층 사이에서 학습된 가중치를 군집으로 분류하는 단계; 유사 가중치를 가지는 군집을 등가그룹을 형성하는 단계; 각 등가그룹내의 가중치를 상기 등가그룹 각각의 가중치 평균값으로 치환하는 단계; 입력값에 따라 분계점을 넘지 못하는 가중치들을 포함하는 등가 그룹을 삭제하는 단계; 역전파 알고리즘을 적용하여 은닉층과 출력층의 분계점을 최적화하는 단계; 및 연결 가중치와 분계점을 없애고 상기 논리 규칙을 추출하는 단계를 포함한다.The generating of the logic rule may include classifying the weights learned between the input layer and the hidden layer into clusters; Forming an equivalent group of clusters having similar weights; Substituting a weight in each equivalent group with a weighted average value of each of the equivalent groups; Deleting an equivalent group including weights that do not exceed the threshold according to the input value; Optimizing the threshold of the hidden layer and the output layer by applying a back propagation algorithm; And removing connection weights and thresholds and extracting the logic rules.

본 발명인 인공 신경망 규칙 추출 장치 및 방법은 연속 속성을 가지는 데이터가 적용되는 인공 신경망의 규칙을 추출할 수 있다.The apparatus and method for extracting artificial neural network rules according to the present invention can extract rules of an artificial neural network to which data having continuous properties are applied.

도1은 본 발명의 실시예에 따른 인공 신경망 규칙 추출 장치의 구성블록도이다.
도2는 본 발명의 실시예에 따른 인공 신경망 모델의 개념도이다.
도3은 본 발명의 실시예에 따른 단일 마디의 개념도이다.
도4는 본 발명의 실시예에 따른 초평면의 개념도이다.
도5는 본 발명의 실시예에 따른 큐브의 개념도이다.
도6은 본 발명의 실시예에 따른 이진 분류 규칙을 설명하기 위한 도면이다.
도7은 본 발명의 실시예에 따른 NofM알고리즘을 설명하기 위한 도면이다.
도8은 본 발명의 실시예에 따른 인공 신경망 규칙 추출 방법의 순서도이다.1 is a block diagram of an artificial neural network rule extraction apparatus according to an embodiment of the present invention.
2 is a conceptual diagram of an artificial neural network model according to an embodiment of the present invention.
3 is a conceptual diagram of a single node according to an embodiment of the present invention.
4 is a conceptual diagram of a hyperplane according to an embodiment of the present invention.
5 is a conceptual diagram of a cube according to an embodiment of the present invention.
6 is a view for explaining a binary classification rule according to an embodiment of the present invention.
7 is a view for explaining the NofM algorithm according to an embodiment of the present invention.
8 is a flowchart of an artificial neural network rule extraction method according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. The present invention can be applied to various changes and can have various embodiments, and specific embodiments will be illustrated and described in the drawings. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention.

제2, 제1 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제2 구성요소는 제1 구성요소로 명명될 수 있고, 유사하게 제1 구성요소도 제2 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다. Terms including ordinal numbers such as second and first may be used to describe various components, but the components are not limited by the terms. The terms are used only for the purpose of distinguishing one component from other components. For example, the second component may be referred to as a first component without departing from the scope of the present invention, and similarly, the first component may also be referred to as a second component. The term and/or includes a combination of a plurality of related described items or any one of a plurality of related described items.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. When an element is said to be "connected" or "connected" to another component, it is understood that other components may be directly connected to or connected to the other component, but there may be other components in between. It should be. On the other hand, when a component is said to be "directly connected" or "directly connected" to another component, it should be understood that no other component exists in the middle.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in this application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, terms such as “include” or “have” are intended to indicate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, and that one or more other features are present. It should be understood that the existence or addition possibilities of fields or numbers, steps, operations, components, parts or combinations thereof are not excluded in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person skilled in the art to which the present invention pertains. Terms, such as those defined in a commonly used dictionary, should be interpreted as having meanings consistent with meanings in the context of related technologies, and should not be interpreted as ideal or excessively formal meanings unless explicitly defined in the present application. Does not.

이하, 첨부된 도면을 참조하여 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 대응하는 구성 요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, exemplary embodiments will be described in detail with reference to the accompanying drawings, but the same or corresponding components are assigned the same reference numbers regardless of reference numerals, and overlapping descriptions thereof will be omitted.

도1은 본 발명의 실시예에 따른 인공 신경망 규칙 추출 장치의 구성블록도이고, 도2는 본 발명의 실시예에 따른 인공 신경망 모델의 개념도이다.1 is a block diagram of an artificial neural network rule extraction apparatus according to an embodiment of the present invention, and FIG. 2 is a conceptual diagram of an artificial neural network model according to an embodiment of the present invention.

도1 내지 도2를 참조하면, 본 발명의 실시예에 따른 인공 신경망 규칙 추출 장치는 데이터 학습부(110), 제1규칙 추출부(120), 이진 분류부(130) 및 제2규칙 추출부(140)를 포함하여 구성될 수 있다.1 to 2, the artificial neural network rule extraction apparatus according to an embodiment of the present invention includes a data learning unit 110, a first rule extraction unit 120, a binary classification unit 130, and a second rule extraction unit It may be configured to include 140.

먼저, 데이터 학습부(110)는 입력 데이터 세트를 학습하여 인공 신경망 모델을 생성할 수 있다. 데이터 학습부(110)는 입력 데이터 세트를 이용하여 입력층에서 출력층에 이르기까지 다음 층에 가중치를 부여하는 과정을 반복하여 인공 신경망 모델을 생성하고 분석할 수 있다. 데이터 학습부(110)는 인공 신경망 모델의 출력값과 목표값의 차이로 발생하는 값을 이용하여 학습하고 새로운 모형을 구축하고, 반복적으로 출력값과 목표값의 차이를 최소화하는 가중치를 탐색할 수 있다. 데이터 학습부(110)는 출력값과 목표값의 차이가 최소화되면 학습을 중지할 수 있다.First, the data learning unit 110 may generate an artificial neural network model by learning an input data set. The data learning unit 110 may generate and analyze an artificial neural network model by repeating a process of weighting the next layer from the input layer to the output layer using the input data set. The data learning unit 110 may learn using a value generated by a difference between an output value and a target value of the artificial neural network model, build a new model, and repeatedly search for a weight that minimizes the difference between the output value and the target value. The data learning unit 110 may stop learning when the difference between the output value and the target value is minimized.

데이터 학습부(110)는 예를 들면, 역전파(backpropagation)알고리즘을 적용하여 인공 신경망 모델을 생성할 수 있다. The data learning unit 110 may generate, for example, an artificial neural network model by applying a backpropagation algorithm.

데이터 학습부(110)는 입력층에 제시된 값 O_pj 와 오프셋 θ_j, 입력층과 은닉층과의 가중치 W_ji를 이용하여 은닉층 마디의 입력값 net_pj를 구하고, 이를 다시 은닉층의 활성화 함수(activation function)에 대입하여 은닉층 마디 j의 출력값 O_pj 를 구한다. The data learning unit 110 obtains the input value net _pj of the hidden layer node using the value O _pj and the offset θ _j presented to the input layer, and the weight W _ji between the input layer and the hidden layer, and again activates the hidden layer activation function. ) To obtain the output value O _pj of the hidden layer node j.

데이터 학습부(110)는 은닉층 노드의 출력값 O_pj과 은닉층과 출력층과의 가중치W_kj, 그리고 출력층 노드의 오프셋 θ_k를 이용하여 출력층의 입력값 net_pk를 구하고, 이를 다시 출력층의 활성화 함수에 대입하여 출력층의 출력값 O_pk 를 구한다.The data learning unit 110 obtains the input value net _pk of the output layer using the output value O _pj of the hidden layer node, the weight W _kj between the hidden layer and the output layer, and the offset θ _k of the output layer node, and substitutes this into the activation function of the output layer To obtain the output value O _pk of the output layer.

데이터 학습부(110)는 학습패턴의 목표값 t_pk과 출력층의 출력값 O_pk의 차이로부터 출력층 마디 k에 연결된 연결강도와 오프셋에 대한 오차 δ_pk를 계산한다.The data learning unit 110 calculates an error δ _pk for the connection strength and offset connected to the output layer node k from the difference between the target value t _pk of the learning pattern and the output value O _pk of the output layer.

데이터 학습부(110)는 오차 δ_pk와 은닉층과 출력층간의 가중치 W_kj, 은닉층 마디의 출력값 net_pk를 이용하여 은닉층 마디 j에 연결된 가중치 W_ji와 오프셋 θ_j에 대한 오차 δ_pj를 구한다.The data learning unit 110 uses the error δ _pk , the weight W _kj between the hidden layer and the output layer, and the output value net _pk of the hidden layer node to obtain an error δ _pj for the weight W _ji and the offset θ _j connected to the hidden layer node j.

데이터 학습부(110)는 오차δ_pj를 이용하여 은닉층과 출력층을 연결하는 가중치 W_ij와 오프셋 θ_j을 수정한다.The data learning unit 110 corrects the weight W _ij and the offset θ _j connecting the hidden layer and the output layer using the error δ _pj .

데이터 학습부(110)는 오차 δ_pk,δ_pj로 부터 은닉층과 입력층을 연결하는 가중치 W_ji와 오프셋 θ_j을 수정한다. The data learning unit 110 corrects the weight W _ji and the offset θ _j connecting the hidden layer and the input layer from errors δ _pk and δ _pj .

데이터 학습부(110)는 수정된 가중치와 오프셋을 인공 신경망 모델에 학습시킨다. 데이터 학습부(110)는 모든 입력 데이터 세트에 대하여 이러한 과정을 되풀이하고, 정해진 학습 반복횟수만큼 전체 학습패턴의 학습을 반복한 후 학습을 종료한다.The data learning unit 110 trains the modified weights and offsets in the artificial neural network model. The data learning unit 110 repeats this process for all input data sets, repeats learning of the entire learning pattern for a predetermined number of learning iterations, and then ends learning.

이와 같이, 데이터 학습부(110)는 역전파 알고리즘에 따라 출력값과 목표값을 비교하여 차이를 줄여나가는 방향으로 가중치를 조절하고, 모델의 상위층에서 역전파한 결과를 근거로 하위층의 가중치를 조정해 나갈 수 있다. As described above, the data learning unit 110 compares the output value and the target value according to the back propagation algorithm, adjusts the weight in the direction of decreasing the difference, and adjusts the weight of the lower layer based on the result of back propagating in the upper layer of the model. I can go out.

제1규칙 추출부(120)는 입력 데이터 세트를 구분하는 초평면(hyperplane)과 접하는 복수개의 큐브에 따라 입력 데이터 세트의 이진 분류 규칙을 추출할 수 있다. The first rule extraction unit 120 may extract a binary classification rule of the input data set according to a plurality of cubes contacting a hyperplane that separates the input data set.

제1규칙 추출부(120)는 입력 데이터 세트를 구분하는 초평면을 형성하고, 초평면에 접하면서 제1 레이블을 가지는 입력 데이터를 가장 많이 포함하는 큐브를 순차적으로 형성하며, 큐브의 범위에 따라 이진 분류 규칙을 추출할 수 있다.The first rule extraction unit 120 forms an hyperplane that separates the input data set, sequentially forms a cube that includes the input data having the first label while touching the hyperplane, and classifies the binary according to the range of the cube Rules can be extracted.

도3은 본 발명의 실시예에 따른 인공 신경망 모델의 단일 마디의 개념도이다. 도1 및 도3을 참조하면, 인공 신경망 모델에서는 마디(인공 뉴런, artificial neuron)라고 불리는 기본 소자 들이 신경 세포와 같은 역할을 하게 되며 이것들이 그물망처럼 서로 연결되어 인공 신경망을 이루게 된다. 각 마디는 외부로부터 값을 받으면 연결된 가중치와 곱한 뒤 모두 합하고, 더해진 값을 활성화함수 (activation function)를 이용하여 변환한다. 활성화 함수는 은닉층과 출력층에 제시되는 값을 변형시켜 출력하는데 사용하는 함수 이다. 인공 신경망에서 주로 사용되는 활성화 함수에는 각 변수들을 [0, 1]의 범위로 변환하는 시그모이드(sigmoid) 함수, [-1, 1]로 변환하는 하이퍼볼릭탄젠트 함수 등 여러 가지가 있다. 본 발명의 실시예에서는 시그모이드 활성화 함수를 예로 들어 설명하기로 한다. 본 발명의 실시예에 따른 인공 신경망 모델의 입력(x₁, x₂,···, x_n)과 출력(Output)은 아래 수학식 1과 같은 함수 관계로 정의될 수 있다.3 is a conceptual diagram of a single node of an artificial neural network model according to an embodiment of the present invention. 1 and 3, in the artificial neural network model, basic elements called nodes (artificial neurons) act as nerve cells, and they are connected to each other like a mesh to form an artificial neural network. When each node receives a value from the outside, it multiplies it with the connected weights, adds them together, and converts the added value using an activation function. The activation function is a function used to transform and output the values presented in the hidden layer and the output layer. There are various activation functions mainly used in artificial neural networks, such as a sigmoid function that converts each variable to a range of [0, 1], and a hyperbolic tangent function that converts to [-1, 1]. In an embodiment of the present invention, a sigmoid activation function will be described as an example. The input (x ₁ , x ₂ ,..., x _n ) and output of the artificial neural network model according to an embodiment of the present invention may be defined by a function relationship as shown in Equation 1 below.

수학식 1에서 x_n은 입력 데이터 세트이고, w_n은 각 입력 데이터 세트에 연결된 신경망의 가중치이다.In Equation 1, x _n is an input data set, and w _n is a weight of a neural network connected to each input data set.

제1규칙 추출부(120)는 각 마디의 입력에 따라 초평면을 형성할 수 있다. 제1규칙 추출부(120)는 아래의 수학식2에 따라 초평면을 형성할 수 있다.The first rule extraction unit 120 may form an ultra-planar surface according to the input of each node. The first rule extraction unit 120 may form an hyperplane according to Equation 2 below.

수학식2는 인공 신경망 에서 각 마디에 입력되는 가중치 선형합(Weighted linear sum)을 의미한다. 입력 데이터 세트가 n개의 속성 또는 차원을 가지는 경우, n개의 가중치와의 곱연산을 통해 다음층(layer)의 입력값이 산출된다. Bias는 편향값으로 추가적인 가중치값을 의미하며스칼라 값을 가진다. 따라서 f(x)는 입력 데이터 세트의 특성 또는 차원이 하나일 땐 직선, 두 개일 땐 평면이 되며, 더 높은 특성 또는 차원에서는 초평면이 되는 회귀 모델의 특징을 가질 수 있다.Equation 2 means a weighted linear sum input to each node in the artificial neural network. When the input data set has n attributes or dimensions, the input value of the next layer is calculated through multiplication with n weights. Bias is a bias value, which means an additional weight value and has a scalar value. Therefore, f(x) may have a feature of a regression model that becomes a straight line when a characteristic or dimension of the input data set is one, and a plane when two, and a hyperplane at a higher characteristic or dimension.

도3 및 도4는 본 발명의 실시예에 따른 초평면의 개념도이다. 본 발명의 실시예에서 인공 신경망 모델은 다층 신경망 모델로 구성되어 있다. 제1규칙 추출부(120)는 다층 신경망 모델을 구성하는 제1은닉층에 대하여 초평면을 형성하고 이진 규칙을 추출할 수 있다.3 and 4 are conceptual views of the hyperplane according to the embodiment of the present invention. In the embodiment of the present invention, the artificial neural network model is composed of a multilayer neural network model. The first rule extraction unit 120 may form an hyperplane with respect to the first hidden layer constituting the multilayer neural network model and extract binary rules.

제1규칙 추출부(120)는 f(X) < 0을 만족하는 입력 데이터 세트는 제2레이블을 표기하고, 그렇지 않은 입력 데이터 세트에는 제1레이블을 표기할 수 있다. 제1레이블은 참(true) 또는 양(positivs)의 값을 가지고, 제2레이블은 거짓(false) 또는 음(negative)의 값을 가질 수 있다. 즉, 제1규칙 추출부(120)는 초평면을 형성하여 초평면을 기준으로 일측에 위치한 입력 데이터 세트에는 제1레이블을 표기하고, 타측에 위치한 입력 데이터 세트에는 제2레이블을 표기할 수 있다.The first rule extraction unit 120 may denote a second label for an input data set that satisfies f(X) <0, and a first label for an input data set that does not. The first label may have a true or positive value, and the second label may have a false or negative value. That is, the first rule extracting unit 120 may form a hyperplane to indicate a first label on an input data set located on one side based on the hyperplane and a second label on an input data set located on the other side.

제1규칙 추출부(120)는 초평면에 접하면서 제1 레이블을 가지는 입력 데이터를 가장 많이 포함하는 큐브를 순차적으로 형성할 수 있다. 도5는 본 발명의 실시예에 따른 큐브의 개념도이다. 도1 및 도5를 참조하면, 제1규칙 추출부(120)는 먼저 초평면과 접하면서 제1레이블을 가지는 입력 데이터를 가장 많이 포함하는 제1큐브를 형성한다. 다음으로, 제1규칙 추출부(120)는 제1큐브를 제외한 상태에서, 초평면과 접하면서 제1레이블을 가지는 입력 데이터를 가장 많이 포함하는 제2큐브를 형성한다. 제1규칙 추출부(120)는 모든 입력 데이터 세트를 커버할 수 있도록 큐브를 형성하는 과정을 반복한다. 제1규칙 추출부(120)는 초평면 상에 위치하는 최적점을 찾고, 최적점에 따라 최대한 많은 수의 입력 데이터를 포함하는 큐브를 형성할 수 있다.The first rule extracting unit 120 may sequentially form a cube that includes the input data having the first label while touching the hyperplane. 5 is a conceptual diagram of a cube according to an embodiment of the present invention. 1 and 5, the first rule extraction unit 120 first forms a first cube that includes the most input data having the first label while contacting the hyperplane. Next, the first rule extraction unit 120 forms a second cube that includes the most input data having the first label while in contact with the hyperplane in a state other than the first cube. The first rule extraction unit 120 repeats the process of forming a cube to cover all input data sets. The first rule extraction unit 120 may find an optimal point located on the hyperplane, and form a cube including as many input data as possible according to the optimal point.

제1규칙 추출부(120)는 큐브를 찾는 단계별로 최적점을 찾는 작업을 반복 수행할 수 있다. 제1규칙 추출부(120)는 먼저 제1레이블의 데이터를 가장 많이 포함할 수 있는 제1큐브를 선택하고, 초평면 위에 존재하는 제1큐브의 꼭짓점을 제1최적점으로 선택한다. 다음으로, 제1규칙 추출부(120)는 제1큐브에 포함된 입력 데이터를 제거한 상태에서 제1레이블의 데이터를 가장 많이 포함할 수 있는 제2큐브를 선택하고, 초평면 위에 존재하는 제2큐브의 꼭짓점을 제2최적점으로 선택한다. 제1규칙 추출부(120)는 모든 입력 데이터 세트가 큐브에 포함될 때까지 큐브와 최적점을 선택하는 과정을 반복 수행한다.The first rule extraction unit 120 may repeatedly perform an operation of finding an optimal point in steps of finding a cube. The first rule extraction unit 120 first selects a first cube that may contain the most data of the first label, and selects a vertex of the first cube existing on the hyperplane as a first optimal point. Next, the first rule extracting unit 120 selects the second cube that can contain the most data of the first label while removing the input data included in the first cube, and the second cube existing on the hyperplane The vertex of is selected as the second optimal point. The first rule extraction unit 120 repeatedly selects a cube and an optimal point until all input data sets are included in the cube.

제1규칙 추출부(120)는 아래 수학식 3의 목적함수 L(x)에 따라 m개의 특성을 가지는 N개의 입력 데이터 세트를 통해 입력 데이터와 최적점 사이의 차이 값을 최소화하는 과정을 수행한다. The first rule extraction unit 120 performs a process of minimizing the difference value between the input data and the optimal point through N input data sets having m characteristics according to the objective function L(x) of Equation 3 below. .

수학식 3에서x^* _j는 큐브의 최적점이고, xⁱ _j는 입력 데이터 세트이고, w는 가중치이고, j는 속성 또는 차원이며, i는 입력 데이터 인덱스이고, λ는 정규화 파라미터(regularization parameter)로 설정에 의하여 변경될 수 있는 파라미터이다. In Equation 3, x ^* _j is the optimal point of the cube, x ⁱ _j is the input data set, w is the weight, j is the attribute or dimension, i is the input data index, and λ is the regularization parameter. This parameter can be changed by setting.

L(x)의 최소화(

)는 아래 수학식 4에 따라 미분을 통하여 수행될 수 있다.Minimization of L(x)(

) May be performed through differential according to Equation 4 below.

제1규칙 추출부(120)는 각 큐브 별로 아래의 수학식 5와 같은 규칙을 추출하게 된다. 이때에 입력 데이터 세트의 특성이 여러개이므로 각 특성별로 수학식 5와 같은　큐브의 범위를 계산하고, 수학식 6과 같이 모두 만족하는 경우를 범위로 표현한다. The first rule extraction unit 120 extracts the rules shown in Equation 5 below for each cube. At this time, since there are several characteristics of the input data set, the range of the 　cube as in Equation 5 is calculated for each characteristic, and the case where all are satisfied as in Equation 6 is expressed as a range.

수학식 5 및 6에서 C는 큐브의 범위이고, x^min _j는j차원에서 입력 데이터 세트의 최솟값이고, x^max _j는 j차원에서 입력 데이터 세트의 최댓값이고, w_j는 j차원에서의 가중치이고, h_j는 이진 분류 규칙이고, x_i는 테스트 입력 데이터 세트이고, x^* _i는 테스트 입력 데이터 세트의 최적점이다.In Equations 5 and 6, C is the range of the cube, x ^min _j is the minimum value of the input data set in the j dimension, x ^max _j is the maximum value of the input data set in the j dimension, and w _j is the weight in the j dimension. , h _j is the binary classification rule, x _i is the test input data set, and x ^* _i is the optimal point of the test input data set.

도6은 본 발명의 실시예에 따른 이진 분류 규칙을 설명하기 위한 도면이다. 도1 및 도6을 참조하면, 이진 분류부(130)는 큐브의 범위에 따라 이진 분류 규칙을 추출할 수 있다. 이진 분류부(130)는 이진 분류 규칙인 수학식 6에 따라 입력 데이터 세트를 이진 데이터로 치환할 수 있다. 이진 분류부(130)는 이진 분류 규칙에 따라 큐브에 포함되는 입력 데이터는 "1"의 값으로 치환하고, 큐브에 포함되지 않는 입력 데이터는 "0"의 값으로 치환할 수 있다.6 is a view for explaining a binary classification rule according to an embodiment of the present invention. 1 and 6, the binary classification unit 130 may extract a binary classification rule according to the range of the cube. The binary classification unit 130 may replace the input data set with binary data according to Equation 6, which is a binary classification rule. The binary classification unit 130 may replace input data included in the cube with a value of "1" and input data not included in the cube with a value of "0" according to the binary classification rule.

제2규칙 추출부(140)는 이진 데이터로 분류된 입력 데이터 세트에 대하여 은닉층과 출력층의 관계를 탐색하여 인공 신경망의 논리 규칙을 생성할 수 있다. The second rule extraction unit 140 may generate a logical rule of the artificial neural network by searching the relationship between the hidden layer and the output layer for the input data set classified as binary data.

제2규칙 추출부(140)는 NofM알고리즘을 이용하여 논리 규칙을 생성할 수 있다. The second rule extraction unit 140 may generate a logic rule using the NofM algorithm.

도7은 본 발명의 실시예에 따른 NofM알고리즘을 설명하기 위한 도면이다. 도1 및 도7을 참조하면, 제2규칙 추출부(140)는 먼저, 입력층과 은닉층 사이에서 학습된 가중치를 군집으로 분류하고, 유사 가중치를 가지는 군집을 등가그룹으로 형성한다. A 내지 G를 입력 노드라 하고, Z를 은닉 노드라 하면 입력 노드와 은닉 노드는 가중치가 부여된다. 제2규칙 추출부는 유사 가중치를 가지는 A, C, F노드를 제1등가그룹으로 형성하고, B, D, E, G노드를 제2등가그룹으로 형성한다. 7 is a view for explaining a NofM algorithm according to an embodiment of the present invention. 1 and 7, the second rule extraction unit 140 first classifies weights learned between the input layer and the hidden layer into clusters, and forms clusters having similar weights into equivalent groups. If A to G are input nodes and Z is a hidden node, the input node and the hidden node are weighted. The second rule extracting unit forms the A, C, and F nodes having similar weights as the first equivalent group, and the B, D, E, and G nodes as the second equivalent group.

다음으로, 제2규칙 추출부(140)는 각 등가그룹내의 가중치를 등가그룹 각각의 가중치 평균값으로 치환한다. 제2규칙 추출부(140)는 제1등가그룹의 가중치를 제1등가그룹의 평균값인 6.1로 치환하고, 제2등가그룹의 가중치를 제2등가그룹의 평균값인 1.1로 치환한다.Next, the second rule extraction unit 140 replaces the weights in each equivalent group with the weighted average value of each equivalent group. The second rule extracting unit 140 replaces the weight of the first equivalent group with 6.1, which is the average value of the first equivalent group, and the weight of the second equivalent group with 1.1, which is the average value of the second equivalent group.

다음으로, 제2규칙 추출부(140)는 입력값에 따라 분계점을 넘지 못하는 가중치들을 포함하는 등가 그룹을 삭제하고, 역전파 알고리즘을 적용하여 은닉층과 출력층의 분계점을 최적화한다. 분계점이 10인 경우 제2등가그룹은 입력 노드에 어떠한 값이 입력되더라도 분계점을 넘을 수 없다. 따라서, 제2규칙 추출부(140)는 제2등가 그룹을 삭제하고, 제2등가그룹의 삭제에 따라 발생하는 성능 변경을 보정하기 위하여 역전파 알고리즘을 적용하여 분계점을 최적화한다.Next, the second rule extraction unit 140 deletes an equivalent group including weights that do not exceed the threshold according to the input value, and optimizes the threshold of the hidden layer and the output layer by applying a backpropagation algorithm. If the demarcation point is 10, the second equivalent group cannot exceed the demarcation point even if any value is input to the input node. Therefore, the second rule extracting unit 140 deletes the second equivalent group and optimizes the threshold by applying a back propagation algorithm to correct a performance change caused by the deletion of the second equivalent group.

다음으로, 제2규칙 추출부(140)는 연결 가중치와 분계점을 없애고 논리 규칙을 추출한다. 제1등가 그룹을 구성하는 입력 노드 A, C, F 중 2개 이상의 노드가 참(true)인 경우 분계점을 넘게 되므로, 도7의 논리 규칙은 "IF 6.1*NumberTrue(A, C, F) > 10.9 THEN Z" 또는 "IF 2 of {A, B, C} THEN Z"와 같이 표현될 수 있다.Next, the second rule extracting unit 140 removes the connection weight and the threshold and extracts the logic rule. If two or more nodes of the input nodes A, C, and F that constitute the first equivalent group are true, the threshold is exceeded, so the logical rule of FIG. 7 is "IF 6.1*NumberTrue(A, C, F)> 10.9 THEN Z" or "IF 2 of {A, B, C} THEN Z".

본 발명의 실시예에 따르면, 입력 데이터 세트가 이진 데이터가 아닌 연속된 속성을 가지는 데이터일 경우라 하더라도, 입력 데이터 세트를 이진 데이터로 치환하여 NofM알고리즘에 적용함으로써 인공 신경망 모델의 논리 규칙을 정확하게 생성할 수 있다. According to an embodiment of the present invention, even if the input data set is data having consecutive attributes, not binary data, the input data set is replaced with binary data and applied to the NofM algorithm to accurately generate the logical rules of the artificial neural network model can do.

도8은 본 발명의 실시예에 따른 인공 신경망 규칙 추출 방법의 순서도이다.8 is a flowchart of an artificial neural network rule extraction method according to an embodiment of the present invention.

도8을 참조하면, 먼저, 데이터 학습부는 입력 데이터 세트를 학습하여 인공 신경망 모델을 생성한다. 데이터 학습부는 입력 데이터 세트를 학습하여 인공 신경망 모델을 생성한다. 데이터 학습부는 입력 데이터 세트를 이용하여 입력층에서 출력층에 이르기까지 다음 층에 가중치를 부여하는 과정을 반복하여 인공 신경망 모델을 생성하고 분석한다. 데이터 학습부는 인공 신경망 모델의 출력값과 목표값의 차이로 발생하는 값을 이용하여 학습하고 새로운 모형을 구축하고, 반복적으로 출력값과 목표값의 차이를 최소화하는 가중치를 탐색한다. 데이터 학습부는 출력값과 목표값의 차이가 최소화되면 학습을 중지한다(S801).Referring to FIG. 8, first, the data learning unit generates an artificial neural network model by learning an input data set. The data learning unit learns the input data set to generate an artificial neural network model. The data learning unit generates and analyzes an artificial neural network model by repeating the process of weighting the next layer from the input layer to the output layer using the input data set. The data learning unit learns using the value generated by the difference between the output value and the target value of the artificial neural network model, builds a new model, and repeatedly searches for weights that minimize the difference between the output value and the target value. When the difference between the output value and the target value is minimized, the data learning unit stops learning (S801).

다음으로, 제1규칙 추출부는 입력 데이터 세트를 구분하는 초평면(hyperplane)과 접하는 복수개의 큐브에 따라 입력 데이터 세트의 이진 분류 규칙을 추출한다. 제1규칙 추출부는 입력 데이터 세트를 구분하는 초평면을 형성하고, 초평면에 접하면서 제1 레이블을 가지는 입력 데이터를 가장 많이 포함하는 큐브를 순차적으로 형성하며, 큐브의 범위에 따라 이진 분류 규칙을 추출한다(S802).Next, the first rule extracting unit extracts a binary classification rule of the input data set according to a plurality of cubes contacting a hyperplane that separates the input data set. The first rule extracting unit forms a hyperplane that separates the input data set, sequentially forms a cube that includes the input data having the first label while touching the hyperplane, and extracts a binary classification rule according to the range of the cube (S802).

다음으로, 이진 분류부는 이진 분류 규칙에 따라 입력 데이터 세트를 이진 데이터로 분류한다. 제1규칙 추출부는 이진 분류 규칙에 따라 큐브에 포함되는 입력 데이터는 "1"의 값으로 치환하고, 큐브에 포함되지 않는 입력 데이터는 "0"의 값으로 치환한다(S803).Next, the binary classification unit classifies the input data set into binary data according to the binary classification rules. According to the binary classification rule, the first rule extraction unit replaces input data included in the cube with a value of "1", and input data not included in the cube with a value of "0" (S803).

다음으로, 제2규칙 추출부는 이진 데이터로 분류된 입력 데이터 세트에 대하여 은닉층과 출력층의 관계를 탐색하여 인공 신경망의 논리 규칙을 생성한다. 제2규칙 추출부는 NofM알고리즘을 이용하여 논리 규칙을 생성한다(S804). Next, the second rule extracting unit generates a logical rule of the artificial neural network by searching the relationship between the hidden layer and the output layer for the input data set classified as binary data. The second rule extraction unit generates a logic rule using the NofM algorithm (S804).

본 실시예에서 사용되는 '~부'라는 용어는 소프트웨어 또는 FPGA(field-programmable gate array) 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, '~부'는 어떤 역할들을 수행한다. 그렇지만 '~부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '~부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들, 및 변수들을 포함한다. 구성요소들과 '~부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '~부'들로 결합되거나 추가적인 구성요소들과 '~부'들로 더 분리될 수 있다. 뿐만 아니라, 구성요소들 및 '~부'들은 디바이스 또는 보안 멀티미디어카드 내의 하나 또는 그 이상의 CPU들을 재생시키도록 구현될 수도 있다.The term'~ unit' used in this embodiment refers to hardware components such as software or field-programmable gate array (FPGA) or ASIC, and'~ unit' performs certain roles. However,'~bu' is not limited to software or hardware. The'~ unit' may be configured to be in an addressable storage medium or may be configured to reproduce one or more processors. Thus, as an example,'~ unit' refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, attributes, and procedures. , Subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, database, data structures, tables, arrays, and variables. The functions provided within components and'~units' may be combined into a smaller number of components and'~units', or further separated into additional components and'~units'. In addition, the components and'~ unit' may be implemented to play one or more CPUs in the device or secure multimedia card.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다. Although described above with reference to preferred embodiments of the present invention, those skilled in the art variously modify and change the present invention without departing from the spirit and scope of the present invention as set forth in the claims below. You can understand that you can.

10: 데이터 학습부
20: 제1규칙 추출부
30: 이진 분류부
40: 제2규칙 추출부10: data learning department
20: first rule extraction unit
30: binary classification
40: second rule extraction unit

Claims

A data learning unit that learns an input data set to generate an artificial neural network model;
A first rule extraction unit for extracting a binary classification rule of the input data set according to a plurality of cubes contacting a hyperplane separating the input data set;
A binary classification unit classifying the input data set into binary data according to the binary classification rule; And
An artificial neural network rule extraction device including a second rule extraction unit that searches a relationship between a hidden layer and an output layer for an input data set classified as binary data to generate logical rules of an artificial neural network.

According to claim 1, The first rule extraction unit,
Artificial to form a hyperplane that separates the input data set, sequentially form a cube that contains the input data having the first label while touching the hyperplane, and extract the binary classification rules according to the range of the cube Neural network rule extraction device.

According to claim 2, The first rule extraction unit,
An artificial neural network rule extraction device that finds an optimal point located on a hyperplane and forms a cube including the largest number of input data according to the optimal point.

According to claim 3, The binary classification unit,
An artificial neural network rule extraction device that replaces input data included in a cube with a value of "1" and replaces input data not included in a cube with a value of "0" according to the binary classification rule.

According to claim 1,
The input data set is an artificial neural network rule extraction device including data having three or more attributes.

According to claim 1,
The data learning unit is an artificial neural network rule extraction device for generating the artificial neural network model using a back propagation algorithm.

According to claim 1,
The second rule extraction unit is an artificial neural network rule extraction device that generates the logical rule using a NofM algorithm.

A data learning unit learning an input data set to generate an artificial neural network model;
Extracting a binary classification rule of the input data set according to a plurality of cubes in contact with a hyperplane that separates the input data set by the first rule extraction unit;
Classifying the input data set into binary data according to the binary classification rule; And
A second rule extracting method comprising the steps of generating a logical rule of an artificial neural network by searching a relationship between a hidden layer and an output layer for an input data set classified as binary data.

The method of claim 8, wherein the step of extracting the binary classification rule,
Forming a hyperplane that separates the input data set;
Forming a first cube in contact with the hyperplane and having the most input data having a first label;
Forming a second cube that includes the most input data having a first label while touching the hyperplane among cubes other than the first cube;
And extracting the binary classification rules according to the ranges of the first cube and the second cube.

The method of claim 8, wherein the step of generating the logic rule,
Classifying the weights learned between the input layer and the hidden layer into clusters;
Forming an equivalent group of clusters having similar weights;
Substituting a weight in each equivalent group with a weighted average value of each of the equivalent groups;
Deleting an equivalent group including weights that do not exceed the threshold according to the input value;
Optimizing the threshold of the hidden layer and the output layer by applying a back propagation algorithm; And
A method of extracting artificial neural network rules, comprising removing connection weights and thresholds and extracting the logical rules.