KR20210088421A

KR20210088421A - Machine learning method for incremental learning and computing device for performing the same

Info

Publication number: KR20210088421A
Application number: KR1020200181204A
Authority: KR
Inventors: 김철호; 백옥기; 우영춘; 이성엽; 이정훈; 최인문
Original assignee: 한국전자통신연구원
Priority date: 2020-01-06
Filing date: 2020-12-22
Publication date: 2021-07-14
Also published as: KR102554626B1

Abstract

A machine learning method for incremental learning of the present invention constructs a model using training data, and the constructed model is incrementally updated by using only new weights generated based on the new training data. The machine learning method comprises: an encoding step; a generating step; a determining step; a constructing step; an encoding step; a normalizing step; and an updating step.

Description

A machine learning method for progressive learning and a computing device for performing the same {MACHINE LEARNING METHOD FOR INCREMENTAL LEARNING AND COMPUTING DEVICE FOR PERFORMING THE SAME}

본 발명은 기계 학습에 관한 것으로, 더욱 상세하게는, 점진적 학습(incremental learning)과 관련된 기계 학습에 관한 기술이다.The present invention relates to machine learning, and more particularly, to machine learning related to incremental learning.

인공 지능(Artificial Intelligence: AI) 분야에서 널리 사용되고 있는 지도 기계 학습(supervised machine learning) 모델의 적응성 및 신뢰도 향상을 위해, 점진적 학습(incremental learning)에 대한 다양한 연구가 시도되고 있다. 점진적 학습은 점진적 학습은 지속적으로 변화하는 환경에 대해 모델의 적응성을 높일 수 있게 한다.In order to improve the adaptability and reliability of a supervised machine learning model widely used in the field of artificial intelligence (AI), various studies on incremental learning are being attempted. Progressive learning allows the model to be more adaptable to continuously changing environments.

심층 신경망(Deep Neural Network: DNN), 합성곱 신경망(Convoultional Neural Network, CNN), 순환 신경망(Recurrent Neural Network, RNN) 등과 같은 인공 신경망(Artificial Neural Network, ANN) 기반의 기계 학습 모델은 파국적 망각 (Catastrophic Forgetting, CF)의 문제를 가지고 있어 점진적(incremental) 또는 연속적(continual) 학습을 구현하는 데 있어 한계가 있고, 모델의 내부 구조가 매우 복잡하여 모델이나 결과에 대한 설명이 어렵다는 것은 잘 알려진 사실이다.Artificial Neural Network (ANN)-based machine learning models such as Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), etc. It is a well-known fact that there is a problem with Catastrophic Forgetting (CF), which limits implementation of incremental or continuous learning, and that it is difficult to explain the model or the results because the internal structure of the model is very complex. .

ANN 기반 기계 학습 모델은, 새로운 학습 데이터가 입력되는 경우, 이전 학습 데이터의 전체에 대해 최적화된 상태(이전에 학습된 상태)에서 벗어나 이전에 학습한 내용을 망각하는 CF 문제가 발생할 수 있기 때문에, 모델의 점진적 확장(점진적 업데이트 또는 점진적 성능 개선)이 어렵다.In the ANN-based machine learning model, when new training data is input, the CF problem of escaping from an optimized state (previously learned state) for the entire previous training data and forgetting previously learned content may occur. It is difficult to incrementally scale the model (either incremental updates or incremental performance improvements).

이러한 CF 문제를 개선하기 위해 다양한 방법들이 연구되고 있지만 대부분의 연구는 모델의 성능 저하를 수반하기 때문에, CF 문제를 효과적으로 개선할 수 있는 방법은 아직까지는 미비한 상태이다.Although various methods are being studied to improve the CF problem, most of the studies involve degradation of model performance, so a method to effectively improve the CF problem is still insufficient.

이미지가 아닌 다변수 수치 데이터 (multivariate numeric data 또는 multivariate numeric heterogeneous data)에 대해 ANN 기반 알고리즘을 능가하는 우수한 성능을 보이는 알고리즘으로 의사결정나무(decision tree) 기반 앙상블(ensemble) 기법 중 하나인 그래디언트 부스팅(Gradient Boosting, GB)이 있다. 하지만 이 기법 역시 모델을 구축하는데 있어서 학습 데이터 전체에 대한 최적화를 수행하기 때문에 점진적 학습을 용이하게 제공하지는 못한다.For non-image multivariate numeric data (multivariate numeric data or multivariate numeric heterogeneous data), it is an algorithm that outperforms ANN-based algorithms. Gradient Boosting, GB). However, this technique also does not easily provide gradual learning because it optimizes the entire training data in building the model.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은, 모델의 성능 저하없이, 점진적 학습(incremental learning)을 용이하게 수행하기 위한 기계 학습 방법 및 그 컴퓨팅 장치를 제공하는 데 있다.An object of the present invention for solving the above problems is to provide a machine learning method and a computing device for easily performing incremental learning without degrading the performance of the model.

본 발명의 전술한 목적 및 그 이외의 목적과 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부된 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다.The above and other objects, advantages and features of the present invention, and a method of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings.

상술한 목적을 달성하기 위한 본 발명의 일면에 따른 점진적 학습을 위한 기계 학습 방법은, 다수의 클래스 레이블로 라벨링된 훈련 데이터를 인코딩하는 단계; 상기 인코딩된 훈련 데이터에 포함된 변수들을 노드들로 구성하고, 상기 노드들 중에서 인접한 노드들을 연결 강도를 나타내는 가중치를 갖는 엣지로 연결하여, 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성하는 단계; 상기 생성된 다수의 변수 네트워크들 중에서 성능에 따라 선택된 변수 네트워크들을 주요 변수 네트워크들로 결정하는 단계; 상기 결정된 주요 변수 네트워크들을 결합하여 모델을 구축하는 단계; 새로운 훈련 데이터를 인코딩하는 단계; 상기 인코딩된 새로운 훈련 데이터의 인스턴스를 이용하여 새로운 가중치를 계산한 후, 상기 계산된 새로운 가중치를 정규화하는 단계; 및 상기 정규화된 새로운 가중치를 기반으로 상기 결정된 주요 변수 네트워크들 각각의 상기 가중치를 갱신하여 상기 구축된 모델을 점진적으로 업데이트 하는 단계를 포함한다.A machine learning method for progressive learning according to an aspect of the present invention for achieving the above object comprises: encoding training data labeled with a plurality of class labels; Constructing the variables included in the encoded training data into nodes, and connecting adjacent nodes among the nodes with edges having a weight indicating the connection strength to generate a plurality of variable networks classified by the plurality of class labels step; determining variable networks selected according to performance from among the generated plurality of variable networks as main variable networks; constructing a model by combining the determined main variable networks; encoding new training data; calculating a new weight by using the instance of the encoded new training data and then normalizing the calculated new weight; and gradually updating the built model by updating the weight of each of the determined main variable networks based on the new normalized weight.

본 발명의 다른 일면에 따른 점진적 학습을 위한 기계 학습 방법을 실행하는 컴퓨팅 장치는, 프로세서; 다수의 클래스 레이블로 라벨링된 훈련 데이터와 새로운 훈련 데이터를 저장한 저장소; 및 상기 프로세서의 제어에 따라, 상기 다수의 클래스 레이블로 라벨링된 훈련 데이터를 이용하여 모델을 구축하는 기계 학습 모듈을 포함하고, 상기 기계 학습 모듈은, 다수의 클래스 레이블로 라벨링된 훈련 데이터와 상기 새로운 훈련 데이터를 인코딩하는 인코더; 상기 인코딩된 훈련 데이터에 포함된 변수들을 노드들로 구성하고, 상기 노드들 중에서 인접한 노드들을 연결 강도를 나타내는 가중치를 갖는 엣지로 연결하여, 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성하는 변수 네트워크 생성기; 상기 생성된 다수의 변수 네트워크들 중에서 성능에 따라 선택된 변수 네트워크들을 주요 변수 네트워크들로 결정하고, 상기 인코딩된 새로운 훈련 데이터의 인스턴스를 이용하여 새로운 가중치를 계산하고, 상기 계산된 새로운 가중치를 정규화하는 주요변수 네트워크 결정기; 상기 결정된 주요 변수 네트워크들을 결합하여 모델을 구축하는 모델 구축기; 및 상기 정규화된 새로운 가중치를 기반으로 상기 결정된 주요 변수 네트워크들 각각의 상기 가중치를 갱신하여 상기 구축된 모델을 점진적으로 업데이트 하는 업데이트 유닛을 포함한다.A computing device for executing a machine learning method for gradual learning according to another aspect of the present invention comprises: a processor; a repository for storing training data labeled with multiple class labels and new training data; and a machine learning module for building a model using the training data labeled with the plurality of class labels under the control of the processor, wherein the machine learning module comprises: the training data labeled with the plurality of class labels and the new an encoder that encodes the training data; Constructing the variables included in the encoded training data into nodes, and connecting adjacent nodes among the nodes with edges having a weight indicating the connection strength to generate a plurality of variable networks classified by the plurality of class labels variable network generator; Determining variable networks selected according to performance from among the generated plurality of variable networks as main variable networks, calculating a new weight using an instance of the encoded new training data, and normalizing the calculated new weight variable network determiner; a model builder for building a model by combining the determined main variable networks; and based on the new normalized weight and an update unit for progressively updating the built model by updating the weight of each of the determined main variable networks.

본 발명에 따르면, 새로운 훈련 데이터가 입력될 때, 기존에 구축한 모델을 유지하고 새로운 훈련 데이터 기반으로 생성된 가중치만을 이용하여 기존에 구축한 모델을 학습시키기 때문에, 기존에 구축한 모델의 구조를 변경하지 않고 모델을 갱신해 나갈 수 있어 점진적 학습이 용이하다.According to the present invention, when new training data is input, the previously built model is maintained and the previously built model is learned using only the weights generated based on the new training data, so that the structure of the previously built model is improved. Progressive learning is easy because the model can be updated without changes.

도 1은 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 설명하기 위한 흐름도들이다.
도 2는 도 1에 도시한 변수 순열(feature sequences) 선택(selecting) 단계에서 선택된 변수 순열을 설명하기 위한 도면이다.
도 3은 도 1에 도시한 모델 구축 단계(S400)를 도식적으로 설명하기 위한 도면이다.
도 4는 도 1에 도시한 각 서브 모델의 앙상블 구성을 설명하기 위한 도면이다.
도 5는 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 수행하도록 구현된 컴퓨팅 장치의 블록도이다.1 is a flowchart illustrating a machine learning method for gradual learning according to an embodiment of the present invention.
FIG. 2 is a diagram for explaining a variable permutation selected in the step of selecting feature sequences shown in FIG. 1 .
FIG. 3 is a diagram for schematically explaining the model building step S400 shown in FIG. 1 .
FIG. 4 is a view for explaining an ensemble configuration of each sub-model shown in FIG. 1 .
5 is a block diagram of a computing device implemented to perform a machine learning method for gradual learning according to an embodiment of the present invention.

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시 예들에 대해서 특정한 구조적 또는 기능적 설명들은 단지 본 발명의 개념에 따른 실시 예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시 예들은 다양한 형태로 실시될 수 있으며 본 명세서에 설명된 실시 예들에 한정되지 않는다.Specific structural or functional descriptions of the embodiments according to the concept of the present invention disclosed in this specification are only exemplified for the purpose of explaining the embodiments according to the concept of the present invention, and the embodiments according to the concept of the present invention are It may be implemented in various forms and is not limited to the embodiments described herein.

본 발명의 개념에 따른 실시예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시예들을 도면에 예시하고 본 명세서에 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시예들을 특정한 개시형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Since the embodiments according to the concept of the present invention may have various changes and may have various forms, the embodiments will be illustrated in the drawings and described in detail herein. However, this is not intended to limit the embodiments according to the concept of the present invention to specific disclosed forms, and includes changes, equivalents, or substitutes included in the spirit and scope of the present invention.

본 명세서에서 사용한 용어는 단지 특정한 실시예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is used only to describe specific embodiments, and is not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, and includes one or more other features or numbers, It should be understood that the possibility of the presence or addition of steps, operations, components, parts or combinations thereof is not precluded in advance.

본 발명은 기존의 기계학습이 효율적으로 구현하지 못하는 점진적 학습이 용이한 지도학습 알고리즘에 관한 것이다. 본 발명은, 다수의 변수(variable 또는 feature)와 목적 변수 (target variable)로 이루어진 데이터에 대해서 목적 변수의 레이블(label)을 예측하는 지도 학습 방법에서, 주요 변수들의 네트워크 (Significant Feature Network: SFN)들을 발견하고, 학습데이터로부터 변수 조합에 포함된 값들의 상호 관계를 이용하여 학습 모델을 구성하고, 그 구성된 학습 모델을 새로운 데이터에 대한 분류 예측에 활용한다.The present invention relates to a supervised learning algorithm that facilitates gradual learning that conventional machine learning cannot efficiently implement. The present invention is a supervised learning method for predicting a label of a target variable with respect to data consisting of a plurality of variables (variables or features) and a target variable, a Significant Feature Network (SFN) , constructs a learning model using the interrelationship of values included in the variable combination from the learning data, and uses the constructed learning model for classification prediction on new data.

본 발명은 기 구축한 모델을 새로운 데이터 세트를 이용하여 추가적으로 학습하는 경우, 기존의 모델에 점진적인 변화를 부가하여 새로운 데이터 세트를 포괄하는 새로운 모델을 구축할 수 있기 때문에, 점진적 학습이 매우 용이하다.In the present invention, when an existing model is additionally learned using a new data set, incremental learning is very easy because a new model covering the new data set can be built by adding a gradual change to the existing model.

이하, 도면을 참조하여, 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법에 대해 상세히 설명하기로 한다. 그리고 이하의 실시 예는 분류(classification)를 목적으로 하는 지도 학습(supervised learning)에 대한 것이다. 그러나 이에 한정하지 않고, 본 발명은 회귀(regression)를 목적으로 하는 지도 학습(supervised learning)에서도 적용될 수 있음을 당업자라면 이하의 설명으로부터 충분히 이해할 수 있을 것이다.Hereinafter, a machine learning method for gradual learning according to an embodiment of the present invention will be described in detail with reference to the drawings. And the following embodiment relates to supervised learning for the purpose of classification. However, the present invention is not limited thereto, and those skilled in the art will be able to fully understand from the following description that the present invention can also be applied to supervised learning for the purpose of regression.

도 1은 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 설명하기 위한 흐름도들이다. 1 is a flowchart illustrating a machine learning method for gradual learning according to an embodiment of the present invention.

본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법은 크게, 단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계와 추가 데이터 세트에 대한 점진적 학습을 수행하는 단계로 나눌 수 있다.The machine learning method for gradual learning according to an embodiment of the present invention can be largely divided into a step of performing learning and prediction on a single data set and a step of performing gradual learning on an additional data set.

먼저, "단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계"를 설명한 후, "추가 데이터 세트에 대한 점진적 학습을 수행하는 단계"를 설명하기로 한다. 본 명세서에서는 단일 데이터 세트가, First, "a step of performing learning and prediction on a single data set" will be described, and then "a step of performing gradual learning on an additional data set" will be described. In this specification, a single data set is

단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계Steps to train and make predictions on a single data set

도 1을 참조하면, 단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계는, 훈련 데이터 세트(training data sets)(101, 102)과 검증 데이터 세트(test data sets)(110, 111)을 준비(preparing)하는 단계(S100), 인코딩(encoding) 단계(S200), 주요 변수 네트워크(Significant Feature Networks: SFN)를 탐색(discovering)하는 단계(S300), 모델을 구축(building)하는 단계(S400) 및 예측(prediction) 단계(S500)를 포함한다. 1 , the step of performing training and prediction on a single data set includes preparing training data sets 101 and 102 and test data sets 110 and 111 ( Preparing) (S100), encoding (encoding) step (S200), discovering (discovering) major variable networks (Significant Feature Networks: SFN) (S300), building a model (S400) and It includes a prediction (prediction) step (S500).

A. 훈련 데이터 세트와 검증 데이터 세트의 준비 단계(S100)A. Preparation of training data set and validation data set (S100)

훈련 데이터 세트(101)는 모델(400: 400_1, 400_2, …, 400_N)을 구축하기 위해 다수의 클래스 레이블로 라벨링된 다수의 훈련 데이터를 포함한다. The training data set 101 includes a plurality of training data labeled with a number of class labels to build a model 400: 400_1, 400_2, ..., 400_N.

각 훈련 데이터는 다차원의 변수들(multi-dimensional feature variables)과 클래스 레이블(class label)을 목표 변수(target feature 또는 target variable)로 이루어진다. 각 변수(feature 또는 variable)는 연속적이거나 (continuous) 이산적인 (discrete) 수치 또는 문자 값으로 이루어질 수 있다.Each training data consists of multi-dimensional feature variables and a class label as a target feature or target variable. Each feature (feature or variable) can be continuous or discrete numeric or character values.

검증 데이터 세트(110)는 훈련 데이터 세트(101)과 동일한 구성으로 이루어져 있으나, 기 구축된 모델의 예측 성능을 검증하기 위한 목적으로 사용되는 점에서 훈련 데이터 세트(101, 102)과 차이가 있다.The validation data set 110 has the same configuration as the training data set 101 , but is different from the training data sets 101 and 102 in that it is used for the purpose of verifying the prediction performance of a previously built model.

훈련 데이터 세트(101)과 검증 데이터 세트(110)은 인코딩 전과 후로 구분될 수 있으며, 인코딩 전의 훈련 데이터 세트(101)과 검증 데이터 세트(110)은 미가공 훈련 데이터 세트(raw training data sets) 및 미가공 검증 데이터 세트(raw test data sets)으로 불릴 수 있다.The training data set 101 and the validation data set 110 can be divided into before and after encoding, and the training data set 101 and the validation data set 110 before encoding are raw training data sets and raw data sets. They may be referred to as raw test data sets.

B. 인코딩(encoding) 단계(S200)B. Encoding step (S200)

인코딩 단계(S200)에서, 인코더(200)가 훈련 데이터 세트(101)과 검증 데이터 세트(110)을 인코딩 하는 프로세스가 수행된다. 인코딩은 훈련 데이터 세트(101)을 모델(400)의 훈련(또는 학습)에 적합한 데이터로 가공하고, 검증 데이터 세트(110)을 모델(400)의 검증에 적합한 데이터로 가공하는 것일 수 있다.In the encoding step (S200), the encoder 200 encodes the training data set 101 and the verification data set 110 is performed. The encoding may be processing the training data set 101 into data suitable for training (or learning) of the model 400 , and processing the validation data set 110 into data suitable for verification of the model 400 .

인코딩(S200)은 어떤 변수의 값이 연속적(continuous)일 경우 이를 이산적인(discrete) 값, 불연속적인(Discontinuous) 값 또는 범주형(categorical) 값으로 변환하거나, 문자(text)로 이루어진 값을 적절한 수치로 변환하는 것일 수 있다.The encoding (S200) converts the value of a certain variable into a discrete value, a discontinuous value, or a categorical value when the value of a variable is continuous, or converts a value consisting of text to an appropriate value. It can be converted to a number.

어떤 변수가 갖는 연속적인 값을 이산적인 값 또는 범주형 값으로 변환하거나 문자로 이루어진 값을 수치로 변환하는 것은 사정에 정의된(또는 프로그래밍 된) 인코딩 규칙(encoding rule)에 따라 변환될 수 있다. 인코딩 규칙은 학습과 예측의 전체 과정에서 고정적(static)일 수도 있고, 유동적 (dynamic)일 수도 있다.Converting a continuous value of a variable into a discrete or categorical value, or converting a character value into a numeric value can be converted according to an encoding rule defined (or programmed) in circumstances. Encoding rules may be static or dynamic in the entire process of learning and prediction.

또한 인코딩(S200)은 이산적이거나 범주형의 값의 구간을 재설정하거나, 입력 값을 다른 값으로 변환하는 것일 수도 있다. 여기서, 이산적이거나 범주형의 값의 구간을 재설정하는 것은, 예를 들면, 10단계로 나누어진 값들을 5단계로 다시 설정하는 것일 수 있고, 입력 값을 다른 값으로 변환하는 것은, 예를 들면, -2,-1,0,1,2로 설정된 값을 1,2,3,4,5로 변환하는 것일 수 있다.Also, the encoding ( S200 ) may be to reset the interval of discrete or categorical values, or to convert the input value into another value. Here, resetting the interval of discrete or categorical values may be, for example, resetting the values divided by 10 steps to 5 steps, and converting the input value to another value is, for example, , -2,-1,0,1,2 may be converted to 1,2,3,4,5.

C. 주요변수 네트워크(SFN)를 탐색(discovering)하는 단계(S300)C. Step of discovering (discovering) the main variable network (SFN) (S300)

SFN을 탐색하는 단계(S300)는 인코더(200)에 의해 인코딩된 훈련 데이터 세트(201)(또는 인코딩된 학습 데이터 세트)을 이용하여 모델(400)의 핵심 구성 요소인 주요 변수 네트워크(significant feature networks, SFN)을 탐색(discovering)하는 것일 수 있다. 여기서, SFN의 탐색(discovering)은 인코딩된 훈련 데이터 세트(201)을 이용하여 SFN을 검출(detecting), 추출(extracting) 또는 계산(calculating)하는 것일 수 있다. The step of exploring the SFN ( S300 ) is a key feature network that is a core component of the model 400 using the training data set 201 (or the encoded training data set) encoded by the encoder 200 . , SFN) may be discovered. Here, the discovery of the SFN may be detecting, extracting, or calculating the SFN using the encoded training data set 201 .

구체적으로, SFN을 탐색하는 단계(S300)는, 예를 들면, 변수 순열(feature sequences) 생성(generating) 단계(S301), 노드 및 엣지(node and edge)의 구성(formation) 단계(S302), 가중치 계산 단계(S303), 가중치 정규화(weight normalization) 단계(S304), 변수 네트워크 평가(assessing feature network) 단계(S305), 변수 네트워크(feature network) 순위 결정(ranking) 단계(S306) 및 SFN 선택(selecting significant feature networks) 단계(S307)를 포함한다. Specifically, the step of searching for the SFN (S300) is, for example, a step (S301) of generating a variable permutation (feature sequences), a step (S302) of a formation of a node and an edge (node and edge), Weight calculation step (S303), weight normalization step (S304), variable network evaluation (assessing feature network) step (S305), variable network (feature network) ranking step (S306) and SFN selection ( selecting significant feature networks (S307).

SFN은 상기 단계들(S301~S306)을 반복(iteration) 수행하는 과정을 통해 획득(발견, 검출, 추출 또는 계산)될 수 있으며, SFN 선택 단계(S307)에서는 상기 네트워크 순위 결정단계(S306)에서 상위에 랭크된 특정 변수 순열들(specific feature sequences)을 SFN으로 선택하는 과정이 수행된다. 이렇게 선택된 SFN을 이용하여 모델이 구성된다. 이하, SFN 도출을 위한 상기 각 단계에 대해 상세히 설명하기로 한다. The SFN may be obtained (discovered, detected, extracted, or calculated) through the process of iteration of the steps S301 to S306, and in the SFN selection step S307, in the network ranking step S306 A process of selecting high-ranked specific feature sequences as SFN is performed. A model is constructed using the selected SFN. Hereinafter, each step for deriving the SFN will be described in detail.

C-1. 변수 순열(feature sequences) 생성 단계(S301)C-1. Generating feature sequences (S301)

도 2는 도 1에 도시한 변수 순열(feature sequences) 생성 단계(S301)에 의해 생성된 변수 순열의 일 예를 도시한 도면이다. FIG. 2 is a diagram illustrating an example of a variable permutation generated by the step S301 of generating a feature sequence shown in FIG. 1 .

도 2를 참조하면, 변수 순열은 다변수로 이루어진 인코딩된 훈련 데이터 세트(201)에서 2개 이상의 변수(또는 2개 이상의 생성된 변수)를 선택하고 이들을 특정한 순서로 배열한 것이다. Referring to FIG. 2 , variable permutation is selecting two or more variables (or two or more generated variables) from a multivariate encoded training data set 201 and arranging them in a specific order.

변수 순열(feature sequence)은, 예를 들면, 전체 변수 중 N개의 변수를 선택하여 특정한 순서로 나열하면 도2에서 보는 것과 같이 f₁, f₂, f₃, ??, f_N으로 이루어진 특정 순열이 만들어질 수 있다.Variable permutation (feature sequence) is, for example, if N variables from among all variables are selected and listed in a specific order, as shown in FIG. 2 , a specific permutation consisting of _{f 1} , f ₂ , f ₃ , ??, f _N this can be made

특정 변수 순열을 생성하는 것은, 변수의 변환 없이, 변수(feature)를 선택하는 방법과 상기 인코딩된 훈련 데이터 세트(201)에 포함된 변수들을 기반으로 새로운 변수를 만들어내는 변수 생성(feature generation) 방법으로 나눌 수 있다.Generating a specific variable permutation is a method of selecting a variable (feature) without transformation of the variable, and a method of generating a new variable based on the variables included in the encoded training data set 201 (feature generation) method can be divided into

변수 순열 생성을 위한 변수 선택 방법으로, 예를 들면, 무작위 선택(random selection) 방법, 모든 조합을 고려하는 방법, 다른 기계학습 방법으로부터 얻는 방법, 정보이론(information theory)의 상호정보(mutual information)를 이용하는 방법 등 다양한 방법이 있을 수 있다.Variable selection methods for generating variable permutations, for example, a random selection method, a method that considers all combinations, a method obtained from other machine learning methods, and mutual information in information theory There may be various methods such as a method of using

변수 순열 생성을 위한 변수 생성 방법으로, 예를 들면, 선형 판별 분석(linear discriminant analysis, LDA), 주성분 분석(Principal Component Analysis, PCA), Autoencoder와 같은 딥러닝 (deep learning) 기반의 변수 추출법을 이용하는 방법 등 다양한 방법이 있을 수 있다.As a variable generation method for generating variable permutations, for example, using a variable extraction method based on deep learning such as linear discriminant analysis (LDA), Principal Component Analysis (PCA), and Autoencoder. There may be various methods, such as a method.

C-2 노드 및 엣지(node and edge)의 구성(formation) 단계(S302)C-2 node and edge (node and edge) configuration (formation) step (S302)

전단계(S301)에 의해 특정한 변수 순열이 선택되면, 노드와 엣지를 정의하고 이를 통해 변수 네트워크(feature network)가 구성될 수 있다. When a specific variable permutation is selected by the previous step S301, nodes and edges are defined, and a feature network can be constructed through this.

노드(f₁₁, f_{12, …} f_1i,f_21,f_{22, …} f_N1, f_N2, f_NP,??)는, 도2에 도시된 바와 같이, 각 변수(f₁, f₂, f₃, ??, f_N)가 가지는 인코딩된 값들로 정의되고, 엣지(w₁₁, w₁₂, w₁₃, w_1α,w_21,w_22,w_23,w_2β,??)는 인접한 노드들 사이의 연결을 정의된다. 여기서, 동일한 변수 내에서 노드들의 연결은 고려되지 않는다. 예를 들어, 변수 f₂에는 f₂₁, f₂₂, ?? f_2j와 같은 노드들이 존재하고 이들은 인접한 변수 f₁과 f₃의 노드들에 엣지(또는 가중치를 나타내는 연결선)로 연결된다. 이렇게 노드와 엣지의 연결을 통해, 선택된 변수 순열에 대한 변수 네트워크(feature network)가 구성된다.Node _{_{(f 11, f 12, ...}} f 1i, f 21, f 22, ... f N1, f N2, f NP, ??) is, also, the respective parameters as shown in ₂ (f _1, f 2, f ₃ , ??, f _N ) are defined as encoded values, and edges (w ₁₁ , w ₁₂ , w ₁₃ , w _1α, w _{21 ,} w _22, w _23, w _2β, ??) are adjacent nodes The connections between them are defined. Here, the connection of nodes within the same variable is not considered. For example, the variable f ₂ contains f ₂₁ , f ₂₂ , ?? There are nodes such as f _2j _{and they are connected to the nodes of the adjacent variables f 1} and f ₃ by edges (or connecting lines representing weights). Through this connection of nodes and edges, a feature network for the selected variable permutation is constructed.

C-3 가중치 계산 단계(S303)C-3 weight calculation step (S303)

노드들을 연결하는 엣지는 특정한 값을 가지며, 이 특정한 값은 노드들의 연결 강도를 나타내는 가중치(weight)로 정의된다. 가중치는 인코딩된 훈련 데이터(201)로부터 얻을 수 있다. 인코딩된 훈련 데이터(201)의 인스턴스(instance)가 입력될 때, 상기 인스턴스에 의해 활성화된 노드들을 연결하는 엣지의 가중치가 계산된다. 여기서, 인스턴스는 기계학습 모델이 학습 또는 추론 (예측) 등을 위해 필요로 하는 데이터가 주어졌을 때, 그 데이터를 구성하는 각각의 사례 (example) 또는 샘플 (sample)을 의미한다. 따라서, 인스턴스는 훈련 데이터를 구성하는 훈련 사례(training example) 또는 훈련 샘플(training sample)이라 불릴 수도 있다.Edges connecting nodes have a specific value, and this specific value is defined as a weight indicating the connection strength of the nodes. The weights may be obtained from the encoded training data 201 . When an instance of the encoded training data 201 is input, a weight of an edge connecting nodes activated by the instance is calculated. Here, the instance means each case (example) or sample (sample) constituting the data when the data required for the machine learning model for learning or inference (prediction), etc. is given. Accordingly, the instance may be called a training example or a training sample constituting the training data.

가중치는 미리 정의된 가중치 계산 규칙에 의해 계산될 수 있다. 가중치 계산 방법 중에 하나는 네트워크를 클래스별로 분리하여 가중치를 갱신하는 방법이다. The weight may be calculated according to a predefined weight calculation rule. One of the weight calculation methods is to divide the network by class and update the weight.

예를 들어, 1부터 3까지 3개의 클래스 레이블을 가지는 훈련 데이터가 주어질 때, 같은 변수 순열로 이루어진 변수 네트워크 3개를 생성하고, 1번 클래스 레이블을 가진 훈련 데이터는 1번 네트워크의 가중치를, 2번 클래스 레이블을 갖는 훈련 데이터는 2번 네트워크의 가중치를, 3번 클래스 레이블을 가진 훈련 데이터는 세 번째 네트워크의 가중치를 계산하는데 활용된다. 이것은 하나의 변수 순열에 대해 클래스 레이블에 따라 서로 다른 가중치를 가진 변수 네트워크들이 생성됨을 의미한다.For example, given training data with 3 class labels from 1 to 3, 3 variable networks are created with the same variable permutation, and the training data with class 1 label gives the weight of network 1, 2 The training data with the class label No. 2 is used to calculate the weight of the second network, and the training data with the class label No. 3 is used to calculate the weight of the third network. This means that for one variable permutation, variable networks with different weights are created according to the class label.

C-4 가중치 정규화(weight normalization) 단계(S304)C-4 weight normalization step (S304)

인코딩된 훈련 데이터 세트1(201)에 포함된 다수의 인스턴스에 의해 엣지의 가중치가 계산되면, 상기 계산된 가중치에 대한 정규화 과정이 수행된다. When an edge weight is calculated by a plurality of instances included in the encoded training data set 1 201, a normalization process is performed on the calculated weight.

정규화 과정은, 미리 정의된 가중치 정규화 규칙(weight normalization rule)에 의해 수행될 수 있다. 여기서, 가중치 정규화 규칙은, 예를 들면, 인접한 두 변수들 사이의 엣지들의 합이 1이 되도록 설정하는 규칙일 수 있다. The normalization process may be performed according to a predefined weight normalization rule. Here, the weight normalization rule may be, for example, a rule that sets the sum of edges between two adjacent variables to be 1.

C-5 변수 네트워크 평가(assessing feature network) 단계(S305)C-5 variable network evaluation (assessing feature network) step (S305)

변수 네트워크 평가 단계(S305)는 이상의 단계들에 의해 생성된 변수 네트워크와 가중치 정보들을 이용하여 해당 변수 네트워크가 클래스를 구별하는데 있어서 얼마나 좋은 성능을 갖는지를 나타내는 네트워크 평가지수를 계산하는 단계이다. The variable network evaluation step S305 is a step of calculating a network evaluation index indicating how good the variable network has in classifying a class by using the variable network and weight information generated by the above steps.

변수 네트워크 평가에는 크게 두 가지 방법이 있을 수 있다. There are two main methods for evaluating a variable network.

첫 번째 방법은 변수 네트워크가 가진 가중치 정보에 내포된 특성으로부터 성능 지수(figure of merit)를 수학적으로 추출하는 방법이다. 두 번째 방법은 다수의 변수 네트워크들, 상기 정규화된 가중치, 가중치 계산에 사용되지 않은(또는 사용된) 클래스 레이블로 라벨링된 인스턴스를 이용하여 클래스 구별 정확도를 산출하여 변수 네트워크들의 성능을 평가하는 것일 수 있다. 어떠한 방법이든 변수 네트워크를 산술적으로 평가가 가능하다.The first method is a method of mathematically extracting a figure of merit from the characteristics contained in the weight information of the variable network. The second method may be to evaluate the performance of variable networks by calculating class discrimination accuracy using a plurality of variable networks, the normalized weight, and an instance labeled with a class label not used (or used) in weight calculation. have. Any method can arithmetically evaluate the variable network.

C-6 변수 네트워크(feature network) 순위 결정 단계(S306)C-6 feature network ranking step (S306)

변수 네트워크 순위는 전 단계(S305)의 수행 결과에 의해 산술적으로 도출된 변수 네트워크의 평가 지수를 기반으로 결정될 수 있다. 최초 수행에서는 첫 번째로 선택된 변수 네트워크가 1위이지만, S301 단계에서 다른 변수 네트워크를 선택하여 S306까지의 과정을 반복 (iteration) 수행하면, 순위가 변경될 수 있다. 순위는 SFN₁, SFN₂, SFN₃, ??와 같이 아랫 첨자로 나타낸다.The variable network rank may be determined based on the evaluation index of the variable network arithmetically derived by the result of the previous step ( S305 ). In the initial execution, the first selected variable network ranks first, but if another variable network is selected in step S301 and the process up to S306 is iterated, the ranking may be changed. The rank is indicated by a subscript such as _{SFN 1} , SFN ₂ , SFN _{3 , ??.}

C-7 SFN 선택(selecting significant feature networks) 단계(S307)C-7 SFN selection (selecting significant feature networks) step (S307)

전 단계(S306)에서 상위에 랭킹된 변수 네트워크들이 미리 정해진 수만큼 선택된다. 선택된 변수 네트워크들은 주요변수 네트워크(SFN)들로서 모델 구축에 활용된다. In the previous step S306, a predetermined number of variable networks ranked higher are selected. The selected variable networks are used for model building as principal variable networks (SFNs).

D. 모델을 구축(building)하는 단계(S400)D. Step of building a model (S400)

도 3은 도 1에 도시한 모델 구축 단계(S400)를 도식적으로 설명하기 위한 도면이고, 도 4는 도 1에 도시한 각 서브 모델의 앙상블 구성을 설명하기 위한 도면이다. FIG. 3 is a diagram for schematically explaining the model building step S400 shown in FIG. 1 , and FIG. 4 is a diagram for explaining an ensemble configuration of each sub-model shown in FIG. 1 .

도 3을 참조하면, 모델 구축 단계(S400)는 전단계(S307)를 통해 선택된 주요변수 네트워크를 이용하여 모델을 구성하는 단계이다. 전체 모델(400)은 각 클래스별로 구별되는 다수의 서브모델들로 구성된다.Referring to FIG. 3 , the model building step ( S400 ) is a step of constructing a model using the main variable network selected through the previous step ( S307 ). The entire model 400 is composed of a plurality of sub-models that are distinguished for each class.

도1에 도시된 바와 같이, N개의 클래스를 구별하도록 구축된 모델은 N개의 서브모델(400_1, 400_2, ??, 400_N)을 포함하도록 구성된다. 그리고 각 서브 모델은, 도 4에 도시된 바와 같이, 전 단계(S307)에서 선택된 주요변수 네트워크들을 결합(combining)한 앙상블(ensemble)로 구성될 수 있다.As shown in FIG. 1, the model constructed to distinguish N classes is configured to include N submodels 400_1, 400_2, ??, and 400_N. And each sub-model may be composed of an ensemble in which the main variable networks selected in the previous step ( S307 ) are combined, as shown in FIG. 4 .

가장 기본적인 앙상블을 구성하는 방법은 모든 서브 모델들을 동일한 주요변수 네트워크들로 구축하는 것이다. 그리고 훈련 데이터를 이용하여 가중치를 업데이트할 때, 훈련 데이터의 인스턴스는 도3에 도시한 바와 같이 각각의 클래스 레이블에 해당하는 서브모델이 가진 주요변수 네트워크의 가중치를 계산 및 업데이트하는 데 활용한다. 훈련 과정이 끝나면, 생성된 서브 모델들은 동일한 주요변수 네트워크들로 구성되지만, 서로 다른 가중치 정보를 갖게 된다.The most basic way to construct an ensemble is to build all sub-models with the same principal variable networks. And when the weights are updated using the training data, the instances of the training data are used to calculate and update the weights of the main variable networks of the submodels corresponding to each class label as shown in FIG. 3 . After the training process, the generated sub-models are composed of the same main variable networks, but have different weight information.

E. 예측(prediction) 단계(S500)E. Prediction step (S500)

예측 단계(S500)는 검증 데이터 세트(110)의 인스턴스를 상기 구축된 모델(400)에 포함된 모든 서브 모델들(400_1, 400_2, …400_N)에 입력하여 가장 큰 가중치 점수를 가지는 서브 모델을 해당 인스턴스의 예측 클래스로 선택하는 것이다.In the prediction step S500, the instance of the verification data set 110 is input to all the sub-models 400_1, 400_2, ... 400_N included in the built model 400, and the sub-model having the largest weight score corresponds to that. It is selected as the prediction class of the instance.

검증 데이터 세트(110)의 인스턴스에 대해 특정 서브 모델이 가지는 가중치 점수는 서브모델을 구성하는 주요변수 네트워크(SFN)들의 해당 가중치 점수를 이용하여 도출할 수 있다.The weight score of a specific sub-model for the instance of the verification data set 110 may be derived using the corresponding weight score of the main variable networks (SFNs) constituting the sub-model.

도4에 도시한 바와 같이 서브모델1의 가중치 점수는 SFN1 (S311), SFN2 (S312), SFN3 (S313) 등 서브모델을 구성하는 주요변수 네트워크들의 가중치 점수에 대한 선형 결합 (linear combination)으로 계산할 수 있다.As shown in Figure 4, the weight score of submodel 1 is calculated as a linear combination of the weight scores of the main variable networks constituting the submodel, such as SFN1 (S311), SFN2 (S312), and SFN3 (S313). can

W(Di, SFN_j)를 검증 데이터 세트(110)의 i번째 인스턴스 D_i에 대해 SFN_j가 가지는 가중치 점수라고 하자. 이 때, 서브모델 1이 가지는 가중치 점수 W₁(D_i)는 다음과 같이 계산된다.Let W(Di, SFN _j ) be the weight score of _{SFN j} with respect to the i-th instance D _{i of the verification data set 110 .} At this time, the weight score W ₁ (D _i ) of submodel 1 is calculated as follows.

c_j는 SFN의 순위에 따른 기여도를 나타내는 계수이다. 예를 들어, c_j= 1인 경우, 가중치 점수는 순위에 관계없이 모든 SFN에 대해 동등한 비율로 계산된다. 이때, 순위 별로 차등을 두어 j에 따라 (SFN에 따라) c_j 값이 다르게 설정될 수도 있다.c _j is a coefficient indicating the contribution according to the ranking of the SFN. For example, if c _j = 1, the weight score is calculated in equal proportion for all SFNs regardless of rank. _{In this case, the value of c j} may be set differently according to j (according to SFN) by giving a difference according to rank.

추가 데이터 세트에 대한 점진적 학습을 수행하는 단계Steps to perform incremental training on additional data sets

본 발명의 중요한 특징 중에 하나는 새롭게 추가된 훈련 데이터(102)에 대한 점진적 학습을 용이하게 수행할 수 있는 점이다. 먼저, 훈련 데이터 세트1(101)을 기반으로 모델(400)이 구축된 상황이 가정된다. 이후, 새로운 훈련 데이터 세트 2(102)가 인코더(200)로 입력된다.One of the important features of the present invention is that it is possible to easily perform gradual learning on the newly added training data 102 . First, it is assumed that the model 400 is built based on the training data set 1 101 . Then, a new training data set 2 (102) is input to the encoder (200).

인코더(200)는 새로운 훈련 데이터 세트 2(102)에 대해 인코딩을 수행하고, 인코딩된 훈련 데이터 세트2(102)를 생성한다.The encoder 200 performs encoding on the new training data set 2 102 , and generates an encoded training data set 2 102 .

이후, 인코딩된 훈련 데이터 세트 2(102)에 대해 SFN의 탐색 단계(S300)에 포함된 일련의 모든 과정(S301~S307)을 수행하는 것이 아니라 가중치 계산(weight scoring) 과정(S303) 및 가중치 정규화 과정(S304)만을 순차적으로 수행하여, 인코딩된 훈련 데이터 세트2(102)에 대한 정규화된 가중치로 이미 구축된 모델(400)의 가중치를 업데이트하는 방식으로 점진적 학습이 수행된다.Thereafter, instead of performing all a series of processes ( S301 to S307 ) included in the search step ( S300 ) of the SFN for the encoded training data set 2 ( 102 ), a weight scoring process ( S303 ) and weight normalization Progressive learning is performed in such a way that only the process S304 is sequentially performed, and the weights of the already built model 400 are updated with the normalized weights for the encoded training data set 2 102 .

이러한 점진적 학습은 새로운 훈련 데이터가 입력될 때, 기존에 구축한 모델을 유지하고 가중치라는 상태 변수만을 업데이트하는 방식으로 학습을 수행하기 때문에, 점진적 학습이 용이하게 수행될 수 있다.Since such gradual learning is performed in a manner that maintains the previously built model and updates only the state variable called weight when new training data is input, gradual learning can be easily performed.

도 5는 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 수행하도록 구현된 컴퓨팅 장치의 블록도이다.5 is a block diagram of a computing device implemented to perform a machine learning method for gradual learning according to an embodiment of the present invention.

도 5를 참조하면, 컴퓨팅 장치(600)는 저장소(610), 기계 학습 모듈(620), 프로세서(630), 메모리(640) 및 이들(610, 620, 630, 640)을 연결하는 시스템 버스(650)을 포함할 수 있다.Referring to FIG. 5 , the computing device 600 includes a storage 610 , a machine learning module 620 , a processor 630 , a memory 640 and a system bus ( 610 , 620 , 630 , 640 ) connecting them. 650) may be included.

저장소(610)는 모델(도 1의 400)을 구축하기 위한 다수의 클래스 레이블로 라벨링된 훈련 데이터(또는 훈련 데이터 세트)(102)와 검증 데이터(또는 검증 데이터 세트)(110, 111)를 저장하고, 점진적 학습을 통해 모델(도 1의 400)을 점진적으로 업데이트하기 위한 새로운 훈련 데이터(또는 새로운 훈련 데이터 세트)(102)를 저장하는 하드웨어 장치이다.Storage 610 stores training data (or training data sets) 102 and validation data (or validation data sets) 110 , 111 labeled with a number of class labels for building a model ( 400 in FIG. 1 ). And, it is a hardware device for storing new training data (or new training data set) 102 for progressively updating the model (400 in FIG. 1) through gradual learning.

저장소(610)는, 예를 들면, 컴퓨터 판독가능매체로서, 예를 들면, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체를 포함할 수 있다.Storage 610 is, for example, a computer-readable medium, for example, a hard disk, magnetic media such as floppy disks and magnetic tapes, optical recording media such as CD-ROMs, DVDs, and floppy disks. ), such as magneto-optical media.

기계 학습 모듈(620)은 프로세서(630)의 제어 또는 실행에 따라, 모델(도 1의 400)을 구축하고, 상기 새로운 훈련 데이터(102) 기반으로 생성된 새로운 가중치만을 이용하여 상기 구축된 모델(도 1의 400)을 점진적으로 업데이트하는(학습시키는) 하드웨어 모듈 또는 소프트웨어 모듈일 수 있다. The machine learning module 620 builds a model (400 in FIG. 1) under the control or execution of the processor 630, and uses only the new weights generated based on the new training data 102, the built model ( It may be a hardware module or a software module that gradually updates (learns) 400 of FIG. 1 .

기계 학습 모듈(620)은, 기능에 따라 구분되는 다수의 하위 모듈들을 포함할 수 있으며, 다수의 하위 모듈들은, 예를 들면, 인코더(621), 변수 네트워크(Feature Networks: FN) 생성기(622), 주요 변수 네트워크(significant feature networks: SFN) 결정기(623), 모델 구축기(624) 및 업데이트 유닛(625)을 포함할 수 있다. The machine learning module 620 may include a plurality of sub-modules classified according to functions, and the plurality of sub-modules are, for example, an encoder 621 and a Feature Networks (FN) generator 622 . , a significant feature networks (SFN) determiner 623 , a model builder 624 , and an update unit 625 .

인코더(621)는, 다수의 클래스 레이블로 라벨링된 훈련 데이터를 인코딩하는 구성으로, 예를 들면, 도 1에서 설명한 단계 S200의 과정을 수행한다. 인코더(621)는 사전에 정의된 인코딩 규칙에 따라, 상기 훈련 데이터에 포함된 변수의 연속적인(continuous) 값을 이산적인(discrete) 값 또는 범주형(categorical) 값으로 변환한다.The encoder 621 is configured to encode training data labeled with a plurality of class labels, and, for example, performs the process of step S200 described in FIG. 1 . The encoder 621 converts continuous values of variables included in the training data into discrete values or categorical values according to a predefined encoding rule.

또한 인코더(621)는 새로운 훈련 데이터(102) 기반의 새로운 가중치를 생성하기 위해 새로운 훈련 데이터(102)를 인코딩한다. The encoder 621 also encodes the new training data 102 to generate new weights based on the new training data 102 .

변수 네트워크(FN) 생성기(622)는 상기 인코딩된 훈련 데이터에 포함된 변수들을 노드들로 구성하고, 상기 노드들 중에서 인접한 노드들을 연결 강도를 나타내는 가중치를 갖는 엣지로 연결하여, 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성하는 구성으로, 도 1에서 설명한 단계 S301 및 S302를 수행하는 구성일 수 있다.A variable network (FN) generator 622 configures the variables included in the encoded training data into nodes, and connects adjacent nodes among the nodes with edges having a weight indicating the strength of the connection, and the plurality of class labels As a configuration for generating a plurality of variable networks classified as , it may be a configuration for performing steps S301 and S302 described in FIG. 1 .

변수 네트워크(FN) 생성기(622)는 단계 S301의 수행에 따라 상기 인코딩된 훈련 데이터에 포함된 2이상의 변수들을 특정 순서로 나열하여 변수 순열을 생성한다. The variable network (FN) generator 622 generates a variable permutation by listing two or more variables included in the encoded training data in a specific order according to the execution of step S301 .

일 예로, 변수 네트워크(FN) 생성기(622)는 상기 인코딩된 훈련 데이터에서 2이상의 변수들을 무작위로 선택한 후, 상기 무작위로 선택된 2이상의 변수들을 상기 특정 순서로 나열하여 상기 변수 순열을 생성한다.For example, the variable network (FN) generator 622 randomly selects two or more variables from the encoded training data, and then lists the two or more randomly selected variables in the specific order to generate the variable permutation.

다른 예로, 변수 네트워크(FN) 생성기(622)는 상기 인코딩된 훈련 데이터에서 포함된 2 이상의 변수들을 선형 판별 분석(linear discriminant analysis, LDA), 주성분 분석(Principal Component Analysis, PCA) 및 딥러닝 (deep learning) 기반의 변수 추출법을 이용하여 새로운 변수들로 변환한 후, 상기 새로운 변수들을 특정 순서로 나열하여 상기 변수 순열을 생성한다.As another example, the variable network (FN) generator 622 performs linear discriminant analysis (LDA), principal component analysis (PCA), and deep learning (deep learning) on two or more variables included in the encoded training data. learning)-based variable extraction method is used to convert new variables, and then the new variables are arranged in a specific order to generate the variable permutations.

상기 변수 순열이 생성되면, 변수 네트워크(FN) 생성기(622)는 상기 나열된 변수들 각각에 포함된 값들을 노드들로 구성하고, 상기 구성된 노드들 중에서 상기 특정 순서에 따라 인접한 노드들을 상기 엣지로 연결하여, 상기 생성된 변수 순열에 대해 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성한다.When the variable permutation is generated, the variable network (FN) generator 622 configures the values included in each of the listed variables into nodes, and connects adjacent nodes according to the specific order among the configured nodes to the edge. Thus, for the generated variable permutations, a plurality of variable networks classified by the plurality of class labels are generated.

상기 주요 변수 네트워크(SFN) 결정기(623)는 상기 생성된 다수의 변수 네트워크들 중에서 성능에 따라 선택된 변수 네트워크들을 주요 변수 네트워크들로 결정한다.The principal variable network (SFN) determiner 623 determines variable networks selected according to performance from among the generated plurality of variable networks as principal variable networks.

예를 들면, 상기 주요 변수 네트워크(SFN) 결정기(623)는 상기 인코딩된 훈련 데이터(201)의 인스턴스를 이용하여 상기 다수의 변수 네트워크들 각각의 상기 가중치를 계산하고(도 1의 S303), 상기 계산된 가중치를 정규화하는 과정을 처리한 후(도 1의 S304), 상기 다수의 변수 네트워크들과 상기 정규화된 가중치를 이용하여 각 변수 네트워크들의 성능을 평가하는 과정(도 1의 S305)을 처리한다. For example, the principal variable network (SFN) determiner 623 calculates the weight of each of the plurality of variable networks using the instance of the encoded training data 201 ( S303 in FIG. 1 ), and the After processing the normalization of the calculated weights (S304 of FIG. 1), the process of evaluating the performance of each of the variable networks using the plurality of variable networks and the normalized weights (S305 of FIG. 1) is processed .

추가적으로, 상기 주요 변수 네트워크(SFN) 결정기(623)는 도 1의 단계 303을 통해 상기 인코더(200)에 의해 인코딩된 새로운 훈련 데이터(202)의 인스턴스를 이용하여 새로운 가중치를 계산하고, 도 1의 S304를 통해 상기 계산된 새로운 가중치를 정규화하는 과정을 처리한다.Additionally, the principal variable network (SFN) determiner 623 calculates a new weight using the instance of the new training data 202 encoded by the encoder 200 through step 303 of FIG. 1 , A process of normalizing the calculated new weight is processed through S304.

이후, 상기 주요 변수 네트워크(SFN) 결정기(623)는 상기 평가된 성능에 따라 상기 다수의 변수 네트워크들의 순위를 결정한 후(도 1의 S306), 상기 다수의 변수 네트워크들 중에서 미리 설정된 개수에 따라 상위에 랭크된 변수 네트워크들을 상기 주요 변수 네트워크들로 결정하는 과정(도 1의 S307)을 처리한다. Then, the main variable network (SFN) determiner 623 determines the rank of the plurality of variable networks according to the evaluated performance (S306 of FIG. 1 ), and then ranks higher according to a preset number among the plurality of variable networks. A process (S307 of FIG. 1 ) of determining the variable networks ranked in , as the main variable networks is processed.

상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 상기 계산된 가중치를 정규화하는 과정은, 예를 들면, 상기 다수의 클래스 레이블이 제1 클래스 레이블과 제2 클래스 레이블을 포함하고, 상기 다수의 변수 네트워크들이 제1 변수 네트워크와 제2 변수 네트워크를 포함하는 경우, 상기 제1 클래스 레이블로 라벨링된 상기 훈련 데이터의 인스턴스를 이용하여 상기 제1 변수 네트워크의 가중치를 계산하는 과정, 상기 제2 클래스 레이블로 라벨링된 상기 훈련 데이터의 인스턴스를 이용하여 상기 제1 변수 네트워크의 가중치와 다른 상기 제2 변수 네트워크의 가중치를 계산하는 과정 및 상기 제1 변수 네트워크의 가중치와 상기 제2 변수 네트워크의 가중치를 정규화하는 과정을 포함할 수 있다.The normalization of the weights calculated by the principal variable network (SFN) determiner 623 may include, for example, that the plurality of class labels include a first class label and a second class label, and the plurality of variables When networks include a first variable network and a second variable network, calculating a weight of the first variable network using an instance of the training data labeled with the first class label, with the second class label Calculating a weight of the second variable network that is different from the weight of the first variable network using the labeled instances of the training data, and normalizing the weight of the first variable network and the weight of the second variable network may include.

상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 각 변수 네트워크들의 성능을 평가하는 과정은 상기 다수의 변수 네트워크들, 상기 정규화된 가중치, 클래스 레이블로 라벨링된 인스턴스를 이용하여 클래스 구별 정확도를 산출하는 과정 및 상기 산출된 클래스 구별 정확도를 기반으로 각 변수 네트워크들의 성능을 평가하는 과정을 포함할 수 있다.The process of evaluating the performance of each variable network by the main variable network (SFN) determiner 623 is to calculate the class discrimination accuracy using the plurality of variable networks, the normalized weight, and an instance labeled with a class label. and evaluating the performance of each variable network based on the calculated class discrimination accuracy.

상기 모델 구축기(624)는, 상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 상기 결정된 주요 변수 네트워크들을 결합하여 모델(400)을 구축하는 과정을 처리한다.The model builder 624 processes the process of building the model 400 by combining the principal variable networks determined by the principal variable network (SFN) determiner 623 .

상기 업데이트 유닛(625)은 상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 정규화된 새로운 가중치를 기반으로 상기 모델 구축기(624)에 의해 구축된 상기 모델(400)을 점진적으로 업데이트하는 과정을 처리한다.The update unit 625 handles the process of incrementally updating the model 400 built by the model builder 624 based on the new weights normalized by the principal variable network (SFN) determiner 623 . do.

예를 들면, 상기 업데이트 유닛(625)은 상기 결정된 주요 변수 네트워크들 각각의 상기 가중치에 상기 정규화된 새로운 가중치를 합산하여, 상기 구축된 모델을 점진적으로 업데이트할 수 있다.For example, the update unit 625 may incrementally update the built model by adding the new normalized weight to the weight of each of the determined main variable networks.

프로세서(630)는 시스템 버스(650)를 통해 저장소(610), 기계 학습 모듈(620) 및 메모리(640)의 동작을 제어 및 관리하는 구성으로, 적어도 하나의 CPU, 적어도 하나의 GPU 또는 이들의 조합일 수 있다.The processor 630 is a configuration that controls and manages the operations of the storage 610, the machine learning module 620, and the memory 640 through the system bus 650, and includes at least one CPU, at least one GPU, or their It can be a combination.

도 5에서는 프로세서(630)와 기계 학습 모듈(620)이 분리된 구성으로 도시하고 있으나, 하나로 통합될 수 있다. 예를 들면, 기계 학습 모듈(620)이 프로세서(630)의 내부에 통합될 수 있다.Although the processor 630 and the machine learning module 620 are illustrated as separate components in FIG. 5 , they may be integrated into one. For example, a machine learning module 620 may be integrated into the processor 630 .

메모리(640)는 프로세서(630) 또는 기계 학습 모듈(620) 내의 각 구성에서 처리한 중간 데이터 또는 결과 데이터를 일시적 또는 영구적으로 저장하는 하드웨어 장치로, ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함될 수 있다.The memory 640 is a hardware device that temporarily or permanently stores intermediate data or result data processed by each configuration in the processor 630 or machine learning module 620, and stores program instructions such as ROM, RAM, flash memory, etc. and hardware devices specifically configured to perform

프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급언어코드를 포함한다. 상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다Examples of program instructions include not only machine codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

본 발명의 보호범위가 이상에서 명시적으로 설명한 실시예의 기재와 표현에 제한되는 것은 아니다. 또한, 본 발명이 속하는 기술분야에서 자명한 변경이나 치환으로 말미암아 본 발명이 보호범위가 제한될 수도 없음을 다시 한번 첨언한다.The protection scope of the present invention is not limited to the description and expression of the embodiments explicitly described above. In addition, it is added once again that the protection scope of the present invention cannot be limited due to obvious changes or substitutions in the technical field to which the present invention pertains.

Claims

A machine learning method performed by a computing device, comprising:
encoding training data labeled with a plurality of class labels;
Constructing the variables included in the encoded training data into nodes, and connecting adjacent nodes among the nodes with edges having a weight indicating the connection strength to generate a plurality of variable networks classified by the plurality of class labels step;
determining variable networks selected according to performance from among the generated plurality of variable networks as main variable networks;
constructing a model by combining the determined main variable networks;
encoding new training data;
calculating a new weight by using the instance of the encoded new training data and then normalizing the calculated new weight; and
gradually updating the built model by updating the weight of each of the determined main variable networks based on the new normalized weight.
A machine learning method for progressive learning, including

In claim 1,
The encoding of the training data comprises:
Machine learning for progressive learning, which is a step of converting continuous values of variables included in the training data into discrete or categorical values according to a predefined encoding rule Way.

In claim 1,
The step of generating a plurality of variable networks classified into the plurality of class labels comprises:
generating a variable permutation by listing two or more variables included in the encoded training data in a specific order; and
By configuring the values included in each of the listed variables into nodes, and connecting adjacent nodes according to the specific order among the configured nodes to the edge, the generated variable permutation is classified into the plurality of class labels creating multiple variable networks
A machine learning method for progressive learning, including

In claim 3,
The step of generating the variable permutation comprises:
randomly selecting two or more variables from the encoded training data; and
generating the variable permutation by arranging the randomly selected two or more variables in the specific order
A machine learning method for progressive learning that includes a.

In claim 3,
The step of generating the variable permutation comprises:
Two or more variables included in the encoded training data are converted into new variables using a variable extraction method based on linear discriminant analysis (LDA), Principal Component Analysis (PCA), and deep learning. converting; and
generating the variable permutation by arranging the new variables in a specific order
A machine learning method for progressive learning that includes a.

In claim 1,
Determining the selected variable networks as main variable networks comprises:
calculating the weight of each of the plurality of variable networks using an instance of the training data, and normalizing the calculated weight;
evaluating the performance of each variable network using the plurality of variable networks and the normalized weight;
ranking the plurality of variable networks according to the evaluated performance; and
determining upper-ranked variable networks as the main variable networks according to a preset number among the plurality of variable networks;
A machine learning method for progressive learning that includes a.

In claim 6,
The step of normalizing the calculated weight is,
when the plurality of class labels include a first class label and a second class label, and the plurality of variable networks include a first variable network and a second variable network,
calculating a weight of the first variable network using an instance of the training data labeled with the first class label;
calculating a weight of the second variable network that is different from the weight of the first variable network by using an instance of the training data labeled with the second class label; and
normalizing the weights of the first variable network and the weights of the second variable network;
A machine learning method for progressive learning that includes a.

In claim 6,
Evaluating the performance of each of the variable networks comprises:
calculating class discrimination accuracy using the plurality of variable networks, the normalized weight, and an instance labeled with a class label; and
Evaluating the performance of each variable network based on the calculated class discrimination accuracy
A machine learning method for progressive learning that includes a.

In claim 1,
The step of gradually updating the built model is,
and adding the new normalized weight to the weight of each of the determined main variable networks, and gradually updating the built model.

A computing device for executing a machine learning method for progressive learning, comprising:
processor;
a repository for storing training data labeled with multiple class labels and new training data; and
A machine learning module for building a model using training data labeled with the plurality of class labels under the control of the processor;
The machine learning module is
an encoder for encoding training data labeled with a plurality of class labels and the new training data;
Constructing the variables included in the encoded training data into nodes, and connecting adjacent nodes among the nodes with edges having a weight indicating the connection strength to generate a plurality of variable networks classified by the plurality of class labels variable network generator;
Determining variable networks selected according to performance from among the generated plurality of variable networks as main variable networks, calculating a new weight using an instance of the encoded new training data, and normalizing the calculated new weight variable network determiner;
a model builder for building a model by combining the determined main variable networks; and
An update unit for progressively updating the built model by updating the weight of each of the determined main variable networks based on the new normalized weight
A computing device comprising a.

In claim 10,
The variable network generator is
A first process of generating a variable permutation by listing two or more variables included in the encoded training data in a specific order, and configuring values included in each of the listed variables into nodes, and in the specific order among the configured nodes and connecting adjacent nodes to the edge according to a second process of generating a plurality of variable networks classified into the plurality of class labels with respect to the generated variable permutation.

In claim 10,
The main variable network determiner,
A first processor that calculates the weight of each of the plurality of variable networks using an instance of the training data and normalizes the calculated weight, each variable network using the plurality of variable networks and the normalized weight A second process for evaluating the performance of the variables, a third process for determining a rank of the plurality of variable networks according to the evaluated performance, and a third process for determining the rank of the plurality of variable networks according to a preset number among the plurality of variable networks and processing a fourth process of determining with variable networks.

In claim 10,
The update unit is
and adding the normalized new weight to the weight of each of the determined main variable networks to handle a process of gradually updating the built model.