KR102554626B1

KR102554626B1 - Machine learning method for incremental learning and computing device for performing the same

Info

Publication number: KR102554626B1
Application number: KR1020200181204A
Authority: KR
Inventors: 김철호; 백옥기; 우영춘; 이성엽; 이정훈; 최인문
Original assignee: 한국전자통신연구원
Priority date: 2020-01-06
Filing date: 2020-12-22
Publication date: 2023-07-13
Also published as: KR20210088421A

Abstract

본 발명의 점진적 학습을 위한 기계 학습 방법은 훈련 데이터를 이용하여 모델을 구축하고, 새로운 훈련 데이터를 기반으로 생성된 새로운 가중치만을 이용하여 상기 구축된 모델을 점진적으로 업데이트한다. In the machine learning method for incremental learning of the present invention, a model is built using training data, and the built model is gradually updated using only new weights generated based on new training data.

Description

Machine learning method for incremental learning and computing device for performing the same

본 발명은 기계 학습에 관한 것으로, 더욱 상세하게는, 점진적 학습(incremental learning)과 관련된 기계 학습에 관한 기술이다.The present invention relates to machine learning, and more particularly, to machine learning related to incremental learning.

인공 지능(Artificial Intelligence: AI) 분야에서 널리 사용되고 있는 지도 기계 학습(supervised machine learning) 모델의 적응성 및 신뢰도 향상을 위해, 점진적 학습(incremental learning)에 대한 다양한 연구가 시도되고 있다. 점진적 학습은 점진적 학습은 지속적으로 변화하는 환경에 대해 모델의 적응성을 높일 수 있게 한다.In order to improve the adaptability and reliability of supervised machine learning models widely used in the field of artificial intelligence (AI), various studies on incremental learning have been attempted. Incremental Learning: Incremental learning makes it possible to increase the adaptability of a model to a constantly changing environment.

심층 신경망(Deep Neural Network: DNN), 합성곱 신경망(Convoultional Neural Network, CNN), 순환 신경망(Recurrent Neural Network, RNN) 등과 같은 인공 신경망(Artificial Neural Network, ANN) 기반의 기계 학습 모델은 파국적 망각 (Catastrophic Forgetting, CF)의 문제를 가지고 있어 점진적(incremental) 또는 연속적(continual) 학습을 구현하는 데 있어 한계가 있고, 모델의 내부 구조가 매우 복잡하여 모델이나 결과에 대한 설명이 어렵다는 것은 잘 알려진 사실이다.Machine learning models based on artificial neural networks (ANNs), such as deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), etc. It is a well-known fact that there are limitations in implementing incremental or continual learning due to the problem of catastrophic forgetting (CF), and that it is difficult to explain the model or its results because the internal structure of the model is very complex. .

ANN 기반 기계 학습 모델은, 새로운 학습 데이터가 입력되는 경우, 이전 학습 데이터의 전체에 대해 최적화된 상태(이전에 학습된 상태)에서 벗어나 이전에 학습한 내용을 망각하는 CF 문제가 발생할 수 있기 때문에, 모델의 점진적 확장(점진적 업데이트 또는 점진적 성능 개선)이 어렵다.ANN-based machine learning models, when new training data is input, can get out of the optimized state (previously learned state) for the entire previous training data and cause a CF problem in which previously learned content is forgotten. It is difficult to incrementally extend the model (gradual update or incremental performance improvement).

이러한 CF 문제를 개선하기 위해 다양한 방법들이 연구되고 있지만 대부분의 연구는 모델의 성능 저하를 수반하기 때문에, CF 문제를 효과적으로 개선할 수 있는 방법은 아직까지는 미비한 상태이다.Various methods are being studied to improve the CF problem, but since most studies involve model performance degradation, methods for effectively improving the CF problem are still incomplete.

이미지가 아닌 다변수 수치 데이터 (multivariate numeric data 또는 multivariate numeric heterogeneous data)에 대해 ANN 기반 알고리즘을 능가하는 우수한 성능을 보이는 알고리즘으로 의사결정나무(decision tree) 기반 앙상블(ensemble) 기법 중 하나인 그래디언트 부스팅(Gradient Boosting, GB)이 있다. 하지만 이 기법 역시 모델을 구축하는데 있어서 학습 데이터 전체에 대한 최적화를 수행하기 때문에 점진적 학습을 용이하게 제공하지는 못한다.Gradient boosting, one of the decision tree-based ensemble techniques, is an algorithm that outperforms ANN-based algorithms for non-image multivariate numeric data (multivariate numeric data or multivariate numeric heterogeneous data). Gradient Boosting, GB). However, since this technique also optimizes the entire learning data in building a model, it does not easily provide incremental learning.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은, 모델의 성능 저하없이, 점진적 학습(incremental learning)을 용이하게 수행하기 위한 기계 학습 방법 및 그 컴퓨팅 장치를 제공하는 데 있다.An object of the present invention to solve the above problems is to provide a machine learning method and a computing device for easily performing incremental learning without deteriorating model performance.

본 발명의 전술한 목적 및 그 이외의 목적과 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부된 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다.The foregoing and other objects, advantages and characteristics of the present invention, and methods of achieving them will become clear with reference to the embodiments described below in detail in conjunction with the accompanying drawings.

상술한 목적을 달성하기 위한 본 발명의 일면에 따른 점진적 학습을 위한 기계 학습 방법은, 다수의 클래스 레이블로 라벨링된 훈련 데이터를 인코딩하는 단계; 상기 인코딩된 훈련 데이터에 포함된 변수들을 노드들로 구성하고, 상기 노드들 중에서 인접한 노드들을 연결 강도를 나타내는 가중치를 갖는 엣지로 연결하여, 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성하는 단계; 상기 생성된 다수의 변수 네트워크들 중에서 성능에 따라 선택된 변수 네트워크들을 주요 변수 네트워크들로 결정하는 단계; 상기 결정된 주요 변수 네트워크들을 결합하여 모델을 구축하는 단계; 새로운 훈련 데이터를 인코딩하는 단계; 상기 인코딩된 새로운 훈련 데이터의 인스턴스를 이용하여 새로운 가중치를 계산한 후, 상기 계산된 새로운 가중치를 정규화하는 단계; 및 상기 정규화된 새로운 가중치를 기반으로 상기 결정된 주요 변수 네트워크들 각각의 상기 가중치를 갱신하여 상기 구축된 모델을 점진적으로 업데이트 하는 단계를 포함한다.A machine learning method for incremental learning according to an aspect of the present invention for achieving the above object includes encoding training data labeled with a plurality of class labels; Creating a plurality of variable networks classified by the plurality of class labels by configuring variables included in the encoded training data as nodes and connecting adjacent nodes among the nodes with edges having weights representing connection strength. step; determining variable networks selected according to performance among the plurality of generated variable networks as main variable networks; building a model by combining the determined main variable networks; encoding new training data; After calculating new weights using the encoded instance of the new training data, normalizing the calculated new weights; and gradually updating the built model by updating the weight of each of the determined main variable networks based on the normalized new weight.

본 발명의 다른 일면에 따른 점진적 학습을 위한 기계 학습 방법을 실행하는 컴퓨팅 장치는, 프로세서; 다수의 클래스 레이블로 라벨링된 훈련 데이터와 새로운 훈련 데이터를 저장한 저장소; 및 상기 프로세서의 제어에 따라, 상기 다수의 클래스 레이블로 라벨링된 훈련 데이터를 이용하여 모델을 구축하는 기계 학습 모듈을 포함하고, 상기 기계 학습 모듈은, 다수의 클래스 레이블로 라벨링된 훈련 데이터와 상기 새로운 훈련 데이터를 인코딩하는 인코더; 상기 인코딩된 훈련 데이터에 포함된 변수들을 노드들로 구성하고, 상기 노드들 중에서 인접한 노드들을 연결 강도를 나타내는 가중치를 갖는 엣지로 연결하여, 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성하는 변수 네트워크 생성기; 상기 생성된 다수의 변수 네트워크들 중에서 성능에 따라 선택된 변수 네트워크들을 주요 변수 네트워크들로 결정하고, 상기 인코딩된 새로운 훈련 데이터의 인스턴스를 이용하여 새로운 가중치를 계산하고, 상기 계산된 새로운 가중치를 정규화하는 주요변수 네트워크 결정기; 상기 결정된 주요 변수 네트워크들을 결합하여 모델을 구축하는 모델 구축기; 및 상기 정규화된 새로운 가중치를 기반으로 상기 결정된 주요 변수 네트워크들 각각의 상기 가중치를 갱신하여 상기 구축된 모델을 점진적으로 업데이트 하는 업데이트 유닛을 포함한다.A computing device for executing a machine learning method for incremental learning according to another aspect of the present invention includes a processor; a storage for storing training data labeled with a plurality of class labels and new training data; and a machine learning module that builds a model using the training data labeled with the plurality of class labels under the control of the processor, wherein the machine learning module includes the training data labeled with the plurality of class labels and the new model. an encoder that encodes training data; Creating a plurality of variable networks classified by the plurality of class labels by configuring variables included in the encoded training data as nodes and connecting adjacent nodes among the nodes with edges having weights representing connection strength. variable network generator; Among the plurality of generated variable networks, variable networks selected according to performance are determined as main variable networks, new weights are calculated using instances of the encoded new training data, and the calculated new weights are normalized. variable network determinant; a model builder that builds a model by combining the determined main variable networks; and based on the normalized new weight and an update unit for incrementally updating the built model by updating the weight of each of the determined key variable networks.

본 발명에 따르면, 새로운 훈련 데이터가 입력될 때, 기존에 구축한 모델을 유지하고 새로운 훈련 데이터 기반으로 생성된 가중치만을 이용하여 기존에 구축한 모델을 학습시키기 때문에, 기존에 구축한 모델의 구조를 변경하지 않고 모델을 갱신해 나갈 수 있어 점진적 학습이 용이하다.According to the present invention, when new training data is input, the previously built model is maintained and the previously built model is learned using only the weights generated based on the new training data. Progressive learning is easy because the model can be updated without changing it.

도 1은 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 설명하기 위한 흐름도들이다.
도 2는 도 1에 도시한 변수 순열(feature sequences) 선택(selecting) 단계에서 선택된 변수 순열을 설명하기 위한 도면이다.
도 3은 도 1에 도시한 모델 구축 단계(S400)를 도식적으로 설명하기 위한 도면이다.
도 4는 도 1에 도시한 각 서브 모델의 앙상블 구성을 설명하기 위한 도면이다.
도 5는 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 수행하도록 구현된 컴퓨팅 장치의 블록도이다.1 is a flowchart illustrating a machine learning method for incremental learning according to an embodiment of the present invention.
FIG. 2 is a diagram for explaining a feature sequence selected in a feature sequence selection step shown in FIG. 1 .
3 is a diagram for schematically explaining the model building step (S400) shown in FIG.
FIG. 4 is a diagram for explaining the ensemble configuration of each sub-model shown in FIG. 1;
5 is a block diagram of a computing device implemented to perform a machine learning method for incremental learning according to an embodiment of the present invention.

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시 예들에 대해서 특정한 구조적 또는 기능적 설명들은 단지 본 발명의 개념에 따른 실시 예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시 예들은 다양한 형태로 실시될 수 있으며 본 명세서에 설명된 실시 예들에 한정되지 않는다.Specific structural or functional descriptions of the embodiments according to the concept of the present invention disclosed in this specification are only illustrated for the purpose of explaining the embodiments according to the concept of the present invention, and the embodiments according to the concept of the present invention It can be implemented in various forms and is not limited to the embodiments described herein.

본 발명의 개념에 따른 실시예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시예들을 도면에 예시하고 본 명세서에 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시예들을 특정한 개시형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Embodiments according to the concept of the present invention can apply various changes and can have various forms, so the embodiments are illustrated in the drawings and described in detail herein. However, this is not intended to limit the embodiments according to the concept of the present invention to specific disclosures, and includes modifications, equivalents, or substitutes included in the spirit and scope of the present invention.

본 명세서에서 사용한 용어는 단지 특정한 실시예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in this specification are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, but one or more other features or numbers, It should be understood that the presence or addition of steps, operations, components, parts, or combinations thereof is not precluded.

본 발명은 기존의 기계학습이 효율적으로 구현하지 못하는 점진적 학습이 용이한 지도학습 알고리즘에 관한 것이다. 본 발명은, 다수의 변수(variable 또는 feature)와 목적 변수 (target variable)로 이루어진 데이터에 대해서 목적 변수의 레이블(label)을 예측하는 지도 학습 방법에서, 주요 변수들의 네트워크 (Significant Feature Network: SFN)들을 발견하고, 학습데이터로부터 변수 조합에 포함된 값들의 상호 관계를 이용하여 학습 모델을 구성하고, 그 구성된 학습 모델을 새로운 데이터에 대한 분류 예측에 활용한다.The present invention relates to a supervised learning algorithm that facilitates incremental learning that conventional machine learning cannot efficiently implement. In the present invention, in a supervised learning method for predicting the label of a target variable for data composed of a plurality of variables (variables or features) and target variables, a network of significant features (Significant Feature Network: SFN) are found, and a learning model is constructed using the mutual relationship of the values included in the variable combination from the learning data, and the constructed learning model is used for classification prediction of new data.

본 발명은 기 구축한 모델을 새로운 데이터 세트를 이용하여 추가적으로 학습하는 경우, 기존의 모델에 점진적인 변화를 부가하여 새로운 데이터 세트를 포괄하는 새로운 모델을 구축할 수 있기 때문에, 점진적 학습이 매우 용이하다.In the present invention, when a previously built model is additionally learned using a new data set, incremental learning is very easy because a new model covering the new data set can be built by adding gradual changes to the existing model.

이하, 도면을 참조하여, 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법에 대해 상세히 설명하기로 한다. 그리고 이하의 실시 예는 분류(classification)를 목적으로 하는 지도 학습(supervised learning)에 대한 것이다. 그러나 이에 한정하지 않고, 본 발명은 회귀(regression)를 목적으로 하는 지도 학습(supervised learning)에서도 적용될 수 있음을 당업자라면 이하의 설명으로부터 충분히 이해할 수 있을 것이다.Hereinafter, a machine learning method for incremental learning according to an embodiment of the present invention will be described in detail with reference to the drawings. In addition, the following embodiments relate to supervised learning for the purpose of classification. However, it is not limited thereto, and those skilled in the art will be able to fully understand from the following description that the present invention can also be applied to supervised learning for the purpose of regression.

도 1은 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 설명하기 위한 흐름도들이다. 1 is a flowchart illustrating a machine learning method for incremental learning according to an embodiment of the present invention.

본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법은 크게, 단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계와 추가 데이터 세트에 대한 점진적 학습을 수행하는 단계로 나눌 수 있다.A machine learning method for incremental learning according to an embodiment of the present invention can be largely divided into a step of performing learning and prediction on a single data set and a step of performing incremental learning on an additional data set.

먼저, "단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계"를 설명한 후, "추가 데이터 세트에 대한 점진적 학습을 수행하는 단계"를 설명하기로 한다. 본 명세서에서는 단일 데이터 세트가, First, "steps of performing learning and prediction on a single data set" will be described, and then "steps of performing incremental learning on additional data sets" will be described. In this specification, a single data set is

단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계Steps to train and predict on a single data set

도 1을 참조하면, 단일 데이터 세트에 대한 학습 및 예측을 수행하는 단계는, 훈련 데이터 세트(training data sets)(101, 102)과 검증 데이터 세트(test data sets)(110, 111)을 준비(preparing)하는 단계(S100), 인코딩(encoding) 단계(S200), 주요 변수 네트워크(Significant Feature Networks: SFN)를 탐색(discovering)하는 단계(S300), 모델을 구축(building)하는 단계(S400) 및 예측(prediction) 단계(S500)를 포함한다.Referring to FIG. 1, in the step of performing learning and prediction on a single data set, training data sets 101 and 102 and test data sets 110 and 111 are prepared ( preparing (S100), encoding (S200), discovering Significant Feature Networks (SFN) (S300), building a model (S400), and A prediction step (S500) is included.

A. 훈련 데이터 세트와 검증 데이터 세트의 준비 단계(S100)A. Preparing a training data set and a verification data set (S100)

훈련 데이터 세트(101)는 모델(400: 400_1, 400_2, …, 400_N)을 구축하기 위해 다수의 클래스 레이블로 라벨링된 다수의 훈련 데이터를 포함한다. The training data set 101 includes a number of training data labeled with a number of class labels to build a model 400: 400_1, 400_2, ..., 400_N.

각 훈련 데이터는 다차원의 변수들(multi-dimensional feature variables)과 클래스 레이블(class label)을 목표 변수(target feature 또는 target variable)로 이루어진다. 각 변수(feature 또는 variable)는 연속적이거나 (continuous) 이산적인 (discrete) 수치 또는 문자 값으로 이루어질 수 있다.Each training data consists of multi-dimensional feature variables and class labels as target features or target variables. Each variable (feature or variable) can consist of continuous or discrete numeric or character values.

검증 데이터 세트(110)는 훈련 데이터 세트(101)과 동일한 구성으로 이루어져 있으나, 기 구축된 모델의 예측 성능을 검증하기 위한 목적으로 사용되는 점에서 훈련 데이터 세트(101, 102)과 차이가 있다.The verification data set 110 has the same configuration as the training data set 101, but is different from the training data sets 101 and 102 in that it is used for the purpose of verifying the predictive performance of a previously built model.

훈련 데이터 세트(101)과 검증 데이터 세트(110)은 인코딩 전과 후로 구분될 수 있으며, 인코딩 전의 훈련 데이터 세트(101)과 검증 데이터 세트(110)은 미가공 훈련 데이터 세트(raw training data sets) 및 미가공 검증 데이터 세트(raw test data sets)으로 불릴 수 있다.The training data set 101 and the verification data set 110 can be divided into before and after encoding, and the training data set 101 and the verification data set 110 before encoding are raw training data sets and raw training data sets. They may be called raw test data sets.

B. 인코딩(encoding) 단계(S200)B. Encoding step (S200)

인코딩 단계(S200)에서, 인코더(200)가 훈련 데이터 세트(101)과 검증 데이터 세트(110)을 인코딩 하는 프로세스가 수행된다. 인코딩은 훈련 데이터 세트(101)을 모델(400)의 훈련(또는 학습)에 적합한 데이터로 가공하고, 검증 데이터 세트(110)을 모델(400)의 검증에 적합한 데이터로 가공하는 것일 수 있다.In the encoding step (S200), a process in which the encoder 200 encodes the training data set 101 and the verification data set 110 is performed. Encoding may be processing the training data set 101 into data suitable for training (or learning) of the model 400 and processing the verification data set 110 into data suitable for verifying the model 400 .

인코딩(S200)은 어떤 변수의 값이 연속적(continuous)일 경우 이를 이산적인(discrete) 값, 불연속적인(Discontinuous) 값 또는 범주형(categorical) 값으로 변환하거나, 문자(text)로 이루어진 값을 적절한 수치로 변환하는 것일 수 있다.In encoding (S200), if the value of a variable is continuous, it is converted into a discrete value, discontinuous value, or categorical value, or a value consisting of text is appropriately converted. It can be converted into numbers.

어떤 변수가 갖는 연속적인 값을 이산적인 값 또는 범주형 값으로 변환하거나 문자로 이루어진 값을 수치로 변환하는 것은 사정에 정의된(또는 프로그래밍 된) 인코딩 규칙(encoding rule)에 따라 변환될 수 있다. 인코딩 규칙은 학습과 예측의 전체 과정에서 고정적(static)일 수도 있고, 유동적 (dynamic)일 수도 있다.Converting a continuous value of a variable into a discrete value or a categorical value, or converting a value composed of characters into a numerical value can be converted according to a defined (or programmed) encoding rule. Encoding rules may be static or dynamic during the entire process of learning and prediction.

또한 인코딩(S200)은 이산적이거나 범주형의 값의 구간을 재설정하거나, 입력 값을 다른 값으로 변환하는 것일 수도 있다. 여기서, 이산적이거나 범주형의 값의 구간을 재설정하는 것은, 예를 들면, 10단계로 나누어진 값들을 5단계로 다시 설정하는 것일 수 있고, 입력 값을 다른 값으로 변환하는 것은, 예를 들면, -2,-1,0,1,2로 설정된 값을 1,2,3,4,5로 변환하는 것일 수 있다.Also, the encoding (S200) may reset a range of discrete or categorical values or convert an input value into another value. Here, resetting the range of discrete or categorical values may be, for example, resetting values divided into 10 steps into 5 steps, and converting an input value into another value, for example, , -2, -1,0,1,2 may be converted to 1,2,3,4,5.

C. 주요변수 네트워크(SFN)를 탐색(discovering)하는 단계(S300)C. Discovering the main variable network (SFN) (S300)

SFN을 탐색하는 단계(S300)는 인코더(200)에 의해 인코딩된 훈련 데이터 세트(201)(또는 인코딩된 학습 데이터 세트)을 이용하여 모델(400)의 핵심 구성 요소인 주요 변수 네트워크(significant feature networks, SFN)을 탐색(discovering)하는 것일 수 있다. 여기서, SFN의 탐색(discovering)은 인코딩된 훈련 데이터 세트(201)을 이용하여 SFN을 검출(detecting), 추출(extracting) 또는 계산(calculating)하는 것일 수 있다.The step of discovering the SFN (S300) is a significant feature network, which is a key component of the model 400, using the training data set 201 encoded by the encoder 200 (or the encoded training data set). , SFN) may be discovered. Here, discovering the SFN may be detecting, extracting, or calculating the SFN using the encoded training data set 201 .

구체적으로, SFN을 탐색하는 단계(S300)는, 예를 들면, 변수 순열(feature sequences) 생성(generating) 단계(S301), 노드 및 엣지(node and edge)의 구성(formation) 단계(S302), 가중치 계산 단계(S303), 가중치 정규화(weight normalization) 단계(S304), 변수 네트워크 평가(assessing feature network) 단계(S305), 변수 네트워크(feature network) 순위 결정(ranking) 단계(S306) 및 SFN 선택(selecting significant feature networks) 단계(S307)를 포함한다. Specifically, the step of searching for SFN (S300) includes, for example, a feature sequence generating step (S301), a node and edge formation step (S302), Weight calculation step (S303), weight normalization step (S304), variable network evaluation step (S305), feature network ranking step (S306) and SFN selection ( selecting significant feature networks) step (S307).

SFN은 상기 단계들(S301~S306)을 반복(iteration) 수행하는 과정을 통해 획득(발견, 검출, 추출 또는 계산)될 수 있으며, SFN 선택 단계(S307)에서는 상기 네트워크 순위 결정단계(S306)에서 상위에 랭크된 특정 변수 순열들(specific feature sequences)을 SFN으로 선택하는 과정이 수행된다. 이렇게 선택된 SFN을 이용하여 모델이 구성된다. 이하, SFN 도출을 위한 상기 각 단계에 대해 상세히 설명하기로 한다.The SFN can be acquired (discovered, detected, extracted, or calculated) through an iteration of the above steps (S301 to S306), and in the SFN selection step (S307) in the network ranking step (S306). A process of selecting specific feature sequences ranked at the top by SFN is performed. A model is constructed using the SFN selected in this way. Hereinafter, each step for deriving the SFN will be described in detail.

C-1. 변수 순열(feature sequences) 생성 단계(S301)C-1. Generation of feature sequences (S301)

도 2는 도 1에 도시한 변수 순열(feature sequences) 생성 단계(S301)에 의해 생성된 변수 순열의 일 예를 도시한 도면이다.FIG. 2 is a diagram showing an example of variable permutations generated by the feature sequence generating step (S301) shown in FIG. 1 .

도 2를 참조하면, 변수 순열은 다변수로 이루어진 인코딩된 훈련 데이터 세트(201)에서 2개 이상의 변수(또는 2개 이상의 생성된 변수)를 선택하고 이들을 특정한 순서로 배열한 것이다. Referring to FIG. 2 , variable permutation is selecting two or more variables (or two or more generated variables) from an encoded training data set 201 composed of multiple variables and arranging them in a specific order.

변수 순열(feature sequence)은, 예를 들면, 전체 변수 중 N개의 변수를 선택하여 특정한 순서로 나열하면 도2에서 보는 것과 같이 f₁, f₂, f₃, ??, f_N으로 이루어진 특정 순열이 만들어질 수 있다.The feature sequence is, for example, if N variables are selected from among all variables and arranged in a specific order, as shown in FIG. 2, a specific permutation consisting of f ₁ , f ₂ , f ₃ , ??, f _N this can be made

특정 변수 순열을 생성하는 것은, 변수의 변환 없이, 변수(feature)를 선택하는 방법과 상기 인코딩된 훈련 데이터 세트(201)에 포함된 변수들을 기반으로 새로운 변수를 만들어내는 변수 생성(feature generation) 방법으로 나눌 수 있다.Generating a specific variable permutation is a method of selecting a feature without transforming the variable and a feature generation method of creating a new variable based on the variables included in the encoded training data set 201. can be divided into

변수 순열 생성을 위한 변수 선택 방법으로, 예를 들면, 무작위 선택(random selection) 방법, 모든 조합을 고려하는 방법, 다른 기계학습 방법으로부터 얻는 방법, 정보이론(information theory)의 상호정보(mutual information)를 이용하는 방법 등 다양한 방법이 있을 수 있다.Variable selection methods for variable permutation generation, e.g., random selection methods, methods considering all combinations, methods derived from other machine learning methods, mutual information in information theory There may be various methods, such as how to use .

변수 순열 생성을 위한 변수 생성 방법으로, 예를 들면, 선형 판별 분석(linear discriminant analysis, LDA), 주성분 분석(Principal Component Analysis, PCA), Autoencoder와 같은 딥러닝 (deep learning) 기반의 변수 추출법을 이용하는 방법 등 다양한 방법이 있을 수 있다.As a variable generation method for variable permutation generation, for example, using a deep learning-based variable extraction method such as linear discriminant analysis (LDA), principal component analysis (PCA), or autoencoder There may be a variety of methods, including methods.

C-2 노드 및 엣지(node and edge)의 구성(formation) 단계(S302)Formation step of C-2 node and edge (S302)

전단계(S301)에 의해 특정한 변수 순열이 선택되면, 노드와 엣지를 정의하고 이를 통해 변수 네트워크(feature network)가 구성될 수 있다. When a specific variable permutation is selected in the previous step (S301), a node and an edge are defined, and a feature network can be configured through them.

노드(f₁₁, f_{12, …} f_1i,f_21,f_{22, …} f_N1, f_N2, f_NP,??)는, 도2에 도시된 바와 같이, 각 변수(f₁, f₂, f₃, ??, f_N)가 가지는 인코딩된 값들로 정의되고, 엣지(w₁₁, w₁₂, w₁₃, w_1α,w_21,w_22,w_23,w_2β,??)는 인접한 노드들 사이의 연결을 정의된다. 여기서, 동일한 변수 내에서 노드들의 연결은 고려되지 않는다. 예를 들어, 변수 f₂에는 f₂₁, f₂₂, ?? f_2j와 같은 노드들이 존재하고 이들은 인접한 변수 f₁과 f₃의 노드들에 엣지(또는 가중치를 나타내는 연결선)로 연결된다. 이렇게 노드와 엣지의 연결을 통해, 선택된 변수 순열에 대한 변수 네트워크(feature network)가 구성된다.Nodes (f ₁₁ , f _{12, ...} f _1i, f _21, f _{22, ...} f _N1 , f _N2 , f _NP, ??), as shown in Figure 2, each variable (f ₁ , f ₂ , f ₃ , ??, f _N ) are defined as encoded values, and edges (w ₁₁ , w ₁₂ , w ₁₃ , w _1α, w _21, w _22, w _23, w _2β, ??) are adjacent nodes The connection between them is defined. Here, the connection of nodes within the same variable is not considered. For example, the variable f ₂ contains f ₂₁ , f ₂₂ , ?? Nodes such as f _2j exist, and they are connected to nodes of adjacent variables f ₁ and f ₃ with edges (or connecting lines representing weights). In this way, through the connection of nodes and edges, a feature network for the selected variable permutation is constructed.

C-3 가중치 계산 단계(S303)C-3 weight calculation step (S303)

노드들을 연결하는 엣지는 특정한 값을 가지며, 이 특정한 값은 노드들의 연결 강도를 나타내는 가중치(weight)로 정의된다. 가중치는 인코딩된 훈련 데이터(201)로부터 얻을 수 있다. 인코딩된 훈련 데이터(201)의 인스턴스(instance)가 입력될 때, 상기 인스턴스에 의해 활성화된 노드들을 연결하는 엣지의 가중치가 계산된다. 여기서, 인스턴스는 기계학습 모델이 학습 또는 추론 (예측) 등을 위해 필요로 하는 데이터가 주어졌을 때, 그 데이터를 구성하는 각각의 사례 (example) 또는 샘플 (sample)을 의미한다. 따라서, 인스턴스는 훈련 데이터를 구성하는 훈련 사례(training example) 또는 훈련 샘플(training sample)이라 불릴 수도 있다.Edges connecting nodes have a specific value, and this specific value is defined as a weight representing the connection strength of nodes. Weights can be obtained from encoded training data 201 . When an instance of the encoded training data 201 is input, weights of edges connecting nodes activated by the instance are calculated. Here, an instance means each example or sample constituting the data when the data required for learning or inference (prediction) by the machine learning model is given. Accordingly, an instance may also be called a training example or a training sample constituting training data.

가중치는 미리 정의된 가중치 계산 규칙에 의해 계산될 수 있다. 가중치 계산 방법 중에 하나는 네트워크를 클래스별로 분리하여 가중치를 갱신하는 방법이다. The weight may be calculated according to a predefined weight calculation rule. One of the weight calculation methods is a method of dividing the network into classes and updating the weights.

예를 들어, 1부터 3까지 3개의 클래스 레이블을 가지는 훈련 데이터가 주어질 때, 같은 변수 순열로 이루어진 변수 네트워크 3개를 생성하고, 1번 클래스 레이블을 가진 훈련 데이터는 1번 네트워크의 가중치를, 2번 클래스 레이블을 갖는 훈련 데이터는 2번 네트워크의 가중치를, 3번 클래스 레이블을 가진 훈련 데이터는 세 번째 네트워크의 가중치를 계산하는데 활용된다. 이것은 하나의 변수 순열에 대해 클래스 레이블에 따라 서로 다른 가중치를 가진 변수 네트워크들이 생성됨을 의미한다.For example, when training data having three class labels from 1 to 3 is given, three variable networks consisting of the same variable permutation are created, and the training data with class label No. 1 has the weight of network No. 1, and 2 Training data with the first class label are used to calculate the weights of the second network, and training data with the third class label are used to calculate the weights of the third network. This means that variable networks with different weights are created according to class labels for one variable permutation.

C-4 가중치 정규화(weight normalization) 단계(S304)C-4 weight normalization step (S304)

인코딩된 훈련 데이터 세트1(201)에 포함된 다수의 인스턴스에 의해 엣지의 가중치가 계산되면, 상기 계산된 가중치에 대한 정규화 과정이 수행된다. When weights of edges are calculated by a plurality of instances included in the encoded training data set 1 (201), a normalization process is performed on the calculated weights.

정규화 과정은, 미리 정의된 가중치 정규화 규칙(weight normalization rule)에 의해 수행될 수 있다. 여기서, 가중치 정규화 규칙은, 예를 들면, 인접한 두 변수들 사이의 엣지들의 합이 1이 되도록 설정하는 규칙일 수 있다.The normalization process may be performed according to a predefined weight normalization rule. Here, the weight normalization rule may be, for example, a rule for setting the sum of edges between two adjacent variables to be 1.

C-5 변수 네트워크 평가(assessing feature network) 단계(S305)C-5 variable network evaluation (assessing feature network) step (S305)

변수 네트워크 평가 단계(S305)는 이상의 단계들에 의해 생성된 변수 네트워크와 가중치 정보들을 이용하여 해당 변수 네트워크가 클래스를 구별하는데 있어서 얼마나 좋은 성능을 갖는지를 나타내는 네트워크 평가지수를 계산하는 단계이다.The variable network evaluation step (S305) is a step of calculating a network evaluation index indicating how good the variable network has in distinguishing classes using the variable network and weight information generated by the above steps.

변수 네트워크 평가에는 크게 두 가지 방법이 있을 수 있다. There are two main methods for evaluating variable networks.

첫 번째 방법은 변수 네트워크가 가진 가중치 정보에 내포된 특성으로부터 성능 지수(figure of merit)를 수학적으로 추출하는 방법이다. 두 번째 방법은 다수의 변수 네트워크들, 상기 정규화된 가중치, 가중치 계산에 사용되지 않은(또는 사용된) 클래스 레이블로 라벨링된 인스턴스를 이용하여 클래스 구별 정확도를 산출하여 변수 네트워크들의 성능을 평가하는 것일 수 있다. 어떠한 방법이든 변수 네트워크를 산술적으로 평가가 가능하다.The first method is a method of mathematically extracting a figure of merit from the characteristics included in the weight information of the variable network. A second method may be to evaluate the performance of variable networks by calculating class distinction accuracy using multiple variable networks, the normalized weights, and instances labeled with class labels not used (or used) for weight calculation. there is. Any method can arithmetically evaluate the variable network.

C-6 변수 네트워크(feature network) 순위 결정 단계(S306)C-6 Feature network ranking step (S306)

변수 네트워크 순위는 전 단계(S305)의 수행 결과에 의해 산술적으로 도출된 변수 네트워크의 평가 지수를 기반으로 결정될 수 있다. 최초 수행에서는 첫 번째로 선택된 변수 네트워크가 1위이지만, S301 단계에서 다른 변수 네트워크를 선택하여 S306까지의 과정을 반복 (iteration) 수행하면, 순위가 변경될 수 있다. 순위는 SFN₁, SFN₂, SFN₃, ??와 같이 아랫 첨자로 나타낸다.The variable network rank may be determined based on the evaluation index of the variable network arithmetically derived by the performance result of the previous step (S305). In the initial implementation, the variable network selected first is ranked first, but if another variable network is selected in step S301 and the process up to S306 is repeated, the rank may be changed. Ranks are indicated by subscripts, such as SFN ₁ , SFN ₂ , SFN ₃ , ??

C-7 SFN 선택(selecting significant feature networks) 단계(S307)C-7 SFN selection (selecting significant feature networks) step (S307)

전 단계(S306)에서 상위에 랭킹된 변수 네트워크들이 미리 정해진 수만큼 선택된다. 선택된 변수 네트워크들은 주요변수 네트워크(SFN)들로서 모델 구축에 활용된다.In the previous step (S306), a predetermined number of variable networks ranked at the top are selected. The selected variable networks are utilized for model construction as principal variable networks (SFNs).

D. 모델을 구축(building)하는 단계(S400)D. Building a model (S400)

도 3은 도 1에 도시한 모델 구축 단계(S400)를 도식적으로 설명하기 위한 도면이고, 도 4는 도 1에 도시한 각 서브 모델의 앙상블 구성을 설명하기 위한 도면이다.FIG. 3 is a diagram for schematically explaining the model building step (S400) shown in FIG. 1, and FIG. 4 is a diagram for explaining the ensemble configuration of each sub-model shown in FIG.

도 3을 참조하면, 모델 구축 단계(S400)는 전단계(S307)를 통해 선택된 주요변수 네트워크를 이용하여 모델을 구성하는 단계이다. 전체 모델(400)은 각 클래스별로 구별되는 다수의 서브모델들로 구성된다.Referring to FIG. 3 , the model construction step (S400) is a step of constructing a model using the main variable network selected through the previous step (S307). The entire model 400 is composed of a plurality of submodels distinguished for each class.

도1에 도시된 바와 같이, N개의 클래스를 구별하도록 구축된 모델은 N개의 서브모델(400_1, 400_2, …, 400_N)을 포함하도록 구성된다. 그리고 각 서브 모델은, 도 4에 도시된 바와 같이, 전 단계(S307)에서 선택된 주요변수 네트워크들을 결합(combining)한 앙상블(ensemble)로 구성될 수 있다.As shown in FIG. 1, a model built to distinguish N classes is configured to include N submodels (400_1, 400_2, ..., 400_N). And, as shown in FIG. 4, each sub-model may be composed of an ensemble combining main variable networks selected in the previous step (S307).

가장 기본적인 앙상블을 구성하는 방법은 모든 서브 모델들을 동일한 주요변수 네트워크들로 구축하는 것이다. 그리고 훈련 데이터를 이용하여 가중치를 업데이트할 때, 훈련 데이터의 인스턴스는 도3에 도시한 바와 같이 각각의 클래스 레이블에 해당하는 서브모델이 가진 주요변수 네트워크의 가중치를 계산 및 업데이트하는 데 활용한다. 훈련 과정이 끝나면, 생성된 서브 모델들은 동일한 주요변수 네트워크들로 구성되지만, 서로 다른 가중치 정보를 갖게 된다.The most basic way to construct an ensemble is to construct all submodels with the same main variable networks. When weights are updated using training data, instances of training data are used to calculate and update weights of main variable networks of submodels corresponding to respective class labels, as shown in FIG. 3 . After the training process, the generated sub-models are composed of the same main variable networks, but have different weight information.

E. 예측(prediction) 단계(S500)E. Prediction Step (S500)

예측 단계(S500)는 검증 데이터 세트(110)의 인스턴스를 상기 구축된 모델(400)에 포함된 모든 서브 모델들(400_1, 400_2, …400_N)에 입력하여 가장 큰 가중치 점수를 가지는 서브 모델을 해당 인스턴스의 예측 클래스로 선택하는 것이다.In the prediction step (S500), an instance of the verification data set 110 is input to all sub-models (400_1, 400_2, ... 400_N) included in the built model 400, and the sub-model having the largest weight score corresponds to It is selected as the predicted class of the instance.

검증 데이터 세트(110)의 인스턴스에 대해 특정 서브 모델이 가지는 가중치 점수는 서브모델을 구성하는 주요변수 네트워크(SFN)들의 해당 가중치 점수를 이용하여 도출할 수 있다.A weight score of a specific sub-model for an instance of the verification data set 110 may be derived using a corresponding weight score of key variable networks (SFNs) constituting the sub-model.

도4에 도시한 바와 같이 서브모델1의 가중치 점수는 SFN1 (S311), SFN2 (S312), SFN3 (S313) 등 서브모델을 구성하는 주요변수 네트워크들의 가중치 점수에 대한 선형 결합 (linear combination)으로 계산할 수 있다.As shown in FIG. 4, the weight score of submodel 1 can be calculated as a linear combination of weight scores of main variable networks constituting submodels such as SFN1 (S311), SFN2 (S312), and SFN3 (S313). can

W(Di, SFN_j)를 검증 데이터 세트(110)의 i번째 인스턴스 D_i에 대해 SFN_j가 가지는 가중치 점수라고 하자. 이 때, 서브모델 1이 가지는 가중치 점수 W₁(D_i)는 다음과 같이 계산된다.Let W(Di, SFN _j ) be a weight score of SFN _j for the ith instance D _i of the verification data set 110 . At this time, the weight score W ₁ (D _i ) of submodel 1 is calculated as follows.

c_j는 SFN의 순위에 따른 기여도를 나타내는 계수이다. 예를 들어, c_j= 1인 경우, 가중치 점수는 순위에 관계없이 모든 SFN에 대해 동등한 비율로 계산된다. 이때, 순위 별로 차등을 두어 j에 따라 (SFN에 따라) c_j 값이 다르게 설정될 수도 있다.c _j is a coefficient representing the contribution according to the rank of SFN. For example, if _cj = 1, weighted scores are calculated in equal proportions for all SFNs regardless of rank. In this case, the value of _cj may be set differently according to j (according to the SFN) with a difference according to rank.

추가 데이터 세트에 대한 점진적 학습을 수행하는 단계Steps to perform incremental learning on additional data sets

본 발명의 중요한 특징 중에 하나는 새롭게 추가된 훈련 데이터(102)에 대한 점진적 학습을 용이하게 수행할 수 있는 점이다. 먼저, 훈련 데이터 세트1(101)을 기반으로 모델(400)이 구축된 상황이 가정된다. 이후, 새로운 훈련 데이터 세트 2(102)가 인코더(200)로 입력된다.One of the important features of the present invention is that it is possible to easily perform incremental learning on the newly added training data 102 . First, it is assumed that the model 400 is built based on the training data set 1 (101). Then, a new training data set 2 (102) is input to the encoder (200).

인코더(200)는 새로운 훈련 데이터 세트 2(102)에 대해 인코딩을 수행하고, 인코딩된 훈련 데이터 세트2(102)를 생성한다.The encoder (200) performs encoding on the new training data set 2 (102) and generates an encoded training data set 2 (102).

이후, 인코딩된 훈련 데이터 세트 2(102)에 대해 SFN의 탐색 단계(S300)에 포함된 일련의 모든 과정(S301~S307)을 수행하는 것이 아니라 가중치 계산(weight scoring) 과정(S303) 및 가중치 정규화 과정(S304)만을 순차적으로 수행하여, 인코딩된 훈련 데이터 세트2(102)에 대한 정규화된 가중치로 이미 구축된 모델(400)의 가중치를 업데이트하는 방식으로 점진적 학습이 수행된다.Thereafter, instead of performing all of the processes (S301 to S307) included in the SFN search step (S300) for the encoded training data set 2 (102), a weight scoring process (S303) and weight normalization are performed. Progressive learning is performed by sequentially performing only the process S304 to update the weights of the already built model 400 with normalized weights for the encoded training data set 2 102 .

이러한 점진적 학습은 새로운 훈련 데이터가 입력될 때, 기존에 구축한 모델을 유지하고 가중치라는 상태 변수만을 업데이트하는 방식으로 학습을 수행하기 때문에, 점진적 학습이 용이하게 수행될 수 있다.When new training data is input, gradual learning can be easily performed because learning is performed by maintaining a previously built model and updating only state variables called weights.

도 5는 본 발명의 실시 예에 따른 점진적 학습을 위한 기계 학습 방법을 수행하도록 구현된 컴퓨팅 장치의 블록도이다.5 is a block diagram of a computing device implemented to perform a machine learning method for incremental learning according to an embodiment of the present invention.

도 5를 참조하면, 컴퓨팅 장치(600)는 저장소(610), 기계 학습 모듈(620), 프로세서(630), 메모리(640) 및 이들(610, 620, 630, 640)을 연결하는 시스템 버스(650)을 포함할 수 있다.Referring to FIG. 5 , a computing device 600 includes a storage 610, a machine learning module 620, a processor 630, a memory 640, and a system bus (610, 620, 630, 640) connecting them. 650) may be included.

저장소(610)는 모델(도 1의 400)을 구축하기 위한 다수의 클래스 레이블로 라벨링된 훈련 데이터(또는 훈련 데이터 세트)(102)와 검증 데이터(또는 검증 데이터 세트)(110, 111)를 저장하고, 점진적 학습을 통해 모델(도 1의 400)을 점진적으로 업데이트하기 위한 새로운 훈련 데이터(또는 새로운 훈련 데이터 세트)(102)를 저장하는 하드웨어 장치이다.Storage 610 stores training data (or training data set) 102 and validation data (or validation data set) 110, 111 labeled with a plurality of class labels for building a model (400 in FIG. 1). It is a hardware device that stores new training data (or new training data set) 102 for incrementally updating the model (400 in FIG. 1) through incremental learning.

저장소(610)는, 예를 들면, 컴퓨터 판독가능매체로서, 예를 들면, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체를 포함할 수 있다.The storage 610 is, for example, a computer readable medium, for example, a magnetic medium such as a hard disk, a floppy disk and a magnetic tape, an optical recording medium such as a CD-ROM and a DVD, a floptical disk ).

기계 학습 모듈(620)은 프로세서(630)의 제어 또는 실행에 따라, 모델(도 1의 400)을 구축하고, 상기 새로운 훈련 데이터(102) 기반으로 생성된 새로운 가중치만을 이용하여 상기 구축된 모델(도 1의 400)을 점진적으로 업데이트하는(학습시키는) 하드웨어 모듈 또는 소프트웨어 모듈일 수 있다. The machine learning module 620 builds a model (400 in FIG. 1) under the control or execution of the processor 630, and uses only new weights generated based on the new training data 102 to build the model ( 400 of FIG. 1) may be a hardware module or a software module that gradually updates (learns).

기계 학습 모듈(620)은, 기능에 따라 구분되는 다수의 하위 모듈들을 포함할 수 있으며, 다수의 하위 모듈들은, 예를 들면, 인코더(621), 변수 네트워크(Feature Networks: FN) 생성기(622), 주요 변수 네트워크(significant feature networks: SFN) 결정기(623), 모델 구축기(624) 및 업데이트 유닛(625)을 포함할 수 있다. The machine learning module 620 may include a plurality of sub-modules classified according to functions, and the plurality of sub-modules include, for example, an encoder 621, a feature networks (FN) generator 622 , a significant feature networks (SFN) determiner 623, a model builder 624 and an update unit 625.

인코더(621)는, 다수의 클래스 레이블로 라벨링된 훈련 데이터를 인코딩하는 구성으로, 예를 들면, 도 1에서 설명한 단계 S200의 과정을 수행한다. 인코더(621)는 사전에 정의된 인코딩 규칙에 따라, 상기 훈련 데이터에 포함된 변수의 연속적인(continuous) 값을 이산적인(discrete) 값 또는 범주형(categorical) 값으로 변환한다.The encoder 621 is a component that encodes training data labeled with a plurality of class labels, and performs, for example, the process of step S200 described in FIG. 1 . The encoder 621 converts continuous values of variables included in the training data into discrete values or categorical values according to predefined encoding rules.

또한 인코더(621)는 새로운 훈련 데이터(102) 기반의 새로운 가중치를 생성하기 위해 새로운 훈련 데이터(102)를 인코딩한다. The encoder (621) also encodes the new training data (102) to generate new weights based on the new training data (102).

변수 네트워크(FN) 생성기(622)는 상기 인코딩된 훈련 데이터에 포함된 변수들을 노드들로 구성하고, 상기 노드들 중에서 인접한 노드들을 연결 강도를 나타내는 가중치를 갖는 엣지로 연결하여, 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성하는 구성으로, 도 1에서 설명한 단계 S301 및 S302를 수행하는 구성일 수 있다.A variable network (FN) generator 622 organizes the variables included in the encoded training data into nodes, connects adjacent nodes among the nodes with edges having weights indicating connection strength, and then connects the plurality of class labels. As a configuration for generating a plurality of variable networks classified as , it may be a configuration for performing steps S301 and S302 described in FIG. 1 .

변수 네트워크(FN) 생성기(622)는 단계 S301의 수행에 따라 상기 인코딩된 훈련 데이터에 포함된 2이상의 변수들을 특정 순서로 나열하여 변수 순열을 생성한다. The variable network (FN) generator 622 generates a variable permutation by arranging two or more variables included in the encoded training data in a specific order according to the execution of step S301.

일 예로, 변수 네트워크(FN) 생성기(622)는 상기 인코딩된 훈련 데이터에서 2이상의 변수들을 무작위로 선택한 후, 상기 무작위로 선택된 2이상의 변수들을 상기 특정 순서로 나열하여 상기 변수 순열을 생성한다.For example, the variable network (FN) generator 622 randomly selects two or more variables from the encoded training data, and then arranges the randomly selected two or more variables in the specific order to generate the variable permutation.

다른 예로, 변수 네트워크(FN) 생성기(622)는 상기 인코딩된 훈련 데이터에서 포함된 2 이상의 변수들을 선형 판별 분석(linear discriminant analysis, LDA), 주성분 분석(Principal Component Analysis, PCA) 및 딥러닝 (deep learning) 기반의 변수 추출법을 이용하여 새로운 변수들로 변환한 후, 상기 새로운 변수들을 특정 순서로 나열하여 상기 변수 순열을 생성한다.As another example, the variable network (FN) generator 622 performs linear discriminant analysis (LDA), principal component analysis (PCA), and deep learning (deep learning) on two or more variables included in the encoded training data. After transforming into new variables using a variable extraction method based on learning, the variable permutation is generated by arranging the new variables in a specific order.

상기 변수 순열이 생성되면, 변수 네트워크(FN) 생성기(622)는 상기 나열된 변수들 각각에 포함된 값들을 노드들로 구성하고, 상기 구성된 노드들 중에서 상기 특정 순서에 따라 인접한 노드들을 상기 엣지로 연결하여, 상기 생성된 변수 순열에 대해 상기 다수의 클래스 레이블로 분류되는 다수의 변수 네트워크들을 생성한다.When the variable permutation is generated, the variable network (FN) generator 622 configures the values included in each of the listed variables into nodes, and connects adjacent nodes among the configured nodes to the edge according to the specific order. Thus, a plurality of variable networks classified into the plurality of class labels are generated for the generated variable permutation.

상기 주요 변수 네트워크(SFN) 결정기(623)는 상기 생성된 다수의 변수 네트워크들 중에서 성능에 따라 선택된 변수 네트워크들을 주요 변수 네트워크들로 결정한다.The principal variable network (SFN) determiner 623 determines variable networks selected according to performance among the generated plurality of variable networks as principal variable networks.

예를 들면, 상기 주요 변수 네트워크(SFN) 결정기(623)는 상기 인코딩된 훈련 데이터(201)의 인스턴스를 이용하여 상기 다수의 변수 네트워크들 각각의 상기 가중치를 계산하고(도 1의 S303), 상기 계산된 가중치를 정규화하는 과정을 처리한 후(도 1의 S304), 상기 다수의 변수 네트워크들과 상기 정규화된 가중치를 이용하여 각 변수 네트워크들의 성능을 평가하는 과정(도 1의 S305)을 처리한다. For example, the principal variable network (SFN) determiner 623 calculates the weight of each of the plurality of variable networks using the instance of the encoded training data 201 (S303 in FIG. 1), After processing the process of normalizing the calculated weights (S304 in FIG. 1), the process of evaluating the performance of each variable network using the plurality of variable networks and the normalized weights (S305 in FIG. 1) is processed. .

추가적으로, 상기 주요 변수 네트워크(SFN) 결정기(623)는 도 1의 단계 303을 통해 상기 인코더(200)에 의해 인코딩된 새로운 훈련 데이터(202)의 인스턴스를 이용하여 새로운 가중치를 계산하고, 도 1의 S304를 통해 상기 계산된 새로운 가중치를 정규화하는 과정을 처리한다.Additionally, the principal variable network (SFN) determiner 623 calculates new weights using an instance of the new training data 202 encoded by the encoder 200 through step 303 of FIG. A process of normalizing the calculated new weight is processed through S304.

이후, 상기 주요 변수 네트워크(SFN) 결정기(623)는 상기 평가된 성능에 따라 상기 다수의 변수 네트워크들의 순위를 결정한 후(도 1의 S306), 상기 다수의 변수 네트워크들 중에서 미리 설정된 개수에 따라 상위에 랭크된 변수 네트워크들을 상기 주요 변수 네트워크들로 결정하는 과정(도 1의 S307)을 처리한다. Thereafter, the main variable network (SFN) determiner 623 determines the ranking of the plurality of variable networks according to the evaluated performance (S306 of FIG. 1), and then ranks the plurality of variable networks according to a preset number among the plurality of variable networks. A process of determining variable networks ranked in as the main variable networks (S307 in FIG. 1) is processed.

상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 상기 계산된 가중치를 정규화하는 과정은, 예를 들면, 상기 다수의 클래스 레이블이 제1 클래스 레이블과 제2 클래스 레이블을 포함하고, 상기 다수의 변수 네트워크들이 제1 변수 네트워크와 제2 변수 네트워크를 포함하는 경우, 상기 제1 클래스 레이블로 라벨링된 상기 훈련 데이터의 인스턴스를 이용하여 상기 제1 변수 네트워크의 가중치를 계산하는 과정, 상기 제2 클래스 레이블로 라벨링된 상기 훈련 데이터의 인스턴스를 이용하여 상기 제1 변수 네트워크의 가중치와 다른 상기 제2 변수 네트워크의 가중치를 계산하는 과정 및 상기 제1 변수 네트워크의 가중치와 상기 제2 변수 네트워크의 가중치를 정규화하는 과정을 포함할 수 있다.The process of normalizing the weights calculated by the principal variable network (SFN) determiner 623 includes, for example, the plurality of class labels including a first class label and a second class label, and the plurality of variables When the networks include a first variable network and a second variable network, calculating weights of the first variable network using instances of the training data labeled with the first class label, using the second class label The process of calculating weights of the second variable network that are different from the weights of the first variable network using the labeled instances of the training data, and the process of normalizing the weights of the first variable network and the weights of the second variable network. can include

상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 각 변수 네트워크들의 성능을 평가하는 과정은 상기 다수의 변수 네트워크들, 상기 정규화된 가중치, 클래스 레이블로 라벨링된 인스턴스를 이용하여 클래스 구별 정확도를 산출하는 과정 및 상기 산출된 클래스 구별 정확도를 기반으로 각 변수 네트워크들의 성능을 평가하는 과정을 포함할 수 있다.The process of evaluating the performance of each variable network by the principal variable network (SFN) determiner 623 calculates class discrimination accuracy using the plurality of variable networks, the normalized weights, and instances labeled with class labels. and a process of evaluating the performance of each variable network based on the calculated class discrimination accuracy.

상기 모델 구축기(624)는, 상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 상기 결정된 주요 변수 네트워크들을 결합하여 모델(400)을 구축하는 과정을 처리한다.The model builder 624 processes the process of building the model 400 by combining the principal variable networks determined by the principal variable network (SFN) determiner 623 .

상기 업데이트 유닛(625)은 상기 주요 변수 네트워크(SFN) 결정기(623)에 의해 정규화된 새로운 가중치를 기반으로 상기 모델 구축기(624)에 의해 구축된 상기 모델(400)을 점진적으로 업데이트하는 과정을 처리한다.The update unit 625 processes the process of incrementally updating the model 400 built by the model builder 624 based on the new weights normalized by the principal variable network (SFN) determiner 623. do.

예를 들면, 상기 업데이트 유닛(625)은 상기 결정된 주요 변수 네트워크들 각각의 상기 가중치에 상기 정규화된 새로운 가중치를 합산하여, 상기 구축된 모델을 점진적으로 업데이트할 수 있다.For example, the update unit 625 may incrementally update the built model by adding the normalized new weight to the weight of each of the determined main variable networks.

프로세서(630)는 시스템 버스(650)를 통해 저장소(610), 기계 학습 모듈(620) 및 메모리(640)의 동작을 제어 및 관리하는 구성으로, 적어도 하나의 CPU, 적어도 하나의 GPU 또는 이들의 조합일 수 있다.The processor 630 is a component that controls and manages operations of the storage 610, the machine learning module 620, and the memory 640 through the system bus 650, and includes at least one CPU, at least one GPU, or can be a combination.

도 5에서는 프로세서(630)와 기계 학습 모듈(620)이 분리된 구성으로 도시하고 있으나, 하나로 통합될 수 있다. 예를 들면, 기계 학습 모듈(620)이 프로세서(630)의 내부에 통합될 수 있다.In FIG. 5, the processor 630 and the machine learning module 620 are shown as separate configurations, but may be integrated into one. For example, machine learning module 620 may be integrated into processor 630 .

메모리(640)는 프로세서(630) 또는 기계 학습 모듈(620) 내의 각 구성에서 처리한 중간 데이터 또는 결과 데이터를 일시적 또는 영구적으로 저장하는 하드웨어 장치로, ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함될 수 있다.The memory 640 is a hardware device that temporarily or permanently stores intermediate data or result data processed by each component in the processor 630 or the machine learning module 620, and stores program commands such as ROM, RAM, flash memory, etc. and may include hardware devices specially configured to do so.

프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급언어코드를 포함한다. 상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다Examples of program instructions include not only machine code generated by a compiler but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to act as one or more software modules to perform the operations of the present invention, and vice versa.

본 발명의 보호범위가 이상에서 명시적으로 설명한 실시예의 기재와 표현에 제한되는 것은 아니다. 또한, 본 발명이 속하는 기술분야에서 자명한 변경이나 치환으로 말미암아 본 발명이 보호범위가 제한될 수도 없음을 다시 한번 첨언한다.The protection scope of the present invention is not limited to the description and expression of the embodiments explicitly described above. In addition, it is added once again that the scope of protection of the present invention cannot be limited due to obvious changes or substitutions in the technical field to which the present invention belongs.

Claims

A machine learning method performed by a computing device, in a machine learning method for incremental learning,
encoding training data labeled with multiple class labels;
Creating a plurality of variable networks classified by the plurality of class labels by configuring variables included in the encoded training data as nodes and connecting adjacent nodes among the nodes with edges having weights representing connection strength. step;
determining variable networks selected according to performance among the plurality of generated variable networks as main variable networks;
building a model by combining the determined main variable networks;
encoding new training data;
After calculating new weights using the encoded instance of the new training data, normalizing the calculated new weights; and
Progressively updating the built model by updating the weights of each of the determined main variable networks based on the normalized new weights.
Machine learning method for incremental learning comprising a.

In paragraph 1,
Encoding the training data,
Machine learning for incremental learning, which is a step of converting continuous values of variables included in the training data into discrete values or categorical values according to predefined encoding rules. method.

In paragraph 1,
The step of generating a plurality of variable networks classified into a plurality of class labels,
generating a variable permutation by arranging two or more variables included in the encoded training data in a specific order; and
The values included in each of the listed variables are configured as nodes, and among the configured nodes, adjacent nodes are connected to the edge according to the specific order, and the generated variable permutation is classified into the plurality of class labels. Creating multiple variable networks
Machine learning method for incremental learning comprising a.

In paragraph 3,
The step of generating the variable permutation,
randomly selecting two or more variables from the encoded training data; and
generating the variable permutation by arranging the two or more randomly selected variables in the specific order;
A machine learning method for incremental learning that includes a.

In paragraph 3,
The step of generating the variable permutation,
Two or more variables included in the encoded training data are converted into new variables using linear discriminant analysis (LDA), principal component analysis (PCA), and deep learning-based variable extraction. converting; and
generating the variable permutation by listing the new variables in a specific order;
A machine learning method for incremental learning that includes a.

In paragraph 1,
The step of determining the selected variable networks as main variable networks,
calculating the weight of each of the plurality of variable networks using the instance of the training data, and normalizing the calculated weight;
Evaluating performance of each variable network using the plurality of variable networks and the normalized weights;
ranking the plurality of variable networks according to the evaluated performance; and
determining, among the plurality of variable networks, variable networks ranked higher according to a preset number as the main variable networks;
A machine learning method for incremental learning that includes a.

In paragraph 6,
Normalizing the calculated weights,
When the plurality of class labels include a first class label and a second class label, and the plurality of variable networks include a first variable network and a second variable network,
calculating weights of the first variable network using instances of the training data labeled with the first class label;
calculating weights of the second variable network that are different from weights of the first variable network using instances of the training data labeled with the second class label; and
Normalizing the weight of the first variable network and the weight of the second variable network
A machine learning method for incremental learning that includes a.

In paragraph 6,
Evaluating the performance of each variable network,
calculating class discrimination accuracy using the plurality of variable networks, the normalized weights, and instances labeled with class labels; and
Evaluating the performance of each variable network based on the calculated class discrimination accuracy
A machine learning method for incremental learning that includes a.

In paragraph 1,
The step of gradually updating the built model is,
The step of gradually updating the built model by adding the normalized new weight to the weight of each of the determined main variable networks.

A computing device executing a machine learning method for incremental learning, comprising:
processor;
a storage for storing training data labeled with a plurality of class labels and new training data; and
A machine learning module for building a model using training data labeled with the plurality of class labels under the control of the processor;
The machine learning module,
an encoder that encodes training data labeled with a plurality of class labels and the new training data;
Creating a plurality of variable networks classified by the plurality of class labels by configuring variables included in the encoded training data as nodes and connecting adjacent nodes among the nodes with edges having weights representing connection strength. variable network generator;
Among the plurality of generated variable networks, variable networks selected according to performance are determined as main variable networks, new weights are calculated using instances of the encoded new training data, and the calculated new weights are normalized. variable network determinant;
a model builder that builds a model by combining the determined main variable networks; and
An update unit for incrementally updating the built model by updating the weights of each of the determined main variable networks based on the normalized new weights.
Computing device comprising a.

In paragraph 10,
The variable network generator,
A first process of generating a variable permutation by arranging two or more variables included in the encoded training data in a specific order, configuring values included in each of the listed variables into nodes, and configuring the specific order among the configured nodes. A second process of generating a plurality of variable networks classified into the plurality of class labels for the generated variable permutation by connecting adjacent nodes to the edge according to the computing device.

In paragraph 10,
The main variable network determinant,
A first process of calculating the weight of each of the plurality of variable networks using an instance of the training data and normalizing the calculated weight, each variable network using the plurality of variable networks and the normalized weight A second process of evaluating the performance of, a third process of determining the rank of the plurality of variable networks according to the evaluated performance, and a variable network ranked at the top according to a preset number among the plurality of variable networks in the main process. processing a fourth process of determining variable networks.

In paragraph 10,
The update unit,
and processing a process of gradually updating the built model by adding the normalized new weight to the weight of each of the determined main variable networks.