KR102590793B1

KR102590793B1 - Method and apparatus of self-balancing online dataset for machine learning

Info

Publication number: KR102590793B1
Application number: KR1020230045026A
Authority: KR
Inventors: 서승우; 윤형석; 김찬
Original assignee: 대한민국(방위사업청장)
Priority date: 2023-04-05
Filing date: 2023-04-05
Publication date: 2023-10-30

Abstract

본 개시의 일 실시예는 새로운 학습 데이터를 수집하고, 기존 데이터셋에 의하여 학습된 신경망에서 도출된, 상기 수집된 데이터에 대응하는 추정 확률이 미리 정해진 기준값 이하인 경우에는 상기 수집된 데이터를 선택하고, 상기 추정 확률이 상기 기준값 이상인 경우는 상기 수집된 데이터를 삭제하며, 상기 기존 데이터셋에서 상기 추정 확률이 높은 데이터들을 상기 선택된 데이터들과 교환하고, 상기 선택된 데이터들과 교환된 데이터셋에서 균형 지표가 미리 정해진 기준값 이상인 데이터 배치를 생성하며, 상기 생성된 데이터 배치를 상기 신경망에 학습하고, 상기 균형 지표는 서로 다른 라벨에 속하는 표현들(representations)이 서로 유사한 정도의 선형적인 분리 가능성을 갖는 정도를 나타내는 데이터셋 자가 균형화 장치 및 방법을 제공하고자 한다.An embodiment of the present disclosure collects new training data, selects the collected data when the estimated probability corresponding to the collected data, derived from a neural network learned by an existing data set, is less than or equal to a predetermined reference value, If the estimated probability is greater than or equal to the reference value, the collected data is deleted, data with a high estimated probability in the existing dataset are exchanged with the selected data, and a balance indicator is obtained in the dataset exchanged with the selected data. A data batch exceeding a predetermined reference value is generated, the generated data batch is learned by the neural network, and the balance index indicates the degree to which representations belonging to different labels have a similar degree of linear separability. We intend to provide a device and method for self-balancing datasets.

Description

Method and apparatus for self-balancing online dataset for machine learning {METHOD AND APPARATUS OF SELF-BALANCING ONLINE DATASET FOR MACHINE LEARNING}

본 개시는 기계 학습을 위한 온라인 데이터셋(dataset) 자가 균형화 방법 및 장치에 관한 것이다. 보다 구체적으로, 본 개시는 데이터셋의 균형 지표(balance indicator)를 기초로 데이터셋을 균형화하는 방법 및 장치에 관한 것이다.This disclosure relates to a method and device for self-balancing an online dataset for machine learning. More specifically, the present disclosure relates to a method and apparatus for balancing a dataset based on a balance indicator of the dataset.

인공지능 분야에서 널리 사용되는 프레임워크 중의 하나는 모방학습(imitation learning, IL)이고, 모방학습 알고리즘의 성능을 결정하는 두 가지 주요한 구성요소는 학습을 위한 데이터 세트와 신경망 교육이다. 즉, 새로운 데이터를 수집하고, 고품질 데이터 세트를 유지하는 것이 모방학습의 성능에 중요한 요소이고, 신경망의 효율적인 훈련은 모방학습의 성능에 직접적인 영향을 끼치는 요소이다.One of the widely used frameworks in the field of artificial intelligence is imitation learning (IL), and the two main components that determine the performance of an imitation learning algorithm are the data set for learning and neural network training. In other words, collecting new data and maintaining high-quality data sets are important factors in the performance of imitation learning, and efficient training of neural networks is a factor that directly affects the performance of imitation learning.

그러나 학습을 위한 데이터 세트를 획득하는 과정에서 상대적으로 자주 발생하는 상황에 대응되는 데이터가 편향적으로 많이 수집될 수 있고, 그로부터 획득된 데이터셋(dataset)은 일부 시나리오에 편향된 불균형한 특징을 갖게 된다. 예를 들어, 자율주행(Autonomous Driving, AD)을 위하여 실제 운전자로부터 데이터를 수집하는 경우에, 차선을 따라 운전하거나 정지하는 것과 같은 몇 가지 지배적인 시나리오가 반복적으로 수집될 수 있고, 이를 이용하여 획득된 데이터셋은 일부 시나리오에 편향된 불균형한 특징을 가지게 된다. 또한, 이러한 불균형한 데이터셋을 이용하여 모방학습된 경우에는, 차선을 따라서 운전하는 등 지배적인 상황에 대하여는 잘 동작할 수 있으나, 충돌 직전과 같이 발생 빈도가 낮은 상황에서는 오동작할 가능성이 높은 문제가 있다.However, in the process of acquiring a data set for learning, a large amount of data corresponding to relatively frequently occurring situations may be collected biasedly, and the resulting dataset will have unbalanced characteristics that are biased toward some scenarios. For example, when collecting data from real drivers for Autonomous Driving (AD), several dominant scenarios, such as driving along a lane or stopping, can be repeatedly collected and used to obtain data. The resulting dataset has unbalanced characteristics that are biased towards some scenarios. In addition, in the case of imitation learning using such an unbalanced dataset, it may work well in dominant situations such as driving along the lane, but there is a high possibility of malfunction in situations with a low frequency of occurrence, such as right before a collision. there is.

또한, 모방학습을 위한 데이터셋을 저장하기 위한 저장공간이 무한하지 않기 때문에, 새롭게 수집되는 데이터를 모두 데이터셋에 추가할 수 없다. 즉, 데이터셋의 최대 크기가 정해진 경우에는 새로 데이터를 데이터셋에 추가하기 위해서는 데이터셋의 기존 데이터 중 일부를 제거해야 하고, 임의적인 방식으로 데이터를 추가하고 삭제하는 경우에는 오히려 데이터셋의 불균형성이 증가하는 문제가 있다.Additionally, because the storage space for storing the dataset for imitation learning is not infinite, all newly collected data cannot be added to the dataset. In other words, when the maximum size of the dataset is set, some of the existing data in the dataset must be removed in order to add new data to the dataset, and if data is added and deleted in an arbitrary manner, the dataset may become imbalanced. This is a growing problem.

더 나아가, 기존 데이터를 통하여 학습된 신경망에 대하여 새로운 데이터셋을 단순히 추가 교육하는 경우에는 오히려 성능이 악화될 수 있는 문제가 있다. 예를 들어, 데이터셋이 불균형한 경우에 해당 데이터셋으로부터 무작위로 샘플링된 데이터 배치(data batch)로 증분 학습(incremental learning)하는 경우에는, 데이터 배치가 데이터셋의 불균형성을 그대로 가지게 되고 결과적으로 신경망이 바람직하지 않은 방향으로 학습될 수 있다.Furthermore, when simply additionally training a new dataset for a neural network learned through existing data, there is a problem that performance may actually deteriorate. For example, if the dataset is unbalanced and incremental learning is performed using data batches randomly sampled from the dataset, the data batches will retain the imbalance of the dataset, resulting in Neural networks can learn in undesirable directions.

결국, 기존의 방식에 따르면 온라인 데이터셋의 불균형성을 해소하기 위하여, 데이터셋에 균형성을 증가시킬 수 있는 새로운 데이터를 추출하고, 무의미한 데이터를 삭제한 후에 의미있는 데이터를 추가하며, 균형 잡힌 데이터 배치로 신경망을 재훈련하는 방식을 제공하지 못하는 문제점이 있었고, 본 발명은 이를 해결하기 위한 것이다.Ultimately, according to the existing method, in order to resolve the imbalance in the online dataset, new data that can increase the balance in the dataset is extracted, meaningful data is added after deleting meaningless data, and balanced data is collected. There was a problem of not being able to provide a method for retraining neural networks in batches, and the present invention is intended to solve this problem.

본 개시의 일 실시예는 온라인 데이터셋의 불균형성을 해소하기 위한 것이다. One embodiment of the present disclosure is intended to resolve imbalance in online datasets.

또한, 본 개시의 일 실시예는 데이터셋의 균형성을 증가시킬 수 있는 새로운 데이터를 추출하여 데이터셋에 추가하기 위한 것이다.Additionally, an embodiment of the present disclosure is intended to extract new data that can increase the balance of the dataset and add it to the dataset.

또한, 본 개시의 일 실시예는 균형 잡힌 데이터 배치를 이용하여 신경망을 재훈련하기 위한 것이다.Additionally, one embodiment of the present disclosure is for retraining a neural network using balanced data placement.

본 개시의 일 실시예는 기계 학습을 위한 온라인 데이터셋 자가 균형화 방법 및 장치를 제공하고자 한다.An embodiment of the present disclosure seeks to provide a method and device for self-balancing an online dataset for machine learning.

본 개시의 일 실시예는 새로운 학습 데이터를 수집하는 데이터 수집기; 기존 데이터셋에 의하여 학습된 신경망에서 도출된, 상기 수집된 데이터에 대응하는 추정 확률이 미리 정해진 기준값 이하인 경우에는 상기 수집된 데이터를 선택하고, 상기 추정 확률이 상기 기준값 이상인 경우는 상기 수집된 데이터를 삭제하는 데이터 선택기; 상기 기존 데이터셋에서 상기 추정 확률이 높은 데이터들을 상기 선택된 데이터들과 교환하는 데이터 교환기; 상기 선택된 데이터들과 교환된 데이터셋에서 균형 지표가 미리 정해진 기준값 이상인 데이터 배치를 생성하는 데이터 배치 생성기; 및 상기 생성된 데이터 배치를 상기 신경망에 학습하는 증분 학습기;를 포함하고, 상기 균형 지표는 서로 다른 라벨에 속하는 표현들(representations)이 서로 유사한 정도의 선형적인 분리 가능성을 갖는 정도를 나타내는 데이터셋 자가 균형화 장치를 제공하고자 한다.One embodiment of the present disclosure includes a data collector that collects new learning data; If the estimated probability corresponding to the collected data, derived from a neural network learned by an existing data set, is less than or equal to a predetermined reference value, the collected data is selected, and if the estimated probability is more than the reference value, the collected data is selected. data selector to delete; a data exchanger that exchanges data with the high estimated probability in the existing dataset with the selected data; a data batch generator that generates a data batch in which a balance index is greater than or equal to a predetermined reference value in the dataset exchanged with the selected data; and an incremental learner that learns the generated data batch into the neural network, wherein the balance index is a dataset self-indicating the degree to which representations belonging to different labels have a similar degree of linear separability. We would like to provide a balancing device.

일 실시예에서, 상기 데이터 선택기는 상기 수집된 데이터의 입력에 해당하는 경우의 발생 확률이 미리 정해진 기준값보다 높고, 상기 입력을 전제로 상기 수집된 데이터의 출력이 발생하는 조건부 확률이 미리 정해진 기준값보다 높은 경우에 상기 수집된 데이터를 선택할 수 있다.In one embodiment, the data selector has a probability of occurrence corresponding to the input of the collected data being higher than a predetermined reference value, and a conditional probability of generating an output of the collected data based on the input being greater than a predetermined reference value. In high cases, the collected data can be selected.

일 실시예에서, 상기 수집된 데이터의 입력에 해당하는 경우의 발생 확률은, 상기 데이터 수집기에서 수집한 데이터들을 준지도 학습 방식으로 라벨을 분류하여, 상기 수집된 데이터에 해당하는 라벨에 대응되는 데이터들의 비율로 도출될 수 있다.In one embodiment, the probability of occurrence of a case corresponding to the input of the collected data is determined by labeling the data collected by the data collector using a semi-supervised learning method, and determining the data corresponding to the label corresponding to the collected data. It can be derived as a ratio of .

일 실시예에서, 상기 입력을 전제로 상기 수집된 데이터의 출력이 발생하는 조건부 확률은, 상기 기존 데이터셋에 의하여 학습된 신경망인 이산 행동 추정부에 의하여 도출되고, 상기 이산 행동 추정부는 특징 추출층, 제어/행동 계획층 및 분류층을 포함할 수 있다.In one embodiment, the conditional probability that the output of the collected data occurs based on the input is derived by a discrete action estimator, which is a neural network learned by the existing dataset, and the discrete action estimator is a feature extraction layer. , may include a control/action planning layer and a classification layer.

일 실시예에서, 상기 데이터 선택기는 상기 수집된 데이터에 대응하는 불확실성 지표가 미리 정해진 기준값 이상인 경우에 상기 수집된 데이터를 선택할 수 있다.In one embodiment, the data selector may select the collected data when an uncertainty index corresponding to the collected data is greater than or equal to a predetermined reference value.

일 실시예에서, 상기 불확실성 지표는 상기 기존 데이터셋에 의하여 학습된 신경망인 연속 제어 추정부에 몬테-카를로 드롭아웃 (Monte-Carlo Dropout)을 적용하여, 몬테-카를로 샘플들의 분산으로 계산할 수 있다.In one embodiment, the uncertainty index can be calculated as the variance of Monte-Carlo samples by applying Monte-Carlo Dropout to a continuous control estimator, which is a neural network learned by the existing dataset.

일 실시예에서, 상기 균형 지표는 상기 데이터 배치에 포함된 데이터들에 대하여 각 라벨에 대한 분류 정확도의 차이가 작은 경우에 큰 값을 가질 수 있다.In one embodiment, the balance index may have a large value when the difference in classification accuracy for each label for data included in the data batch is small.

일 실시예에서, 상기 선택된 데이터들과 교환된 데이터셋을 유사도가 높은 데이터들을 낮은 차원 공간에 인접하게 배치하여 시각화하는 시각화 모듈을 더 포함할 수 있다.In one embodiment, the method may further include a visualization module that visualizes the selected data and the exchanged dataset by arranging data with high similarity adjacent to each other in a low-dimensional space.

일 실시예에서, 상기 시각화 모듈은 t-SNE(t-distributed stochastic neighbor embedding) 방식을 이용할 수 있다.In one embodiment, the visualization module may use a t-distributed stochastic neighbor embedding (t-SNE) method.

일 실시예에서, 상기 데이터 교환기는, 상기 기존 데이터셋의 데이터들을 상기 추정 확률이 높고, 불확실성이 낮은 순서로 정렬하고, 상기 데이터 교환기의 정렬 동작은 상기 데이터 선택기의 데이터 선택 동작과 병렬적으로 수행될 수 있다.In one embodiment, the data exchanger sorts the data of the existing dataset in the order of high estimate probability and low uncertainty, and the sorting operation of the data exchanger is performed in parallel with the data selection operation of the data selector. It can be.

본 개시의 일 실시예는 새로운 학습 데이터를 수집하는 데이터 수집단계; 기존 데이터셋에 의하여 학습된 신경망에서 도출된, 상기 수집된 데이터에 대응하는 추정 확률이 미리 정해진 기준값 이하인 경우에는 상기 수집된 데이터를 선택하고, 상기 추정 확률이 상기 기준값 이상인 경우는 상기 수집된 데이터를 삭제하는 데이터 선택 단계; 상기 기존 데이터셋에서 상기 추정 확률이 높은 데이터들을 상기 선택된 데이터들과 교환하는 데이터 교환 단계; 상기 선택된 데이터들과 교환된 데이터셋에서 균형 지표가 미리 정해진 기준값 이상인 데이터 배치를 생성하는 데이터 배치 생성 단계; 및 상기 생성된 데이터 배치를 상기 신경망에 학습하는 데이터 학습 단계;를 포함하고, 상기 균형 지표는 서로 다른 라벨에 속하는 표현들(representations)이 서로 유사한 정도의 선형적인 분리 가능성을 갖는 정도를 나타내는 데이터셋 자가 균형화 방법을 제공하고자 한다.One embodiment of the present disclosure includes a data collection step of collecting new learning data; If the estimated probability corresponding to the collected data, derived from a neural network learned by an existing data set, is less than or equal to a predetermined reference value, the collected data is selected, and if the estimated probability is more than the reference value, the collected data is selected. Selecting data to be deleted; A data exchange step of exchanging data with the high estimated probability in the existing dataset with the selected data; A data batch generation step of generating a data batch in which a balance index is greater than or equal to a predetermined reference value in the dataset exchanged with the selected data; and a data learning step of learning the generated data batch into the neural network, wherein the balance index is a dataset indicating the degree to which representations belonging to different labels have a similar degree of linear separability. We would like to provide a self-balancing method.

일 실시예에서, 상기 데이터 선택 단계는 상기 수집된 데이터의 입력에 해당하는 경우의 발생 확률이 미리 정해진 기준값보다 높고, 상기 입력을 전제로 상기 수집된 데이터의 출력이 발생하는 조건부 확률이 미리 정해진 기준값보다 높은 경우에 상기 수집된 데이터를 선택할 수 있다.In one embodiment, in the data selection step, the probability of occurrence in the case corresponding to the input of the collected data is higher than a predetermined reference value, and the conditional probability that the output of the collected data occurs based on the input is a predetermined reference value. In higher cases, the collected data can be selected.

일 실시예에서, 상기 수집된 데이터의 입력에 해당하는 경우의 발생 확률은,In one embodiment, the probability of occurrence of the case corresponding to the input of the collected data is,

상기 데이터 수집기에서 수집한 데이터들을 준지도 학습 방식으로 라벨을 분류하여, 상기 수집된 데이터에 해당하는 라벨에 대응되는 데이터들의 비율로 도출될 수 있다.By classifying the labels of the data collected by the data collector using a semi-supervised learning method, the ratio of data corresponding to the label corresponding to the collected data can be derived.

일 실시예에서, 상기 입력을 전제로 상기 수집된 데이터의 상기 출력이 발생하는 조건부 확률은 상기 기존 데이터셋에 의하여 학습된 신경망인 이산 행동 추정부에 의하여 도출되고, 상기 이산 행동 추정부는 특징 추출층, 제어/행동 계획층 및 분류층을 포함할 수 있다.In one embodiment, the conditional probability that the output of the collected data occurs based on the input is derived by a discrete action estimator that is a neural network learned by the existing dataset, and the discrete action estimator is a feature extraction layer. , may include a control/action planning layer and a classification layer.

일 실시예에서, 상기 데이터 선택 단계는, 상기 수집된 데이터에 대응하는 불확실성 지표가 미리 정해진 기준값 이상인 경우에 상기 수집된 데이터를 선택할 수 있다.In one embodiment, the data selection step may select the collected data when an uncertainty index corresponding to the collected data is greater than or equal to a predetermined reference value.

일 실시예에서, 상기 선택된 데이터들과 교환된 데이터셋을 유사도가 높은 데이터들을 낮은 차원 공간에 인접하게 배치하여 시각화하는 시각화 단계를 더 포함할 수 있다.In one embodiment, a visualization step of visualizing the selected data and the exchanged dataset by arranging data with high similarity adjacent to each other in a low-dimensional space may be further included.

일 실시예에서, 상기 시각화 단계는 t-SNE(t-distributed stochastic neighbor embedding) 방식을 이용할 수 있다.In one embodiment, the visualization step may use a t-distributed stochastic neighbor embedding (t-SNE) method.

일 실시예에서, 상기 데이터 교환 단계는 상기 기존 데이터셋의 데이터들을 상기 추정 확률이 높고, 불확실성이 낮은 순서로 정렬하고, 상기 데이터 교환기의 정렬 동작은 상기 데이터 선택기의 데이터 선택 동작과 병렬적으로 수행할 수 있다.In one embodiment, the data exchange step sorts the data of the existing dataset in the order of high estimate probability and low uncertainty, and the sorting operation of the data exchanger is performed in parallel with the data selection operation of the data selector. can do.

본 개시의 일 실시예는, 본 개시의 일 실시예에 의한 방법을 컴퓨터에서 실행시키도록 기록매체에 저장된 프로그램을 포함한다.One embodiment of the present disclosure includes a program stored in a recording medium to execute the method according to the embodiment of the present disclosure on a computer.

본 개시의 일 실시예는, 본 개시의 일 실시예에 의한 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 포함한다.An embodiment of the present disclosure includes a computer-readable recording medium on which a program for executing a method according to an embodiment of the present disclosure on a computer is recorded.

본 개시의 일 실시예는, 본 개시의 일 실시예에서 사용되는 데이터베이스를 기록한 컴퓨터로 읽을 수 있는 기록매체를 포함한다.An embodiment of the present disclosure includes a computer-readable recording medium that records a database used in an embodiment of the present disclosure.

본 개시의 일 실시예에 따르면 기계 학습을 위한 온라인 데이터셋의 자가 균형화 방법 및 장치가 제공될 수 있다.According to an embodiment of the present disclosure, a method and apparatus for self-balancing an online dataset for machine learning may be provided.

또한, 본 개시의 일 실시예에 따르면, 데이터 감독을 위한 전문가 없이 완전 자동으로 작동하는 데이터셋 자가 균형화 방법 및 장치가 제공될 수 있다. Additionally, according to an embodiment of the present disclosure, a dataset self-balancing method and apparatus that operates completely automatically without an expert for data supervision can be provided.

또한, 본 개시의 일 실시예에 따르면, 고정된 크기의 데이터셋을 유지하면서 데이터셋을 자가 균형화하는 방법 및 장치가 제공될 수 있다.Additionally, according to an embodiment of the present disclosure, a method and apparatus for self-balancing a dataset while maintaining a fixed-size dataset may be provided.

또한, 본 개시의 일 실시예에 따르면, 임의성을 활용하거나 특정 분포 형태를 가정하지 않고 데이터의 확률과 신규성을 수치적으로 추정하는 데이터셋 자가 균형화 방법 및 장치가 제공될 수 있다.Additionally, according to an embodiment of the present disclosure, a dataset self-balancing method and device can be provided that numerically estimates the probability and novelty of data without utilizing randomness or assuming a specific distribution form.

도 1은 본 개시의 일 실시예에 따른 기계 학습을 위한 온라인 데이터셋의 자가 균형화 방법을 도시하는 순서도이다.
도 2는 본 개시의 일 실시예에 따른 데이터 분류부(110)를 구현한 신경망을 도시한 개념도이다.
도 3은 본 개시의 일 실시예에 따른 이산 행동 추정부(120)를 구현한 신경망을 도시한 개념도이다.
도 4는 본 개시의 일 실시예에 따른 연속 제어 추정부(130)를 구현한 신경망을 도시한 개념도이다.
도 5은 본 개시의 일 실시예에 따른 온라인 데이터셋의 자가 균형화 방법을 도시한 순서도이다.1 is a flow chart illustrating a method for self-balancing an online dataset for machine learning according to an embodiment of the present disclosure.
FIG. 2 is a conceptual diagram illustrating a neural network implementing the data classification unit 110 according to an embodiment of the present disclosure.
FIG. 3 is a conceptual diagram illustrating a neural network implementing the discrete action estimator 120 according to an embodiment of the present disclosure.
FIG. 4 is a conceptual diagram illustrating a neural network implementing the continuous control estimation unit 130 according to an embodiment of the present disclosure.
Figure 5 is a flow chart illustrating a method for self-balancing an online dataset according to an embodiment of the present disclosure.

본 개시의 기술적 사상을 명확하게 하기 위하여 첨부된 도면을 참조하여 본 개시의 실시예를 상세하게 설명하도록 한다. 본 개시를 설명함에 있어서, 관련된 공지 기능 또는 구성요소에 대한 구체적인 설명이 본 개시의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략할 것이다. 도면들 중 실질적으로 동일한 기능구성을 갖는 구성요소들에 대하여는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 참조번호들 및 부호들을 부여하였다. 설명의 편의를 위하여 필요한 경우에는 장치와 방법을 함께 서술하도록 한다. 본 개시의 각 동작은 반드시 기재된 순서대로 수행되어야 할 필요는 없고, 병렬적, 선택적, 또는 개별적으로 수행될 수 있다.In order to clarify the technical idea of the present disclosure, embodiments of the present disclosure will be described in detail with reference to the attached drawings. In describing the present disclosure, if it is determined that a detailed description of a related known function or component may unnecessarily obscure the gist of the present disclosure, the detailed description will be omitted. Components having substantially the same functional configuration among the drawings are given the same reference numbers and symbols as much as possible, even if they are shown in different drawings. For convenience of explanation, if necessary, the device and method should be described together. Each operation of the present disclosure does not necessarily have to be performed in the order described, and may be performed in parallel, selectively, or individually.

본 개시의 실시예들에서 사용되는 용어는 본 개시의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 실시예의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 명세서에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the embodiments of the present disclosure have selected general terms that are currently widely used as much as possible while considering the function of the present disclosure, but this may vary depending on the intention or precedent of a person working in the art, the emergence of new technology, etc. . In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the relevant embodiment. Therefore, the terms used in this specification should not be defined simply as the names of the terms, but should be defined based on the meaning of the term and the overall content of the present disclosure.

본 개시 전체에서 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. "포함하다" 또는 "가지다" 등의 용어는 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 즉, 본 개시 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. Throughout this disclosure, singular expressions may include plural expressions, unless the context clearly dictates otherwise. Terms such as "include" or "have" are intended to designate the presence of a feature, number, step, operation, component, part, or combination thereof, but not one or more other features, numbers, steps, operations, or composition. It should be understood that this does not exclude in advance the possibility of the presence or addition of elements, parts, or combinations thereof. In other words, when it is said that a part "includes" a certain element throughout the present disclosure, this means that other elements may be further included rather than excluding other elements, unless specifically stated to the contrary.

"적어도 하나의"와 같은 표현은, 구성요소들의 리스트 전체를 수식하고, 그 리스트의 구성요소들을 개별적으로 수식하지 않는다. 예를 들어, "A, B, 및 C 중 적어도 하나" 및 "A, B, 또는 C 중 적어도 하나"는 오직 A, 오직 B, 오직 C, A와 B 모두, B와 C 모두, A와 C 모두, A와 B와 C 전체, 또는 그 조합을 가리킨다.An expression such as "at least one" modifies the entire list of elements, not the elements of the list individually. For example, “at least one of A, B, and C” and “at least one of A, B, or C” means only A, only B, only C, both A and B, both B and C, and A and C All refers to all of A, B, and C, or a combination thereof.

또한, 본 개시에 기재된 "...부", "...모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.In addition, terms such as "...unit" and "...module" described in the present disclosure mean a unit that processes at least one function or operation, which is implemented in hardware or software or by a combination of hardware and software. It can be implemented.

본 개시 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the present disclosure, when a part is said to be “connected” to another part, this includes not only the case where it is “directly connected,” but also the case where it is “electrically connected” with another element in between. do. Additionally, when a part "includes" a certain component, this means that it may further include other components rather than excluding other components, unless specifically stated to the contrary.

본 개시 전체에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)", "~하는 능력을 가지는(having the capacity to)", "~하도록 설계된(designed to)", "~하도록 변경된(adapted to)", "~하도록 만들어진(made to)", 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성된(또는 설정된)"은 하드웨어적으로 "특별히 설계된(specifically designed to)" 것만을 반드시 의미하지 않을 수 있다. 대신, 어떤 상황에서는, "~하도록 구성된 시스템"이라는 표현은, 그 시스템이 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다.The expression “configured to” used throughout the present disclosure may be used, depending on the context, for example, “suitable for,” “having the capacity to.” )", "designed to", "adapted to", "made to", or "capable of". . The term “configured (or set to)” may not necessarily mean “specifically designed to” in hardware. Instead, in some contexts, the expression “system configured to” may mean that the system is “capable of” in conjunction with other devices or components. For example, the phrase "processor configured (or set) to perform A, B, and C" refers to a processor dedicated to performing the operations (e.g., an embedded processor), or by executing one or more software programs stored in memory. It may refer to a general-purpose processor (e.g., CPU or application processor) that can perform the corresponding operations.

본 개시에 따른 인공지능과 관련된 기능은 프로세서와 메모리를 통해 동작될 수 있다. 프로세서는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU, VPU(Vision Processing Unit)와 같은 그래픽 전용 프로세서 또는 NPU와 같은 인공지능 전용 프로세서 등을 포함할 수 있다. 또한, 프로세서는 메모리에 저장된 기 정의된 동작 규칙 또는 인공지능 모델에 따라, 입력 데이터를 처리하도록 제어할 수 있다. 또한, 인공지능 전용 프로세서는 특정 인공지능 모델의 처리에 특화된 하드웨어 구조로 설계될 수 있다.Functions related to artificial intelligence according to the present disclosure may be operated through a processor and memory. The processor may include a general-purpose processor such as a CPU, AP, or DSP (Digital Signal Processor), a graphics-specific processor such as a GPU or VPU (Vision Processing Unit), or an artificial intelligence-specific processor such as an NPU. Additionally, the processor can control input data to be processed according to predefined operation rules or artificial intelligence models stored in memory. Additionally, an artificial intelligence-specific processor may be designed with a hardware structure specialized for processing a specific artificial intelligence model.

기 정의된 동작 규칙 또는 인공지능 모델은 학습을 통해 만들어진 것을 특징으로 한다. 여기서, 학습을 통해 만들어진다는 것은, 기본 인공지능 모델이 학습 알고리즘에 의하여 다수의 학습 데이터들을 이용하여 학습됨으로써, 원하는 목적을 수행하도록 설정된 기 정의된 동작 규칙 또는 인공지능 모델이 만들어짐을 의미할 수 있다. 이러한 학습은 본 개시에 따른 인공지능이 수행되는 기기 자체에서 이루어질 수도 있고, 별도의 서버 및/또는 시스템을 통해 이루어질 수도 있다.Predefined operation rules or artificial intelligence models are characterized by being created through learning. Here, being created through learning may mean that a basic artificial intelligence model is learned using a plurality of learning data by a learning algorithm, thereby creating a predefined operation rule or artificial intelligence model set to perform the desired purpose. . This learning may be accomplished in the device itself that performs the artificial intelligence according to the present disclosure, or may be accomplished through a separate server and/or system.

인공지능 모델은, 복수의 신경망 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행할 수 있다. 복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공지능 모델의 학습 결과에 의해 최적화될 수 있다. 예를 들어, 학습 과정 동안 인공지능 모델에서 획득한 로스(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 갱신될 수 있다. 인공 신경망은 예를 들어, CNN (Convolutional Neural Network), 심층 신경망 (DNN, Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN (Bidirectional Recurrent Deep Neural Network) 또는 심층 Q-네트워크 (Deep Q-Networks) 등을 포함할 수 있으나, 이에 한정되는 것은 아니다.An artificial intelligence model may be composed of multiple neural network layers. Each of the plurality of neural network layers has a plurality of weight values, and a neural network calculation can be performed through calculation between the calculation result of the previous layer and the plurality of weights. Multiple weights of multiple neural network layers can be optimized by the learning results of the artificial intelligence model. For example, a plurality of weights may be updated so that loss or cost values obtained from the artificial intelligence model are reduced or minimized during the learning process. Artificial neural networks are, for example, Convolutional Neural Network (CNN), Deep Neural Network (DNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), and Bidirectional Recurrent Deep Neural Network (BRDNN). Neural Network) or Deep Q-Networks, etc., but are not limited thereto.

본 개시는 기계 학습을 위한 온라인 데이터셋 자가 균형화 방법 및 장치에 관한 것이다. 보다 구체적으로, 본 개시는 기계 학습을 위한 데이터셋의 균형 지표(balance indicator)를 기초로 스스로 균형화하는 방법 및 장치에 관한 것이다.This disclosure relates to a method and apparatus for self-balancing an online dataset for machine learning. More specifically, the present disclosure relates to a method and device for self-balancing based on a balance indicator of a dataset for machine learning.

본 개시의 일 실시예에서, 기계 학습을 위한 데이터셋 D는 일반적으로 n개의 세부요소로 이루어진 입력(input) X = (x₁, x₂, … , x_n)와 m개의 세부요소로 이루어진 출력(output) Y = (y₁, y₂, … , y_m)의 튜플인 D(X, Y)로 이루어져 있다. 자율 주행을 예로 들어 설명하면, 입력은 관찰되는 운전 상황(observation, o)이 되고, 출력은 자율 주행을 위한 제어(control, c)가 되며, 출력의 세부요소는 연료 조절판(throttle, at), 방향 조절(steering, as) 및 브레이크(brake, ab)에 대한 제어를 포함할 수 있다. 따라서 자율 주행의 기계학습을 위한 단일 데이터는 (o, c) 또는 (o, at, as, ab)로 나타낼 수 있다. In one embodiment of the present disclosure, the dataset D for machine learning generally has an input X = (x ₁ , x ₂ , ..., x _n ) consisting of n detailed elements and an output consisting of m detailed elements. (output) Y = consists of D(X, Y), which is a tuple of (y ₁ , y ₂ , …, y _m ). Taking autonomous driving as an example, the input is the observed driving situation (observation, o), the output is the control for autonomous driving (control, c), and the detailed elements of the output are the fuel control panel (throttle, at), It may include controls for steering (as) and brakes (brake, ab). Therefore, single data for machine learning of autonomous driving can be expressed as (o, c) or (o, at, as, ab).

본 개시의 일 실시예에서, 데이터셋의 균형화(balancing)을 위해서는 개별 데이터가 현재 데이터셋에 의하여 학습된 신경망에 의하여 추정되는 확률을 계산하고, 데이터셋의 전체적인 분포를 수치화하는 과정이 필요하다. 예를 들어, 개별 데이터 (X, Y)의 추정 확률값인 Pr(X, Y)는 베이지안 방법론을 통해 Pr(Y|X)·Pr(X)로 계산할 수 있고, Pr(Y|X) 및 Pr(X)를 각각 계산한 후에 이들을 곱하여 최종 확률을 추정할 수 있다. In one embodiment of the present disclosure, balancing of a dataset requires a process of calculating the probability that individual data is estimated by a neural network learned by the current dataset and quantifying the overall distribution of the dataset. For example, Pr(X, Y), the estimated probability value of individual data (X, Y), can be calculated as Pr(Y|X)·Pr(X) through Bayesian methodology, and Pr(Y|X) and Pr After calculating (X) individually, you can multiply them to estimate the final probability.

자율 주행을 예로 들어 설명하면, 현재 데이터셋에 의하여 학습된 신경망에 의하여 데이터 (o, c)가 도출될 확률인 Pr(o, c)을 계산할 수 있다. 제어 동작은 원래 연속적이나, Pr(o, c)를 계산하기 위하여 이산화에 의하여 연속 제어를 근사화할 수 있다. 추가적으로, 제어 동작인 연료 조절판(throttle, at), 방향 조절(steering, as) 및 브레이크(brake, ab)에 대한 동작들이 모두 조건부로 독립적이라고 가정하면, 이산화를 통해 아래와 같이 연속 제어를 근사화할 수 있다.If we take autonomous driving as an example, Pr(o, c), which is the probability that data (o, c) will be derived by a neural network learned from the current dataset, can be calculated. The control operation is inherently continuous, but continuous control can be approximated by discretization to calculate Pr(o, c). Additionally, assuming that the control operations for fuel throttle (throttle, at), steering (as), and brake (brake, ab) are all conditionally independent, continuous control can be approximated through discretization as follows: there is.

Pr(o, c)

Pr(o, at, as, ab) Pr(o, c)

Pr(o, at, as, ab)

= Pr(at, as, ab | o) Pr(o) = Pr(at, as, ab | o) Pr(o)

Pr(at | o) Pr(as | o) Pr(ab | o) Pr(o) (1)

도 1은 본 개시의 일 실시예에 따른 온라인 데이터셋의 자가 균형화 시스템을 도시한 개념도이다.Figure 1 is a conceptual diagram illustrating a self-balancing system for an online dataset according to an embodiment of the present disclosure.

본 개시의 일 실시예에 따르면, 온라인 데이터셋의 자가 균형화 시스템은 데이터 수집기(100, data collector), 데이터 선택기(200, data selector), 데이터 교환기(300, data exchanger), 데이터 배치 생성기(400, data batch generator), 증분 학습기(500, incremental learning)를 포함하여 구성될 수 있다.According to an embodiment of the present disclosure, a self-balancing system for an online dataset includes a data collector (100), a data selector (200), a data exchanger (300), and a data batch generator (400, It can be configured to include a data batch generator) and an incremental learner (500, incremental learning).

본 개시의 일 실시예에 따르면, 데이터 수집기(100)는 새로운 학습 데이터를 수집하기 위한 것으로, 모방학습의 대상이 되는 전문가(expert)로부터 실제 상황에서 데이터를 수집하게 된다. 예를 들어, 자율주행을 위한 데이터 수집을 위하여 실제 도로 상황에서 운전자가 자동차를 운전하는 데이터를 수집할 수 있다. According to an embodiment of the present disclosure, the data collector 100 is used to collect new learning data, and collects data from an expert who is the subject of imitation learning in a real situation. For example, to collect data for autonomous driving, data can be collected when a driver drives a car in actual road conditions.

본 개시의 일 실시예에 따르면, 데이터 수집기(100)는 데이터 분류부(110, data classifier), 이산 행동 추정부(120, discretized action estimator) 및 연속 제어 추정부(130, continuous control estimator)를 포함할 수 있다. 자율 주행을 예로 들어 설명하면, 데이터 분류부(110)는 수집된 데이터의 관찰된 운전 상황(observation)을 종류별로 분류하기 위한 것이고, 이산 행동 추정부(120)는 관찰된 운전 상황(observation)에 대한 각 제어 행동(control action)의 확률을 추정하기 위한 것이며, 연속 제어 추정부(130)는 현재 데이터셋에 의하여 학습된 신경망에 의한 연속 제어를 예측하고, 이를 기초로 인식론적 불확실성(epistemic uncertainty)을 추정하기 위한 것이다. 다만, 데이터 수집기(100)가 데이터 분류부(110), 이산 행동 추정부(120) 및 연속 제어 추정부(130)를 포함하는 것은 본 개시의 일 실시예에 따른 것이며, 본 개시의 다른 일 실시예에 따르면 데이터 수집기(100)는 데이터의 수집만을 수행하고, 데이터 분류부(110), 이산 행동 추정부(120) 및 연속 제어 추정부(130)는 데이터 수집기(100)와 별도로 구비되어, 데이터 수집기(100)에 의하여 이미 수집된 데이터에 대하여 각 동작을 별도로 수행할 수 있다.According to one embodiment of the present disclosure, the data collector 100 includes a data classifier (110), a discrete action estimator (120), and a continuous control estimator (130). can do. Taking autonomous driving as an example, the data classification unit 110 is used to classify the observed driving situations (observations) of the collected data by type, and the discrete behavior estimation unit 120 is used to classify the observed driving situations (observations) of the collected data. This is to estimate the probability of each control action, and the continuous control estimation unit 130 predicts continuous control by a neural network learned by the current data set, and based on this, epistemic uncertainty. It is for estimating. However, the data collector 100 includes a data classification unit 110, a discrete behavior estimation unit 120, and a continuous control estimation unit 130 according to an embodiment of the present disclosure, and another embodiment of the present disclosure According to an example, the data collector 100 only collects data, and the data classification unit 110, the discrete behavior estimation unit 120, and the continuous control estimation unit 130 are provided separately from the data collector 100, Each operation can be performed separately on data already collected by the collector 100.

본 개시의 일 실시예에 따르면, 데이터 선택기(200)는 데이터 수집기(100)에서 수집된 개별 데이터가 현재의 데이터셋(Ds)에 의하여 학습된 신경망에서 추정되는 확률과 인식론적 불확실성(epistemic uncertainty)을 고려하여, 해당 데이터의 추정 확률이 낮고 불확실성이 높은 데이터를 선택하고, 미리 정해진 임계값보다 데이터의 추정 확률이 높고 불확실성이 낮은 데이터를 필터링하여 제거한다. 여기서, 현재의 데이터셋(Ds)에 의하여 학습된 신경망에서 수집된 개별 데이터가 추정되는 확률이 높은 경우는 이미 현재의 데이터셋(Ds)에 해당 데이터에 대응되는 상황이 이미 충분히 학습된 것으로 볼 수 있으며, 불확실성이 낮은 경우는 이미 충분히 훈련된 것이므로 이를 제외할 수 있다. According to an embodiment of the present disclosure, the data selector 200 determines the probability and epistemic uncertainty of individual data collected from the data collector 100 estimated from a neural network learned by the current dataset (Ds). Taking this into account, select data with a low estimation probability and high uncertainty, and filter and remove data with a high estimation probability and low uncertainty above a predetermined threshold. Here, if the probability that individual data collected from the neural network learned by the current dataset (Ds) is estimated is high, the situation corresponding to the data in the current dataset (Ds) can be considered to have already been sufficiently learned. In cases where uncertainty is low, it can be excluded because it has already been sufficiently trained.

일 실시예에서, 수집된 개별 데이터 (X, Y)의 추정 확률 Pr(X, Y)은 데이터 분류부(110) 및 이산 행동 추정부(120)에 의하여 도출되고, 불확실성 지표(uncertainty indicator)는 연속 제어 추정부(130)에서 드롭아웃(dropout)을 이용하여 계산될 수 있다. 자율 주행을 예로 들어 설명하면, 주행에서 수집된 데이터 (o, at, as, ab)에 대하여 각 세부요소의 추정 확률이 미리 정해진 기준값보다 낮고, 불확실성 지표가 미리 정해진 기준값보다 높은 경우에만 이를 새로운 데이터로 선별한다. 이를 수식으로 나타내면 아래와 같다.In one embodiment, the estimated probability Pr(X, Y) of the collected individual data (X, Y) is derived by the data classification unit 110 and the discrete behavior estimation unit 120, and the uncertainty indicator is It can be calculated using dropout in the continuous control estimation unit 130. Taking autonomous driving as an example, for the data (o, at, as, ab) collected during driving, the estimated probability of each detailed element is lower than the predetermined standard value, and only when the uncertainty index is higher than the predetermined standard value, new data is generated. Select. This can be expressed in a formula as follows:

(2)(2)

여기서, T_at, T_as, T_ab, T_o 및 T_u는　각　세부요소에　대하여　미리　정해진　기준값이고，I는　조건이 참이면 1이고, 거짓이면 0을 반환하는 지시 함수(indicator function)이다.Here, T _at , T _as , T _ab , T _o and T _u are predetermined reference values for each detailed element, and I is an indicator function that returns 1 if the condition is true and 0 if it is false.

본 개시의 일 실시예에 따르면, 데이터 교환기(300)는 현재의 데이터셋 Ds의 데이터들을 추정 확률이 높고 불확실성이 낮은 순서로 정렬한다. 일 실시예에서, 데이터 교환기(300)의 현재 데이터셋에 대한 데이터 정렬 동작은 데이터 선택기(200)의 데이터 선택 동작과 병렬적으로 수행될 수 있다. 앞서 설명한 바와 같이, 데이터의 추정 확률이 높고 불확실성이 낮은 것은 신경망이 해당 데이터에 대하여 충분히 훈련되었음을 나타내고, 이러한 데이터는 데이터셋 Ds에서 일반적인 것으로 필요성이 낮은 것이다. 따라서, 데이터 교환기(300)는 데이터 선택기(200)에서 수집된 추정 확률이 낮고 불확실성이 높은 데이터와 데이터셋 Ds의 추정 확률이 높고 불확실성이 낮은 데이터를 교환한다. 이러한 절차를 통해, 데이터셋 Ds는 고정된 크기를 유지하면서 새로운 온라인 데이터들을 업데이트할 수 있다. According to one embodiment of the present disclosure, the data exchanger 300 sorts the data of the current dataset Ds in order of high estimation probability and low uncertainty. In one embodiment, the data sorting operation for the current dataset of the data exchanger 300 may be performed in parallel with the data selection operation of the data selector 200. As explained earlier, the high estimation probability and low uncertainty of the data indicates that the neural network has been sufficiently trained on the data, and such data is common in dataset Ds and is of low necessity. Accordingly, the data exchanger 300 exchanges data with a low estimated probability and high uncertainty collected by the data selector 200 and data with a high estimated probability and low uncertainty of the dataset Ds. Through this procedure, dataset Ds can be updated with new online data while maintaining a fixed size.

본 개시의 일 실시예에 따르면, 데이터 배치 생성기(400)는 증분 학습을 위한 데이터 배치(data batch)를 생성하는 기능을 수행한다. 데이터 배치 생성기(400)는 데이터 교환기(300)에 의하여 업데이트된 데이터셋 Ds를 무작위로 샘플링하여 데이터 배치 Db를 형성하고, 해당 데이터 배치 Db에 포함된 데이터들의 균형 지표(balance indicator)를 계산하며, 균형 지표가 미리 정해진 기준값보다 높은 데이터 배치를 출력한다. According to one embodiment of the present disclosure, the data batch generator 400 performs a function of generating a data batch for incremental learning. The data batch generator 400 randomly samples the data set Ds updated by the data exchanger 300 to form a data batch Db, and calculates a balance indicator of the data included in the data batch Db, Outputs a batch of data where the balance indicator is higher than a predetermined reference value.

본 개시의 일 실시예에 따르면, 데이터셋의 특징 공간(S) 안에 있는 서로 다른 클래스(또는 라벨)에 속하는 표현들(representations)이 서로 유사한 정도의 선형적인 분리 가능성을 갖는 경우에 균형을 이루고, 그 정도를 균형 지표(balance indicator)라 한다. 예를 들어, 균형 지표(I)는 수학적으로 아래와 같이 표현될 수 있다(관련 논문 [Bingyi Kang, EXPLORING BALANCED FEATURE SPACES FOR REPRESENTATION LEARNING, ICLR 2022]을 참조로서 본 개시에 포함함).According to an embodiment of the present disclosure, balance is achieved when representations belonging to different classes (or labels) in the feature space (S) of the dataset have a similar degree of linear separability, That degree is called the balance indicator. For example, the balance index (I) can be expressed mathematically as follows (the related paper [Bingyi Kang, EXPLORING BALANCED FEATURE SPACES FOR REPRESENTATION LEARNING, ICLR 2022] is incorporated into this disclosure by reference).

(3) (3)

여기서, k는 미리 정해진 상수이고, a_j는 특성 공간(S)에서 각 클래스(j)에 속하는 표현들에 대한 분류 정확도를 나타낸다. Here, k is a predetermined constant, and a _j represents the classification accuracy for expressions belonging to each class (j) in the feature space (S).

본 개시의 일 실시예에 따르면, 증분 학습기(500)는 데이터 배치 생성기(400)에 의하여 도출된 데이터 배치(data batch)를 이용하여 신경망들을 증분 학습한다. 예를 들어, 증분 학습기(500)는 데이터 분류부(110), 이산 행동 추정부(120) 및 연속 제어 추정부(130)의 신경망들을 데이터 배치를 이용하여 증분 학습할 수 있다. According to one embodiment of the present disclosure, the incremental learner 500 incrementally learns neural networks using a data batch derived by the data batch generator 400. For example, the incremental learner 500 may incrementally learn the neural networks of the data classification unit 110, the discrete action estimation unit 120, and the continuous control estimation unit 130 using data batches.

본 개시의 일 실시예에 따르면, 온라인 데이터셋의 자가 균형화 시스템은 시각화 모듈(600)을 더 포함할 수 있다. 일 실시예에서, 시각화 모듈(600)은 데이터셋 Ds를 시각화하여, 균형화 결과에 대한 정성적 평가가 가능하게 하고, 실시간 모니터링이 가능하도록 한다. 예를 들어, 시각화 모듈(600)은 높은 차원의 복잡한 데이터들을 유사도가 높은 데이터들을 가까이에 배치하여 낮은 차원 공간으로 시각화하는 t-SNE(t-distributed stochastic neighbor embedding)을 이용할 수 있다. 다른 일 실시예에서, 시각화 모듈(600)은 데이터 배치 Db를 시각화할 수 있다. According to one embodiment of the present disclosure, the system for self-balancing an online dataset may further include a visualization module 600. In one embodiment, the visualization module 600 visualizes the dataset Ds, enabling qualitative evaluation of the balancing results and real-time monitoring. For example, the visualization module 600 may use t-SNE (t-distributed stochastic neighbor embedding), which visualizes high-dimensional complex data in a low-dimensional space by placing data with high similarity nearby. In another embodiment, the visualization module 600 can visualize the data batch Db.

본 개시의 일 실시예에 따르면, 시각화 모듈(600)에 의하여 시각화된 데이터는 사용자에 의하여 정성적 평가가 가능하고, 사용자는 시각화된 균형화 결과를 실시간으로 모니터링하면서 학습 과정을 제어하거나 종료할 수 있다.According to an embodiment of the present disclosure, data visualized by the visualization module 600 can be qualitatively evaluated by the user, and the user can control or terminate the learning process while monitoring the visualized balancing results in real time. .

도 2는 본 개시의 일 실시예에 따른 데이터 분류부(110)를 구현한 신경망을 도시한 개념도이다.FIG. 2 is a conceptual diagram illustrating a neural network implementing the data classification unit 110 according to an embodiment of the present disclosure.

본 개시의 일 실시예에 따르면, 데이터 분류부(110)는 수집된 데이터의 입력(X)를 종류별로 분류하기 위한 것이고, 그 결과를 이용하면 입력(X)의 분포에 기초하여 각 입력의 확률인 Pr(X)를 계산할 수 있다. 예를 들어, 자율 주행을 위하여 주행 과정에서 수집된 데이터들을 데이터 분류부(110)에 의하여 분류하여, 관찰된 운전 상황(observation)을 종류별로 분류하여, 그 분포에 따라 Pr(o)를 도출할 수 있다. 즉, 특정 주행 상황(o)에 대한 확률 Pr(o)은 관찰된 전체 모집단의 크기(N)로 해당 주행 상황으로 분류된 경우의 크기(n_o)를 나눈 값(n_o/N)으로 계산될 수 있다.According to an embodiment of the present disclosure, the data classification unit 110 is for classifying the input (X) of the collected data by type, and using the results, the probability of each input is based on the distribution of the input (X). Pr(X) can be calculated. For example, for autonomous driving, the data collected during the driving process is classified by the data classification unit 110, the observed driving situation (observation) is classified by type, and Pr(o) is derived according to the distribution. You can. In other words, the probability Pr(o) for a specific driving situation (o) is calculated by dividing the size (n o ) of cases classified into that driving situation by the size ( _N ) of the entire observed population (n _o /N). It can be.

일 실시예에서, 데이터 분류부(110)는 수집된 데이터를 분류하기 위하여 신경망을 이용할 수 있고, 비지도 학습(unsupervised learning)을 통한 데이터 클러스터링을 이용할 수 있다. 여기서 비지도 학습이란 라벨(label)이 없는 데이터들을 이용하여 신경망을 학습하는 방식이다. 예를 들어, 자율 주행을 위하여 주행 환경에서 수집된 데이터들에 대하여, SwAV(논문 [M. Caron 외 5명, "Unsupervised learning of visual features by contrasting cluster assignment," in Proceedings of Advances in Neural Information Proceeding Systems (NeurIPS), 2020]의 개시 내용을 본 명세서에 참조로서 포함함) 방식을 이용하여 주행 데이터를 비지도 학습 방식으로 군집화(clustering)할 수 있다.In one embodiment, the data classification unit 110 may use a neural network to classify the collected data and may use data clustering through unsupervised learning. Here, unsupervised learning is a method of learning a neural network using data without labels. For example, regarding data collected in the driving environment for autonomous driving, SwAV (paper [M. Caron et al., "Unsupervised learning of visual features by contrasting cluster assignment," in Proceedings of Advances in Neural Information Proceeding Systems (NeurIPS), 2020, the disclosure of which is incorporated herein by reference) can be used to cluster driving data in an unsupervised learning manner.

다른 실시예에서, 데이터 분류부(110)는 수집된 데이터의 분류를 위하여 준지도 학습(semi-supervised learning) 방식을 사용할 수 있다. 도 2를 참조하면, 준지도 학습은 라벨을 가진 데이터(labeled data)와 라벨이 없는 데이터(un-labeled data)를 모두 사용하는 학습 방식으로, 라벨을 가진 데이터를 신경망에 학습하고, 라벨이 없는 데이터의 의사 라벨(pseudo label)을 추정하여 학습에 사용하는 방식이다. 비지도 학습에서는 분류된 데이터들의 군집(cluster)의 중심(centroid)이 라벨이 없는 데이터를 학습한 신경망에 의하여 설정되나, 준지도 학습에서는 군집의 중심이 라벨을 가진 데이터에 의하여 결정되므로 데이터의 분류가 보다 합리적인 방식으로 이루어질 수 있다. 또한, 비지도 학습에 의하여 형성된 군집(cluster)의 분류 기준은 인간이 해석하기 어려운 한계점이 있으므로, 준지도 학습에 의하여 데이터를 분류하는 것이 보다 바람직한 방식이다. 따라서, 본 개시에서는 비지도 학습에 의한 데이터 분류를 배제하지는 않으나, 바람직하게는 준지도 학습에 의하여 데이터를 분류하는 방식을 사용한다. 예를 들어, 준지도 학습 방식으로 논문 [Yang, Xiangli 외 3인, "A survey on deep semi-supervised learning." IEEE Transactions on Knowledge and Data Engineering, 2022, 논문의 내용은 참조로서 본 명세서에 포함함]에 개시된 방식들 중 하나를 사용하여 데이터 분류를 수행할 수 있다.In another embodiment, the data classification unit 110 may use a semi-supervised learning method to classify the collected data. Referring to Figure 2, semi-supervised learning is a learning method that uses both labeled data and unlabeled data. Labeled data is trained on a neural network, and unlabeled data is trained on the neural network. This is a method that estimates pseudo labels of data and uses them for learning. In unsupervised learning, the centroid of the cluster of classified data is set by a neural network that learned unlabeled data, but in semi-supervised learning, the centroid of the cluster is determined by labeled data, so data classification can be achieved in a more rational way. In addition, the classification criteria for clusters formed through unsupervised learning have limitations that make it difficult for humans to interpret, so classifying data through semi-supervised learning is a more desirable method. Therefore, in this disclosure, data classification by unsupervised learning is not excluded, but a method of classifying data by semi-supervised learning is preferably used. For example, in a semi-supervised learning paper [Yang, Xiangli et al., "A survey on deep semi-supervised learning." Data classification can be performed using one of the methods disclosed in IEEE Transactions on Knowledge and Data Engineering, 2022, the contents of the paper are incorporated herein by reference.

도 3은 본 개시의 일 실시예에 따른 이산 행동 추정부(120)를 구현한 신경망을 도시한 개념도이다.FIG. 3 is a conceptual diagram illustrating a neural network implementing the discrete action estimator 120 according to an embodiment of the present disclosure.

도 3을 참조하면, 본 개시의 일 실시예에 따른 이산 행동 추정부(120)는 특징 추출층(121, feature extractor), 제어/행동 계획층(122, control/action planner) 및 분류층(123, classification layer)을 포함할 수 있다. 예를 들어, 특징 추출층(121)은 합성곱 신경망(Convolutional Neural Network)으로 구현될 수 있고, 제어/행동 계획층(122)는 다층 퍼셉트론(Multi-Layer Perceptrons)에 의하여 구현될 수 있다. 또한, 이산 행동 추정부(120)의 끝단인 분류층(123)은 다중 클래스 분류(Multi-class Classification)가 가능한 소프트맥스층(softmax layer)으로 구현될 수 있다. 예를 들어, 도 3을 참조하면, 이산 행동 추정부(120)는 연료 조절판(throttle), 브레이크의 조절 범위를 0~1이고, 스티어링의 조절 범위를 -1~1로 두고, 이들 구간들을 이산화하여 각 동작의 확률을 계산할 수 있다. Referring to FIG. 3, the discrete action estimation unit 120 according to an embodiment of the present disclosure includes a feature extraction layer (121), a control/action planner (122), and a classification layer (123). , classification layer). For example, the feature extraction layer 121 may be implemented with a convolutional neural network, and the control/action planning layer 122 may be implemented with multi-layer perceptrons. Additionally, the classification layer 123, which is the end of the discrete behavior estimation unit 120, may be implemented as a softmax layer capable of multi-class classification. For example, referring to FIG. 3, the discrete behavior estimation unit 120 sets the fuel throttle and brake control ranges to 0 to 1, the steering control range to -1 to 1, and discretizes these sections. Thus, the probability of each action can be calculated.

도 3을 참조하면, 이산 행동 추정부(120)의 전반부에 위치한 특징 추출층(121)에서의 중간 산출물을 r _i ∈S, 후반부에 위치한 제어/행동 계획층(122)의 최종산출물을 c _i∈C라고 했을 때, 제어/행동 계획층(122) 네트워크 구조를 r _i 과 c _i의 일대일 대응을 만족시키게 구성한다면, 위의 수식(3)에서의 r _i 항을 본 기술의 일부인 c _i로 대체할 수 있게 된다. 예를 들어, 선형 합계(linear summation)를 이용하여 제어/행동 계획층(122) 네트워크 구조를 r _i 과 c _i의 일대일 대응을 만족시키게 구성할 수 있다. 이를 통해 특성 공간에서의 균형지표를 라벨 공간의 균형지표로 확장할 수 있게 되고, 이는 본 특허기술이 제안하고자 하는 라벨공간의 균형화를 위한 정량지표로 활용할 수 있다.Referring to FIG. 3, the intermediate output from the feature extraction layer 121 located in the first half of the discrete action estimation unit 120 is r _i ∈ S , and the final output of the control/action planning layer 122 located in the second half is c _i. Assuming ∈ C , if the network structure of the control/action planning layer 122 is configured to satisfy a one-to-one correspondence between r _i and c _i , the r _i term in equation (3) above is converted to c _i , which is part of this technology. It can be replaced. For example, the control/action planning layer 122 network structure can be configured to satisfy a one-to-one correspondence between r _i and c _i using linear summation. Through this, it is possible to expand the balance index in the feature space to the balance index in the label space, which can be used as a quantitative index for balancing the label space that this patented technology proposes.

(4) (4)

여기서, k는 미리 정해진 상수이고, a_j는 라벨 공간(S)에서 각 클래스(j)에 속하는 표현들에 대한 분류 정확도를 나타낸다.Here, k is a predetermined constant, and a _j represents the classification accuracy for expressions belonging to each class (j) in the label space (S).

도 4는 본 개시의 일 실시예에 따른 연속 제어 추정부(130)를 구현한 신경망을 도시한 개념도이다.FIG. 4 is a conceptual diagram illustrating a neural network implementing the continuous control estimation unit 130 according to an embodiment of the present disclosure.

도 4를 참조하면, 본 개시의 일 실시예에 따른 연속 제어 추정부(130)는 특징 추출층(131, feature extractor), 제어/행동 계획층(132, control/action planner)을 포함할 수 있다. 예를 들어, 특징 추출층(131)은 합성곱 신경망(Convolutional Neural Network)으로 구현될 수 있고, 제어/행동 계획층(132)은 다층 퍼셉트론(Multi-Layer Perceptrons)에 의하여 구현될 수 있다. 일 실시예에서, 연속 제어 추정부(130)는 관찰된 운전 상황(o)을 입력으로 하여, 연료 조절판, 브레이크 및 스티어링에 대한 연속적인 제어 동작을 출력으로 할 수 있다.Referring to FIG. 4, the continuous control estimation unit 130 according to an embodiment of the present disclosure may include a feature extraction layer (131) and a control/action planner (132). . For example, the feature extraction layer 131 may be implemented with a convolutional neural network, and the control/action planning layer 132 may be implemented with multi-layer perceptrons. In one embodiment, the continuous control estimation unit 130 may use the observed driving situation (o) as an input and output continuous control operations for the fuel control panel, brakes, and steering.

본 개시의 일 실시예에 따르면, 현재의 데이터셋에 의하여 학습된 연속 제어 추정부(130)의 신경망에 대하여 신경망의 일부를 생략하는 드롭아웃(dropout)을 적용하여 인식론적 불확실성을 추정할 수 있다. 예를 들어, 신경망에 몬테 카를로 드롭아웃(Monte-Carlo Dropout)을 적용하여, N번의 몬테 카를로 샘플들의 분산으로 아래의 수식과 같이 불확실성 지표인 σ _uncertainty 를 계산할 수 있다(관련 논문 [A. Loquercio, M. Segu, and D. Scaramuzza, "A general framework for uncertainty estimation in deep learning," IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3153-3160, 2020]는 참조로서 본 개시에 포함됨).According to an embodiment of the present disclosure, epistemological uncertainty can be estimated by applying dropout, which omits part of the neural network, to the neural network of the continuous control estimation unit 130 learned by the current dataset. . For example, by applying Monte-Carlo Dropout to a neural network, σ _uncertainty , an uncertainty index, can be calculated using the variance of N Monte Carlo samples as shown in the formula below (related paper [A. Loquercio, M. Segu, and D. Scaramuzza, "A general framework for uncertainty estimation in deep learning," IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3153-3160, 2020] is incorporated by reference into this disclosure ).

(5) (5)

여기서, y_n은 n번째 드롭아웃이 적용된 신경망의 출력이고, y는 y_n들의 평균이다.Here, y _n is the output of the neural network to which the nth dropout is applied, and y is the average of y _n .

도 5은 본 개시의 일 실시예에 따른 온라인 데이터셋의 자가 균형화 방법을 도시한 순서도이다.Figure 5 is a flow chart illustrating a method for self-balancing an online dataset according to an embodiment of the present disclosure.

본 개시의 일 실시예에 따르면, 온라인 데이터셋의 자가 균형화 방법은 초기화 단계(S100), 균형지표 비교 단계(S200), 데이터 수집 단계(S300), 데이터 선택 단계(S400) 및 데이터 교환 단계(S500)을 포함할 수 있다. According to an embodiment of the present disclosure, the method for self-balancing an online dataset includes an initialization step (S100), a balance indicator comparison step (S200), a data collection step (S300), a data selection step (S400), and a data exchange step (S500). ) may include.

본 개시의 일 실시예에 따르면, 초기화 단계(S100)에서 미리 주어진 초기 데이터셋 D_S를 데이터 분류부(110), 이산 행동 추정부(120) 및 연속 제어 추정부(130)의 신경망들에 학습하고, 그에 기초하여 P_at, P_as, P_ab 및 P_o의 초기값을 계산할 수 있다.According to an embodiment of the present disclosure, in the initialization step (S100), the pre-given initial dataset D _S is trained on the neural networks of the data classification unit 110, the discrete behavior estimation unit 120, and the continuous control estimation unit 130. And based on that, the initial values of P _at , P _as , P _ab and P _o can be calculated.

본 개시의 일 실시예에 따르면, 균형지표 비교 단계(S200)에서 현재 데이터셋의 균형 지표를 계산하여 균형 지표가 미리 정해진 기준값(threshold)보다 작은 경우에는 데이터셋의 업데이트를 계속하고, 기준값보다 큰 경우에는 종료할 수 있다.According to an embodiment of the present disclosure, in the balance indicator comparison step (S200), the balance indicator of the current dataset is calculated, and if the balance indicator is less than a predetermined threshold, update of the dataset is continued, and if the balance indicator is less than the predetermined threshold, update of the dataset is continued. In this case, it can be terminated.

본 개시의 일 실시예에 따르면, 데이터 수집 단계(S300)에서 새로운 학습 데이터를 수집하기 위하여, 모방학습의 대상이 되는 전문가(expert)로부터 실제 상황에서 데이터를 수집할 수 있다. 예를 들어, 자율주행을 위한 데이터 수집을 위하여 실제 도로 상황에서 운전자가 자동차를 운전하는 데이터를 수집할 수 있다. According to an embodiment of the present disclosure, in order to collect new learning data in the data collection step (S300), data can be collected from an expert who is the subject of imitation learning in a real situation. For example, to collect data for autonomous driving, data can be collected when a driver drives a car in actual road conditions.

본 개시의 일 실시예에 따르면, 데이터 선택 단계(S400)에서 수집된 개별 데이터가 현재의 데이터셋(Ds)에 의하여 학습된 신경망에서 추정되는 확률과 불확실성을 고려하여, 해당 데이터의 추정 확률이 낮고 불확실성이 높은 데이터를 선택하고, 미리 정해진 임계값보다 데이터의 추정 확률이 높고 불확실성이 낮은 데이터를 필터링하여 제거할 수 있다. 여기서, 현재의 데이터셋(Ds)에 의하여 학습된 신경망에서 수집된 개별 데이터가 추정되는 확률이 높은 경우는 이미 현재의 데이터셋(Ds)에 해당 데이터에 대응되는 상황이 이미 충분히 학습된 것으로 볼 수 있으며, 불확실성이 낮은 경우는 이미 충분히 훈련된 것이므로 이를 제외할 수 있다. 자율 주행을 예로 들어 설명하면, 주행에서 수집된 데이터 (o, at, as, ab)에 대하여 각 세부요소의 추정 확률이 미리 정해진 기준값보다 낮고, 불확실성 지표가 미리 정해진 기준값보다 높은 경우에만 이를 새로운 데이터로 선별할 수 있다.According to an embodiment of the present disclosure, considering the probability and uncertainty of individual data collected in the data selection step (S400) estimated in a neural network learned by the current dataset (Ds), the estimated probability of the data is low and Data with high uncertainty can be selected, and data with a higher estimate probability and lower uncertainty than a predetermined threshold can be filtered out and removed. Here, if the probability that individual data collected from the neural network learned by the current dataset (Ds) is estimated is high, the situation corresponding to the data in the current dataset (Ds) can be considered to have already been sufficiently learned. In cases where uncertainty is low, it can be excluded because it has already been sufficiently trained. Taking autonomous driving as an example, for the data (o, at, as, ab) collected during driving, the estimated probability of each detailed element is lower than the predetermined standard value, and only when the uncertainty index is higher than the predetermined standard value, new data is generated. You can select.

본 개시의 일 실시예에 따르면, 데이터 교환 단계(S500)에서 현재의 데이터셋 Ds의 데이터들을 추정 확률이 높고 불확실성이 낮은 순서로 정렬하고, 정렬된 데이터셋의 추정 확률이 높고 불확실성이 낮은 데이터들을, 새로 수집된 추정 확률이 낮고, 불확실성이 높은 데이터와 교환할 수 있다. 일 실시예에서, 현재 데이터셋에 대한 데이터 정렬 동작은 데이터 선택 동작(S400)과 병렬적으로 수행될 수 있고, 그 후에 현재 데이터셋의 데이터들을 새로 수집된 데이터들과 교환할 수 있다(S500). 이러한 절차를 통해, 데이터셋 Ds는 고정된 크기를 유지하면서 새로운 온라인 데이터들을 업데이트할 수 있다. According to an embodiment of the present disclosure, in the data exchange step (S500), the data of the current dataset Ds are sorted in order of high estimation probability and low uncertainty, and the data of the sorted dataset with high estimation probability and low uncertainty are sorted. , can be exchanged for newly collected data with low estimated probability and high uncertainty. In one embodiment, a data sorting operation for the current dataset may be performed in parallel with a data selection operation (S400), and then data in the current dataset may be exchanged with newly collected data (S500). . Through this procedure, dataset Ds can be updated with new online data while maintaining a fixed size.

본 개시의 일 실시예에 따르면, 데이터 배치 생성 단계(S600)에서 증분 학습을 위한 데이터 배치(data batch)가 생성될 수 있다. 데이터 배치는 데이터 교환 단계(S500)에서 데이터의 교환으로 업데이트된 데이터셋 Ds를 무작위로 샘플링하여 데이터 배치 Db를 형성하고, 해당 데이터 배치 Db의 균형 지표(balance indicator)를 계산하며, 균형 지표가 미리 정해진 기준값보다 높은 데이터 배치를 출력한다. According to an embodiment of the present disclosure, a data batch for incremental learning may be created in the data batch creation step (S600). In the data exchange step (S500), data batch Db is formed by randomly sampling the updated dataset Ds through data exchange, a balance indicator of the data batch Db is calculated, and the balance indicator is calculated in advance. Outputs a batch of data higher than a set standard value.

본 개시의 일 실시예에 따르면, 데이터 학습 단계(S700)은 데이터 배치 생성 단계(S600)에서 생성된 데이터 배치를 이용하여 신경망들을 증분 학습한다. 예를 들어, 데이터 학습 단계(S700)에서 데이터 분류부(110), 이산 행동 추정부(120) 및 연속 제어 추정부(130)의 신경망들을 데이터 배치를 이용하여 증분 학습할 수 있다. According to an embodiment of the present disclosure, the data learning step (S700) incrementally learns neural networks using the data batch generated in the data batch generating step (S600). For example, in the data learning step (S700), the neural networks of the data classification unit 110, the discrete action estimation unit 120, and the continuous control estimation unit 130 may be incrementally trained using data batches.

본 개시의 일 실시예는 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 또는 프로그램 모듈을 포함하며, 임의의 정보 전달 매체를 포함한다.An embodiment of the present disclosure may also be implemented in the form of a recording medium containing instructions executable by a computer, such as program modules executed by a computer. Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and non-volatile media, removable and non-removable media. Additionally, computer-readable media may include both computer storage media and communication media. Computer storage media includes both volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Communication media typically includes computer-readable instructions, data structures, or program modules and includes any information delivery medium.

전술한 본 개시의 설명은 예시를 위한 것이며, 본 개시가 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The foregoing description of the present disclosure is for illustrative purposes, and a person skilled in the art to which the present disclosure pertains will understand that the present invention can be easily modified into other specific forms without changing its technical idea or essential features. will be. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive. For example, each component described as unitary may be implemented in a distributed manner, and similarly, components described as distributed may also be implemented in a combined form.

본 개시의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 개시의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present disclosure is indicated by the claims described below rather than the detailed description above, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present disclosure. do.

Claims

In the dataset self-balancing device,
a data collector that collects new training data;
If the estimated probability corresponding to the collected data, derived from a neural network learned by an existing data set, is less than or equal to a predetermined reference value, the collected data is selected, and if the estimated probability is more than the reference value, the collected data is selected. data selector to delete;
a data exchanger that exchanges data with the high estimated probability in the existing dataset with the selected data;
a data batch generator that generates a data batch in which a balance index is greater than or equal to a predetermined reference value in the dataset exchanged with the selected data; and
Includes an incremental learner that learns the generated data batch into the neural network,
The balance index indicates the degree to which representations belonging to different labels have a similar degree of linear separability, and the difference in classification accuracy for each label is small for the data included in the data batch. Having a large value in
Dataset self-balancing device.

The method of claim 1, wherein the data selector is:
The collected data is selected when the probability of occurrence corresponding to the input of the collected data is higher than a predetermined reference value, and the conditional probability of the output of the collected data occurring on the premise of the input is higher than the predetermined reference value. doing,
Dataset self-balancing device.

The method of claim 2, wherein the probability of occurrence in the case corresponding to the input of the collected data is,
The data collected by the data collector is classified into labels using a semi-supervised learning method, and the ratio of data corresponding to the label corresponding to the collected data is derived.
Dataset self-balancing device.

The method of claim 2, wherein the conditional probability that the output of the collected data occurs based on the input is:
Derived by a discrete action estimator, which is a neural network learned from the existing dataset,
The discrete action estimation unit includes a feature extraction layer, a control/action planning layer, and a classification layer,
Dataset self-balancing device.

The method of claim 1, wherein the data selector is:
Selecting the collected data when the uncertainty index corresponding to the collected data is greater than or equal to a predetermined reference value,
Dataset self-balancing device.

The method of claim 5, wherein the uncertainty indicator is,
Monte-Carlo Dropout is applied to the continuous control estimator, which is a neural network learned by the existing dataset, and calculated as the variance of Monte-Carlo samples.
Dataset self-balancing device.

delete

According to paragraph 1,
Further comprising a visualization module that visualizes the selected data and the exchanged dataset by arranging data with high similarity adjacent to a low-dimensional space,
Dataset self-balancing device.

The method of claim 8, wherein the visualization module
Using t-SNE (t-distributed stochastic neighbor embedding) method,
Dataset self-balancing device.

The method of claim 1, wherein the data exchanger:
Sort the data in the existing dataset in order of high estimated probability and low uncertainty,
The sorting operation of the data exchanger is performed in parallel with the data selection operation of the data selector,
Dataset self-balancing device.

In the dataset self-balancing method,
A data collection step of collecting new learning data;
If the estimated probability corresponding to the collected data, derived from a neural network learned by an existing data set, is less than or equal to a predetermined reference value, the collected data is selected, and if the estimated probability is more than the reference value, the collected data is selected. Selecting data to be deleted;
A data exchange step of exchanging data with the high estimated probability in the existing dataset with the selected data;
A data batch generation step of generating a data batch in which a balance index is greater than or equal to a predetermined reference value in the dataset exchanged with the selected data; and
A data learning step of learning the generated data batch into the neural network,
The balance index indicates the degree to which representations belonging to different labels have a similar degree of linear separability, and the difference in classification accuracy for each label is small for the data included in the data batch. Having a large value in
Dataset self-balancing method.

The method of claim 11, wherein the data selection step includes:
The collected data is selected when the probability of occurrence corresponding to the input of the collected data is higher than a predetermined reference value, and the conditional probability of the output of the collected data occurring on the premise of the input is higher than the predetermined reference value. doing,
Dataset self-balancing method.

The method of claim 12, wherein the probability of occurrence in the case corresponding to the input of the collected data is,
The collected data are classified into labels using a semi-supervised learning method, and the ratio of data corresponding to the label corresponding to the collected data is derived.
Dataset self-balancing method.

The method of claim 12, wherein the conditional probability that the output of the collected data occurs based on the input is:
Derived by a discrete action estimator, which is a neural network learned from the existing dataset,
The discrete action estimation unit includes a feature extraction layer, a control/action planning layer, and a classification layer,
Dataset self-balancing method.

The method of claim 11, wherein the data selection step includes:
Selecting the collected data when the uncertainty index corresponding to the collected data is greater than or equal to a predetermined reference value,
Dataset self-balancing method.

The method of claim 15, wherein the uncertainty indicator is,
Monte-Carlo Dropout is applied to the continuous control estimator, which is a neural network learned by the existing dataset, and calculated as the variance of Monte-Carlo samples.
Dataset self-balancing method.

delete

According to clause 11,
Further comprising a visualization step of visualizing the selected data and the exchanged dataset by arranging data with high similarity adjacent to each other in a low-dimensional space,
Dataset self-balancing method.

The method of claim 18, wherein the visualization step includes:
Using t-SNE (t-distributed stochastic neighbor embedding) method,
Dataset self-balancing method.

The method of claim 11, wherein the data exchange step includes:
A data sorting step of sorting the data of the existing dataset in the order of high estimate probability and low uncertainty,
The data sorting step is performed in parallel with the data selection step,
Dataset self-balancing method.

A program stored in a computer-readable recording medium to execute the method of any one of claims 11 to 16 and 18 to 20 on a computer.