KR20210000072A

KR20210000072A - Lightweight multilayer random forests classifier for real-time operation under low-specification and classification method using thereof

Info

Publication number: KR20210000072A
Application number: KR1020190074864A
Authority: KR
Inventors: 고병철; 정미라
Original assignee: 계명대학교 산학협력단
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2021-01-04
Also published as: KR102238271B1

Abstract

The present invention relates to a lightweight multilayer random forest (LMRF) classifier for a low-specification real-time operation, and a classification method using the same. According to the present invention, the classification method comprises the steps of: (A) generating an LMRF classifier; and (B) performing classification using the generated LMRF classifier. According to the present invention, performance similar to that of a DNN can be provided even with a small number of hyper-parameters.

Description

Lightweight multilayer random forest classifier for low-spec real-time operation and classification method using it {LIGHTWEIGHT MULTILAYER RANDOM FORESTS CLASSIFIER FOR REAL-TIME OPERATION UNDER LOW-SPECIFICATION AND CLASSIFICATION METHOD USING THEREOF}

본 발명은 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에 관한 것으로서, 보다 구체적으로는 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에 관한 것이다.The present invention relates to a random forest classifier and a classification method using the same, and more particularly, to a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same.

심층 신경망(Deep Neural Network; DNN)은 많은 분류 애플리케이션을 위한 강력한 알고리즘이지만 너무 많은 매개변수, 주의 깊은 매개변수 튜닝, 엄청난 양의 교육 데이터, 및 사전 훈련된 아키텍처 등이 필요하다. 현재 DNN 모델에 대한 이러한 요구 사항은 큰 부담이 되고 있으며 특히, 실시간 처리를 위한 분야에의 적용을 어렵게 하고 있다. 또한, 심층 신경망 모델은 블랙박스 형태의 모델이므로 설명이 불가능한 문제도 있다.
Deep Neural Networks (DNNs) are powerful algorithms for many classification applications, but require too many parameters, careful parameter tuning, huge amounts of training data, and a pretrained architecture. Currently, these requirements for the DNN model are a great burden, and in particular, it is difficult to apply it to the field for real-time processing. Also, since the deep neural network model is a black box model, there is a problem that cannot be explained.

한편, 기계 학습(Machine Learning)에서의 랜덤 포레스트(Random Forest)는 분류, 회귀 분석 등에 사용되는 앙상블 학습 방법의 일종으로서, 훈련 과정에서 구성한 다수의 결정 트리로부터 부류(분류) 또는 평균 예측치(회귀 분석)를 출력함으로써 동작한다. 랜덤 포레스트는 여러 개의 결정 트리들을 임의적으로 학습하는 방식의 앙상블 방법이다. 랜덤 포레스트 방법은 크게 다수의 결정 트리를 구성하는 학습 단계와, 입력 벡터가 들어왔을 때 분류하거나 예측하는 테스트 단계로 구성되어 있다. 랜덤 포레스트는 검출, 분류, 그리고 회귀 등 다양한 애플리케이션으로 활용되고 있다.
On the other hand, Random Forest in Machine Learning is a kind of ensemble learning method used for classification and regression analysis, and it is a class (classification) or average predicted value (regression analysis) from a number of decision trees constructed in the training process. It works by outputting ). Random forest is an ensemble method of randomly learning several decision trees. The random forest method largely consists of a learning step that constructs a number of decision trees and a test step that classifies or predicts when an input vector is received. Random forests are used in various applications such as detection, classification, and regression.

랜덤 포레스트의 가장 핵심적인 특징은 임의성(randomness)에 의해 서로 조금씩 다른 특성을 갖는 트리들로 구성된다는 점이다. 이 특징은 각각의 트리들의 예측(prediction)들이 비상관화(decorrelation) 되게 하며, 결과적으로 일반화(generalization) 성능을 향상시킨다. 또한, 임의화(randomization)는 포레스트가 노이즈가 포함된 데이터에 대해서도 강인하게 만들어 준다. 임의화는 각각의 트리들의 훈련 과정에서 진행되며, 가장 널리 쓰이는 두 가지 방법으로는 임의 학습 데이터 추출 방법을 이용한 앙상블 학습법인 배깅(bagging)과 임의 노드 최적화(randomized node optimization)가 있다. 이 두 가지 방법은 서로 동시에 사용되어 임의화 특성을 더욱 증진시킬 수 있다.
The most important characteristic of a random forest is that it is composed of trees with slightly different characteristics due to randomness. This feature causes the predictions of each tree to be decorated, and as a result, improves the generalization performance. Also, randomization makes the forest robust to data containing noise. Randomization is performed in the training process of each tree, and two of the most widely used methods are bagging and randomized node optimization, an ensemble learning method using a random learning data extraction method. These two methods can be used simultaneously with each other to further enhance the randomization properties.

랜덤 포레스트에서 가장 큰 영향을 미치는 매개변수들은 포레스트의 크기(트리의 개수)와 최대 허용 깊이 등이다. 이 중, 포레스트의 크기(트리의 개수)는, 총 포레스트를 몇 개의 트리로 구성할지를 결정하는 매개변수이다. 포레스트의 크기가 작으면, 즉 트리의 개수가 적으면 트리들을 구성하고 테스트하는데 걸리는 시간이 짧은 대신, 일반화 능력이 떨어져 임의의 입력 데이터 포인트에 대해 틀린 결과를 내놓을 확률이 높다. 반면에, 포레스트의 크기가 크면, 즉 트리의 개수가 많으면 높은 성능을 보장하지만, 훈련과 테스트 시간이 길어지고 메모리양이 증가하는 단점이 있다. 따라서, 높은 성능은 보장하면서도, 처리 시간 및 메모리 양을 줄일 수 있는 개선된 랜덤 포레스트 방법을 개발할 필요성이 있다.
The parameters that have the greatest influence in a random forest are the size of the forest (number of trees) and the maximum allowable depth. Among them, the size of the forest (the number of trees) is a parameter that determines how many trees the total forest consists of. If the size of the forest is small, that is, if the number of trees is small, the time taken to construct and test the trees is short, but the generalization ability is low, and there is a high probability of producing incorrect results for any input data point. On the other hand, if the size of the forest is large, that is, if the number of trees is large, high performance is guaranteed, but there are disadvantages of lengthening training and testing time and increasing the amount of memory. Therefore, there is a need to develop an improved random forest method capable of reducing processing time and memory amount while ensuring high performance.

랜덤 포레스트 분류 방법과 관련된 선행특허로서는, 특허 제10-1237089호(발명의 명칭: 랜덤 포레스트 분류 기법을 이용한 산불연기 감지 방법)와 특허 제10-1697183호(발명의 명칭: 인공위성 영상과 랜덤포레스트 분류기 결합을 이용한 자동 하천 검출 시스템 및 방법) 등이 있다.Prior patents related to the random forest classification method include Patent No. 10-1237089 (name of the invention: a method for detecting fire smoke using a random forest classification technique) and Patent No. 10-1697183 (name of the invention: satellite images and random forest classifiers. Automatic river detection system and method using a combination).

본 발명은 기존에 제안된 방법들의 상기와 같은 문제점들을 해결하기 위해 제안된 것으로서, 각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델을 구성하고, 각 계층을 미리 정해진 개수 이하의 트리로 구성함으로써, 기존의 DNN 모델에 비하여 적은 수의 하이퍼 파라미터로도 DNN과 비슷한 성능을 제공하며, 동일한 조건에서 사용 시 처리 시간이 DNN보다 더 빠르므로, 실시간 처리를 위한 분야에 적용할 수 있는, 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 제공하는 것을 그 목적으로 한다.The present invention is proposed to solve the above problems of the previously proposed methods, and each layer is a non-layer-by-layer structure composed of a random forest (RF). By constructing a neural network-type deep model and configuring each layer into a tree with a predetermined number or less, it provides similar performance to DNN with fewer hyper parameters than the existing DNN model, and processing time when used under the same conditions Since it is faster than this DNN, an object thereof is to provide a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same that can be applied to a field for real-time processing.

상기한 목적을 달성하기 위한 본 발명의 특징에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기는,A lightweight multilayer random forest classifier for a low-spec real-time operation according to a feature of the present invention for achieving the above object,

경량 다층 랜덤 포레스트(Lightweight multilayer random forest; LMRF) 분류기로서,As a lightweight multilayer random forest (LMRF) classifier,

각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델이며, 각 계층이 미리 정해진 개수 이하의 트리로 구성된 것을 그 구성상의 특징으로 한다.
It is a deep model of a non-neural network type of a layer-by-layer structure in which each layer is composed of a random forest (RF), and each layer is composed of a tree of less than a predetermined number. It is characterized by the top.

바람직하게는,Preferably,

각 계층이 복수의 RF로 구성된 2층 구조일 수 있다.
Each layer may have a two-layer structure composed of a plurality of RFs.

바람직하게는,Preferably,

각 계층이 무작위로 생성된 이종의(heterogeneous) RF로 구성될 수 있다.
Each layer can consist of randomly generated heterogeneous RFs.

더욱 바람직하게는,More preferably,

각 계층이 RF 및 CRF(Complete-RF)의 2가지 타입의 RF로 구성될 수 있다.
Each layer may be composed of two types of RF: RF and Complete-RF (CRF).

바람직하게는,Preferably,

이전 계층에서 생성된 변환된 특징 벡터를 결합하지 않고, 이전 계층의 출력 특징만을 다음 계층의 새로운 입력 특징으로 사용할 수 있다.
The transformed feature vector generated in the previous layer is not combined, and only the output feature of the previous layer can be used as a new input feature of the next layer.

바람직하게는,Preferably,

각각의 RF에 20개의 의사결정 트리를 할당하여 파라미터의 수와 연산 로드를 줄일 수 있다.
By allocating 20 decision trees to each RF, the number of parameters and computational load can be reduced.

바람직하게는,Preferably,

K-겹 유효성 검사(K-fold validation)를 사용하여 상기 계층 및 파라미터의 수를 결정할 수 있다.
The number of layers and parameters can be determined using K-fold validation.

더욱 바람직하게는,More preferably,

(1) 계층 수(layer number) l을 증가시키는 단계;(1) increasing the layer number l;

(2) 학습 데이터셋 A를 k개의 그룹으로 무작위로 분할하는 단계;(2) randomly partitioning the training dataset A into k groups;

(3) 상기 분할된 k개의 그룹을 사용해, 각각의 k 폴드에 대하여 오차를 계산하는 단계;(3) using the divided k groups, calculating an error for each k fold;

(4) l번째 계층의 최소 오차를 갖는 k번째 오차를 검색하는 단계; 및(4) searching for a k-th error having a minimum error of the l-th layer; And

(5) 상기 검색된 최소 오차값이 임계값 보다 작고, 최소 계층 수(minimum number of layers; ML)가 2 이상이면, 계층의 생성을 중단하고 l-1개의 계층으로 구성된 LMRF 분류기를 출력하며, 상기 검색된 최소 오차값이 임계값 이상이거나, 최소 계층 수(minimum number of layers; ML)가 2 미만이면, 상기 단계 (1)부터 다시 수행하여 새로운 계층을 생성하는 단계를 수행하여, 상기 계층 및 파라미터의 수를 결정할 수 있다.
(5) If the retrieved minimum error value is less than the threshold value and the minimum number of layers (ML) is 2 or more, the generation of the layer is stopped and an LMRF classifier consisting of l-1 layers is output, and the If the searched minimum error value is greater than or equal to the threshold value or the minimum number of layers (ML) is less than 2, the step of generating a new layer is performed again from step (1), and the layer and parameter are You can decide the number.

더더욱 바람직하게는, 상기 단계 (3)은,Even more preferably, the step (3),

(3-1) 상기 분할된 k개의 그룹 중, (k-1)개의 그룹을 학습 폴드로 할당하고, 나머지 하나의 그룹을 검증 폴드로 할당하는 단계;(3-1) allocating (k-1) groups of the divided k groups as learning folds, and allocating the remaining one group as verification folds;

(3-2) 상기 (k-1)개의 학습 폴드를 이용해 복수의 RF를 학습하여 계층 l을 생성하는 단계; 및(3-2) generating a layer l by learning a plurality of RFs using the (k-1) learning folds; And

(3-3) 상기 검증 폴드의 N 샘플에 대해 손실 함수(loss function)를 합산하여 오차를 계산하는 단계를 포함하며,(3-3) calculating an error by summing a loss function for N samples of the verification fold,

상기 단계 (3-1) 내지 단계 (3-3)을 각각의 k 폴드에 대하여 수행하여 오차를 계산할 수 있다.
The error can be calculated by performing steps (3-1) to (3-3) for each k-fold.

더더더욱 바람직하게는,Even more preferably,

각 계층이 RF 및 CRF(Complete-RF)의 2가지 타입의 RF로 구성되며,Each layer is composed of two types of RF, RF and CRF (Complete-RF),

상기 단계 (3-2)는,The step (3-2),

학습 폴드로부터 서브셋을 선택하고,Select a subset from the learning fold,

(3-2-1) RF들에 대해서, 상기 서브셋 샘플들 및 정보 이득(information gain)을 사용해 랜덤 트리들을 성장시키는 단계; 및(3-2-1) for RFs, growing random trees using the subset samples and information gain; And

(3-2-2) CRF에 한하여, 정보 이득 없이 상기 서브셋 샘플들을 사용하여 완전하게 랜덤 트리들을 성장시키는 단계를 포함할 수 있다.
(3-2-2) For CRF only, it may include growing random trees completely using the subset samples without information gain.

상기한 목적을 달성하기 위한 본 발명의 특징에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법은,A classification method using a lightweight multilayer random forest classifier for a low-spec real-time operation according to a feature of the present invention for achieving the above object,

경량 다층 랜덤 포레스트(Lightweight multilayer random forest; LMRF) 분류기를 이용한 분류 방법으로서,As a classification method using a lightweight multilayer random forest (LMRF) classifier,

(A) 각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델이며, 각 계층이 미리 정해진 개수 이하의 트리로 구성된 LMRF 분류기를 생성하는 단계; 및(A) A deep model of a non-neural network type of a layer-by-layer structure in which each layer is composed of a random forest (RF), and each layer is composed of a predetermined number or less of trees. Generating an LMRF classifier; And

(B) 상기 생성된 LMRF 분류기를 이용해 분류를 하는 단계를 포함하는 것을 그 구성상의 특징으로 한다.
(B) It is characterized in that it comprises the step of classification using the generated LMRF classifier.

바람직하게는, 상기 LMRF 분류기는,Preferably, the LMRF classifier,

더욱 바람직하게는, 상기 LMRF 분류기는,More preferably, the LMRF classifier,

바람직하게는, 상기 LMRF 분류기는,Preferably, the LMRF classifier,

바람직하게는, 상기 단계 (A)에서는,Preferably, in the step (A),

K-겹 유효성 검사(K-fold validation)를 사용하여 상기 계층 및 파라미터의 수를 결정하여 상기 LMRF 분류기를 생성할 수 있다.
The LMRF classifier may be generated by determining the number of layers and parameters using K-fold validation.

더욱 바람직하게는, 상기 단계 (A)는,More preferably, the step (A),

(5) 상기 검색된 최소 오차값이 임계값 보다 작고, 최소 계층 수(minimum number of layers; ML)가 2 이상이면, 계층의 생성을 중단하고 l-1개의 계층으로 구성된 LMRF 분류기를 출력하며, 상기 검색된 최소 오차값이 임계값 이상이거나, 최소 계층 수(minimum number of layers; ML)가 2 미만이면, 상기 단계 (1)부터 다시 수행하여 새로운 계층을 생성하는 단계를 포함할 수 있다.
(5) If the retrieved minimum error value is less than the threshold value and the minimum number of layers (ML) is 2 or more, the generation of the layer is stopped and an LMRF classifier consisting of l-1 layers is output, and the If the searched minimum error value is greater than or equal to the threshold value or the minimum number of layers (ML) is less than 2, the step of generating a new layer by performing again from step (1) may be included.

더더더욱 바람직하게는,Even more preferably,

상기 LMRF 분류기는, 각 계층이 RF 및 CRF(Complete-RF)의 2가지 타입의 RF로 구성되며,In the LMRF classifier, each layer is composed of two types of RF, RF and CRF (Complete-RF),

상기 단계 (3-2)는,The step (3-2),

(3-2-2) CRF에 한하여, 정보 이득 없이 상기 서브셋 샘플들을 사용하여 완전하게 랜덤 트리들을 성장시키는 단계를 포함할 수 있다.(3-2-2) For CRF only, it may include growing random trees completely using the subset samples without information gain.

본 발명에서 제안하고 있는 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에 따르면, 각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델을 구성하고, 각 계층을 미리 정해진 개수 이하의 트리로 구성함으로써, 기존의 DNN 모델에 비하여 적은 수의 하이퍼 파라미터로도 DNN과 비슷한 성능을 제공하며, 동일한 조건에서 사용 시 처리 시간이 DNN보다 더 빠르므로, 실시간 처리를 위한 분야에 적용할 수 있다.According to the lightweight multilayer random forest classifier for low-spec real-time operation proposed in the present invention and a classification method using the same, each layer is a layer-by-layer structure consisting of a random forest (RF). By constructing a deep model of non-neural network type and configuring each layer into a tree of less than a predetermined number, it provides similar performance to DNN with fewer hyper parameters compared to the existing DNN model, and is used under the same conditions. Since the time processing time is faster than DNN, it can be applied to the field for real-time processing.

도 1은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기의 구성을 도시한 도면.
도 2는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법을 도시한 도면.
도 3은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법에서, 단계 S10의 세부적인 흐름을 도시한 도면.
도 4는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법에서, 단계 S300의 세부적인 흐름을 도시한 도면.
도 5는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법에서, 단계 S320의 세부적인 흐름을 도시한 도면.
도 6은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서, LMRF 분류기의 생성 알고리즘을 도시한 도면.
도 7은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 이용한 얼굴 표정 인식의 전체적인 프로세스를 도시한 도면.
도 8은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서, RF의 수를 증가시키면서 트리 수를 균등하게 분배했을 때, FER 정확도를 표시한 도면.
도 9는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서, FER 정확도를 비교하여 표시한 도면.
도 10은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법과 다른 DRF 기반 방법의 감정 분류 정확도를 비교하여 표시한 도면.
도 11은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법과 DNN 모델 압축 알고리즘, DRF 기반 알고리즘의 정확도, 파라미터의 수 및 연산의 수를 비교하여 표시한 도면.
도 12는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 사용하여 얼굴 표정을 인식한 결과를 도시한 도면.1 is a diagram showing the configuration of a lightweight multi-layer random forest classifier for low-spec real-time operation according to an embodiment of the present invention.
2 is a view showing a classification method using a lightweight multi-layer random forest classifier for low-spec real-time operation according to an embodiment of the present invention.
3 is a diagram showing a detailed flow of step S10 in a classification method using a lightweight multi-layer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention.
4 is a diagram showing a detailed flow of step S300 in a classification method using a lightweight multilayer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention.
5 is a view showing a detailed flow of step S320 in a classification method using a lightweight multi-layer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention.
6 is a diagram illustrating an algorithm for generating an LMRF classifier in a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating an overall process of facial expression recognition using a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention.
FIG. 8 is a diagram showing FER accuracy when the number of trees is equally distributed while increasing the number of RFs in a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention.
9 is a view showing comparison of FER accuracy in a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention.
FIG. 10 is a view showing a comparison of a light weight multilayer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention, and a classification method using the same and emotion classification accuracy of another DRF-based method.
11 is a light weight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention, and a classification method using the same, a DNN model compression algorithm, the accuracy of a DRF-based algorithm, the number of parameters, and the number of operations are compared and displayed. drawing.
12 is a view showing a result of recognizing facial expressions using a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 바람직한 실시예를 상세히 설명한다. 다만, 본 발명의 바람직한 실시예를 상세하게 설명함에 있어, 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다. 또한, 유사한 기능 및 작용을 하는 부분에 대해서는 도면 전체에 걸쳐 동일한 부호를 사용한다.
Hereinafter, preferred embodiments will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present invention. However, in describing a preferred embodiment of the present invention in detail, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. In addition, the same reference numerals are used throughout the drawings for portions having similar functions and functions.

덧붙여, 명세서 전체에서, 어떤 부분이 다른 부분과 연결 되어 있다고 할 때, 이는 직접적으로 연결 되어 있는 경우뿐만 아니라, 그 중간에 다른 소자를 사이에 두고 간접적으로 연결 되어 있는 경우도 포함한다. 또한, 어떤 구성요소를 포함 한다는 것은, 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있다는 것을 의미한다.
In addition, throughout the specification, when a part is said to be connected to another part, this includes not only the case that it is directly connected, but also the case that it is indirectly connected with another element interposed therebetween. In addition, the inclusion of certain components means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

도 1은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기의 구성을 도시한 도면이다. 도 1에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기는, 각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델이며, 각 계층이 미리 정해진 개수 이하의 트리로 구성될 수 있다.
1 is a diagram showing a configuration of a lightweight multi-layer random forest classifier for low-spec real-time operation according to an embodiment of the present invention. As shown in Figure 1, the lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention, each layer (layer) is a multilayer structure (layer-by) consisting of a random forest (Random forest; RF). -layer structure) of the non-neural network type, and each layer can be configured with a predetermined number or fewer trees.

보다 구체적으로, 본 발명의 LMRF 분류기는, 각 계층이 복수의 RF로 구성된 2층 구조일 수 있다. 또한, 각 계층이 무작위로 생성된 이종의(heterogeneous) RF로 구성될 수 있고, 각 계층이 RF 및 CRF(Complete-RF)의 2가지 타입의 RF로 구성될 수 있다.
More specifically, the LMRF classifier of the present invention may have a two-layer structure in which each layer is composed of a plurality of RFs. In addition, each layer may be composed of randomly generated heterogeneous RF, and each layer may be composed of two types of RF: RF and Complete-RF (CRF).

첫 번째 계층의 역할은, 개별 특징을 클래스 확률로 변환하는 것이며, 이러한 확률 출력은 다음 계층의 새로운 입력에 대한 변환된 단일 특징 벡터로 연결될 수 있다. 예를 들어, LMRF 분류기를 얼굴 표정 인식을 위해 사용할수 있는데, 입력 영상으로부터 얼굴 랜드마크를 검출하고, 얼굴 랜드마크로부터 랜드마크 사이의 공간 관계를 기하학적 특징으로 추출한 다음, 추출된 기하학적 특징을 이용해 학습을 통해 LMRF 분류기를 생성할 수 있다. 이때, 얼굴 랜드마크로부터 랜드마크 상호간의 각도 특징(Angle feature) 및 거리 특징(Distance feature)을 기하학적 특징으로 추출하고, 모든 각도 특징 및 거리 특징은 각각 16개의 RF 및 16개의 완전한 RF(Complete-RF; CRF)로 구성된 서로 다른 하위 계층에 적용될 수 있다.
The role of the first layer is to transform individual features into class probabilities, and these probability outputs can be connected to a single transformed feature vector for the new input of the next layer. For example, an LMRF classifier can be used for facial expression recognition. It detects a facial landmark from an input image, extracts the spatial relationship between the landmarks from the facial landmark as geometric features, and then learns using the extracted geometric features. You can create an LMRF classifier through At this time, angle features and distance features between landmarks are extracted as geometric features from facial landmarks, and all angular features and distance features are each 16 RF and 16 complete RF (Complete-RF). ; CRF) can be applied to different lower layers.

두 번째 계층에서, 각 계층은 다음 계층에 대한 새로운 특징 벡터를 생성하며, 두 번째 계층이 최종 계층인 경우, 분류를 수행하는 역할을 할 수 있다. 전술한 바와 같은 예에서는, 최종 계층은, 최종 얼굴 표정 클래스를 예측하는데 사용될 수 있다.
In the second layer, each layer generates a new feature vector for the next layer, and when the second layer is the final layer, it may serve to perform classification. In the example as described above, the final layer can be used to predict the final facial expression class.

본 발명의 LMRF 분류기에서는 DNN 계층의 각 뉴런이 RF로 대체되며, 각 계층은 여러 유형의 RF로 구성될 수 있다. LMRF 분류기의 계층은, 다양성을 높이고 보편성을 유지하기 위해, 균일한 RF 대신 무작위로 생성된 이종의(heterogeneous) RF로 구성될 수 있다.
In the LMRF classifier of the present invention, each neuron in the DNN layer is replaced by RF, and each layer may be composed of several types of RF. The layer of the LMRF classifier may consist of randomly generated heterogeneous RFs instead of uniform RFs to increase diversity and maintain universality.

도 1에 도시된 바와 같이, LMRF 분류기의 계층은, RF 및 CRF(Complete-RF)의 2가지 타입의 RF로 구성될 수 있다. 즉, LMRF 분류기의 한 계층에서는, 2개의 서로 다른 타입의 RF를 사용할 수 있다. 단일한 RF를 사용할 때보다 RF 및 CRF의 서로 다른 2가지 타입의 RF를 사용할 때에, 성능이 향상될 수 있다.
As shown in FIG. 1, the layer of the LMRF classifier may be composed of two types of RF: RF and Complete-RF (CRF). That is, in one layer of the LMRF classifier, two different types of RF can be used. When using two different types of RF, RF and CRF than when using a single RF, performance can be improved.

본 발명에서는, 기존의 DF(Deep Forest) 방법과 달리, 이전 계층에서 생성된 변환된 특징 벡터를 결합하지 않고, 이전 계층의 출력 특징만을 다음 계층의 새로운 입력 특징으로 사용하는 모델을 설계하였다. 따라서 수렴이 빠르게 일어나고, 테스트 중 성능 저하를 막을 수 있다.
In the present invention, unlike the existing DF (Deep Forest) method, a model that uses only the output feature of the previous layer as a new input feature of the next layer is designed without combining the transformed feature vector generated in the previous layer. Therefore, convergence occurs quickly and performance degradation during testing can be prevented.

또한, 각각의 RF에 20개의 의사결정 트리를 할당하여 파라미터의 수와 연산 로드를 줄일 수 있다. 분류할 클래스가 3개이고 계층 당 총 8개의 RF가 있는 경우, LMRF 분류기의 출력 벡터의 크기는 96(3×32)가 된다. 그러나 DF는 계층(3×8)의 출력과 변환된 특징 벡터(1,806)를 결합하여 1,818차원을 갖게 된다. 본 발명에서는, RF당 트리 수 또는 계층당 트리 수를 늘리는 것보다 RF의 개수를 늘리는 것이 더 좋다는 것을 실험을 통해 증명하였다(추후 상세히 설명할 실험 결과 및 도 8 참조).
In addition, by allocating 20 decision trees to each RF, the number of parameters and computational load can be reduced. If there are 3 classes to be classified and there are a total of 8 RFs per layer, the size of the output vector of the LMRF classifier is 96 (3×32). However, the DF has 1,818 dimensions by combining the output of the layer (3×8) and the transformed feature vector (1,806). In the present invention, it has been proved through an experiment that it is better to increase the number of RFs than to increase the number of trees per RF or the number of trees per layer (see experimental results and Fig. 8 to be described in detail later).

한편, 본 발명에서는, K-겹 유효성 검사(K-fold validation)를 사용하여 계층 및 파라미터의 수를 자동으로 결정할 수 있으며, 보다 구체적으로는, 5-겹 유효성 검사(five-fold validation)를 사용할 수 있다. K-겹 유효성 검사(K-fold validation)를 사용한 학습을 통해 LMRF 분류기의 계층 수 및 파라미터 수를 결정하는 과정에 대해서는, 추후 도 3 내지 도 6을 참조하여 상세히 설명하도록 한다.
Meanwhile, in the present invention, the number of layers and parameters can be automatically determined using K-fold validation, and more specifically, five-fold validation can be used. I can. A process of determining the number of layers and the number of parameters of the LMRF classifier through learning using K-fold validation will be described in detail later with reference to FIGS. 3 to 6.

도 2는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법을 도시한 도면이다. 도 2에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법은, LMRF 분류기를 생성하는 단계(S10) 및 생성된 LMRF 분류기를 이용해 분류를 하는 단계(S20)를 포함하여 구현될 수 있다. 단계 S10 및 단계 S20은, 분류기의 생성 및 분류를 수행하는 컴퓨터 또는 분류 장치에서 수행될 수 있다.
2 is a diagram illustrating a classification method using a lightweight multi-layer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention. As shown in Figure 2, the classification method using a lightweight multi-layer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention, the step of generating an LMRF classifier (S10) and classification using the generated LMRF classifier It may be implemented including the step (S20). Steps S10 and S20 may be performed in a computer or a classification device that generates and classifies a classifier.

단계 S10에서는, 각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델이며, 각 계층이 미리 정해진 개수 이하의 트리로 구성된 LMRF 분류기를 생성할 수 있다. 단계 S10에서는, 오버 피팅(overfitting)의 위험을 줄이면서 계층 및 파라미터의 수를 자동으로 결정하기 위해, K-겹 유효성 검사(K-fold validation)를 사용할 수 있으며, 보다 구체적으로는 5-겹 유효성 검사(five-fold validation)를 사용할 수 있다. 단계 S10의 세부적인 흐름에 대해서는, 추후 도 3에서 상세히 설명하도록 한다.
In step S10, each layer is a deep model of a non-neural network type of a layer-by-layer structure composed of a random forest (RF), and each layer is a tree having a predetermined number or less. You can create a configured LMRF classifier. In step S10, in order to automatically determine the number of layers and parameters while reducing the risk of overfitting, K-fold validation can be used, and more specifically, 5-fold validation. You can use five-fold validation. The detailed flow of step S10 will be described in detail later in FIG. 3.

단계 S20에서는, 생성된 LMRF 분류기를 이용해 분류를 할 수 있다. 보다 구체적으로, 단계 S20에서는, LMRF 분류기의 마지막 층의 각 RF의 출력 확률을 평균하여, 최대 확률을 갖는 클래스로 분류를 할 수 있다.
In step S20, classification can be performed using the generated LMRF classifier. More specifically, in step S20, the output probability of each RF of the last layer of the LMRF classifier is averaged, and the class can be classified into a class having a maximum probability.

도 3은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법에서, 단계 S10의 세부적인 흐름을 도시한 도면이다. 도 3에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법의 단계 S10은, 계층 수 l을 증가시키는 단계(S100), 학습 데이터셋 A를 k개의 그룹으로 무작위로 분할하는 단계(S200), 분할된 k개의 그룹을 사용해, 각각의 k 폴드에 대하여 오차를 계산하는 단계(S300), l번째 계층의 최소 오차를 갖는 k번째 오차를 검색하는 단계(S400) 및 검색된 최소 오차값이 임계값보다 작고, 최소 계층 수가 2 이상인이면, 계층의 생성을 중단하고 l-1개의 계층으로 구성된 LMRF 분류기를 출력하는 단계(S500)를 포함하여 구현될 수 있다.
3 is a diagram showing a detailed flow of step S10 in a classification method using a lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention. As shown in FIG. 3, step S10 of the classification method using a lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention is a step of increasing the number of layers l (S100), and a training dataset A The step of randomly partitioning into k groups (S200), calculating an error for each k-fold using the divided k groups (S300), searching for the k-th error with the minimum error of the l-th layer And if the searched minimum error value is less than the threshold value and the minimum number of layers is 2 or more, stopping the generation of layers and outputting an LMRF classifier composed of l-1 layers (S500). I can.

단계 S100에서는, 계층 수(layer number) l을 증가시킬 수 있다.
In step S100, the layer number l may be increased.

단계 S200에서는, 학습 데이터셋 A를 k개의 그룹으로 무작위로 분할할 수 있다. 여기서, k는 폴드의 개수이다.
In step S200, the training data set A may be randomly divided into k groups. Where k is the number of folds.

단계 S300에서는, 분할된 k개의 그룹을 사용해, 각각의 k 폴드에 대하여 오차를 계산할 수 있다. 즉, 학습 데이터셋 A를 분할한 k개의 폴드 중 어느 하나를 검증 폴드로 하고 나머지는 학습 폴드로 하면서, 모든 k 폴드를 검증 폴드로 하는 경우에 대하여 오차를 계산할 수 있다. 이때, 단계 S300에서는, 평균 제곱 오차(mean squared error; MSE)를 계산할 수 있다. 이하에서는, 단계 S300의 세부적인 흐름에 대해서는 도 4를 참조하여 상세히 설명하도록 한다.
In step S300, an error can be calculated for each k-fold using k divided groups. That is, the error can be calculated for a case in which one of the k folds obtained by dividing the training data set A is used as the verification fold and the rest is the learning fold, and all k folds are used as the verification fold. At this time, in step S300, a mean squared error (MSE) may be calculated. Hereinafter, a detailed flow of step S300 will be described in detail with reference to FIG. 4.

도 4는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법에서, 단계 S300의 세부적인 흐름을 도시한 도면이다. 도 4에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법의 단계 S300은, 분할된 k개의 그룹 중, (k-1)개의 그룹을 학습 폴드로 할당하고, 나머지 하나의 그룹을 검증 폴드로 할당하는 단계(S310), (k-1)개의 학습 폴드를 이용해 복수의 RF를 학습하여 계층 l을 생성하는 단계(S320), 검증 폴드의 N 샘플에 대해 손실 함수를 합산하여 오차를 계산하는 단계(S330)를 포함하여 구현될 수 있다. 이때, 단계 S300에서는, 단계 S310 내지 단계 S330을 각각의 k 폴드에 대하여 모두 수행하여 오차를 계산할 수 있다.
4 is a diagram illustrating a detailed flow of step S300 in a classification method using a lightweight multilayer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention. As shown in FIG. 4, step S300 of the classification method using a lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention includes (k-1) groups among the divided k groups. Allocating a learning fold and assigning the remaining one group as a verification fold (S310), learning a plurality of RFs using (k-1) learning folds to generate a layer l (S320), It may be implemented by summing the loss function for N samples to calculate an error (S330). In this case, in step S300, the error may be calculated by performing all steps S310 to S330 for each k-fold.

단계 S310에서는, 단계 S200에서 분할된 k개의 그룹 중, (k-1)개의 그룹을 학습 폴드로 할당하고, 나머지 하나의 그룹을 검증 폴드로 할당할 수 있다.
In step S310, of the k groups divided in step S200, (k-1) groups may be allocated as a learning fold, and the remaining one group may be allocated as a verification fold.

단계 S320에서는, (k-1)개의 학습 폴드를 이용해 복수의 RF를 학습하여 계층 l을 생성할 수 있다. 보다 구체적으로, 단계 S320에서는, (k-1)개의 학습 폴드를 사용하여 RF 및 CRF를 학습하여 계층 l을 생성할 수 있다. 즉, k번째 학습 폴드 A_k로부터 서브셋 A`_k을 선택하여, 서브셋 A`_k 샘플들을 사용하여 계층을 구성하는 RF들의 트리를 성장시킬 수 있다.
In step S320, layer l may be generated by learning a plurality of RFs using (k-1) learning folds. More specifically, in step S320, the layer 1 may be generated by learning RF and CRF using (k-1) learning folds. That is, by selecting a subset A` _k from the k-th learning fold A _k, _k subsets A` can use the samples growing RF tree of constituting layers.

도 5는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법에서, 단계 S320의 세부적인 흐름을 도시한 도면이다. 도 5에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 이용한 분류 방법의 단계 S320은, RF들에 대해서, 서브셋 샘플들 및 정보 이득을 사용해 랜덤 트리들을 성장시키는 단계(S321) 및 CRF에 한하여, 정보 이득 없이 서브셋 샘플들을 사용하여 완전하게 랜덤 트리들을 성장시키는 단계(S322)를 포함하여 구현될 수 있다.
5 is a diagram showing a detailed flow of step S320 in a classification method using a lightweight multilayer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention. As shown in FIG. 5, step S320 of the classification method using a lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention is, for RFs, a random tree using subset samples and information gain. It may be implemented including the step of growing random trees (S321) and the step (S322) of completely growing random trees using subset samples without information gain, only for the CRF.

즉, 단계 S321에서는 RF들에 대해서, 서브셋 A`_k 샘플들 및 정보 이득(information gain)을 사용해 랜덤 트리들을 성장시키고, 단계 S321에서는 CRF에 한하여, 정보 이득 없이 서브셋 A`_k 샘플들을 사용하여 완전하게 랜덤한 트리들을 성장시킬 수 있다.
That is, in step S321, random trees are grown using subset A′ _k samples and information gain for RFs, and in step S321, only CRF is used, and subset A′ _k samples are used without information gain. You can grow random trees.

단계 S330에서는, 검증 폴드의 N 샘플에 대해 손실 함수(loss function)를 합산하여 오차를 계산할 수 있다. 이때, 단계 S330에서는, k 폴드의 평균 제곱 오차(mean squared error; MSE)를 다음 수학식 1로부터 계산할 수 있다.In step S330, an error may be calculated by summing a loss function for N samples of the verification fold. At this time, in step S330, a mean squared error (MSE) of k-folds may be calculated from Equation 1 below.

여기서, pi 및

는 각각 검증 폴드에 포함된 i번째 샘플 데이터의 예측 및 실제 클래스 확률이다.
Where pi and

Is the predicted and actual class probability of the i-th sample data included in the verification fold, respectively.

단계 S400에서는, l번째 계층의 최소 오차를 갖는 k번째 오차를 검색할 수 있다. 즉, 단계 S400에서는, 단계 S300에서 계산한 오차들 중에서, 최소 오차를 검색할 수 있다. 즉, 단계 S400에서는, l번째 계층의 최소 MSE를 갖는 k번째 MSE를 검색하여 minMSE^l로 할 수 있다.
In step S400, a k-th error having a minimum error of the l-th layer may be searched. That is, in step S400, among the errors calculated in step S300, the minimum error may be searched. That is, in step S400, the k-th MSE having the minimum MSE of the l-th layer may be searched for minMSE ^l .

단계 S500에서는, 검색된 최소 오차값이 임계값(τ) 보다 작고, 최소 계층 수(minimum number of layers; ML)가 2 이상이면, 계층의 생성을 중단하고 l-1개의 계층으로 구성된 LMRF 분류기를 출력하며, 검색된 최소 오차값이 임계값(τ) 이상이거나, 최소 계층 수(minimum number of layers; ML)가 2 미만이면, 단계 S100부터 다시 수행하여 새로운 계층을 생성할 수 있다. 이때, 임계값(τ)은 LMRF 분류기의 계층 수를 제어하는 중요한 파라미터이다. LMRF 분류기는 응용 분야에 따라 임계값(τ)을 제어하여 모델의 복잡성을 적응적으로 결정할 수 있다. ML은 LMRF의 계층이 적어도 두 개 이상 생성되도록 하는데 사용되는 최소 계층 수이다.
In step S500, when the searched minimum error value is less than the threshold value (τ) and the minimum number of layers (ML) is 2 or more, the generation of layers is stopped and the LMRF classifier consisting of l-1 layers is output. And, if the searched minimum error value is greater than or equal to the threshold value τ or the minimum number of layers (ML) is less than 2, a new layer may be generated by performing again from step S100. In this case, the threshold value τ is an important parameter controlling the number of layers of the LMRF classifier. The LMRF classifier can adaptively determine the complexity of the model by controlling the threshold τ according to the application field. ML is the minimum number of layers used to generate at least two layers of LMRF.

도 6은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서, LMRF 분류기의 생성 알고리즘을 도시한 도면이다. 도 3 내지 도 6에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서는, 도 6의 알고리즘에 따라 LMRF 분류기를 생성하고, 생성된 LMRF 분류기를 이용해 분류를 할 수 있다.
6 is a diagram illustrating a generation algorithm of an LMRF classifier in a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention. 3 to 6, in the lightweight multi-layer random forest classifier for low-spec real-time operation and the classification method using the same according to an embodiment of the present invention, an LMRF classifier is generated according to the algorithm of FIG. Classification can be done using the LMRF classifier.

본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법의 검증을 위해, 얼굴 표정 인식을 위한 분류 애플리케이션에 적용하였다.
In order to verify a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention, it was applied to a classification application for facial expression recognition.

도 7은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 이용한 얼굴 표정 인식의 전체적인 프로세스를 도시한 도면이다. 도 7에 도시된 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 적용하기 위해, (a) 입력 영상으로부터 얼굴 랜드마크를 검출하고, (b) 얼굴 랜드마크로부터 랜드마크 사이의 공간 관계를 기하학적 특징으로 추출한 다음, (c) 추출된 기하학적 특징을 이용해, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기를 학습하고, (d) 학습된 LMRF를 이용해 얼굴 표정을 인식할 수 있다. 이하에서는, 얼굴 표정 인식의 과정에 대해 상세히 설명하도록 한다.
7 is a diagram illustrating an overall process of facial expression recognition using a lightweight multi-layer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention. As shown in FIG. 7, in order to apply a lightweight multilayer random forest classifier and a classification method using the same for a low-spec real-time operation according to an embodiment of the present invention, (a) detect a face landmark from an input image, and ( b) After extracting the spatial relationship between the landmarks from the facial landmarks as geometric features, (c) using the extracted geometric features, learning a lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention. , (d) Using the learned LMRF, facial expressions can be recognized. Hereinafter, the process of facial expression recognition will be described in detail.

먼저, 도 7의 (a)에 도시된 바와 같이, 입력 영상으로부터 얼굴 랜드마크를 검출할 수 있다. 보다 구체적으로는, 얼굴 영역과 회귀 분석에 기반한 랜드마크 검출을 적용하여, 얼굴 영역에서 68(x,y) 좌표의 위치를 예측할 수 있다. 여기서, 입력 영상은 일반적인 이미지, 동영상, IR 영상 등일 수 있으며, 얼굴 영역을 포함하며 얼굴 표정의 인식이 필요한 영상이라면 구체적인 영상 특징이나 촬영 특성과 관계없이 입력 영상으로 사용될 수 있다.
First, as shown in (a) of FIG. 7, a face landmark may be detected from an input image. More specifically, by applying the face region and landmark detection based on regression analysis, the position of the 68(x,y) coordinate in the face region can be predicted. Here, the input image may be a general image, a moving image, an IR image, and the like, and any image including a face region and requiring facial expression recognition may be used as an input image regardless of specific image features or photographing characteristics.

다음으로, 도 7의 (b)에 도시된 바와 같이, 얼굴 랜드마크로부터 랜드마크 사이의 공간 관계를 기하학적 특징(geometric features)으로 추출할 수 있다. 보다 구체적으로, 얼굴 랜드마크로부터 랜드마크 상호 간의 각도 특징(Angle feature) 및 거리 특징(Distance feature)을 기하학적 특징으로 추출할 수 있다.
Next, as shown in (b) of FIG. 7, a spatial relationship between landmarks from a facial landmark may be extracted as geometric features. More specifically, angle features and distance features between landmarks may be extracted from facial landmarks as geometric features.

얼굴 표정 인식을 위한 딥 러닝 알고리즘이 전체 이미지를 사용하는 것과 달리, 본 발명에서는, 제한된 얼굴 랜드마크로부터 랜드마크 상호간의 각도 특징 및 거리 특징을 기하학적 특징으로 추출할 수 있다. 즉, 도 7의 (b)에 도시된 바와 같이, 제한된 랜드마크로부터 거리 비율 및 각도 비율을 획득하고, 이를 특징으로 사용할 수 있다.
Unlike the deep learning algorithm for facial expression recognition that uses the entire image, in the present invention, angular features and distance features between landmarks can be extracted as geometric features from limited facial landmarks. That is, as shown in (b) of FIG. 7, a distance ratio and an angle ratio can be obtained from a restricted landmark and used as a feature.

기하학적 특징은 랜드마크 {i, j}의 쌍의 개별 벡터 v_i,j와 {j, k}의 쌍의 벡터 v_j,k 사이의 두 벡터를 사용하여 계산될 수 있다. 거리 비율은 얼굴 회전 또는 스케일링의 결과로 변할 수 있는 공간 관계를 보완하기 위해 두 벡터를 사용해 다음 수학식 2에 의해 계산될 수 있다.The geometric feature can be calculated using two vectors between the individual vectors v _i,j of the pair of landmarks {i, j} and the vector v _j,k of the pair of {j, k}. The distance ratio can be calculated by the following equation (2) using two vectors to compensate for the spatial relationship that may change as a result of face rotation or scaling.

세 랜드마크 {i, j, k} 사이의 각도 특징은 다음 수학식 3에 의해 모델링될 수 있다.The angular feature between the three landmarks {i, j, k} may be modeled by Equation 3 below.

v_i,j및 v_j,k는 각각 랜드마크 i에서 랜드마크 j, 랜드마크 j에서 랜드마크 k를 향하는 벡터이다.
v _{i, j} and v _{j, k} are _vectors from landmark i to landmark j and from landmark j to landmark k, respectively.

이와 같이 제한된 랜드마크를 사용하여 특징을 추출하면 두 가지 장점이 있다. 첫째, 특징 추출을 위한 여러 회선(convolution) 프로세스가 필요하지 않기 때문에, 파라미터와 연산 감소를 통해 심층 모델의 계산 속도를 향상시킬 수 있다. 둘째, 기하학적 특징은 랜드마크의 상대적 거리 및 각도를 사용하기 때문에, 얼굴의 큰 회전 또는 크기 변형에 덜 민감하므로, 이를 통해 얼굴 표정 정확도가 향상될 수 있다.
There are two advantages to extracting features using such limited landmarks. First, since multiple convolution processes for feature extraction are not required, the calculation speed of the deep model can be improved by reducing parameters and operations. Second, since the geometrical feature uses the relative distance and angle of the landmark, it is less sensitive to large rotation or size deformation of the face, and thus facial expression accuracy can be improved.

다음으로, 도 7의 (c)에 도시된 바와 같이, 추출된 기하학적 특징을 이용해, 얼굴 표정 인식을 위한 LMRF 분류기를 학습할 수 있다. 전술한 바와 같이, 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기는, 2개의 계층 구조 및 각 계층 당 미리 정해진 개수 미만의 트리를 포함하여 구성될 수 있다.
Next, as shown in (c) of FIG. 7, an LMRF classifier for facial expression recognition may be learned using the extracted geometric features. As described above, the lightweight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention may be configured to include two hierarchical structures and less than a predetermined number of trees per layer.

이때, 도 7의 (c)에 도시된 바와 같이, 추출된 각도 특징 및 거리 특징을 각도 특징 벡터 및 거리 특징 벡터로 각각 구성할 수 있다. 즉, 전체 특징값을 하나의 특징 벡터로 입력하는 것이 아니라, 각도 특징과 거리 특징을 각각 별개의 특징 벡터로 구성할 수 있다. 구성된 각도 특징 벡터 및 거리 특징 벡터를 LMRF 분류기의 제1 계층에 각각 입력하여 클래스 확률을 획득할 수 있다. 보다 구체적으로는, 각도 특징 벡터 및 거리 특징 벡터를 제1 계층을 구성하는 서로 다른 하위 계층(sub-layers)에 각각 적용할 수 있다. 즉, 각도 특징을 위한 하위 계층과 거리 특징을 위한 하위 계층을 각각 별도로 구성하여, 두 특징 간의 독립성(independence)을 최대한 유지하도록 할 수 있다.
At this time, as shown in (c) of FIG. 7, the extracted angular features and distance features may be configured as angular feature vectors and distance feature vectors, respectively. That is, instead of inputting all feature values as one feature vector, angular features and distance features may be configured as separate feature vectors. The configured angular feature vector and the distance feature vector are respectively input to the first layer of the LMRF classifier to obtain a class probability. More specifically, the angle feature vector and the distance feature vector may be applied to different sub-layers constituting the first layer, respectively. That is, by separately configuring a lower layer for an angular feature and a lower layer for a distance feature, it is possible to maintain the independence between the two features as much as possible.

여기서, 하위 계층은, 16개의 RF(Random Forest) 및 16개의 완전한 RF(Complete-RF; CRF)로 각각 구성할 수 있다. 단일한 RF를 사용할 때보다 RF 및 CRF의 서로 다른 2가지 타입의 RF를 사용할 때에, 성능이 향상될 수 있고, 추후 상세히 설명할 도 7 및 실험 결과로부터 RF가 32개일 때 성능이 가장 우수하므로, RF와 CRF를 각각 16개로 구성하여 우수한 성능을 갖도록 할 수 있다.
Here, the lower layer may be composed of 16 random forests (RF) and 16 complete RFs (CRFs), respectively. When using two different types of RF, RF and CRF than when using a single RF, performance can be improved, and the performance is best when there are 32 RFs from Fig. 7 and the experimental results which will be described in detail later, RF and CRF can be composed of 16 pieces each to have excellent performance.

제1 계층에서 획득된 클래스 확률을 LMRF 분류기의 다음 계층에 입력하여 학습할 수 있다. 즉, 도 7의 (c)에 도시된 바와 같이, 제1 계층의 출력은 다음 계층으로 연결되고, 이전 계층에서 획득된 클래스 확률은 특징 벡터로 변환하여 다음 계층으로 입력될 수 있다.
The class probabilities obtained from the first layer may be input to the next layer of the LMRF classifier to learn. That is, as shown in (c) of FIG. 7, the output of the first layer is connected to the next layer, and the class probabilities obtained from the previous layer are converted into feature vectors and input to the next layer.

이와 같이 학습 과정 동안, 한 계층의 출력 벡터는 연속적으로 다음 계층의 입력 벡터가 될 수 있다. 본 발명에서는, 이전 계층에서 생성된 변환된 특징 벡터를 결합하지 않고, 이전 계층의 출력 특징만을 다음 계층의 새로운 입력 특징으로 사용하는 모델을 설계하여 빠른 수렴이 일어나고 우수한 성능이 유지되도록 하였다.
As described above, during the learning process, an output vector of one layer may be an input vector of a next layer continuously. In the present invention, by designing a model that uses only the output features of the previous layer as new input features of the next layer without combining the transformed feature vectors generated in the previous layer, rapid convergence occurs and excellent performance is maintained.

마지막으로, 도 7의 (d)에 도시된 바와 같이, 학습된 LMRF 분류기를 이용해 얼굴 표정을 인식할 수 있다. 즉, LMRF 분류기의 학습을 마친 후, 테스트 이미지가 주어지면, 검출된 랜드마크로부터 기하학적인 특징을 추출한 다음, 제1 계층에 입력할 수 있다. 제1 계층의 출력은 다음 계층으로 연결되고, 제1 계층에 의해 생성된 클래스 벡터로 보강된 변환된 특징 벡터는 최종 계층에 매핑될 때까지 다음 계층으로 입력될 수 있다.
Finally, as shown in (d) of FIG. 7, facial expressions may be recognized using the learned LMRF classifier. That is, after completing the learning of the LMRF classifier, if a test image is given, geometric features may be extracted from the detected landmark and then input to the first layer. The output of the first layer is connected to the next layer, and the transformed feature vector reinforced with the class vector generated by the first layer may be input to the next layer until it is mapped to the final layer.

이때, LMRF 분류기의 마지막 층의 각 RF의 출력 확률을 평균하여, 최대 확률을 갖는 클래스를 최종 얼굴 표정으로 인식할 수 있다. 즉, 최종 계층은 각 클래스의 확률값을 평균화하고 가장 높은 확률값을 갖는 클래스를 최종 표정 클래스로 결정할 수 있다. 보다 구체적으로, 행복, 두려움, 놀람, 화남, 역겨움, 슬픔의 6가지 분류 중 어느 하나로 얼굴 표정을 인식할 수 있다.
In this case, by averaging the output probability of each RF of the last layer of the LMRF classifier, the class having the maximum probability may be recognized as a final facial expression. That is, the final layer may average the probability values of each class and determine the class having the highest probability value as the final expression class. More specifically, facial expressions can be recognized in one of six categories: happiness, fear, surprise, anger, disgust, and sadness.

실험 결과Experiment result

전술한 바와 같이 학습을 통해 생성된 LMRF 분류기의 얼굴 표정 인식(Facial Expression Recognition; FER) 성능 평가 실험을 수행하였다. FER을 평가할 수 있는 많은 벤치마크 데이터베이스가 있다. 본 발명에서는, 계명대학교 운전자 얼굴 표정(KMU-FED)과 CK+ 및 MMI 데이터베이스를 이용해 본 발명의 성능을 평가하였다.
As described above, an experiment was performed to evaluate the facial expression recognition (FER) performance of the LMRF classifier generated through learning. There are many benchmark databases from which FER can be evaluated. In the present invention, the performance of the present invention was evaluated using Keimyung University driver's facial expression (KMU-FED) and CK+ and MMI databases.

CK+는 FER에서 가장 널리 사용되는 데이터베이스이며, 118개의 피사체로부터 327개의 이미지 시퀀스와 얼굴 동작 코딩 시스템을 기반으로 하는 표정 레이블을 포함한다. MMI 데이터베이스는 213개의 영상 시퀀스를 포함한다. 이 실험에서는, 31명의 피험자의 정면 얼굴을 갖는 205개의 시퀀스를 이용하였다. KMU-FED 데이터베이스는 12명의 피험자로부터 55개의 이미지 시퀀스를 포함하는 다양한 운전자 표정으로 구성된다. 머리카락이나 선글라스 때문에 다양한 조명(앞, 왼쪽, 오른쪽, 뒤)과 부분적인 폐색이 변경된다. NIR 카메라는 운전자의 얼굴 인식을 위해 차량의 대시보드 또는 스티어링 휠에 설치되었다. 성능 평가를 위해 CK+에 대한 5-겹 교차 검증(five-fold cross validation)과 MMI 데이터베이스에 대한 개인 독립적 10-겹 교차 검증(person-independent 10-fold cross validation)을 수행하였다. KMU-FED 데이터베이스의 경우 5-겹 교차 검증을 수행하였다.
CK+ is the most widely used database in FER and contains 327 image sequences from 118 subjects and facial expression labels based on a facial motion coding system. The MMI database contains 213 image sequences. In this experiment, 205 sequences with front faces of 31 subjects were used. The KMU-FED database is composed of various driver expressions containing 55 image sequences from 12 subjects. Various lighting (front, left, right, back) and partial occlusion are altered by hair or sunglasses. NIR cameras were installed on the vehicle's dashboard or steering wheel to recognize the driver's face. To evaluate the performance, five-fold cross validation for CK+ and person-independent 10-fold cross validation for the MMI database were performed. In the case of the KMU-FED database, a 5-fold cross-validation was performed.

LMRF 학습은 CK+ 데이터베이스를 사용하였으며, 교차 검증은 학습 과정에서 학습 데이터를 5부분으로 나누어 측정하였다. 성능 평가는 CK+ 데이터베이스에서 학습한 LMRF 구조와 파라미터를 각 데이터베이스에 적용하여 수행하였다.
CK+ database was used for LMRF learning, and cross-validation was measured by dividing the learning data into 5 parts in the learning process. Performance evaluation was performed by applying the LMRF structure and parameters learned in the CK+ database to each database.

실험을 위한 시스템 환경에는 Microsoft Windows 10과 8GB RAM이 장착된 Intel Core i7 프로세서가 포함되었다. 본 발명의 LMRF는 CPU를 기반으로 작동하며 비교 실험에 사용된 최신 DNN 기반 알고리즘은 단일 Titan-X GPU를 사용하여 테스트하였다. 성능 평가로서, 조사된 총 사례 수에 대한 참 긍정(true positive)에서 참 부정(true negative)의 비율인 일반적인 정확도(accuracy)를 사용하였다.
The system environment for the experiment included Microsoft Windows 10 and an Intel Core i7 processor with 8GB of RAM. The LMRF of the present invention operates based on the CPU, and the latest DNN-based algorithm used in the comparative experiment was tested using a single Titan-X GPU. As a performance evaluation, we used general accuracy, which is the ratio of true positive to true negative to the total number of cases investigated.

A. 포레스트와 트리 개수 평가A. Forest and tree count evaluation

본 발명에서는, RF당 트리 수 또는 계층당 트리 수를 늘리는 것보다 RF의 개수를 늘리는 것이 더 효과적이라는 것을 실험을 통해 증명하였다.
In the present invention, it was proved through experiment that increasing the number of RFs is more effective than increasing the number of trees per RF or the number of trees per layer.

640개의 트리를 생성하고, 그 수를 증가시키면서 적절한 개수의 RF를 예측하였다. 최대 계층의 수는 2로 하였고, 계층의 수와 한 계층당 트리 수는 실시간 작업을 고려하여 결정된다. 실험은 기본 감정이 6가지인 CK+ 데이터베이스를 사용하여 수행하였다.
640 trees were generated, and an appropriate number of RFs was predicted while increasing the number. The maximum number of layers is set to 2, and the number of layers and the number of trees per layer are determined in consideration of real-time work. The experiment was conducted using a CK+ database with six basic emotions.

도 8은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서, RF의 수를 증가시키면서 트리 수를 균등하게 분배했을 때, FER 정확도를 표시한 도면이다. 도 8에 도시된 바와 같이, RF의 수가 증가하고 전체 트리가 여러 RF에 균등하게 분배되면 인식 정확도가 향상된다고 말할 수 있다. 그러나 RF의 수가 너무 많아지면, 각 RF에 할당된 트리가 너무 적기 때문에 인식 정확도가 떨어진다. 따라서 본 발명에서는, 각 RF에 20개의 트리를 할당하여 최상의 성능을 발휘하는 경우(RF32)를 사용하였다.
FIG. 8 is a diagram showing FER accuracy when the number of trees is evenly distributed while increasing the number of RFs in a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention. . As shown in FIG. 8, it can be said that the recognition accuracy is improved when the number of RFs is increased and the entire tree is evenly distributed to several RFs. However, if the number of RFs is too large, recognition accuracy deteriorates because the tree allocated to each RF is too small. Therefore, in the present invention, a case where 20 trees are allocated to each RF to exhibit the best performance (RF32) was used.

B. 최신 방법들과의 비교B. Comparison with the latest methods

본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 이용한 FER 성능 검증을 위해, (1) 기존의 CNN 계층 구조를 사용하는 AlexNets 기반의 FER 접근법, (2) 변형가능 얼굴 동작 부분 제약조건(deformable facial action part constraints)을 갖는 3D CNN 기반 접근법(CDCNN-DAP), (3) 다중 인셉션(Multiple Inception) 층을 사용하는 DNN, (4) LSTM을 갖는 2D Inception-ResNet 모듈, (5) ADML(adaptive deep metric learning)을 사용하는 신원 확인 FER, (6) 빠른 FER을 위해 설계된 계층 가중 RF(hierarchical weighted RF; HWRF), 및 (7) DF(Deep forest), (8) FTDRF(Forward-thinking deep random forest), (9) 2층으로 구성된 본 발명의 LMRF(Proposed LMRF)의 세 가지 DRF 기반 방법을 비교하였다.
In order to verify FER performance using a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention, (1) an AlexNets-based FER approach using an existing CNN layer structure, (2 ) 3D CNN-based approach (CDCNN-DAP) with deformable facial action part constraints, (3) DNN using multiple inception layers, (4) 2D with LSTM Inception-ResNet module, (5) identity verification FER using adaptive deep metric learning (ADML), (6) hierarchical weighted RF (HWRF) designed for fast FER, and (7) deep forest (DF) , (8) Forward-thinking deep random forest (FTDRF), and (9) three DRF-based methods of LMRF (Proposed LMRF) of the present invention composed of two layers were compared.

여기서, DF는 계층당 4개의 포레스트로 구성되며, 각 포레스트는 500개의 트리로 구성된다. 네트워크는 입력 계층을 포함하여 총 5개의 계층으로 구성된다. FTDRF는 2개의 계층과 한 개의 계층으로 구성되며, 2000개의 트리를 포함한다.
Here, DF is composed of 4 forests per layer, and each forest is composed of 500 trees. The network consists of a total of 5 layers including the input layer. FTDRF consists of two layers and one layer, and includes 2000 trees.

도 9는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에서, FER 정확도를 비교하여 표시한 도면이다. 도 8에서 확인할 수 있는 바와 같이, 본 발명의 LMRF 분류기(Proposed LMRF)는 DNN 기반의 방법 중에서 최상의 성능을 보여주는 Inception 기반 방법들(Multiple Inception 및 Inception-ResNet with LSTM)보다도 0.4% 더 높은 정확도를 제공한다.
9 is a diagram showing comparison of FER accuracy in a lightweight multilayer random forest classifier and a classification method using the same for a low-spec real-time operation according to an embodiment of the present invention. As can be seen in FIG. 8, the LMRF classifier (Proposed LMRF) of the present invention provides 0.4% higher accuracy than Inception-based methods (Multiple Inception and Inception-ResNet with LSTM) that show the best performance among DNN-based methods. do.

MMI 데이터베이스의 경우, ADML 방법은 DNN 기반 방법들 중에서 78.5%의 가장 좋은 성능을 나타내며, 본 발명보다 약 1.1% 정확도가 높다. 그러나 하이엔드 GPU 대신 CPU에서 실시간으로 실행할 수 있는 경량 알고리즘이 필요하기 때문에, 지능형 차량과 같은 로우엔드 시스템에는 DNN 기반 방법이 적합하지 않은 한계가 있다. 또한, DRF 기반 방법들을 상호 비교할 때, FTDRF는 본 발명보다 1.5% 정도 약간 더 나은 성능을 보여준다. 그러나 본 발명이 FTDRF보다 2,600개 적은 의사결정 트리를 사용한다는 점을 고려할 때, 1.5% 정도의 성능은 트리 또는 계층을 추가함으로써 극복될 수 있다. 따라서 본 발명의 LMRF 모델은 LMRF의 상대적으로 가벼운 구조에도 불구하고, 다른 최첨단 DNN 기반 연구 및 다른 DRF 기반 연구에 비해 높은 성능을 보임을 알 수 있다.
In the case of the MMI database, the ADML method shows the best performance of 78.5% among DNN-based methods, and is about 1.1% more accurate than the present invention. However, since a lightweight algorithm that can be executed in real time on a CPU instead of a high-end GPU is required, the DNN-based method is not suitable for low-end systems such as intelligent vehicles. In addition, when comparing DRF-based methods with each other, FTDRF shows slightly better performance by 1.5% than the present invention. However, considering that the present invention uses 2,600 fewer decision trees than FTDRF, the performance of about 1.5% can be overcome by adding a tree or a hierarchy. Therefore, it can be seen that the LMRF model of the present invention shows higher performance than other state-of-the-art DNN-based studies and other DRF-based studies despite the relatively light structure of the LMRF.

본 발명의 유효성을 검증하기 위해, DF, FTDRF, 및 본 발명을 포함하는 DRF 기반 접근법에 대해 KMU-FED 데이터베이스를 사용하여 6가지 기본 감정의 분류 정확성 비교를 수행하였다. 구체적인 모델 구성은 전술한 실험과 동일하다.
In order to verify the validity of the present invention, classification accuracy comparison of six basic emotions was performed using the KMU-FED database for DF, FTDRF, and DRF-based approaches including the present invention. The specific model configuration is the same as the above experiment.

도 10은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법과 다른 DRF 기반 방법의 감정 분류 정확도를 비교하여 표시한 도면이다. 도 10에 도시된 바와 같이, 본 발명의 LMRF는 DF보다 4.6%, FRDRF보다 1.5% 정확도가 높다. 이 결과는 본 발명이 트리의 수에 의존하기보다는 RF의 수를 증가시킴으로써 분류 결과의 신뢰도를 증가시킨다는 것을 타나낸다.
FIG. 10 is a diagram illustrating a comparison of a light weight multilayer random forest classifier for a low-spec real-time operation according to an embodiment of the present invention and a classification method using the same and emotion classification accuracy of another DRF-based method. As shown in FIG. 10, the LMRF of the present invention has an accuracy of 4.6% higher than that of DF and 1.5% of that of FRDRF. This result indicates that the present invention increases the reliability of the classification result by increasing the number of RFs rather than depending on the number of trees.

C. 파라미터의 수 및 연산 비교C. Comparison of number and operation of parameters

운전자의 감정 상태 모니터링과 같은 응용 분야에 적용하기 위해서는, 실시간 처리가 매우 중요하다. 따라서 본 실험에서는 2개의 DNN 모델 압축 알고리즘과 DRF 기반 알고리즘의 작동에 필요한 파라미터와 연산의 수를 비교하였다. CK+ 데이터 세트를 사용하는 인기 있는 모델 압축 방법인 최신의 MobileNet 및 SqueezeNet과 본 발명의 LMRF 모델을 비교하였다. 또한, DRF 기반의 두 가지 방법인 DF와 FRDRF도 비교하였다. 본 발명의 LMRF를 포함하는 DRF 기반의 방법은 CPU에서 작동되었으며, 두 가지 DNN 모델 압축 방법은 GPU 장치에서 작동되었다.
In order to be applied to an application field such as monitoring a driver's emotional state, real-time processing is very important. Therefore, in this experiment, parameters and the number of operations required for operation of two DNN model compression algorithms and DRF-based algorithms were compared. The latest MobileNet and SqueezeNet, popular model compression methods using CK+ data sets, were compared with the LMRF model of the present invention. In addition, two DRF-based methods, DF and FRDRF, were also compared. The DRF-based method including the LMRF of the present invention was operated on the CPU, and the two DNN model compression methods were operated on the GPU device.

도 11은 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법과 DNN 모델 압축 알고리즘, DRF 기반 알고리즘의 정확도, 파라미터의 수 및 연산의 수를 비교하여 표시한 도면이다. 도 11에 도시된 바와 같이, DNN 기반 모델 압축 방법은 3가지 DRF 기반 방법과 비슷한 수의 파라미터를 갖지만, 연산 횟수는 3가지 DRF 기반 방법보다 훨씬 많다. 본 발명의 LMRF는 파라미터 측면에서 MobileNet 및 SqueezeNet과 유사하지만, 정확도 및 연산 수는 우수하다. 따라서 본 발명은 모델 압축 없이 CPU 환경에서 잘 동작할 수 있다. 두 가지 DRF 기반 방법 중 더 우수한 FTDRF는 본 발명의 LMRF보다 약 2.8배의 파라미터 수 및 2배의 연산 횟수가 필요하다. 따라서 정확도, 메모리 및 동작 면에서 본 발명의 LMRF 방법은 지능형 차량과 같은 임베디드 시스템에 최적화될 수 있다.
11 is a light weight multilayer random forest classifier for low-spec real-time operation according to an embodiment of the present invention, and a classification method using the same, a DNN model compression algorithm, the accuracy of a DRF-based algorithm, the number of parameters, and the number of operations are compared and displayed. It is a drawing. As shown in FIG. 11, the DNN-based model compression method has a similar number of parameters as the three DRF-based methods, but the number of operations is much higher than that of the three DRF-based methods. The LMRF of the present invention is similar to MobileNet and SqueezeNet in terms of parameters, but has excellent accuracy and number of operations. Therefore, the present invention can operate well in a CPU environment without model compression. Among the two DRF-based methods, the superior FTDRF requires about 2.8 times the number of parameters and twice the number of calculations than the LMRF of the present invention. Therefore, in terms of accuracy, memory and operation, the LMRF method of the present invention can be optimized for embedded systems such as intelligent vehicles.

도 12는 본 발명의 일실시예에 따른 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법을 사용하여 얼굴 표정을 인식한 결과를 도시한 도면이다. 여기서, (a) CK+, (b) MMI, (c) KMU-FED 데이터베이스를 각각 나타내며, (d)는 모호한 표정과 빛의 갑작스런 변화로 인한 잘못된 인식 결과를 나타낸다. 도 12의 (a) 및 (b)에서 확인할 수 있는 바와 같이, 공개된 CK+ 또는 MMI 데이터 세트와 같은 비교적 간단한 배경 이미지에서 표정이 올바르게 인식된다. 또한, (c)에서 확인할 수 있는 바와 같이, KMU-FED를 이용한 실험에서, 운전 중에 발생하는 다양한 배경 변화, 조명 변화 및 운전자 움직임에도 불구하고, 본 발명은 운전자의 감정을 상대적으로 정확하게 인식할 수 있다. 그러나 (d)와 같이, 갑작스러운 차량 흔들림, 모호한 표정 및 조명 변화로 인한 잘못된 인식은 해결해야 할 문제이다.
12 is a diagram showing a result of recognizing facial expressions using a lightweight multilayer random forest classifier for low-spec real-time operation and a classification method using the same according to an embodiment of the present invention. Here, (a) CK+, (b) MMI, and (c) KMU-FED databases are shown respectively, and (d) shows false recognition results due to ambiguous facial expressions and sudden changes in light. As can be seen in FIGS. 12A and 12B, facial expressions are correctly recognized in a relatively simple background image such as a published CK+ or MMI data set. In addition, as can be seen in (c), in the experiment using the KMU-FED, despite various background changes, lighting changes, and driver movements occurring during driving, the present invention can relatively accurately recognize the driver's emotions. have. However, as shown in (d), erroneous recognition due to sudden vehicle shaking, ambiguous facial expressions, and lighting changes is a problem to be solved.

전술한 바와 같이, 본 발명에서 제안하고 있는 저사양 실시간 동작을 위한 경량 다층 랜덤 포레스트 분류기 및 이를 이용한 분류 방법에 따르면, 각 계층(layer)이 랜덤 포레스트(Random forest; RF)로 구성된 다층 구조(layer-by-layer structure)의 비-신경망 타입의 심층 모델을 구성하고, 각 계층을 미리 정해진 개수 이하의 트리로 구성함으로써, 기존의 DNN 모델에 비하여 적은 수의 하이퍼 파라미터로도 DNN과 비슷한 성능을 제공하며, 동일한 조건에서 사용 시 처리 시간이 DNN보다 더 빠르므로, 실시간 처리를 위한 분야에 적용할 수 있다.
As described above, according to the lightweight multilayer random forest classifier for low-spec real-time operation proposed in the present invention and a classification method using the same, each layer is a multilayer structure composed of a random forest (RF). By-layer structure), a non-neural network type deep model is constructed, and each layer is composed of a tree with a predetermined number or less, providing similar performance to DNN with fewer hyper parameters compared to the existing DNN model. When used under the same conditions, processing time is faster than DNN, so it can be applied to fields for real-time processing.

이상 설명한 본 발명은 본 발명이 속한 기술분야에서 통상의 지식을 가진 자에 의하여 다양한 변형이나 응용이 가능하며, 본 발명에 따른 기술적 사상의 범위는 아래의 특허청구범위에 의하여 정해져야 할 것이다.The present invention described above can be modified or applied in various ways by those of ordinary skill in the technical field to which the present invention belongs, and the scope of the technical idea according to the present invention should be determined by the following claims.

S10: LMRF 분류기를 생성하는 단계
S20: 생성된 LMRF 분류기를 이용해 분류를 하는 단계
S100: 계층 수 l을 증가시키는 단계
S200: 학습 데이터셋 A를 k개의 그룹으로 무작위로 분할하는 단계
S300: 분할된 k개의 그룹을 사용해, 각각의 k 폴드에 대하여 오차를 계산하는 단계
S310: 분할된 k개의 그룹 중, (k-1)개의 그룹을 학습 폴드로 할당하고, 나머지 하나의 그룹을 검증 폴드로 할당하는 단계
S320: (k-1)개의 학습 폴드를 이용해 복수의 RF를 학습하여 계층 l을 생성하는 단계
S321: RF들에 대해서, 서브셋 샘플들 및 정보 이득을 사용해 랜덤 트리들을 성장시키는 단계
S322: CRF에 한하여, 정보 이득 없이 서브셋 샘플들을 사용하여 완전하게 랜덤 트리들을 성장시키는 단계
S330: 검증 폴드의 N 샘플에 대해 손실 함수를 합산하여 오차를 계산하는 단계
S400: l번째 계층의 최소 오차를 갖는 k번째 오차를 검색하는 단계
S500: 검색된 최소 오차값이 임계값보다 작고, 최소 계층 수가 2 이상이면, 계층의 생성을 중단하고 l-1개의 계층으로 구성된 LMRF 분류기를 출력하는 단계S10: Steps to generate an LMRF classifier
S20: Step of classifying using the generated LMRF classifier
S100: Step of increasing the number of layers l
S200: randomly partitioning training dataset A into k groups
S300: Using the divided k groups, calculating an error for each k fold
S310: Step of allocating (k-1) groups as a learning fold among the divided k groups, and allocating the remaining one group as a verification fold
S320: Learning a plurality of RFs using (k-1) learning folds to generate layer l
S321: For RFs, growing random trees using subset samples and information gain
S322: For CRF only, completely growing random trees using subset samples without information gain
S330: calculating an error by summing the loss function for N samples of the verification fold
S400: Searching for the k-th error with the minimum error of the l-th layer
S500: If the retrieved minimum error value is less than the threshold value and the minimum number of layers is 2 or more, stopping the generation of layers and outputting an LMRF classifier consisting of l-1 layers

Claims

As a lightweight multilayer random forest (LMRF) classifier,
Each layer is a deep model of a non-neural network type of a layer-by-layer structure composed of a random forest (RF), and each layer is composed of a predetermined number or less of trees. A lightweight multi-layer random forest classifier for low-end, real-time operation.

The method of claim 1,
A lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that each layer has a two-layer structure composed of a plurality of RFs.

The method of claim 1,
A lightweight multi-layer random forest classifier for low-end real-time operation, characterized in that each layer is composed of randomly generated heterogeneous RF.

The method of claim 3,
Each layer is composed of two types of RF, RF and CRF (Complete-RF), a lightweight multi-layer random forest classifier for low-spec real-time operation.

The method of claim 1,
A lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that the transformed feature vectors generated in the previous layer are not combined, and only the output features of the previous layer are used as new input features of the next layer.

The method of claim 1,
A lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that 20 decision trees are allocated to each RF to reduce the number of parameters and computational load.

The method of claim 1,
A lightweight multilayer random forest classifier for low-spec real-time operation, characterized in that the number of layers and parameters is determined using K-fold validation.

The method of claim 7,
(1) increasing the layer number l;
(2) randomly partitioning the training dataset A into k groups;
(3) using the divided k groups, calculating an error for each k fold;
(4) searching for a k-th error having a minimum error of the l-th layer; And
(5) If the retrieved minimum error value is less than the threshold value and the minimum number of layers (ML) is 2 or more, the generation of the layer is stopped and an LMRF classifier consisting of l-1 layers is output, and the If the searched minimum error value is greater than or equal to the threshold value or the minimum number of layers (ML) is less than 2, the step of generating a new layer is performed again from step (1), and the layer and parameter are A lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in determining the number.

The method of claim 8, wherein the step (3),
(3-1) allocating (k-1) groups of the divided k groups as learning folds, and allocating the remaining one group as verification folds;
(3-2) generating a layer l by learning a plurality of RFs using the (k-1) learning folds; And
(3-3) calculating an error by summing a loss function for N samples of the verification fold,
A lightweight multilayer random forest classifier for low-spec real-time operation, characterized in that the error is calculated by performing steps (3-1) to (3-3) for each k-fold.

The method of claim 9,
Each layer is composed of two types of RF, RF and CRF (Complete-RF),
The step (3-2),
Select a subset from the learning fold,
(3-2-1) for RFs, growing random trees using the subset samples and information gain; And
(3-2-2) For CRF, comprising the step of completely growing random trees using the subset samples without information gain, lightweight multi-layer random forest classifier for low-spec real-time operation.

As a classification method using a lightweight multilayer random forest (LMRF) classifier,
(A) A deep model of a non-neural network type of a layer-by-layer structure in which each layer is composed of a random forest (RF), and each layer is composed of a predetermined number of trees or less. Generating an LMRF classifier; And
(B) A classification method using a lightweight multilayer random forest classifier for low-spec real-time operation, comprising the step of performing classification using the generated LMRF classifier.

The method of claim 11, wherein the LMRF classifier,
Classification method using a lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that each layer is a two-layer structure composed of a plurality of RF.

The method of claim 11, wherein the LMRF classifier,
Classification method using a lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that each layer is composed of randomly generated heterogeneous RF.

The method of claim 13, wherein the LMRF classifier,
Classification method using a lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that each layer is composed of two types of RF, RF and CRF (Complete-RF).

The method of claim 11, wherein the LMRF classifier,
A classification method using a lightweight multi-layer random forest classifier for low-spec real-time operation, characterized in that the transformed feature vector generated in the previous layer is not combined, and only the output feature of the previous layer is used as a new input feature of the next layer.

The method of claim 11, wherein the LMRF classifier,
A classification method using a lightweight multilayer random forest classifier for low-spec real-time operation, characterized in that 20 decision trees are allocated to each RF to reduce the number of parameters and computational load.

The method of claim 11, wherein in the step (A),
A classification method using a lightweight multilayer random forest classifier for low-spec real-time operation, characterized in that the LMRF classifier is generated by determining the number of layers and parameters using K-fold validation.

The method of claim 17, wherein the step (A),
(1) increasing the layer number l;
(2) randomly partitioning the training dataset A into k groups;
(3) using the divided k groups, calculating an error for each k fold;
(4) searching for a k-th error having a minimum error of the l-th layer; And
(5) If the retrieved minimum error value is less than the threshold value and the minimum number of layers (ML) is 2 or more, the generation of the layer is stopped and an LMRF classifier consisting of l-1 layers is output, and the If the searched minimum error value is greater than or equal to the threshold value or the minimum number of layers (ML) is less than 2, the method comprising the step of generating a new layer by performing it again from step (1). A classification method using a lightweight multilayer random forest classifier for real-time operation.

The method of claim 18, wherein the step (3),
(3-1) allocating (k-1) groups of the divided k groups as learning folds, and allocating the remaining one group as verification folds;
(3-2) generating a layer l by learning a plurality of RFs using the (k-1) learning folds; And
(3-3) calculating an error by summing a loss function for N samples of the verification fold,
A classification method using a lightweight multilayer random forest classifier for a low-spec real-time operation, characterized in that the error is calculated by performing steps (3-1) to (3-3) for each k-fold.

The method of claim 19,
In the LMRF classifier, each layer is composed of two types of RF, RF and CRF (Complete-RF),
The step (3-2),
Select a subset from the learning fold,
(3-2-1) for RFs, growing random trees using the subset samples and information gain; And
(3-2-2) For CRF, a classification method using a lightweight multilayer random forest classifier for low-spec real-time operation, comprising the step of completely growing random trees using the subset samples without information gain .