KR102039146B1

KR102039146B1 - Method for performing multi-domain ensemble learning and apparatus thereof

Info

Publication number: KR102039146B1
Application number: KR1020190032120A
Authority: KR
Inventors: 김효은; 이현재
Original assignee: 주식회사 루닛
Priority date: 2019-03-21
Filing date: 2019-03-21
Publication date: 2019-10-31

Abstract

Provided are a method for multi-domain ensemble learning capable of guaranteeing the performance improvement effect of multi-domain learning and an apparatus thereof. According to some embodiments of the present disclosure, the method for multi-domain ensemble learning may learn a first output layer specific to a first domain using a first dataset belonging to the first domain, learn a second output layer specific to a second domain using a second dataset belonging to the second domain, and learn a third output layer using the first and second datasets. In addition, the method for multi-domain ensemble may guarantee the performance improvement effect according to multi-domain ensemble learning by performing prediction for each domain by further using a predictive value of the third output layer.

Description

Multi-domain ensemble learning method and apparatus therefor {METHOD FOR PERFORMING MULTI-DOMAIN ENSEMBLE LEARNING AND APPARATUS THEREOF}

본 개시는 멀티 도메인 앙상블 학습 방법 및 그 장치에 관한 것이다. 보다 자세하게는, 복수의 출력 레이어를 포함하는 신경망에 대해 멀티 도메인(multi-domain) 학습을 수행함에 있어서, 멀티 도메인 앙상블 학습에 따른 성능 개선 효과를 보장할 수 있는 방법 및 그 방법을 수행하는 장치에 관한 것이다.The present disclosure relates to a multi-domain ensemble learning method and apparatus thereof. More specifically, in performing multi-domain learning on a neural network including a plurality of output layers, a method and a device for performing the method that can ensure the performance improvement effect according to the multi-domain ensemble learning It is about.

멀티 도메인 학습은 하나의 모델을 통해 복수의 도메인들을 동시에 학습시는 기법이다. Multi-domain learning is a technique for learning a plurality of domains simultaneously through one model.

멀티 도메인 학습의 목표는, 여러 도메인에서 동시에 잘 작동할 수 있는 공통적인 특징(shared feature)을 학습하여 각각 독립적으로 모델을 구축했을 때 대비 학습 파라미터의 개수는 줄이고, 모델의 성능은 향상시키는 것이다.The goal of multi-domain learning is to reduce the number of learning parameters and improve the performance of the model when learning the shared features that can work well in multiple domains simultaneously.

일반적으로, 멀티 도메인 학습은 도메인 간 유사도가 높은 경우에 활용될 수 있다. 그러나, 타깃 도메인(target domain)의 데이터셋이 작은 경우에는, 소스 도메인(source domain)의 데이터셋을 활용하여 멀티 도메인 학습을 수행하더라도 타깃 도메인에 대한 학습 모델의 예측 성능을 향상시키는 것은 쉽지 않다.In general, multi-domain learning may be utilized when the similarity between domains is high. However, when the data set of the target domain is small, it is not easy to improve the prediction performance of the learning model for the target domain even when multi domain learning is performed using the data set of the source domain.

한국공개특허 제10-2018-0099119호 (2018.09.05공개)Korean Patent Publication No. 10-2018-0099119

본 개시의 몇몇 실시예들을 통해 해결하고자 하는 기술적 과제는, 복수의 출력 레이어를 포함하는 신경망에 대한 멀티 도메인 학습을 수행함에 있어서, 멀티 도메인 앙상블 학습에 따른 성능 개선 효과를 보장할 수 있는 방법 및 그 방법을 수행하는 장치를 제공하는 것이다.Technical problem to be solved by some embodiments of the present disclosure, a method that can ensure the performance improvement effect according to the multi-domain ensemble learning in performing multi-domain learning on the neural network including a plurality of output layers It is to provide an apparatus for performing the method.

본 개시의 몇몇 실시예들을 통해 해결하고자 하는 다른 기술적 과제는, 복수의 출력 레이어를 포함하는 신경망에 대한 멀티 도메인 학습을 수행함에 있어서, 도메인간 유사성을 활용하여 멀티 도메인 앙상블 학습에 따른 성능 개선 효과를 보장할 수 있는 방법 및 그 방법을 수행하는 장치를 제공하는 것이다.Another technical problem to be solved through some embodiments of the present disclosure, in performing multi-domain learning on a neural network including a plurality of output layers, by utilizing the similarity between domains to improve the performance of the multi-domain ensemble learning It is to provide a method that can be guaranteed and an apparatus for performing the method.

본 개시의 몇몇 실시예들을 통해 해결하고자 하는 또 다른 기술적 과제는, 복수의 레이어를 포함하는 신경망에 대한 멀티 도메인 학습을 수행함에 있어서, 각 도메인의 추론 결과를 가중치로 반영하여 멀티 도메인 앙상블 학습에 따른 성능 개선 효과를 보장할 수 있는 방법 및 그 방법을 수행하는 장치를 제공하는 것이다.Another technical problem to be solved by some embodiments of the present disclosure, in performing multi-domain learning on a neural network including a plurality of layers, by reflecting the inference result of each domain as a weight according to the multi-domain ensemble learning The present invention provides a method capable of ensuring a performance improvement effect and an apparatus for performing the method.

본 개시의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 개시의 기술분야에서의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present disclosure are not limited to the technical problems mentioned above, and other technical problems that are not mentioned will be clearly understood by those skilled in the art from the following description.

상기 기술적 과제를 해결하기 위한, 본 개시의 일 실시예에 따른 멀티 도메인 앙상블 학습 방법은, 컴퓨팅 장치에서 복수의 출력 레이어를 포함하는 신경망에 대해 멀티 도메인 앙상블 학습을 수행하는 방법에 있어서, 제1 도메인에 속한 제1 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제1 출력 레이어를 학습시키는 단계, 제2 도메인에 속한 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제2 출력 레이어를 학습시키는 단계 및 상기 제1 데이터셋 및 상기 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제3 출력 레이어를 학습시키는 단계를 포함하는 것일 수 있다.In order to solve the above technical problem, a multi-domain ensemble learning method according to an embodiment of the present disclosure, in a method for performing multi-domain ensemble learning for a neural network including a plurality of output layers in a computing device, the first domain Training a first output layer of the plurality of output layers using a first dataset belonging to; training a second output layer of the plurality of output layers using a second dataset belonging to a second domain. And learning a third output layer from the plurality of output layers using the first data set and the second data set.

몇몇 실시예에서, 상기 제1 데이터셋은 상기 제2 데이터셋보다 더 많은 개수의 데이터를 포함하는 것일 수 있다.In some embodiments, the first dataset may include more data than the second dataset.

몇몇 실시예에서, 상기 제1 데이터셋은 2D 이미지를 포함하고, 상기 제2 데이터셋은 3D 이미지를 포함할 수 있다. 이 때 상기 제1 데이터셋은 FFDM 이미지를 포함하고, 상기 제2 데이터셋은 DBT 이미지를 포함할 수 있다.In some embodiments, the first dataset may include a 2D image, and the second dataset may include a 3D image. In this case, the first data set may include an FFDM image, and the second data set may include a DBT image.

몇몇 실시예에서, 상기 제1 데이터셋은 단일 레이어 이미지를 포함하고, 상기 제2 데이터셋은 멀티 레이어 이미지를 포함할 수 있다. 이 때. 상기 제2 출력 레이어를 학습시키는 단계는, 상기 멀티 레이어 이미지에서 제1 레이어 이미지를 추출하고, 상기 제1 레이어 이미지를 이용하여 상기 제2 출력 레이어를 학습시키는 단계 및 상기 멀티 레이어 이미지에서 제2 레이어 이미지를 추출하고, 상기 제2 레이어 이미지를 이용하여 상기 제2 출력 레이어를 학습시키는 단계를 포함할 수 있다. In some embodiments, the first dataset may comprise a single layer image, and the second dataset may comprise a multi-layer image. At this time. The training of the second output layer may include extracting a first layer image from the multilayer image, training the second output layer using the first layer image, and a second layer from the multilayer image. And extracting an image and learning the second output layer using the second layer image.

몇몇 실시예에서, 상기 신경망은 상기 제1 출력 레이어 내지 상기 제3 출력 레이어에 의해 공유되는 특징 추출 레이어를 더 포함하고, 상기 특징 추출 레이어는 상기 제1 데이터셋 및 상기 제2 데이터셋에 의해 학습되는 레이어일 수 있다. In some embodiments, the neural network further comprises a feature extraction layer shared by the first to third output layers, wherein the feature extraction layer is learned by the first dataset and the second dataset. It may be a layer.

몇몇 실시예에서 상기 신경망을 이용하여 상기 제1 도메인에 속한 예측용 데이터의 레이블을 예측하는 단계를 더 포함하고, 상기 예측하는 단계는, 상기 제1 출력 레이어에서 출력된 제1 예측값과 상기 제3 출력 레이어에서 출력된 제2 예측값에 기초하여 상기 레이블을 예측하는 단계를 포함할 수 있다. In some embodiments, the method may further include predicting a label of prediction data belonging to the first domain using the neural network, wherein the predicting may include a first prediction value and a third prediction value output from the first output layer. Predicting the label based on the second prediction value output from the output layer.

몇몇 실시예에서, 상기 제1 출력 레이어에서 출력된 제1 예측값과 상기 제3 출력 레이어에서 출력된 제2 예측값에 기초하여 상기 레이블을 예측하는 단계는, 상기 제1 예측값과 상기 제2 예측값 각각에 대해 가중치를 부여하는 단계, 상기 부여된 가중치를 반영하여 상기 제1 예측값과 상기 제2 예측값을 종합하는 단계 및 상기 종합된 예측값에 기초하여 상기 레이블을 예측하는 단계를 더 포함할 수 있다. In some embodiments, predicting the label based on a first prediction value output from the first output layer and a second prediction value output from the third output layer may include: applying the first prediction value and the second prediction value to each of the first prediction value and the second prediction value. The method may further include assigning a weight to the first weight value, synthesizing the first predicted value and the second predicted value by reflecting the weighted weight, and predicting the label based on the synthesized predicted value.

상기 기술적 과제를 해결하기 위한, 본 개시의 다른 실시예에 따른 멀티 도메인 앙상블 학습 장치는 하나 이상의 인스트럭션들(instructions)을 저장하는 메모리 및 상기 저장된 하나 이상의 인스트럭션들을 실행함으로써, 제1 도메인에 속한 제1 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제1 출력 레이어를 학습시키고, 제2 도메인에 속한 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제2 출력 레이어를 학습시키고, 상기 제1 데이터셋 및 상기 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제3 출력 레이어를 학습시키는 프로세서를 포함할 수 있다. In order to solve the above technical problem, a multi-domain ensemble learning apparatus according to another embodiment of the present disclosure executes a memory for storing one or more instructions and the stored one or more instructions, thereby performing a first belonging to a first domain. A first output layer of the plurality of output layers is trained using a dataset, a second output layer of the plurality of output layers is trained using a second dataset belonging to a second domain, and the first dataset. And a processor configured to learn a third output layer from the plurality of output layers using the second data set.

상기 기술적 과제를 해결하기 위한, 본 개시의 또 다른 실시예에 따른 컴퓨터 프로그램은 컴퓨팅 장치와 결합되어, 제1 도메인에 속한 제1 데이터셋을 이용하여 복수의 출력 레이어를 포함하는 신경망의 제1 출력 레이어를 학습시키는 단계, 제2 도메인에 속한 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제2 출력 레이어를 학습시키는 단계 및 상기 제1 데이터셋 및 상기 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제3 출력 레이어를 학습시키는 단계를 실행시키기 위하여 컴퓨터로 판독가능한 기록매체에 저장된 프로그램일 수 있다. In order to solve the above technical problem, a computer program according to another embodiment of the present disclosure is coupled to a computing device and includes a first output of a neural network including a plurality of output layers using a first dataset belonging to a first domain. Training a layer, training a second output layer of the plurality of output layers using a second dataset belonging to a second domain, and learning the plurality of output data using the first dataset and the second dataset The program may be a program stored in a computer-readable recording medium for executing the learning of the third output layer.

도 1 및 도 2는 본 개시의 몇몇 실시예에 따른 기계학습 장치와 학습 환경을 설명하기 위한 도면이다.
도 3은 본 개시의 몇몇 실시예에 따른 멀티 도메인 앙상블 학습 방법의 순서도이다.
도 4는 도 3에 도시된 데이터셋 획득 단계 S100의 세부 과정을 나타내는 순서도이다.
도 5는 도 3에 도시된 출력 레이어 학습 단계 S300의 세부 과정을 나타내는 순서도이다.
도 6은 본 개시의 몇몇 실시예에 따른 멀티 도메인 앙상블 학습을 통해 레이블을 예측하는 방법의 순서도이다.
도 7은 도 6에 도시된 레이블 예측 단계 S800의 세부 과정을 나타내는 순서도이다.
도 8은 본 개시의 다양한 실시예에 따른 장치를 구현할 수 있는 예시적인 컴퓨팅 장치를 나타내는 하드웨어 구성도이다. 1 and 2 are diagrams for describing a machine learning apparatus and a learning environment according to some embodiments of the present disclosure.
3 is a flowchart of a multi-domain ensemble learning method according to some embodiments of the present disclosure.
FIG. 4 is a flowchart illustrating a detailed process of the dataset acquiring step S100 illustrated in FIG. 3.
5 is a flowchart illustrating a detailed process of an output layer learning step S300 illustrated in FIG. 3.
6 is a flowchart of a method of predicting a label through multi-domain ensemble learning according to some embodiments of the present disclosure.
FIG. 7 is a flowchart illustrating a detailed process of the label predicting step S800 illustrated in FIG. 6.
8 is a hardware diagram illustrating an example computing device that may implement an apparatus in accordance with various embodiments of the present disclosure.

이하, 첨부된 도면을 참조하여 본 개시의 바람직한 실시예들을 상세히 설명한다. 본 개시의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 개시의 기술적 사상은 이하의 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 이하의 실시예들은 본 개시의 기술적 사상을 완전하도록 하고, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 본 개시의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 개시의 기술적 사상은 청구항의 범주에 의해 정의될 뿐이다.Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. Advantages and features of the present disclosure, and methods of accomplishing the same will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the technical spirit of the present disclosure is not limited to the following embodiments, and may be implemented in various forms, and only the following embodiments make the technical spirit of the present disclosure complete and in the technical field to which the present disclosure belongs. It is provided to fully inform those skilled in the art the scope of the present disclosure, and the technical spirit of the present disclosure is only defined by the scope of the claims.

각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 개시를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 개시의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.In adding reference numerals to the components of each drawing, it should be noted that the same reference numerals are assigned to the same components as much as possible even though they are shown in different drawings. In addition, in describing the present disclosure, when it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present disclosure, the detailed description thereof will be omitted.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다. 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 개시를 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.Unless otherwise defined, all terms used in the present specification (including technical and scientific terms) may be used in a sense that can be commonly understood by those skilled in the art to which the present disclosure belongs. In addition, the terms defined in the commonly used dictionaries are not ideally or excessively interpreted unless they are specifically defined clearly. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase.

또한, 본 개시의 구성 요소를 설명하는 데 있어서, 제1, 제2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 또는 접속될 수 있지만, 각 구성 요소 사이에 또 다른 구성 요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다.In addition, in describing the components of the present disclosure, terms such as first, second, A, B, (a), and (b) may be used. These terms are only for distinguishing the components from other components, and the nature, order or order of the components are not limited by the terms. If a component is described as being "connected", "coupled" or "connected" to another component, that component may be directly connected to or connected to that other component, but there may be another configuration between each component. It is to be understood that the elements may be "connected", "coupled" or "connected".

명세서에서 사용되는 "포함한다 (comprises)" 및/또는 "포함하는 (comprising)"은 언급된 구성 요소, 단계, 동작 및/또는 소자는 하나 이상의 다른 구성 요소, 단계, 동작 및/또는 소자의 존재 또는 추가를 배제하지 않는다.As used herein, “comprises” and / or “comprising” refers to a component, step, operation and / or element that is mentioned in the presence of one or more other components, steps, operations and / or elements. Or does not exclude additions.

본 명세서에 대한 설명에 앞서, 본 명세서에서 사용되는 몇몇 용어들에 대하여 명확하게 하기로 한다.Prior to the description herein, some terms used herein will be clarified.

본 명세서에서, 태스크(task)란, 기계학습을 통해 해결하고자 하는 과제 또는 기계학습을 통해 수행하고자 하는 작업을 지칭한다. 예를 들어, 얼굴 데이터로부터 얼굴 인식, 표정 인식, 성별 분류, 포즈 분류 등을 수행한다고 할 때, 얼굴 인식, 표정 인식, 성별 분류, 포즈 분류 각각이 개별 도메인에 대응될 수 있다. 다른 예로, 의료 이미지 데이터(medical image data)로부터 이상(abnormality)을 인식, 분류, 예측 등을 수행한다고 할 때, 이상 인식, 이상 분류, 이상 예측 각각이 개별 태스크에 대응될 수 있다. 그리고 태스크는 목적 태스크라고 칭할 수도 있다.In the present specification, a task refers to a task to be solved through machine learning or a task to be performed through machine learning. For example, when face recognition, facial expression recognition, gender classification, pose classification, and the like are performed from face data, each of face recognition, facial expression recognition, gender classification, and pose classification may correspond to an individual domain. As another example, when an abnormality is recognized, classified, or predicted from medical image data, each of the abnormality recognition, the abnormality classification, and the abnormality prediction may correspond to an individual task. And the task may be referred to as the destination task.

본 명세서에서, 멀티 도메인(multi-domain) 학습이란, 하나의 모델을 이용하여 서로 다른 도메인에 대한 학습을 수행하는 것을 의미한다. In the present specification, multi-domain learning means performing learning on different domains using one model.

본 명세서에서, 신경망(neural network)이란, 신경 구조를 모방하여 고안된 모든 종류의 기계학습 모델을 포괄하는 용어이다. 가령, 상기 신경망은 인공 신경망(artificial neural network; ANN), 컨볼루션 신경망(convolutional neural network; CNN) 등과 같이 모든 종류의 신경망 기반 모델을 포함할 수 있다.In this specification, a neural network is a term encompassing all kinds of machine learning models designed to mimic neural structures. For example, the neural network may include all kinds of neural network based models such as an artificial neural network (ANN), a convolutional neural network (CNN), and the like.

본 명세서에서 인스트럭션(instruction)이란, 기능을 기준으로 묶인 일련의 컴퓨터 판독가능 명령어들로서 컴퓨터 프로그램의 구성 요소이자 프로세서에 의해 실행되는 것을 가리킨다.Instructions herein refer to a series of computer readable instructions, bound by function, that refer to components of a computer program and to be executed by a processor.

이하, 본 개시의 몇몇 실시예들에 대하여 첨부된 도면에 따라 상세하게 설명한다.Hereinafter, some embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 본 개시의 몇몇 실시예에 따른 기계학습 장치(10)와 학습 환경을 설명하기 위한 도면이다.1 is a diagram illustrating a machine learning apparatus 10 and a learning environment according to some embodiments of the present disclosure.

도 1에 도시된 바와 같이, 기계학습 장치(10)는 신경망에 대한 멀티 도메인 앙상블 학습을 수행하는 컴퓨팅 장치이다. 따라서, 본 개시의 다양한 실시예에서, 기계학습 장치(10)는 멀티 도메인 앙상블 학습 장치(10)로 명명될 수도 있다. 이하에서는, 설명의 편의상 기계학습 장치(10)를 학습 장치(10)로 약칭하도록 한다.As shown in FIG. 1, the machine learning device 10 is a computing device that performs multi-domain ensemble learning for neural networks. Thus, in various embodiments of the present disclosure, machine learning apparatus 10 may be referred to as multi-domain ensemble learning apparatus 10. Hereinafter, for convenience of description, the machine learning apparatus 10 will be abbreviated as the learning apparatus 10.

상기 컴퓨팅 장치는, 노트북, 데스크톱(desktop), 랩탑(laptop), 서버(server) 등이 될 수 있으나, 이에 국한되는 것은 아니며 컴퓨팅 기능이 구비된 모든 종류의 장치를 포함할 수 있다. 상기 컴퓨팅 장치의 일 예는 도 8을 참조하도록 한다.The computing device may be a laptop, a desktop, a laptop, a server, or the like, but is not limited thereto and may include any type of device having a computing function. An example of the computing device is shown in FIG. 8.

도 1은 학습 장치(10)가 하나의 컴퓨팅 장치로 구현된 것을 예로써 도시하고 있으나, 실제 물리적 환경에서 학습 장치(10)의 제1 기능은 제1 컴퓨팅 장치에서 구현되고, 학습 장치(10)의 제2 기능은 제2 컴퓨팅 장치에서 구현될 수도 있다. 또한, 학습 장치(10)는 복수의 컴퓨팅 장치로 구성될 수 있고, 복수의 컴퓨팅 장치가 제1 기능과 제2 기능을 나누어 구현할 수도 있다.Although FIG. 1 illustrates that the learning device 10 is implemented as one computing device, the first function of the learning device 10 is implemented in the first computing device in an actual physical environment, and the learning device 10 is illustrated in FIG. The second function of may be implemented in the second computing device. In addition, the learning device 10 may be composed of a plurality of computing devices, and the plurality of computing devices may be implemented by dividing the first function and the second function.

데이터셋(12, 13)은 정답 레이블 정보가 주어진 트레이닝 데이터셋이다. 구체적으로, 제1 데이터셋(12)은 제1 도메인에 속한 복수의 트레이닝 샘플로 구성된 데이터셋이고, 제2 데이터셋(13)은 상기 제1 도메인과 상이한 제2 도메인에 속한 복수의 트레이닝 샘플로 구성된 데이터셋일 수 있다. 여기서 트레이닝 샘플은 학습을 위한 데이터의 단위를 의미할 수 있고, 다양한 데이터일 수 있다. 예를 들어, 트레이닝 샘플은 하나의 이미지일 수 있고, 학습 대상 또는 태스크에 따라 이미지 이외의 다양한 데이터일 수 있다.Datasets 12 and 13 are training datasets given correct label information. Specifically, the first dataset 12 is a dataset composed of a plurality of training samples belonging to the first domain, and the second dataset 13 is a plurality of training samples belonging to a second domain different from the first domain. It may be a configured dataset. Here, the training sample may mean a unit of data for learning and may be various data. For example, the training sample may be one image and may be various data other than the image according to the learning object or task.

본 개시의 다양한 실시예에 따르면, 학습 장치(10)는 복수의 도메인에 속한 데이터셋(12, 13)을 이용하여 신경망에 대한 멀티 도메인 앙상블 학습을 수행할 수 있다. 상기 신경망은 복수의 출력 레이어와 상기 복수의 출력 레이어에 의해 공유되는 적어도 하나의 특징 추출 레이어를 포함할 수 있다. 이때, 상기 복수의 출력 레이어는 특정 도메인에 특화된 제1 유형의 레이어와 복수의 도메인과 연관된 제2 유형의 레이어를 포함할 수 있다. 또한, 상기 제1 유형의 레이어는 상기 특정 도메인에 속한 데이터셋을 이용하여 학습되고, 상기 제2 유형의 레이어는 상기 복수의 도메인 각각에 속한 데이터셋을 이용하여 학습될 수 있다.According to various embodiments of the present disclosure, the learning apparatus 10 may perform multi-domain ensemble learning on a neural network using data sets 12 and 13 belonging to a plurality of domains. The neural network may include a plurality of output layers and at least one feature extraction layer shared by the plurality of output layers. In this case, the plurality of output layers may include a first type of layer specialized for a specific domain and a second type of layer associated with the plurality of domains. In addition, the first type of layer may be learned using a dataset belonging to the specific domain, and the second type of layer may be learned using a dataset belonging to each of the plurality of domains.

예를 들어, 상기 신경망은 도 2에 예시된 바와 같이 구성될 수 있는데, 도 2는 2개의 도메인(12,13)에 대한 멀티 도메인 앙상블 학습에 이용될 수 있는 신경망을 예시하고 있다. 도 2에 예시된 바와 같이, 상기 신경망은 3개의 출력 레이어(15,16,17)와 공유된 특징 추출 레이어(14)를 포함할 수 있다. 이때, 제1 출력 레이어(15)는 제1 도메인에 특화된 레이어이고, 제2 출력 레이어(16)는 제2 도메인에 특화된 레이어일 수 있다. 또한, 제3 출력 레이어(17)는 상기 제1 도메인 및 상기 제2 도메인과 연관된 레이어일 수 있다. For example, the neural network can be configured as illustrated in FIG. 2, which illustrates a neural network that can be used for multi-domain ensemble learning for two domains 12, 13. As illustrated in FIG. 2, the neural network may include a feature extraction layer 14 shared with three output layers 15, 16, 17. In this case, the first output layer 15 may be a layer specialized for the first domain, and the second output layer 16 may be a layer specialized for the second domain. In addition, the third output layer 17 may be a layer associated with the first domain and the second domain.

따라서, 상기 제1 출력 레이어(15)는 상기 제1 도메인에 속한 제1 데이터셋(12)에 기초하여 학습되고, 상기 제2 출력 레이어(16)는 상기 제2 도메인에 속한 제2 데이터셋(13)에 기초하여 학습될 수 있다. 또한, 상기 제3 출력 레이어(17)는 각 도메인에 속한 데이터셋(12, 13)을 함께 이용하여 학습될 수 있다. 특징 추출 레이어(14)는 멀티 도메인에 대하여 특징을 추출하는 동작을 수행해야 하므로, 제1 데이터셋(12)과 제2 데이터셋(13) 모두에 기초하여 학습될 수 있다. 상기 신경망에 대해 멀티 도메인 앙상블 학습을 수행하는 방법에 대한 보다 자세한 설명은 도 3 내지 도 8을 참조하여 후술하도록 한다.Accordingly, the first output layer 15 is trained based on the first dataset 12 belonging to the first domain, and the second output layer 16 is a second dataset belonging to the second domain. 13) can be learned. In addition, the third output layer 17 may be trained using the data sets 12 and 13 belonging to each domain. Since the feature extraction layer 14 needs to perform an operation of extracting a feature with respect to the multi-domain, the feature extraction layer 14 may be trained based on both the first data set 12 and the second data set 13. A detailed description of a method of performing multi-domain ensemble learning on the neural network will be described later with reference to FIGS. 3 to 8.

지금까지 도 1 및 도 2를 참조하여 본 개시의 몇몇 실시예에 따른 학습 장치(10)의 동작 및 학습 환경에 대하여 설명하였다. 이하에서는, 본 개시의 다양한 실시예들에 따른 방법들에 대하여 설명하도록 한다.So far, the operation and learning environment of the learning apparatus 10 according to some embodiments of the present disclosure have been described with reference to FIGS. 1 and 2. Hereinafter, methods according to various embodiments of the present disclosure will be described.

상기 방법들의 각 단계는 컴퓨팅 장치에 의해 수행될 수 있다. 다시 말하면, 상기 방법들의 각 단계는 컴퓨팅 장치의 프로세서에 의해 실행되는 하나 이상의 인스트럭션들로 구현될 수 있다. 상기 방법들에 포함되는 모든 단계는 하나의 물리적인 컴퓨팅 장치에 의하여 실행될 수도 있을 것이나, 상기 방법들의 제1 단계들은 제1 컴퓨팅 장치에 의하여 수행되고, 상기 방법들의 제2 단계들은 제2 컴퓨팅 장치에 의하여 수행될 수도 있다. 이하에서는, 상기 방법들의 각 단계가 학습 장치(10)에 의해 수행되는 것을 가정하여 설명을 이어가도록 한다. 따라서, 상기 방법들에 관한 설명에서 각 동작의 주어가 생략된 경우, 상기 예시된 장치에 의하여 수행될 수 있는 것으로 이해될 수 있을 것이다. 또한, 이하에서 후술될 방법들은 필요에 따라 논리적으로 수행 순서가 바뀔 수 있는 범위 안에서 각 동작의 수행 순서가 바뀔 수 있음은 물론이다.Each step of the methods may be performed by a computing device. In other words, each step of the methods may be implemented with one or more instructions executed by a processor of the computing device. All steps included in the methods may be performed by one physical computing device, but the first steps of the methods are performed by the first computing device and the second steps of the methods are performed on the second computing device. It may also be performed by. In the following description, it is assumed that each step of the methods is performed by the learning apparatus 10. Thus, it will be appreciated that where the subject of each operation is omitted in the description of the methods, it may be performed by the apparatus illustrated above. In addition, the methods to be described below may change the execution order of each operation within a range in which the execution order may be logically changed as necessary.

도 3은 본 개시의 몇몇 실시예에 따른 멀티 도메인 앙상블 학습 방법의 순서도이다. 단, 이는 본 개시의 목적을 달성하기 위한 바람직한 실시예일 뿐이며, 필요에 따라 일부 단계가 추가되거나 삭제될 수 있음은 물론이다. 또한, 이하의 서술에서, 도메인의 개수가 2개인 경우를 가정하여 설명할 것이나, 이는 이해의 편의를 제공하기 위한 것일 뿐, 도메인의 개수는 실시예에 따라 얼마든지 달라질 수 있다. 또한, 학습 대상 신경망은 도 2에 예시된 바와 같이 공유된 특징 추출 레이어와 3개의 출력 레이어를 포함하는 것으로 가정한다.3 is a flowchart of a multi-domain ensemble learning method according to some embodiments of the present disclosure. However, this is only a preferred embodiment for achieving the object of the present disclosure, of course, some steps may be added or deleted as necessary. In addition, in the following description, it will be described on the assumption that the number of domains is two, but this is only for convenience of understanding, and the number of domains may vary depending on the embodiment. In addition, it is assumed that the learning target neural network includes a shared feature extraction layer and three output layers as illustrated in FIG. 2.

도 3을 참조하면, 단계 S100에서, 멀티 도메인 앙상블 학습을 위한 데이터셋이 획득된다. 예를 들어, 상기 데이터셋은 제1 도메인에 속한 제1 데이터셋과 제2 도메인에 속한 제2 데이터셋을 포함할 수 있다. 이하에서는, 상기 제1 데이터셋이 상기 제1 도메인에 속한 데이터셋을 지칭하고, 상기 제2 데이터셋은 상기 제2 도메인에 속한 데이터셋을 지칭하는 의미로 계속하여 사용하도록 한다.Referring to FIG. 3, in step S100, a data set for multi-domain ensemble learning is obtained. For example, the dataset may include a first dataset belonging to the first domain and a second dataset belonging to the second domain. Hereinafter, the first data set refers to a data set belonging to the first domain, and the second data set refers to a data set belonging to the second domain.

몇몇 실시예에서, 상기 제1 데이터셋은 제1 촬영 방식에 의해 생성된 이미지로 구성되고, 상기 제2 데이터셋은 제2 촬영 방식에 의해 생성된 이미지로 구성될 수 있다. 즉, 멀티 도메인이 촬영 방식을 기준으로 정의된 것일 수 있다. 예컨대, 상기 제1 촬영 방식은 FFDM(Full-Field Digital Mammography) 방식이고, 상기 제2 촬영 방식은 DBT(Digital Breast Tomosynthesis) 방식일 수 있다. 이와 같은 경우, FFDM 이미지와 DBT 이미지에 대해 예측 태스크(e.g. 암 진단)를 수행할 수 있도록 상기 신경망이 학습될 수 있다.In some embodiments, the first dataset may consist of an image generated by a first photographing method, and the second dataset may consist of an image generated by a second photographing method. That is, the multi-domain may be defined based on the photographing method. For example, the first imaging method may be a full-field digital mammography (FFDM) method, and the second imaging method may be a digital breast tomosynthesis (DBT) method. In this case, the neural network may be trained to perform a prediction task (e.g. cancer diagnosis) on the FFDM image and the DBT image.

몇몇 실시예에서, 상기 제1 데이터셋은 상기 제2 데이터셋보다 많은 개수의 데이터(즉, 트레이닝 샘플)를 포함할 수 있다. 이와 같은 경우, 오버 샘플링(over-sampling)을 통해 상기 제2 도메인의 샘플 개수를 증가시키는 과정이 더 수행될 수도 있다.In some embodiments, the first dataset may comprise more data (ie, training samples) than the second dataset. In this case, a process of increasing the number of samples of the second domain through over-sampling may be further performed.

몇몇 실시예에서 상기 제1 데이터셋은 상기 제2 데이터셋과 서로 다른 형태(또는 형식)의 데이터를 포함할 수 있다. 예를 들어, 상기 제1 데이터셋은 2D 이미지(e.g. FFDM 이미지)로 구성되고, 상기 제2 데이터셋은 3D 이미지(e.g. DBT 이미지)로 구성될 수 있다. 다른 예로는, 상기 제1 데이터셋은 단일 채널 또는 단일 레이어 이미지(e.g. FFDM 이미지)로 구성되고, 상기 제2 데이터셋은 멀티 채널 또는 멀티 레이어 이미지(e.g. DBT 이미지)로 구성될 수 있다. 이와 같은 경우, 신경망에 데이터를 입력하기 전에 신경망의 입력 형식에 맞게 입력 데이터의 형태를 조정(또는 변환)하는 등의 과정이 더 수행될 수 있다. 이와 관련하여서는 도 4를 참조하여 상세하게 설명하도록 한다. In some embodiments, the first dataset may include data of a different type (or format) from the second dataset. For example, the first dataset may consist of a 2D image (e.g. FFDM image), and the second dataset may consist of a 3D image (e.g. DBT image). In another example, the first dataset may consist of a single channel or single layer image (e.g. FFDM image), and the second dataset may consist of a multichannel or multi-layer image (e.g. DBT image). In this case, before inputting data into the neural network, a process of adjusting (or converting) the shape of the input data according to the input format of the neural network may be further performed. In this regard will be described in detail with reference to FIG.

도 4는 신경망의 입력 형태가 제1 데이터셋의 형태에 따라 구현된 것을 가정하고 있다.4 assumes that the input form of the neural network is implemented according to the form of the first dataset.

도 4를 참조하면, 단계 S101에서, 제1 데이터셋과 제2 데이터셋이 다른 형태의 데이터를 포함하는지 판단된다. 데이터의 형태가 상이한 경우, 단계 S102에서, 상기 제2 데이터셋에 포함된 각각의 데이터에 대하여 상기 제1 데이터셋과 동일한 입력의 형태를 갖도록 조정(또는 변환)될 수 있다. 구체적인 조정 프로세스는 실시예에 따라 달라질 수 있다.Referring to FIG. 4, in step S101, it is determined whether the first data set and the second data set include different types of data. If the shape of the data is different, in step S102, each data included in the second dataset may be adjusted (or converted) to have the same input form as the first dataset. The specific adjustment process may vary depending on the embodiment.

몇몇 실시예에서, 제1 데이터셋은 FFDM 데이터이고, 상기 제2 데이터셋은 DBT 데이터일 수 있다. DBT 데이터는 멀티 채널(multi-channel) 또는 3D 형태의 입력(input)도 가능하나, 신경망은 FFDM 데이터와 동일하게 단일 채널 이미지를 입력으로 받도록 구현되어 있을 수 있다. 이와 같은 경우, 멀티 채널 이미지에서 단일 채널 이미지를 추출(또는 샘플링) 하고, 추출된 단일 채널 이미지가 신경망으로 입력되도록 할 수 있다.In some embodiments, the first dataset may be FFDM data and the second dataset may be DBT data. The DBT data may be multi-channel or 3D input, but the neural network may be implemented to receive a single channel image as the FFDM data. In this case, the single channel image may be extracted (or sampled) from the multi channel image, and the extracted single channel image may be input to the neural network.

몇몇 실시예에서 상기 제1 데이터셋은 단일 레이어 이미지를 포함하고, 상기 제2 데이터셋은 멀티 레이어 이미지를 포함할 수 있다. 또한, 신경망은 단일 레이어 이미지를 입력으로 받도록 구현되어 있을 수 있다. 이와 같은 경우, 멀티 레이어 이미지에서 단일 레이어 이미지를 추출(또는 샘플링)하고, 추출된 단일 채널 이미지가 신경망으로 입력되도록 할 수 있다.In some embodiments, the first dataset may include a single layer image, and the second dataset may include a multi-layer image. In addition, the neural network may be implemented to receive a single layer image as an input. In this case, a single layer image may be extracted (or sampled) from the multilayer image, and the extracted single channel image may be input to the neural network.

단계 S103에서, 상기 조정된 데이터가 조건을 만족하는지 판단된다. 상기 조건이란, 상기 조정된 데이터가 학습을 위한 샘플 데이터로 적합한지를 판단하는 기준을 말한다. 예를 들어, 선명도, 제1 데이터셋과의 비율, 특정 색상을 포함하는지, 데이터의 크기 등이 조건이 될 수 있을 것이다. 조건은 사용자의 입력을 통해 설정될 수도 있고, 태스크의 종류에 따라 자동으로 설정될 수도 있다.In step S103, it is determined whether the adjusted data satisfy the condition. The condition refers to a criterion for determining whether the adjusted data is suitable as sample data for learning. For example, the sharpness, the ratio with the first data set, whether the specific color is included, the size of the data, and the like may be conditions. The condition may be set through user input or may be set automatically according to the type of task.

몇몇 실시예에서, 조정된 제2 데이터셋의 수가 제1 도메인에 포함되는 제1 데이터셋에 포함하는 샘플 데이터의 수와의 비율에 맞는지 판단될 수 있다. 이 때, 상기 서로 다른 도메인 간의 비율이 너무 치우치지 않도록 맞추어질 수 있다. 예를 들어, 제1 도메인과 제2 도메인간 비율이 10:1과 같이 크게 차이가 날 경우, 제2 도메인을 오버 샘플링(over sampling)하는 방식으로 원하는 비율로 맞추어 줄 수 있다. 상기 비율은 기 설정된 값일 수 있고 태스크와 도메인의 종류에 따라 변동되는 값일 수 있다. In some embodiments, it may be determined whether the number of adjusted second datasets corresponds to a ratio with the number of sample data included in the first dataset included in the first domain. At this time, the ratio between the different domains can be tailored so as not to be biased. For example, if the ratio between the first domain and the second domain is significantly different, such as 10: 1, the second domain may be adjusted to a desired ratio by over sampling the second domain. The ratio may be a preset value and a value that varies depending on the type of task and domain.

다시 도 3을 참조하여 설명한다. 단계 S200에서, 데이터셋이 어느 도메인에 속하는 것인지가 판단된다. 해당 데이터셋이 제1 도메인에 속하는 것으로 판단된 경우, 단계 S300이 수행되고, 반대의 경우 단계 S400이 수행될 수 있다. 단계 S200는 실시예에 따라 생략될 수 있다.This will be described with reference to FIG. 3 again. In step S200, it is determined to which domain the dataset belongs. If it is determined that the dataset belongs to the first domain, step S300 may be performed, and otherwise, step S400 may be performed. Step S200 may be omitted according to an embodiment.

단계 S300에서, 제1 도메인에 속하는 제1 데이터셋을 이용하여 제1 출력 레이어가 학습된다. 상기 제1 출력 레이어는 3개의 출력 레이어 중에서 상기 제1 도메인에 특화된 레이어를 의미할 수 있다. 또한, 상기 제1 데이터셋을 이용하여 복수의 출력 레이어에 의해 공유되는 특징 추출 레이어도 학습될 수 있다. 본 단계 S300의 구체적인 학습 과정은 도 5에 도시되어 있다.In operation S300, a first output layer is learned using a first dataset belonging to the first domain. The first output layer may mean a layer specialized in the first domain among three output layers. Also, a feature extraction layer shared by a plurality of output layers may be learned using the first dataset. The detailed learning process of this step S300 is shown in FIG.

도 5를 참조하면, 단계 S301에서 제1 출력 레이어에서 출력된 예측값이 획득된다. 단계 S302에서 상기 획득된 예측값에 대한 오차가 산출된다. 단계 S303에서 오차를 역전파하여 제1 출력 레이어와 특징 추출 레이어의 가중치가 업데이트된다. 즉, 상기 획득된 예측값의 오차를 다시 역 방향으로 전파하며 상기 오차가 최소화되는 방식으로 가중치를 업데이트 하는 것이다.Referring to FIG. 5, in operation S301, a predicted value output from the first output layer is obtained. In step S302, an error with respect to the obtained prediction value is calculated. In step S303, the error is propagated back to update the weights of the first output layer and the feature extraction layer. That is, the error of the obtained prediction value is propagated again in the reverse direction and the weight is updated in such a manner that the error is minimized.

다시 도 3을 참조하면, 단계 S400에서 제2 도메인에 속하는 제2 데이터셋을 이용하여 제2 출력 레이어가 학습된다. 상기 제2 출력 레이어는 3개의 출력 레이어 중에서 상기 제2 도메인에 특화된 레이어를 의미할 수 있다. 또한, 상기 제2 데이터셋을 이용하여 복수의 출력 레이어에 의해 공유되는 특징 추출 레이어도 학습될 수 있다. 본 단계 S400의 구체적인 학습 과정은 상술한 단계 S300과 유사하므로, 더 이상의 설명은 생략하도록 한다.Referring back to FIG. 3, in operation S400, a second output layer is learned using a second dataset belonging to a second domain. The second output layer may mean a layer specialized in the second domain among three output layers. Also, a feature extraction layer shared by a plurality of output layers may be learned using the second dataset. Since the detailed learning process of this step S400 is similar to the above-described step S300, further description thereof will be omitted.

단계 S500에서, 상기 제1 데이터셋과 상기 제2 데이터셋을 이용하여 제3 출력 레이어가 학습된다. 상기 제3 출력 레이어는 상기 제1 도메인과 상기 제2 도메인에 연관된 레이어로써, 레이블을 예측할 때 각 도메인에서의 예측 성능을 향상시키기 위해 이용될 수 있다. 이와 관련하여서는, 도 6을 참조하여 추후 상세하게 설명하도록 한다.In operation S500, a third output layer is learned using the first data set and the second data set. The third output layer is a layer associated with the first domain and the second domain, and may be used to improve prediction performance in each domain when predicting a label. In this regard, it will be described in detail later with reference to FIG.

도 3은 단계 S500이 단계 S300 및 단계 S400 이후에 수행되는 것으로 도시하고 있으나, 이는 이해의 편의를 제공하기 위한 것일 뿐이며, 단계 S500의 일부 과정(즉, 제1 도메인과 연관된 학습 과정)은 단계 S300과 함께 수행되고, 다른 일부 과정(즉, 제2 도메인과 연관된 학습 과정)은 단계 S400과 함께 수행될 수도 있다.FIG. 3 shows that step S500 is performed after steps S300 and S400, but this is only for convenience of understanding, and some processes (ie, a learning process associated with the first domain) of step S500 are performed in step S300. And some other processes (ie, a learning process associated with the second domain) may be performed with step S400.

또한, 단계 S500에서, 상기 제2 데이터셋을 이용하여 상기 특징 추출 레이어도 학습될 수 있다.In operation S500, the feature extraction layer may also be learned using the second dataset.

지금까지, 도 3 내지 도 5를 참조하여, 본 개시의 몇몇 실시예에 따른 멀티 도메인 앙상블 학습 방법에 대해 설명하였다. 상술한 방법에 따르면, 학습 대상 신경망은 각 도메인에 특화된 레이어(e.g. 제1 출력 레이어, 제2 출력 레이어) 외에 추가 출력 레이어(e.g. 제3 출력 레이어)를 더 포함하고, 상기 추가 출력 레이어에 대해서는 복수의 도메인에 대한 학습이 이루어진다. 또한, 상기 추가 출력 레이어는 각 도메인에서의 예측 과정을 보조하는 용도로 활용될 수 있으며, 그로 인해 각 도메인에서의 예측 성능이 모두 향상될 수 있다.So far, the multi-domain ensemble learning method according to some embodiments of the present disclosure has been described with reference to FIGS. 3 to 5. According to the above-described method, the neural network to be taught further includes an additional output layer (eg, a third output layer) in addition to the layers specific to each domain (eg, a first output layer and a second output layer), and a plurality of additional output layers. Learning about the domain takes place. In addition, the additional output layer can be used to assist the prediction process in each domain, thereby improving the prediction performance in each domain.

특히, 상기 추가 출력 레이어는 데이터가 많지 않은 도메인에서의 예측 성능의 향상시키기 위해 활용될 수 있으며, 두 도메인 간 유사도가 높을수록 상기 예측 성능은 더욱 향상될 수 있다.In particular, the additional output layer may be used to improve the prediction performance in a domain without much data, and the higher the similarity between the two domains, the more the prediction performance may be improved.

이상의 서술에서, 학습 대상 신경망이 컨볼루션 신경망에 기반한 것을 가정하여 설명하였으나, 본 개시의 기술적 사상은 인공 신경망 등과 같이 다른 종류의 신경망에도 동일하게 적용될 수 있으므로, 본 개시의 기술적 범위가 특정 종류의 신경망에 제한되는 것은 아니다.In the above description, it has been described assuming that the learning target neural network is based on a convolutional neural network. However, since the technical idea of the present disclosure may be equally applied to other types of neural networks, such as artificial neural networks, the technical scope of the present disclosure is specific to neural networks. It is not limited to.

이하, 도 6 내지 도 7을 참조하여, 학습된 신경망을 이용하여 레이블 예측하는 방법에 대하여 설명하도록 한다.Hereinafter, a method of label prediction using a learned neural network will be described with reference to FIGS. 6 to 7.

도 6은 본 개시의 몇몇 실시예에 따른 레이블 예측 방법의 순서도이다.6 is a flowchart of a label prediction method according to some embodiments of the present disclosure.

도 6을 참조하면, 단계 S600에서 레이블이 주어지지 않은 예측용 데이터가 획득된다.Referring to FIG. 6, in step S600, prediction data that is not given a label is obtained.

단계 S700에서, 상기 획득된 예측용 데이터가 제1 도메인에 속하는지 여부가 판단된다. 상기 예측용 데이터가 제1 도메인에 속하는 것으로 판단된 경우, 단계 S800이 수행된다. 반대의 경우, 단계 S900이 수행된다. 단계 S700은 실시예에 따라 생략될 수 있다.In step S700, it is determined whether the obtained prediction data belongs to the first domain. If it is determined that the prediction data belongs to the first domain, step S800 is performed. In the opposite case, step S900 is performed. Step S700 may be omitted according to an embodiment.

단계 S800에서, 제1 도메인에 대한 레이블 예측이 수행된다. 본 단계 S800의 세부 과정은 도 7에 도시되어 있다. 도 7을 참조하여 설명한다.In step S800, label prediction for the first domain is performed. The detailed process of this step S800 is shown in FIG. It demonstrates with reference to FIG.

단계 S801에서 신경망의 제1 출력 레이어를 통해 제1 예측값이 획득된다. 구체적으로, 상기 예측용 데이터가 상기 신경망에 입력되고, 상기 신경망을 구성하는 복수의 출력 레이어 중 상기 제1 출력 레이어에서 출력된 값이 상기 제1 예측값으로 획득된다. 상술한 바와 같이, 상기 제1 출력 레이어는 상기 제1 도메인에 특화된 레이어를 의미한다.In operation S801, a first prediction value is obtained through the first output layer of the neural network. Specifically, the prediction data is input to the neural network, and a value output from the first output layer among a plurality of output layers constituting the neural network is obtained as the first prediction value. As described above, the first output layer means a layer specialized for the first domain.

단계 S802에서 상기 신경망의 제3 출력 레이어를 통해 제2 예측값이 획득된다. 구체적으로, 상기 예측용 데이터가 상기 신경망에 입력되고, 상기 신경망을 구성하는 복수의 출력 레이어 중 상기 제3 출력 레이어에서 출력된 값이 상기 제2 예측값으로 획득된다. 상술한 바와 같이, 상기 제3 출력 레이어는 상기 제1 도메인과 상기 제2 도메인에 대한 학습이 수행된 레이어를 의미한다.In operation S802, a second prediction value is obtained through the third output layer of the neural network. Specifically, the prediction data is input to the neural network, and a value output from the third output layer of the plurality of output layers constituting the neural network is obtained as the second prediction value. As described above, the third output layer refers to a layer on which learning of the first domain and the second domain is performed.

단계 S803에서 상기 획득된 제1 예측값과 제2 예측값을 기초로 레이블이 예측된다. 이때, 제1 예측값과 제2 예측값 각각에 대해 가중치가 부여되고, 부여된 가중치를 고려(e.g. 가중치합 등의 방식)하여 레이블에 대한 예측이 수행될 수 있다. 각 예측값에 대한 가중치는 서로 균등하게 부여될 수 있고, 태스크의 종류, 도메인 간의 유사도, 각 출력 레이어가 학습된 정도 또는 도메인의 종류 등의 다양한 기준에 따라 서로 다르게 부여될 수도 있다.In step S803, a label is predicted based on the obtained first prediction value and the second prediction value. In this case, a weight may be assigned to each of the first predicted value and the second predicted value, and prediction may be performed on the label in consideration of the assigned weight (e.g. sum of weights, etc.). Weights for each prediction value may be equally given to each other, and may be differently applied according to various criteria such as the type of task, the similarity between domains, the degree to which each output layer is learned, or the type of domain.

다시 도 6을 참조하면, 단계 S900에서 제2 도메인에 속하는 예측용 데이터의 레이블이 예측된다. 이때에는 제2 출력 레이어와 제3 출력 레이어에서 각각 예측값이 획득되고, 두 레이어의 예측값에 기초하여 레이블이 예측될 수 있다.Referring back to FIG. 6, in step S900, a label of prediction data belonging to the second domain is predicted. In this case, a prediction value may be obtained in each of the second and third output layers, and a label may be predicted based on the prediction values of the two layers.

지금까지 도 6 및 도 7을 참조하여 본 개시의 몇몇 실시예에 따른 레이블 예측 방법에 대하여 설명하였다. 상술한 방법에 따르면, 특정 도메인에 특화된 출력 레이어와 복수의 도메인을 모두 학습한 추가 출력 레이어로의 예측값을 함께 이용하여 상기 특정 도메인에 대한 레이블이 예측될 수 있다. 이에 따라, 상기 특정 도메인에서의 예측 성능이 향상될 수 있다. 특히, 상기 특정 도메인이 데이터가 많지 않은 경우라도, 상기 추가 출력 레이어를 통해 상기 특정 도메인에서의 예측 성능이 향상될 수 있다.So far, the label prediction method according to some embodiments of the present disclosure has been described with reference to FIGS. 6 and 7. According to the above-described method, a label for the specific domain may be predicted by using an output layer specialized for a specific domain and a prediction value for an additional output layer that has learned all of the plurality of domains together. Accordingly, prediction performance in the specific domain may be improved. In particular, even if the specific domain does not have much data, the additional output layer may improve prediction performance in the specific domain.

이하에서는, 본 개시의 다양한 실시예에 따른 장치(e.g. 학습 장치 10)를 구현할 수 있는 예시적인 컴퓨팅 장치(100)에 대하여 설명하도록 한다.Hereinafter, an exemplary computing device 100 that can implement an apparatus (e. G. Learning device 10) according to various embodiments of the present disclosure will be described.

도 8는 본 개시의 다양한 실시예에 따른 장치(e.g. 학습 장치 10)를 구현할 수 있는 예시적인 컴퓨팅 장치(100)를 나타내는 하드웨어 구성도이다.8 is a hardware diagram illustrating an example computing device 100 that may implement an apparatus (e.g., learning apparatus 10) according to various embodiments of the present disclosure.

도 8에 도시된 바와 같이, 컴퓨팅 장치(100)는 하나 이상의 프로세서(110), 버스(150), 통신 인터페이스(170), 프로세서(110)에 의하여 수행되는 컴퓨터 프로그램을 로드(load)하는 메모리(130)와, 컴퓨터 프로그램(191)을 저장하는 스토리지(190)를 포함할 수 있다. 다만, 도 8에는 본 개시의 실시예와 관련 있는 구성요소들만이 도시되어 있다. 따라서, 본 개시가 속한 기술분야의 통상의 기술자라면 도 8에 도시된 구성요소들 외에 다른 범용적인 구성 요소들이 더 포함될 수 있음을 알 수 있다.As shown in FIG. 8, the computing device 100 may include one or more processors 110, a bus 150, a communication interface 170, and a memory for loading a computer program executed by the processor 110. 130 and a storage 190 for storing the computer program 191. However, FIG. 8 illustrates only the components related to the embodiment of the present disclosure. Accordingly, those skilled in the art may recognize that other general purpose components may be further included in addition to the components illustrated in FIG. 8.

프로세서(110)는 컴퓨팅 장치(100)의 각 구성의 전반적인 동작을 제어한다. 프로세서(110)는 CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 또는 본 개시의 기술 분야에 잘 알려진 임의의 형태의 프로세서를 포함하여 구성될 수 있다. 또한, 프로세서(110)는 본 개시의 실시예들에 따른 방법/동작을 실행하기 위한 적어도 하나의 애플리케이션 또는 프로그램에 대한 연산을 수행할 수 있다. 컴퓨팅 장치(100)는 하나 이상의 프로세서를 구비할 수 있다.The processor 110 controls the overall operation of each component of the computing device 100. The processor 110 is configured to include a central processing unit (CPU), a micro processor unit (MPU), a micro controller unit (MCU), a graphics processing unit (GPU), or any type of processor well known in the art. Can be. In addition, the processor 110 may perform an operation on at least one application or program for executing a method / operation according to embodiments of the present disclosure. Computing device 100 may have one or more processors.

메모리(130)는 각종 데이터, 명령 및/또는 정보를 저장한다. 메모리(130)는 본 개시의 다양한 실시예들에 따른 방법/동작을 실행하기 위하여 스토리지(190)로부터 하나 이상의 프로그램(191)을 로드할 수 있다. 메모리(130)는 RAM과 같은 휘발성 메모리로 구현될 수 있을 것이나, 본 개시의 기술적 범위가 이에 한정되는 것은 아니다.The memory 130 stores various data, commands, and / or information. Memory 130 may load one or more programs 191 from storage 190 to execute methods / operations in accordance with various embodiments of the present disclosure. The memory 130 may be implemented as a volatile memory such as a RAM, but the technical scope of the present disclosure is not limited thereto.

버스(150)는 컴퓨팅 장치(100)의 구성 요소 간 통신 기능을 제공한다. 버스(150)는 주소 버스(Address Bus), 데이터 버스(Data Bus) 및 제어 버스(Control Bus) 등 다양한 형태의 버스로 구현될 수 있다.The bus 150 provides a communication function between components of the computing device 100. The bus 150 may be implemented as various types of buses such as an address bus, a data bus, and a control bus.

통신 인터페이스(170)는 컴퓨팅 장치(100)의 유무선 인터넷 통신을 지원한다. 또한, 통신 인터페이스(170)는 인터넷 통신 외의 다양한 통신 방식을 지원할 수도 있다. 이를 위해, 통신 인터페이스(170)는 본 개시의 기술 분야에 잘 알려진 통신 모듈을 포함하여 구성될 수 있다. 경우에 따라, 통신 인터페이스(170)는 생략될 수도 있다.The communication interface 170 supports wired and wireless Internet communication of the computing device 100. In addition, the communication interface 170 may support various communication methods other than Internet communication. To this end, the communication interface 170 may comprise a communication module well known in the art of the present disclosure. In some cases, the communication interface 170 may be omitted.

스토리지(190)는 상기 하나 이상의 컴퓨터 프로그램(191)과 각종 데이터(e.g. 학습 데이터셋), 기계학습 모델 등을 비임시적으로 저장할 수 있다. 스토리지(190)는 ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리 등과 같은 비휘발성 메모리, 하드 디스크, 착탈형 디스크, 또는 본 개시가 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터로 읽을 수 있는 기록 매체를 포함하여 구성될 수 있다.The storage 190 may non-temporarily store the one or more computer programs 191, various data (e.g., a learning data set), a machine learning model, and the like. The storage 190 is well known in the art for non-volatile memory, hard disks, removable disks, or the like to which the present disclosure belongs, such as Read Only Memory (ROM), Eraseable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), Flash Memory, and the like. It may comprise any known type of computer readable recording medium.

컴퓨터 프로그램(191)은 메모리(130)에 로드될 때 프로세서(110)로 하여금 본 개시의 다양한 실시예에 따른 방법/동작을 수행하도록 하는 하나 이상의 인스트럭션들을 포함할 수 있다. 즉, 프로세서(110)는 상기 하나 이상의 인스트럭션들을 실행함으로써, 본 개시의 다양한 실시예에 따른 방법/동작들을 수행할 수 있다.Computer program 191 may include one or more instructions that, when loaded into memory 130, cause processor 110 to perform methods / operations in accordance with various embodiments of the present disclosure. That is, the processor 110 may perform methods / operations according to various embodiments of the present disclosure by executing the one or more instructions.

예를 들어, 컴퓨터 프로그램(191)은 제1 도메인에 속한 제1 데이터셋을 이용하여 복수의 출력 레이어를 포함하는 신경망의 제1 출력 레이어를 학습시키는 동작, 제2 도메인에 속한 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제2 출력 레이어를 학습시키는 동작, 상기 제1 데이터셋 및 상기 제2 데이터셋을 이용하여 상기 복수의 출력 레이어 중 제3 출력 레이어를 학습시키는 동작을 수행하도록 하는 인스트럭션들을 포함할 수 있다. 또한, 컴퓨터 프로그램(191)은 학습된 신경망을 이용하여 상기 제1 도메인에 속한 예측용 데이터의 레이블을 예측하는 동작을 수행하도록 하는 인스트럭션들을 더 포함할 수도 있다. 이와 같은 경우, 컴퓨팅 장치(100)를 통해 본 개시의 몇몇 실시예에 따른 멀티 도메인 앙상블 학습 장치(e.g. 10)가 구현될 수 있다.For example, the computer program 191 may train a first output layer of a neural network including a plurality of output layers by using a first dataset belonging to a first domain, and generate a second dataset belonging to a second domain. Instructions to train a second output layer of the plurality of output layers by using; and to train a third output layer of the plurality of output layers by using the first data set and the second data set. Can include them. In addition, the computer program 191 may further include instructions for performing an operation of predicting a label of prediction data belonging to the first domain by using the learned neural network. In this case, the multi-domain ensemble learning apparatus e.g. 10 may be implemented through the computing device 100 in accordance with some embodiments of the present disclosure.

지금까지 도 8를 참조하여 본 개시의 다양한 실시예에 따른 장치를 구현할 수 있는 예시적인 컴퓨팅 장치(100)에 대하여 설명하였다.An example computing device 100 that may implement an apparatus in accordance with various embodiments of the present disclosure has been described above with reference to FIG. 8.

지금까지 도 1 내지 도 8을 참조하여 본 개시의 다양한 실시예들 및 그 실시예들에 따른 효과들을 언급하였다. 본 개시의 기술적 사상에 따른 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.So far, various embodiments of the present disclosure and effects according to the embodiments have been described with reference to FIGS. 1 to 8. Effects according to the technical spirit of the present disclosure are not limited to the above-mentioned effects, other effects not mentioned will be clearly understood by those skilled in the art from the following description.

지금까지 도 1 내지 도 8을 참조하여 설명된 본 개시의 기술적 사상은 컴퓨터가 읽을 수 있는 매체 상에 컴퓨터가 읽을 수 있는 코드로 구현될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체는, 예를 들어 이동형 기록 매체(CD, DVD, 블루레이 디스크, USB 저장 장치, 이동식 하드 디스크)이거나, 고정식 기록 매체(ROM, RAM, 컴퓨터 구비 형 하드 디스크)일 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체에 기록된 상기 컴퓨터 프로그램은 인터넷 등의 네트워크를 통하여 다른 컴퓨팅 장치에 전송되어 상기 다른 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 다른 컴퓨팅 장치에서 사용될 수 있다.The technical idea of the present disclosure described with reference to FIGS. 1 to 8 may be implemented as computer readable codes on a computer readable medium. The computer-readable recording medium may be, for example, a removable recording medium (CD, DVD, Blu-ray disc, USB storage device, removable hard disk) or a fixed recording medium (ROM, RAM, computer equipped hard disk). Can be. The computer program recorded on the computer-readable recording medium may be transmitted to another computing device and installed in the other computing device through a network such as the Internet, thereby being used in the other computing device.

이상에서, 본 개시의 실시예를 구성하는 모든 구성 요소들이 하나로 결합되거나 결합되어 동작하는 것으로 설명되었다고 해서, 본 개시의 기술적 사상이 반드시 이러한 실시예에 한정되는 것은 아니다. 즉, 본 개시의 목적 범위 안에서라면, 그 모든 구성요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다.In the above description, it is described that all the elements constituting the embodiments of the present disclosure are combined or operated as one, but the technical spirit of the present disclosure is not necessarily limited to these embodiments. That is, within the scope of the present disclosure, all of the components may be selectively operated in one or more combinations.

도면에서 동작들이 특정한 순서로 도시되어 있지만, 반드시 동작들이 도시된 특정한 순서로 또는 순차적 순서로 실행되어야만 하거나 또는 모든 도시 된 동작들이 실행되어야만 원하는 결과를 얻을 수 있는 것으로 이해되어서는 안 된다. 특정 상황에서는, 멀티태스킹 및 병렬 처리가 유리할 수도 있다. 더욱이, 위에 설명한 실시예들에서 다양한 구성들의 분리는 그러한 분리가 반드시 필요한 것으로 이해되어서는 안 되고, 설명된 프로그램 컴포넌트들 및 시스템들은 일반적으로 단일 소프트웨어 제품으로 함께 통합되거나 다수의 소프트웨어 제품으로 패키지 될 수 있음을 이해하여야 한다.Although the operations are shown in a specific order in the drawings, it should not be understood that the operations must be performed in the specific order or sequential order shown or that all the illustrated operations must be executed to achieve the desired results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of the various configurations in the embodiments described above should not be understood as such separation being necessary, and the described program components and systems may generally be integrated together into a single software product or packaged into multiple software products. Should be understood.

이상 첨부된 도면을 참조하여 본 개시의 실시예들을 설명하였지만, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자는 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 본 개시가 다른 구체적인 형태로도 실시될 수 있다는 것을 이해할 수 있다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다. 본 개시의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 개시에 의해 정의되는 기술적 사상의 권리범위에 포함되는 것으로 해석되어야 할 것이다.While the embodiments of the present disclosure have been described with reference to the accompanying drawings, a person of ordinary skill in the art may implement the present disclosure in other specific forms without changing the technical spirit or essential features. I can understand that there is. Therefore, it is to be understood that the embodiments described above are exemplary in all respects and not restrictive. The scope of protection of the present disclosure should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto shall be interpreted as being included in the scope of the technical idea defined by the present disclosure.

Claims

A method for performing multi-domain ensemble learning on a neural network including a plurality of output layers in a computing device, the method comprising:
Learning a first output layer and a shared feature extraction layer of the plurality of output layers using a first dataset belonging to a first domain;
Learning a second output layer and the shared feature extraction layer of the plurality of output layers using a second dataset belonging to a second domain; And
Learning a third output layer and the shared feature extraction layer of the plurality of output layers by using the first data set and the second data set;
Multi domain ensemble learning method.

According to claim 1,
The first data set includes more data than the second data set,
Multi domain ensemble learning method.

According to claim 1,
The first dataset comprises a 2D image,
Wherein the second dataset comprises a 3D image,
Multi domain ensemble learning method.

The method of claim 3, wherein
The first data set includes a full-field digital mammography (FFDM) image,
The second dataset includes a digital breast tomosynthesis (DBT) image,
Multi domain ensemble learning method.

According to claim 1,
The first dataset comprises a single layer image,
The second dataset comprises a multilayer image,
Multi domain ensemble learning method.

The method of claim 5,
Learning the second output layer may include:
Extracting a first layer image from the multi-layer image and learning the second output layer using the first layer image; And
Extracting a second layer image from the multilayer image and learning the second output layer using the second layer image;
Multi domain ensemble learning method.

According to claim 1,
Wherein the shared feature extraction layer is shared by the first output layer, the second output layer, and the third output layer.
Multi domain ensemble learning method.

According to claim 1,
Predicting a label of prediction data belonging to the first domain by using the neural network;
The predicting step,
Predicting the label based on a first prediction value output from the first output layer and a second prediction value output from the third output layer.
Multi domain ensemble learning method.

The method of claim 8,
Predicting the label based on a first prediction value output from the first output layer and a second prediction value output from the third output layer,
Weighting each of the first prediction value and the second prediction value;
Synthesizing the first prediction value and the second prediction value by reflecting the assigned weight value; And
Predicting the label based on the synthesized prediction value;
Multi domain ensemble learning method.

Memory for storing one or more instructions and
By executing the stored one or more instructions,
Train a first output layer and a shared feature extraction layer among a plurality of output layers included in the neural network by using a first dataset belonging to the first domain,
Train a second output layer and the shared feature extraction layer among the plurality of output layers using a second dataset belonging to a second domain,
And a processor configured to learn a third output layer and the shared feature extraction layer of the plurality of output layers using the first data set and the second data set.
Multi Domain Ensemble Learning Device.

The method of claim 10,
The first data set includes more data than the second data set,
Multi Domain Ensemble Learning Device.

The method of claim 10,
The first dataset comprises a 2D image,
Wherein the second dataset comprises a 3D image,
Multi Domain Ensemble Learning Device.

The method of claim 12,
The first data set includes a full-field digital mammography (FFDM) image,
The second dataset includes a digital breast tomosynthesis (DBT) image,
Multi Domain Ensemble Learning Device.

The method of claim 10,
The first dataset comprises a single layer image,
The second dataset comprises a multilayer image,
Multi Domain Ensemble Learning Device.

The method of claim 14,
The processor,
Extracting a first layer image from the multi-layer image, learning the second output layer using the first layer image,
Extracting a second layer image from the multi-layer image, and learning the second output layer using the second layer image;
Multi Domain Ensemble Learning Device.

The method of claim 10,
Wherein the shared feature extraction layer is shared by the first output layer, the second output layer, and the third output layer.
Multi Domain Ensemble Learning Device.

The method of claim 10,
The processor,
The neural network is further used to predict a label of prediction data belonging to the first domain, wherein the label is based on a first prediction value output from the first output layer and a second prediction value output from the third output layer. Predicted,
Multi Domain Ensemble Learning Device.

The method of claim 17,
The processor,
Integrating the first prediction value and the second prediction value by reflecting weights assigned to each of the first prediction value and the second prediction value, and predicting the label based on the combined prediction value,
Multi Domain Ensemble Learning Device.

In conjunction with computing devices,
Training a first output layer and a shared feature extraction layer among a plurality of output layers included in the neural network by using a first dataset belonging to the first domain;
Learning a second output layer and the shared feature extraction layer of the plurality of output layers using a second dataset belonging to a second domain; And
Stored in a computer-readable recording medium for executing the step of learning a third output layer and the shared feature extraction layer of the plurality of output layers using the first data set and the second data set;
Computer programs.

The method of claim 19,
Predicting a label of prediction data belonging to the first domain by using the neural network;
The predicting step,
Predicting the label based on a first prediction value output from the first output layer and a second prediction value output from the third output layer.
Computer programs.