KR20190092217A

KR20190092217A - Device for ensembling data and operating method thereof

Info

Publication number: KR20190092217A
Application number: KR1020180081304A
Authority: KR
Inventors: 임명은; 박흰돌; 정호열; 최재훈; 한영웅
Original assignee: 한국전자통신연구원
Priority date: 2018-01-30
Filing date: 2018-07-12
Publication date: 2019-08-07

Abstract

The present invention relates to an operating method of a device for ensembling data received from a plurality of health prediction devices. The method comprises the following steps: providing primitive learning data to first and second health prediction devices; receiving first and second learning result data generated from the first and second health prediction devices; generating a target relation model based on a correlation between characteristic data having the same feature in characteristic data included in each of the first and second learning result data; generating a characteristic relation model based on a correlation between characteristic data having different features included in the learning result data; and merging the target relation model and the characteristic relation model and forming an ensembling model. According to the present invention, performance of the ensembling model can be improved.

Description

DEVICE FOR ENSEMBLING DATA AND OPERATING METHOD THEREOF}

본 발명은 미래 건강 예측을 위한 데이터의 처리에 관한 것으로, 좀 더 구체적으로 복수의 건강 예측 장치들로부터 수신된 데이터를 앙상블하는 장치 및 이의 동작 방법에 관한 것이다.The present invention relates to the processing of data for future health prediction, and more particularly, to an apparatus for ensembling data received from a plurality of health prediction devices and a method of operating the same.

건강한 삶을 영위하기 위하여, 현재의 질병을 치료하는 것에서 나아가 미래의 건강 상태를 예측하기 위한 요구가 제기되고 있다. 미래의 건강 상태를 예측하기 위하여, 빅데이터를 분석하여 질병을 진단하거나 미래의 질병 위험도를 예측하고자 하는 수요가 증가하고 있다. 산업 기술과 정보 통신 기술의 발달은 빅데이터의 구축을 지원하고 있다. 그리고, 이러한 빅데이터를 이용하여, 컴퓨터와 같은 전자 장치를 학습시켜, 다양한 서비스를 제공하는 인공 지능과 같은 기술이 대두되고 있다. 특히, 미래의 건강 상태를 예측하기 위하여, 다양한 의료 데이터 또는 건강 데이터 등을 이용한 학습 모델을 구축하는 방안이 제안되고 있다. In order to lead a healthy life, there is a demand for predicting future health conditions in addition to treating current diseases. In order to predict future health conditions, there is an increasing demand for analyzing big data to diagnose diseases or to predict future disease risks. The development of industrial technology and information and communication technology is supporting the construction of big data. In addition, technology such as artificial intelligence, which provides various services by learning electronic devices such as computers using such big data, has emerged. In particular, in order to predict future health conditions, a method of constructing a learning model using various medical data or health data has been proposed.

정확한 예측을 위해서는 데이터의 규모가 클수록 유리하지만, 윤리적 문제, 법적 문제, 개인 프라이버시 문제 등 다양한 원인으로, 다양한 의료 기관들끼리의 데이터 공유 등은 사실상 어려울 수 있다. 이로 인하여, 의료 데이터의 하나로 통합된 빅데이터 구축은 사실상 어려운 실정이다. 이러한 의료 데이터 특유의 문제점에 대한 방안으로, 다기관의 통합된 빅데이터에 대한 단일 예측기를 구축하는 대신 다양한 의료 기관들에서 개별적으로 구축된 데이터로 개별 예측 모델을 학습하고, 이들의 예측 결과를 환자의 미래 건강 상태의 예측에 활용하는 방안이 모색되고 있다. Larger data is advantageous for accurate predictions, but sharing data among various medical institutions can be virtually difficult due to a variety of reasons, including ethical, legal, and personal privacy issues. For this reason, it is difficult to construct big data integrated as one of medical data. As a solution to the unique problems of medical data, instead of constructing a single predictor for multi-organized integrated big data, the individual predictive models are trained with data that are individually constructed from various medical institutions, and the results of these predictions are analyzed. It is looking for ways to use it to predict future health conditions.

본 발명은 미래의 건강 상태 예측의 신뢰성, 정확성, 및 효율성을 확보할 수 있도록, 복수의 건강 예측 장치들로부터 수신된 데이터를 앙상블하는 장치 및 이의 동작 방법을 제공할 수 있다.The present invention can provide an apparatus for ensembling data received from a plurality of health prediction devices and a method of operating the same so as to secure reliability, accuracy, and efficiency of a future health condition prediction.

본 발명의 실시예에 따른 앙상블 예측 장치의 동작 방법에 의하여, 복수의 건강 예측 장치들로부터 수신된 데이터가 앙상블된다. 앙상블 예측 장치의 동작 방법은 원시 학습 데이터를 제1 건강 예측 장치 및 제2 건강 예측 장치에 제공하는 단계, 제1 건강 예측 장치로부터 원시 학습 데이터에 기초하여 생성된 제1 학습 결과 데이터를 수신하는 단계, 제2 건강 예측 장치로부터 원시 학습 데이터에 기초하여 생성된 제2 학습 결과 데이터를 수신하는 단계, 제1 및 제2 학습 결과 데이터 각각에 포함된 특징 데이터 중 동일한 특징을 갖는 특징 데이터 간의 상관 관계에 기초하여, 특징 별로 제1 및 제2 건강 예측 장치들 각각에 대한 가중치를 제공하는 타겟 관계 모델을 생성하는 단계, 제1 또는 제2 학습 결과 데이터에 포함된 특징 데이터 중 서로 다른 특징을 갖는 특징 데이터 사이의 상관 관계에 기초하여 서로 다른 특징 각각에 대한 가중치를 제공하는 특징 관계 모델을 생성하는 단계, 및 타겟 관계 모델 및 특징 관계 모델을 병합하여 앙상블 모델을 구축하는 단계를 포함한다.By the method of operating the ensemble prediction apparatus according to the embodiment of the present invention, data received from the plurality of health prediction apparatuses is ensemble. The operating method of the ensemble prediction apparatus includes providing raw training data to the first health prediction apparatus and the second health prediction apparatus, and receiving first learning result data generated based on the raw learning data from the first health prediction apparatus. Receiving second learning result data generated based on the raw learning data from the second health prediction apparatus, and comparing the feature data having the same feature among the feature data included in each of the first and second learning result data. On the basis of the step of generating a target relationship model for providing a weight for each of the first and second health prediction apparatus for each feature, feature data having a different feature of the feature data included in the first or second learning result data Generating a feature relationship model that provides a weight for each of the different features based on the correlation therebetween, And merging the target relationship model and the feature relationship model to build an ensemble model.

본 발명의 실시예에 따른 앙상블 예측 장치는 네트워크 인터페이스, 앙상블 모델 학습부, 건강 예측부, 및 프로세서를 포함한다. 네트워크 인터페이스는 원시 학습 데이터 또는 의료 데이터를 복수의 건강 예측 장치들에 제공하고, 원시 학습 데이터 및 의료 데이터에 기초하여 복수의 건강 예측 장치들로부터 생성된 학습 결과 데이터 및 복수의 건강 예측 장치들에 대한 복수의 메타 정보들을 수신한다. 앙상블 모델 학습부는 복수의 메타 정보들 사이의 유사도에 기초하여 타겟 학습 데이터를 선별하여 앙상블 학습 데이터를 구축하고, 타겟 학습 데이터 사이의 상관 관계 및 타겟 학습 데이터 각각에 포함된 특징 데이터 사이의 상관 관계에 기초하여 앙상블 모델을 생성한다. 건강 예측부는 의료 데이터에 의해 생성된 결과 데이터를 앙상블 모델에 입력하여 사용자의 건강 상태를 예측한다. 프로세서는 네트워크 인터페이스 및 앙상블 모델 학습부, 및 건강 예측부를 제어한다.An ensemble prediction apparatus according to an embodiment of the present invention includes a network interface, an ensemble model learner, a health predictor, and a processor. The network interface provides raw learning data or medical data to the plurality of health prediction devices, and the training result data generated from the plurality of health prediction devices and the plurality of health prediction devices based on the raw learning data and the medical data. Receive a plurality of meta information. The ensemble model learning unit constructs the ensemble training data by selecting target training data based on the similarity between the plurality of meta informations, and applies the correlation between the target training data and the feature data included in each target training data. Generate an ensemble model based on that. The health prediction unit inputs the result data generated by the medical data into the ensemble model to predict the health state of the user. The processor controls the network interface and the ensemble model learner and the health predictor.

본 발명의 실시예에 따른 데이터를 앙상블하는 장치 및 이의 동작 방법은 타겟 학습 데이터를 선별하여, 앙상블 모델을 생성함으로써 다수의 유사한 타겟 학습 데이터에 의한 오버피팅을 완화시킬 수 있다. An apparatus for ensembling data and an operation method thereof according to an embodiment of the present invention may reduce overfitting by a plurality of similar target training data by selecting target training data and generating an ensemble model.

또한, 본 발명의 실시예에 따른 데이터를 앙상블하는 장치 및 이의 동작 방법은 앙상블 모델의 학습 시에 타겟 학습 데이터 사이의 상관 관계 및 타겟 학습 데이터 각각에 포함된 특징 데이터 사이의 상관 관계를 분리하여 학습함으로써, 복수의 건강 예측 장치들 간의 특성 및 학습 데이터의 특성이 종합적으로 고려되어 앙상블 모델의 성능을 향상시킬 수 있다.In addition, the apparatus for ensembling data according to an embodiment of the present invention and a method of operating the same may be performed by separating correlation between target training data and feature data included in each target training data when the ensemble model is trained. As a result, characteristics of the plurality of health prediction apparatuses and characteristics of the training data may be comprehensively considered to improve performance of the ensemble model.

도 1은 본 발명의 실시예에 따른 건강 상태 예측 시스템을 도시한 도면이다.
도 2는 도 1의 앙상블 예측 장치의 예시적인 블록도이다.
도 3은 도 2의 앙상블 예측 장치의 동작 방법에 대한 순서도이다.
도 4는 도 3의 S130 단계를 구체화한 순서도이다.
도 5는 도 4의 S131 단계를 구체적으로 설명하기 위한 도면이다.
도 6은 도 4의 S132 단계를 구체적으로 설명하기 위한 도면이다.
도 7은 도 4의 S133 단계를 구체적으로 설명하기 위한 도면이다.
도 8은 도 4의 S134 단계를 구체적으로 설명하기 위한 도면이다.1 is a diagram illustrating a health state prediction system according to an embodiment of the present invention.
FIG. 2 is an exemplary block diagram of the ensemble prediction apparatus of FIG. 1.
3 is a flowchart illustrating a method of operating the ensemble prediction apparatus of FIG. 2.
4 is a flowchart embodying operation S130 of FIG. 3.
FIG. 5 is a diagram for specifically describing operation S131 of FIG. 4.
FIG. 6 is a diagram for specifically describing operation S132 of FIG. 4.
FIG. 7 is a diagram for specifically describing operation S133 of FIG. 4.
FIG. 8 is a diagram for specifically describing operation S134 of FIG. 4.

아래에서는, 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있을 정도로, 본 발명의 실시 예들이 명확하고 상세하게 기재된다.In the following, embodiments of the present invention are described clearly and in detail so that those skilled in the art can easily practice the present invention.

도 1은 본 발명의 실시예에 따른 건강 상태 예측 시스템을 도시한 도면이다. 도 1을 참조하면, 건강 상태 예측 시스템(100)은 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120), 앙상블 예측 장치(130), 및 네트워크(140)를 포함한다. 설명의 편의상, 건강 예측 장치들의 개수가 n개인 것으로 도시하였으나, 건강 예측 장치들의 개수는 제한되지 않고 복수로 제공될 수 있다.1 is a diagram illustrating a health state prediction system according to an embodiment of the present invention. Referring to FIG. 1, the health state prediction system 100 includes first to nth health prediction devices 111-11n, a terminal 120, an ensemble prediction device 130, and a network 140. For convenience of description, the number of health prediction devices is illustrated as n, but the number of health prediction devices is not limited and may be provided in plural.

제1 내지 제n 건강 예측 장치들(111~11n) 각각은 개별적으로 구축된 예측 모델에 기초하여 사용자의 건강 상태를 예측할 수 있다. 여기에서, 예측 모델은 시계열 의료 데이터를 이용하여, 미래 시점의 건강 상태를 예측하기 위하여 모델화된 구조일 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 제1 내지 제n 학습 데이터(11~1n)을 이용하여 예측 모델을 생성하고 학습할 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 서로 다른 의료 기관 또는 공공 기관에 제공될 수 있다. 제1 내지 제n 학습 데이터(11~1n)은 기관들 각각의 예측 모델 생성 및 학습을 위하여, 개별적으로 데이터베이스화될 수 있다. 서로 다른 의료 기관 또는 공공 기관은 개별적으로 예측 모델을 학습하고, 이러한 학습에 따라 구축된 예측 모델에 사용자의 시계열 의료 데이터를 적용하여, 사용자의 미래 시점에 대한 건강 상태를 예측할 수 있다.Each of the first to n-th health predicting devices 111 to 11n may predict a health state of the user based on an individually constructed prediction model. Here, the predictive model may be a model that is modeled to predict a state of health at a future time point using time series medical data. Each of the first to nth health prediction apparatuses 111 to 11n may generate and learn a prediction model using the first to nth training data 11 to 1n. Each of the first to nth health predicting devices 111 to 11n may be provided to different medical institutions or public institutions. The first to n-th training data 11 to 1n may be individually databased for generating and learning a prediction model of each of the institutions. Different medical institutions or public institutions can individually predict the predictive models and apply the user's time series medical data to the predictive models constructed according to the learning to predict the state of health for the future time points of the user.

제1 내지 제n 건강 예측 장치들(111~11n) 각각은 네트워크(140)를 통하여, 앙상블 예측 장치(130)로부터 원시 학습 데이터(31)를 수신할 수 있다. 여기에서, 원시 학습 데이터(31)는 앙상블 예측 장치(130)에 구축되는 앙상블 모델을 학습하기 위한 데이터로 이해될 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 원시 학습 데이터(31)를 구축된 예측 모델에 적용하여 제1 내지 제n 학습 결과 데이터를 생성할 수 있다. 여기에서, 제1 내지 제n 학습 결과 데이터는 원시 학습 데이터(31)에 따라 제1 내지 제n 건강 예측 장치들(111~11n) 각각이 미래 건강 상태를 예측한 결과 데이터로 이해될 수 있다. 제1 내지 제n 학습 결과 데이터는 네트워크(140)를 통하여, 앙상블 예측 장치(130)로 제공될 수 있다.Each of the first to nth health prediction devices 111 ˜ 11n may receive the raw learning data 31 from the ensemble prediction device 130 through the network 140. Here, the raw training data 31 may be understood as data for learning an ensemble model built in the ensemble prediction apparatus 130. Each of the first to nth health prediction apparatuses 111 to 11n may generate the first to nth learning result data by applying the raw learning data 31 to the constructed prediction model. Here, the first to n-th learning result data may be understood as the result data of each of the first to n-th health predicting devices 111 to 11n predicting a future health state according to the raw learning data 31. The first to n th learning result data may be provided to the ensemble prediction apparatus 130 through the network 140.

제1 내지 제n 학습 결과 데이터는 서로 다른 예측 모델에 기초하여 생성되므로, 서로 다른 데이터 값을 가질 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 서로 다른 의료 데이터, 즉 서로 다른 제1 내지 제n 학습 데이터(11~1n)을 기반으로 예측 모델을 학습 및 구축하기 때문이다. 윤리적 문제, 법적 문제, 개인 프라이버시 문제 등, 의료 데이터의 특성으로 인하여, 의료 기관 별로 데이터를 공유하기 어렵고, 빅데이터화가 어렵다. 따라서, 제1 내지 제n 건강 예측 장치들(111~11n)이 개별적으로 예측 모델을 구축하되, 앙상블 예측 장치(130)에서 제1 내지 제n 건강 예측 장치들(111~11n)로부터 예측된 결과 데이터를 앙상블함으로써, 다양한 데이터 학습이 고려된 미래 건강 예측이 가능할 수 있다.Since the first to n-th learning result data are generated based on different prediction models, they may have different data values. This is because each of the first to nth health prediction apparatuses 111 to 11n learns and builds a prediction model based on different medical data, that is, different first to nth learning data 11 to 1n. Due to the characteristics of medical data, such as ethical issues, legal issues, and personal privacy issues, it is difficult to share data among medical institutions and make big data difficult. Accordingly, the first to n-th health prediction apparatuses 111 to 11n separately build a prediction model, and the ensemble prediction apparatus 130 predicts the results from the first to n-th health prediction apparatuses 111 to 11n. By ensemble the data, it may be possible to predict future health that takes into account various data learning.

단말기(120)는 사용자의 미래 건강 예측을 위한 요청 신호를 제공할 수 있다. 단말기(120)는 스마트폰, 데스크탑, 랩탑, 웨어러블 장치 등 요청 신호를 제공할 수 있는 전자 장치일 수 있다. 예를 들어, 단말기(120)는 네트워크(140)를 통하여, 앙상블 예측 장치(130)에 요청 신호를 제공할 수 있고, 건강 상태 예측 시스템(100)은 제1 내지 제n 건강 예측 장치들(111~11n) 및 앙상블 예측 장치(130)를 이용하여 사용자의 건강 상태를 진단하거나, 미래 건강 상태를 예측할 수 있다. 이를 위하여, 단말기(120)는 요청 신호와 함께 시계열 의료 데이터를 앙상블 예측 장치(130)에 제공할 수 있다. 시계열 의료 데이터는 진단, 치료, 검사, 또는 투약 처방 등에 의하여 생성된 사용자의 건강 상태를 나타내는 데이터를 의미할 수 있고, 예시적으로, EMR(Electronic Medical Record) 데이터 또는 PHR(Personal Health Record) 데이터일 수 있다.The terminal 120 may provide a request signal for predicting the future health of the user. The terminal 120 may be an electronic device capable of providing a request signal such as a smartphone, a desktop, a laptop, a wearable device, and the like. For example, the terminal 120 may provide a request signal to the ensemble prediction apparatus 130 through the network 140, and the health state prediction system 100 may include the first to nth health prediction apparatuses 111. 11 n) and the ensemble prediction device 130 may diagnose a user's health state or predict a future health state. To this end, the terminal 120 may provide the time series medical data to the ensemble prediction apparatus 130 together with the request signal. The time series medical data may refer to data representing a state of health of a user generated by diagnosis, treatment, examination, or a prescription of medication. For example, the time series medical data may be electronic medical record (EMR) data or personal health record (PHR) data. Can be.

앙상블 예측 장치(130)는 제1 내지 제n 학습 결과 데이터를 이용하여 앙상블 모델을 학습한다. 여기에서, 앙상블 모델은 제1 내지 제n 건강 예측 장치들(111~11n) 각각이 건강 상태를 예측한 학습 결과 데이터를 앙상블하여, 미래 건강 상태를 최종 예측하기 위하여 모델화된 구조일 수 있다. 상술된 바와 같이, 앙상블 예측 장치(130)는 원시 학습 데이터(31)를 제1 내지 제n 건강 예측 장치들(111~11n) 각각이 학습한 결과인 제1 내지 제n 학습 결과 데이터를 수신한다. 앙상블 예측 장치(130)는 제1 내지 제n 학습 결과 데이터를 통합하여 앙상블 학습 데이터(32)를 생성할 수 있다. 앙상블 예측 장치(130)는 앙상블 학습 데이터(32)에 기초하여 앙상블 모델을 학습한다.The ensemble prediction apparatus 130 learns the ensemble model using the first through n-th learning result data. Here, the ensemble model may be a model that is modeled for each of the first to nth health prediction apparatuses 111 to 11n to ensemble the learning result data that predicts the health state and finally predict the future health state. As described above, the ensemble prediction apparatus 130 receives first to nth learning result data that is a result of learning each of the first to nth health prediction devices 111 to 11n from the raw learning data 31. . The ensemble prediction apparatus 130 may generate the ensemble training data 32 by integrating the first through n-th learning result data. The ensemble prediction apparatus 130 learns an ensemble model based on the ensemble training data 32.

앙상블 모델은 제1 내지 제n 건강 예측 장치들(111~11n)의 다양성(diversity)이 클수록, 높은 성능을 가질 수 있다. 이러한 다양성은 각각의 건강 예측 장치들에 구축된 예측 모델들의 알고리즘의 다양성, 예측 모델들 각각에 제공되는 데이터 값의 다양성, 및 데이터에 포함된 특징(feature; 예를 들어, 혈압, 콜레스테롤 수치 등)들의 다양성에 기초하여 결정될 수 있다. 다만, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)에 구축된 예측 모델들에 직접적으로 개입할 수 없다. 따라서, 각각의 예측 모델들이 서로 유사한 데이터, 알고리즘, 또는 특징에 의한 학습 결과로 생성된 경우, 유사하지 않은 데이터에 대하여 정확성이 급격히 감소하는 오버피팅(overfitting)이 발생될 수 있다. The ensemble model may have higher performance as the diversity of the first to nth health predicting devices 111 to 11n increases. This diversity is due to the diversity of algorithms of the predictive models built into the respective health prediction devices, the variety of data values provided to each of the predictive models, and the features included in the data (e.g. blood pressure, cholesterol levels, etc.). Can be determined based on the diversity of these. However, the ensemble prediction apparatus 130 may not directly intervene in the prediction models constructed in the first to nth health prediction apparatuses 111 to 11n. Thus, when the respective prediction models are generated as a result of learning by similar data, algorithms, or features, overfitting may be generated in which the accuracy is drastically reduced for dissimilar data.

앙상블 예측 장치(130)는 오버피팅을 완화시키기 위하여, 타겟 학습 데이터를 선별할 수 있다. 타겟 학습 데이터는 앙상블 모델을 학습하기 위하여 제1 내지 제n 학습 결과 데이터 중 선택된 학습 데이터일 수 있다. 타겟 학습 데이터를 선별하기 위하여, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n) 각각으로부터 제1 내지 제n 메타 정보들을 수신할 수 있다. 제1 내지 제n 메타 정보들 각각은 해당 건강 예측 장치가 학습하는 특징, 알고리즘, 및 규모 등에 대한 정보를 포함할 수 있다. 앙상블 예측 장치(130)는 제1 내지 제n 메타 정보들 사이의 유사도에 기초하여 타겟 학습 데이터를 선별하고, 선별된 타겟 학습 데이터를 통합하여 앙상블 학습 데이터(32)를 생성할 수 있다. 여기에서, 통합은 단순한 데이터의 나열 또는 결합으로 이해될 수 있다. 구체적인 타겟 학습 데이터의 선별 과정은 후술된다.The ensemble prediction apparatus 130 may select target training data to mitigate overfitting. The target training data may be training data selected from the first to nth training result data to train the ensemble model. In order to select the target training data, the ensemble prediction apparatus 130 may receive first to nth meta information from each of the first to nth health prediction apparatuses 111 to 11n. Each of the first to n-th meta information may include information about a feature, an algorithm, a scale, and the like that the corresponding health prediction apparatus learns. The ensemble prediction apparatus 130 may select the target training data based on the similarity between the first to n-th meta information, and generate the ensemble training data 32 by integrating the selected target training data. Here, integration can be understood as a simple listing or combining of data. A specific screening process of the target training data will be described later.

앙상블 예측 장치(130)는 앙상블 학습 데이터(32)에 기초하여 타겟 학습 데이터 사이의 상관 관계 및 타겟 학습 데이터에 포함된 특징 데이터(이하, 특징)들 사이의 상관 관계에 기초하여 앙상블 모델을 생성할 수 있다. 앙상블 예측 장치(130)는 앙상블 학습 데이터(32)를 특징 별로 분류하여 타겟 관계 모델에 입력함으로써, 타겟 관계 모델을 생성 및 학습시킬 수 있다. 이러한 타겟 관계 모델은 타겟 학습 데이터 사이의 상관 관계를 분석하는데 이용될 수 있다. 앙상블 예측 장치(130)는 앙상블 학습 데이터(32)를 학습 데이터 별로 분류하여 특징 관계 모델에 입력함으로써, 특징 관계 모델을 생성 및 학습시킬 수 있다. 이러한 특징 관계 모델은 특징들 사이의 상관 관계를 분석하는데 이용될 수 있다. 이후, 앙상블 예측 장치(130)는 타겟 관계 모델 및 특징 관계 모델을 병합(머징)하고, 튜닝하여, 앙상블 모델을 최적화할 수 있다. 구체적인 앙상블 모델의 생성 과정은 후술된다.The ensemble prediction apparatus 130 may generate an ensemble model based on the correlation between the target training data based on the ensemble training data 32 and the correlation between the feature data (hereinafter, features) included in the target training data. Can be. The ensemble prediction apparatus 130 may generate and train the target relationship model by classifying the ensemble training data 32 by feature and inputting the ensemble training data 32 into the target relationship model. This target relationship model can be used to analyze the correlation between target training data. The ensemble prediction apparatus 130 may generate and learn a feature relation model by classifying the ensemble training data 32 for each training data and inputting the ensemble training data 32 into the feature relation model. This feature relationship model can be used to analyze the correlation between features. Thereafter, the ensemble prediction apparatus 130 may merge (merge) and tune the target relationship model and the feature relationship model to optimize the ensemble model. A process of generating a concrete ensemble model will be described later.

앙상블 예측 장치(130)는 앙상블 모델에 기초하여, 사용자의 미래 건강 상태를 예측 및 분석할 수 있다. 단말기(120)의 요청에 따라, 앙상블 예측 장치(130)는 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공할 수 있다. 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n) 각각으로부터 제1 내지 제n 예측 결과 데이터를 수신할 수 있다. 앙상블 예측 장치(130)는 앙상블 모델에 기초하여 제1 내지 제n 예측 결과 데이터를 앙상블하여, 사용자의 미래 건강 상태를 예측할 수 있다.The ensemble prediction apparatus 130 may predict and analyze a future health state of the user based on the ensemble model. According to a request of the terminal 120, the ensemble prediction device 130 may provide time series medical data to the first to nth health prediction devices 111 to 11n. The ensemble prediction apparatus 130 may receive the first to nth prediction result data from each of the first to nth health prediction apparatuses 111 to 11n. The ensemble prediction apparatus 130 may enumerate the first through n-th prediction result data based on the ensemble model to predict a future health state of the user.

네트워크(140)는 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120), 앙상블 예측 장치(130) 사이의 데이터 통신이 수행되도록 구성될 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120), 앙상블 예측 장치(130)는 네트워크(140)를 통하여, 유선 또는 무선으로 데이터를 주고 받을 수 있다. 도 1에 도시된 바와 달리, 제1 내지 제n 건강 예측 장치들(111~11n)과 앙상블 예측 장치(130) 사이의 데이터 통신을 수행하기 위한 네트워크와 단말기(120)와 앙상블 예측 장치(130) 사이의 데이터 통신을 수행하기 위한 네트워크는 서로 분리될 수 있다.The network 140 may be configured to perform data communication between the first to nth health prediction apparatuses 111 to 11n, the terminal 120, and the ensemble prediction apparatus 130. The first to nth health predicting devices 111 to 11n, the terminal 120, and the ensemble predicting device 130 may exchange data via the network 140 by wire or wirelessly. Unlike FIG. 1, a network, a terminal 120, and an ensemble prediction apparatus 130 for performing data communication between the first to nth health prediction apparatuses 111 to 11n and the ensemble prediction apparatus 130. Networks for performing data communication between them may be separated from each other.

도 2는 도 1의 앙상블 예측 장치의 예시적인 블록도이다. 도 2의 블록도는 앙상블 모델을 생성 및 학습하고, 앙상블 모델을 이용하여 미래 건강 상태를 예측 또는 분석하기 위한 예시적인 구성으로 이해될 것이고, 앙상블 예측 장치(130)의 구조가 이에 제한되지 않을 것이다. 도 2를 참조하면, 앙상블 예측 장치(130)는 네트워크 인터페이스(131), 프로세서(132), 메모리(133), 스토리지(136), 및 버스(137)를 포함할 수 있다. 예시적으로, 앙상블 예측 장치(130)는 서버로 구현될 수 있으나, 이에 제한되지 않는다. 설명의 편의상 도 1의 도면 부호를 참조하여, 도 2가 설명된다.FIG. 2 is an exemplary block diagram of the ensemble prediction apparatus of FIG. 1. The block diagram of FIG. 2 will be understood as an exemplary configuration for generating and learning an ensemble model and predicting or analyzing future health conditions using the ensemble model, and the structure of the ensemble prediction apparatus 130 will not be limited thereto. . Referring to FIG. 2, the ensemble prediction apparatus 130 may include a network interface 131, a processor 132, a memory 133, a storage 136, and a bus 137. In exemplary embodiments, the ensemble prediction apparatus 130 may be implemented as a server, but is not limited thereto. For convenience of description, referring to the reference numerals of FIG. 1, FIG. 2 is described.

네트워크 인터페이스(131)는 도 1의 네트워크(140)를 통하여 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120)와 통신할 수 있도록 구성된다. 예를 들어, 앙상블 모델의 생성을 위하여, 네트워크 인터페이스(131)는 제1 내지 제n 건강 예측 장치들(111~11n)에 원시 학습 데이터(31)를 제공할 수 있다. 네트워크 인터페이스(131)는 제1 내지 제n 건강 예측 장치들(111~11n)의 분석 결과인 제1 내지 제n 학습 결과 데이터를 수신하고, 이를 버스(137)를 통하여 프로세서(132), 메모리(133), 또는 스토리지(136)에 제공할 수 있다. The network interface 131 may be configured to communicate with the first to nth health predicting devices 111 to 11n and the terminal 120 through the network 140 of FIG. 1. For example, to generate an ensemble model, the network interface 131 may provide the raw learning data 31 to the first to nth health prediction devices 111 to 11n. The network interface 131 receives the first through n-th learning result data, which is an analysis result of the first through n-th health predicting devices 111 through 11n, and the processor 132 and the memory (via the bus 137). 133, or to the storage 136.

사용자의 미래 건강 예측 또는 분석을 위하여, 네트워크 인터페이스(131)는 단말기(120)로부터 요청 신호 및 시계열 의료 데이터를 수신할 수 있고, 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공할 수 있다. 네트워크 인터페이스(131)는 1 내지 제n 건강 예측 장치들(111~11n)로부터 제1 내지 제n 예측 결과 데이터를 수신하고, 이를 버스(137)를 통하여 프로세서(132), 메모리(133), 또는 스토리지(136)에 제공할 수 있다. 네트워크 인터페이스(131)는 제1 내지 제n 예측 결과 데이터를 앙상블한 결과 생성된 미래 건강 상태의 최종 예측 결과를 단말기(120)에 제공할 수 있다.In order to predict or analyze a future health of a user, the network interface 131 may receive a request signal and time series medical data from the terminal 120, and transmit the time series medical data to the first to nth health prediction devices 111 to 11n. ) Can be provided. The network interface 131 receives the first through n th prediction result data from the 1 through n th health prediction apparatuses 111 through 11 n, and the network interface 131 receives the first through n th prediction result data from the processor 132, the memory 133, or the bus 137. May be provided to storage 136. The network interface 131 may provide the terminal 120 with a final prediction result of a future health state generated as a result of ensembling the first through n-th prediction result data.

프로세서(132)는 앙상블 예측 장치(130)의 중앙 처리 장치로의 기능을 수행할 수 있다. 프로세서(132)는 앙상블 모델의 생성 및 학습, 그리고 앙상블 모델에 기초한 미래 건강 예측 및 분석을 위하여 요구되는 제어 동작 및 연산 동작을 수행할 수 있다. 예를 들어, 프로세서(132)의 제어에 따라, 네트워크 인터페이스(131)는 원시 학습 데이터(31) 또는 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공하고, 학습 결과 데이터 또는 예측 결과 데이터를 수신할 수 있다. 프로세서(132)의 제어에 따라, 앙상블 모델을 생성하기 위한 타겟 학습 데이터 선별 동작, 타겟 관계 모델 및 특징 관계 모델의 학습 동작 등이 수행될 수 있다. 프로세서(132)는 메모리(133)의 연산 공간을 활용하여 동작할 수 있고, 스토리지(136)로부터 운영체제를 구동하기 위한 파일들 및 어플리케이션의 실행 파일들을 읽을 수 있다. 프로세서(132)는 운영 체제 및 다양한 어플리케이션들을 실행할 수 있다.The processor 132 may perform a function of the ensemble prediction apparatus 130 as a central processing unit. The processor 132 may perform a control operation and a calculation operation required for generating and learning an ensemble model and for predicting and analyzing future health based on the ensemble model. For example, under the control of the processor 132, the network interface 131 provides the raw training data 31 or the time series medical data to the first to nth health prediction devices 111 to 11n, and the learning result. Data or prediction result data may be received. Under the control of the processor 132, a target learning data screening operation for generating an ensemble model, a learning operation of a target relationship model, a feature relationship model, and the like may be performed. The processor 132 may operate by using an operation space of the memory 133, and may read files for executing an operating system and executable files of an application from the storage 136. The processor 132 may execute an operating system and various applications.

메모리(133)는 프로세서(132)에 의하여 처리되거나 처리될 예정인 데이터 및 프로세스 코드들을 저장할 수 있다. 예를 들어, 메모리(133)는 원시 학습 데이터(31), 제1 내지 제n 학습 결과 데이터, 타겟 학습 데이터를 선별하기 위한 정보들, 앙상블 학습 데이터(32), 또는 앙상블 모델을 구축하기 위한 정보들을 저장할 수 있다. 또한, 메모리(133)는 시계열 의료 데이터, 건강 예측 장치들로부터 제공된 예측 결과 데이터, 또는 앙상블 결과 미래 건강에 대한 최종 예측 결과에 대한 정보들을 저장할 수 있다. 메모리(133)는 앙상블 예측 장치(130)의 주기억 장치로 이용될 수 있다. 메모리(133)는 DRAM (Dynamic RAM), SRAM (Static RAM), PRAM (Phase-change RAM), MRAM (Magnetic RAM), FeRAM (Ferroelectric RAM), RRAM (Resistive RAM) 등을 포함할 수 있다.The memory 133 may store data and process codes to be processed or to be processed by the processor 132. For example, the memory 133 may include raw training data 31, first through n-th training result data, information for selecting target training data, ensemble training data 32, or information for constructing an ensemble model. You can save them. In addition, the memory 133 may store time series medical data, prediction result data provided from health prediction devices, or information on an ensemble result final prediction result for future health. The memory 133 may be used as a main memory of the ensemble prediction apparatus 130. The memory 133 may include a dynamic RAM (DRAM), a static RAM (SRAM), a phase-change RAM (PRAM), a magnetic RAM (MRAM), a ferroelectric RAM (FeRAM), a resistive RAM (RRAM), and the like.

메모리(133)는 앙상블 모델 학습부(134) 및 건강 예측부(135)를 포함할 수 있다. 앙상블 모델 학습부(134) 및 건강 예측부(135)는 메모리(133)의 연산 공간의 일부일 수 있다. 이 경우, 앙상블 모델 학습부(134) 및 건강 예측부(135)는 펌웨어 또는 소프트웨어로 구현될 수 있다. 예를 들어, 펌웨어는 스토리지(136)에 저장되고, 펌웨어를 실행 시에 메모리(133)에 로딩될 수 있다. 프로세서(132)는 메모리(133)에 로딩된 펌웨어를 실행할 수 있다. 앙상블 모델 학습부(134)는 프로세서(132)의 제어 하에 앙상블 모델을 생성 및 학습하도록 동작될 수 있다. 건강 예측부(135)는 프로세서(132)의 제어 하에 앙상블 모델을 이용하여 사용자의 미래 건강 상태를 예측 및 분석하도록 동작될 수 있다. 앙상블 모델 학습부(134) 및 건강 예측부(135)의 구체적인 동작은 후술된다.The memory 133 may include an ensemble model learner 134 and a health predictor 135. The ensemble model learner 134 and the health predictor 135 may be part of an operation space of the memory 133. In this case, the ensemble model learner 134 and the health predictor 135 may be implemented by firmware or software. For example, the firmware may be stored in the storage 136 and loaded into the memory 133 when executing the firmware. The processor 132 may execute firmware loaded in the memory 133. The ensemble model learner 134 may be operated to generate and learn an ensemble model under the control of the processor 132. The health predictor 135 may be operated to predict and analyze a future health state of a user using an ensemble model under the control of the processor 132. Detailed operations of the ensemble model learner 134 and the health predictor 135 will be described later.

도 2에 도시된 바와 달리, 앙상블 모델 학습부(134) 및 건강 예측부(135)는 앙상블 모델을 구축하고, 사용자의 미래 건강 상태를 예측하기 위한 별도의 하드웨어로 구현될 수 있다. 예를 들어, 앙상블 모델 학습부(134) 및 건강 예측부(135)는 인공 신경망을 통한 학습을 수행하여 앙상블 모델을 구축하기 위한 뉴로모픽 칩 등으로 구현되거나, FPGA(Field Programmable Gate Aray) 또는 ASIC(Application Specific Integrated Circuit)와 같은 전용 논리 회로 등으로 구현될 수 있다.Unlike FIG. 2, the ensemble model learner 134 and the health predictor 135 may be implemented as separate hardware for building an ensemble model and predicting a future health state of a user. For example, the ensemble model learner 134 and the health predictor 135 may be implemented as a neurochip chip for constructing an ensemble model by performing learning through an artificial neural network, or may include a field programmable gate array (FPGA) or It may be implemented as a dedicated logic circuit such as an application specific integrated circuit (ASIC).

스토리지(136)는 운영 체제 또는 어플리케이션들에 의해 장기적인 저장을 목적으로 생성되는 데이터, 운영 체제를 구동하기 위한 파일, 또는 어플리케이션들의 실행 파일 등을 저장할 수 있다. 예를 들어, 스토리지(136)는 앙상블 모델 학습부(134) 및 건강 예측부(135)의 실행을 위한 파일들을 저장할 수 있다. 스토리지(136)는 앙상블 예측 장치(130)의 보조 기억 장치로 이용될 수 있다. 스토리지(136)는 플래시 메모리, PRAM (Phase-change RAM), MRAM (Magnetic RAM), FeRAM (Ferroelectric RAM), RRAM (Resistive RAM) 등을 포함할 수 있다.The storage 136 may store data generated for long-term storage by an operating system or applications, a file for driving the operating system, an executable file of applications, or the like. For example, the storage 136 may store files for executing the ensemble model learner 134 and the health predictor 135. The storage 136 may be used as an auxiliary memory device of the ensemble prediction device 130. The storage 136 may include a flash memory, a phase-change RAM (PRAM), a magnetic RAM (MRAM), a ferroelectric RAM (FeRAM), a resistive RAM (RRAM), and the like.

버스(137)는 앙상블 예측 장치(130)의 구성 요소들 사이에서 통신 경로를 제공할 수 있다. 네트워크 인터페이스(131), 프로세서(132), 메모리(133), 및 스토리지(136) 는 버스(137)를 통해 서로 데이터를 교환할 수 있다. 버스(137)는 앙상블 예측 장치(130)에서 이용되는 다양한 유형의 통신 포맷을 지원하도록 구성될 수 있다.The bus 137 may provide a communication path between the components of the ensemble prediction apparatus 130. The network interface 131, the processor 132, the memory 133, and the storage 136 may exchange data with each other via the bus 137. The bus 137 may be configured to support various types of communication formats used in the ensemble prediction apparatus 130.

도 3은 도 2의 앙상블 예측 장치의 동작 방법에 대한 순서도이다. 도 3을 참조하면, 앙상블 예측 장치의 동작 방법은 앙상블 모델을 학습하는 단계(S100) 및 미래 건강 상태를 예측하는 단계(S200)로 구분될 수 있다. 도 3의 각 단계들은 도 2의 프로세서(132)에 의하여 실행될 수 있다. S100 단계는 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 처리될 수 있다. S200 단계는 프로세서(132)의 제어 하에, 건강 예측부(135)에서 처리될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 3이 설명된다.3 is a flowchart illustrating a method of operating the ensemble prediction apparatus of FIG. 2. Referring to FIG. 3, a method of operating an ensemble prediction apparatus may be divided into a step of learning an ensemble model (S100) and a step of predicting a future health state (S200). Each step of FIG. 3 may be executed by the processor 132 of FIG. 2. The step S100 may be processed by the ensemble model learner 134 under the control of the processor 132. The step S200 may be processed by the health predicting unit 135 under the control of the processor 132. For convenience of description, referring to the reference numerals of FIGS. 1 and 2, FIG. 3 is described.

S110 단계에서, 앙상블 예측 장치(130)는 원시 학습 데이터(31)를 건강 예측 장치들(111~11n)에 제공한다. 원시 학습 데이터(31)는 시계열 데이터이다. 원시 학습 데이터(31)는 시간의 흐름에 따른 특징 데이터를 포함할 수 있다. 예를 들어, 원시 학습 데이터(31)는 측정 또는 진단된 시간을 나타내는 시간 데이터를 포함할 수 있다. 원시 학습 데이터(31)는 혈압, 콜레스테롤 수치, 몸무게 등 다양한 건강 지표를 나타내는 특징 데이터를 포함할 수 있다.In operation S110, the ensemble prediction apparatus 130 provides the raw learning data 31 to the health prediction apparatuses 111 ˜ 11n. The raw training data 31 is time series data. The raw training data 31 may include feature data over time. For example, the raw learning data 31 may include time data representing the time measured or diagnosed. The raw learning data 31 may include characteristic data representing various health indicators such as blood pressure, cholesterol level, and weight.

S120 단계에서, 앙상블 예측 장치(130)는 건강 예측 장치들(111~11n)로부터 학습 결과 데이터 및 메타 정보들을 수신한다. 학습 결과 데이터는 건강 예측 장치들(111~11n) 각각이 원시 학습 데이터(31)를 이용하여 건강 상태를 예측한 결과 데이터일 수 있다. 건강 예측 장치들(111~11n) 각각에 대응되는 메타 정보는 해당 예측 모델에서 학습한 특징 데이터, 학습 알고리즘, 및 제1 내지 제n 학습 데이터(11~1n) 중 해당 건강 예측 장치에 대응되는 학습 데이터의 규모 등에 대한 정보를 포함할 수 있다. 앙상블 모델 학습부(134)는 메타 정보들 및 학습 결과 데이터를 제공받을 수 있다.In operation S120, the ensemble prediction apparatus 130 receives learning result data and meta information from the health prediction apparatuses 111 ˜ 11n. The learning result data may be result data of each of the health prediction apparatuses 111 ˜ 11n predicting a health state using the raw learning data 31. The meta information corresponding to each of the health prediction apparatuses 111-11n includes learning corresponding to the health prediction apparatus among the feature data, the learning algorithm, and the first to n-th learning data 11-1n learned from the corresponding prediction model. It may include information about the size of the data. The ensemble model learner 134 may be provided with meta information and training result data.

S130 단계에서, 앙상블 예측 장치(130)는 수신된 메타 정보들 및 학습 결과 데이터에 기초하여 앙상블 모델을 생성할 수 있다. S130 단계는 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 수행될 수 있다. 앙상블 모델 학습부(134)는 메타 정보들 사이의 유사도에 기초하여 학습 결과 데이터 중 일부를 선별할 수 있다. 선별된 학습 결과 데이터는 타겟 학습 데이터, 즉 앙상블 학습 데이터(32)로 결정될 수 있다. 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)에 기초하여 타겟 관계 모델, 특징 관계 모델을 생성하고, 이를 병합 및 튜닝하여 앙상블 모델을 생성할 수 있다. 생성된 앙상블 모델은 스토리지(136)에 구축될 수 있으나, 이에 제한되지 않고, 별도의 서버 또는 저장 매체에 구축될 수 있다. S130 단계의 구체적인 과정들은 도 4에서 후술된다.In operation S130, the ensemble prediction apparatus 130 may generate an ensemble model based on the received meta information and the learning result data. Step S130 may be performed by the ensemble model learner 134 under the control of the processor 132. The ensemble model learner 134 may select some of the learning result data based on the similarity between the meta information. The selected learning result data may be determined as target learning data, that is, ensemble learning data 32. The ensemble model learner 134 may generate a target relation model and a feature relation model based on the ensemble training data 32, and may merge and tune the ensemble model to generate an ensemble model. The generated ensemble model may be built in the storage 136, but is not limited thereto and may be built in a separate server or storage medium. Specific processes of step S130 will be described later with reference to FIG. 4.

S200 단계에서, 앙상블 예측 장치(130)는 생성된 앙상블 모델에 기초하여, 사용자의 미래 건강 상태를 예측할 수 있다. 이를 위하여, S210 단계에서, 앙상블 예측 장치(130)는 단말기(120)로부터 시계열 의료 데이터를 수신할 수 있다. 프로세서(132)의 제어 하에, 네트워크 인터페이스(131)는 네트워크(140)를 통하여 시계열 의료 데이터를 수신할 수 있다. 시계열 의료 데이터는 사용자의 다양한 건강 지표를 나타내는 다양한 특징 데이터를 포함할 수 있다.In operation S200, the ensemble prediction apparatus 130 may predict a future health state of the user based on the generated ensemble model. To this end, in step S210, the ensemble prediction apparatus 130 may receive time series medical data from the terminal 120. Under the control of the processor 132, the network interface 131 may receive time series medical data via the network 140. The time series medical data may include various characteristic data representing various health indicators of a user.

S220 단계에서, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)에 건강 예측을 요청할 수 있다. 이를 위하여, 앙상블 예측 장치(130)는 단말기(120)로부터 수신된 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공할 수 있다. 사용자의 시계열 의료 데이터는 제1 내지 제n 건강 예측 장치들(111~11n) 각각의 개별적으로 구축된 예측 모델에 입력될 수 있다. 그 결과, 제1 내지 제n 건강 예측 장치들(111~11n)은 개별적인 예측 모델에 의하여, 제1 내지 제n 예측 결과 데이터를 생성할 수 있다. 제1 내지 제n 예측 결과 데이터는 네트워크(140)를 통하여 앙상블 예측 장치(130)에 제공될 수 있다.In operation S220, the ensemble prediction apparatus 130 may request health prediction from the first to nth health prediction apparatuses 111 to 11n. To this end, the ensemble prediction apparatus 130 may provide time series medical data received from the terminal 120 to the first to nth health prediction apparatuses 111 to 11n. The user's time series medical data may be input to an individually constructed prediction model of each of the first to nth health predicting devices 111 to 11n. As a result, the first to n-th health prediction apparatuses 111 to 11n may generate the first to n-th prediction result data by individual prediction models. The first through n-th prediction result data may be provided to the ensemble prediction apparatus 130 through the network 140.

S230 단계에서, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)로부터 수신된 제1 내지 제n 예측 결과 데이터를 앙상블할 수 있다. S230 단계는 프로세서(132)의 제어 하에, 건강 예측부(135)에서 수행될 수 있다. 건강 예측부(135)는 S130 단계에서 생성된 앙상블 모델에 기초하여, 제1 내지 제n 예측 결과 데이터를 앙상블하고, 사용자의 미래 건강 상태를 예측 및 분석할 수 있다. 예측 및 분석된 사용자의 미래 건강 상태에 대한 정보는 네트워크(140)를 통하여, 단말기(120)에 제공될 수 있다. In operation S230, the ensemble prediction apparatus 130 may ensemble the first to nth prediction result data received from the first to nth health prediction apparatuses 111 to 11n. The step S230 may be performed by the health predictor 135 under the control of the processor 132. The health predictor 135 may ensemble the first through n-th prediction result data based on the ensemble model generated in step S130, and may predict and analyze a future health state of the user. Information about the predicted and analyzed user's future health state may be provided to the terminal 120 through the network 140.

앙상블 예측 장치(130)를 이용하여, 다양한 기관들(제1 내지 제n 건강 예측 장치들(111~11n))로부터 학습된 예측 모델을 이용하여, 시계열 의료 데이터를 분석할 수 있다. 시계열 의료 데이터는 시간의 흐름에 따른 특징들에 대한 정보를 나타내므로, 시간의 경과에 따른 건강 상태의 추이가 분석될 수 있다. 이를 이용하면, 미래의 특정 시점에서의 건강 상태가 분석될 수 있다. 다만, 건강 예측 장치는 한정된 학습 데이터를 이용하여 예측 모델이 생성되므로, 앙상블 예측 장치(130)는 다양한 기관들로부터 출력된 예측 결과 데이터를 통합하여 건강 상태 예측의 정확성을 증가시킬 수 있다. 이러한 통합에 기관들 각각의 예측 모델들 사이의 상관 관계 및 특징들 사이의 상관 관계가 고려되어, 미래의 특정 시점의 건강 상태에 대한 예측 정확성이 증가될 수 있다.Using the ensemble prediction device 130, time series medical data may be analyzed using a prediction model learned from various institutions (first to nth health prediction devices 111 to 11n). Since time series medical data represents information about features over time, the trend of health status over time can be analyzed. With this, the state of health at a certain point in the future can be analyzed. However, since the health prediction apparatus generates a prediction model using limited training data, the ensemble prediction apparatus 130 may increase the accuracy of the health state prediction by integrating the prediction result data output from various institutions. This integration takes into account the correlations between features and the correlations between the predictive models of each of the institutions, so that the accuracy of predictions for future health conditions at specific time points can be increased.

도 4는 도 3의 S130 단계를 구체화한 순서도이다. 즉, 도 4는 앙상블 예측 장치(130)의 앙상블 모델을 생성하는 단계를 구체화한 도면이다. 도 4의 각 단계들은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 처리될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 4가 설명된다.4 is a flowchart embodying operation S130 of FIG. 3. That is, FIG. 4 is a diagram illustrating an embodiment of generating an ensemble model of the ensemble prediction apparatus 130. Each step of FIG. 4 may be processed by the ensemble model learner 134 under the control of the processor 132 of FIG. 2. For convenience of description, referring to the numerals in FIGS. 1 and 2, FIG. 4 is described.

S131 단계에서, 앙상블 모델 학습부(134)는 타겟 예측 장치를 선별할 수 있다. 상술하였듯이, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)로부터 원시 학습 데이터(31)의 송신에 응답하여, 메타 정보들 및 학습 결과 데이터를 수신한다. 앙상블 모델 학습부(134)는 메타 정보들 사이의 유사도에 따라 메타 정보들을 하나 이상의 그룹으로 클러스터링하고, 클러스터링된 그룹 별로 하나의 대표를 선택할 수 있다. 즉, 제1 내지 제n 건강 예측 장치들(111~11n) 중 앙상블 모델을 생성 및 학습하기 위한 대상들, 즉 타겟 예측 장치들이 선택될 수 있다. 선택된 대표들에 대응되는 학습 결과 데이터가 타겟 학습 데이터, 즉 앙상블 학습 데이터(32)로 결정될 수 있다. 이에 대한 내용은 도 5에 도시된다.In operation S131, the ensemble model learner 134 may select a target prediction device. As described above, the ensemble prediction apparatus 130 receives the meta information and the learning result data in response to the transmission of the raw training data 31 from the first to nth health prediction apparatuses 111 to 11n. The ensemble model learner 134 may cluster the meta information into one or more groups according to the similarity between the meta information and select one representative for each clustered group. That is, targets, ie, target prediction devices, for generating and learning an ensemble model among the first to nth health prediction devices 111-11n may be selected. The learning result data corresponding to the selected representatives may be determined as the target learning data, that is, the ensemble learning data 32. This is illustrated in FIG. 5.

S132 단계에서, 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)에 기초하여 타겟 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)로 결정된 복수의 타겟 학습 데이터 사이의 상관 관계에 기초하여 타겟 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 특징을 기준으로, 앙상블 학습 데이터(32)를 재구성할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 제1 내지 제n 타겟 학습 데이터 각각의 혈압과 관련된 특징 데이터 간의 상관 관계를 분석 가능하도록, 특징 별로 앙상블 학습 데이터(32)를 재구성할 수 있다. 앙상블 모델 학습부(134)는 타겟 학습 데이터 각각에 포함된 동일한 특징에 대응되는 특징 데이터 사이의 상관 관계를 분석함으로써, 타겟 학습 데이터 사이의 상관 관계를 분석할 수 있다. 타겟 관계 모델은 다양한 특징(건강 지표)들에 대한 기관들(건강 예측 장치들) 별 예측 정확성을 분석하도록 구축되고, 이에 따라 기관들 각각에 대한 가중치를 결정할 수 있다. 이에 대한 내용은 도 6에 도시된다.In operation S132, the ensemble model learner 134 may generate and learn a target relationship model based on the ensemble training data 32. The ensemble model learner 134 may generate and learn a target relationship model based on correlations between the plurality of target training data determined as the ensemble training data 32. The ensemble model learner 134 may reconstruct the ensemble training data 32 based on the feature. For example, the ensemble model learner 134 may reconstruct the ensemble training data 32 for each feature to analyze a correlation between feature data related to blood pressure of each of the first to n-th target training data. The ensemble model learner 134 may analyze the correlation between the target training data by analyzing the correlation between feature data corresponding to the same feature included in each target training data. The target relationship model is built to analyze prediction accuracy for each institution (health prediction devices) for various features (health indicators), thereby determining the weight for each of the institutions. This is illustrated in FIG. 6.

S133 단계에서, 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)에 기초하여 특징 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 복수의 타겟 학습 데이터 각각에 포함된 복수의 특징 데이터 사이의 상관 관계에 기초하여 특징 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 타겟 학습 데이터를 기준으로, 앙상블 학습 데이터(32)를 분리할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 하나의 타겟 학습 데이터에 포함된 제1 내지 제x 특징 데이터 간의 상관 관계를 분석 가능하도록, 타겟 학습 데이터 별로 앙상블 학습 데이터(32)를 분리할 수 있다. 앙상블 모델 학습부(134)는 타겟 학습 데이터 내의 서로 다른 특징들 사이의 상관 관계를 분석할 수 있다. 특징 관계 모델은 다양한 특징(건강 지표)들 사이의 연관성 및 유사성을 분석하도록 구축되고, 이에 따른 특징들 각각에 대한 가중치를 결정할 수 있다. 이에 대한 내용은 도 7에 도시된다.In operation S133, the ensemble model learner 134 may generate and learn a feature relationship model based on the ensemble training data 32. The ensemble model learner 134 may generate and learn a feature relationship model based on a correlation between a plurality of feature data included in each of the plurality of target training data. The ensemble model learner 134 may separate the ensemble training data 32 based on the target training data. For example, the ensemble model learner 134 may separate the ensemble training data 32 for each target training data so as to analyze a correlation between the first to x th feature data included in one target training data. . The ensemble model learner 134 may analyze correlations between different features in the target training data. The feature relationship model is built to analyze the associations and similarities between the various features (health indicators), thereby determining the weight for each of the features. This is illustrated in FIG. 7.

S134 단계에서, 앙상블 모델 학습부(134)는 타겟 관계 모델 및 특징 관계 모델을 병합함으로써, 앙상블 모델을 구축할 수 있다. 앙상블 모델 학습부(134)는 타겟 관계 모델의 출력과 특징 관계 모델의 입력을 연결함으로써, 두 모델들을 병합(머징)하고, 앙상블 모델을 생성할 수 있다. 앙상블 모델 학습부(134)는 병합된 앙상블 모델에 다시 앙상블 학습 데이터(32)를 입력할 수 있다. 그리고, 앙상블 모델 학습부(134)는 앙상블 모델의 출력 결과를 분석하여, 타겟 학습 데이터를 생성하는 건강 예측 장치들(기관들), 그리고 특징들에 대한 가중치를 조정하는 튜닝 과정을 수행할 수 있다. 이에 대한 내용은 도 8에 도시된다.In operation S134, the ensemble model learner 134 may construct an ensemble model by merging the target relationship model and the feature relationship model. The ensemble model learner 134 may merge the two models and generate an ensemble model by connecting the output of the target relation model and the input of the feature relation model. The ensemble model learner 134 may input the ensemble training data 32 again into the merged ensemble model. In addition, the ensemble model learner 134 may analyze the output result of the ensemble model, and perform a tuning process of adjusting weights for health prediction devices (organizations) and features for generating target training data. . This is illustrated in FIG. 8.

S135 단계에서, 앙상블 모델 학습부(134)는 구축된 앙상블 모델의 성능을 평가한다. 앙상블 모델 학습부(134)는 앙상블 모델로부터 출력된 결과 데이터와 원시 학습 데이터(31)에 의하여 기대되는 결과 데이터를 비교할 수 있다. 이러한 비교에 기초하여, 앙상블 모델 학습부(134)는 앙상블 모델의 성능을 평가할 수 있다. 원시 학습 데이터(31) 및 이에 대하여 기대되는 결과 데이터, 즉 원시 학습 데이터(31)에 대한 미래 건강 상태의 예측 결과는 앙상블 모델의 구축을 위하여, 미리 설정될 수 있고, 메모리(133)에 저장될 수 있다.In step S135, the ensemble model learner 134 evaluates the performance of the constructed ensemble model. The ensemble model learner 134 may compare the result data output from the ensemble model with the result data expected by the raw training data 31. Based on this comparison, the ensemble model learner 134 may evaluate the performance of the ensemble model. The raw training data 31 and the result data expected for this, that is, the prediction result of the future health state for the raw training data 31, may be preset in order to build an ensemble model and stored in the memory 133. Can be.

S136 단계에서, 앙상블 모델 학습부(134)는 앙상블 모델의 평가된 성능과 기준 성능을 비교할 수 있다. 기준 성능은 미리 설정될 수 있고, 메모리(133)에 저장될 수 있다. 앙상블 모델의 성능이 기준 성능 이상인 경우 (또는 높은 경우), 구축된 앙상블 모델이 최종 앙상블 모델로 결정되어 앙상블 모델을 생성하는 단계가 종료될 수 있다. 앙상블 모델의 성능이 기준 성능보다 낮은 경우 (또는 이하인 경우), S131 단계가 다시 진행된다. 이 경우, 앙상블 모델 학습부(134)는 타겟 학습 데이터를 다시 선별할 수 있다. 앙상블 모델 학습부(134)는 메타 정보들을 다시 클러스터링하거나, 클러스터링된 그룹에서 대표를 다시 선택할 수 있다. 앙상블 모델 학습부(134)는 앙상블 모델의 성능이 기준 성능을 만족할 때까지, S132 단계 내지 S135 단계를 반복할 수 있다.In operation S136, the ensemble model learner 134 may compare the evaluated performance and the reference performance of the ensemble model. The reference performance may be preset and stored in the memory 133. If the performance of the ensemble model is above (or higher) than the reference performance, the constructed ensemble model may be determined as the final ensemble model and the step of generating the ensemble model may be ended. If the performance of the ensemble model is lower than (or less than) the reference performance, step S131 is resumed. In this case, the ensemble model learner 134 may reselect the target training data. The ensemble model learner 134 may cluster the meta information again or may select a representative from the clustered group again. The ensemble model learner 134 may repeat steps S132 to S135 until the performance of the ensemble model satisfies the reference performance.

도 5는 도 4의 S131 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 5는 앙상블 예측 장치(130)의 타겟 예측 장치를 선별하는 단계를 구체화한 도면이다. 도 5의 각 단계들은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 처리될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 5가 설명된다.FIG. 5 is a diagram for specifically describing operation S131 of FIG. 4. That is, FIG. 5 is a diagram illustrating a step of selecting a target prediction device of the ensemble prediction device 130. Each step of FIG. 5 may be processed by the ensemble model learner 134 under the control of the processor 132 of FIG. 2. For convenience of description, referring to the reference numerals of FIGS. 1 and 2, FIG. 5 is described.

S131a 단계에서, 앙상블 모델 학습부(134)는 메타 정보의 유사도를 계산한다. 앙상블 모델 학습부(134)는 제1 내지 제n 건강 예측 장치들(111~11n) 각각에 대한 메타 정보들을 수신한다. 예시적으로, 타겟 풀 상에 메타 정보들이 원형으로 도시된다. 앙상블 모델 학습부(134)는 메타 정보들을 통하여, 건강 예측 장치들(111~11n) 각각에 구축된 예측 모델들이 학습한 특징들, 예측 모델들의 알고리즘들, 학습 데이터(11~1n)의 규모를 분석할 수 있다. 앙상블 모델 학습부(134)는 분석된 결과에 기초하여 타겟 풀 상에 메타 정보들을 배치할 수 있다. 앙상블 모델 학습부(134)는 타겟 풀 상에 배치된 메타 정보들 사이의 벡터 값에 기초하여, 메타 정보의 유사도를 계산할 수 있다.In step S131a, the ensemble model learner 134 calculates the similarity of meta information. The ensemble model learner 134 receives meta information about each of the first to nth health prediction apparatuses 111 to 11n. By way of example, meta information is shown in a circle on the target pool. The ensemble model learner 134 uses meta information to determine the features learned by the prediction models constructed in each of the health prediction devices 111-11n, the algorithms of the prediction models, and the scale of the training data 11-1n. Can be analyzed. The ensemble model learner 134 may place meta information on the target pool based on the analyzed result. The ensemble model learner 134 may calculate the similarity of the meta information based on the vector values between the meta information disposed on the target pool.

S131b 단계에서, 앙상블 모델 학습부(134)는 메타 정보의 유사도에 기초하여, 메타 정보들을 하나 이상의 그룹으로 클러스터링할 수 있다. 예시적으로, 도 5의 타겟 풀에서, 메타 정보들은 유사도에 기초하여 제1 내지 제3 그룹들(C1~C3)로 클러스터링되는 것으로 도시된다. 앙상블 모델 학습부(134)는 메타 정보가 유사한 유사군별로 메타 정보들을 클러스터링한다. 즉, 동일한 그룹에 속하는 메타 정보에 대응되는 건강 예측 장치는 유사한 학습을 통하여 구축된 예측 모델을 포함하는 것으로 이해될 수 있다.In step S131b, the ensemble model learner 134 may cluster the meta information into one or more groups based on the similarity of the meta information. By way of example, in the target pool of FIG. 5, the meta information is shown to be clustered into first to third groups C1 to C3 based on the similarity. The ensemble model learner 134 clusters the meta information by similar groups having similar meta information. That is, it may be understood that the health prediction apparatus corresponding to the meta information belonging to the same group includes a prediction model constructed through similar learning.

S131c 단계에서, 앙상블 모델 학습부(134)는 타겟 학습 데이터를 선택한다. 이를 위하여, 앙상블 모델 학습부(134)는 원시 학습 데이터(31)에 대한 학습 결과 데이터의 정확도를 평가한다. 상술하였듯이, 앙상블 예측 장치(130)는 원시 학습 데이터(31)에 대하여 기대되는 결과 데이터, 즉 미래 건강 상태의 예측 결과를 미리 설정할 수 있다. 앙상블 모델 학습부(134)는 미리 설정된 결과 데이터에 기초하여, 그룹들 내의 메타 정보에 대응되는 학습 결과 데이터의 정확도를 평가할 수 있다. 앙상블 모델 학습부(134)는 각각의 그룹들 내에서 평가 결과 가장 높은 정확도를 갖는 학습 결과 데이터를 타겟 학습 데이터로 결정할 수 있다.In step S131c, the ensemble model learner 134 selects target training data. To this end, the ensemble model learner 134 evaluates the accuracy of the training result data for the raw training data 31. As described above, the ensemble prediction apparatus 130 may preset the result data expected for the raw training data 31, that is, the prediction result of the future health state. The ensemble model learner 134 may evaluate the accuracy of the training result data corresponding to the meta information in the groups based on the preset result data. The ensemble model learner 134 may determine the training result data having the highest accuracy of the evaluation result as the target training data in the respective groups.

예를 들어, 앙상블 모델 학습부(134)는 제1 그룹(C1)의 세 개의 메타 정보들에 대응되는 학습 결과 데이터와 기대되는 결과 데이터를 비교할 수 있다. 이 중, 제1 타겟 메타 정보(T1)에 대응되는 학습 결과 데이터의 정확도가 가장 높은 경우, 앙상블 모델 학습부(134)는 제1 타겟 메타 정보(T1)에 대응되는 학습 결과 데이터를 타겟 학습 데이터로 선택할 수 있다. 유사한 방식으로, 앙상블 모델 학습부(134)는 제2 그룹(C2) 및 제3 그룹(C3) 내의 학습 결과 데이터 중 가장 높은 정확도를 갖는 제2 타겟 메타 정보(T2) 및 제3 타겟 메타 정보(T3)에 대응되는 학습 결과 데이터를 타겟 학습 데이터로 선택할 수 있다. 즉, 앙상블 학습 데이터(32)는 제1 내지 제3 타겟 메타 정보(T1~T3)에 대응되는 학습 결과 데이터를 포함할 수 있다.For example, the ensemble model learner 134 may compare the learning result data corresponding to the three meta informations of the first group C1 with the expected result data. Among these, when the accuracy of the training result data corresponding to the first target meta information T1 is the highest, the ensemble model learner 134 may target the training result data corresponding to the first target meta information T1. Can be selected. In a similar manner, the ensemble model learner 134 performs the second target meta information T2 and the third target meta information (T2) having the highest accuracy among the training result data in the second group C2 and the third group C3. Learning result data corresponding to T3) may be selected as target learning data. That is, the ensemble learning data 32 may include learning result data corresponding to the first to third target meta information T1 to T3.

S131a 내지 S131c 단계들을 수행한 결과 선택된 타겟 학습 데이터에 기초하여, 앙상블 모델이 생성된다. 이후, 도 4의 S136 단계에서, 앙상블 모델의 성능이 기준 성능에 도달하지 못한 경우, S131a 내지 S131c 단계들이 다시 수행될 수 있다. 이 경우, S131b 단계에서, 앙상블 모델 학습부(134)는 메타 정보들을 다시 클러스터링 할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 그룹 내에서 상대적으로 메타 정보의 유사도가 낮은 메타 정보를 해당 그룹에서 제외시키거나, 다른 그룹에 포함시킬 수 있다. 또한, S131c 단계에서, 앙상블 모델 학습부(134)는 타겟 학습 데이터를 다시 선택할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 다시 수행된 클러스터링에 의하여 변경된 그룹들 내의 학습 결과 데이터의 정확도를 다시 평가하고, 타겟 학습 데이터를 다시 선별할 수 있다. As a result of performing steps S131a through S131c, an ensemble model is generated based on the selected target training data. Subsequently, in step S136 of FIG. 4, when the performance of the ensemble model does not reach the reference performance, steps S131a to S131c may be performed again. In this case, in step S131b, the ensemble model learner 134 may cluster the meta information again. For example, the ensemble model learner 134 may exclude meta information having a low similarity of meta information within a group from the group or include it in another group. In operation S131c, the ensemble model learner 134 may reselect the target training data. For example, the ensemble model learner 134 may re-evaluate the accuracy of the learning result data in the groups changed by the clustering performed again, and select the target training data again.

예시적으로, S131a 내지 S131c 단계들은 기계학습 방식의 학습 모델에 기초하여 진행될 수 있다. 기계학습 방식의 학습 모델은 유사도 계산 기반의 클러스터링을 수행하도록 구현될 수 있다. 유사도 계산 기반의 클러스터링을 이용하여, 타겟 학습 데이터를 선별함으로써, 앙상블 모델의 오버 피팅이 완화될 수 있다. S131a 내지 S131c 단계들에 따른, 유사도 계산, 클러스터링, 및 정확도 평가는 입력되는 메타 정보의 종류, 클러스터링 알고리즘, 및 정확도 평가 계산 방식 등에 기초하여 다양하게 설정될 수 있다. In exemplary embodiments, steps S131a to S131c may be performed based on a learning model of a machine learning method. The machine learning type learning model may be implemented to perform similarity calculation based clustering. By selecting target training data using similarity calculation based clustering, overfitting of the ensemble model can be mitigated. Similarity calculation, clustering, and accuracy evaluation according to steps S131a to S131c may be variously set based on the type of input meta information, a clustering algorithm, an accuracy evaluation calculation method, and the like.

도 6은 도 4의 S132 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 6은 앙상블 학습 데이터(32)를 이용하여 앙상블 예측 장치(130)가 타겟 관계 모델(TM)을 학습하는 과정을 나타낸다. 타겟 관계 모델(TM)은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 학습될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 6이 설명된다.FIG. 6 is a diagram for specifically describing operation S132 of FIG. 4. That is, FIG. 6 illustrates a process in which the ensemble prediction apparatus 130 learns the target relationship model TM using the ensemble training data 32. The target relationship model TM may be learned by the ensemble model learner 134 under the control of the processor 132 of FIG. 2. For convenience of description, with reference to the numerals of FIGS. 1 and 2, FIG. 6 is described.

앙상블 학습 데이터(32)는 제1 내지 제n 타겟 학습 데이터(Ha~Hn)을 포함하며, 복수의 건강 예측 장치들로부터 생성된 학습 결과 데이터 중 n개의 학습 결과 데이터가 선택되었음을 의미한다. 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각은 다양한 특징 데이터를 포함한다. 예를 들어, 제1 타겟 학습 데이터(Ha)는 제1 내지 제x 특징 데이터(a1~ax)을 포함하고, 제2 타겟 학습 데이터(Hb)는 제1 내지 제x 특징 데이터(b1~bx)을 포함하고, 제n 타겟 학습 데이터(Hn)는 제1 내지 제x 특징 데이터(n1~nx)을 포함한다.The ensemble learning data 32 includes first to nth target learning data Ha to Hn, and means that n pieces of learning result data are selected from the learning result data generated from the plurality of health prediction apparatuses. Each of the first to n-th target learning data Ha to Hn includes various feature data. For example, the first target training data Ha includes first to xth feature data a1 to ax, and the second target training data Hb includes first to xth feature data b1 to bx. The n th target training data Hn includes first to x th feature data n1 to nx.

특징 데이터는 원시 학습 데이터(31)를 생성하기 위하여 진단, 검사, 또는 처방된 항목인 특징에 대응될 수 있다. 특징은 혈압, 콜레스테롤 수치, 몸무게 등 다양한 건강 지표를 나타낼 수 있다. 도 6에 도시된 특징 데이터는 동일한 숫자를 갖는 경우, 동일한 특징을 나타내는 것으로 가정한다. 예를 들어, 제1 타겟 학습 데이터(Ha)의 제1 특징 데이터(a1)와 제2 타겟 학습 데이터(Hb)의 제1 특징 데이터(b1)는 동일한 특징을 나타내는 것으로 이해될 것이다.The feature data may correspond to a feature that is an item that has been diagnosed, tested, or prescribed to produce the raw learning data 31. Characteristics can indicate various health indicators, such as blood pressure, cholesterol levels, and weight. When the feature data shown in FIG. 6 has the same number, it is assumed to represent the same feature. For example, it will be understood that the first characteristic data a1 of the first target training data Ha and the first characteristic data b1 of the second target training data Hb exhibit the same characteristic.

앙상블 모델 학습부(134)는 타겟 관계 모델(TM)을 학습하기 위하여, 동일한 특징 별로 제1 내지 제n 타겟 학습 데이터(Ha~Hn)을 재구성할 수 있다. 예를 들어, 제1 타겟 학습 데이터(Ha)의 제1 특징 데이터(a1), 제2 타겟 학습 데이터(Hb)의 제1 특징 데이터(b1), 및 제n 타겟 학습 데이터(Hn)의 제1 특징 데이터(n1)는 타겟 관계 모델(TM)의 동일한 레이어에 입력되도록 재구성될 수 있다. 즉, 동일한 특징은 동일한 입력 레이어에 제공될 수 있다.The ensemble model learner 134 may reconstruct the first to nth target training data Ha to Hn for the same feature in order to learn the target relationship model TM. For example, the first feature data a1 of the first target training data Ha, the first feature data b1 of the second target training data Hb, and the first of the nth target training data Hn. The feature data n1 may be reconfigured to be input to the same layer of the target relationship model TM. That is, the same feature may be provided to the same input layer.

타겟 관계 모델(TM)은 타겟 학습 데이터 사이의 관계를 고려하여, 특징 별 미래 시점의 예측 결과를 도출할 수 있다. 타겟 관계 모델(TM)은 제1 내지 제x 타겟 관계 모델들(TM1~TMx)을 포함할 수 있고, 타겟 관계 모델들(TM1~TMx)의 개수는 특징 데이터의 개수에 대응될 수 있다. 제1 내지 제x 타겟 관계 모델들(TM1~TMx) 각각은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 특징 데이터 중 한가지 종류의 특징 데이터를 입력 받는다. 예를 들어, 제1 타겟 관계 모델(TM1)은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 제1 특징 데이터(a1~n1)을 입력 받을 수 있다. The target relationship model TM may derive a prediction result of a future viewpoint by feature in consideration of the relationship between the target training data. The target relationship model TM may include first to x th target relationship models TM1 to TMx, and the number of target relationship models TM1 to TMx may correspond to the number of feature data. Each of the first to x th target relationship models TM1 to TMx receives one type of feature data among feature data included in each of the first to n th target learning data Ha through Hn. For example, the first target relationship model TM1 may receive first feature data a1 to n1 included in each of the first to nth target learning data Ha to Hn.

제1 내지 제x 타겟 관계 모델들(TM1~TMx) 각각은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 특징 데이터 중 한가지 종류의 특징 데이터 사이의 상관 관계를 학습할 수 있다. 예를 들어, 제1 타겟 관계 모델(TM1)은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 제1 특징 데이터(a1~n1) 사이의 상관 관계를 분석할 수 있다. 이를 통하여, 제1 타겟 관계 모델(TM1)은 제1 특징 데이터(a1~n1) 각각에 가중치를 부여할 수 있다. Each of the first to x th target relationship models TM1 to TMx may learn a correlation between one kind of feature data among feature data included in each of the first to n th target learning data Ha through Hn. . For example, the first target relationship model TM1 may analyze a correlation between the first feature data a1 to n1 included in each of the first to nth target learning data Ha to Hn. In this way, the first target relationship model TM1 may assign a weight to each of the first feature data a1 to n1.

예를 들어, 도 1의 제1 건강 예측 장치(111)가 제공되는 의료 기관은 다른 의료 기관들에 비하여 심혈관 질환 등에 특화될 수 있고, 제2 건강 예측 장치(112)가 제공되는 의료 기관은 다른 의료 기관들에 비해 호흡기 질환 등에 특화될 수 있다. 제1 건강 예측 장치(111)가 제1 타겟 학습 데이터(Ha)를 생성하고, 제2 건강 예측 장치(112)가 제2 타겟 학습 데이터(Hb)를 생성한 경우, 타겟 관계 모델(TM)은 제1 타겟 학습 데이터(Ha)의 심혈관 질환과 관련된 특징 데이터의 가중치를 다른 타겟 학습 데이터의 심혈관 질환과 관련된 특징 데이터보다 크게 부여할 수 있다. 또한, 타겟 관계 모델(TM)은 제2 타겟 학습 데이터(Hb)의 호흡기 질환과 관련된 특징 데이터의 가중치를 다른 타겟 학습 데이터의 호흡기 질환과 관련된 특징 데이터보다 크게 부여할 수 있다. 이를 통하여, 다양한 의료 기관들의 예측 모델들을 이용하여 미래 건강 상태가 예측될 수 있고, 미래 건강 상태의 예측 정확성이 증가할 수 있다.For example, a medical institution provided with the first health predicting apparatus 111 of FIG. 1 may be specialized in cardiovascular disease, etc., compared to other medical institutions, and a medical institution provided with the second health predicting apparatus 112 may be different. Compared to medical institutions, the respiratory disease may be specialized. When the first health prediction apparatus 111 generates the first target training data Ha, and the second health prediction apparatus 112 generates the second target training data Hb, the target relationship model TM is The weight of the feature data related to the cardiovascular disease of the first target learning data Ha may be greater than the feature data related to the cardiovascular disease of the other target learning data. In addition, the target relationship model TM may give a weight of feature data related to respiratory disease of the second target learning data Hb to be greater than feature data related to respiratory disease of other target learning data. Through this, the future health condition may be predicted using prediction models of various medical institutions, and the prediction accuracy of the future health condition may be increased.

제1 내지 제x 타겟 관계 모델들(TM1~TMx) 각각은 복수의 레이어들로 계층화될 수 있다. 예시적으로, 제1 내지 제x 타겟 관계 모델들(TM1~TMx)이 뉴럴 네트워크 모델로 도시되었으나, 특정 모델로 제한되지 않고, 기계 학습을 수행할 수 있는 다양한 학습 모델이 적용될 수 있다.Each of the first to x th target relationship models TM1 to TMx may be layered into a plurality of layers. For example, although the first to x th target relationship models TM1 to TMx are illustrated as neural network models, various learning models capable of performing machine learning may be applied without being limited to a specific model.

도 7은 도 4의 S133 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 7은 앙상블 학습 데이터(32)를 이용하여 앙상블 예측 장치(130)가 특징 관계 모델(FM)을 학습하는 과정을 나타낸다. 특징 관계 모델(FM)은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 학습될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 7이 설명된다.FIG. 7 is a diagram for specifically describing operation S133 of FIG. 4. That is, FIG. 7 illustrates a process in which the ensemble prediction apparatus 130 learns the feature relation model FM using the ensemble training data 32. The feature relationship model FM may be learned by the ensemble model learner 134 under the control of the processor 132 of FIG. 2. For convenience of description, referring to the reference numerals of FIGS. 1 and 2, FIG. 7 is described.

앙상블 학습 데이터(32)는 복수의 타겟 학습 데이터를 포함하고, 복수의 타겟 학습 데이터 각각은 복수의 특징 데이터를 포함한다. 예를 들어, 제1 타겟 학습 데이터는 제1 내지 제x 특징 데이터(a1~ax)을 포함한다. 특징 데이터는 원시 학습 데이터(31)를 생성하기 위하여 진단, 검사, 또는 처방된 항목에 대응될 수 있다. 앙상블 모델 학습부(134)는 특징 관계 모델(FM)을 학습하기 위하여, 앙상블 학습 데이터(32)를 타겟 학습 데이터 별로 분리할 수 있다. 앙상블 학습 데이터(32)는 타겟 학습 데이터 별로 특징 관계 모델(FM)에 입력된다.The ensemble training data 32 includes a plurality of target training data, and each of the plurality of target training data includes a plurality of feature data. For example, the first target training data includes first to x th feature data a1 to ax. The feature data may correspond to items diagnosed, tested, or prescribed to generate the raw learning data 31. The ensemble model learner 134 may separate the ensemble training data 32 for each target training data in order to learn the feature relationship model FM. The ensemble training data 32 is input to the feature relation model FM for each target training data.

특징 관계 모델(FM)은 타겟 학습 데이터 내의 특징 데이터(a1~ax) 사이의 관계를 고려하여, 미래 시점의 예측 결과를 도출할 수 있다. 특징 관계 모델(FM)은 타겟 학습 데이터 단위로 데이터를 입력 받는다. 예를 들어, 특징 관계 모델(FM)은 제1 타겟 학습 데이터를 입력 받고, 제2 타겟 학습 데이터 내지 제n 타겟 학습 데이터를 차례로 입력 받을 수 있다. 특징 관계 모델(FM)은 하나의 타겟 학습 데이터의 제1 내지 제x 특징 데이터(a1~ax) 사이의 상관 관계를 분석할 수 있다. 이를 통하여, 특징 관계 모델(FM)은 제1 내지 제x 특징 데이터(a1~ax) 각각에 가중치를 부여할 수 있다.The feature relation model FM may derive the prediction result of the future view in consideration of the relationship between the feature data a1 to ax in the target training data. The feature relation model FM receives data in units of target training data. For example, the feature relationship model FM may receive first target training data and sequentially receive second target training data to n-th target training data. The feature relation model FM may analyze a correlation between the first to x th feature data a1 to ax of one target learning data. In this way, the feature relation model FM may assign weights to the first to x-th feature data a1 to ax.

예를 들어, 심혈관 질환과 관련하여, 제1 특징 데이터(a1)가 다른 특징 데이터에 비하여 중요한 건강 지표일 수 있다. 이 경우, 특징 관계 모델(FM)은 제1 특징 데이터(a1)의 가중치를 다른 특징 데이터보다 크게 부여할 수 있다. 또한, 호흡기 질환과 관련하여, 제2 특징 데이터(a2)와 제x 특징 데이터(ax)가 유사한 건강 지표로 이용될 수 있다. 이 경우, 특징 관계 모델(FM)은 제2 특징 데이터(a2)와 제x 특징 데이터(ax) 사이의 연산에 부여되는 가중치를 다른 특징 데이터 간의 연산에 부여되는 가중치보다 크게 설정할 수 있다. 이를 통하여, 다양한 특징들이 복합적으로 고려되어 미래 건강 상태가 예측될 수 있고, 미래 건강 상태의 예측 정확성이 증가할 수 있다.For example, with respect to cardiovascular disease, the first feature data a1 may be an important health indicator compared to other feature data. In this case, the feature relation model FM may give the weight of the first feature data a1 to be greater than other feature data. In addition, with regard to respiratory disease, the second characteristic data a2 and the x th characteristic data ax may be used as similar health indicators. In this case, the feature relation model FM may set a weight given to the calculation between the second feature data a2 and the x th feature data ax to be greater than a weight given to the calculation between the other feature data. Through this, various features may be considered in combination to predict a future health condition and increase the accuracy of prediction of the future health condition.

특징 관계 모델(FM)은 복수의 레이어들로 계층화될 수 있다. 예시적으로, 특징 관계 모델(FM)은 뉴럴 네트워크 모델로 도시되었으나, 특정 모델로 제한되지 않고, 기계 학습을 수행할 수 있는 다양한 학습 모델이 적용될 수 있다.The feature relationship model FM may be layered into a plurality of layers. For example, the feature relationship model FM is illustrated as a neural network model, but is not limited to a specific model, and various learning models capable of performing machine learning may be applied.

도 8은 도 4의 S134 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 8은 앙상블 학습 데이터(32)를 이용하여 앙상블 예측 장치(130)가 앙상블 모델(EM)을 구축하는 과정을 나타낸다. 앙상블 모델(EM)은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 구축될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 8이 설명된다.FIG. 8 is a diagram for specifically describing operation S134 of FIG. 4. That is, FIG. 8 illustrates a process in which the ensemble prediction apparatus 130 builds an ensemble model EM by using the ensemble training data 32. The ensemble model EM may be constructed in the ensemble model learner 134 under the control of the processor 132 of FIG. 2. For convenience of description, referring to the reference numerals of FIGS. 1 and 2, FIG. 8 is described.

앙상블 모델(EM)은 우선 도 6에서 생성된 타겟 관계 모델(TM)과 도 7에서 생성된 특징 관계 모델(FM)을 병합함으로써 생성된다. 앙상블 모델 학습부(134)는 타겟 관계 모델(TM)의 출력과 특징 관계 모델(FM)의 입력을 연결시킬 수 있다. 이를 통하여, 앙상블 모델(EM)은 타겟 학습 데이터에 대응되는 건강 예측 장치들 사이의 관계 및 타겟 학습 데이터 각각에 포함된 특징들 사이의 관계들을 종합적으로 고려할 수 있다.The ensemble model EM is first generated by merging the target relationship model TM generated in FIG. 6 and the feature relationship model FM generated in FIG. 7. The ensemble model learner 134 may connect the output of the target relationship model TM and the input of the feature relationship model FM. Through this, the ensemble model EM may comprehensively consider the relationship between the health prediction apparatuses corresponding to the target learning data and the relationship between the features included in each target learning data.

타겟 관계 모델(TM)과 특징 관계 모델(FM)은 개별적으로 학습되므로, 두 모델들을 단순하게 병합하여 생성된 앙상블 모델(EM)은 기준 성능보다 낮은 성능을 가질 수 있다. 따라서, 타겟 관계 모델(TM)과 특징 관계 모델(FM)이 병합된 후, 앙상블 학습 데이터(32)가 다시 앙상블 모델(EM)에 입력된다. 앙상블 학습 데이터(32)는 제1 내지 제n 타겟 학습 데이터(Ha~Hn)을 포함할 수 있다. 앙상블 학습 데이터(32)는 도 6에서의 데이터 입력 방법과 같이, 동일한 특징 별로 재구성될 수 있다. 예를 들어, 제1 타겟 학습 데이터(Ha)의 제1 특징 데이터(a1), 제2 타겟 학습 데이터(Hb)의 제1 특징 데이터(b1), 및 제n 타겟 학습 데이터(Hn)의 제1 특징 데이터(n1)는 타겟 관계 모델(TM)의 동일한 레이어에 입력되도록 재구성될 수 있다.Since the target relationship model TM and the feature relationship model FM are trained separately, the ensemble model EM generated by simply merging the two models may have a lower performance than the reference performance. Therefore, after the target relationship model TM and the feature relationship model FM are merged, the ensemble training data 32 is input again to the ensemble model EM. The ensemble training data 32 may include first to nth target training data Ha to Hn. The ensemble learning data 32 may be reconstructed for the same feature as in the data input method of FIG. 6. For example, the first feature data a1 of the first target training data Ha, the first feature data b1 of the second target training data Hb, and the first of the nth target training data Hn. The feature data n1 may be reconfigured to be input to the same layer of the target relationship model TM.

앙상블 모델(EM)에 앙상블 학습 데이터(32)를 입력한 결과에 기초하여, 앙상블 모델(EM)은 최적화될 수 있다. 즉, 앙상블 모델(EM)의 가중치가 갱신될 수 있다. 예를 들어, 앙상블 모델(EM)의 출력 결과와 미리 설정된 미래 건강 상태의 예측 결과의 비교를 통하여, 앙상블 모델(EM)의 가중치가 변경될 수 있다. 이러한 가중치의 변경은 특징 관계 모델(FM)로 한정될 수 있으나, 이에 제한되지 않는다. 또한, 타겟 관계 모델(TM) 및 특징 관계 모델(FM)의 병합 과정에의 변형 등을 최소화하고, 데이터의 평활화(smoothing)을 위하여, 타겟 관계 모델의 출력과 특징 관계 모델의 입력 사이에 병합 레이어(aggregation layer, AL)가 제공될 수 있다.Based on the result of inputting the ensemble training data 32 into the ensemble model EM, the ensemble model EM may be optimized. That is, the weight of the ensemble model EM may be updated. For example, the weight of the ensemble model EM may be changed by comparing the output result of the ensemble model EM with a preset prediction result of the future health state. Such a change in weight may be limited to the feature relationship model FM, but is not limited thereto. In addition, the merge layer between the output of the target relationship model and the input of the feature relationship model for minimizing deformations in the merging process of the target relationship model (TM) and the feature relationship model (FM), and for smoothing data. (aggregation layer, AL) may be provided.

위에서 설명한 내용은 본 발명을 실시하기 위한 구체적인 예들이다. 본 발명에는 위에서 설명한 실시 예들뿐만 아니라, 단순하게 설계 변경하거나 용이하게 변경할 수 있는 실시 예들도 포함될 것이다. 또한, 본 발명에는 상술한 실시 예들을 이용하여 앞으로 용이하게 변형하여 실시할 수 있는 기술들도 포함될 것이다.The above description is specific examples for practicing the present invention. The present invention will include not only the embodiments described above but also embodiments that can be easily changed or simply changed in design. In addition, the present invention will also include techniques that can be easily modified and carried out using the above-described embodiments.

100: 건강 상태 예측 시스템
130: 앙상블 예측 장치100: health status prediction system
130: ensemble prediction device

Claims

In the operating method of the device for ensemble data received from a plurality of health prediction devices,
Providing raw learning data to the first health prediction device and the second health prediction device;
Receiving first learning result data generated based on the raw learning data from the first health prediction device;
Receiving second learning result data generated based on the raw learning data from the second health prediction apparatus;
A weight for each of the first and second health prediction apparatuses for each feature based on a correlation between feature data having the same feature among feature data included in each of the first learning result data and the second learning result data. Creating a target relationship model that provides a
Generating a feature relationship model that provides a weight for each of the different features based on a correlation between the feature data having different features among the feature data included in the first learning result data or the second learning result data; step; And
Merging the target relationship model and the feature relationship model to build an ensemble model.