KR20200010991A

KR20200010991A - Apparatus and method for determining blood glucose concentration

Info

Publication number: KR20200010991A
Application number: KR1020190033931A
Authority: KR
Inventors: 고리쉬 아가뤌; 고리쉬 아가?; 키란 바이남; 이소영; 수지트 호세; 에이 이브라힘; 아로라 라할
Original assignee: 삼성전자주식회사
Priority date: 2018-07-23
Filing date: 2019-03-25
Publication date: 2020-01-31

Abstract

The present invention relates to a non-invasive method for determining blood glucose concentration. The method for determining blood glucose concentration according to an embodiment of the present invention can comprise the steps of: processing a near infrared spectrum and a pure spectrum to obtain a preprocessed near infrared spectrum; extracting a key feature set from the preprocessed near infrared spectrum; obtaining a plurality of homogenized feature sets of validation data; and determining the blood glucose concentration of a subject by using homogenized feature sets of training data and validation data for a plurality of subjects.

Description

Apparatus and method for determining blood glucose levels {APPARATUS AND METHOD FOR DETERMINING BLOOD GLUCOSE CONCENTRATION}

혈당 모니터링 기술에 관한 것으로, 보다 상세하게는 범용 캘리브레이션을 수행하여 비침습적으로 혈당 농도를 결정하는 기술과 관련된다.The present invention relates to a blood glucose monitoring technique, and more particularly, to a technique for performing general-purpose calibration to determine blood glucose concentration non-invasively.

연속 포도당 모니터링(Continuous Glucose monitoring, CGM)은 일정 간격으로 피검체에서 혈당 농도를 검사하기 위해 사용된다. 연속 혈당 모니터링은 침습적 또는 비침습적으로 수행될 수 있다. 침습적 방법에서는 검사를 위한 혈액 샘플을 획득하기 위해 인체의 피부가 피어싱(piercing) 된다.Continuous Glucose monitoring (CGM) is used to check blood glucose levels in subjects at regular intervals. Continuous blood glucose monitoring can be performed invasive or non-invasive. In invasive methods, the skin of the human body is pierced to obtain a blood sample for examination.

하지만, 비침습적 방법에서는 혈당 농도를 얻기 위한 혈액 샘플의 수집은 요구되지 않는다. 비침습성 포도당 모니터링에 사용되는 일반적인 방법은 중적외선(Mid IR), 근적외선(Near Infrared, NIR) 및 라만 분광법 등이다. 최근에 근적외선 방법이 연속적인 포도당 모니터링을 위해 일반적으로 사용되고 있다. 여기서, 적외(IR) 파장은 피부를 통과하며, 피부의 피하 부분에 의한 적외 파장의 흡수(absorption)는 포도당 수치를 결정하는 데 도움이 된다. However, non-invasive methods do not require the collection of blood samples to obtain blood glucose levels. Common methods used for non-invasive glucose monitoring include mid-infrared (Mid IR), near infrared (NIR) and Raman spectroscopy. Recently, near-infrared methods are commonly used for continuous glucose monitoring. Here, the infrared (IR) wavelength passes through the skin, and the absorption of the infrared wavelength by the subcutaneous portion of the skin helps to determine the glucose level.

샘플에 의한 파장의 흡수(A)는 비어 램버트 법칙(BEER Lambert law)에 의해 정의된다.The absorption of wavelength A by the sample is defined by the BEER Lambert law.

…(1)

… (One)

여기서

는 흡수계수이고,

는 샘플에서의 성분의 농도이며,

는 침투 깊이(penetration depth)를 나타낸다.here

Is the absorption coefficient,

Is the concentration of the component in the sample,

Denotes the penetration depth.

샘플이 다른 성분들로 구성된 경우, 전체 흡수는 다음과 같이 주어진다.If the sample consists of different components, the total absorption is given by

…(2)

… (2)

피부의 근적외선 흡수 스펙트럼은 물, 지방, 단백질(콜라겐 및 케라틴), 아미노산, 엘라스틴 및 포도당과 같은 여러 성분들의 흡수로 구성된다.The near-infrared absorption spectrum of skin consists of the absorption of several components such as water, fat, proteins (collagen and keratin), amino acids, elastin and glucose.

비침습적으로 혈당 농도를 모니터링 하는 것은 혈액 내 포도당 농도가 다른 구성 성분의 농도보다 몇 배 더 적기 때문에 매우 어렵다. 포도당 정보는 근적외선 스펙트럼의 잡음(noise) 및 표류(drift) 성분 아래에 묻힐 수 있다. 다른 구성 성분 농도의 대략적인 순서는 아래 표에 나타난다.Non-invasive monitoring of blood glucose levels is very difficult because the concentration of glucose in the blood is several times lower than that of other components. Glucose information may be buried beneath the noise and drift components of the near infrared spectrum. The approximate sequence of different component concentrations is shown in the table below.

ConstituentConstituent WaterWater FatFat ProteinProtein Elastin/AcidElastin / Acid GlucoseGlucose Order of concentration (~)Order of concentration (~) 10⁰ 10 ⁰ 10^-1 10 ^-1 10^-3 10 ^-3 10^-3 10 ^-3 10^-4 10 ^-4

기존의 혈당 예측 메커니즘들은 수분 흡수 피크를 포함하는 특정 적외 분광법(IR spectroscopy)만을 이용하여 혈당을 측정하는 것을 다루고, 피부로부터 혈관과 같은 측정 영역까지의 투과 전자기(EM) 방사선을 사용한다. 또한, 수집된 광/EM을 분석하고 저장된 기준 검정 곡선(reference calibration curve)과 비교하여 혈당을 계산한다. 하지만, 기존의 포도당 결정 메커니즘들은 배경 간섭(background interference)이 근적외선 영역의 모든 범위에 공통적이라고 가정한다. 이 방법들은 피검체에 따라 다른 기준 검정 곡선을 사용하고, 그로 인해 보편성이 보장되지 않는다.Existing blood glucose prediction mechanisms deal with measuring blood glucose using only specific IR spectroscopy, including a water absorption peak, and use transmissive electromagnetic (EM) radiation from the skin to the measurement region, such as blood vessels. Blood glucose is calculated by analyzing the collected light / EM and comparing it with a stored reference calibration curve. However, existing glucose determination mechanisms assume that background interference is common to all ranges of the near infrared region. These methods use different reference calibration curves depending on the subject, and thus universality is not guaranteed.

기존의 다른 포도당 결정 메커니즘은 근적외 영역(NIR) 분광법을 통해 혈액 내 성분들을 측정한다. 그것은 스펙트럼 차감 생성기(spectrum substraction generator)를 사용하여 다른 시점에 스펙트럼 분석기에 의해 측정된 스펙트럼들로부터 복수의 스펙트럼 차감들을 생성한다. 부분 최소 자승 회귀법(Partial Least Squares Regression, PLS) 또는 주성분 회귀법(Principal Component Regression, PCR)에 기반으로 하는 다중 회귀 모델이 사용된다. 하지만, 기존의 포도당 결정 메커니즘은 포도당/지방/지질 등과 같은 서로 다른 혈액 성분들을 구분하지 못한다.Other existing glucose determination mechanisms measure components in the blood via near infrared region (NIR) spectroscopy. It uses a spectrum substraction generator to generate a plurality of spectral subtractions from the spectra measured by the spectrum analyzer at different time points. Multiple regression models based on Partial Least Squares Regression (PLS) or Principal Component Regression (PCR) are used. However, existing glucose determination mechanisms do not distinguish between different blood components such as glucose / fat / lipids.

기존의 다른 포도당 결정 메커니즘은 근적외 영역(NIR) 분광 기법을 사용하고, 단일 피검 데이터에서 노이즈를 제거하기 위해 몬테 카플로(Monte Carlo) 시뮬레이션을 사용한다. 이 방법은 당일 유효성 검사에 대하여는 좋은 결과를 얻지만 다른 날짜 유효성 검사에 대하여는 좋은 성능을 내지 못한다.Other existing glucose determination mechanisms use near infrared region (NIR) spectroscopy and Monte Carlo simulations to remove noise from single test data. This method yields good results for same-day validation but poor performance for other date validations.

캘리브레이션을 수행하여 비침습적으로 혈당 농도를 결정하는 장치 및 방법을 개시하는 것이다.Disclosed is an apparatus and method for performing a calibration to determine blood glucose levels non-invasively.

일 양상에 따른 혈당 농도를 결정하는 방법은, 전처리부에 의해, 복수의 대상들의 훈련 데이터, 피검체의 캘리브레이션 데이터 및 피검체의 검증 데이터 중의 적어도 하나를 포함하는 근적외선 스펙트럼 및, 순수 스펙트럼을 처리하여 전처리된 근적외선 스펙트럼을 획득하는 단계, 특징 셋 추출부에 의해 상기 전처리된 근적외선 스펙트럼으로부터 주요 특징 셋(dominant feature set)을 추출하는 단계, 특징 균질화부에 의해 상기 검증 데이터의 복수의 균질화된 특징 셋을 획득하는 단계 및 앙상블 학습부에 의해 복수의 대상들에 대한 훈련 데이터 및 상기 검증 데이터의 균질화된 특징 셋을 이용하여 피검체의 혈당 농도를 결정하는 단계를 포함할 수 있다.According to an aspect of the present invention, there is provided a method for determining a blood glucose level by processing a near-infrared spectrum including at least one of training data of a plurality of subjects, calibration data of a subject, and verification data of a subject by a preprocessing unit. Obtaining a preprocessed near infrared spectrum, extracting a dominant feature set from the preprocessed near infrared spectrum by a feature set extractor, and extracting a plurality of homogenized feature sets of the verification data by a feature homogenizer. The method may include obtaining the blood glucose level by using the training data for the plurality of subjects and the homogenized feature set of the verification data by the obtaining and ensemble learners.

전처리된 근적외선 스펙트럼을 획득하는 단계는, 주파수 도메인 필터링부에 의해, 주파수 도메인에서 원하지 않는 성분들을 제거하기 위해 상기 근적외선 스펙트럼을 필터링하는 단계, 데이터 화이트닝부에 의해, 순수 스펙트럼을 변환한 후 직교 순수 스펙트럼을 획득하는 단계, EMSC부(Extended Multiplicative Scatter Correction unit) 에 의해, 상기 직교 순수 스펙트럼을 사용하여 상기 근적외선 스펙트럼에서 포도당 및 물 이외의 성분을 제거하는 단계 및 드리프드 제거부에 의해, 근적외선 스펙트럼의 시간 도메인에 존재하는 드리프트를 제거하는 단계를 포함할 수 있다.Acquiring the pre-processed near infrared spectrum may include: filtering the near infrared spectrum to remove unwanted components from the frequency domain by a frequency domain filtering unit; Acquiring the step, by the EMSC (Extended Multiplicative Scatter Correction unit), the step of removing components other than glucose and water in the near infrared spectrum by using the orthogonal pure spectrum and the time of the near infrared spectrum by the drifted remover Removing drift present in the domain.

근적외선 스펙트럼을 필터링하는 단계는, Savitzky-Golay(SG) 필터를 사용하여 상기 근적외선 스펙트럼에 존재하는 노이즈를 필터링하는 단계 및 근적외선 스펙트럼을 파장에 대하여 미분함으로써 근적외선 스펙트럼의 파장 도메인에 존재하는 선형 드리프트를 제거하는 단계를 포함할 수 있다.The filtering of the near infrared spectrum may include filtering out noise present in the near infrared spectrum using a Savitzky-Golay (SG) filter and differentiating the near infrared spectrum with respect to the wavelength to remove linear drift in the wavelength domain of the near infrared spectrum. It may include the step.

전처리된 근적외선 스펙트럼으로부터 주요 특징 셋을 추출하는 단계는, 훈련 데이터의 포도당 농도와 상기 훈련 데이터 내의 전처리된 근적외선 스펙트럼 각각 사이의 상관 관계를 계산하는 단계 및 훈련 데이터로부터 계산된 상관 관계의 최대치에 기초하여, 훈련 데이터, 캘리브레이션 데이터 및 검증 데이터 각각에 대응하는 적어도 하나의 전처리된 근적외선 스펙트럼을 포함하는 주요 특징 셋을 선택하는 단계를 포함할 수 있다.Extracting a set of key features from the pre-processed near infrared spectra, calculating the correlation between the glucose concentration of the training data and each of the pre-processed near infrared spectra in the training data and based on the maximum value of the correlation calculated from the training data. And selecting the primary feature set comprising at least one pre-processed near infrared spectrum corresponding to each of the training data, calibration data and verification data.

검증 데이터의 복수의 균질화된 특징 셋을 획득하는 단계는, 단일 대상의 훈련 데이터 및 캘리브레이션 데이터의 주요 특징 셋에 대해 균질화된, 검증 데이터의 단일의 균질화된 특징 셋을 획득하는 단계 및, 검증 데이터에 대하여 복수의 균질화된 특징 셋을 획득하기 위해 단계를 반복하는 단계를 포함할 수 있다.Acquiring a plurality of homogenized feature sets of the validation data includes obtaining a single homogenized feature set of the validation data, homogenized with respect to the primary feature set of the training data and the calibration data of the single subject, and And repeating the steps to obtain a plurality of homogenized feature sets.

검증 데이터의 단일의 균질화된 특징 셋을 획득하는 단계는 훈련 데이터 및 캘리브레이션 데이터의 대응하는 포도당 농도들에 대하여 주요 특징 셋 내의 각 특징에 대한 선형 근사 관계를 획득하는 단계, 훈련 데이터를 기준으로서 유지하는 캘리브레이션 데이터에 대하여 주요 특징 셋 내의 각 특징에 대한 조정 인자를 계산하는 단계 및 검증 데이터의 주요 특징 셋 내의 각 특징을 상기 대응하는 조정 인자로부터 차감하는 방식으로 매핑하여, 상기 검증 데이터의 단일의 균질화된 특징 셋을 획득하는 단계를 포함할 수 있다.Acquiring a single homogenized feature set of the validation data comprises obtaining a linear approximation relationship for each feature in the primary feature set with respect to the corresponding glucose concentrations of the training data and the calibration data, maintaining the training data as a reference. Calculating calibration factors for each feature in the primary feature set for calibration data and mapping each feature in the primary feature set of the validation data in a manner that is subtracted from the corresponding adjustment factor to form a single homogenized version of the validation data. Obtaining a feature set.

주요 특징 셋 내의 각 특징에 대한 조정 인자를 계산하는 단계는, 훈련 데이터에 대한 평균 포도당 값을 산출하는 단계, 선형 근사 관계를 이용하여 상기 평균 포도당 값에서 훈련 데이터 및 캘리브레이션 데이터의 주요 특징 셋 내의 각 특징에 대한 근적외선 특징 값들을 계산하는 단계 및, 훈련 데이터의 근적외선 특징 값으로부터 상기 캘리브레이션 데이터의 근적외선 특징 값을 빼서, 상기 주요 특징 셋 내의 각 특징에 대한 조정 인자를 획득하는 단계를 포함할 수 있다.Computing the adjustment factor for each feature in the key feature set includes calculating an average glucose value for the training data, using a linear approximation relationship at each mean in the key feature set of training data and calibration data at the mean glucose value. Calculating near infrared feature values for the feature, and subtracting the near infrared feature value of the calibration data from the near infrared feature value of the training data to obtain an adjustment factor for each feature in the primary feature set.

피검체의 혈당 농도를 결정하는 단계는, 별개 대상의 훈련 데이터 내의 주요 특징 셋과 별도로 훈련된 각 회귀 모델들의 앙상블(ensemble)을 훈련시키는 단계 및, 회귀 모델들 각각에 대한 입력을 상기 검증 데이터의 대응하는 균질화된 특징 셋으로서 취하는 회귀 모델들의 앙상블로부터 복수의 회귀 출력들을 계산하는 단계, 복수의 대상들의 훈련 데이터를 사용하여 상기 회귀 모델들 각각에 대한 메타-분류기(meta-classifier)의 복수의 가중치를 추정하는 단계 및, 메타-분류기의 가중치들에 기초하여 복수의 회귀 출력들을 가중 평균함으로써 피검체의 혈당 농도를 결정하는 단계를 포함할 수 있다.Determining the blood glucose level of the subject includes training an ensemble of each regression model trained separately from the main feature set in a separate subject's training data, and inputting each of the regression models to the input of the validation data. Calculating a plurality of regression outputs from an ensemble of regression models taken as a corresponding homogenized feature set, using a plurality of weights of a meta-classifier for each of the regression models using training data of a plurality of subjects. Estimating, and determining the blood glucose concentration of the subject by weighted averaging the plurality of regression outputs based on the weights of the meta-classifier.

회귀 모델들 각각에 대한 메타-분류기(meta-classifier)의 복수의 가중치를 추정하는 단계는, 회귀 모델들 각각에 대한 입력을 복수의 대상들의 훈련 데이터의 셋으로서 취하는 상기 회귀 모델들의 앙상블로부터 복수의 모델 출력들을 계산하는 단계, 복수의 모델 출력들과 복수의 대상들의 훈련 데이터의 포도당 농도 사이의 상관 관계를 계산함으로써 복수의 모델 상관 관계들을 산출하는 단계 및 복수의 모델 상관 관계들을 스케일링하여, 회귀 모델들 각각에 대응하는 메타-분류기의 복수의 가중치들을 획득하는 단계를 포함할 수 있다.Estimating a plurality of weights of the meta-classifier for each of the regression models comprises: generating a plurality of weights from the ensemble of the regression models taking input for each of the regression models as a set of training data of a plurality of subjects. Calculating the model outputs, calculating the correlation between the plurality of model outputs and the glucose concentration of the training data of the plurality of subjects, calculating the plurality of model correlations and scaling the plurality of model correlations, thereby regression model Obtaining a plurality of weights of the meta-classifier corresponding to each of the two.

일 양상에 따르면, 복수의 대상들의 훈련 데이터, 피검체의 캘리브레이션 데이터 및 상기 피검체의 검증 데이터 중의 적어도 하나를 포함하는 근적외선 스펙트럼 및, 순수 스펙트럼을 처리하여, 전처리된 근적외선 스펙트럼을 획득하는 전처리부, 전처리된 근적외선 스펙트럼으로부터 주요 특징 셋(dominant feature set)을 추출하는 특징 셋 추출부, 검증 데이터의 복수의 균질화된 특징 셋을 획득하는 특징 균질화부 및, 복수의 대상들의 훈련 데이터 및 상기 검증 데이터의 균질화된 특징 셋을 이용하여 피검체의 혈당 농도를 결정하는 앙상블 학습부를 포함할 수 있다.According to one aspect, the pre-processing unit to obtain a pre-processed near infrared spectrum by processing a near-infrared spectrum including at least one of training data of the plurality of subjects, calibration data of the subject and verification data of the subject, and pure spectrum; A feature set extractor for extracting a dominant feature set from a pre-processed near infrared spectrum, a feature homogenizer for obtaining a plurality of homogenized feature sets of verification data, and training data of a plurality of objects and homogenization of the verification data It may include an ensemble learning unit for determining the blood glucose concentration of the subject using the set feature.

전처리부는 근적외선 스펙트럼을 필터링하여 주파수 도메인에서 원하지 않는 성분들을 제거하는 주파수 도메인 필터링부, 순수 스펙트럼에 대해 변환을 적용한 후 직교 순수 스펙트럼을 획득하는 데이터 화이트닝부, 직교 순수 스펙트럼을 이용하여 상기 근적외선 스펙트럼에 존재하는 포도당 및 물 이외의 성분을 제거하는 EMSC부 및 근적외선 스펙트럼의 시간 도메인에 존재하는 드리프트를 제거하는 드리프트 제거부를 포함할 수 있다.Pre-processing unit is a frequency domain filtering unit for filtering out the near-infrared spectrum to remove unwanted components in the frequency domain, a data whitening unit for obtaining an orthogonal pure spectrum after the transformation is applied to the pure spectrum, present in the near-infrared spectrum using an orthogonal pure spectrum It may include an EMSC portion for removing components other than glucose and water and a drift removal portion for removing drift existing in the time domain of the near infrared spectrum.

주파수 도메인 필터링부는 Savitzky-Golay(SG) 필터를 이용하여 근적외선 스펙트럼에 존재하는 노이즈를 필터링하고, 근적외선 스펙트럼을 파장에 대해 미분하여 상기 근적외선 스펙트럼의 파장 도메인에 존재하는 선형 드리프트를 제거할 수 있다.The frequency domain filtering unit may filter out noise present in the near infrared spectrum using a Savitzky-Golay (SG) filter, and may remove linear drift present in the wavelength domain of the near infrared spectrum by differentiating the near infrared spectrum with respect to the wavelength.

특징 셋 추출부는, 훈련 데이터의 포도당 농도와 상기 훈련 데이터 내의 상기 전처리된 근적외선 스펙트럼 각각 사이의 상관 관계를 계산하고, 훈련 데이터로부터 계산된 상관 관계의 최대치에 기초하여, 상기 훈련 데이터, 상기 캘리브레이션 데이터 및 상기 검증 데이터 각각에 대응하는 적어도 하나의 전처리된 근적외선을 포함하는 상기 주요 특징 셋을 선택할 수 있다.The feature set extracting unit calculates a correlation between the glucose concentration of the training data and each of the pre-processed near infrared spectra in the training data, and based on the maximum value of the correlation calculated from the training data, the training data, the calibration data and The main feature set may be selected to include at least one pre-processed near infrared ray corresponding to each of the verification data.

특징 균질화부는 단일 대상의 훈련 데이터 및 상기 캘리브레이션 데이터의 주요 특징 셋에 대하여 균질화된, 검증 데이터의 단일의 균질화된 특징 셋을 획득하는 과정을 반복하여, 검증 데이터에 대하여 복수의 균질화된 특징 셋을 획득할 수 있다.The feature homogenizer repeats the process of obtaining a single homogenized feature set of the verification data, which is homogenized with respect to the training data of the single subject and the main feature set of the calibration data, to obtain a plurality of homogenized feature sets for the verification data. can do.

특징 균질화부는, 훈련 데이터 및 캘리브레이션 데이터의 대응하는 포도당 농도들에 대하여 상기 주요 특징 셋 내의 각 특징에 대한 선형 근사 관계를 획득하고, 훈련 데이터를 기준으로서 유지하는 상기 캘리브레이션 데이터에 대해 상기 주요 특징 셋 내의 각 특징에 대한 조정 인자를 계산하고, 검증 데이터의 주요 특징 셋 내의 각 특징을 상기 대응하는 조정 인자로부터 차감하는 방식으로 매핑함으로써, 검증 데이터에 대한 단일의 균질화된 특징 셋을 획득할 수 있다.The feature homogenizer obtains a linear approximation relationship for each feature in the main feature set with respect to the corresponding glucose concentrations of training data and calibration data, and maintains the training data as a reference in the key feature set. By calculating the adjustment factor for each feature and mapping each feature in the main feature set of the validation data in a subtracted manner from the corresponding adjustment factor, a single homogenized feature set for the validation data can be obtained.

특징 균질화부는, 훈련 데이터에 대한 평균 포도당 값을 산출하고, 선형 근사 관계를 이용하여 상기 평균 포도당 값에서 상기 훈련 데이터 및 상기 캘리브레이션 데이터의 주요 특징 셋 내의 각 특징에 대한 근적외선 특징 값들을 계산하며, 훈련 데이터의 근적외선 특징 값들로부터 상기 캘리브레이션 데이터의 근적외선 특징 값을 빼서, 상기 주요 특징 셋 내의 각 특징에 대한 조정 인자를 획득할 수 있다.The feature homogenizer calculates an average glucose value for training data, calculates near infrared feature values for each feature in the main feature set of the training data and the calibration data at the average glucose value using a linear approximation relationship, and trains The adjustment factor for each feature in the main feature set can be obtained by subtracting the near infrared feature value of the calibration data from the near infrared feature values of the data.

앙상블 학습부는, 별개 대상의 훈련 데이터 내의 주요 특징 셋과 별도로 훈련된 각각의 회귀 모델들의 앙상블(ensemble)을 훈련시키고, 회귀 모델들 각각에 대한 입력을 검증 데이터의 대응하는 균질화된 특징 셋으로서 취하는 상기 회귀 모델들의 앙상블로부터 복수의 회귀 출력들을 계산하고, 복수의 대상들의 훈련 데이터를 사용하여 상기 회귀 모델들 각각에 대한 메타-분류기(meta-classifier)의 복수의 가중치를 추정하며, 메타-분류기의 가중치들에 기초하여 복수의 회귀 출력들을 가중 평균함으로써 피검체의 혈당 농도를 결정할 수 있다.The ensemble learning unit trains an ensemble of each of the regression models trained separately from the main feature set in the separate subject training data, and takes the input to each of the regression models as the corresponding homogenized feature set of the validation data. Compute a plurality of regression outputs from the ensemble of regression models, estimate the plurality of weights of the meta-classifier for each of the regression models using the training data of the plurality of subjects, and the weight of the meta-classifier The blood glucose concentration of the subject can be determined by weighted average of the plurality of regression outputs based on the results.

앙상블 학습부는, 회귀 모델들 각각에 대한 입력을 복수의 대상들의 훈련 데이터의 셋으로서 취하는 회귀 모델들의 앙상블로부터 복수의 모델 출력들을 계산하고, 복수의 모델 출력들과 복수의 대상들의 훈련 데이터의 포도당 농도 사이의 상관 관계를 계산함으로써 복수의 모델 상관 관계들을 산출하며, 복수의 모델 상관 관계들을 스케일링함으로써, 회귀 모델들 각각에 대응하는 메타-분류기의 복수의 가중치들을 획득할 수 있다.The ensemble learning unit calculates a plurality of model outputs from an ensemble of regression models taking an input for each of the regression models as a set of training data of the plurality of subjects, and calculates the glucose concentration of the plurality of model outputs and the training data of the plurality of subjects. Computing a plurality of model correlations by calculating a correlation therebetween, and scaling a plurality of model correlations, can obtain a plurality of weights of the meta-classifier corresponding to each of the regression models.

전술한 내용은 일반적으로 본 발명의 다양한 양상을 서술하고 있으며, 보다 나은 이해를 위해 후술하는 상세한 설명이 제공될 것이다. 이와 관련하여, 본 명세서의 실시예들이 여기에 설명되고 예시된 사용 방법이나 응용에 제한되지 않는다는 명확한 이해가 있어야 한다. 본 명세서에 포함된 상세한 설명 또는 예시로부터 분명하거나 명확해지는 실시 예들의 임의의 다른 장점 및 목적은 본 명세서에 개시된 실시 예의 범위 내에 있다.The foregoing generally describes various aspects of the invention, and the following detailed description will be provided for better understanding. In this regard, it should be understood that the embodiments herein are not limited to the methods of use or applications described and illustrated herein. Any other advantages and objects of the embodiments become apparent or apparent from the detailed description or examples contained herein are within the scope of the embodiments disclosed herein.

캘리브레이션을 수행하여 비침습적으로 혈당 농도를 결정할 수 있다.Calibration can be performed to determine blood glucose levels non-invasively.

본 명세서의 실시예들은 첨부된 도면들에 도시되어 있으며, 전반에 걸쳐 동일한 도면 부호는 다양한 도면에서 대응하는 부분을 나타낸다. 본 명세서의 실시예들은 도면들을 참고하여 다음의 설명으로부터 더욱 잘 이해될 것이다.
도 1은 실시예들에 따라, 혈당 농도를 캘리브레이션 및 예측하기 위한 전자 장치의 다양한 유닛을 나타내는 블록도이다.
도 2a는 실시예들에 따라, 혈당 농도를 캘리브레이션 및 예측하기 위한 훈련 및 교정 데이터의 흐름을 나타내는 블록도이다.
도 2b는 실시예들에 따라, 혈당 농도를 예측하기 위한 검증 데이터의 흐름을 나타내는 블록도이다.
도 3a 및 도 3b는 실시예들에 따라, 훈련 데이터를 기준으로서 유지하는 검증 데이터의 특징 균질화를 나타내는 플롯이다.
도 4는 실시예들에 따라, 109번째 특징(주요 특징 셋의 일부)에 대하여 훈련 데이터 및 캘리브레이션 데이터에 대한 선형 근사를 나타내는 플롯이다.
도 5는 실시예들에 따라, 캘리브레이션 데이터의 109번째 특징에 대한 조정 인자를 나타내는 플롯이고, 조정 인자는 훈련 데이터를 기준으로 유지하는 캘리브레이션 데이터에 대해 계산된다.
도 6은 실시예들에 따라, 검증 데이터 및 캘리브레이션 데이터의 균질화 된 109번째 특징을 나타내는 플롯이고, 109번째 특징은 훈련 데이터에 대해 균질화된다.
도 7은 실시예들에 따라, 피검체의 혈당 농도를 결정하기 위한 앙상블 학습부를 나타내는 블록도이다.Embodiments herein are shown in the accompanying drawings, wherein like reference numerals designate corresponding parts in the various figures. Embodiments of the present specification will be better understood from the following description with reference to the drawings.
1 is a block diagram illustrating various units of an electronic device for calibrating and predicting a blood sugar level, according to embodiments.
2A is a block diagram illustrating the flow of training and calibration data for calibrating and predicting blood glucose levels, in accordance with embodiments.
2B is a block diagram illustrating the flow of verification data for predicting blood glucose concentration, in accordance with embodiments.
3A and 3B are plots illustrating feature homogenization of verification data that maintains training data as a reference, in accordance with embodiments.
4 is a plot showing a linear approximation to training data and calibration data for a 109th feature (part of key feature set), according to embodiments.
FIG. 5 is a plot showing an adjustment factor for the 109th feature of calibration data, in accordance with embodiments, wherein the adjustment factor is calculated for calibration data maintained relative to the training data.
FIG. 6 is a plot showing homogenized 109 th feature of the validation data and calibration data, in accordance with embodiments, wherein the 109 th feature is homogenized over the training data.
FIG. 7 is a block diagram illustrating an ensemble learner for determining a blood glucose level of a subject, according to example embodiments.

본 명세서의 실시예들, 다양한 특징들 및 그것들의 유리한 세부사항들이 첨부 도면들에 도시되고 이하의 상세한 설명에서 설명된 비제한적인 실시예들을 참고로 더욱 완전하게 설명된다. 공지된 구성 요소 및 처리 기술에 대한 설명은 본 명세서의 실시예를 불필요하게 불명료하게 하지 않기 위해 생략된다. 본 명세서의 설명은 단지 본 명세서의 예시적인 실시예들이 실시될 수 있는 방법의 이해를 용이하게 하고 통상의 기술자들이 본 명세서의 예시적인 실시예들을 실시 할 수 있도록 하기 위한 것이다. 따라서, 본 개시는 본 명세서의 예시적인 실시 예들의 범위를 제한하는 것으로 해석되어서는 안된다. Embodiments, various features, and advantageous details thereof, are described more fully with reference to the non-limiting embodiments shown in the accompanying drawings and described in the following detailed description. Descriptions of known components and processing techniques have been omitted so as not to unnecessarily obscure the embodiments herein. The description herein is merely intended to facilitate understanding of how the exemplary embodiments of the present disclosure may be practiced and to enable those skilled in the art to practice the exemplary embodiments of the present disclosure. Accordingly, the present disclosure should not be construed as limiting the scope of the exemplary embodiments herein.

실시예들은 캘리브레이션을 수행하여 비침습적으로 혈당 농도를 결정하는 방법 및 장치에 관한 것이다. 혈당 농도를 결정하는 방법은 근적외선(NIR) 스펙트럼 및 순수 스펙트럼을 처리하여 전처리된 근적외선 스펙트럼을 획득하는 단계를 포함한다. 이때, 근적외선 스펙트럼은 복수의 대상들의 훈련 데이터, 피검체의 캘리브레이션 데이터 및 피검체의 검증 데이터 중의 적어도 하나를 포함할 수 있다. 또한, 혈당 농도를 결정하는 방법은 전처리된 근적외선 스펙트럼으로부터 주요 특징 셋(dominant feature set)을 추출하는 단계를 포함한다. 또한, 혈당 농도를 결정하는 방법은 검증 데이터에 대한 복수의 균질화된(homogenized) 특징 셋을 획득하는 단계를 포함할 수 있다. 또한, 복수의 대상들의 훈련 데이터 및 검증 데이터의 균질화된 특징 셋들을 이용하여 피검체의 혈당 농도를 결정하는 단계를 포함할 수 있다. Embodiments relate to a method and apparatus for performing a calibration to determine blood glucose levels non-invasively. The method for determining blood glucose concentrations includes processing the near infrared (NIR) spectrum and the pure spectrum to obtain a pretreated near infrared spectrum. In this case, the near infrared spectrum may include at least one of training data of a plurality of objects, calibration data of a subject, and verification data of the subject. The method of determining blood glucose concentration also includes extracting a dominant feature set from the pre-processed near infrared spectrum. In addition, the method of determining blood glucose concentration may include obtaining a plurality of homogenized feature sets for the validation data. The method may also include determining a blood glucose level of the subject using the homogenized feature sets of the training data and the validation data of the plurality of subjects.

이하 도 1 내지 도 7을 참고하면 유사한 참조 부호는 도면 전체에서 대응하는 특징을 나타낸다.1 through 7, like reference numerals denote corresponding features throughout the drawings.

본 명세서에 사용된 '대상'은 제한되지 않지만 인간, 동물, 조류, 수생 생물 등과 같이, 몸에 혈액을 포함하고 있는 어떠한 생명체 또는 생명체의 일부일 수 있다. As used herein, 'subject' is not limited, but may be any living thing or part of life that contains blood in the body, such as humans, animals, birds, aquatic organisms, and the like.

도 1은 실시예들에 따라, 혈당 농도를 캘리브레이션 및 예측하기 위한 전자 장치의 다양한 구성을 나타내는 블록도이다. 일 실시예에서, 전자 장치(100)는 적어도 하나의 모바일 폰, 스마트폰, 태블릿, 패블릿(phablet), PDA(personal digital assistant), 웨어러블 컴퓨팅 기기, IOT(Internet of Things) 기기, 포도당 모니터링 기기, 글루코미터(glucometer) 또는 그 밖의 전자 기기들 중의 적어도 하나일 수 있으며, 이에 제한되는 것은 아니다.1 is a block diagram illustrating various configurations of an electronic device for calibrating and predicting a blood sugar level, according to embodiments. In one embodiment, the electronic device 100 includes at least one mobile phone, smartphone, tablet, phablet, personal digital assistant, wearable computing device, Internet of Things (IOT) device, glucose monitoring device. , At least one of a glucometer or other electronic devices, but is not limited thereto.

실시예들은 캘리브레이션을 수행하여 비침습적으로 피검체의 혈당 농도를 결정하는 방법 및 전자 장치를 제공할 수 있다. Embodiments may provide a method and an electronic device for performing a calibration to non-invasively determine a blood glucose level of a subject.

전자 장치(100)는 전처리부(110), 특징 셋 추출부(120), 캘리브레이션부(130) 및 메모리(140)를 포함할 수 있다. 또한, 캘리브레이션부(130)는 특징 균질화부(132) 및 앙상블 학습부(134)를 포함할 수 있다. The electronic device 100 may include a preprocessor 110, a feature set extractor 120, a calibration unit 130, and a memory 140. In addition, the calibration unit 130 may include a feature homogenizer 132 and an ensemble learner 134.

전처리부(110)는 근적외선 스펙트럼을 처리할 수 있다. 이때, 근적외선 스펙트럼은 복수의 대상들에 대한 훈련 데이터, 피검체의 캘리브레이션 데이터 및 피검체의 검증 데이터 중의 적어도 하나를 포함할 수 있다. 일 실시예에서, 근적외선 스펙트럼 및 순수 스펙트럼은 근적외선 스펙트럼 내에 존재하는 노이즈 성분, 시간 변화 드리프트 성분(time variant drift component), 주파수 변화 드리프트 성분(frequency variant drift component) 및 기타 성분들을 제거함으로써 전처리된 근적외선 스펙트럼을 획득하도록 처리될 수 있다. The preprocessor 110 may process the near infrared spectrum. In this case, the near infrared spectrum may include at least one of training data for a plurality of objects, calibration data of a subject, and verification data of the subject. In one embodiment, the near infrared spectra and pure spectra are pre-processed near infrared spectra by removing noise components, time variant drift components, frequency variant drift components, and other components present in the near infrared spectrum. Can be processed to obtain.

특징 셋 추출부(120)는 전처리된 근적외선 스펙트럼으로부터 주요 특징 셋(dominant feature set)을 추출할 수 있다. 이때, 주요 특징 셋은 훈련 데이터, 캘리브레이션 데이터 및 검증 데이터 각각에 대응하는 적어도 하나의 전처리된 근적외선 스펙트럼을 포함할 수 있다.The feature set extractor 120 may extract a dominant feature set from the pre-processed NIR spectrum. In this case, the main feature set may include at least one pre-processed near infrared spectrum corresponding to each of the training data, the calibration data, and the verification data.

전처리부(110)는 근적외선 스펙트럼에 존재하는 임의의 파장 또는 임의의 시간 샘플을 제거하지 않는다. 일 예에서, 전처리부(110)는 시간 샘플을 필터링하여 전처리된 근적외선 스펙트럼이 129개의 시간 샘플을 포함하며, 이때 근적외선 스펙트럼에 129개의 파장이 존재할 수 있다. 각각의 파장이 특징(feature)으로 여겨질 수 있으므로 근적외선 스펙트럼은 129개 특징을 가질 것이다.The preprocessor 110 does not remove any wavelength or any time sample present in the near infrared spectrum. In one example, the preprocessing unit 110 filters the time sample, and thus the pre-processed NIR spectrum includes 129 time samples, in which 129 wavelengths may exist in the NIR spectrum. Since each wavelength can be considered a feature, the near infrared spectrum will have 129 features.

또한, 특징 균질화부(132)는 검증 데이터에 대한 복수의 균질화된 특징 셋을 획득할 수 있다. 이때, 각 특징 셋은 별개 대상(distinct object)의 훈련 데이터 내의 주요 특징 셋 및 피검체의 캘리브레이션 데이터 내의 주요 특징 셋에 대하여 균질화될 수 있다. 일 실시예에서, 검증 데이터에 대한 복수의 균질화된 특징 셋을 획득하는 방법은 단일 대상(singular object)의 훈련 데이터 및 캘리브레이션 데이터의 주요 특징 셋에 대해 균질화된 검증 데이터의 단일의 균질화된 특징 셋을 획득하는 단계를 포함할 수 있다. 또한, 복수의 균질화된 특징 셋을 획득하는 방법은 위 과정을 반복하여 검증 데이터에 대하여 복수의 균질화된 특징 셋을 획득할 수 있다.In addition, the feature homogenizer 132 may obtain a plurality of homogenized feature sets for the verification data. At this time, each feature set may be homogenized with respect to the main feature set in the training data of the distinct object and the main feature set in the calibration data of the subject. In one embodiment, a method of obtaining a plurality of homogenized feature sets for validation data comprises taking a single homogenized feature set of homogenized validation data for a single feature set of training data and calibration data for a single object. It may include the step of obtaining. In addition, the method of obtaining a plurality of homogenized feature sets may repeat the above process to obtain a plurality of homogenized feature sets with respect to the verification data.

또한, 앙상블 학습부(134)는 별개 대상 훈련 데이터의 주요 특징 셋을 가지고 각각 별도로 훈련된 회귀 모델들의 앙상블을 훈련시킴으로써 피검체의 혈당 농도를 결정할 수 있다. 또한, 본 혈당 농도를 결정하는 방법은 검증 데이터의 대응하는 균질화된 특징 셋을 회귀 모델들 각각에 대한 입력으로서 취하는 회귀 모델들의 앙상블로부터 복수의 회귀 출력을 계산하는 단계를 포함할 수 있다. 또한, 본 혈당 농도를 결정하는 방법은 복수의 대상의 훈련 데이터를 사용하여 회귀 모델 각각에 대한 메타 분류기(meta-classifier)의 복수의 가중치를 추정하는 단계를 포함할 수 있다. 또한, 본 혈당 농도를 결정하는 방법은 메타 분류기의 가중치들에 기초하여 복수의 회귀 출력을 가중 평균함으로써 피검체의 혈당 농도를 결정하는 단계를 포함할 수 있다.In addition, the ensemble learner 134 may determine a blood sugar level of a subject by training an ensemble of regression models trained separately with main feature sets of separate target training data. The method of determining the present blood glucose concentration may also include calculating a plurality of regression outputs from an ensemble of regression models that take a corresponding homogenized feature set of the validation data as input to each of the regression models. In addition, the method of determining the present blood glucose concentration may include estimating a plurality of weights of a meta-classifier for each regression model using training data of a plurality of subjects. In addition, the method of determining the present blood glucose concentration may include determining the blood sugar concentration of the subject by weighted averaging the plurality of regression outputs based on weights of the meta classifier.

메모리(140)는 훈련 데이터, 순수 스펙트럼, 캘리브레이션 데이터, 검증 데이터 및 피검체의 결정된 포도당 농도를 저장할 수 있다. 메모리(140)는 하나 이상의 컴퓨터 판독 가능 저장 매체를 포함할 수 있다. 메모리(140)는 비휘발성 저장 요소를 포함할 수 있다. 이러한 비휘발성 저장 요소의 예는 자기 하드 디스크, 광 디스크, 플로피 디스크, 플래시 메모리, 또는 전기적으로 프로그램 가능한 메모리(EPROM) 또는 전기적으로 소거 및 프로그램 가능한 메모리 (EEPROM)의 형태를 포함할 수 있다. 또한, 메모리(140)는 일부 예에서 비일시적인(non-transitory) 저장 매체로 간주될 수 있다. 용어 "비일시적"은 저장 매체가 반송파 또는 전파된 신호로 구현되지 않음을 나타낼 수 있다. 그러나, "비일시적"이라는 용어는 메모리(140)가 이동 불가능하다는 것을 의미하는 것으로 해석되어서는 안된다. 일부 예에서, 메모리(140)는 메모리 보다 더 많은 양의 정보를 저장하도록 구성 될 수 있다. 특정 예에서, 비일시적인 저장 매체는 시간이 지남에 따라 변할 수있는 데이터를 저장할 수 있다(예컨대, 랜덤 액세스 메모리(RAM) 또는 캐시에서).The memory 140 may store training data, pure spectrum, calibration data, verification data, and the determined glucose concentration of the subject. Memory 140 may include one or more computer readable storage media. Memory 140 may include non-volatile storage elements. Examples of such nonvolatile storage elements may include the form of magnetic hard disks, optical disks, floppy disks, flash memory, or electrically programmable memory (EPROM) or electrically erasable and programmable memory (EEPROM). In addition, the memory 140 may be considered as a non-transitory storage medium in some examples. The term “non-transitory” may indicate that the storage medium is not implemented with a carrier wave or a propagated signal. However, the term "non-transitory" should not be interpreted to mean that the memory 140 is immovable. In some examples, memory 140 may be configured to store a greater amount of information than memory. In certain instances, non-transitory storage media may store data that may change over time (eg, in random access memory (RAM) or cache).

도 1은 전자 장치(100)의 예시적인 구성을 도시하지만, 다른 실시예들이 그 전자 장치(100)에 제한되지 않는다는 것을 이해해야 한다. 다른 실시예들에서, 전자 장치(100)는 더 적은 또는 더 많은 수의 유닛을 포함할 수 있다. 또한, 각 구성의 레이블이나 명칭은 설명의 목적으로만 사용되며, 본 발명의 범위를 한정하는 것은 아니다. 하나 이상의 유닛들은 전자 장치(100)에서 동일 또는 실질적으로 유사한 기능을 수행하도록 함께 조합될 수 있다.Although FIG. 1 illustrates an exemplary configuration of an electronic device 100, it is to be understood that other embodiments are not limited to the electronic device 100. In other embodiments, the electronic device 100 may include fewer or more units. In addition, the label and name of each structure are used for the purpose of description only, and do not limit the scope of the present invention. One or more units may be combined together to perform the same or substantially similar function in the electronic device 100.

도 2a는 실시예들에 따라 혈당 농도를 캘리브레이션하고 예측하기 위한 훈련 및 캘리브레이션 데이터의 흐름을 나타내는 블록도이다. 실시예들은 보편적 캘리브레이션을 수행하여 비침습적으로 혈당 농도를 결정하는 전자 장치(100)를 제공한다. 전자 장치(100)는 전처리부(110), 특징 셋 추출부(120) 및 캘리브레이션부(130)를 포함한다. 2A is a block diagram illustrating the flow of training and calibration data for calibrating and predicting blood glucose levels in accordance with embodiments. Embodiments provide an electronic device 100 that performs universal calibration to non-invasively determine blood glucose levels. The electronic device 100 includes a preprocessor 110, a feature set extractor 120, and a calibration unit 130.

전처리부(110)는 주파수 도메인 필터링부(112), 데이터 화이트닝부(114), EMSC(Extended Multiplicative Scatter Correction)부(116), 및 드리프트 제거부(118)를 포함할 수 있다. 또한, 캘리브레이션부(130)는 특징 균질화부(132) 및 앙상블 학습부(134)를 포함할 수 있다.The preprocessing unit 110 may include a frequency domain filtering unit 112, a data whitening unit 114, an extended multiplicative scatter correction (EMSC) unit 116, and a drift removing unit 118. In addition, the calibration unit 130 may include a feature homogenizer 132 and an ensemble learner 134.

전처리부(110)는 복수의 대상들로부터 근적외선(NIR) 스펙트럼을 수신할 수 있다. 근적외선 스펙트럼은 복수의 대상들의 훈련 데이터, 피검체의 캘리브레이션 데이터 및 피검체의 검증 데이터 중의 적어도 하나를 포함할 수 있다.The preprocessor 110 may receive a near infrared (NIR) spectrum from a plurality of objects. The near infrared spectrum may include at least one of training data of a plurality of objects, calibration data of a subject, and verification data of the subject.

주파수 도메인 필터링부(112)는 주파수 도메인에서 노이즈 및 주파수 변화 드리프트 성분과 같은 원치 않는 성분들을 제거할 수 있다. 주파수 도메인 필터링부(112)는 Savitzky-Golay(SG) 필터를 이용하여 근적외선 스펙트럼에 존재하는 노이즈를 필터링할 수 있다. 근적외선 스펙트럼의 파장에 대한 미분은 근적외선 스펙트럼의 파장 도메인에 존재하는 선형 드리프트를 제거할 수 있다. The frequency domain filtering unit 112 may remove unwanted components such as noise and frequency change drift components in the frequency domain. The frequency domain filtering unit 112 may filter noise present in the near infrared spectrum by using a Savitzky-Golay (SG) filter. The derivative to the wavelength of the near infrared spectrum can eliminate the linear drift present in the wavelength domain of the near infrared spectrum.

데이터 화이트닝부(114)는 순수 스펙트럼에 변환을 적용한 후에 직교 순수 스펙트럼을 획득할 수 있다. 데이터 화이트닝부(114)는 순수 스펙트럼의 평균 및 표준 편차 정규화를 포함하고, 정규화된 순수 스펙트럼의 고유 벡터를 계산하고 화이트닝 변환을 적용하여, 직교 순수 스펙트럼을 획득할 수 있다. 순수한 스펙트럼을

, 정규화된 순수 스펙트럼을

라고 가정하면 고유 벡터 E 및 D는 다음을 사용하여 계산될 수 있다.The data whitening unit 114 may obtain an orthogonal pure spectrum after applying the transform to the pure spectrum. The data whitening unit 114 may include an average and standard deviation normalization of the pure spectrum, calculate an eigenvector of the normalized pure spectrum, and apply a whitening transform to obtain an orthogonal pure spectrum. Pure spectrum

Normalized pure spectrum

Eigenvectors E and D can be computed using

직교 순수 스펙트럼

를 얻기 위한 화이트닝 변환은 다음과 같이 주어질 수 있다.Orthogonal Pure Spectrum

The whitening transform to obtain is given by

또한 데이터 화이트닝은

과 동일한 화이트닝된 근적외선 스펙트럼 데이터의 공분산 행렬

을 만들 것이다.Also, data whitening

Covariance matrix of whitened near infrared spectral data equal to

Will make

이는 화이트닝된 근적외선 스펙트럼

가 직교이며 서로의 투영(projection)이 0임을 보장할 수 있다.This is the whitened near infrared spectrum

We can guarantee that is orthogonal and that the projections of each other are zero.

일단 노이즈 및 드리프트가 근적외선 스펙트럼 및 직교화된 순수 스펙트럼으로부터 제거되면, 근적외선 스펙트럼 및 직교 순수 스펙트럼이 EMSC부(116)에 추가로 제공될 수 있다. Once noise and drift are removed from the near infrared spectrum and the orthogonalized pure spectrum, the near infrared spectrum and the orthogonal pure spectrum can be further provided to the EMSC unit 116.

EMSC부(116)는 직교 순수 스펙트럼을 사용하여 근적외선 스펙트럼에 존재하는 포도당 및 물 이외의 성분을 제거할 수 있다. EMSC부(116)는 근적외선 스펙트럼에 EMSC(extended multiplicative scatter correction) 방법을 적용하고, 직교 순수 스펙트럼을 사용하여, 근적외선 스펙트럼에서 성분들을 회귀분석할 수 있다. 예를 들어,

를 다른 혈액 성분들에 대한 다양한 순수 스펙트럼

를 포함하는 임의의 근적외선 스펙트럼이라 하면,

는 아래와 같이 임의의 파장에서 단순 선형 회귀분석(simple linear regression)을 통해 해결될 수 있다.The EMSC unit 116 may remove components other than glucose and water present in the near infrared spectrum using the orthogonal pure spectrum. The EMSC unit 116 may apply an extended multiplicative scatter correction (EMSC) method to the near infrared spectrum, and may regress the components in the near infrared spectrum using an orthogonal pure spectrum. For example,

Various pure spectra for different blood components

Any near infrared spectrum including

Can be solved through simple linear regression at any wavelength as follows.

여기서

는 혈액 성분의 강도(strength)이고

는 DC 성분이다.here

Is the strength of the blood components

Is a DC component.

을 포도당 스펙트럼이라 하면, 포도당 스펙트럼은 특정 스펙트럼에서 다른 성분들을 차감하여 획득할 수 있다.

When the glucose spectrum, the glucose spectrum can be obtained by subtracting other components from a specific spectrum.

다른 성분들은 예컨대, 흡수 물(absorption water), 흡수 지방(absorption fat), 흡수 콜라겐(absorption collgen), 흡수 케라틴(absorption keratin), 흡수 산(absorption acid) 등을 포함하며 이에 제한되는 것은 아니다.Other components include, but are not limited to, for example, absorption water, absorption fat, absorption collagen, absorption keratin, absorption acid, and the like.

근적외선 스펙트럼은 시간 도메인에서 포도당 성분 및 드리프트 성분만을 포함할 수 있다. 드리프트 제거부(118)는 근적외선 스펙트럼의 시간 도메인에 존재하는 드리프트를 제거할 수 있다. 이제 근적외선 스펙트럼은 포도당 성분만을 포함할 수 있으므로, 근적외선 스펙트럼은 특징 셋 추출부(120)에 제공될 수 있다.The near infrared spectrum may include only glucose components and drift components in the time domain. The drift removing unit 118 may remove drift existing in the time domain of the near infrared spectrum. Since the near infrared spectrum may now include only glucose components, the near infrared spectrum may be provided to the feature set extractor 120.

특징 셋 추출부(120)는 전처리된 근적외선 스펙트럼으로부터 주요 특징 셋을 추출할 수 있다. 주요 특징 셋은 훈련 데이터, 캘리브레이션 데이터 및 검증 데이터 각각에 대응하는 적어도 하나의 전처리된 근적외선 스펙트럼을 포함할 수 있다. 전처리부(110)의 출력은 전처리된 근적외선 스펙트럼으로, 전처리된 근적외선 스펙트럼은 미리 정의된 수의 파장(본 명세서의 예에서, 파장의 수는 129임)에서의 근적외선 스펙트럼을 포함할 수 있다. 전처리부(110)는 근적외선 스펙트럼에서 캡쳐된 샘플들로 매회 129 개의 파장을 유지할 수 있다. 각각의 파장이 특징(feature)으로 고려될 수 있기 때문에, 전술한 바와 같이 처리된 근적외선 스펙트럼은 파장의 수와 동일한 수의 특징을 가질 수 있다. The feature set extractor 120 may extract a main feature set from the pre-processed near infrared spectrum. The main feature set may include at least one preprocessed near infrared spectrum corresponding to each of the training data, calibration data, and verification data. The output of the preprocessor 110 may be a pre-processed near infrared spectrum, and the pre-processed near infrared spectrum may include a near infrared spectrum at a predefined number of wavelengths (in this example, the number of wavelengths is 129). The preprocessor 110 may maintain 129 wavelengths each time with samples captured in the near infrared spectrum. Since each wavelength may be considered a feature, the near-infrared spectrum processed as described above may have the same number of features as the number of wavelengths.

예를 들어, 근적외선 스펙트럼의 파장 범위는 1000 ~ 2400 nm일 수 있다. 특징들 중 하나가 1200 nm에서 캡처된 스펙트럼일 수 있다고 가정한다. 특징 셋 추출부(120)에 의해 추출된 주요 특징 셋은 총 파장 수(본 예에서는 129)로부터 몇 개의 특징을 선택할 것이다. 이러한 특징들은 훈련 데이터의 포도당 농도와 훈련 데이터의 각각의 특징 사이에 계산된 상관 관계에 기초하여 선택된다. 이러한 특징들/파장들은, 주요 특징 셋을 형성하고, 훈련 데이터, 캘리브레이션 데이터 및 검증 데이터에 대해 공통적으로 유지된다.For example, the wavelength range of the near infrared spectrum may be 1000 to 2400 nm. Assume that one of the features can be the spectrum captured at 1200 nm. The main feature set extracted by the feature set extractor 120 will select several features from the total number of wavelengths (129 in this example). These features are selected based on the calculated correlation between the glucose concentration of the training data and each feature of the training data. These features / wavelengths form the main feature set and are commonly maintained for training data, calibration data and verification data.

또한, 특징 셋 추출부(120)는 근적외선 데이터를 두 셋으로 분할할 수 있다. 이때, 하나는 훈련 데이터와 캘리브레이션 데이터이고 나머지는 검증 데이터일 수 있다. 훈련 데이터 및 캘리브레이션 데이터는 특징 균질화부 (132)에 제공될 수 있다. In addition, the feature set extractor 120 may divide the NIR data into two sets. In this case, one may be training data and calibration data and the other may be verification data. Training data and calibration data may be provided to the feature homogenizer 132.

특징 균질화부(132)는 훈련 데이터의 포도당 농도와 훈련 데이터 내의 전처리된 근적외선 스펙트럼 각각 사이의 상관관계를 계산할 수 있다. 특징 균질화부(132)는 검증 데이터에 대해 복수의 균질화된 특징 셋을 획득하기 위해 상관관계를 계산하는 단계를 반복할 수 있다. The feature homogenizer 132 may calculate a correlation between the glucose concentration of the training data and each of the pre-processed near infrared spectra in the training data. The feature homogenizer 132 may repeat calculating correlations to obtain a plurality of homogenized feature sets with respect to the verification data.

특징 균질화부(132)가 검증 데이터의 단일의 균일화된 특징 셋을 획득하는 단계는 훈련 데이터 및 캘리브레이션 데이터의 대응하는 포도당 농도들에 대하여 주요 특징 셋의 각 특징에 대한 선형 근사 관계를 획득할 수 있다. 또한, 단일의 균일화된 특징 셋을 획득하는 단계는 훈련 데이터를 기준으로 유지하는 캘리브레이션 데이터에 대한 주요 특징 셋의 각 특징에 대한 조정 인자를 계산할 수 있다. 또한, 단일의 균일화된 특징 셋을 획득하는 단계는 검증 데이터의 주요 특징 셋의 각 특징을 대응하는 조정 인자로부터 감산하는 방식으로 매핑하여, 검증 데이터의 단일의 균질화된 특징 셋을 획득할 수 있다.The step of obtaining the single uniform feature set of the verification data by the feature homogenizer 132 may obtain a linear approximation relationship for each feature of the main feature set with respect to the corresponding glucose concentrations of the training data and the calibration data. . In addition, obtaining a single uniform feature set may calculate an adjustment factor for each feature of the primary feature set relative to the calibration data maintained based on the training data. In addition, obtaining a single uniform feature set may map each feature of the primary feature set of the validation data in a manner that is subtracted from the corresponding adjustment factor to obtain a single homogenized feature set of the validation data.

앙상블 학습부(134)는 회귀 모델들의 앙상블을 훈련시키기 위해 개별 대상 훈련 데이터를 기초로 훈련 데이터의 주요 특징 셋을 서로 다른 서브셋들로 분할할 수 있다. 이러한 분할은 개별 대상 훈련을 기반으로 수행될 수 있다. The ensemble learner 134 may divide the main feature set of the training data into different subsets based on the individual target training data to train the ensemble of the regression models. This partitioning can be performed based on individual subject training.

예를 들어, 전체 데이터는 근적외선 스펙트럼과 그것에 대응하는 대상 1, 대상 2, 대상 3 및 대상 4의 포도당 농도를 포함할 수 있다. 대상 1, 대상 2, 대상 3은 훈련 데이터를 제공할 수 있으며, 대상 4는 피검체일 수 있다. 따라서, 근적외선 스펙트럼과 그것에 대응하는 대상 1, 대상 2 및 대상 3의 해당 포도당 농도가 훈련 데이터가 될 수 있다. 캘리브레이션 데이터는 피검체(본 명세서의 예에서, 대상 4) 근적외선 스펙트럼의 몇 개의 초기 시간 샘플(예: ~4-5가 충분할 수 있음) 및 대응하는 포도당 농도로 구성될 것이다. 검증 데이터는 피검체(본 예에서, 대상 4) 근적외선 스펙트럼의 나머지 시간 샘플 및 대응하는 포도당 농도로 구성될 것이다.For example, the overall data may include the near infrared spectrum and the glucose concentrations of Subject 1, Subject 2, Subject 3, and Subject 4 corresponding thereto. Subject 1, Subject 2, and Subject 3 may provide training data, and Subject 4 may be a subject. Thus, the training data may be the near-infrared spectrum and corresponding glucose concentrations of the subjects 1, 2 and 3 corresponding thereto. The calibration data will consist of several initial time samples of the subject (in this example, subject 4) near infrared spectra (eg, ˜4-5 may be sufficient) and corresponding glucose concentrations. The validation data will consist of the remaining time samples of the subject (in this example, subject 4) near infrared spectrum and the corresponding glucose concentration.

회귀 모델의 앙상블을 훈련시키기 위해 필요한 훈련 데이터의 분리는 도 2a에 도시된 바와 같이 훈련 데이터에 기여하는 각 대상(본 예에서, 대상 1, 대상 2, 대상 3)에 대해 서로 다른 데이터에 기초할 수 있다. 특징 균질 화부(132)에서 검증 데이터의 균질화는 훈련 데이터의 분할에 기초하여 여러 번 수행됨을 알 수 있다(본 예에서 3). 이는 각각이 분할된 별개의 훈련 데이터를 기준으로서 유지하는 복수의 균질화된 특징 셋의 생성을 유도할 수 있다. 본 명세서의 예에서, 대상 4의 데이터는 대상 1, 대상 2 및 대상 3의 데이터에 대해 개별적으로 균질화될 것이다. 검증 데이터의 복수의 균질화된 특징 셋은 도 2b에 도시된 바와 같이 복수의 회귀 출력을 계산하기 위해 대응하는 훈련된 회귀 모델에 대한 입력으로서 사용될 수 있다. 도 2a 및 도 2b에 도시된 바와 같이, 특징 셋 추출부까지 경로는 트레이닝 데이터, 캘리브레이션 데이터 및 검증 데이터에 대해 동일하다.The separation of training data needed to train the ensemble of the regression model may be based on different data for each subject (in this example, subject 1, subject 2, subject 3) contributing to the training data, as shown in FIG. 2A. Can be. It can be seen that the homogenization of the verification data in the feature homogenizer 132 is performed several times based on the division of the training data (3 in this example). This may lead to the generation of a plurality of homogenized feature sets, each of which keeps separate, separate training data as a reference. In the examples herein, the data of subject 4 will be homogenized separately for the data of subject 1, subject 2 and subject 3. The plurality of homogenized feature sets of the validation data can be used as input to the corresponding trained regression model to calculate the plurality of regression outputs as shown in FIG. 2B. As shown in Figs. 2A and 2B, the paths to the feature set extractor are the same for the training data, the calibration data and the verification data.

도 3a 및 도 3b는 실시예들에 따라 훈련 데이터를 기준으로서 유지하는 검증 데이터의 특징 균질화를 나타내는 플롯이다. 일 실시예에서, 특징 균질화부(132)는 검증 데이터의 복수의 균질화된 특징 셋을 획득할 수 있다. 복수의 균질화된 특징 셋 각각은 개별 대상의 검증 데이터의 주요 특징 셋 및 피검체의 훈련 데이터의 주요 특징 셋에 대하여 균질화된다. 예를 들어, 파장

와 시간 t에서 훈련 데이터

의 주요 특징 셋의 임의의 특징 및 포도당 농도

에 대하여, 검증 데이터

의 대응하는 특징은 도 3에 도시된 바와 같이 훈련 데이터에 대해 시프트될 수 있다. 도 3a는

에 대해 플롯된

를 나타내며, 이는 결과적으로 도 3b와 같이 훈련 데이터

에 대해 매핑된다. 도 3b에 도시된 매핑된 데이터는 더 나은 포도당 예측을 제공한다.3A and 3B are plots illustrating feature homogenization of validation data that maintains training data as a reference in accordance with embodiments. In one embodiment, the feature homogenizer 132 may obtain a plurality of homogenized feature sets of the verification data. Each of the plurality of homogenized feature sets is homogenized with respect to the main feature set of the individual subject's validation data and the subject's training feature set. For example, wavelength

And training data at time t

Random Features and Glucose Concentrations of Three Main Features

Verification data

The corresponding feature of may be shifted with respect to the training data as shown in FIG. 3. 3a

Plotted against

This results in training data as shown in Figure 3b.

Is mapped to. The mapped data shown in FIG. 3B provides better glucose prediction.

아래 표기들은 다음을 나타내기 위해 명세서 전반에서 사용된다. The following notations are used throughout the specification to indicate the following.

는 참조 벡터(훈련 데이터)에 대한 주요 특징 셋의 특징이다.

Is a feature of the main feature set for the reference vector (training data).

및

는 각각 파장 및 시간 인덱스를 나타내기 위해 사용된다.

And

Are used to represent wavelength and time index, respectively.

는 시간 t에서의 포도당 농도이다.

Is the glucose concentration at time t.

및

는 각각 시간

에서 피검체의 대응하는 특징에 대한 초기의 몇 개 값들 및 대응하는 포도당 농도이다.

And

Each time

In the initial few values for the corresponding feature of the subject and the corresponding glucose concentration.

는 피검체의 대응하는 특징에 대한 나머지 몇 개 값들이다.

를 제외함.

Are the remaining few values for the corresponding feature of the subject.

Except.

는 5분 간격으로 5 개의 포도당 측정치들로부터 각각 보간된 20 개의 값들을 갖는다.

Has 20 values each interpolated from five glucose measurements at 5-minute intervals.

도 4는 실시예들에 따라, 훈련 데이터 및 캘리브레이션 데이터의 대응하는 포도당 농도에 대하여 109번째 특징에 대한 선형 근사 관계를 나타내는 플롯이다. 실시예들은 훈련 데이터 및 캘리브레이션 데이터의 대응하는 포도당 농도에 대하여 특징들의 선형 근사를 획득하는 방법을 제공한다. 주요 특징 셋 내의 각각의 전처리된 근적외선 특징,

는 다음과 같이 포도당 농도에 대하여 근사될 수 있다.4 is a plot showing a linear approximation relationship to the 109th feature with respect to the corresponding glucose concentration of training data and calibration data, in accordance with embodiments. Embodiments provide a method of obtaining a linear approximation of features with respect to corresponding glucose concentrations of training data and calibration data. Each pre-treated near-infrared feature in the main feature set,

Can be approximated for glucose concentration as follows.

여기서,

는

에 대한 선형 근사이고,

및

는 λ번째 특징에 대한 기울기 및 절편이다.here,

Is

Is a linear approximation for,

And

Is the slope and intercept for the λ th feature.

일 실시예에서, 훈련 데이터 및 캘리브레이션 데이터에 대한 주요 특징 셋의 특징은 그들의 대응하는 포도당 농도에 대해 선형으로 근사될 수 있다.

는 두 선형 근사값들에 대해 거의 동일해야 한다.In one embodiment, the features of the key feature set for training data and calibration data can be approximated linearly to their corresponding glucose concentrations.

Should be nearly identical for the two linear approximations.

도 5는 실시예들에 따라 훈련 데이터를 기준으로서 유지하는 캘리브레이션 데이터의 주요 특징 셋의 109번째 특징에 대한 조정 인자를 나타내는 플롯이다. 실시예들은 훈련 데이터를 기준으로 캘리브레이션 데이터의 특징에 대한 조정 인자를 계산한다. 조정 인자

는 선형 근사 관계를 이용한 훈련 데이터의 평균 포도당 농도에서의 훈련 데이터 및 캘리브레이션 데이터의 주요 특징 셋의 각 특징에 대한 근적외선 특징 값들 사이의 거리이다. 예를 들어, 기준 평균 포도당

에 대하여,

는 다음과 같다.FIG. 5 is a plot showing adjustment factors for the 109th feature of the primary feature set of calibration data that maintains training data as a reference in accordance with embodiments. Embodiments calculate adjustment factors for the characteristics of the calibration data based on the training data. Adjustment factor

Is the distance between near infrared feature values for each feature of the key feature set of the training data and the calibration data at the mean glucose concentration of the training data using the linear approximation relationship. For example, the baseline average glucose

about,

Is as follows.

조정 인자

는

에서 모든 관련된

에 대하여 계산될 수 있다.

값들은 검증 데이터를 매핑하는데 사용될 수 있다. 도 5에서, 기준 평균 포도당

이고, 109번째 특징에 대한 캘리브레이션 데이터의 조정 인자는 다음과 같다.Adjustment factor

Is

All related in

Can be calculated for.

The values can be used to map validation data. In Figure 5, the reference mean glucose

And the adjustment factor of the calibration data for the 109th feature is as follows.

도 6은 실시예들에 따라, 검증 데이터에 대해 균질화된 109번째 특징 및 훈련 데이터의 전처리된 109번째 특징을 나타내는 플롯이다. 본 명세서의 실시예들은 계산된 조정 인자를 기초로 검증 데이터의 특징을 매핑하여 검증 데이터의 그 특징을 균질화한다. 조정 인자

는 각

에서

로부터 차감될 수 있다.FIG. 6 is a plot showing the 109 th feature homogenized for validation data and the preprocessed 109 th feature of training data, in accordance with embodiments. FIG. Embodiments herein map a feature of the validation data based on the calculated adjustment factor to homogenize that feature of the validation data. Adjustment factor

Is each

in

Can be deducted from.

는 회귀 모델들의 앙상블을 훈련시키기 위해 회귀 모델들의 앙상블에 전송될 수 있다. 여기서,

는 피검체의 혈당 농도를 결정하기 위해 사용될 수 있다.

May be sent to the ensemble of regression models to train the ensemble of regression models. here,

May be used to determine the blood glucose level of the subject.

도 7은 회귀 모델들의 앙상블을 생성하고, 회귀 모델들의 앙상블을 사용하여 회귀 출력들의 앙상블을 계산하며, 메타-분류기의 가중치들에 기초하여 복수의 회귀 출력들의 가중 평균을 통해 피검체의 혈당 농도를 결정하는 앙상블 학습부(134)를 나타내는 블록도이다.7 generates an ensemble of regression models, calculates an ensemble of regression outputs using the ensemble of regression models, and measures the blood glucose concentration of the subject through a weighted average of the plurality of regression outputs based on weights of the meta-classifier. It is a block diagram which shows the ensemble learning part 134 to determine.

예를 들어, 도 7을 참조하면, 검증 데이터(S4)와 훈련 데이터(S1,S2,S3)일 때, 검증 데이터(S4)는 훈련 데이터(S1,S2,S3) 각각에 대해 개별적으로 균질화되고, 검증 데이터(S4)의 복수의 균질화된 특징 셋(S4(Hom))은 각각 대응하는 훈련된 회귀 모델에 입력된다. 이때, 각 회귀 모델을 통해 검증 데이터(S4)에 대해 예측된 결과(Pred.1, Pred. 2, Pred. 3)는 복수의 훈련 데이터(S1,S2,S3)를 사용하여 회귀 모델들 각각에 대한 메타-분류기(meta-classifier)의 복수의 가중치를 추정할 수 있다. 또한, 메타-분류기의 가중치들에 기초하여 복수의 회귀 출력들을 결합함으로써 예컨대, 가중 평균함으로써 혈당 농도(S4 포도당 결과)를 결정할 수 있다.For example, referring to FIG. 7, when the verification data S4 and the training data S1, S2, and S3, the verification data S4 are homogenized separately for each of the training data S1, S2, and S3. The plurality of homogenized feature sets S4 (Hom) of the validation data S4 are each input to the corresponding trained regression model. At this time, the results Pred. 1, Pred. 2, and Pred. 3 predicted for the verification data S4 through each regression model are applied to each of the regression models using the plurality of training data S1, S2, and S3. A plurality of weights of the meta-classifiers may be estimated. In addition, blood glucose concentration (S4 glucose result) can be determined by combining a plurality of regression outputs based on weights of the meta-classifier, eg, by weighted average.

실시예들은 적어도 하나의 하드웨어 장치 상에서 실행되는 적어도 하나의 소프트웨어 프로그램을 통해 구현될 수 있으며 구성 요소들을 제어하기 위해 네트워크 관리 기능을 수행할 수 있다. 도 1에 도시된 구성 요소들은 적어도 하나의 하드웨어 기기 또는 하드웨어 기기와 소프트웨어 모듈의 조합 중의 적어도 하나가 될 수 있다.Embodiments may be implemented through at least one software program running on at least one hardware device and may perform network management functions to control components. The components shown in FIG. 1 may be at least one of a hardware device or a combination of hardware devices and software modules.

한편, 본 실시 예들은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현하는 것을 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고 본 실시예들을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술 분야의 프로그래머들에 의하여 용이하게 추론될 수 있다.Meanwhile, the embodiments may be implemented by computer readable codes on a computer readable recording medium. Computer-readable recording media include all kinds of recording devices that store data that can be read by a computer system. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disks, optical data storage devices, and the like, which may also be implemented in the form of carrier waves (for example, transmission over the Internet). Include. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. And functional programs, codes and code segments for implementing the embodiments can be easily inferred by programmers in the art to which the present invention belongs.

본 개시가 속하는 기술분야의 통상의 지식을 가진 자는 개시된 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.Those skilled in the art will appreciate that the present disclosure may be embodied in other specific forms without changing the technical spirit or essential features disclosed. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive.

100: 전자 장치 110: 전처리부
112: 주파수 도메인 필터링부 114: 데이터 화이트닝부
116: EMSC 부 118: 드리프트 제거부
120: 특징 셋 추출부 130: 캘리브레이션부
132: 특징 균질화부 134: 앙상블 학습부
140: 메모리100: electronic device 110: preprocessor
112: frequency domain filtering unit 114: data whitening unit
116: EMSC part 118: drift removal unit
120: feature set extraction unit 130: calibration unit
132: feature homogenizer 134: ensemble learning unit
140: memory

Claims

Obtaining, by the preprocessing unit, a near infrared spectrum and a pure spectrum including at least one of training data of a plurality of objects, calibration data of a subject, and verification data of the subject, to obtain a preprocessed near infrared spectrum;
Extracting, by a feature set extractor, a dominant feature set from the preprocessed near infrared spectrum;
Obtaining, by a feature homogenizer, a plurality of homogenized feature sets of the verification data; And
Determining, by an ensemble learner, blood glucose levels of the subject using training data for the plurality of subjects and a homogenized feature set of the verification data.

The method of claim 1,
Acquiring the pre-processed near infrared spectrum,
Filtering, by a frequency domain filtering unit, the near infrared spectrum to remove unwanted components in the frequency domain;
Acquiring an orthogonal pure spectrum after converting the pure spectrum by a data whitening unit;
Removing components other than glucose and water from the near infrared spectrum by using the orthogonal pure spectrum by an Extended Multiplicative Scatter Correction unit (EMMS); And
Removing, by the drift remover, drift present in the time domain of the near infrared spectrum.

The method of claim 2,
The filtering of the near infrared spectrum may include:
Filtering noise present in the near infrared spectrum using a Savitzky-Golay (SG) filter; And
Removing the linear drift present in the wavelength domain of the near infrared spectrum by differentiating the near infrared spectrum with respect to the wavelength.

The method of claim 1,
Extracting a set of key features from the pre-processed near infrared spectrum,
Calculating a correlation between the glucose concentration of the training data and each of the pre-processed near infrared spectra in the training data; And
Selecting a key feature set comprising at least one preprocessed near infrared spectrum corresponding to each of the training data, the calibration data, and the verification data, based on the maximum value of the correlation calculated from the training data; How to predict blood sugar levels.

The method of claim 1,
Obtaining a plurality of homogenized feature sets of the verification data,
Obtaining a single homogenized feature set of the validation data, homogenized with respect to the training data of the single subject and the primary feature set of the calibration data; And
Repeating said step to obtain a plurality of homogenized feature sets for said validation data.

The method of claim 5,
Obtaining a single homogenized feature set of the verification data
Obtaining a linear approximation relationship for each feature in the set of key features with respect to corresponding glucose concentrations in the training data and the calibration data;
Calculating an adjustment factor for each feature in the set of key features for the calibration data holding the training data as a reference; And
Mapping each feature in the primary feature set of the validation data in a manner that is subtracted from the corresponding adjustment factor to obtain a single homogenized feature set of the validation data.

The method of claim 6,
Computing the adjustment factor for each feature in the set of key features,
Calculating an average glucose value for the training data;
Calculating near-infrared feature values for each feature in the key feature set of the training data and the calibration data at the average glucose value using the linear approximation relationship; And
Subtracting the near infrared feature value of the calibration data from the near infrared feature value of the training data to obtain an adjustment factor for each feature in the primary feature set.

The method of claim 1,
Determining the blood sugar level of the subject,
Training an ensemble of regression models each separately trained with a set of key features in the training data of separate subjects; And
Calculating a plurality of regression outputs from an ensemble of regression models that takes an input to each of the regression models as a corresponding homogenized feature set of the validation data;
Estimating a plurality of weights of a meta-classifier for each of the regression models using training data of a plurality of subjects; And
Determining the blood glucose concentration of the subject by weighted averaging a plurality of regression outputs based on the weights of the meta-classifier.

The method of claim 8,
Estimating a plurality of weights of the meta-classifier for each of the regression models,
Calculating a plurality of model outputs from an ensemble of regression models that takes a set of training data of a plurality of subjects as input to each of the regression models;
Calculating a plurality of model correlations by calculating a correlation between the plurality of model outputs and the glucose concentration of the training data of the plurality of subjects; And
Scaling a plurality of model correlations to obtain a plurality of weights of a meta-classifier corresponding to each of the regression models.

A pre-processing unit processing a near-infrared spectrum including at least one of training data of a plurality of subjects, calibration data of a subject, and verification data of the subject, and a pure spectrum to obtain a pre-processed near-infrared spectrum;
A feature set extraction unit for extracting a dominant feature set from the pre-processed near infrared spectrum;
A feature homogenizer for obtaining a plurality of homogenized feature sets of the verification data; And
And an ensemble learner configured to determine a blood glucose level of a subject by using the training data of the plurality of subjects and the homogenized feature set of the verification data.

The method of claim 10,
The pretreatment unit
A frequency domain filtering unit filtering the near infrared spectrum to remove unwanted components from the frequency domain;
A data whitening unit configured to obtain an orthogonal pure spectrum after applying a transform to the pure spectrum;
An EMSC unit for removing components other than glucose and water present in the near infrared spectrum by using the orthogonal pure spectrum; And
And a drift removing unit for removing drift existing in the time domain of the near infrared spectrum.

The method of claim 11,
The frequency domain filtering unit
Savitzky-Golay (SG) filter is used to filter noise present in the near infrared spectrum,
And differentiating the near infrared spectrum with respect to wavelength to remove linear drift present in the wavelength domain of the near infrared spectrum.

The method of claim 10,
The feature set extraction unit,
Calculate a correlation between the glucose concentration of the training data and each of the pre-processed near infrared spectra in the training data,
And based on the maximum value of the correlation calculated from the training data, selecting the primary feature set comprising at least one pre-processed near infrared ray corresponding to each of the training data, the calibration data and the verification data.

The method of claim 10,
The feature homogenization unit
Repeating the process of obtaining a single homogenized feature set of the verification data, homogenized with respect to the training data of the single subject and the primary feature set of the calibration data, to obtain a plurality of homogenized feature sets for the verification data. , Electronic device.

The method of claim 14,
The feature homogenization unit,
Obtaining a linear approximation relationship for each feature in the key feature set with respect to the corresponding glucose concentrations in the training data and the calibration data,
Calculating adjustment factors for each feature in the primary feature set for the calibration data holding the training data as a reference,
By mapping each feature in the main feature set of the verification data in a subtracted manner from the corresponding adjustment factor,
Obtain a single homogenized feature set for the verification data.

The method of claim 15,
The feature homogenization unit,
Calculate an average glucose value for the training data,
Using the linear approximation relationship to calculate near infrared feature values for each feature in the key feature set of the training data and the calibration data at the mean glucose value,
And subtract the near infrared feature value of the calibration data from the near infrared feature values of the training data to obtain an adjustment factor for each feature in the primary feature set.

The method of claim 10,
The ensemble learning unit,
Train an ensemble of regression models, each trained separately, with a set of key features in separate subject's training data,
Calculate a plurality of regression outputs from an ensemble of regression models that take a corresponding homogenized feature set of the validation data as input to each of the regression models,
Using a plurality of subjects' training data to estimate a plurality of weights of a meta-classifier for each of the regression models,
And determine a blood glucose concentration of the subject by weighted averaging a plurality of regression outputs based on weights of the meta-classifier.

The method of claim 17,
The ensemble learning unit,
Calculate a plurality of model outputs from an ensemble of regression models taking a set of training data of a plurality of subjects as input to each of the regression models,
Calculating a plurality of model correlations by calculating a correlation between the plurality of model outputs and the glucose concentration of the training data of the plurality of subjects,
Obtaining a plurality of weights of the meta-classifier corresponding to each of the regression models by scaling the plurality of model correlations.