KR102366096B1

KR102366096B1 - Diabetes Prediction methods using clinical and genetic Risk score in Korean and method for suggesting diagnosis related to Diabetes

Info

Publication number: KR102366096B1
Application number: KR1020210057027A
Authority: KR
Inventors: 서종원; 이은솔
Original assignee: 주식회사 바이오에이지
Priority date: 2021-05-03
Filing date: 2021-05-03
Publication date: 2022-02-23

Abstract

The present invention relates to a method for predicting a chance of developing diabetes by integrating clinical information and genetic variation information of Koreans and, more specifically, to a method for predicting a chance of developing diabetes from a Korean-specific viewpoint using a disease predicting model built by integrating SNP marker variation information and clinical information and for suggesting a diagnosis related to diabetes. According to the present invention, the method for predicting the chance of developing diabetes integrally analyzes Korean-specific genetic information and personal clinical information to predict the chance of developing diabetes, so the method is useful for early diagnosis of diabetes and predictive prevention because accuracy is superior to an existing method.

Description

Diabetes Prediction methods using clinical and genetic Risk score in Korean and method for suggesting diagnosis related to Diabetes

본 발명은 한국인의 임상 정보 및 유전변이 정보를 통합하여 당뇨 발병 확률을 예측하는 방법에 관한 것으로, 보다 구체적으로는 SNP 마커 변이 정보와 임상 정보를 통합하여 구축한 질병예측 모델을 이용하여 한국인의 특정 시점에서 당뇨병 발병 확률을 예측하고, 관련 검사를 추천하는 방법에 관한 것이다. The present invention relates to a method for predicting the probability of developing diabetes by integrating clinical information and genetic mutation information of Koreans, and more specifically, using a disease prediction model built by integrating SNP marker mutation information and clinical information It relates to a method of predicting the probability of developing diabetes at a time point and recommending a related test.

통상의 질환 및 병태의 병인은 대체로 유전 및 환경 인자 둘 모두에 그 원인이 있다. 유전자형분석 기술에서의 최근의 진보로 이러한 질환에 대한 유전적 기여도의 이해가 상당히 향상되었다. 게놈 전반에서 공통 유전자 변이체 및 공통 질환 간 새로운 연관성을 발견하려는 것을 목표로 삼은, 수많은 전체 게놈 연관성 연구가 최근에 완료되었다. 이들 연구들은 유전자 조성을 기초로, 개체 생애 동안 질환이 발병될 개체의 위험율 및 질환의 기전을 밝혀 주었다. 생애 초기에 임상적 의사 결정 프로세스에 선천적 유전자 위험 정보를 통합시키는 것은 질환증상 또는 병태를 완화 또는 더욱 예방하는데 중요한 효과를 준다.The etiology of common diseases and conditions is largely due to both genetic and environmental factors. Recent advances in genotyping techniques have significantly improved our understanding of the genetic contribution to these diseases. Numerous whole-genome association studies have recently been completed, aiming to discover novel associations between common genetic variants and common diseases across the genome. These studies, based on their genetic makeup, have revealed an individual's risk of developing the disease during an individual's lifetime and the mechanism of the disease. Incorporating innate genetic risk information into the clinical decision-making process early in life has important effects in alleviating or further preventing disease symptoms or conditions.

통상의 만성적인 비전염성 질환의 유병률은 대체로 단성 및 전염성 질환 둘 모두의 조합된 유병률을 무색하게 한다. 통상의 SNP 변이체는 통상의 질병에 대한, 모두는 아니더라도 유의한 수의 생식선 유전자 위험성의 일부를 차지하며 이러한 면에서 사용시 개체에 대해 보다 나은 개인화 및 집중적인 노출 경감, 초기 검출, 및 초기 중재 패러다임을 허용한다.The prevalence of common chronic non-communicable diseases largely overshadows the combined prevalence of both monogenic and communicable diseases. Conventional SNP variants account for some, if not all, significant, if not all, significant, number of germline genetic risks for common diseases and in this regard, when used, allow for better personalization and intensive reduction of exposure, early detection, and early intervention paradigms for the subject. allow

게놈 내 유전자 변이, 예컨대 단일 뉴클레오티드 다형성(SNP), 돌연변이, 결실, 삽입, 반복, 미소부수체 등은 다양한 표현형, 예컨대 질환 또는 병태와 상관관계가 있다. 개체의 유전 변이를 동정하고 상관관계 지어서 상이한 표현형에 대한 개체의 소인 또는 위험성을 결정하여, 개인별 표현형 프로파일을 생성시킬 수 있다.Genetic variations in the genome, such as single nucleotide polymorphisms (SNPs), mutations, deletions, insertions, repeats, microsatellites, etc., are correlated with various phenotypes, such as diseases or conditions. Genetic variations in an individual can be identified and correlated to determine an individual's predisposition or risk for different phenotypes, resulting in an individualized phenotype profile.

낮은 효과 크기 공통 SNP 변이체, 희귀한 개인 변이체, DNA 카피수 변이체, 및 후성 변형이 대체로 선천적 위험성의 대부분을 차지한다. 병태가 발병할 개체의 위험율을 정확하게 추정하는 것은 쉽지 않은 작업이다. 이러한 위험율은 유전자 위험 인자 부하량, 환경 인자, 성별 및 연령을 포함한, 수많은 인자들에 의해 결정된다. 따라서, 대부분의 병태의 경우, 가장 정확한 위험율 평가는 확률적인 위험율 추정치로만 제공할 수 있다. 인자들은 상이한 연관 변이체, 그들의 효과 크기, 개체군에서 그들의 빈도, 개체에 영향을 주는 환경 인자 예컨대, 식이,연령, 가족력, 및 인종적 배경과 그들 상호작용을 포함할 수 있다. Low effect size common SNP variants, rare individual variants, DNA copy number variants, and epigenetic modifications generally account for the majority of congenital risks. Accurately estimating an individual's risk of developing a condition is not an easy task. This risk rate is determined by a number of factors, including genetic risk factor loading, environmental factors, sex, and age. Therefore, for most conditions, the most accurate risk assessment can only provide a probabilistic risk estimate. Factors may include different linked variants, their effect size, their frequency in the population, and their interaction with environmental factors that affect the individual, such as diet, age, family history, and ethnic background.

따라서, 유전자 변이 효과를 고려하지만 복수의 위험 인자를 동시에 평가하는 대규모 연구 결과를 필요로 하지 않는 위험율 추정치를 이용한 개인별 표현형 프로파일을 생성하기 위한 방법이 요구된다. 또한, 질환에 따라 다를 뿐만 아니라, 환경 데이터와 조합할 수 있는, 예컨대 임상적 분류자로서 예측력을 갖는, 임상적 의사 결정을 위한 부가의 도구를 제공하는, 위험율 추정치 생성에 대한 요구가 존재하는 실정이다.Therefore, there is a need for methods for generating individual phenotypic profiles using risk rate estimates that take into account the effects of genetic variation but do not require the results of large-scale studies evaluating multiple risk factors simultaneously. There is also a need to generate risk rate estimates that are disease-specific, but also provide additional tools for clinical decision-making that can be combined with environmental data, such as with predictive power as clinical classifiers. am.

한편, 2형 당뇨병 (Type 2 Diabetes mellitus)은 전체 당뇨병 환자의 90 ~ 95%를 차지하는 것으로 알려져 있다. 2형 당뇨병은 체내에서 인슐린은 생산되지만, 인슐린의 양이 비정상적이거나, 인슐린에 대한 민감도(sensitivity)가 낮은 사람들에게서 발병하는데, 혈액 내 혈당 수준의 변이가 크게 나타나는 증세를 나타낸다. 이는 인슐린의 이상으로 인해 혈액 내의 포도당을 세포 내로 이동시킬 수 없기 때문에 음식물로부터 에너지를 얻는데 어려움이 생기는 것으로 알려져 있다. 2형 당뇨병의 발병에는 유전적 요인이 있는 것으로 알려져 있고, 그 외의 위험인자 (risk factor)로는 45세 이상의 나이, 당뇨병에 대한 가족력, 과체중, 고혈압 및 콜레스테롤 수준 등이 있다. Meanwhile, Type 2 Diabetes mellitus is known to account for 90 to 95% of all diabetes patients. Type 2 diabetes occurs in people who produce insulin in the body, but have an abnormal amount of insulin or have low sensitivity to insulin, and show a symptom in which blood glucose levels vary greatly. It is known that it is difficult to obtain energy from food because glucose in the blood cannot be moved into cells due to an abnormality in insulin. It is known that there is a genetic factor in the onset of type 2 diabetes, and other risk factors include age over 45, family history of diabetes, overweight, high blood pressure and cholesterol levels.

현재 당뇨병의 진단은 주로 공복시의 혈당치(FSB: fasting blood glucose) 시험, 및 경구 포도당 부하시험(OGTT : oral glucose tolerance test) 등을 통해 질병에 의한 표현형의 변화, 즉 혈당량을 측정하는 방법으로 이루어지고 있다(National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health, http://www.niddk.nih.gov, 2003). 2형 당뇨병의 경우 진단이 되면 운동 및 식이생활 습관의 변화, 체중조절, 및 각종 약물 치료 등을 통해 치료 또는 당뇨병의 진행 속도를 늦출 수 있기 때문에 조기 진단의 필요성이 매우 높은 질병이라 할 수 있다. 밀레니움 파마슈니컬스(Millenium Pharmaceuticals) 사에서 HNF1 유전자에 있는 유전자형의 변이들을 탐지함으로써 2형 당뇨병의 진단(diagnosis)과 예측 (prognosis)이 가능하다고 발표하였고(PR Newswire, Sept 1, 1998), 시쿼넘(Sequenom) 사에서는 FOXA2 (HNF3β유전자가 2형 당뇨병의 발병과 높은 연관이 있다고 발표하였다(PR Newswire, Oct 28, 2003).Currently, the diagnosis of diabetes is mainly made by a method of measuring a change in the phenotype due to a disease, that is, the blood glucose level, through a fasting blood glucose (FSB) test and an oral glucose tolerance test (OGTT). (National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health, http://www.niddk.nih.gov, 2003). In the case of type 2 diabetes, once diagnosed, it can be said that the need for early diagnosis is very high because it can be treated or the progress of diabetes can be slowed through changes in exercise and dietary habits, weight control, and various drug treatments. Millenium Pharmaceuticals announced that it is possible to diagnose and predict type 2 diabetes by detecting genotype mutations in the HNF1 gene (PR Newswire, Sept 1, 1998), Sequonum (Sequenom) reported that FOXA2 (HNF3β gene) is highly associated with the onset of type 2 diabetes (PR Newswire, Oct 28, 2003).

이와 같이 몇몇 유전자들이 2형 당뇨병의 발병과 연관이 있다고 보고되고 있지만, 일부 염색체 상의 소수의 특정 유전자에만 연구가 집중되어 있고 특정 인구집단을 대상으로 실험하였다. 이에 따라 대상으로 하는 인종에 따라 다른 결과가 나타날 가능성이 있고, 2형 당뇨병의 원인유전자가 모두 밝혀진 것은 아니며, 이러한 분자 생물학적 방법을 이용하여 2형 당뇨병을 진단하는 경우는 많지 않은 것이 현재의 진단 수준이다. 이에 전체 인간 유전체를 대상으로 2형 당뇨병과 연관이 높은 새로운 SNP 및 관련 유전자를 찾아내어 조기 진단에 활용해야 할 필요성이 대두되었다.Although several genes have been reported to be associated with the onset of type 2 diabetes as described above, research has been focused on only a small number of specific genes on some chromosomes and experiments were conducted on specific population groups. Accordingly, there is a possibility that different results may appear depending on the target race, not all of the causative genes of type 2 diabetes have been identified, and the current level of diagnosis is that there are not many cases of diagnosing type 2 diabetes using these molecular biological methods. am. Accordingly, there is a need to find new SNPs and related genes highly related to type 2 diabetes from the entire human genome and utilize them for early diagnosis.

제2형 당뇨병(Type 2 diabetes mellitus; T2D)은 다유전자 성향 및 환경 위험 인자에 기초한 만성 다중인자질환(chronic multi-factorial disease)이다. 유전체 수준의 연관성 연구(genome-wide association studies; GWAS)의 첫 번째 물결은 제2형 당뇨병(T2D) 감수성 유전자 좌위의 발견이었다. 대부분의 제2형 당뇨 유전자 좌위는 지금까지 일부 동아시아 그룹과 함께 유럽 가계 집단에서 확인되었다. 최근 민족간 정교한 분석(transethnic fine-mapping)을 통해 특성 변화를 증가시키는 추가적인 집단 특이적 신호가 동정되고 있다. 그러나, 제2형 당뇨의 유전적 구성요소의 동정은 다양한 인종 가계(ethnically diverse ancestries)에 걸쳐 유전적 이질성(genetic heterogeneity)으로 인해 제한적이다.Type 2 diabetes mellitus (T2D) is a chronic multi-factorial disease based on polygenic predisposition and environmental risk factors. The first wave of genome-wide association studies (GWAS) was the discovery of the type 2 diabetes mellitus (T2D) susceptibility locus. Most type 2 diabetes loci have so far been identified in European ancestry, along with some East Asian groups. Recently, additional group-specific signals that increase trait change have been identified through transethnic fine-mapping. However, identification of the genetic component of type 2 diabetes is limited due to genetic heterogeneity across ethnically diverse ancestries.

제2형 당뇨병 위험 예측에 대한 통합 분석 방법으로, 유전적 위험도 점수(genetic risk scores; GRS)는 GWAS 결과로부터의 지놈 수준에서 위험도를 측정하는 것에 있어서 효과적이고 효율적인 방법이 될 수 있다. 최근, 제2형 당뇨에 대한 유전적 위험 평가 연구는 누적적인 유전적 점수의 예측값을 평가함으로써 보고되었다. 그러나, GWAS로부터 유래한 유전적 점수(genetic scores)의 예측값은 유전자형 빈도, 표현형 효과 크기(phenotypic effect size) 및 질병 발생의 민족 특이적 결정 요인들에 의해 영향을 받아왔다. 추가적인 다중 SNP 유전적 위험도 점수(multi-SNP genetic risk score)의 개발 및 개선이 독립적인 민족 집단 간 복잡한 질병 예측 및 예방에 대한 좀 더 나은 이해를 이끌 수 있을 것이다. As an integrated analysis method for predicting type 2 diabetes risk, genetic risk scores (GRS) can be an effective and efficient method for estimating risk at the genomic level from GWAS results. Recently, a genetic risk assessment study for type 2 diabetes was reported by evaluating the predictive value of the cumulative genetic score. However, predictive values of genetic scores derived from GWAS have been influenced by genotype frequency, phenotypic effect size, and ethnicity-specific determinants of disease incidence. The development and improvement of additional multi-SNP genetic risk scores could lead to a better understanding of the complex disease prediction and prevention between independent ethnic groups.

이에, 본 발명자들은 한국인에 있어서 제2형 당뇨병의 예후 판단을 위하여, 한국인에서 분석이 가능한 기 보고된 당뇨 유전변이와 안성/안산 코호트를 이용한 연관성 분석에서 연관성을 나타내는 단일염기다형성(SNP)을 선정하였으며, 이를 유전적 위험도 점수화하여 임상정보와 통합하여 분석할 경우, 제2형 당뇨병의 발병 예측에 대하여 통계적으로 유의한 결과를 나타냄을 확인하고, 본 발명을 완성하였다.Therefore, in order to determine the prognosis of type 2 diabetes in Koreans, the present inventors selected single nucleotide polymorphisms (SNPs) that show association in association analysis using previously reported diabetes genetic mutations that can be analyzed in Koreans and the Anseong/Ansan cohort. When the genetic risk score was scored and analyzed by integrating with clinical information, it was confirmed that a statistically significant result was shown for the prediction of the onset of type 2 diabetes, and the present invention was completed.

본 발명의 목적은 임상정보와 유전변이 정보를 통합한 당뇨병 발병확률 예측방법을 제공하는 것이다.It is an object of the present invention to provide a method for predicting the incidence of diabetes by integrating clinical information and genetic mutation information.

본 발명의 다른 목적은 당뇨병 환자의 추가 검진을 위한 정보의 제공방법을 제공하는 것이다.Another object of the present invention is to provide a method of providing information for further examination of diabetic patients.

본 발명의 또 다른 목적은 당뇨병 환자의 관리를 위한 정보의 제공방법을 제공하는 것이다.Another object of the present invention is to provide a method of providing information for the management of diabetic patients.

본 발명의 또 다른 목적은 상기 방법을 이용한 보험설계 및 보험상품 매칭 시스템을 제공하는 것이다.Another object of the present invention is to provide an insurance design and insurance product matching system using the above method.

상기 목적을 달성하기 위하여, 본 발명은 a) 시료로부터 DNA를 추출하는 단계; b) rs10811661, rs947474, rs1048886 및 rs9268645 SNP의 유전자형을 확인하여 점수(Genotype Score, GS)를 부여하는 단계; c) 각 SNP 별로 가중치를 부여하여 SNP 값(SNP point)을 도출하는 단계; 및 d) 상기 SNP point를 바탕으로 당뇨병 발병확률을 계산하는 단계를 포함하되, 여기서, GS는 각 SNP의 유전형별로 당뇨 발병에 미치는 영향력을 0, 1 또는 2로 구분하여 나타낸 값을 의미하며, SNP point는 상기 GS에 가중치를 곱한 값의 전체 합을 의미하는 것을 특징으로 하는 당뇨병 발병확률 예측 방법을 제공한다.In order to achieve the above object, the present invention comprises the steps of: a) extracting DNA from a sample; b) determining the genotypes of rs10811661, rs947474, rs1048886 and rs9268645 SNPs to give a score (Genotype Score, GS); c) deriving a SNP value (SNP point) by assigning a weight to each SNP; and d) calculating the probability of developing diabetes based on the SNP point, wherein GS means a value representing the influence of each SNP genotype on the onset of diabetes by dividing it into 0, 1, or 2, and SNP The point provides a method for predicting the incidence of diabetes, characterized in that it means the total sum of the values multiplied by the weight of the GS.

본 발명은 또한, a) 환자의 임상정보를 수집하고, DNA를 추출하는 단계; b) 임상정보에 대하여 가중치를 부여하여 임상정보 값을 도출하는 단계; c) DNA로부터 rs10811661, rs947474, rs1048886 및 rs9268645 SNP의 유전자형을 확인하여 점수(Genotype Score, GS)를 부여하는 단계; d) 각 SNP 별로 가중치를 부여하여 SNP 값(SNP point)을 도출하는 단계; 및 e) b)와 d) 값을 바탕으로 특정 시점 내 당뇨병 발병확률을 계산하는 단계를 포함하되, 여기서, GS는 각 SNP의 유전형별로 당뇨 발병에 미치는 영향력을 0, 1 또는 2로 구분하여 나타낸 값을 의미하며, SNP point는 상기 GS에 가중치를 곱한 값의 전체 합을 의미하는 것을 특징으로 하는 당뇨병 발병확률 예측 방법을 제공한다. The present invention also comprises the steps of: a) collecting clinical information of a patient, and extracting DNA; b) deriving clinical information values by assigning weights to clinical information; c) determining the genotypes of rs10811661, rs947474, rs1048886 and rs9268645 SNPs from DNA and assigning a score (Genotype Score, GS); d) deriving an SNP value (SNP point) by assigning a weight to each SNP; and e) calculating the probability of developing diabetes within a specific time point based on the values b) and d), wherein GS represents the influence of each SNP genotype on the onset of diabetes by dividing it into 0, 1, or 2 It means a value, and the SNP point provides a method for predicting the probability of diabetes onset, characterized in that it means the total sum of values obtained by multiplying the GS by a weight.

본 발명은 또한, a) 상기 방법으로 특정 기간 내 당뇨병 발병확률을 예측하는 단계; b) 상기 발병확률에 유전적 요인과 환경적 요인이 미치는 영향을 구분하는 단계; 및 c) 추가로 필요한 검진 내용을 제공하는 단계를 포함하는 당뇨병 환자의 추가 검진을 위한 정보의 제공방법을 제공한다.The present invention also comprises the steps of: a) predicting the probability of developing diabetes within a specific period by the above method; b) distinguishing the influence of genetic factors and environmental factors on the incidence probability; And c) provides a method of providing information for additional examination of a diabetic patient comprising the step of providing additional necessary examination details.

본 발명은 또한, a) 상기 방법으로 특정 기간 내 당뇨병 발병확률을 예측하는 단계; b) 상기 발병확률에 유전적 요인과 환경적 요인이 미치는 영향을 구분하는 단계; 및 c) 발병확률을 낮추기 위한 식이, 운동 및 생활습관 가이드를 제공하는 단계를 포함하는 당뇨병 환자의 관리를 위한 정보의 제공방법을 제공한다.The present invention also comprises the steps of: a) predicting the probability of developing diabetes within a specific period by the above method; b) distinguishing the influence of genetic factors and environmental factors on the incidence probability; And c) provides a method of providing information for the management of diabetic patients, including the step of providing a dietary, exercise and lifestyle guide for lowering the incidence probability.

본 발명은 또한, The present invention also

상기 방법을 이용하여 사용자의 당뇨병 발병 확률을 계산하는 장치와,상기 당뇨병 발병 확률 계산 장치로부터 당뇨병 발병확률 정보를 수신하여 분석하고, 보험상품을 매칭해주는 서버와, 상기 서버의 정보를 화면상에 출력해주는 단말기를 포함하여 구성되되, 상기 당뇨병 발병 확률 계산 장치를 통해 사용자의 당뇨병 발병확률을 계산하고, 당뇨병 발병확률 정보를 서버에서 수신하여 미래의 당뇨병 예상 치료비용을 계산한 뒤, 계산결과를 기반으로 보험상품을 매칭시켜 추천하는 당뇨병 발병확률 기반 보험설계 및 보험상품 매칭 시스템에 있어서,A device for calculating a user's diabetes incidence probability using the method; a server for receiving and analyzing diabetes incidence probability information from the diabetes incidence probability calculation device and matching insurance products; and outputting the server information on a screen It is configured to include a terminal that does this, calculates the user's diabetes incidence probability through the diabetes incidence probability calculation device, receives diabetes incidence probability information from the server, calculates future diabetes treatment costs, and based on the calculation results In the insurance design and insurance product matching system based on the probability of developing diabetes that matches and recommends insurance products,

상기 서버는, 사용자에게 제공될 모든 보험상품에 대한 정보를 저장하고 있는 보험상품저장부와, 당뇨병 평균 치료비용을 저장하고 있는 평균 치료비용저장부와, 사용자를 고객으로 인증하는 기능을 하는 고객인증부와, 상기 고객인증부를 통해 인증된 고객의 정보를 기반으로 상기 당뇨병 발병 확률 계산 장치의 당뇨병 발병확률을 수신하는 당뇨병 발병확률 정보수신부와, 상기 당뇨병 발병확률 정보수신부를 통해 수신된 당뇨병 발병 확률 정보를 기반으로 미래의 예상 치료비용을 계산하는 기능을 하는 분석부와, 상기 고객인증부를 통해 인증된 고객의 정보를 기반으로 고객이 가입한 보험을 식별하고, 식별된 보험에 대한 고객의 보장내역을 보험상품저장부에서 수신하여 상기 분석부의 미래의 예상 치료비용과 비교하는 보험보장내역 비교부와, 상기 분석부를 통해 분석된 미래의 예상 치료비용 및 상기 보험보장내역 비교부에 기반하여 미래의 예상 치료비용이 고객이 가입한 보험에 의해 보장되는 지 판단하고, 판단된 결과에 기반하여 보험을 매칭해주는 보험매칭부와, 상기 보험매칭부에서 매칭된 보험을 사용자에게 추천해주는 보험정보추천부를 포함하는 것을 특징으로 하는 임상 및 유전변이 정보 기반 보험설계 및 보험상품 매칭 시스템을 제공한다.The server includes an insurance product storage unit for storing information on all insurance products to be provided to the user, an average treatment cost storage unit for storing average diabetes treatment cost, and customer authentication for authenticating the user as a customer. A diabetes onset probability information receiving unit for receiving the diabetes incidence probability of the diabetes onset probability calculation device based on the customer information authenticated through the customer authentication unit, and the diabetes onset probability information received through the diabetes onset probability information receiving unit An analysis unit that calculates the future treatment cost based on An insurance insurance detail comparison unit that receives from the insurance product storage unit and compares it with the future expected treatment cost of the analysis unit, and the future expected treatment cost analyzed through the analysis unit and future treatment based on the insurance coverage comparison unit Comprising an insurance matching unit that determines whether the cost is covered by the insurance purchased by the customer and matches the insurance based on the determined result, and an insurance information recommendation unit that recommends the insurance matched by the insurance matching unit to the user It provides an insurance design and insurance product matching system based on clinical and genetic variation information.

본 발명에 따른 당뇨병 발병확률 예측 방법은 한국인 특이적 유전정보뿐만 아니라, 개인별 임상정보를 통합 분석하여 당뇨병 발병확률을 예측하기 때문에 그 정확도가 기존 방법에 비해 월등히 뛰어나므로, 당뇨병 조기 진단 및 예측 예방에 유용하다. Since the diabetes incidence probability prediction method according to the present invention predicts the diabetes incidence probability by integrating not only Korean-specific genetic information but also individual clinical information, the accuracy is far superior to that of the existing method, so it is useful for early diagnosis and prevention of diabetes. useful.

도 1은 본 발명에 따른 임상/유전 변이 정보 통합 모델 구축을 위한 구체적인 프로세스를 나타낸 개략도이다.
도 2는 본 발명에 따른 당뇨병 발병확률 예측 및 검사추천 방법에 따라 예측한 당뇨병 발병확률을 보고하는 보고서의 예시이다.
도 3은 본 발명에 따른 당뇨병 발병확률 예측 및 검사추천 방법에 따라 추가 검사를 추천하는 보고서의 예시이다.
도 4는 본 발명에 따른 당뇨병 발병확률 예측 모델에 따라 예측한 환자의 발병확률 보고서의 상세 도면이다.
도 5는 본 발명에 따른 당뇨병 발병확률 예측 및 환자 관리를 위한 정보제공방법에 따라 관리 가이드를 제공하는 보고서의 예시이다. 1 is a schematic diagram showing a specific process for constructing an integrated clinical/genetic variation information model according to the present invention.
2 is an example of a report reporting the diabetes incidence probability predicted according to the method for predicting and recommending the diagnosis of diabetes according to the present invention.
3 is an example of a report recommending an additional test according to the method for predicting the incidence of diabetes and recommending a test according to the present invention.
4 is a detailed view of the patient's incidence probability report predicted according to the diabetes incidence probability prediction model according to the present invention.
5 is an example of a report providing a management guide according to the information providing method for predicting the incidence of diabetes and patient management according to the present invention.

다른 식으로 정의되지 않는 한, 본 명세서에서 사용된 모든 기술적 및 과학적 용어들은 본 발명이 속하는 기술 분야에서 숙련된 전문가에 의해서 통상적으로 이해되는 것과 동일한 의미를 갖는다. 일반적으로 본 명세서에서 사용된 명명법 및 이하에 기술하는 실험 방법은 본 기술 분야에서 잘 알려져 있고 통상적으로 사용되는 것이다.Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In general, the nomenclature used herein and the experimental methods described below are well known and commonly used in the art.

본 발명에서는, 한국인 특이적 유전 변이 정보와 임상정보를 통합하여 분석할 경우, 당뇨병 발병확률을 더 높은 정확도로 예측할 수 있다는 것을 확인하고자 하였다. In the present invention, it was attempted to confirm that the probability of developing diabetes can be predicted with higher accuracy when the Korean-specific genetic variation information and clinical information are integrated and analyzed.

즉, 본 발명의 일 실시예에서는, KoGES(Korean Genome and Epidemiology Study)로부터 연구 대상자를 선별하였고, 이는 대한민국 안성지역 및 안산지역에서 조사한 인구 기반 전향적 코호트 연구를 통해 40세 이상 69세 이하의 8,842명의 연구 참가자의 유전 데이터 및 10,030 명의 임상정보를 수득하였다.That is, in one embodiment of the present invention, study subjects were selected from the Korean Genome and Epidemiology Study (KoGES), which was 8,842 between the ages of 40 and 69 through a population-based prospective cohort study surveyed in the Anseong and Ansan regions of Korea. Genetic data and clinical information of 10,030 study participants were obtained.

먼저, 임상 정보를 분석하기 위하여 수득한 임상 정보의 데이터를 연속형에서 범주형으로 변환시킨 다음, 나이, 성별을 predictor 로 한 base cox ph 모형과 각 임상정보 요인을 추가한 cox ph 모형을 비교한 LR test(likelihood ratio test)를 통해 p value가 0.05 미만인 임상 정보 요인을 선정하였다. 이후 선정된 임상 정보 요인과 나이, 성별, 당뇨 가족력 여부를 predictor로 하여 multiple regression을 적용한 다음, backward elimination 방법으로 1차 환경분석 모델을 구축하였다(도 1).First, to analyze clinical information, the obtained clinical information data was converted from continuous to categorical, and then the base cox ph model with age and gender as predictors and the cox ph model with each clinical information factor added were compared. Clinical information factors with a p value of less than 0.05 were selected through the LR test (likelihood ratio test). Then, multiple regression was applied using the selected clinical information factors, age, gender, and family history of diabetes as predictors, and then a primary environmental analysis model was constructed by backward elimination (FIG. 1).

유전 변이 정보를 분석하기 위하여 8,842명의 443,816개 SNP genotype을Affymetirx Genome-Wide Human SNP array 5.0을 이용하여 분석하였다. 먼저 SNP 중에서, 단일형 SNP(Monomorphic SNP), 소수 대립유전자 빈도(Minor Allele Frequency)가 1% 미만인 SNP, 하디-바인베르그 평형 시험(Hardy-Weinberg equilibrium test)결과 p value가 0.001 미만인 SNP 및 결측률이 20% 초과한 SNP는 제외하였다.To analyze genetic variation information, 443,816 SNP genotypes from 8,842 individuals were analyzed using Affymetirx Genome-Wide Human SNP array 5.0. First, among SNPs, monomorphic SNPs, SNPs with a minor allele frequency of less than 1%, SNPs with a p value of less than 0.001 as a result of the Hardy-Weinberg equilibrium test, and a missing rate were SNPs exceeding 20% were excluded.

다음으로 GWAS Catalog를 통해 선행연구에서 관련 질환과 연관성을 보이는 것으로 분석된 SNP를 후보로 선별하고, 동일 염색체내 SNP의 모든 pairwise correlation 상수가 0.9를 초과한 SNP를 tagging SNP로 정의하고 tagging SNP는 한 모형 내에서 중복으로 사용되지 않도록 하였다.Next, through the GWAS Catalog, SNPs analyzed as being correlated with related diseases in previous studies are selected as candidates, and SNPs with all pairwise correlation constants of SNPs in the same chromosome exceeding 0.9 are defined as tagging SNPs. It was made not to be used repeatedly within the model.

후보 SNP는 연령, 성별, 당뇨 가족력 여부와 최종 선정된 임상 정보 요인 및 각 SNP를 설명변수로 한 cox ph 모형에 상가적 유전 모형을 적용하여 범주형 유전형 AA, BB 중 HR이 큰 Genotype의 Allele을 risk allele로 하여 연속형(0,1,2; GS)으로 변환하였다.Candidate SNPs were identified by applying an additive genetic model to the cox ph model using age, sex, family history of diabetes, the finally selected clinical information factors, and each SNP as explanatory variables to determine Allele of the genotype with the largest HR among categorical genotypes AA and BB. It was converted to continuous (0,1,2; GS) as a risk allele.

1차 선별된 유전 요인은 1개부터 n개의 SNP를 통합하여 하나의 SNP point로 이용하였다, 즉 (nC1+nC2 +.. +nCn)개의 SNP point를 후보 요인으로 하나의 변수로 적용하는 것이다. 선별된 환경요인에 각 SNP point를 추가한 cox ph 모형에 대해서 아래 (1)~(3)의 조건을 만족하는 모형을 구축하였다. The first selected genetic factors were used as one SNP point by integrating 1 to n SNPs, that is, (nC1+nC2 +.. +nCn) SNP points were applied as candidate factors as one variable. For the cox ph model in which each SNP point is added to the selected environmental factors, a model that satisfies the conditions of (1) to (3) below was constructed.

구축한 모든 모형들의 time-dependent AUC, survival NRI 결과를 비교하여 최종 당뇨병 발병 확률 예측 모델을 구축하였다. The final diabetes incidence probability prediction model was constructed by comparing the time-dependent AUC and survival NRI results of all the constructed models.

(1) SNP point의 PH가정 만족 여부, LR test 결과 p value < 0. 05 (여기서, PH가정은 cox ph model의 기본가정인 proportional hazard 가정을 확인하는 것임) (1) Whether or not the PH assumption of the SNP point is satisfied, LR test result p value < 0.05 (here, the PH assumption is to confirm the proportional hazard assumption, which is the basic assumption of the cox ph model)

(2) SNP point 추가한 모형의 time-dependent AUC 가 그렇지 않은 모형보다 큼 (2) The time-dependent AUC of the model with the SNP point added is larger than that of the model without the SNP point.

(3) SNP point 추가한 모형의 hosmer-lemeshow test 가 그렇지 않은 모형과 비교해 악화되지 않음(3) The hosmer-lemeshow test of the model with the added SNP point did not deteriorate compared to the model without the SNP point.

그 결과, 구축한 모델이 적은 수의 유전 요인을 사용해 높은 정확도로 당뇨병 발병확률을 예측할 수 있다는 것을 확인하였다.As a result, it was confirmed that the constructed model can predict the incidence of diabetes with high accuracy using a small number of genetic factors.

따라서, 본 발명은 일 관점에서, a) 시료로부터 DNA를 추출하는 단계; b) rs10811661, rs947474, rs1048886 및 rs9268645 SNP의 유전자형을 확인하여 점수(Genotype Score, GS)를 부여하는 단계; c) 각 SNP 별로 가중치를 부여하여 SNP 값(SNP point)을 도출하는 단계; 및 d) 상기 SNP 값을 바탕으로 당뇨병 발병확률을 계산하는 단계를 포함하는 당뇨병 발병확률 예측 방법에 관한 것이다.Accordingly, in one aspect, the present invention comprises the steps of: a) extracting DNA from a sample; b) determining the genotypes of rs10811661, rs947474, rs1048886 and rs9268645 SNPs to give a score (Genotype Score, GS); c) deriving a SNP value (SNP point) by assigning a weight to each SNP; and d) calculating the probability of developing diabetes based on the SNP value.

본 발명에 있어서, 상기 방법은 e) 환자의 임상정보를 수집하는 단계; f) 수집한 임상정보 별로 가중치를 부여하여 임상정보 값을 도출하는 단계; 및 g) 상기 SNP 값과 임상정보 값을 통합하여 특정 시점 내 당뇨병 발병확률을 계산하는 단계를 추가로 포함하는 것을 특징으로 할 수 있다.In the present invention, the method comprises the steps of: e) collecting clinical information of a patient; f) deriving clinical information values by assigning weights to each collected clinical information; and g) integrating the SNP value and the clinical information value to calculate the probability of developing diabetes within a specific time point.

본 발명은 또한, a) 환자의 임상정보를 수집하고, DNA를 추출하는 단계; b) 임상정보에 대하여 가중치를 부여하여 임상정보 값을 도출하는 단계; c) DNA로부터 rs10811661, rs947474, rs1048886 및 rs9268645 SNP의 유전자형을 확인하여 점수(Genotype Score, GS)를 부여하는 단계; d) 각 SNP 별로 가중치를 부여하여 SNP 값(SNP point)을 도출하는 단계; 및 e) b)와 d) 값을 바탕으로 특정 시점 내 당뇨병 발병확률을 계산하는 단계를 포함하는 당뇨병 발병확률 예측 방법에 관한 것이다.The present invention also comprises the steps of: a) collecting clinical information of a patient, and extracting DNA; b) deriving clinical information values by assigning weights to clinical information; c) determining the genotypes of rs10811661, rs947474, rs1048886 and rs9268645 SNPs from DNA and assigning a score (Genotype Score, GS); d) deriving an SNP value (SNP point) by assigning a weight to each SNP; and e) calculating the probability of developing diabetes within a specific time based on the values b) and d).

본 발명에 있어서, 상기 유전자형을 확인하는 단계는 기존의 공지된 다양한 방법을 통해 수득할 수 있으며, 핵산 증폭을 통한 유전자형 확인, 서열정보 분석을 통한 유전자형 확인, 프로브를 통한 유전자형 확인 방법 등을 사용할 수 있으나, 이에 한정되는 것은 아니다.In the present invention, the step of confirming the genotype can be obtained through various known methods, genotype confirmation through nucleic acid amplification, genotype confirmation through sequence information analysis, genotype confirmation method through a probe, etc. can be used. However, the present invention is not limited thereto.

본 발명에서 용어 "다형성(polymorphism)"이란 하나의 유전자 좌위(locus)에 두 가지 이상의 대립 유전자(allele)가 존재하는 경우를 말하며 다형성 부위 중에서, 사람에 따라 단일 염기만이 다른 것을 단일 염기 다형성(single nucleotide polymorphism, SNP)이라 한다. 바람직한 다형성 마커는 선택된 집단에서 1% 이상, 더욱 바람직하게는 5% 또는 10% 이상의 발생 빈도를 나타내는 두 가지 이상의 대립 유전자를 가진다.In the present invention, the term "polymorphism" refers to a case in which two or more alleles exist at one locus, and among polymorphic sites, only a single base differs from one person to another as a single nucleotide polymorphism ( It is called single nucleotide polymorphism (SNP). Preferred polymorphic markers have two or more alleles exhibiting an incidence of at least 1%, more preferably at least 5% or 10% in a selected population.

본 발명에서 "대립 유전자"는 상동 염색체의 동일한 유전자 좌위에 존재하는 한 유전자의 여러 타입을 말한다. 대립 유전자는 다형성을 나타내는데 사용되기도 하며, 예컨대, SNP는 두 종류의 대립 인자(biallele)를 갖는다.In the present invention, "allele" refers to several types of one gene present at the same locus on a homologous chromosome. Alleles are also used to indicate polymorphism, for example, SNPs have two types of alleles.

본 발명에서 용어, "rs_id" 또는 “rs_Number”또는 “rs”란 1998년부터 SNP 정보를 축적하기 시작한 NCBI가 초기에 등록되는 모든 SNP에 대하여 부여한 독립된 표지자인 rs-ID를 의미한다. 본 발명에서는 rs831571와 같은 형태로 기재하였다.In the present invention, the term “rs_id” or “rs_Number” or “rs” refers to rs-ID, an independent marker, assigned to all initially registered SNPs by NCBI, which has been accumulating SNP information since 1998. In the present invention, it was described in the same form as rs831571.

본 발명의 표에 기재된 rs_id는 본 발명의 다형성 마커인 SNP 마커를 의미한다. 당업자라면 상기 rs_id를 이용하여 SNP의 위치 및 서열을 용이하게 확인할 수 있을 것이다. NCBI의 dbSNP (The Single Nucleotide Polymorphism Database) 번호인 rs_id에 해당하는 구체적인 서열은 시간이 지남에 따라 약간 변경될 수 있다.rs_id described in the table of the present invention means a SNP marker that is a polymorphic marker of the present invention. Those skilled in the art will be able to easily identify the position and sequence of the SNP using the rs_id. The specific sequence corresponding to rs_id, which is NCBI's dbSNP (The Single Nucleotide Polymorphism Database) number, may change slightly over time.

본 발명의 범위가 상기 변경된 서열에도 미치는 것은 당업자에게 자명할 것이다.It will be apparent to those skilled in the art that the scope of the present invention also extends to such altered sequences.

본 발명에서 용어 "뉴클레오시드"는 핵산 염기(핵염기)가 당 모이어티에 연결된 글리코실아민 화합물을 의미한다. "뉴클레오티드"는 뉴클레오시드 포스페이트를 의미한다. 뉴클레오티드는 표 3에 기재된 것과 같이, 그의 뉴클레오시드에 상응하는 알파벳 문자(문자 명칭)를 사용하여 표시될 수 있다. 예컨대, A는 아데노신(아데닌 핵염기를 함유하는 뉴클레오시드)을 지칭하고, C는 시티딘을 지칭하고, G는 구아노신을 지칭하고, U는 우리딘을 지칭하고, T는 티미딘(5-메틸 우리딘)을 지칭한다. W는 A 또는 T/U를 지칭하고, S는 G 또는 C를 지칭한다. N은 랜덤한 뉴클레오시드를 표시하고, dNTP는 데옥시리보뉴클레오시드 트리포스페이트를 의미한다. N은 A, C, G, 또는 T/U 중 어떤 것도 될 수 있다.As used herein, the term “nucleoside” refers to a glycosylamine compound in which a nucleic acid base (nucleobase) is linked to a sugar moiety. "Nucleotide" means a nucleoside phosphate. Nucleotides can be designated using the alphabetic letters (letter names) corresponding to their nucleosides, as described in Table 3. For example, A refers to adenosine (a nucleoside containing an adenine nucleobase), C refers to cytidine, G refers to guanosine, U refers to uridine, and T refers to thymidine (5- methyl uridine). W refers to A or T/U, and S refers to G or C. N represents a random nucleoside, and dNTP means deoxyribonucleoside triphosphate. N can be any of A, C, G, or T/U.

본 발명에서 용어 "올리고뉴클레오티드"는 뉴클레오티드의 올리고머를 의미한다. 본원에 사용된 용어 "핵산"은 뉴클레오티드의 중합체를 의미한다. 본원에 사용된 용어 "서열"은 올리고뉴클레오티드 또는 핵산의 뉴클레오티드 서열을 의미한다. 명세서를 통틀어, 올리고뉴클레오티드 또는 핵산이 문자의 서열에 의해 표시될 때마다, 뉴클레오티드는 좌에서 우로 5'→순서이다. 올리고뉴클레오티드 또는 핵산은 DNA, RNA, 또는 그의 유사체(예컨대, 포스포로티오에이트 유사체)일 수 있다. 올리고뉴클레오티드 또는 핵산은 개질된 염기 및/또는 골격(예컨대, 개질된 포스페이트 연결부 또는 개질된 당 모이어티)도 또한 포함할 수 있다. 핵산에 안정성 및/또는 다른 이점을 부여하는 합성 골격의 비-제한적 예시는 포스포로티오에이트 연결부, 펩티드 핵산, 잠금 핵산, 자일로스핵산, 또는 그의 유사체를 포함할 수 있다.As used herein, the term “oligonucleotide” refers to an oligomer of nucleotides. As used herein, the term “nucleic acid” refers to a polymer of nucleotides. As used herein, the term “sequence” refers to the nucleotide sequence of an oligonucleotide or nucleic acid. Throughout the specification, whenever an oligonucleotide or nucleic acid is represented by a sequence of letters, the nucleotides are in 5'→ left to right order. The oligonucleotide or nucleic acid may be DNA, RNA, or an analog thereof (eg, a phosphorothioate analog). Oligonucleotides or nucleic acids may also include modified bases and/or backbones (eg, modified phosphate linkages or modified sugar moieties). Non-limiting examples of synthetic backbones that confer stability and/or other advantages to nucleic acids may include phosphorothioate linkages, peptide nucleic acids, locked nucleic acids, xylose nucleic acids, or analogs thereof.

본 발명에서 용어 “핵산”은 뉴클레오티드 폴리머를 지칭하며, 달리 한정되지 않는다면 자연적으로 발생한 뉴클레오티드와 유사한 방식(예컨대, 혼성화)으로 작용할 수 있는 천연 뉴클레오티드의 공지된 유사체(analog)를 포함한다.As used herein, the term “nucleic acid” refers to a polymer of nucleotides, and unless otherwise limited, includes known analogs of natural nucleotides that can function in a manner similar to naturally occurring nucleotides (eg, hybridization).

용어 핵산은, 예를 들어 유전체 DNA; 상보 DNA(cDNA)(이는 보통 전령 RNA(mRNA)의 역전사 또는 증폭으로 얻어지는 mRNA의 DNA 표현임); 합성으로 또는 증폭으로 생성된 DNA 분자; 및 mRNA를 포함한 임의의 형태의 DNA 또는RNA를 포함한다.The term nucleic acid includes, for example, genomic DNA; complementary DNA (cDNA), which is usually a DNA representation of an mRNA obtained by reverse transcription or amplification of messenger RNA (mRNA); DNA molecules produced synthetically or by amplification; and any form of DNA or RNA, including mRNA.

용어 핵산은 단일 가닥 분자뿐만 아니라 이중 또는 삼중 가닥 핵산을 포함한다. 이중 또는 삼중 가닥 핵산에서, 핵산 가닥은 동연(coextensive)일 필요는 없다(즉, 이중 가닥 핵산은 양 가닥의 전체 길이를 따라 이중 가닥일 필요는 없다).The term nucleic acid includes single-stranded molecules as well as double or triple-stranded nucleic acids. In double or triple stranded nucleic acids, the nucleic acid strands need not be coextensive (ie, double stranded nucleic acids need not be double stranded along the entire length of both strands).

용어 핵산은 또한 메틸화 및/또는 캡핑과 같은 것에 의한 이의 임의의 화학적 개질을 포함한다. 핵산 개질은 개별적인 핵산 염기 또는 핵산 전체에 추가적인 전하, 분극률, 수소 결합, 정전기 상호작용, 및 기능성을 포함하는 화학기의 첨가를 포함할 수 있다. 이러한 개질은 2' 위치 당 개질, 5 위치 피리미딘 개질, 8 위치 퓨린개질, 시토신 환외(exocyclic) 아민에서의 개질, 5-브로모-우라실의 치환, 주쇄 개질, 이소염기 이소시티딘 및 이소구아니딘과 같은 특이 염기 쌍 조합 등과 같은 염기 개질을 포함할 수 있다.The term nucleic acid also includes any chemical modification thereof, such as by methylation and/or capping. Nucleic acid modifications may include the addition of chemical groups comprising additional charge, polarizability, hydrogen bonding, electrostatic interactions, and functionality to individual nucleic acid bases or to the nucleic acid as a whole. These modifications include sugar modification at the 2' position, pyrimidine modification at position 5, purine modification at position 8, modification at cytosine exocyclic amine, substitution of 5-bromo-uracil, backbone modification, isocytidine and isoguanidine isobases base modifications, such as specific base pair combinations, such as

핵산(들)은 고상 매개 화학적 합성(solid phase-mediated chemical synthesis)과 같은 완전한 화학적 합성 과정으로부터, 핵산을 생성하는 임의의 종으로부터 분리를 통해서와 같은 생물학적 공급원으로부터, 또는 DNA 복제, PCR 증폭, 역전사와 같은 분자 생물학 도구에 의한 핵산의 취급과 관련된 과정으로부터, 또는 이들 과정의 결합으로부터 유도될 수 있다.Nucleic acid(s) can be synthesized from a biological source, such as from a complete chemical synthesis process, such as solid phase-mediated chemical synthesis, through isolation from any species that produces the nucleic acid, or from DNA replication, PCR amplification, reverse transcription. It can be derived from processes related to the handling of nucleic acids by molecular biology tools, such as, or from a combination of these processes.

본 발명에서 용어 “상보”는 2개의 뉴클레오티드 사이의 정확한 쌍형성에 대한 능력을 지칭한다. 즉, 핵산의 주어진 위치에서 뉴클레오티드가 다른 핵산의 뉴클레오티드와 수소 결합을 할 수 있다면, 2개의 핵산은 그 위치에서 서로 상보적인 것으로 여겨진다. 뉴클레오티드의 일부만이 결합하여 2개의 단일 가닥 핵산 분자 사이의 상보성은 “부분적”일 수 있거나, 또는 전체 상보성이 단일 가닥 분자 사이에 존재할 때 상보성은 완전할 수 있다. 핵산 가닥 사이의 상보성의 정도는 핵산 가닥 사이의 혼성화의 효율 및 강도에 상당한 영향을 미친다.As used herein, the term “complementary” refers to the ability for correct pairing between two nucleotides. That is, two nucleic acids are considered to be complementary to each other at a given position in a nucleic acid if a nucleotide at a given position is capable of hydrogen bonding with a nucleotide in another nucleic acid. Complementarity between two single-stranded nucleic acid molecules may be “partial” because only a portion of the nucleotides bind, or complementarity may be complete when total complementarity exists between single-stranded molecules. The degree of complementarity between nucleic acid strands significantly affects the efficiency and strength of hybridization between nucleic acid strands.

본 발명에서 용어 "프라이머"는 핵산 합성 반응을 프라이밍하기 위한 표적 핵산 서열(예컨대, 증폭될 DNA 주형)에 혼성화되는 짧은 선형 올리고뉴클레오티드를 의미한다. 프라이머는 RNA 올리고뉴클레오티드, DNA 올리고뉴클레오티드, 또는 키메라 서열일 수 있다. 프라이머는 천연, 합성, 또는 개질된 뉴클레오티드를 함유할 수 있다. 프라이머 길이의 상한 및 하한 둘 모두는 실험적으로 결정된다. 프라이머 길이의 하한은 핵산 증폭 반응 조건에서 표적 핵산과의 혼성화 후 안정한 듀플렉스를 형성하는데 필요한 최소 길이이다. 매우 짧은 프라이머(흔히 3 개 뉴클레오티드 미만 길이)는 이러한 혼성화 조건 하에서 표적 핵산과의 열열학적으로 안정한 듀플렉스를 형성하지 않는다. 상한은 표적 핵산에서 미리 결정된 핵산 서열 이외의 영역에서 듀플렉스 형성을 가질 수 있는 가능성에 의해 보통 결정된다. 일반적으로, 적합한 프라이머 길이는 약 3 개 뉴클레오티드 길이 내지 약 50개 뉴클레오티드 길이의 범위에 있다.As used herein, the term “primer” refers to a short linear oligonucleotide that hybridizes to a target nucleic acid sequence (eg, a DNA template to be amplified) for priming a nucleic acid synthesis reaction. A primer may be an RNA oligonucleotide, a DNA oligonucleotide, or a chimeric sequence. Primers may contain natural, synthetic, or modified nucleotides. Both the upper and lower limits of primer length are determined empirically. The lower limit of the primer length is the minimum length required to form a stable duplex after hybridization with a target nucleic acid under nucleic acid amplification reaction conditions. Very short primers (often less than 3 nucleotides in length) do not form thermothermally stable duplexes with the target nucleic acid under these hybridization conditions. The upper limit is usually determined by the likelihood of having duplex formation in a region other than a predetermined nucleic acid sequence in the target nucleic acid. Generally, suitable primer lengths range from about 3 nucleotides in length to about 50 nucleotides in length.

본 발명에서 용어 “프로브”는 하나 이상 유형의 화학 결합을 통하여, 일반적으로 상보적 염기 쌍형성을 통하여, 보통 수소 결합 형성을 통하여 상보적인 서열의 표적 핵산에 결합하고 따라서 이중나선(duplex) 구조를 형성할 수 있는 핵산이다. 프로브는 “프로브 결합 부위”에 결합 또는 혼성화한다. 특히, 일단 프로브가 프로브의 상보적인 표적에 혼성화하면 프로브의 검출을 용이하게 하도록 프로브는 검출가능한 표지로 표지될 수 있다. 그러나 대안적으로, 프로브는 표지화되지 않을 수 있지만, 표지화된 리간드와의 특이적 결합에 의해 직접적으로 또는 간접적으로 검출될 수 있다. 프로브는 크기가 상당히 다양할 수 있다. 일반적으로 프로브는 길이가 적어도 7 내지 18개 뉴클레오티드이다. 다른 프로브는 길이가 적어도 20, 30 또는 40개 뉴클레오티드이다. 또 다른 프로브는 다소 더 길며, 길이가 적어도 50, 60, 70, 80, 또는 90개 뉴클레오티드이다. 또 다른 프로브는 더욱 더 길며, 길이가 적어도 100, 150, 200개 또는 그 이상의 뉴클레오티드이다. 프로브는 또한 상기 값(예컨대, 길이가 15~20개 뉴클레오티드)의 임의의 값으로 한정된 임의의 범위 내에 있는 임의의 길이의 것일 수 있다.In the present invention, the term “probe” refers to binding to a target nucleic acid of a complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, and usually through hydrogen bond formation and thus forming a duplex structure. It is a nucleic acid capable of forming A probe binds or hybridizes to a “probe binding site”. In particular, the probe may be labeled with a detectable label to facilitate detection of the probe once it hybridizes to its complementary target. Alternatively, however, the probe may not be labeled, but may be detected directly or indirectly by specific binding to a labeled ligand. Probes can vary considerably in size. Generally, probes are at least 7 to 18 nucleotides in length. Other probes are at least 20, 30 or 40 nucleotides in length. Another probe is somewhat longer and is at least 50, 60, 70, 80, or 90 nucleotides in length. Another probe is even longer and is at least 100, 150, 200 or more nucleotides in length. Probes can also be of any length within any range defined by any of the above values (eg, 15-20 nucleotides in length).

본 발명에서 용어 “혼성화”는 상보적 염기서열을 가진 단일가닥 핵산들 간 수소결합에 의해 이중가닥 핵산이 형성되는 것을 의미하며, 어닐링(annealing)과 유사한 의미로 사용된다. 다만 조금 더 넓은 의미에서, 혼성화는 두 개의 단일가닥 간 염기서열이 완전히 상보적인 경우(perfect match)와 더불어 예외적으로 일부의 염기서열이 상보적이지 않은 경우(mismatch)까지 포함한다.In the present invention, the term “hybridization” refers to the formation of double-stranded nucleic acids by hydrogen bonding between single-stranded nucleic acids having complementary nucleotide sequences, and is used in a similar sense to annealing. However, in a slightly broader sense, hybridization includes a case in which two single-stranded sequences are completely complementary (perfect match), as well as a case in which some of the sequences are not complementary (mismatch).

본 발명에 있어서, 상기 임상정보는 본 발명에서 구축한 당뇨병 발병확률 예측 모델에 사용될 수 있는 임상 정보이면 제한 없이 이용가능하나, 바람직하게는 나이(age), 성별, 가족력(family history), 공복혈당(FBS), 간수치(alanine aminotransferase, ALT), 감마 GTP, 고밀도 콜레스테롤 농도(HDL), 중성지방(TG), 수축기 혈압(SBP), 및 체질량 지수(BMI)인 것을 특징으로 할 수 있으나, 이에 한정되는 것은 아니다.In the present invention, the clinical information can be used without limitation as long as it is clinical information that can be used for the diabetes incidence probability prediction model constructed in the present invention, but preferably age, sex, family history, fasting blood sugar (FBS), liver level (alanine aminotransferase, ALT), gamma GTP, high-density cholesterol (HDL), triglyceride (TG), systolic blood pressure (SBP), and body mass index (BMI) it is not going to be

본 발명에서 상기 SNP point는 여러 개의 유전 요인을 하나의 점수로 통합하여 당뇨병 발병 예측 마커로서 이용하는 것을 의미하는데, 각 SNP 별로 당뇨 발병에 미치는 영향력을 가중치로 부여한 Genotype score(GS)의 합을 SNP point로 정의하며, 선별된 유전 요인 중 k 개를 대상으로 할 경우의 SNP point 는 Full model 과 Reduced model 로부터 계산되는 각 유전요인의 semi-partial R²를 가중치로 하여 계산하였다.In the present invention, the SNP point means that several genetic factors are integrated into one score and used as a predictive marker for diabetes onset. , and the SNP point for k among the selected genetic factors was calculated using the semi-partial R ² of each genetic factor calculated from the full model and reduced model as a weight.

본 발명에 있어서, 상기 SNP point는 하기 수식 2로 계산하는 것을 특징으로 할 수 있다:In the present invention, the SNP point may be calculated by the following Equation 2:

수식 2: SNP point=Equation 2: SNP point=

0.1595Xrs10811661'GS+0.2820Xrs947474'GS+0.2498Xrs1048886'GS+0.3088Xrs9268645'GS0.1595Xrs10811661'GS+0.2820Xrs947474'GS+0.2498Xrs1048886'GS+0.3088Xrs9268645'GS

본 발명에 있어서, 상기 GS는 하기 표 2와 같은 값을 가지는 것을 특징으로 할 수 있다.In the present invention, the GS may be characterized as having the values shown in Table 2 below.

rs10811661rs10811661 rs947474rs947474 rs1048886rs1048886 rs9268645rs9268645 GS: 0GS: 0 CCCC AAAA AAAA GGGG GS: 1GS: 1 CTCT AGAG AGAG CGCG GS: 2GS: 2 TTTT GGGG GGGG CCCC

여기서 full model은 임상 정보와 k 개의 유전 요인을 모두 predictor로 한 cox-ph 모형을 의미하며, Reduced model은 임상 정보와 i 번째 유전 요인을 제외한 (k-1)개 유전요인을 predictor로 한 cox-ph모형을 의미한다.Here, the full model means a cox-ph model using clinical information and k genetic factors as predictors, and the reduced model is a cox-ph model using clinical information and (k-1) genetic factors excluding the i-th genetic factor as predictors. It means the ph model.

이 때, semi-partial R²은 하기의 수식 3으로 계산되는 것을 특징으로 한다.At this time, semi-partial R ² is characterized in that it is calculated by Equation 3 below.

수식 3:Formula 3:

여기서, i는 i번째 SNP를 의미하고, R²은 coefficient of determination으로, 모형의 설명력의 척도로서, 데이터의 전체 변동성 중 모형으로 설명될 수 있는 부분을 계산한 값을 의미하며,

은 선별된 SNP 중 i번째 SNP를 제외한 cox ph 모형의 R²값을 의미한다.Here, i means the i-th SNP, R ² is the coefficient of determination, which is a measure of the explanatory power of the model, and means a value calculated from the part that can be explained by the model among the total variability of the data,

is the R ² value of the cox ph model except for the i-th SNP among the selected SNPs.

i번째 각 SNP 의 가중치를 부여한 GS(WGS)는 하기 수식 4로 계산하였다:The weighted GS (WGS) of each i-th SNP was calculated by the following Equation 4:

수식 4:

Formula 4:

여기서,

는 i번째 SNP의 weighted genotype score를 의미하고,

는 i번째 SNP의 genotype score를 의미하며,

는 i번째 SNP의 weight를 의미한다.here,

is the weighted genotype score of the i-th SNP,

is the genotype score of the i-th SNP,

is the weight of the i-th SNP.

는 하기 수식 5로 계산하였다:

was calculated with the following Equation 5:

수식 5:

Formula 5:

따라서 본 발명의 당뇨병 유전 요인의 위험 점수의 합(SNP point)의 계산식인 상기 수식 2는 하기 수식 6으로 계산하였다:Therefore, the formula for calculating the sum (SNP point) of the risk scores of the genetic factors for diabetes of the present invention, Equation 2, was calculated by the following Equation 6:

수식 6:

Formula 6:

본 발명에 있어서, 특정 시점 내 당뇨병 발병확률은 하기 식 1로 계산하는 것을 특징으로 할 수 있다.In the present invention, the probability of onset of diabetes within a specific time point may be calculated by Equation 1 below.

수식 1: 1-exp[-H₀(t)]^exp(βx) Formula 1: 1-exp[-H ₀ (t)] ^exp(βx)

이하, 상기 수식에 대하여 상세히 설명한다.Hereinafter, the above formula will be described in detail.

본 발명에서는 최종 선별된 임상정보와 SNP point를 predictor로 사용하여 Cox proportional hazard model을 구축하였다.In the present invention, a Cox proportional hazard model was constructed using the final selected clinical information and SNP point as predictors.

Cox proportional hazard model의 구조는 아래와 같다.The structure of the Cox proportional hazard model is as follows.

수식 7: t 시점의 hazard functionEquation 7: hazard function at time t

여기서, h₀(t)는 baseline hazard function을 의미하고, X 는 위험요인을 의미하며, β는 위험요인 별 coefficient 를 의미한다.Here, h ₀ (t) means the baseline hazard function, X means the risk factor, and β means the coefficient for each risk factor.

수식 8: t 시점까지 질병이 발병하지 않을 확률Equation 8: Probability of not developing disease until time t

따라서, Cox proportional hazard model을 이용한 당뇨병 발병 확률은 하기 수식 9로 나타낼 수 있다:Therefore, the probability of developing diabetes using the Cox proportional hazard model can be expressed by Equation 9 below:

수식 9: t 시점 내 당뇨병 발병 확률Equation 9: Probability of developing diabetes within time t

여기서, H₀(t)는 cumulative baseline hazard function 을 의미한다. Here, H ₀ (t) means the cumulative baseline hazard function.

β는 위험요인 별 coefficient를 의미하는데, 본 발명에서 β의 추정은 하기 수식 10의 partial likelihood function을 최대화하도록 하는 값으로 결정한다:β means a coefficient for each risk factor, and in the present invention, the estimation of β is determined as a value that maximizes the partial likelihood function of Equation 10 below:

수식 10:Equation 10:

, 0<

<…<

:d 개의 distinct failure time을 의미하며,

, 0<

<… <

: means d distinct failure times,

를 의미한다.

means

Tied failure time이 있는 경우 β의 추정은 하기 수식 11의 partial likelihood function을 최대화하도록 하는 값으로 결정한다:When there is a tiered failure time, the estimation of β is determined as a value that maximizes the partial likelihood function of Equation 11 below:

수식 11:Formula 11:

여기서,

는 failure time 이

인 subject set을 의미하고,here,

is the failure time

means a subject set that is

는 failure time 이

인 k 번째 subject의 위험요인을 의미한다.

is the failure time

It means the risk factor of the k-th subject.

따라서, cumulative baseline hazard function은 하기 수식 12로 계산할 수 있다:Therefore, the cumulative baseline hazard function can be calculated by Equation 12 below:

수식 12:

Equation 12:

여기서,

는 failure time 이

인 subject 수를 의미하며,here,

is the failure time

means the number of subjects,

는 추정된 β 값을 의미한다.

is the estimated β value.

본 발명에 있어서, 상기 수식 1의 βX는 하기 표 1의 마커에 가중치를 곱하여 모두 합한 값으로 계산하는 것을 특징으로 할 수 있다.In the present invention, βX in Equation 1 may be calculated as a value obtained by multiplying the markers in Table 1 by a weight and adding them all together.

마커marker 범위range 가중치weight 나이age 0.02610.0261 성별: 남Gender: Male 00 성별: 여Gender: Female -0.0385-0.0385 가족력family history 있음has exist 0.42780.4278 공복혈당(FBS)Fasting Blood Sugar (FBS) 80 초과, 85 이하More than 80, less than 85 0.40870.4087 85 초과, 90 이하Greater than 85, less than 90 0.65370.6537 90 초과over 90 1.37911.3791 간수치(ALT)Liver level (ALT) 30 초과, 40 이하More than 30, less than 40 0.17230.1723 40 초과over 40 0.26970.2697 감마 GTP Gamma GTP 남자: 18 초과, 29 이하Men: over 18, under 29 0.41490.4149 여자: 10 초과, 13 이하Women: over 10, under 13 남자: 29 초과, 51 이하Men: over 29, under 51 0.56900.5690 여자: 13 초과, 18 이하Women: over 13, under 18 남자: 51 초과Men: Over 51 0.76510.7651 여자: 18 초과Women: Over 18 고밀도콜레스테롤 농도
(HDL)high-density cholesterol concentration
(HDL) 남자: 40 초과Men: over 40 -0.0843-0.0843 여자: 50 초과Women: over 50 중성지방(TG)triglycerides (TG) 90 초과, 110 이하More than 90, less than 110 0.15300.1530 110 초과, 160 이하More than 110, less than 160 0.43440.4344 160 초과, 230 이하More than 160, less than 230 0.72880.7288 230 초과over 230 0.86470.8647 수축기 혈압(SBP)systolic blood pressure (SBP) 130 초과, 150 이하More than 130, less than 150 0.18770.1877 150 초과over 150 0.24880.2488 체질량 지수(BMI)body mass index (BMI) 23 초과, 25 이하greater than 23, less than 25 0.03040.0304 25 초과, 30 이하Greater than 25, less than 30 0.11260.1126 30 초과over 30 0.31370.3137 SNP pointSNP point 0.37690.3769

본 발명에 있어서, 상기 수식 1의 H₀(t)는 5년 일 때 0.00337이며, 7년 일 때 0.00500인 것을 특징으로 할 수 있다.In the present invention, H ₀ (t) of Equation 1 may be 0.00337 for 5 years and 0.00500 for 7 years.

한편, 본 발명자들은 상기 발병확률 예측 모델을 이용하여, 발병확률을 예측할 때, 유전 요인과 임상 정보가 발병확률 예측에 미치는 영향을 각각 구분할 수 있으며, 이를 이용하여 필요시 추가 검진 내용 및 관리 방법을 제공할 수 있을 것으로 예상하였다.On the other hand, the present inventors, when predicting the probability of onset by using the onset probability prediction model, can distinguish the effects of genetic factors and clinical information on the onset probability prediction, respectively, and using this, additional examination contents and management methods, if necessary expected to be available.

즉, 본 발명의 다른 실시예에서는 상기 방법으로 예측한 당뇨병 발병 확률을 바탕으로 유전 요인과 임상 정보를 구분한 다음, 추가 검진 내용 및 관리 방법을 제공하는 보고서를 제작하고, 이를 제공할 경우, 이용자 편의성이 증가하는 것을 확인하였다(도 2 내지 도 5).That is, in another embodiment of the present invention, genetic factors and clinical information are distinguished based on the diabetes incidence probability predicted by the above method, and then a report providing additional examination contents and management method is produced, and when provided, the user It was confirmed that the convenience increased ( FIGS. 2 to 5 ).

따라서, 본 발명은 다른 관점에서, a) 상기 방법으로 특정 기간 내 당뇨병 발병확률을 예측하는 단계; b) 상기 발병확률에 유전적 요인과 환경적 요인이 미치는 영향을 구분하는 단계; 및 c) 추가로 필요한 검진 내용을 제공하는 단계를 포함하는 당뇨병 환자의 추가 검진을 위한 정보의 제공방법에 관한 것이다.Therefore, in another aspect, the present invention comprises the steps of: a) predicting the probability of developing diabetes within a specific period by the above method; b) distinguishing the influence of genetic factors and environmental factors on the incidence probability; And c) relates to a method of providing information for additional examination of a diabetic patient comprising the step of providing additional necessary examination details.

본 발명에 있어서, 상기 당뇨병 환자의 추가 검진을 위한 정보의 제공방법은 d) 다른 질병의 발병 여부를 확인하기 위한 추가 검진 내용을 제공하는 단계를 추가로 포함하는 것을 특징으로 할 수 있다.In the present invention, the method of providing information for additional examination of a diabetic patient may further include the step of d) providing additional examination details for confirming whether another disease has occurred.

본 발명에 있어서, 상기 다른 질병은 상기 유전적 요인 및 환경적 요인으로 발병할 수 있는 질병이면 모두 가능하며, 바람직하게는 고혈압, 관상동맥 질환 및 뇌혈관 질환으로 구성된 군에서 선택될 수 있으나, 이에 한정되는 것은 아니다.In the present invention, the other diseases can be any disease that can be caused by the genetic factors and environmental factors, and preferably can be selected from the group consisting of hypertension, coronary artery disease and cerebrovascular disease, It is not limited.

본 발명은 또한, a) 상기 방법으로 특정 기간 내 당뇨병 발병확률을 예측하는 단계; b) 상기 발병확률에 유전적 요인과 환경적 요인이 미치는 영향을 구분하는 단계; 및 c) 발병확률을 낮추기 위한 식이, 운동 및 생활습관 가이드를 제공하는 단계를 포함하는 당뇨병 환자의 관리를 위한 정보의 제공방법에 관한 것이다.The present invention also comprises the steps of: a) predicting the probability of developing diabetes within a specific period by the above method; b) distinguishing the influence of genetic factors and environmental factors on the incidence probability; And c) relates to a method of providing information for the management of diabetic patients, including the step of providing a dietary, exercise and lifestyle guide to lower the incidence probability.

본 발명에서 상기 추가 검진은 당뇨병과 관련된 증상 또는 다른 질병을 확인할 수 있는 검진이면 제한 없이 이용가능하나, 바람직하게는 당뇨 합병증 관련 검사일 수 있으며, 더욱 바람직하게는 동맥경화 검사, 호모시스테인, CRP, 경동맥 초음파, 심장 초음파 및 심장 CT로 구성된 군에서 선택되는 것을 특징으로 할 수 있으나, 이에 한정되는 것은 아니다.In the present invention, the additional checkup can be used without limitation as long as it is a checkup that can confirm symptoms or other diseases related to diabetes, preferably, it may be a test related to diabetic complications, more preferably arteriosclerosis test, homocysteine, CRP, carotid artery ultrasound , may be characterized as selected from the group consisting of echocardiography and cardiac CT, but is not limited thereto.

본 발명에 있어서, 상기 b) 단계는 snpRatio를 계산하여 구분하는 것을 특징으로 할 수 있다.In the present invention, the step b) may be characterized in that it is divided by calculating the snpRatio.

본 발명의 snpRatio는 유전요인에 의한 linear predictor가 전체 위험요인에 의한 linear predictor 값에서 차지하는 비율을 의미하며, 하기 수식 13으로 계산하는 것을 특징으로 할 수 있다:The snpRatio of the present invention means the ratio of the linear predictor by genetic factors to the linear predictor value by all risk factors, and it can be characterized by calculating with the following Equation 13:

수식 13:Equation 13:

여기서,

는 i번째 subject에서 계산된 SNP point를 의미하고,

는 i번째 subject에서 측정된 j번째 임상정보 값을 의미하며,

는

인 임상정보를 의미하고,

는

에서 측정될 수 있는 최대값을 의미한다.here,

means the SNP point calculated in the i-th subject,

is the j-th clinical information value measured in the i-th subject,

Is

means clinical information,

Is

The maximum value that can be measured in

상기 식을 바탕으로 유전요인에 의한 t 시점 내에 당뇨 발병확률은 하기 수식 14로 계산되며:Based on the above formula, the probability of developing diabetes within time t due to genetic factors is calculated by the following Equation 14:

수식 14:

Equation 14:

임상정보에 의한 t 시점 내에 당뇨 발병확률은 하기 수식 15로 계산되는 것을 특징으로 할 수 있다:The probability of developing diabetes within time t according to clinical information may be characterized by being calculated by the following Equation 15:

수식 15:

Equation 15:

여기서,

는 t 시점 내에 당뇨가 발병할 확률을 의미한다.here,

is the probability of developing diabetes within time t.

또한 본 발명의 보고서에는 동년대 평균 대비 발병가능성을 유전요인과 임상정보에 따른 비율을 구분하여 제공하는 것을 특징으로 할 수 있다.In addition, the report of the present invention may be characterized in that the probability of onset compared to the average of the same year is provided by dividing the ratio according to genetic factors and clinical information.

본 발명은 또 다른 관점에서, In another aspect, the present invention

상기 방법을 이용하여 사용자의 당뇨병 발병 확률을 계산하는 장치와,A device for calculating a user's probability of developing diabetes by using the method;

상기 당뇨병 발병 확률 계산 장치로부터 당뇨병 발병확률 정보를 수신하여 분석하고, 보험상품을 매칭해주는 서버와,a server that receives and analyzes diabetes onset probability information from the diabetes onset probability calculation device and matches insurance products;

상기 서버의 정보를 화면상에 출력해주는 단말기를 포함하여 구성되되, Doedoe configured to include a terminal for outputting the information of the server on the screen,

상기 당뇨병 발병 확률 계산 장치를 통해 사용자의 당뇨병 발병확률을 계산하고, 당뇨병 발병확률 정보를 서버에서 수신하여 미래의 당뇨병 예상 치료비용을 계산한 뒤, 계산결과를 기반으로 보험상품을 매칭시켜 추천하는 당뇨병 발병확률 기반 보험설계 및 보험상품 매칭 시스템에 있어서,Diabetes that calculates the user's diabetes incidence probability through the diabetes incidence probability calculation device, receives diabetes incidence probability information from the server, calculates future diabetes treatment costs, and matches insurance products based on the calculation results to recommend diabetes In the insurance design and insurance product matching system based on the probability of occurrence,

상기 서버는,The server is

사용자에게 제공될 모든 보험상품에 대한 정보를 저장하고 있는 보험상품저장부와,An insurance product storage unit that stores information on all insurance products to be provided to the user;

당뇨병 평균 치료비용을 저장하고 있는 평균 치료비용저장부와,an average treatment cost storage unit storing the average treatment cost of diabetes;

사용자를 고객으로 인증하는 기능을 하는 고객인증부와,a customer authentication unit that authenticates the user as a customer;

상기 고객인증부를 통해 인증된 고객의 정보를 기반으로 상기 당뇨병 발병 확률 계산 장치의 당뇨병 발병확률을 수신하는 당뇨병 발병확률 정보수신부와,a diabetes onset probability information receiving unit for receiving the diabetes onset probability of the diabetes onset probability calculation device based on the customer's information authenticated through the customer authentication unit;

상기 당뇨병 발병확률 정보수신부를 통해 수신된 당뇨병 발병 확률 정보를 기반으로 미래의 예상 치료비용을 계산하는 기능을 하는 분석부와,an analysis unit that calculates a future expected treatment cost based on the diabetes incidence probability information received through the diabetes incidence probability information receiving unit;

상기 고객인증부를 통해 인증된 고객의 정보를 기반으로 고객이 가입한 보험을 식별하고, 식별된 보험에 대한 고객의 보장내역을 보험상품저장부에서 수신하여 상기 분석부의 미래의 예상 치료비용과 비교하는 보험보장내역 비교부와,Identifies the insurance purchased by the customer based on the customer's information authenticated through the customer authentication unit, receives the customer's insurance details for the identified insurance from the insurance product storage unit, and compares it with the expected future treatment cost of the analysis unit insurance coverage comparison department;

상기 분석부를 통해 분석된 미래의 예상 치료비용 및 상기 보험보장내역 비교부에 기반하여 미래의 예상 치료비용이 고객이 가입한 보험에 의해 보장되는 지 판단하고, 판단된 결과에 기반하여 보험을 매칭해주는 보험매칭부와,Based on the expected future treatment cost analyzed through the analysis unit and the insurance coverage detail comparison unit, it is determined whether the expected future treatment cost is covered by the insurance purchased by the customer, and the insurance is matched based on the determined result. insurance matching department,

상기 보험매칭부에서 매칭된 보험을 사용자에게 추천해주는 보험정보추천부를 포함하는 것을 특징으로 하는and an insurance information recommendation unit that recommends the insurance matched by the insurance matching unit to the user.

임상 및 유전변이 정보 기반 보험설계 및 보험상품 매칭 시스템에 관한 것이다.It relates to an insurance design and insurance product matching system based on clinical and genetic variation information.

본 발명에서 상기 시스템은 상기 방법을 이용하여 당뇨병 발병확률을 계산하는 당뇨병 발병확률 계산 장치와,In the present invention, the system includes a diabetes incidence probability calculation device for calculating the diabetes incidence probability using the method;

상기 계산 장치로부터 당뇨병 발병확률을 수신하여 분석하고, 보험상품을 매칭해 주는 서버와,a server that receives and analyzes the probability of developing diabetes from the calculation device and matches insurance products;

상기 서버의 정보를 화면상에 출력해주는 단말기를 포함하여 구성될 수 있다.It may be configured to include a terminal for outputting the information of the server on the screen.

본 발명에서, 상기 서버는, 데이터베이스, 고객인증부, 당뇨병 발병확률 정보수신부, 분석부, 조작인터페이스, 보험매칭부 및 보험정보추천부를 포함하여 구성될 수 있다.In the present invention, the server may be configured to include a database, a customer authentication unit, a diabetes onset probability information receiving unit, an analysis unit, an operation interface, an insurance matching unit, and an insurance information recommendation unit.

본 발명에서, 상기 데이터베이스는 사용자에게 제공될 모든 보험상품에 대한 정보를 저장하고 있는 보험상품저장부와, 당뇨병 평균 치료비용을 저장하고 있는 평균 치료비용저장부를 포함하여 구성될 수 있다.In the present invention, the database may be configured to include an insurance product storage unit that stores information on all insurance products to be provided to the user, and an average treatment cost storage unit that stores the average diabetes treatment cost.

본 발명에서, 상기 고객인증부는 단말기를 사용하는 사용자를 고객으로서 인증하는 기능을 하며, 이러한 인증은 로그인을 기반으로 이루어지고, 상기 데이터베이스는 사용자정보를 저장하는 저장부를 더 포함할 수 있다.In the present invention, the customer authentication unit serves to authenticate a user who uses the terminal as a customer, and such authentication is made based on login, and the database may further include a storage unit for storing user information.

이때, 사용자정보에는 당뇨병 발병확률 계산장치와의 연동을 위한 식별번호를 포함할 수 있고, 이를 통해 사용자는 로그인시 식별번호를 함께 입력하여 당뇨병 발병확률 계산장치의 발병확률 정보를 서버가 공유할 수 있다.At this time, the user information may include an identification number for interworking with the diabetes incidence probability calculation device, through which the user inputs the identification number together when logging in, so that the server can share the incidence probability information of the diabetes incidence probability calculation device there is.

본 발명에서, 상기 당뇨병 발병확률 정보수신부는 상기 고객인증부를 통해 인증된 고객의 정보를 기반으로 상기 당뇨병 발병 확률 계산 장치의 당뇨병 발병확률을 수신하는 것을 특징으로 할 수 있다.In the present invention, the diabetes onset probability information receiving unit may be characterized in that it receives the diabetes onset probability of the diabetes onset probability calculation device based on customer information authenticated through the customer authentication unit.

본 발명에서, 상기 분석부는 상기 당뇨병 발병확률 정보수신부를 통해 수신된 당뇨병 발병 확률 정보와 평균 치료비용저장부의 평균 치료비용을 기반으로 미래의 예상 치료비용을 계산하는 기능을 수행한다. In the present invention, the analysis unit performs a function of calculating the expected future treatment cost based on the diabetes onset probability information received through the diabetes onset probability information receiving unit and the average treatment cost of the average treatment cost storage unit.

예를 들어, 당뇨병 발병 확률 정보와 평균 치료비용을 곱하거나, 추가 위엄 완충 여분을 더하여 계산할 수 있다.For example, it can be calculated by multiplying the information on the probability of developing diabetes by the average cost of treatment, or by adding an extra dignified buffer.

본 발명에서 보험보장내역 비교부는 상기 고객인증부를 통해 인증된 고객의 정보를 기반으로 고객이 가입한 보험을 식별하고, 식별된 보험에 대한 고객의 보장내역을 보험상품저장부에서 수신하여 상기 분석부의 미래의 예상 치료비용과 비교하는 기능을 수행한다.In the present invention, the insurance guarantee history comparison unit identifies the insurance purchased by the customer based on the customer's information authenticated through the customer authentication unit, receives the customer's insurance details for the identified insurance from the insurance product storage unit, and receives the insurance information from the analysis unit. It performs the function of comparing with the expected future treatment cost.

본 발명에서, 보험매칭부는 상기 분석부를 통해 분석된 미래의 예상 치료비용 및 상기 보험보장내역 비교부에 기반하여 미래의 예상 치료비용이 고객이 가입한 보험에 의해 보장되는 지 판단하고, 판단된 결과에 기반하여 보험을 매칭해주는 기능을 수행한다.In the present invention, the insurance matching unit determines whether the expected future treatment cost is covered by the insurance purchased by the customer based on the future expected treatment cost analyzed through the analysis unit and the insurance coverage detail comparison unit, and the result of the determination It performs the function of matching insurance based on

예를 들어, 미래의 예상 치료비용이 고객이 가입한 보험에 의해 보장될 경우에는 현재 가입한 보험으로 충분하다는 메시지를 단말기에 출력할 수 있고, 만약 보장되지 않을 경우에는 보험상품저장부에서 미래의 예상 치료비용보다 높은 금액을 보장하는 보험을 찾아서 매치할 수 있다. For example, if the expected future treatment cost is covered by the insurance purchased by the customer, a message indicating that the current insurance is sufficient can be printed on the terminal. You can find and match insurance that covers more than the expected cost of treatment.

본 발명에서, 상기 보험정보추천부는 보험매칭부에서 매칭된 보험을 사용자에게 추천해주는 기능을 수행한다.In the present invention, the insurance information recommendation unit performs a function of recommending the insurance matched by the insurance matching unit to the user.

실시예Example

이하, 실시예를 통하여 본 발명을 더욱 상세히 설명하고자 한다. 이들 실시예는 오로지 본 발명을 예시하기 위한 것으로서, 본 발명의 범위가 이들 실시예에 의해 제한되는 것으로 해석되지는 않는 것은 당업계에서 통상의 지식을 가진 자에게 있어서 자명할 것이다.Hereinafter, the present invention will be described in more detail through examples. These examples are only for illustrating the present invention, and it will be apparent to those of ordinary skill in the art that the scope of the present invention is not to be construed as being limited by these examples.

실시예 1: 분석 대상자 설정Example 1: Analysis subject setting

본 발명에서의 연구 대상자는 한국인 유전체 역학조사사업(KoGES)을 통해 대한민국 안성지역 및 안산지역에서 조사한 인구 기반 전향적 코호트 연구를 통해 수집된 40세 이상 69세 이하의 10,030명의 연구 참가자로부터 선정하였다.The subjects of the present invention were selected from 10,030 study participants aged 40 to 69 years old, collected through a population-based prospective cohort study conducted in the Anseong and Ansan regions of Korea through the Korean Genome Epidemiological Survey (KoGES).

조사한 임상 역학 데이터는 하기 표 4와 같다.The investigated clinical epidemiologic data are shown in Table 4 below.

코호트로부터 문진결과 당뇨 진단력이 있거나, 현재 치료력 혹은 약물지속력이 있는 경우를 당뇨 발생으로 정의하고, 그 외에도 혈액 검사 결과 공복혈당이 126mg/dL 이상 또는 경구 당 부하 검사 2시간 후 혈당이 200mg/dL 이상인 경우를 당뇨 발생으로 정의하였다. 조사 대상자 중 기반 조사 시점에서 당뇨 기발생으로 정의된 자는 분석 대상에서 제외하였다.Diabetes mellitus is defined as a case of diabetes diagnosis, current treatment, or drug persistence as a result of the cohort questionnaire. In addition, as a result of blood tests, fasting blood sugar is 126 mg/dL or higher, or blood sugar 200 mg/dL 2 hours after oral glucose tolerance test Abnormalities were defined as diabetes. Among the survey subjects, those who were defined as prediabetic at the time of the baseline survey were excluded from the analysis.

실시예 2: 제2형 당뇨 발병확률 검출을 위한 통합 모델 구축Example 2: Construction of an integrated model for detecting the incidence of type 2 diabetes

2-1. 임상 정보 분석 모델 구축2-1. Establishment of clinical information analysis model

당뇨 발병에 영향을 미치는 임상 정보를 분석하기 위하여, 기본 건강검진을 통해 조사되는 항목을 후보 요인으로 하여 Cox proportional hazards model(이하 cox ph 모형) 에 적용하여 위험요인을 선정하였다. 일차적으로 연령, 성별 및 각 임상 정보 요인을 설명변수로 한 cox ph 모형에서 각 임상정보의 LR test(likelihood-ratio test) 결과 p value가 0.05 미만인 임상 정보 요인에 대해 데이터 형식을 범주화하여 유의성을 확인하여, 유의성이 유지되는 임상 정보 요인을 선정하였다. 선정된 임상 정보 요인들에 대한 다중 회귀 분석을 위해서는 연령, 성별, 당뇨 가족력 여부와 선정된 임상 정보 요인들을 설명변수로 한 cox ph 모형에 backward elimination 방법을 적용하여 최종 임상 정보 요인을 선정하였다. 이때 변수 제거의 기준은 AIC를 최소화하는 것이나, 해당 변수의 p value가 0.05 미만일 경우에는 제거하지 않았다(도 1).In order to analyze the clinical information affecting the onset of diabetes, the risk factors were selected by applying the Cox proportional hazards model (hereinafter the cox ph model) with the items investigated through the basic health checkup as candidate factors. In the cox ph model using age, sex, and each clinical information factor as explanatory variables, the significance was confirmed by categorizing the data type for the clinical information factor with a p value of less than 0.05 as a result of the LR test (likelihood-ratio test) of each clinical information. Therefore, clinical information factors maintaining significance were selected. For multiple regression analysis of the selected clinical information factors, the final clinical information factors were selected by applying the backward elimination method to the cox ph model using age, gender, family history of diabetes and the selected clinical information factors as explanatory variables. At this time, the criterion for removing a variable is to minimize the AIC, but when the p value of the variable is less than 0.05, it was not removed ( FIG. 1 ).

각 단계에서 선정된 임상정보 요인은 아래 표 5와 같다: 공복혈당(FBS), 간수치(alanine aminotransferase, ALT), 감마 GTP, 고밀도 콜레스테롤 농도(HDL), 중성지방(TG), 수축기 혈압(SBP), 및 체질량 지수(BMI).The clinical information factors selected at each stage are shown in Table 5 below: fasting blood sugar (FBS), liver level (alanine aminotransferase, ALT), gamma GTP, high-density cholesterol concentration (HDL), triglyceride (TG), systolic blood pressure (SBP) , and body mass index (BMI).

임상정보 요인 선정Selection of clinical information factors 단일 분석single analysis 다중 분석Multiple analysis 연속형continuous 범주형categorical 최종 선정 요인Final Selection Factors 공복혈당(FBS)Fasting Blood Sugar (FBS) <0.001<0.001 <0.001<0.001 <0.001<0.001 크레아티닌creatinine 0.2390.239 미실시not done 제외except 간수치(AST)Liver value (AST) <0.001<0.001 <0.001<0.001 제외except 간수치(ALT)Liver level (ALT) <0.001<0.001 <0.001<0.001 <0.001<0.001 감마 GTPGamma GTP <0.001<0.001 <0.001<0.001 <0.001<0.001 총 콜레스테롤 농도total cholesterol concentration <0.001<0.001 <0.001<0.001 제외except 고밀도 콜레스테롤농도(HDL)High-density cholesterol (HDL) <0.001<0.001 <0.001<0.001 0.0810.081 중성지방(TG)triglycerides (TG) <0.001<0.001 <0.001<0.001 <0.001<0.001 혈색소hemoglobin <0.001<0.001 <0.001<0.001 제외except 수축기 혈압(SBP)systolic blood pressure (SBP) <0.001<0.001 <0.001<0.001 0.0060.006 이완기 혈압(DBP)diastolic blood pressure (DBP) <0.001<0.001 <0.001<0.001 제외except 허리둘레Waist circumference <0.001<0.001 <0.001<0.001 제외except 체질량 지수(BMI)body mass index (BMI) <0.001<0.001 <0.001<0.001 0.1190.119

* 단일 분석은 연령, 성별로 보정하였으며, 다중 분석은 연령, 성별 및 가족력으로 보정하였음* Single analysis was corrected for age and gender, and multiple analysis was adjusted for age, gender and family history.

2-2 유전 정보 분석 모델 구축2-2 Genetic information analysis model construction

후보 SNP(;rs10842994, rs10811661, rs947474, rs3024505, rs4402960, rs1048886, rs9268645)는 연령, 성별, 당뇨 가족력 여부와 최종 선정된 임상 정보 요인 및 각 SNP를 설명변수로 한 cox ph 모형에 상가적 유전 모형을 적용하여 범주형 유전형 AA, BB 중 HR이 큰 Genotype의 Allele 을 risk allele로 하여 연속형(0,1,2; GS)으로 변환하였다.Candidate SNPs (;rs10842994, rs10811661, rs947474, rs3024505, rs4402960, rs1048886, rs9268645) were an additive genetic model to the cox ph model using age, sex, family history of diabetes, the final selected clinical information factors, and each SNP as explanatory variables. Among the categorical genotypes AA and BB, Allele of the genotype with the highest HR was used as a risk allele, and it was converted to continuous (0,1,2; GS).

1차 선별된 유전 요인 여러 개를 하나의 점수로 통합하여 당뇨병 발병 예측 마커로 이용하고자 하였으며, 1차 선별된 7개 요인 1개부터 7개의 SNP의 genotype score(GS)를 통합하여 하나의 SNP point로 이용하였다, 즉 (7+

+... +

)개의 SNP point를 후보 요인으로 하나의 변수로 적용하는 것이다. 2-1 에서 최종 선정된 환경요인에 각 SNP point를 추가한 cox ph 모형에 대해서 아래 (1)~(3)의 조건을 만족하는 SNP point로 후보군을 좁혔다. Several primary selected genetic factors were integrated into one score to be used as a predictive marker for diabetes onset, and the genotype scores (GS) of 1 to 7 SNPs from 1 of the 7 primary selected factors were integrated into one SNP point. was used, i.e. (7+

+... +

) SNP points are applied as a single variable as a candidate factor. For the cox ph model in which each SNP point is added to the final selected environmental factors in 2-1, the candidate group was narrowed down to SNP points that satisfy the conditions of (1) to (3) below.

(1) SNP point의 PH가정 만족여부, LR test 결과 p value < 0.05 (1) Whether or not the PH assumption of the SNP point is satisfied, LR test result p value < 0.05

(2) SNP point 추가한 모형의 time-dependent AUC가 그렇지 않은 모형보다 큼(2) The time-dependent AUC of the model with the SNP point added is larger than that of the model without the SNP point.

(3) SNP point 추가한 모형의 hosmer-lemeshow test 결과가 그렇지 않은 모형과 비교해 악화되지 않음(3) The hosmer-lemeshow test result of the model with the added SNP point did not deteriorate compared to the model without the SNP point.

(1)~(3) 의 조건을 만족하는 SNP point는 127 개 중 59개가 해당되었으며, time-dependent AUC 및 survival NRI 값 상위 7개 모형의 결과는 표 6에 제시하였다. 7개 모형 모두에서 SNP point가 증가함에 따른 당뇨발병 위험의 유의성이 확인되었다. 발병시점과 관계없이 117번 SNP point를 사용하는 모형에서 time-dependent AUC 값이 가장 높았으나, 1개 SNP를 적게 사용하는 89번 SNP point를 사용하는 모형과 그 차이가 작고, 통계적으로도 유의한 차이가 없음을 확인하였다. 또한 survival NRI 값도 마찬가지로 117번 SNP point를 사용하는 모형이 가장 큰 값을 가지나, 89번 SNP point를 사용하는 모형의 survival NRI 값이 큰 차이가 없고, Event NRI의 경우는 더 큰 것을 확인할 수 있었다. 따라서 더 적은 수의 SNP를 이용하는 89번 SNP point를 사용하여 당뇨 발병을 예측하는 것이 가장 효율적임을 확인할 수 있었다. 59 out of 127 SNP points satisfying the conditions (1) to (3) corresponded, and the results of the top 7 models with time-dependent AUC and survival NRI values are presented in Table 6. Significance of the risk of diabetes mellitus as the SNP point increased was confirmed in all 7 models. Regardless of the time of onset, the time-dependent AUC value was the highest in the model using SNP point 117, but the difference between the model using SNP point 89 and the model using SNP point 89 was small and statistically significant. It was confirmed that there was no difference. Also, the survival NRI value of the model using SNP point 117 had the largest value, but there was no significant difference in the survival NRI value of the model using SNP point 89, and it was confirmed that the event NRI was larger. . Therefore, it was confirmed that it is most effective to predict the onset of diabetes using SNP point 89 using a smaller number of SNPs.

SNP point를 달리한 결과Results of different SNP points SNP point^*1 SNP point ^*1 SNP point의HR (p value)HR of SNP point (p value) 5년이내발병 Onset within 5 years 7년이내발병 Onset within 7 years 5년이내발병 Onset within 5 years 7년이내발병 Onset within 7 years AUC (95% CI)AUC (95% CI) AUC (95% CI)AUC (95% CI) NRI^{*2 (}Event, non Event NRI)NRI ^{*2 (} Event, non Event NRI) NRI^{*2 (}Event, non Event NRI)NRI ^{*2 (} Event, non Event NRI) 47번
number 47
1.3049 (0.0009)1.3049 (0.0009) 0.7570
(0.7365-0.7774)0.7570
(0.7365-0.7774) 0.7494
(0.7312-0.7676)0.7494
(0.7312-0.7676) 0.1305
(0.1090,0.0215)0.1305
(0.1090,0.0215) 0.1238
(0.0992,0.0244)0.1238
(0.0992,0.0244) 53번number 53 1.2880 (0.0013)1.2880 (0.0013) 0.7571
(0.7366-0.7776)0.7571
(0.7366-0.7776) 0.7492
(0.7310-0.7674)0.7492
(0.7310-0.7674) 0.1353
(0.0759,0.0594)0.1353
(0.0759,0.0594) 0.1033
(0.0473,0.0558)0.1033
(0.0473,0.0558) 86번 number 86 1.3570 (0.0007)1.3570 (0.0007) 0.7568
(0.7363-0.7772)0.7568
(0.7363-0.7772) 0.7490
(0.7308-0.7672)0.7490
(0.7308-0.7672) 0.1423
(0.0976,0.0446)0.1423
(0.0976,0.0446) 0.1273
(0.0807,0.0464)0.1273
(0.0807,0.0464) 89번89 1.4578 (0.0001)1.4578 (0.0001) 0.7573
(0.7369-0.7778)0.7573
(0.7369-0.7778) 0.7495
(0.7312-0.7677)0.7495
(0.7312-0.7677) 0.1505
(0.1018,0.0487)0.1505
(0.1018,0.0487) 0.1316
(0.0807,0.0507)0.1316
(0.0807,0.0507) 92번92 1.3451 (0.0010)1.3451 (0.0010) 0.7568
(0.7363-0.7773)0.7568
(0.7363-0.7773) 0.7488
(0.7306-0.7671)0.7488
(0.7306-0.7671) 0.1454
(0.0643,0.0811)0.1454
(0.0643,0.0811) 0.1130
(0.0337,0.0791)0.1130
(0.0337,0.0791) 116번number 116 1.5142 (0.0001)1.5142 (0.0001) 0.7571
(0.7366-0.7776)0.7571
(0.7366-0.7776) 0.7491
(0.7309-0.7674)0.7491
(0.7309-0.7674) 0.1455
(0.0972,0.0483)0.1455
(0.0972,0.0483) 0.1231
(0.0750,0.0479)0.1231
(0.0750,0.0479) 117번
number 117
1.4976 (0.0001)1.4976 (0.0001) 0.7575
(0.7370-0.7780)0.7575
(0.7370-0.7780) 0.7496
(0.7313-0.7678)0.7496
(0.7313-0.7678) 0.1666
(0.0877,0.0789)0.1666
(0.0877,0.0789) 0.1362
(0.0558,0.0802)0.1362
(0.0558,0.0802)

*1. 47번 SNP point (rs10842994, rs10811661, rs1048886), 53번 SNP point (rs10842994, rs4402960, rs1048886), 86번 SNP point (rs10842994, rs10811661, rs947474, rs1048886), 89번 SNP point (rs10842994, rs10811661, rs4402960, rs1048886), 92번 SNP point (rs10842994, rs947474, rs4402960, rs1048886), 116번 SNP point (rs10842994, rs10811661, rs947474, rs4402960, rs1048886), 117번 SNP point (rs10842994, rs10811661, rs3024505, rs4402960, rs1048886)*One. SNP point 47 (rs10842994, rs10811661, rs1048886), SNP point 53 (rs10842994, rs4402960, rs1048886), SNP point 86 (rs10842994, rs10811661, rs947474, rs1048886), SNP point 89 (rs10842994, rs108116) ), SNP point 92 (rs10842994, rs947474, rs4402960, rs1048886), SNP point 116 (rs10842994, rs10811661, rs947474, rs4402960, rs1048886), SNP point 117 (rs10842994, rs10811661, rs3024505)

*2. Event NRI : 특정기간 이내 당뇨 발병된 케이스 중 SNP point 추가한 모형의 발병확률이 그렇지 않은 모형보다 증가한 확률 - 감소한 확률non Event NRI : 특정기간 이내 당뇨 발병되지 않은 케이스 중 SNP point 추가한 모형의 발병확률이 그렇지 않은 모형보다 감소한 확률 - 증가한 확률*2. Event NRI: Among the cases in which diabetes occurred within a specific period, the probability of the model adding the SNP point increased compared to the model without the SNP point - decreased probability Reduced Probability - Increased Probability over Models That Don't

NRI = Event NRI + non Event NRI NRI = Event NRI + non Event NRI

2-3. 통합 모델 구축2-3. Build a unified model

실시예 2-1 및 2-2에서 선별한 임상 정보 및 유전 요인을 설명변수로 한 cox ph 모형을 구축하였다.A cox ph model was constructed using the clinical information and genetic factors selected in Examples 2-1 and 2-2 as explanatory variables.

통합 분석 모델은 표 1에 개시된 변 수들을 가중치를 곱하여 합한 다음, 수식 1로 계산하는 것이다.In the integrated analysis model, the variables disclosed in Table 1 are multiplied by weights and summed, and then calculated using Equation 1.

수식 1: 1-exp[-H₀(t)]^exp(β Formula 1: 1-exp[-H ₀ (t)] ^exp(β

그 결과, 구축한 모델이 높은 정확도로 당뇨 발병확률을 예측할 수 있다는 것을 확인하였다(도 2).As a result, it was confirmed that the built model can predict the probability of diabetes with high accuracy (FIG. 2).

실시예 3: 구축한 모델의 성능 검증Example 3: Performance verification of the built model

실시예 2-3에서 구축한 모형을 통해 당뇨 발병을 예측한 결과를 임상 정보만을 사용하여 당뇨 발병을 예측한 결과와 비교 분석한 결과, 통합모형을 이용해 발병을 예측하는 경우 time-dependent AUC의 값과 Event NRI, non Event NRI 값이 발병 시점에 관계없이 모두 증가하여 보다 높은 정확도로 발병 예측이 가능한 것을 확인하였다. (표 7)As a result of comparing and analyzing the result of predicting the onset of diabetes using the model constructed in Example 2-3 with the result of predicting the onset of diabetes using only clinical information, the value of time-dependent AUC when predicting the onset using the integrated model and event NRI and non-event NRI values all increased regardless of the time of onset, confirming that onset prediction was possible with higher accuracy. (Table 7)

당뇨 발병 예측 모형의 성능 평가Performance evaluation of diabetes onset prediction model 5년이내발병 Onset within 5 years 7년이내발병 Onset within 7 years 5년이내발병 Onset within 5 years 7년이내발병 Onset within 7 years AUC (95% CI)AUC (95% CI) AUC (95% CI)AUC (95% CI) NRI⁽Event, non Event NRI)NRI ⁽ Event, non Event NRI) NRI⁽Event, non Event NRI)NRI ⁽ Event, non Event NRI) ENV 모형 ENV model 0.7552
(0.7348-0.7757)0.7552
(0.7348-0.7757) 0.7483
(0.7301-0.7665)0.7483
(0.7301-0.7665) -- -- 통합모형integrated model 0.7573
(0.7369-0.7778)0.7573
(0.7369-0.7778) 0.7495
(0.7312-0.7677)0.7495
(0.7312-0.7677) 0.1505
(0.1018,0.0487)0.1505
(0.1018,0.0487) 0.1316
(0.0807,0.0507)0.1316
(0.0807,0.0507)

이상으로 본 발명 내용의 특정한 부분을 상세히 기술하였는 바, 당업계의 통상의 지식을 가진 자에게 있어서 이러한 구체적 기술은 단지 바람직한 실시 양태일 뿐이며, 이에 의해 본 발명의 범위가 제한되는 것이 아닌 점은 명백할 것이다. 따라서, 본 발명의 실질적인 범위는 첨부된 청구항들과 그것들의 등가물에 의하여 정의된다고 할 것이다.As a specific part of the present invention has been described in detail above, for those of ordinary skill in the art, it is clear that this specific description is only a preferred embodiment, and the scope of the present invention is not limited thereby. will be. Accordingly, it is intended that the substantial scope of the present invention be defined by the appended claims and their equivalents.

Claims

delete

In a method for predicting the incidence of diabetes performed by a computing device including a processor,
a) collecting clinical information of a patient and receiving DNA information;
b) deriving clinical information values by assigning weights to clinical information;
c) determining the genotypes of rs10811661, rs947474, rs1048886, and rs9268645 SNPs from DNA information to give a score (Genotype Score, GS);
d) deriving an SNP value (SNP point) by assigning a weight to each SNP; and
e) calculating the probability of developing diabetes within a specific time point based on the values b) and d),
Here, GS means a value indicating the influence of each SNP genotype on the onset of diabetes by dividing it into 0, 1, or 2,
The SNP point means the total sum of the values multiplied by the weight of the GS,
The probability of developing diabetes within a specific time point is calculated by Equation 1 below:
Equation 1: 1-exp[-H0(t)]exp(βX)
The βX is calculated as the sum of all markers in Table 1 by multiplying them by a weight.

According to claim 3, wherein the clinical information is age, sex, family history, fasting blood glucose (FBS), liver level (alanine aminotransferase, ALT), gamma GTP, high-density cholesterol concentration (HDL), triglycerides ( TG), systolic blood pressure (SBP), and body mass index (BMI).

delete

The method of claim 3, wherein the SNP point is calculated by the following Equation 3:
Equation 3: SNP point=
0.1595Xrs10811661'GS+0.2820Xrs947474'GS+0.2498Xrs1048886'GS+0.3088Xrs9268645'GS

The method of claim 6, wherein the GS has the same value as in Table 2.

The method of claim 3, wherein H ₀ (t) is 0.00337 at 5 years, and 0.00500 at 7 years.

A method of providing information for further screening of diabetic patients, comprising the steps of:
a) estimating the probability of developing diabetes within a specific period by the method of any one of claims 3, 4, 6 to 8;
b) distinguishing the influence of genetic factors and environmental factors on the incidence probability; and
c) providing additional necessary examination contents.

10. The method of claim 9,
d) A method of providing information for an additional check-up of a diabetic patient, characterized in that it further comprises the step of providing additional check-up details for confirming the onset of other diseases.

The method of claim 10, wherein the other disease is one or more selected from the group consisting of hypertension, coronary artery disease, and cerebrovascular disease.

A method of providing information for the management of a diabetic patient comprising the steps of:
a) estimating the probability of developing diabetes within a specific period by the method of any one of claims 3, 4, 6 to 8;
b) distinguishing the influence of genetic factors and environmental factors on the incidence probability; and
c) Providing dietary, exercise, and lifestyle guides to lower the incidence rate.

A device for calculating a user's probability of developing diabetes using the method of any one of claims 3, 4, and 6 to 8;
a server that receives and analyzes diabetes onset probability information from the diabetes onset probability calculation device and matches insurance products;
Doedoe configured to include a terminal for outputting the information of the server on the screen,
Diabetes that calculates the user's diabetes incidence probability through the diabetes incidence probability calculation device, receives diabetes incidence probability information from the server, calculates future diabetes treatment costs, and matches insurance products based on the calculation results to recommend diabetes In the insurance design and insurance product matching system based on the probability of occurrence,
The server is
An insurance product storage unit that stores information on all insurance products to be provided to the user;
an average treatment cost storage unit storing the average treatment cost of diabetes;
A diabetes onset probability information storage unit for storing the diabetes onset probability information calculated by the diabetes onset probability calculation device;
A customer authentication unit that authenticates the user as a customer;
a diabetes onset probability information receiving unit for receiving the diabetes onset probability of the diabetes onset probability calculation device based on the customer's information authenticated through the customer authentication unit;
an analysis unit that calculates a future expected treatment cost based on the diabetes incidence probability information received through the diabetes incidence probability information receiving unit;
Identifies the insurance purchased by the customer based on the customer's information authenticated through the customer authentication unit, receives the customer's insurance details for the identified insurance from the insurance product storage unit, and compares it with the expected future treatment cost of the analysis unit insurance coverage comparison department;
Based on the expected future treatment cost analyzed through the analysis unit and the insurance coverage details comparison unit, it is determined whether the expected future treatment cost is covered by the insurance purchased by the customer, and the insurance is matched based on the determined result. insurance matching department,
and an insurance information recommendation unit that recommends the insurance matched by the insurance matching unit to the user.
Insurance design and insurance product matching system based on clinical and genetic mutation information.