KR20230019347A

KR20230019347A - Nose shape-associated SNP markers and uses thereof

Info

Publication number: KR20230019347A
Application number: KR1020210100725A
Authority: KR
Inventors: 차성원; 반효정; 백수진; 박상민; 안일구; 이시우
Original assignee: 한국 한의학 연구원
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2023-02-08
Also published as: KR102592866B1

Abstract

The present invention relates to an SNP marker associated with the nose phenotype, from among 41 phenotypes related to the length of parts connecting 25 points important in face recognition and uses thereof. Specifically, the present invention relates to a marker composition for determining the nose phenotype, including SNP markers in seven types of genes associated with four phenotypes related to the length of the nose region, a method of providing information necessary for determining the nose phenotype, including a step of verifying the SNP markers, and a kit for determining the nose phenotype, including an agent for verifying the genotype of the SNP markers. The SNP set of the present invention has been identified in the phenotype related to the length of the nose region, and can be actively used not only in finding missing people and criminal investigations, but also in determining physical appearance types and effective health management.

Description

Nose shape-associated SNP markers and uses thereof {Nose shape-associated SNP markers and uses thereof}

본 발명은 얼굴 인식에 중요한 25개의 점을 연결하는 부위별 길이 41개의 표현형 중 코 형태, 즉, 코 표현형과 연관된 SNP 마커 및 이의 용도에 관한 것이다. 구체적으로, 코 부위 길이 관련 4개의 표현형과 연관된 7종 유전자 내 SNP 마커를 포함하는 코 표현형 판별용 마커 조성물, 상기 SNP 마커를 확인하는 단계를 포함하는 코 표현형 판별에 필요한 정보를 제공하는 방법 및 상기 SNP 마커의 유전형 확인용 제제를 포함하는 코 표현형 판별용 키트에 관한 것이다.The present invention relates to a SNP marker associated with a nose shape, that is, a nose phenotype, among 41 length-specific phenotypes connecting 25 points important for face recognition, and uses thereof. Specifically, a marker composition for determining a nose phenotype including SNP markers in 7 genes associated with 4 phenotypes related to the length of the nose, a method for providing information necessary for determining a nose phenotype including identifying the SNP markers, and the above It relates to a kit for determining a nasal phenotype including an agent for confirming a genotype of a SNP marker.

얼굴의 형태 정보를 이용한 바이오 인식 기술은 활발히 연구되고 있으며, 이와 관련하여 다양한 얼굴 인식 방법들이 알려지고 있다. 얼굴 인식 기술은 얼굴 이미지 내의 개별 구역, 예를 들어 눈, 입, 코, 이마 등에 따라서 유사도를 측정을 수행하고, 두 얼굴 이미지를 비교하는 방법 등을 기반으로 하고 있다. 이 때, 얼굴 인식 기술은 주변의 조명 등을 빛에 의한 노이즈 또는 에러 요소를 고려하여 인식률을 향상시키고, 얼굴 인식을 효과적으로 처리하는 방법 등이 연구되고 있다. 예를 들어, 한국등록특허 제10-1853006호에는 깊이 영상에서 코 검출을 이용한 얼굴 인식으로 3차원 영상에서 데이터베이스를 생성하고 얼굴 형상을 획득하는 방법이 개시되어 있다.Biorecognition technology using face shape information is being actively researched, and various face recognition methods are known in this regard. Face recognition technology is based on a method of measuring similarities according to individual regions in a face image, for example, eyes, mouth, nose, forehead, etc., and comparing two face images. At this time, in the face recognition technology, research is being conducted on a method of improving the recognition rate by considering noise or error factors caused by ambient lighting and effectively processing face recognition. For example, Korean Patent Registration No. 10-1853006 discloses a method of generating a database from a 3D image and obtaining a face shape by face recognition using nose detection in a depth image.

특히, 얼굴 부위 중에서 코는 해부학적으로 안면에서 돌출한 코와 그 내부를 구성하는 비강으로 이루어져 있다. 코의 모양은 동물의 종류에 따라 다른 데 사람에서도 인종차나 개인차가 현저하며 그 모양은 용모에 커다란 관계가 있다.In particular, among the facial parts, the nose anatomically consists of a nose protruding from the face and a nasal cavity constituting the inside. The shape of the nose is different depending on the type of animal, but even in humans, racial and individual differences are remarkable, and the shape has a great relationship with the appearance.

그러나, 실질적으로 얼굴 형태와 같은 표현형은 유전적인 특성에 의해 조절되고 있음에도 불구하고, 얼굴 표현형을 결정하는 명확한 유전자와 그의 작용기전에 대한 이해는 부족한 상태이다. 얼굴 표현형에 연관된 유전 변이는 전장 유전체 연관 분석법(genome-wide association study, GWAS)를 활용하여 이뤄지며, 한국인, 중국인, 미국인 등 몇 개의 인구집단에서 분석 결과가 발표된 바 있다. 발표된 연구 결과에서도 각기 다른 부위별 표현형들을 사용하고 있어 다양한 SNP, 다양한 연관 유전자들이 발표되고 있으나, 유전자 기능과 그에 대한 해석이 여전히 어려운 실정이다.However, despite the fact that phenotypes such as facial shapes are regulated by genetic characteristics, there is a lack of understanding of clear genes that determine facial phenotypes and their mechanisms of action. Genetic variations associated with facial phenotypes are performed using a genome-wide association study (GWAS), and analysis results have been published in several population groups, including Koreans, Chinese, and Americans. Published research results also use different site-specific phenotypes, and various SNPs and various related genes have been published, but the gene function and its interpretation are still difficult.

이러한 배경 하에서, 본 발명자들은 유전자를 이용하여 얼굴 형상을 보다 효과적으로 판별하는 방법을 개발하고자 예의 연구노력한 결과, 얼굴 부위 중 얼굴 형태에 큰 영향을 주는 코 부위 길이에 관련된 표현형과 관련된 7종 유전자 유래 SNP 마커를 사용할 경우, 코 부위의 표현형을 보다 용이하게 유추할 수 있음을 확인하고, 본 발명을 완성하였다.Under this background, the present inventors have made intensive research efforts to develop a method for more effectively discriminating the face shape using genes. When using a marker, it was confirmed that the phenotype of the nose can be more easily inferred, and the present invention was completed.

본 발명의 하나의 목적은 코 표현형과 관련된, 7종 유전자에 포함된 SNP를 마커로 포함하는 코 표현형 판별용 마커 조성물을 제공하는 것이다.One object of the present invention is to provide a marker composition for determining the nose phenotype, which includes, as markers, SNPs included in 7 genes related to the nose phenotype.

본 발명의 다른 하나의 목적은 목적하는 개체의 시료로부터 상기 SNP의 유전형을 확인하는 단계를 포함하는, 코 표현형 판별에 필요한 정보를 제공하는 방법을 제공하는 것이다.Another object of the present invention is to provide a method for providing information necessary for determining a nose phenotype, including the step of confirming the genotype of the SNP from a sample of a subject of interest.

본 발명의 또 다른 하나의 목적은 상기 SNP의 유전형 확인용 제제를 포함하는 코 표현형 판별용 키트를 제공하는 것이다.Another object of the present invention is to provide a kit for determining the nasal phenotype, including a preparation for confirming the genotype of the SNP.

상술한 목적을 달성하기 위한 본 발명의 일 실시양태는 코 표현형과 관련된, 7종 유전자에 포함된 SNP를 마커로 포함하는 코 표현형 판별용 마커 조성물을 제공한다.One embodiment of the present invention for achieving the above object provides a marker composition for determining the nose phenotype, which includes, as markers, SNPs included in 7 genes related to the nose phenotype.

구체적으로, 본 발명에서 제공하는 코 표현형 판별용 마커 조성물은, HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 및 이들의 조합으로 이루어지는 군으로부터 선택되는 어느 하나 이상의 유전자에 포함된 SNP를 마커로 포함한다.Specifically, the marker composition for determining the nose phenotype provided by the present invention is a SNP included in any one or more genes selected from the group consisting of HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 and combinations thereof as a marker include

또한, 상기 SNP는 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs16942475, rs56292972, rs7406355, rs2270424, rs12937642, rs2292749, rs11217253, rs7937208, rs11217255, rs59969944, rs11112705, rs4964157, rs7134222, rs3782695, rs1215807 및 이들의 조합으로 구성된 군으로부터 선택되는 어느 하나 이상의 SNP일 수 있다.또한, 상기 SNP는 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs16942475, rs56292972, rs7406355, rs2270424, rs12937642, rs2292749, rs11217253, rs7937208, rs11217255, rs59969944, rs11112705, rs4964157, rs7134222, rs3782695, rs1215807 및 이들의 조합으로 구성된 It may be any one or more SNPs selected from the group.

본 발명의 용어 "코 표현형"이란, 유전적 특성에 의해 나타나는 코의 전체 형상 또는 부분 형상뿐만 아니라, 코 형태의 구체적인 거리 또는 비율 등을 포괄적으로 포함한다. 상기 코 표현형은 코 형태와 혼용될 수 있다.The term "nose phenotype" of the present invention comprehensively includes not only the overall shape or partial shape of the nose represented by genetic characteristics, but also the specific distance or ratio of the shape of the nose. The nose phenotype may be used interchangeably with the nose shape.

본 발명에 있어서, 상기 코 표현형은 유전적 특성에 의해 선천적으로 형태가 결정되는 코 부위라면 어느 부위라도 포함될 수 있는데, 상기 코 표현형의 구체적인 예시로는 눈사이에서 코볼까지 길이, 코 길이, 코 끝에서 코볼까지 길이, 코볼 너비 등을 각각 개별적으로 또는 이들을 조합한 것일 수 있고, 보다 구체적으로 눈사이에서 코볼까지 길이, 코 길이, 코 끝에서 코볼까지 길이 및 코볼 너비에 해당하는 4개 표현형일 수 있으나, 이에 제한되지 않는다.In the present invention, the nose phenotype may include any part of the nose as long as it is a part of the nose whose shape is congenitally determined by genetic characteristics. Specific examples of the nose phenotype include the length between the eyes to the nose, the length of the nose, and the tip of the nose. The length from the nose to the nose, the width of the nose may be individually or a combination thereof, and more specifically, the four phenotypes corresponding to the length between the eyes to the nose, the length of the nose, the length from the tip of the nose to the nose, and the width of the nose. However, it is not limited thereto.

본 발명에 있어서, 상기 코 표현형과 관련된 유전자는 HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 및 이들의 조합으로 이루어지는 군으로부터 선택되는 어느 하나 이상일 수 있다.In the present invention, the gene associated with the nose phenotype may be any one or more selected from the group consisting of HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1, and combinations thereof.

본 발명의 용어 "HOXB2(Homeobox protein Hox-B2) 유전자"란, Antp 호메오박스 패밀리의 구성원이며 호메오박스 DNA 결합 도메인이 있는 핵 단백질을 코딩하는 유전자로서, 염색체 17에 위치한 호메오박스 B 유전자의 클러스터에 포함되어 있다. 이로부터 코딩되는 단백질은 발달에 관여하는 서열 특이적인 전사 인자로 기능하는 것으로 알려져 있다. 상기 HOXB2 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_002145.4 등으로 보고되어 있다. The term "HOXB2 (Homeobox protein Hox-B2) gene" of the present invention is a gene encoding a nuclear protein that is a member of the Antp homeobox family and has a homeobox DNA binding domain, and is a homeobox B gene located on chromosome 17. included in the cluster. Proteins encoded therefrom are known to function as sequence-specific transcription factors involved in development. The specific nucleotide sequence of the HOXB2 gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_002145.4.

본 발명에 있어서, 상기 HOXB2 유전자는 코 표현형에 영향을 미치는 SNP인 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 및 rs7406355의 원천 유전자로서 해석될 수 있다.In the present invention, the HOXB2 gene can be interpreted as a source gene of rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 and rs7406355, which are SNPs that affect the nose phenotype.

본 발명의 용어 "YLPM1(YLP motif-containing protein 1) 유전자"란, TERT(Telomerase reverse transcriptase)의 핵심 프로모터에 결합하고 그 하향 조절을 조절하여 배아 줄기 세포의 분화 과정에서 텔로머라제 활성을 감소시키는, YLP 모티브를 포함하는 단백질을 코딩하는 유전자를 의미한다. 상기 YLPM1 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_019589.3 등으로 보고되어 있다. The term "YLPM1 (YLP motif-containing protein 1) gene" of the present invention refers to a gene that binds to the core promoter of TERT (Telomerase reverse transcriptase) and regulates its down-regulation to reduce telomerase activity during the differentiation of embryonic stem cells. , means a gene encoding a protein containing the YLP motif. The specific nucleotide sequence of the YLPM1 gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_019589.3.

본 발명에 있어서, 상기 YLPM1 유전자는 코 표현형에 영향을 미치는 SNP인 rs2270424의 원천 유전자로서 해석될 수 있다.In the present invention, the YLPM1 gene can be interpreted as a source gene of rs2270424, a SNP that affects the nose phenotype.

본 발명의 용어 "FAM134C(family with sequence similarity 134, member C) 유전자"란, NRF1(Nuclear respiratory factor 1) 강화 신경 돌기 성장을 매개하는 단백질을 코딩하는 유전자로서, RETREG3(Reticulophagy regulator 3) 유전자로도 명명된다. 상기 FAM134C 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_178126.4 등으로 보고되어 있다. The term "FAM134C (family with sequence similarity 134, member C) gene" of the present invention is a gene encoding a protein that mediates NRF1 (Nuclear respiratory factor 1) enhanced neurite outgrowth, also known as RETREG3 (Reticulophagy regulator 3) gene. is named The specific nucleotide sequence of the FAM134C gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_178126.4.

본 발명에 있어서, 상기 FAM134C 유전자는 코 표현형에 영향을 미치는 SNP인 rs12937642의 원천 유전자로서 해석될 수 있다.In the present invention, the FAM134C gene can be interpreted as a source gene of rs12937642, a SNP that affects the nose phenotype.

본 발명의 용어 "TUBG1(Tubulin gamma-1) 유전자"란, 튜불린(Tubulin) 슈퍼패밀리의 구성원을 코딩하는 유전자를 의미한다. 이로부터 발현되는 TUBG1 단백질은 감마-튜블린 링 복합체(Gamma-tubulin ring complex)로 명명되는 복합체의 일부로서, 미세관 핵 형성을 매개하고 미세관 형성 및 세포주기 진행에 관여하는 것으로 알려져 있다. 상기 TUBG1 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_001070.5 등으로 보고되어 있다. The term "TUBG1 (Tubulin gamma-1) gene" of the present invention means a gene encoding a member of the tubulin superfamily. The TUBG1 protein expressed therefrom is a part of a complex called the gamma-tubulin ring complex, and is known to mediate microtubule nucleation and to be involved in microtubule formation and cell cycle progression. The specific nucleotide sequence of the TUBG1 gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_001070.5.

본 발명에 있어서, 상기 TUBG1 유전자는 코 표현형에 영향을 미치는 SNP인 rs12937642 및 rs2292749의 원천 유전자로서 해석될 수 있다.In the present invention, the TUBG1 gene can be interpreted as a source gene of rs12937642 and rs2292749, which are SNPs that affect the nose phenotype.

본 발명의 용어 "THY1(Thy-1 membrane glycoprotein 혹은 Thy-1 cell surface antigen) 유전자"란, CD90(Cluster of Differentiation 90)로도 명명되는, 단일 V-유사 면역 글로불린 도메인(single V-like immunoglobulin domain)을 가진 N-글리코실화된 글리코포스파티딜이노시톨(glycophosphatidylinositol, GPI) 고정 보존 세포 표면 단백질을 코딩하는 유전자를 의미한다. 이로부터 발현된 THY1 단백질은 다양한 줄기 세포와 성숙한 뉴런의 축삭 과정에 대한 마커로 사용되며, 시냅스 생성 및 뇌의 다른 이벤트 동안 세포-세포 또는 세포-리간드 상호 작용에 관여하는 것으로 알려져 있다. 상기 THY1 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_006288.5 등으로 보고되어 있다. The term "THY1 (Thy-1 membrane glycoprotein or Thy-1 cell surface antigen) gene" of the present invention is a single V-like immunoglobulin domain, also named CD90 (Cluster of Differentiation 90) It refers to a gene encoding an N-glycosylated glycophosphatidylinositol (GPI) fixed conserved cell surface protein with The THY1 protein expressed therefrom is used as a marker for axonal processes in various stem cells and mature neurons, and is known to be involved in cell-cell or cell-ligand interactions during synaptogenesis and other events in the brain. The specific nucleotide sequence of the THY1 gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_006288.5.

본 발명에 있어서, 상기 THY1 유전자는 코 표현형에 영향을 미치는 SNP인 rs11217253, rs7937208, rs11217255 및 rs59969944의 원천 유전자로서 해석될 수 있다.In the present invention, the THY1 gene can be interpreted as a source gene of SNPs rs11217253, rs7937208, rs11217255 and rs59969944 that affect the nose phenotype.

본 발명의 용어 "IGFBP4(Insulin-like growth factor-binding protein 4) 유전자"란, 인슐린 유사 성장인자 결합 단백질(insulin-like growth factor binding protein, IGFBP) 패밀리의 구성원이며, IGFBP 도메인과 티로글로불린 I 형 도메인(thyroglobulin type-I domain)을 가진 단백질을 코딩하는 유전자를 의미한다. 이로부터 발현된 IGFBP4 단백질은 인슐린 유사 성장 인자(insulin-like growth factor, IGF) I 및 II에 결합하고 혈장에서 글리코실화 및 비글리코실화 형태로 순환하며, 상기 단백질의 결합은 IGF의 반감기를 연장시키고 세포 표면 수용체와의 상호 작용을 변경시키는 것으로 알려져 있다. 상기 IGFBP4 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_001552.3 등으로 보고되어 있다. The term "IGFBP4 (Insulin-like growth factor-binding protein 4) gene" of the present invention is a member of the insulin-like growth factor binding protein (IGFBP) family, and contains an IGFBP domain and thyroglobulin type I It means a gene encoding a protein having a domain (thyroglobulin type-I domain). The IGFBP4 protein expressed therefrom binds to insulin-like growth factor (IGF) I and II and circulates in plasma in glycosylated and non-glycosylated forms, and the binding of the protein prolongs the half-life of IGF It is known to alter interactions with cell surface receptors. The specific nucleotide sequence of the IGFBP4 gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_001552.3.

본 발명에 있어서, 상기 IGFBP4 유전자는 코 표현형에 영향을 미치는 SNP인 rs2023906의 원천 유전자로서 해석될 수 있다.In the present invention, the IGFBP4 gene can be interpreted as a source gene of rs2023906, a SNP that affects the nose phenotype.

본 발명의 용어 "NUAK1(NUAK family SNF1-like kinase 1) 유전자"란, ARK5(AMPK-related protein kinase 5)로도 명명되는, 키나아제 단백질을 코딩하는 유전자를 의미한다. 이로부터 발현된 NUAK1 단백질은 Akt의 조절하에 종양 세포 생존을 촉진하고, 종양 전이에 관여하는 MMP-2와 MMP-9를 활성화시키는 MT1-MMP 생산을 증가시키는 것으로 알려져 있다. 상기 NUAK1 유전자의 구체적인 염기서열 또는 단백질의 아미노산 서열 정보는 NCBI와 같은 database에 보고되어 있다. 예를 들면, GenBank Accession Nos: NM_014840.3 등으로 보고되어 있다. The term "NUAK1 (NUAK family SNF1-like kinase 1) gene" of the present invention refers to a gene encoding a kinase protein, also referred to as ARK5 (AMPK-related protein kinase 5). It is known that the NUAK1 protein expressed therefrom promotes tumor cell survival under the control of Akt and increases the production of MT1-MMP, which activates MMP-2 and MMP-9 involved in tumor metastasis. The specific nucleotide sequence of the NUAK1 gene or amino acid sequence information of the protein is reported in a database such as NCBI. For example, it is reported as GenBank Accession Nos: NM_014840.3.

본 발명에 있어서, 상기 NUAK1 유전자는 코 표현형에 영향을 미치는 SNP인 rs11112705, rs4964157, rs7134222, rs3782695 및 rs1215807 의 원천 유전자로서 해석될 수 있다.In the present invention, the NUAK1 gene can be interpreted as a source gene of rs11112705, rs4964157, rs7134222, rs3782695 and rs1215807, which are SNPs that affect the nose phenotype.

본 발명의 SNP는 HOXB2 유전자의 SNP인 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 및 rs7406355; YLPM1 유전자의 SNP인 rs2270424; FAM134C 유전자의 SNP인 rs12937642; TUBG1 유전자의 SNP인 rs12937642 및 rs2292749; THY1 유전자의 SNP인 rs11217253, rs7937208, rs11217255 및 rs59969944; IGFBP4 유전자의 SNP인 rs2023906; NUAK1 유전자의 SNP인 rs11112705, rs4964157, rs7134222, rs3782695 및 rs1215807이 될 수 있다.The SNPs of the present invention include rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 and rs7406355 of the HOXB2 gene; rs2270424, a SNP of the YLPM1 gene; rs12937642, a SNP of the FAM134C gene; SNPs rs12937642 and rs2292749 of the TUBG1 gene; SNPs rs11217253, rs7937208, rs11217255 and rs59969944 of the THY1 gene; rs2023906, a SNP of the IGFBP4 gene; It can be SNPs rs11112705, rs4964157, rs7134222, rs3782695 and rs1215807 of the NUAK1 gene.

상기 SNP 마커의 조합 역시 특별히 이에 제한되지 않으나, 일 예로서, 20종의 SNP 마커 중 1종 이상, 2종 이상, 4종 이상, 6종 이상, 8종 이상, 10종 이상, 12종 이상, 14종 이상, 16종 이상, 18종 이상 또는 20종의 SNP 마커를 모두 포함하는 조합이 될 수 있다.The combination of the SNP markers is also not particularly limited thereto, but as an example, one or more, two or more, four or more, six or more, eight or more, ten or more, twelve or more of 20 SNP markers, It may be a combination including all 14 or more, 16 or more, 18 or more, or 20 SNP markers.

본 발명의 용어 "판별"이란, 상기 SNP 마커의 유전형 분석을 통해 코 표현형에 대한 특징을 도출하여 구체화하는 과정을 의미한다.The term "determination" of the present invention refers to a process of deriving and specifying characteristics of a nose phenotype through genotyping of the SNP marker.

본 발명의 일 실시예에 의하면, 연구 참가자로부터 얻어진 얼굴 이미지 및 유전체 전체 유전자형 데이터를 이용하여 얼굴 표현형을 도출하고, 도출된 자료를 이용한 세트 기반 분석(Set-based analysis)을 수행하여 유전자 및 SNP 세트 후보를 1차 발굴하였으며, 발굴된 유전자 및 SNP 세트 후보를 대상으로 Hi-C 및 Annotation Gene Functional Enrichment 분석을 수행하여 439개 유전자 및 이에 연관된 621개 SNP를 2차 발굴하였다. 이후, 추가 검증 분석을 통해 2차 발굴된 439개 유전자 및 이에 연관된 621개 SNP로부터 7개 유전자 및 이에 연관된 20개 SNP를 최종선발하였다(도 1).According to an embodiment of the present invention, a facial phenotype is derived using face images and genome-wide genotype data obtained from research participants, and set-based analysis is performed using the derived data to set genes and SNPs. Candidates were firstly discovered, and 439 genes and 621 SNPs related to them were secondly discovered by performing Hi-C and Annotation Gene Functional Enrichment analysis on the discovered gene and SNP set candidates. Then, through additional verification analysis, 7 genes and 20 SNPs associated with them were finally selected from 439 genes and 621 SNPs associated with them, which were discovered secondarily (Fig. 1).

본 발명의 다른 일 실시양태는 목적하는 개체의 시료로부터 본 발명의 SNP의 유전형을 확인하는 단계를 포함하는, 코 표현형 판별에 필요한 정보를 제공하는 방법을 제공한다.Another embodiment of the present invention provides a method for providing information necessary for determining a nose phenotype, comprising the step of confirming the genotype of the SNP of the present invention from a sample of an individual of interest.

구체적으로, 본 발명의 코 표현형 판별에 필요한 정보를 제공하는 방법은 목적하는 개체의 시료로부터 HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 및 이들의 조합으로 이루어지는 군으로부터 선택되는 유전자에 포함된 SNP의 유전형을 확인하는 단계를 포함할 수 있다.Specifically, the method for providing information necessary for determining the nose phenotype of the present invention is included in a gene selected from the group consisting of HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 and combinations thereof from a sample of a target individual It may include determining the genotype of the SNP.

상기 유전자, SNP 등은 앞서 설명한 바와 동일하다.The genes, SNPs, etc. are the same as described above.

상기 유전형의 확인은 당업계에서 공지된 방법을 사용하여 수행할 수 있는데, 일 예로서, PCR 방법 등을 통해 수행될 수 있다.Confirmation of the genotype may be performed using a method known in the art, for example, through a PCR method or the like.

상기 코 표현형 판별에 필요한 정보를 제공하는 방법은 SNP의 유전형을 확인한 다음, 상기 유전형이 확인된 SNP를 코 표현형과 연관시키는 단계를 추가로 포함할 수 있다.The method of providing information necessary for determining the nose phenotype may further include determining the genotype of the SNP and associating the SNP of which the genotype is confirmed with the nose phenotype.

상기 SNP의 유전형을 코 표현형과 연관시키는 단계는 확인된 각 SNP의 유전형을 조합하여 수행할 수 있는데, 상기 조합은 당업계에서 공지된 방법에 의해 수행될 수 있다. 예를 들어, 상기 SNP 유전형의 조합은 선형 또는 비선형 회귀 분석방법; 선행 또는 비선형 classification 분석방법; ANOVA; 신경망 분석방법; 유전적 분석방법; 서포트 벡터 머신 분석방법; 계층 분석 또는 클러스터링 분석방법; 결정 트리를 이용한 계층 알고리즘, 또는 Kernel principal components 분석방법; Markov Blanket 분석방법; recursive feature elimination 또는 엔트로피-기본 recursive feature elimination 분석방법; 전방 floating search 또는 후방 floating search 분석방법;을 각각 개별적으로 또는 이들을 조합하여 수행할 수 있다.The step of associating the genotype of the SNP with the nose phenotype may be performed by combining the genotypes of each identified SNP, and the combination may be performed by a method known in the art. For example, the combination of the SNP genotypes may be performed using a linear or non-linear regression analysis method; A priori or non-linear classification analysis method; ANOVA; neural network analysis method; genetic analysis; support vector machine analysis method; hierarchical analysis or clustering analysis method; hierarchical algorithm using decision tree, or Kernel principal components analysis method; Markov Blanket analysis method; recursive feature elimination or entropy-based recursive feature elimination analysis methods; Forward floating search or backward floating search analysis methods; can be performed individually or in combination.

또한, SNP의 유전형을 조합하는 방법은 컴퓨터 알고리즘을 사용하여 자동화된 방법에 의해 수행될 수 있다.In addition, the method of combining the genotypes of SNPs can be performed by an automated method using a computer algorithm.

본 발명의 코 표현형의 구체적인 예시로는 눈사이에서 코볼까지 길이, 코 길이, 코 끝에서 코볼까지 길이, 코볼 너비 등을 각각 개별적으로 또는 이들을 조합한 것일 수 있으나, 이에 제한되지 않으며, 이에 대해서는 전술한 바와 같다.Specific examples of the nose phenotype of the present invention may include, but are not limited to, the length between the eyes to the nose, the length of the nose, the length from the tip of the nose to the nose, and the width of the nose individually or in combination. It's like a bar.

상기 코 표현형과 SNP의 연관성 역시 특별히 이에 제한되지 않으나, 일 예로서, HOXB2 유전자의 SNP인 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 및 rs7406355는 눈사이에서 코볼까지 길이에 연관될 수 있고; YLPM1 유전자의 SNP인 rs2270424는 눈사이에서 코볼까지 길이에 연관될 수 있고; FAM134C 유전자의 SNP인 rs12937642는 눈사이에서 코볼까지 길이에 연관될 수 있고; TUBG1 유전자의 SNP인 rs12937642 및 rs2292749는 눈사이에서 코볼까지 길이에 연관될 수 있고; THY1 유전자의 SNP인 rs11217253, rs7937208, rs11217255 및 rs59969944는 코 길이에 연관될 수 있고; IGFBP4 유전자의 SNP인 rs2023906는 코 끝에서 코볼까지 길이에 연관될 수 있고; NUAK1 유전자의 SNP인 rs11112705, rs4964157, rs7134222, rs3782695 및 rs1215807는 코볼 너비에 연관될 수 있다.The association between the nose phenotype and the SNP is also not particularly limited thereto, but as an example, the SNPs of the HOXB2 gene, rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 and rs7406355 are related to the length between the eyes can; The SNP of the YLPM1 gene, rs2270424, can be associated with the length between the eyes to the nose; The SNP of the FAM134C gene, rs12937642, may be associated with the length between the eyes to the nose; SNPs rs12937642 and rs2292749 of the TUBG1 gene can be associated with interocular to cobol length; SNPs rs11217253, rs7937208, rs11217255 and rs59969944 of the THY1 gene can be associated with nose length; The SNP of the IGFBP4 gene, rs2023906, can be associated with the length from the tip of the nose to the nose; SNPs rs11112705, rs4964157, rs7134222, rs3782695 and rs1215807 of the NUAK1 gene may be associated with COBOL width.

본 발명의 또 다른 일 실시양태는 본 발명의 SNP 마커의 유전형 확인용 제제를 포함하는 코 표현형 판별용 키트를 제공한다.Another embodiment of the present invention provides a kit for determining a nasal phenotype comprising a preparation for confirming the genotype of the SNP marker of the present invention.

구체적으로, 본 발명의 키트는 HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 및 이들의 조합으로 이루어지는 군으로부터 선택되는 유전자에 포함된 SNP의 유전형 확인용 제제를 포함할 수 있다.Specifically, the kit of the present invention may include an agent for confirming the genotype of a SNP included in a gene selected from the group consisting of HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1, and combinations thereof.

본 발명의 키트는, 코 표현형을 판별하고자 하는 개체로부터 분리된 DNA 시료로부터 상기 SNP 마커 유전자를 증폭시킨 후, 증폭된 SNP 마커의 유전형을 확인하고, 이를 이용하여 상기 개체의 코 표현형을 판별하는데 사용될 수 있다. The kit of the present invention amplifies the SNP marker gene from a DNA sample isolated from an individual whose nose phenotype is to be determined, confirms the genotype of the amplified SNP marker, and uses this to determine the nose phenotype of the individual. can

구체적인 일례로서, 본 발명에서 제공하는 코 표현형 판별용 키트는 Multi-plexing PCR을 수행하기 위해 필요한 요소를 포함하는 키트일 수 있다. 상기 Multi-plexing PCR 키트는, 상기 SNP 마커 유전자들에 대해 특이적인 각각의 프라이머 쌍 외에도 테스트 튜브 또는 다른 적절한 컨테이너, 반응 완충액(pH 및 마그네슘 농도는 다양), 데옥시뉴클레오타이드(dNTPs), Taq-폴리머라아제 등의 효소, DNase, RNAse 억제제, DEPC-수(DEPC-water), 멸균수 등을 포함할 수 있다.As a specific example, the kit for determining the nose phenotype provided by the present invention may be a kit including elements necessary for performing multiple-plexing PCR. The multiplexing PCR kit, in addition to each primer pair specific for the SNP marker genes, contains a test tube or other appropriate container, reaction buffer (with varying pH and magnesium concentration), deoxynucleotides (dNTPs), Taq-polymer Enzymes such as lase, DNase, RNAse inhibitors, DEPC-water, sterile water, and the like.

다른 예로서, 본 발명에서 제공하는 코 표현형 판별용 키트는 DNA 칩을 수행하기 위해 필요한 필수 요소를 포함하는 코 표현형 판별용 키트일 수 있다. DNA 칩 키트는, 일반적으로 편평한 고체 지지판, 전형적으로는 현미경용 슬라이드보다 크지않은 유리 표면에 핵산 종을 격자형 배열(gridded array)로 부착한 것으로, 칩 표면에 핵산이 일정하게 배열되어, DNA 칩 상의 핵산과 칩 표면에 처리된 용액 내에 포함된 상보적인 핵산 간에 다중 혼성화(hybridization) 반응이 일어나 대량 병렬 분석이 가능하도록 하는 도구로서, 이를 이용하여 다수의 코 표현형, 즉, 코 형태를 동시에 식별할 수 있다.As another example, the kit for determining a nose phenotype provided by the present invention may be a kit for determining a nose phenotype that includes essential elements required to perform a DNA chip. A DNA chip kit is a DNA chip kit in which nucleic acid species are attached in a gridded array to a generally flat solid support plate, typically a glass surface no larger than a slide for a microscope, and nucleic acids are regularly arranged on the surface of the chip. As a tool that enables massively parallel analysis by multiple hybridization reactions between nucleic acids on the surface and complementary nucleic acids contained in the solution treated on the chip surface, it is possible to simultaneously identify multiple nose phenotypes, that is, nose shapes. can

본 발명의 SNP 세트는 코 부위 길이와 관련된 표현형에서 확인된 것으로, 실종자 찾기, 범죄수사 등에 적극 활용될 수 있을 뿐만 아니라, 나아가 외모 유형 판별 및 효과적인 건강관리에 활용될 수 있다.The SNP set of the present invention has been identified in phenotypes related to the length of the nose, and can be actively used for finding missing people and criminal investigations, as well as for determining appearance types and effective health management.

도 1은 본 발명에서 수행한 코 표현형 판별용 유전자 및 SNP를 도출하는 연구과정의 개요를 나타내는 개략도이다.
도 2는 얼굴 전체 표현형을 추출하기 위하여 설정한 얼굴 인식점을 나타내는 개략도이다.
도 3은 세트 기반 분석(Set-based analysis)을 수행하는 절차를 나타내는 개략도이다.
도 4는 본 발명에서 제공하는 코 표현형과 관련된 7개 유전자 및 20종의 SNP가, 코 부위의 각 길이와 어떠한 연관성을 나타내는지를 도시한 개략도이다.Figure 1 is a schematic diagram showing the outline of the research process for deriving genes and SNPs for nose phenotype discrimination performed in the present invention.
2 is a schematic diagram showing face recognition points set to extract the entire face phenotype.
3 is a schematic diagram showing a procedure for performing set-based analysis.
Figure 4 is a schematic diagram showing how the 7 genes and 20 SNPs related to the nose phenotype provided in the present invention are related to each length of the nose region.

이하 본 발명을 실시예를 통하여 보다 상세하게 설명한다. 그러나 이들 실시예는 본 발명을 예시적으로 설명하기 위한 것으로 본 발명의 범위가 이들 실시예에 한정되는 것은 아니다.Hereinafter, the present invention will be described in more detail through examples. However, these examples are intended to illustrate the present invention by way of example, and the scope of the present invention is not limited to these examples.

실시예 1: 연구 참가자Example 1: Study participants

연구 참가자들은 한국의 두 지역인 안산과 안성에서 2009 년부터 2012 년까지 Korean Genome and Epidemiology Study(KoGES) 사업과 협력하여 모집하였다(집단 1). 연구 참가자의 얼굴 이미지 및 유전체 전체 유전자형 데이터(Affymetrix SNP chip 5.0)를 이용하였으며, 암의 병력, 성별 불일치, 숨겨진 가계 관련성, 낮은 유전형 비율(<95 %), 오염된 샘플은 배제하고, 총 5207명이 분석에 포함되었다.Study participants were recruited in cooperation with the Korean Genome and Epidemiology Study (KoGES) project from 2009 to 2012 in Ansan and Anseong, two regions of Korea (Group 1). Facial images and genome-wide genotyping data (Affymetrix SNP chip 5.0) of study participants were used, and a total of 5207 patients were excluded, including cancer history, gender mismatch, hidden family relatedness, low genotype rate (<95%), and contaminated samples. were included in the analysis.

또한, 추가 검증(replication) 분석은 2007년부터 2012년까지 19개의 한방병원 대상으로 수집된 사람들로서 얼굴 이미지 및 유전체 전체 유전자형 데이터(PMRA Asian SNP chip)가 확보된 2,000명(집단 2)을 대상으로 이루어졌다.In addition, additional replication analysis was conducted on 2,000 people (group 2) who had facial images and genome-wide genotype data (PMRA Asian SNP chip) collected from 19 oriental medicine hospitals from 2007 to 2012. done

상기 설정한 연구 참가자로부터 얻어진 얼굴 이미지 및 유전체 전체 유전자형 데이터를 이용하여 코를 포함하여 얼굴 표현형을 도출하고, 도출된 자료를 이용한 세트 기반 분석(Set-based analysis)을 수행하여 유전자 및 SNP 세트 후보를 1차 발굴하였으며, 발굴된 유전자 및 SNP 세트 후보를 대상으로 Hi-C 및 Annotation Gene Functional Enrichment 분석을 수행하여 439개 유전자 및 이에 연관된 621개 SNP를 2차 발굴하였다. 이후, 추가 검증 분석을 통해 2차 발굴된 439개 유전자 및 이에 연관된 621개 SNP로부터 7개 유전자 및 이에 연관된 20개 SNP를 최종선발하였다(도 1).Facial phenotypes, including the nose, were derived using the facial images and genome-wide genotype data obtained from the research participants set above, and set-based analysis was performed using the derived data to determine gene and SNP set candidates. After the first discovery, 439 genes and 621 SNPs related to them were secondly discovered by performing Hi-C and Annotation Gene Functional Enrichment analysis on the discovered genes and SNP set candidates. Then, through additional verification analysis, 7 genes and 20 SNPs associated with them were finally selected from 439 genes and 621 SNPs associated with them, which were discovered secondarily (Fig. 1).

도 1은 본 발명에서 수행한 코 표현형, 즉, 코 형태 판별용 유전자 및 SNP를 도출하는 연구과정의 개요를 나타내는 개략도이다.1 is a schematic diagram showing the outline of the research process for deriving nose phenotypes, that is, genes and SNPs for determining nose shape, performed in the present invention.

실시예 2: 얼굴의 표현형 추출Example 2: Facial phenotype extraction

참가자들이 화장을 하지 않은 채, 정면과 측면 모두에서 디지털 카메라(DSLR Nikon D90: Nikon AF 50-mm F1.8D 렌즈, 3216 Х 2136 픽셀)를 사용하여 표준조건으로 촬영하였다(도 2). 이때, 표준조건은 머리카락은 머리띠를 착용하여 뒤로 넘긴 채로, 볼의 중심점과 얼굴 윤곽과 상부 auric perimeter를 연결하는 두 점(예를 들어, 도 2의 점 1과 17)이 개인별로 동일한 수평에 위치하도록 하며, 눈금자는 턱 밑 대략 10 mm에 두어 화소를 밀리미터로 변환하도록 설정한 조건을 의미한다.The participants were photographed in standard conditions using a digital camera (DSLR Nikon D90: Nikon AF 50-mm F1.8D lens, 3216 Х 2136 pixels) from both the front and the side without makeup (FIG. 2). At this time, the standard condition is that the two points (for example, points 1 and 17 in FIG. 2) connecting the center point of the ball, the facial contour, and the upper auric perimeter are located on the same horizontal level for each individual, with the hair turned back by wearing a headband. It means that the ruler is placed at about 10 mm under the chin, and the condition is set to convert pixels into millimeters.

도 2는 얼굴 전체 표현형을 추출하기 위하여 설정한 얼굴 인식점을 나타내는 개략도이다.2 is a schematic diagram showing face recognition points set to extract the entire face phenotype.

도 2에서 보듯이, 얼굴의 전체 표현형으로는 25개의 얼굴 인식점을 설정하고, 설정된 각각의 25개 점에서 부위별 거리를 만들어 총 41개 표현형을 만들어 사용하였다. 코 부위의 경우에는, 25개의 얼굴 인식점 중에서 6개(2, 11, 12, 13, 14 및 22) 점 간의 구간을 사용한 12개의 표현형을 다음과 같이 설정하였다: As shown in FIG. 2, 25 face recognition points were set as the overall phenotype of the face, and a total of 41 phenotypes were created and used by making distances for each part at each of the 25 set points. In the case of the nose, 12 phenotypes using intervals between 6 (2, 11, 12, 13, 14 and 22) points among 25 face recognition points were set as follows:

1. point2-point22(미간에서 눈사이까지 길이, pheno23)1. point2-point22 (length from the middle of the forehead to between the eyes, pheno23)

2. point12-point22(눈사이에서 코끝까지 길이, pheno24)2. point12-point22 (length between the eyes to the tip of the nose, pheno24)

3. point2-point11(미간에서 코끝까지 길이, pheno25)3. point2-point11 (length from the middle of the forehead to the tip of the nose, pheno25)

4. point2-point12(미간에서 비중격까지 길이, pheno26)4. point2-point12 (length from the middle of the forehead to the nasal septum, pheno26)

5. point2-point13(눈사이에서 왼코볼까지 길이, pheno27)5. point2-point13 (length between the eyes to the left nostril, pheno27)

6. point2-point14(눈사이에서 오른코볼까지 길이,pheno28)6. point2-point14 (length between eyes to right nostril, pheno28)

7. point13-point14(코볼 너비, pheno29)7. point13-point14 (cobol width, pheno29)

8. point11-point12(코끝에서 비중격까지 길이, pheno30)8. point11-point12 (length from the tip of the nose to the nasal septum, pheno30)

9. point11-point13(코끝에서 왼코볼까지 길이, pheno31)9. point11-point13 (length from the tip of the nose to the left nasal ball, pheno31)

10. point11-point14(코끝에서 오른코볼까지 길이, pheno32)10. point11-point14 (length from the tip of the nose to the right nose ball, pheno32)

11. point12-point13(비중격에서 왼코볼까지 길이, pheno33)11. point12-point13 (length from nasal septum to left nostril, pheno33)

12. point12-point14(비중격에서 오른코볼까지 길이, pheno34)12. point12-point14 (length from nasal septum to right nostril, pheno34)

실시예 3: 세트 기반 분석Example 3: Set-based analysis

상기 실시예 2에서 도출한 코 부위의 표현형과 관련된 유전자 및 SNP 세트를 선발하기 위하여, PLINK 프로그램을 사용한 세트 기반 분석을 수행하였다(도 3). In order to select a set of genes and SNPs related to the phenotype of the nose region derived in Example 2, set-based analysis was performed using the PLINK program (FIG. 3).

도 3은 세트 기반 분석을 수행하는 절차를 나타내는 개략도이다.3 is a schematic diagram showing a procedure for performing set-based analysis.

도 3에서 보듯이, 공개되어 있는 모든 mRNA 등의 전체 2만여 개의 유전자 정보를 수집하고, 수집한 유전자 부위에 있는 SNP 세트 정보도 함께 수집하였다. 이어, 상기 수집한 SNP로 코 부위 12개 표현형에 대한 선형회귀분석(보정변수: 연령과 성별)을 수행하고, 중복 LD SNP(r² > 0.5)을 제외시켜 대표 SNP들을 선정하였다. As shown in FIG. 3, information on a total of 20,000 genes, such as all mRNAs that have been published, was collected, and SNP set information in the collected gene regions was also collected. Subsequently, linear regression analysis (correction variables: age and gender) was performed on the 12 phenotypes of the nose with the collected SNPs, and representative SNPs were selected by excluding redundant LD SNPs (r ² > 0.5).

상기 선정된 SNP들의 표현형과의 연관성 통계량 값을 10,000번 반복 측정하여 세트내의 단일 SNP들의 통계량 평균을 계산하고 최종적으로 유의한 SNP 세트를 결정하였다(P_permutation < 0.05). The statistic value of association with the phenotype of the selected SNPs was repeatedly measured 10,000 times, the average of the statistics of single SNPs in the set was calculated, and finally a significant SNP set was determined (P _permutation < 0.05).

실시예 4: Hi-C 및 Functional enrichment 분석Example 4: Hi-C and functional enrichment analysis

상기 실시예 3에서 결정된 SNP 세트의 표적 유전자를 찾아내기 위하여, Hi-C 데이터(Nat Genet 2019 Oct;51(10):1442-1449)를 사용하였다.In order to find the target gene of the SNP set determined in Example 3, Hi-C data (Nat Genet 2019 Oct;51(10):1442-1449) was used.

통상적으로, 많은 수의 SNP의 경우 프로모터 영역 이외 영역에 위치하고 있는데, 이러한 영역에 존재하는 표적 유전자를 찾기 위해 Hi-C 데이터를 활용하고자 하였으며, 이를 활용하여 코 표현형 연관 유전자의 생물학적 기능성을 측정하기 위해 Enrichr이라는 분석 도구를 사용한 functional enrichment 분석을 수행하였다(https://amp.pharm.mssm.edu/Enrichr/). 아울러, 해당 분석을 위해 생물학적 경로, 질병연관성 결과, 유전자의 온톨로지를 담고 있는 KEGG, GO, GWAS catalog, BioPlanet, dbGAP 데이터베이스를 활용하였으며, 보정된 P_adjusted < 0.05를 유의 수준으로 설정하였다.Usually, in the case of a large number of SNPs, they are located in regions other than the promoter region. Hi-C data was used to find target genes present in these regions, and to measure the biological functionality of genes associated with nose phenotype using this Functional enrichment analysis was performed using an analysis tool called Enrichr (https://amp.pharm.mssm.edu/Enrichr/). In addition, KEGG, GO, GWAS catalog, BioPlanet, and dbGAP databases containing biological pathways, disease association results, and gene ontology were used for the analysis, and the adjusted P _adjusted < 0.05 was set as the significance level.

그 결과, 439개 유전자와 연관된 621개의 SNP 세트를 1차로 선발하였다.As a result, a set of 621 SNPs associated with 439 genes was initially selected.

실시예 5: 추가 검증 분석Example 5: Additional validation analysis

상기 실시예 4를 통해 1차 선발된 439개 유전자와 연관된 621개의 SNP 세트를 이용하여 상기 실시예 1에서 설정한 집단 2에 대하여 세트 기반 분석(보정변수: 연령, 성별)을 수행하였다. 이때, 유의 수준은 P<0.05 인 경우로 선정하였다. A set-based analysis (correction variables: age, gender) was performed on Group 2 set in Example 1 using the 621 SNP sets associated with the 439 genes first selected through Example 4 above. At this time, the significance level was selected when P<0.05.

그 결과, 7개 유전자와 연관된 20종의 SNP를 최종 선발하였다(표 1 및 도 4).As a result, 20 SNPs associated with the 7 genes were finally selected (Table 1 and FIG. 4).

유전자gene 표현형phenotype 얼굴관련부위parts of the face 유전자 위치정보gene location information SNPs IDSNPs ID HOXB2HOXB2 phe27, phe28phe27, phe28 눈사이에서 코볼까지 길이Length between the eyes to the nose 17q21.32
(chr17:48,540,894-48,545,109)17q21.32
(chr17:48,540,894-48,545,109) rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972, rs7406355rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972, rs7406355 YLPM1YLPM1 phe27, phe28phe27, phe28 눈사이에서 코볼까지 길이Length between the eyes to the nose 14q24.3
(chr14:75,230,019-75,304,021)14q24.3
(chr14:75,230,019-75,304,021) rs2270424rs2270424 FAM134CFAM134C phe27, phe28phe27, phe28 눈사이에서 코볼까지 길이Length between the eyes to the nose 17q21.2
(chr17:40,731,526-40,761,445)17q21.2
(chr17:40,731,526-40,761,445) rs12937642rs12937642 TUBG1TUBG1 phe27, phe28phe27, phe28 눈사이에서 코볼까지 길이Length between the eyes to the nose 17q21.31(chr17:40,761,701-40,767,256)17q21.31(chr17:40,761,701-40,767,256) rs12937642, rs2292749 rs12937642, rs2292749 THY1THY1 phe27, phe28, phe29phe27, phe28, phe29 눈사이에서 코볼까지 길이 및 코볼 너비Interocular to nose length and nose width 11q23.3
(chr11:119,286,186-119,295,695)11q23.3
(chr11:119,286,186-119,295,695) rs11217253, rs7937208, rs11217255, rs59969944rs11217253, rs7937208, rs11217255, rs59969944 IGFBP4IGFBP4 phe31phe31 코 끝에서 오른코볼까지 길이Length from the tip of the nose to the right nostril 17q21.2
(chr17:38,599,702-38,613,977)17q21.2
(chr17:38,599,702-38,613,977) rs2023906rs2023906 NUAK1NUAK1 pheno29pheno29 코볼 너비cobol width 12q23.3 (chr12:106,457,123-106,532,732)12q23.3 (chr12:106,457,123-106,532,732) rs11112705, rs4964157, rs7134222, rs3782695,
rs1215807 rs11112705, rs4964157, rs7134222, rs3782695,
rs1215807

도 4는 본 발명에서 제공하는 코 표현형과 관련된 7개 유전자 및 20종의 SNP가, 코 부위의 각 길이와 어떠한 연관성을 나타내는지를 도시한 개략도이다.Figure 4 is a schematic diagram showing how the 7 genes and 20 SNPs related to the nose phenotype provided in the present invention are related to each length of the nose region.

도 4에서 보듯이, HOXB2 유전자의 SNP인 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 및 rs7406355는 눈사이에서 코볼까지 길이에 연관되고; YLPM1 유전자의 SNP인 rs2270424는 눈사이에서 코볼까지 길이에 연관되고; FAM134C 유전자의 SNP인 rs12937642는 눈사이에서 코볼까지 길이에 연관되고; TUBG1 유전자의 SNP인 rs12937642 및 rs2292749는 눈사이에서 코볼까지 길이에 연관되고; THY1 유전자의 SNP인 rs11217253, rs7937208, rs11217255 및 rs59969944는 눈사이에서 코볼까지 길이 및 코볼 너비에 연관되고; IGFBP4 유전자의 SNP인 rs2023906는 코 끝에서 오른코볼까지 길이에 연관되고; NUAK1 유전자의 SNP인 rs11112705, rs4964157, rs7134222, rs3782695 및 rs1215807는 코볼 너비에 연관되는 것으로 확인되었다.As shown in Figure 4, the SNPs rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs7406910, rs16942475, rs56292972 and rs7406355 of the HOXB2 gene are associated with the interocular to cobol length; SNP rs2270424 of the YLPM1 gene is associated with interocular to cobol length; rs12937642, a SNP in the FAM134C gene, is associated with interocular to cobol length; SNPs rs12937642 and rs2292749 of the TUBG1 gene are associated with interocular to cobol length; The SNPs rs11217253, rs7937208, rs11217255 and rs59969944 of the THY1 gene are associated with interocular to cobol length and cobol width; SNP rs2023906 of the IGFBP4 gene is associated with the length from the tip of the nose to the right nasal ball; SNPs rs11112705, rs4964157, rs7134222, rs3782695 and rs1215807 of the NUAK1 gene were identified to be associated with COBOL width.

한편, 상기 최종 선발된 유전자가 상기 실시예 1에서 설정한 집단 1 및 2에 유의한지를 확인하였다(표 2).On the other hand, it was confirmed whether the finally selected genes were significant in groups 1 and 2 set in Example 1 (Table 2).

　　 집단1Group 1 집단2Group 2 SetSet NoSNPsNoSNPs pheno27pheno27 pheno28pheno28 pheno29pheno29 pheno27pheno27 pheno28pheno28 pheno29pheno29 HOXB2HOXB2 99 0.019 0.019 0.046 0.046 0.991 0.991 0.002 0.002 0.007 0.007 0.146 0.146 YLPM1YLPM1 1One 0.012 0.012 0.026 0.026 0.209 0.209 0.036 0.036 0.048 0.048 0.101 0.101 FAM134CFAM134C 1One 0.001 0.001 0.004 0.004 0.001 0.001 0.017 0.017 0.016 0.016 0.191 0.191 TUBG1TUBG1 22 0.001 0.001 0.005 0.005 0.003 0.003 0.017 0.017 0.016 0.016 0.191 0.191 THY1THY1 44 0.024 0.024 0.035 0.035 0.003 0.003 0.017 0.017 0.019 0.019 0.008 0.008 IGFBP4IGFBP4 1One 0.687 0.687 0.854 0.854 0.623 0.623 0.256 0.256 0.226 0.226 0.363 0.363 NUAK1NUAK1 66 0.187 0.187 0.228 0.228 0.002 0.002 0.959 0.959 0.769 0.769 0.030 0.030

상기 표 2에서 보듯이, 상기 최종 선발된 7종의 유전자는 집단 1 및 2에 대하여 모두, 유의한 P value를 나타냄을 확인하였다.As shown in Table 2, it was confirmed that all of the seven finally selected genes showed significant P values for groups 1 and 2.

따라서, 상기 최종 선발된 7종의 유전자 및 이와 연관된 20개의 SNP 세트는 얼굴의 코 표현형과 연관된 지표임을 알 수 있었다.Therefore, it was found that the finally selected 7 genes and the 20 SNP sets associated therewith are indicators associated with the nose phenotype of the face.

본 실시예에 따른 두 단계의 검증 방법을 통해 얻어진 SNP 세트, 해당 SNP 및 이의 표적 유전자를 제시한 결과, 총 20개의 SNP 세트가 구성되었으며, 이는 얼굴 형상 표현형 중 코 부위 길이 관여하는 연관 인자를 나타내었다. 상기 SNP 세트는 표 1과 같으며 코의 크기를 인지할 수 있는 코 부위의 길이에 관여하는 유전자들로 구성되어 있다. As a result of presenting the SNP set obtained through the two-step verification method according to this example, the corresponding SNP, and its target gene, a total of 20 SNP sets were constructed, which represents the associated factor involved in the length of the nose among the facial shape phenotypes. was The SNP set is shown in Table 1 and is composed of genes involved in the length of the nose part capable of recognizing the size of the nose.

본 실시예에서는 실제 확인이 힘들었던 SNP들이 발현 조절을 하는 표적 유전자를 찾기 위해 Hi-C 데이터를 바탕으로 검증을 수행하였으며, 실제 선정된 SNP 세트는 단순 근거리 유전자를 선정한 것이 아닌 실험적으로 원거리 실제 발현 조절 표적을 찾음으로서 생물학적 근거를 제시하였다. 또한, 상기 SNP 세트가 집단 1과 집단 2에서 모두 유의한 P value를 보이고 있어 재현성을 확인한 바, 보다 유의한 얼굴 표현형 유전 연관 마커를 선별하였음을 알 수 있다.In this example, verification was performed based on Hi-C data to find target genes that regulate expression of SNPs that were difficult to actually identify. A biological basis was presented by finding a target. In addition, as the SNP set showed a significant P value in both group 1 and group 2, reproducibility was confirmed, and it was found that more significant face phenotype genetic markers were selected.

이상의 설명으로부터, 본 발명이 속하는 기술분야의 당업자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 이와 관련하여, 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허 청구범위의 의미 및 범위 그리고 그 등가 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.From the above description, those skilled in the art to which the present invention pertains will be able to understand that the present invention may be embodied in other specific forms without changing its technical spirit or essential features. In this regard, the embodiments described above should be understood as illustrative in all respects and not limiting. The scope of the present invention should be construed as including all changes or modifications derived from the meaning and scope of the claims to be described later and equivalent concepts rather than the detailed description above are included in the scope of the present invention.

Claims

HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 and a combination of the SNP contained in any one or more genes selected from the group consisting of a marker composition for determining the nasal phenotype.

제1항에 있어서, 상기 SNP는 rs2023906, rs11652148, rs7225635, rs2740757, rs2555111, rs16942475, rs56292972, rs7406355, rs2270424, rs12937642, rs2292749, rs11217253, rs7937208, rs11217255, rs59969944, rs11112705, rs4964157, rs7134222, rs3782695, rs1215807 및 이들 Any one or more selected from the group consisting of a combination of, the composition.

The composition of claim 1, wherein the nose phenotype is at least one selected from the group consisting of a length between the eyes to the nose, a length from the tip of the nose to the nose, a width of the nose, and combinations thereof.

Information necessary for determining the nose phenotype, including the step of confirming the genotype of a SNP included in a gene selected from the group consisting of HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 and combinations thereof from a sample of the subject of interest How to provide.

The method of claim 4, wherein the confirmation of the genotype is performed through PCR.

The method of claim 4, wherein the method further comprises associating the genotyped SNP with a nasal phenotype.

The method of claim 6, wherein the linking step is performed by combining genotypes of each SNP.

The method of claim 7, wherein the combination of the SNP genotype is a linear or non-linear regression analysis method; A priori or non-linear classification analysis method; ANOVA; neural network analysis method; genetic analysis; support vector machine analysis method; hierarchical analysis or clustering analysis method; hierarchical algorithm using decision tree, or Kernel principal components analysis method; Markov Blanket analysis method; recursive feature elimination or entropy-based recursive feature elimination analysis methods; forward floating search or backward floating search analysis method; And a method that is carried out using any one or more selected from the group consisting of combinations thereof.

The method of claim 7, wherein the combination of the SNP genotypes is performed using a computer algorithm.

HOXB2, YLPM1, FAM134C, TUBG1, THY1, IGFBP4, NUAK1 and a kit for determining the genotype of a SNP contained in a gene selected from the group consisting of combinations thereof, a kit for determining a nose phenotype.