KR20190139480A

KR20190139480A - Full siblings analysis using Genotyping

Info

Publication number: KR20190139480A
Application number: KR1020180066025A
Authority: KR
Inventors: 한지성; 이우재
Original assignee: 주식회사 불루젠코리아
Priority date: 2018-06-08
Filing date: 2018-06-08
Publication date: 2019-12-18
Also published as: KR102163657B1

Abstract

Provided is a blood relationship analysis method using genotyping, which comprises: step (A) of analyzing, based on a marker, a genotype of entities whose likelihood is to be checked; step (B) of inputting a genotyping result data as a result of the step (A); step (C) of inputting an allele frequency from the genotyping result data of the step (B); step (D) of comparing whether an allelotype is matched between the entities whose likelihood is to be checked, setting an identity by descent (IBD) formula in accordance with the number of cases with which the allelotype is matched, and applying an IBD constant in accordance with the number of cases of the likelihood to the set IBD formula, and thus extracting a likelihood ratio; and step (E) of confirming a probability result of the highest value of the step (D) to determine the likelihood. According to the present invention, an analysis process is simplified such that analysis can be performed in a short time, thereby increasing correctness, efficiency, simplicity and specialty in business, and providing a correct result in family line confirmation, selection breeding and discharge effect examination (sibling analysis for a discharged seed).

Description

Full siblings analysis using genotyping

본 발명은 단순반복염기서열(simple sequence repeat, SSR) 부위의 초위성체마커(Microsatellite marker)의 분석 결과에 대한 누적되는 대립유전자빈도데이터(Allele frequency data)를 이용하여 보다 높은 신뢰도로 혈연관계를 분석할 수 있는 유전자형 분석을 이용한 전형매관계분석 방법에 관한 것이다.The present invention analyzes the kinship relationship with higher reliability by using the accumulated allele frequency data for the results of analysis of the microsatellite marker of the simple sequence repeat (SSR) region. The present invention relates to a method for analyzing typical relationships using genotyping.

생명체의 모든 생물학적 형질들은 핵산(DNA)이라는 화학물질로 구성된 유전정보에 의해 결정된다는 것이 밝혀진 이래로 유전정보의 이해 및 이를 통한 유전정보의 산업적, 의학적 이용을 위한 많은 새로운 연구방법들이 개발되었다.Since it has been found that all biological traits in living organisms are determined by genetic information consisting of chemicals called nucleic acids (DNAs), many new research methods have been developed for understanding genetic information and for industrial and medical use of genetic information.

DNA marker란 염색체상에 존재하는 짧은 DNA부분을 지칭하는 용어로 분자표지 또는 DNA표지로 불린다. 실험학적인 측면에서 크게 두 가지 용도로 사용된다. 첫째는 유전자형의 차이에 의한 개체식별이며, 이를 통하여 염색체상에서 특정 염색체 부위의 부계유전 또는 모계유전과 같은 후대전달양식의 추정을 가능하게 해준다. 둘째는 염색체상에서 특정 DNA부분의 위치를 나타내기 위한 용도이다. 이를 통하여 DNA library에 내포되어 있는 DNA단편들을 하나의 큰 단위로 연결시키는 것을 가능하게 한다.DNA marker is a term that refers to a short part of DNA on the chromosome and is called a molecular label or DNA label. In experimental terms, it is mainly used for two purposes. The first is the identification of individuals by genotype differences, which makes it possible to estimate progeny transmission patterns such as paternal or maternal inheritance of specific chromosomal sites on the chromosome. The second is to indicate the location of specific DNA parts on the chromosome. This makes it possible to link DNA fragments contained in the DNA library into one large unit.

DNA marker의 가장 중요한 용도 중의 하나는 유전자형분석이다. 유전자형분석이란, 특정 좌위들에 대한 DNA 다형성의 형태를 분석하는 것을 말하여, 이를 통하여 이들 유전좌위들의 후대전달양식을 판단할 수 있다. 대부분의 유전자형분석은 아가로즈 또는 아크릴아마이드 젤을 이용한 전기영동으로 DNA크기와 염기서열 변이에 따라 나타나는 DNA밴드 이동성 차이에 의해 유전자형을 판별하는 방법이 주로 사용되고 있다. One of the most important uses of DNA markers is genotyping. Genotyping refers to the analysis of the shape of DNA polymorphisms for specific loci, through which the gene transfer patterns of these loci can be determined. Most genotyping is performed by electrophoresis using agarose or acrylamide gels to determine genotypes based on DNA band mobility differences according to DNA size and sequence variation.

이처럼 DNA marker는 유전정보의 이용연구를 위한 기본 도구로 동물 분자육종법에서 다양한 DNA marker들이 개발되어 표현형이 아닌 유전자형을 선발의 대상으로 유전자 지도제작, 우량유전자 또는 질병 유전자의 발굴, 유전자 기능의 이해 및 분자표지인자 의존선발 등에 사용된다.As such, DNA marker is a basic tool for the study of utilization of genetic information, and various DNA markers have been developed in animal molecular breeding methods to select genotypes, not phenotypes, to identify genes, identify superior genes or disease genes, understand gene function, It is used for selection of molecular marker dependence.

유전자지도작성, 표현형과 유전자형 사이의 연관분석 등을 위해 가장 중요한 요소는 개체간의 차이가 큰 DNA marker 들의 확보이다. 이를 위해 다양한 DNA marker들이 개발되었으며 실험방법 및 DNA 다형성의 형태에 따른 주요 DNA marker로는 SSLP, SNP, RFLP, RAPD, AFLP등이 있다.The most important factor for gene mapping, phenotype and genotype association analysis is the acquisition of DNA markers with large differences between individuals. Various DNA markers have been developed for this purpose, and major DNA markers according to the test method and the form of DNA polymorphism include SSLP, SNP, RFLP, RAPD, and AFLP.

DNA marker의 유용성은 부계와 모계에서 유전된 대립유전자들 간의 변별력이 클수록 높아지며, 변별력은 마커의 다형성이 높을수록 높아져 마커의 유용성은 결국 마커의 다형성의 정도에 따라 결정된다. DNA marker의 다형성을 나타내는 방법으로는 marker의 이형접합자빈도(heterozygosity, H)를 측정하는 방법과 다양성정보상수(polymorphism information content, PIC)값을 측정하는 두가지 방법이 사용된다.The usefulness of DNA markers increases as the discriminating power between alleles inherited from the father and mother lines increases, and the discriminating power increases as the polymorphism of the marker increases, so the usefulness of the marker is ultimately determined by the degree of marker polymorphism. There are two methods to express the DNA marker polymorphism. The heterozygosity (H) of the marker and the polymorphism information content (PIC) are measured.

현재 국제적으로 검증된 DNA 분석법인 단순반복염기서열 마커를 이용한 친자확인 및 혈연관계를 확인하는 시스템이 사람뿐만 아니라 가축, 식물 등에 많이 이용되고 있다. 친자확인 분석방법은 최근 수산분야에서 많이 이용되고 있으며 유전적 분석 데이터는 지속적으로 증가하고 있다. Currently, a system for confirming paternity and kinship using a simple repeat base sequence marker, an internationally validated DNA analysis method, is widely used for not only humans but also livestock and plants. The paternity analysis method is widely used in the fisheries field recently, and the genetic analysis data is continuously increasing.

그러나, 혈연관계에서 정확한 전형매관계(Full-sib)확인을 위한 분석 프로그램 및 그에 따른 방법은 없어 가계확인, 선발육종, 방류효과조사(방류종자에 대한 형제관계분석)에서 다소 부정확하고 오류의 가능성이 높은 결과가 사용되고 있다. However, there are no analysis programs and methods for identifying full-sibs in kinship, so there is little inaccuracy and error in household identification, selection breeding, and survey of discharge effects (sibling analysis of discharge seeds). This high result is being used.

또한 분석을 위해 사용되는 단순반복염기서열(simple sequence repeat, SSR) 부위의 초위성체마커(Microsatellite marker)의 분석 결과에 대한 누적되는 대립유전자빈도데이터(Allele frequency data)를 적용시킬 수가 없어 전형매관계 및 혈연관계 분석에 많은 어려움이 있다. 현재 국내에서 전형매관계 확인을 위해 사용되는 분석프로그램이 있긴 하지만 도출되는 결과 값은 관계 확인을 위한 개체 간 비교 또는 전형매관계 확인법이 아니므로 적합하지 않은 문제점이 있다.In addition, the cumulative allele frequency data on the results of the analysis of the microsatellite markers of the simple sequence repeat (SSR) region used for analysis cannot be applied. And there are many difficulties in analyzing the blood relationship. Although there is an analysis program used to confirm the typical sales relationship in Korea, the resulting value is not suitable because it is not a comparison between individuals or a typical sales relationship verification method for confirming the relationship.

이에 국내 공개특허공보 제10-2017-0068347호에는 말의 모계혈연관계 및 운동능력 예측 방법에 관하여 개시하고 있고, 국내 공개특허공보 특2002-0081704호에는 혈연을 감정하는 방법 및 이를 이용한 ＤＮＡ 타이핑 키트에 관하여 게시하고 있으나, 상기 선행문헌은 본 발명의 마커를 기반으로 혈연관계를 파악하고자 하는 개체들의 유전자형을 분석하는 단계(가); 상기 (가)단계의 결과로 유전자형 분석결과 데이터를 입력하는 단계(나); 상기 (나)단계의 유전자형 분석결과의 데이터로부터 대립유전자빈도를 입력하는 단계(다); 혈연관계를 파악하고자 하는 개체 사이의 대립유전자형의 일치여부를 비교하고, 상기 대립유전자형의 일치되는 경우의 수에 따른 IBD공식을 설정하며, 상기 설정된 IBD공식에 혈연관계에 대한 경우의 수에 따른 IBD상수를 적용하여 혈연관계지수(likelihood ratio)를 추출하는 단계(라); 상기 (라)단계의 가장 높은 수치의 확률결과를 확인하여 혈연관계를 결정하는 단계(마)로 이루어진 구성은 개시하지 않아 차이를 보인다.Accordingly, Korean Unexamined Patent Publication No. 10-2017-0068347 discloses a method of predicting maternal blood relationship and exercise ability of horses, and Korean Unexamined Patent Publication No. 2002-0081704 discloses a method of evaluating blood ties and a DNA typing kit using the same. Although it is published in relation to the prior document, based on the markers of the present invention (A) analyzing the genotypes of individuals to determine the blood relationship; Inputting genotyping result data as a result of step (a); Inputting an allele frequency from the data of genotyping results of step (b); Compares allele types between individuals to be related to blood relationship, sets the IBD formula according to the number of cases where the allele is matched, and sets the IBD according to the number of cases related to the blood relationship in the set IBD formula. Extracting a likelihood ratio by applying a constant (d); The configuration consisting of the step (e) of determining the kinship relationship by checking the probability result of the highest value of the step (d) does not disclose the difference.

국내 공개특허공보 제10-2017-0068347호에는 자마와 모마의 DNA 시퀀스를 추출하는 제1 단계, 제1 단계에서 추출된 DNA로 모계혈통 분석범위가 가능한 프라이머를 이용하여 PCR을 수행하는 제2 단계, 자마와 모마의 모계혈통 확인 및 운동능력을 위하여 상기 PCR의 산물을 미토콘드리아 DNA의 조절 영역의 시퀀스 분석을 수행하여 DNA 시퀀스에 대응하는 8개의 하플로그룹을 확인하는 제3 단계, 상기 PCR 수행된 자마와 모마의 DNA 시퀀스의 피크를 분석하여 어느 하플로 그룹에 속하는지 판단하여 말의 모계혈통분석 및 운동 능력을 예측 표시할 수 있도록 하는 모계혈통 및 운동능력 예측과 관련된 단일염기 다형성 (SNP, Single-nucleotide polymorphism)을 이용한 말의 모계혈연관계 및 운동능력 예측 방법에 관하여 개시하고 있다.Korean Patent Laid-Open Publication No. 10-2017-0068347 discloses a first step of extracting a DNA sequence of a follicle and a follicle, and a second step of performing PCR using a primer capable of analyzing a maternal lineage with the DNA extracted in the first step. The third step of identifying the eight haplo groups corresponding to the DNA sequence by performing sequence analysis of the regulatory region of the mitochondrial DNA, the product of the PCR for identifying the maternal lineage of the follicle and the mother Single base polymorphisms related to maternal lineage and motor activity prediction (SNP, Single) to analyze the peaks of the DNA sequences of follicles and mothers to determine which haplo groups belong to them to predict maternal lineage analysis and motor performance A method of predicting maternal kinship and motor performance in horses using -nucleotide polymorphism is disclosed. 국내 공개특허공보 특2002-0081704호에는 X 염색체 상의 대립연쇄반복 DNA를 비교함으로써 아버지의 DNA가 없는 상태에서 혈연을 확인하는 방법 및 이를 이용한 혈연 확인용 DNA 타이핑 키트에 관한 것이다. 본 발명의 혈연을 감정하는 방법은 할머니와 손녀의 X 염색체 상의 STR을 비교 분석함으로써 아버지의 DNA가 없는 경우에도 친생자임을 확인할 수 있고 이복자매의 경우에는 각 피검자의 어머니의 대립유전자를 제외하고 남은 X 염색체 STR 대립유전자를 각 피검자가 공유하는지를 관찰함으로써 아버지의 동일성 여부를 판정할 수 있으므로 아버지의 DNA가 없는 상태에서도 부녀간의 혈연을 확인하는데 유용하게 사용될 수 있는 Ｘ염색체 ＳＴＲ의 대립유전자를 비교하여 혈연을 감정하는 방법 및 이를 이용한 ＤＮＡ 타이핑 키트에 관하여 개시하고 있다.Korean Laid-Open Patent Publication No. 2002-0081704 relates to a method for checking blood kinase in the absence of father DNA by comparing allele-repeated DNA on X chromosome, and a DNA typing kit for blood quench using the same. In the method of evaluating the blood ties of the present invention, the STRs on the X chromosome of the grandmother and the granddaughter can be confirmed to be paternity even in the absence of the father's DNA. By observing whether each subject shares the chromosome STR allele, the identity of the father can be determined, thus comparing the alleles of the chromosome STR, which can be useful for identifying the kinship between women, even in the absence of father DNA. A method of appraising and a DNA typing kit using the same are disclosed. 국내 등록특허번호 제10-1533792호에는 NGS 기반(next generation sequencing-based) 인간 객체(human subject)의 상염색체 분석방법에 관한것으로, (a) 인간 객체 DNA 시료의 D3S1358, TH01, D21S11, D18S51, PentaE, D5S818, D13S317, D7S820, D16S539, CSF1PO (Human c-fms proto-oncogene for CSF-1 receptor gene), PentaD, vWA (von Willebrandfactor A), D8S1179, TPOX (Human thyroid peroxidase gene), FGA (Human fibrinogen alpha chain) 및 아멜로제닌(amelogenin) 유전좌위에 상보적으로 결합하는 각각의 프라이머를 이용하여 멀티플렉스 증폭을 실시하는 단계; 및 (b) 상기 단계 (a)의 멀티플렉스 증폭 산물의 NGS 데이터를 이용하여 상기 유전좌위의 STR (short tandem repeat) 대립유전자형을 결정하여, 상기 인간 객체를 유전자로 감식하는 단계를 포함하는 NGS기반 인간 객체의 상염색체 분석방법에 관하여 개시하고 있다.Korean Patent No. 10-1533792 relates to an autosomal analysis method of a next generation sequencing-based human subject, comprising: (a) D3S1358, TH01, D21S11, D18S51, PentaE, D5S818, D13S317, D7S820, D16S539, Human c-fms proto-oncogene for CSF-1 receptor gene (CSF1PO), PentaD, von Willebrandfactor A (vWA), D8S1179, Human thyroid peroxidase gene (TPOX), Human fibrin performing multiplex amplification using respective primers that complementarily bind to an alpha chain) and an ameloogenin locus; And (b) determining the short tandem repeat (STR) allele of the locus using the NGS data of the multiplex amplification product of step (a) to identify the human subject with a gene. An autosomal analysis method of a human subject is disclosed. 국내 등록특허번호 제10-1147691호에는 특정 두 개체 이상의 유전정보를 비교하여 대(generation)를 거친 부모자식 관계(1촌)와 같은 친자확인과 혈연관계 및 비혈연 관계를 판별하여 서로간의 촌수 정보를 제공하여 정확한 가계 정보를 도출할 수 있는 유전 정보 비교를 이용한 혈연관계 판별장치에 관하여 개시하고 있다.Korean Patent Registration No. 10-1147691 compares genetic information of two or more individuals to determine paternity, blood relationship, and non-kinship relationship such as parent child relationship (generation 1) through generation (generation). Disclosed is a blood relationship determination device using genetic information comparison that can provide accurate household information.

현재 수산분야에서의 유전자형 분석 결과를 이용한 전형매관계(Full-sib) 분석소프트웨어 또는 이를 이용한 혈연관계 분석방법에 관하여 개발되지 않아 정확한 혈연관계를 분석을 할 수 없을 뿐만 아니라 해외에서 개발된 집단 간 비교분석법을 이용하고 있는 상기 문제점을 해결하기 위해서, 본 발명은 수산분야에 적용할 수 있는 전형매 관계분석방법을 제공하고자 한다.Currently, it is not developed about full-sib analysis software using the result of genotyping analysis in fisheries or kinship analysis method using it. In order to solve the above problems using the analytical method, the present invention is to provide a method for analyzing the typical solvent relationship applicable to the fisheries field.

상기 과제를 해결하기 위한 수단으로서 본 발명은 마커를 기반으로 혈연관계를 파악하고자 하는 개체들의 유전자형을 분석하는 단계(가); 상기 (가)단계의 결과로 유전자형 분석결과 데이터를 입력하는 단계(나); 상기 (나)단계의 유전자형 분석결과의 데이터로부터 대립유전자빈도를 입력하는 단계(다); 전형매관계를 파악하고자 하는 개체 사이의 대립유전자형의 일치여부를 비교하고, 상기 대립유전자형의 일치되는 경우의 수에 따른 IBD공식을 설정하며, 상기 설정된 IBD공식에 전형매관계에 대한 경우의 수에 따른 IBD상수를 적용하여 전형매관계지수(likelihood ratio)를 추출하는 단계(라); 상기 (라)단계에서 추출된 전형매관계지수 결과를 이용하여 로그 중앙값[Log10 (median LR)]에 대한 평균 값과 표준편차를 Micro Excel로 계산한 후, 전형매관계(Full-sib)에서의 1~100000 범위의 전형매관계지수(likelihood ratio) 값을 기준으로 거짓양성률(false-positive rate)과 거짓음성률(false-negative rate)을 조사하여 전형매 관계 판정하는 단계(마)로 이루어진 전형매관계 분석방법을 제공한다.As a means for solving the above problems, the present invention comprises the steps of analyzing the genotype of the individuals to determine the relative relationship based on the marker (A); Inputting genotyping result data as a result of step (a); Inputting an allele frequency from the data of genotyping results of step (b); Compares allelic types between individuals seeking to identify typical sales relationships, sets the IBD formula according to the number of cases where the allelic types match, and sets the number of cases for the typical sales relationship to the set IBD formula Extracting the likelihood ratio by applying the IBD constant according to the step (d); After calculating the average value and standard deviation for the log median value [Log10 (median LR)] using the results of the typical sales relationship index extracted in the step (d) in Micro Excel, (E) determining the typical sales relationship by investigating false-positive rate and false-negative rate based on the likelihood ratio value ranging from 1 to 100,000. Provide a relationship analysis method.

현재 전세계적으로 법의학에서 사용되는 분석방법인 혈연관계확인공식과 Identity by descent (IBD) 상수를 수산종의 전형매관계 및 가계 확인을 위한 소프트웨어에 적용하고 있다. 그러나, 어류의 경우는 사람보다 다양한 대립유전자(allele)를 갖고 있기 때문에 동일 대립유전자가 발견될 확률이 낮아지는데 이는 마커의 식별력이 높아졌기 때문이다. 또한, 실질적으로 형제관계에 있는 경우에도 이복형제관계지수나 다른 관계의 지수가 높은 확률로 나오는 경우도 많이 있으며 오판의 가능성에 대한 통계적인 기준은 따로 설정이 되어 있지 않은 실정이다.At present, the analysis method of kinship and identity by descent (IBD) constants, which are used in forensics, are applied to the software for typical sales relations and household identification of fish species. However, in the case of fish, since alleles have more alleles than humans, the probability of finding the same allele is lowered because the marker's discrimination ability is increased. In addition, there are many cases where the half brother relationship index or the index of the other relationship has a high probability even when they are actually siblings, and statistical criteria for the probability of misjudgment are not set separately.

이에 본 발명은 분석하고자하는 특정관계를 지정하여 분석하고 거짓양성과 거짓음성률에 대한 통계적인 정규분포결과를 산출하여 이를 기준으로 정확한 판정이 이루어 질 수 있는 효과가 있다.Accordingly, the present invention has the effect of specifying a specific relationship to be analyzed and calculating and calculating a statistical normal distribution result for false positives and false voice rates and making an accurate determination based on this.

본 발명의 개발된 마커를 이용하여 유전자형분석을 실시하고, 누적되는 대립유전형빈도데이터를 계산하는 단계는 마커의 이용에서 무분별하게 다양한 대립유전자가 있다거나 여러 대립유전자 중에 특정 대립유전자에 빈도가 편중되는 마커를 피할 수 있다.Performing genotyping using the developed marker of the present invention, and calculating the cumulative allele frequency data is indiscriminately different alleles in the use of the marker or frequency of the specific allele among the alleles biased Markers can be avoided.

본 발명의 실시예에 따른 전형매관계 확인을 위한 지수 계산 방법은 일반적인 친자관계 지수계산방법과 동일하게 비교하기 위한 2개체의 대립유전자를 비교하고 일치 여부에 따라 대립유전형빈도와 혈연관계확인공식과 Identity by descent (IBD) 상수를 적용하는 단계를 포함한다.The exponential calculation method for confirming the typical sales relationship according to an embodiment of the present invention is to compare the alleles of the two objects for comparison in the same way as the general paternity index calculation method and the allele frequency and blood relationship confirmation formula according to the match Applying an Identity by descent (IBD) constant.

또한, 본 발명의 실시예에 따른 전형매관계 산출된 혈연관계지수(likelihood ratio)값을 이용하여 우도비의 로그 중앙값 [Log10 (median LR)]에 대한 평균(mean)과 표준편차 (standard deviation)를 Microsoft Excel로 계산하고, 각각의 관계에서 1~100,000 범위의 우도비 값을 기준으로 거짓 양성률(false-positive rate)과 거짓 음성률(false-negative rate)을 조사하는 것일 수 있다.In addition, the mean and standard deviation of the log median value [Log10 (median LR)] of the likelihood ratio using the calculated likelihood ratio calculated according to the exemplary embodiment of the present invention. And calculate the false-positive rate and false-negative rate based on the likelihood ratio value in the range of 1 to 100,000 in each relationship.

본 발명은 대량의 유전형 분석결과를 최대 100만 개까지 생산할 수 있을 뿐만 아니라 한 번에 많은 개체(1000개x1000개) 분석이 가능하여 시간절약, 비용절감, 노동력 감소 등의 효과가 있다.The present invention can produce a large amount of genotyping analysis results up to 1 million, as well as the analysis of a large number of individuals (1000x1000) at one time has the effect of saving time, cost, labor.

또한, 기존에 분석에는 전문적인 기술이 필요할 뿐만 아니라 시간적으로도 많이 소요되었지만 본 발명은 분석 과정 간소화로 인해 짧은 시간에 분석이 가능하여 업무에 대한 정확성, 효율성, 간편성, 전문성에 대한 문제점들을 해결하였다.In addition, in the past, the analysis requires not only specialized skills but also takes a lot of time, but the present invention solves problems of accuracy, efficiency, simplicity, and professionalism by enabling analysis in a short time due to the simplification of the analysis process. .

더불어, 수산분야에서 분석하기 어려운 전형매관계(Full-sib)의 혈연관계 분석에 대한 통계적 검증을 통해 신뢰성과 정확성이 높은 결과를 제공할 수 있고, 지속적으로 누적되는 유전자형 분석 결과에 대한 데이터베이스를 체계적으로 관리와 이용이 가능하다. 이로서, 정확한 전형매관계(Full-sib)확인을 통해 가계확인, 선발육종, 방류효과조사(방류종자에 대한 형제관계분석)에서 정확한 결과를 제공이 가능하다.In addition, statistical verification of the full-sib kinship analysis, which is difficult to analyze in the fisheries field, can provide highly reliable and accurate results, and systematically establish a database of cumulative genotyping results. It can be managed and used. Thus, it is possible to provide accurate results in household identification, selection breeding, and discharge effect investigation (sibling analysis of discharged seeds) through accurate full-sib confirmation.

도 1은 본 발명의 유전자형 분석을 이용한 전형매관계 분석방법의 모식도를 나타낸다.
도 2는 전형매관계지수 계산 수식을 나타낸다.
도 3은 유전자형 분석이 완료된 집단 F₀과 집단F₁ 에 대한 유전자형 분석결과를 입력하는 소프트웨어를 나타낸다.
도 4은 각각의 마커에 대한 대립유전자 빈도를 추출한 결과 값을 나타낸다.
도 5는 대립유전자 빈도값을 입력이 가능한 실시예에 따른 소프트웨어를 나타낸다.
도 6는 본 발명의 실시예 1에 따른 마커에 대해 산출된 전형매관계지수(likelihood ratio)를 나타낸다.
도 7은 (라)단계에 따라 시스템에 저장된 전형매관계지수값을 나타낸다.
도 8는 실험예 1에 따른 실제 형제관계에서의 전형매관계지수(Likelihood ratio) 값에 대한 분포혈연관계 확률을 분석한 결과를 나타낸다.
도 9은 실험예 1에 따른 거짓양성률(false-positive rate)과 거짓음성률(false-negative rate)을 이용한 전형매관계지수(Likelihood ratio) 분포의 평가결과를 나타낸다.
도 10의 혈연관계확률 식을 나타낸다.Figure 1 shows a schematic diagram of the method of analyzing the typical sales relationship using genotyping of the present invention.
2 shows a typical sales relation index calculation formula.
3 shows software for inputting genotyping results for population F ₀ and population F ₁ for which genotyping has been completed.
Figure 4 shows the result of extracting the allele frequency for each marker.
5 shows software according to an embodiment capable of inputting an allele frequency value.
FIG. 6 shows a likelihood ratio calculated for a marker according to Example 1 of the present invention.
7 illustrates a typical sales relationship index value stored in the system according to step (d).
Figure 8 shows the results of analyzing the distribution of the relationship between the probability of the typical sisterhood (Likelihood ratio) in the actual sibling relationship according to Experimental Example 1.
FIG. 9 shows evaluation results of a Likelihood ratio distribution using a false-positive rate and a false-negative rate according to Experimental Example 1. FIG.
The kinetic probability equation of FIG. 10 is shown.

이하, 본 발명을 첨부한 도면과 함께 상세히 설명하면 다음과 같다. 도 1은 본 발명의 유전자형 분석을 이용한 전형매관계 분석방법의 모식도를 나타낸다. 마커를 기반으로 혈연관계를 파악하고자 하는 개체들의 유전자형을 분석하는 단계(가); 상기 (가)단계의 결과로 유전자형 분석결과 데이터를 입력하는 단계(나); 상기 (나)단계의 유전자형 분석결과의 데이터로부터 대립유전자빈도를 입력하는 단계(다); 전형매관계를 파악하고자 하는 개체 사이의 대립유전자형의 일치여부를 비교하고, 상기 대립유전자형의 일치되는 경우의 수에 따른 IBD공식을 설정하며, 상기 설정된 IBD공식에 전형매관계에 대한 경우의 수에 따른 IBD상수를 적용하여 전형매관계지수(likelihood ratio)를 추출하는 단계(라); 상기 (라)단계에서 추출된 전형매관계지수 결과를 이용하여 로그 중앙값[Log10 (median LR)]에 대한 평균 값과 표준편차를 Micro Excel로 계산한 후, 전형매관계(Full-sib)에서의 1~100000 범위의 전형매관계지수(likelihood ratio) 값을 기준으로 거짓양성률(false-positive rate)과 거짓음성률(false-negative rate)을 조사하여 전형매 관계 판정하는 단계(마)로 이루어진다.Hereinafter, described in detail with the accompanying drawings of the present invention. Figure 1 shows a schematic diagram of the method of analyzing the typical sales relationship using genotyping of the present invention. (A) analyzing the genotypes of the individuals to be related to the markers based on the markers; Inputting genotyping result data as a result of step (a); Inputting an allele frequency from the data of genotyping results of step (b); Compares allelic types between individuals seeking to identify typical sales relationships, sets the IBD formula according to the number of cases where the allelic types match, and sets the number of cases for the typical sales relationship to the set IBD formula Extracting the likelihood ratio by applying the IBD constant according to the step (d); After calculating the average value and standard deviation for the log median value [Log10 (median LR)] using the results of the typical sales relationship index extracted in the step (d) in Micro Excel, (E) determining a typical sales relationship by investigating false-positive rate and false-negative rate based on the likelihood ratio value ranging from 1 to 100,000.

본 발명의 유전자형 분석단계(가)는 집단 F₀과 집단F₁의 시료에서 DNA를 추출하고 마커를 이용하여 PCR을 실시한 후, 상기 PCR산물을 이용하여 자동염기 서열분석장치를 이용하여 크기별로 분류되도록 전기영동한 후 각 마커에 대한 표준 allele ladder를 제작하고 분석프로그램을 이용하여 모든 개체를 scoring하고 크기와 표식자의 종류별로 분류하는 단계를 포함한다.Genotyping step (a) of the present invention is to extract the DNA from the sample of the population F ₀ and group F ₁ and PCR using a marker, and then classified by size using an automatic base sequencing device using the PCR product After electrophoresis, a standard allele ladder for each marker is produced, and all the objects are scored using an analysis program and classified by size and type of marker.

본 발명에 따른 집단 F₀과 집단F₁은 전형매관계(Full-sib)의 여부를 검증하고자 하는 개체를 지칭한다. 상기 마커는 바람직하게는 초위성체 (microsatellite) 마커가 이용될 수 있고 상기 PCR은 다중증폭 중합효소연쇄반응(Mutiplex-PCR)일 수 있다. Group F ₀ and group F ₁ according to the present invention refer to an individual to be tested for full-sib. The marker may preferably be a microsatellite marker and the PCR may be a multi-amplification polymerase chain reaction (Mutiplex-PCR).

본 발명의 용어, “마커(marker)”는 유전적으로 불특정 연관된 유전자좌를 동정할 때 참고점으로 사용되는 염기서열을 의미한다.As used herein, the term "marker" refers to a nucleotide sequence used as a reference point when identifying genetically unspecified genetic loci.

본 발명에서 용어, "멀티플렉스 PCR (multiplex PCR)"은, 하나의 주형 DNA에 대한 한 개의 유전자을 증폭하는 단일 PCR과 달리 여러 개의 유전자를 동시에 증폭하는 PCR 기법을 의미한다. 멀티플렉스 PCR에서는 단일 PCR 혼합물에 여러 프라이머쌍이 포함시키는 데, 이때 서로 다른 DNA 서열에 대해서는 각각에 특이적인 증폭 산물의 크기 범위를 나타내어 크기가 서로 중첩되지 않도록 하는 것이 바람직하다. As used herein, the term "multiplex PCR" refers to a PCR technique for amplifying several genes simultaneously, unlike a single PCR for amplifying one gene for one template DNA. In multiplex PCR, multiple primer pairs are included in a single PCR mixture. It is preferable to indicate the size range of amplification products specific to different DNA sequences so that the sizes do not overlap each other.

멀티플레스 PCR 기법에서는 여러 종류의 다른 프라이머 쌍을 한 튜브에 넣고 반응시키기 때문에, 프라이머 간에 억제 (inhibition)가 발생할 수 있으므로, 멀티플렉스 PCR에 적용할 때에는 증폭하고자 하는 유전자들에 대한 프라이머 들의 선정이 중요하다. 또한 포함되는 여러 프라이머 마다 적합한 결합 온도가 다를 수 있으므로, 멀티플렉스 PCR 적용 시에는 단일 PCR 반응에서 포함되는 PCR 프라이머 쌍 모두가 효과적으로 결합할 수 있도록 멀티플렉스 PCR에 적합한 결합 온도의 최적화가 요구된다.In the multiplex PCR technique, since different primer pairs are put in one tube and reacted, inhibition may occur between primers. Therefore, when applying to multiplex PCR, it is important to select primers for genes to be amplified. Do. In addition, since the appropriate binding temperature may be different for each of the primers included, the multiplex PCR application requires optimization of the binding temperature suitable for multiplex PCR so that all pairs of PCR primers included in a single PCR reaction can be efficiently bound.

본 발명에서 "프라이머(Primer)"는 일반적으로, 올리고뉴클레오타이드를 의미하는 것으로, 여기에 정의된 프라이머들이라는 용어는 DNA의 합성을 준비할 수 있는 DNA 가닥들을 말한다. DNA 중합효소(polymerase)는 프라이머 없이 처음부터 DNA를 합성할 수 없다. DNA 중합효소는 오직 상보적인 가닥이 조립되는 뉴클레오티드들의 순서를 지시하기 위한 주형으로서 사용되는 반응에서 존재하는 DNA 가닥을 연장할 수 있다. 그리고 적합한 온도와 pH의 조건에서 합성의 개시점으로 작용할 수 있다. 바람직하게는, 프라이머는 디옥시리보뉴클레오타이드이며 단일쇄이다. 본 발명에서 이용되는 프라이머는 자연 dNMP(즉, dAMP, dGMP, dCMP 및 dTMP), 변형된 뉴클레오타이드 또는 합성 뉴클레오타이드를 포함할 수 있다. 또한, 프라이머는 리보뉴클레오타이드도 포함할 수 있다.As used herein, "primer" generally refers to oligonucleotides, and the term primers as defined herein refers to DNA strands that are ready for the synthesis of DNA. DNA polymerase cannot synthesize DNA from scratch without primers. DNA polymerases can only extend the DNA strands present in the reaction used as a template to direct the sequence of nucleotides to which the complementary strands are assembled. And at the appropriate temperature and pH conditions can serve as a starting point for the synthesis. Preferably, the primer is deoxyribonucleotide and single chain. Primers used in the present invention may include natural dNMP (ie, dAMP, dGMP, dCMP and dTMP), modified nucleotides or synthetic nucleotides. In addition, the primer may also include ribonucleotides.

유전자형 분석결과 데이터 입력단계(나)는 상기 (가)단계의 마커 값을 기반으로 결정된 유전자형 분석결과(Allele scoring data)데이터를 입력하는 단계이다.The genotyping result data input step (b) is a step of inputting genotyping data (Allele scoring data) data determined based on the marker value of step (a).

대립유전자 빈도값 입력단계(다)는 상기 (나)단계의 유전자형 분석결과의 데이터로부터 얻어진 대립유전자빈도(allele frequency)를 입력하는 단계를 포함할 수 있다.An allele frequency value input step (c) may include inputting an allele frequency obtained from the data of the genotyping result of step (b).

본 발명의 전형매관계지수 추출단계(라)는 각 마커의 집단 F₀과 집단F₁의 대립유전자형을 일치여부를 비교하는 단계; 상기 비교된 대립유전자형이 일치하는 경우에 따라 IBD공식을 설정하는 단계; 상기 설정된 IBD공식에 집단 F₀과 집단 F₁ 사이의 전형매 관계에 따른 IBD상수를 적용하여 전형매관계지수(likelihood ratio)를 추출하는 단계로 이루어질 수 있다.The typical sales relations index extraction step (d) of the present invention comprises the steps of comparing the allelic types of the group F ₀ and the group F ₁ of each marker; Setting an IBD formula according to the case where the compared allele types match; Extracting a typical sales relationship index (likelihood ratio) by applying the IBD constant according to the typical sales relationship between the group F ₀ and the group F ₁ to the set IBD formula.

하기의 표 1은 집단 F₀과 집단 F₁의 각 좌위의 대립유전자형의 일치여부에 따른 경우(CASE)와 그에 따른 혈족관계확인 공식(kinship Formulas)을 나타낸다. P는 대립유전자 빈도를 나타낸다. 하기의 표 2는 부모집단과 자식 집단의 관계에 따른 Identity by descent coefficients(IBD상수)를 나타낸다.Table 1 below shows a case (CASE) according to whether the alleles of each locus of the group F ₀ and the group F ₁ (CASE) and the corresponding kinship formulas (kinship Formulas). P represents the allele frequency. Table 2 below shows Identity by descent coefficients (IBD constants) according to the relationship between the parent group and the child group.

혈족관계확인 공식(kinship Formulas)Kinship Formulas 부모집단(F₀) 대립유전자형Parent group (F ₀ ) allele type 자식집단(F₁) 대립유전자형Child group (F ₁ ) allele type FrequencyFrequency case 1case 1 ABAB ABAB Φ₂+0.5Φ₁(P_A+P_B)+2Φ₀P_AP_B Φ ₂ + 0.5Φ ₁ (P _A + P _B ) + 2Φ ₀ P _A P _B case 2case 2 AAAA AAAA Φ₂+Φ₁P_A+Φ₀P_A ² Φ ₂ + Φ ₁ P _A + Φ ₀ P _A ² case 3case 3 AAAA ABAB Φ₁P_B+2Φ₀P_AP_B Φ ₁ P _B + 2Φ ₀ P _A P _B case 4case 4 ABAB AAAA 0.5Φ₁P_A+Φ₀P_A ² 0.5Φ ₁ P _A + Φ ₀ P _A ² case 5case 5 ABAB ACAC 0.5Φ₁P_C+2Φ₀P_AP_C 0.5Φ ₁ P _C + 2Φ ₀ P _A P _C case 6case 6 ABAB CDCD 2Φ₀P_CP_D 2Φ ₀ P _C P _D case 7case 7 AAAA BBBB Φ₀P_B ² Φ ₀ P _B ² case 8case 8 AAAA BCBC 2Φ₀P_BP_C 2Φ ₀ P _B P _C case 9case 9 BCBC AAAA Φ₀P_A ² Φ ₀ P _A ²

Identity by descent coefficients(IBD상수) Identity by descent coefficients RelationshipRelationship Φ₂ Φ ₂ Φ₁ Φ ₁ Φ₀ Φ ₀ Parent-childParent-child 00 1One 00 Full siblings(전형매)Full siblings 1/41/4 1/21/2 1/41/4 Half siblingsHalf siblings 1/21/2 1/21/2 00 Grandparent-grandchildGrandparent-grandchild 1/21/2 1/21/2 00 Uncle-nephewUncle-nephew 1/21/2 1/21/2 00 First cousinsFirst cousins 3/43/4 1/41/4 00 Second cousinsSecond cousins 15/1615/16 1/161/16 00 UnrelatedUnrelated 00 00 1One

상기의 대립유전자형 일치여부를 비교하는 단계는 각 마커의 집단F₀와 집단F₁의 각 좌위의 대립유전자형의 일치여부를 비교하고, 상기 일치여부에 해당하는 경우에 따른 빈도 공식을 설정하는 단계를 포함한다. 대립유전자의 타입이 양쪽 모두 일치하는 경우, 한쪽만 일치하는 경우, 양쪽 모두 일치 하지 않는 경우로 구분하여 값을 산출한다. 산출방식은 표 1의 case 에 따라 각 allele 에 대한 알파벳을 부여하고 비교하여 적용 값을 달리하여 계산한다.Comparing the allele-type match whether the allele of the group F ₀ and each position of the group F ₁ of each marker is compared, and setting the frequency formula according to the case of the match Include. If the types of the alleles match both, if only one matches, both do not match, the value is calculated. The calculation method is calculated by varying the applied value by assigning and comparing the alphabet for each allele according to the case in Table 1.

도 2는 전형매관계지수 계산 수식을 나타낸다. 전형매관계지수(likelihood ratio) 계산 수식에서 H_p는 두 개체가 서로 혈연관계에 있는 경우로 표 2에 도시된 관계(relation)에 따라 구해진 상기 빈도 공식이 적용된 값을 지칭한다. 상기 전형매관계는 집단 F₀과 집단 F₁사이의 관계로 Full siblings의 확인이 포함된다. H_d는 두 개체가 서로 혈연관계가 없는 경우(Unrelated)를 지칭한다.2 shows a typical sales relation index calculation formula. In the formula of calculating the likelihood ratio, H _p refers to a value to which the frequency formula obtained according to the relationship shown in Table 2 is applied when two individuals are related to each other. The typical sales relationship includes the identification of full siblings as the relationship between group F ₀ and group F ₁ . H _d refers to Unrelated when two individuals are related to each other.

상기 혈연관계 확인 검증 알고리즘은 혈연관계 확인을 위한 마커에 대한 값(Allele score)을 통해 유전자형을 분석하고, Identity by descent 상수 확립, 대립유전자형빈도(allele frequency)를 추출하여 각각의 상기 표1에 기재된 각 마커에서의 대립유전자에 대한 비교를 하여 그에 맞는 frequency를 설정한다. The correlation verification verification algorithm analyzes the genotype through a value (Allele score) for the marker for confirmation of the kinship relationship, establishes an identity by descent constant, and extracts an allele frequency. Compare the alleles at each marker and set the appropriate frequency.

이후, 표 2의 IBD상수(Identity by descent coefficients)에서 Full siblings 부분의 Φ 값을 표1에 적용함으로써 각 마커에 대한 혈연관계지수가 도 6과 같이 산출될 수 있다. 이 때 상기 소프트웨어를 이용한 결과값에서 혈연관계지수 계산을 위해 비교되는 마커의 수가 50%미만인 경우에는 비교 대상에서 제외하고 자동으로 혈연관계가 없는 것으로 판별이 나고, 마커의 개수가 비교에 사용될 수 있는 50%이상일 경우에는 다음 단계로의 분석이 진행된다.Then, by applying the Φ value of the full siblings portion in Table 1 in the IBD constants (Identity by descent coefficients) of Table 2 can be calculated as shown in Figure 6 for each marker. In this case, if the number of markers compared for calculating the correlation coefficient in the result value using the software is less than 50%, it is determined that there is no kinship relationship except the comparison target, and the number of markers can be used for comparison. If more than 50%, the analysis goes to the next step.

전형매 관계 판정단계(마)는 거짓양성률(false-positive rate)과 거짓음성률(false-negative rate)을 조사하여 작성된 정규분포도와, 상기 (라)단계에서 추출된 혈연관계지수 결과를 이용하여 로그 중앙값[Log10 (median LR)]에 대한 평균 값과 표준편차를 Micro Excel로 계산한 후, 전형매관계(Full-sib)에서의 1~100000 범위의 전형매 관계지수(likelihood ratio) 값을 기준으로 상기 정규분포도에 적용하여 전형매(Full-sib)관계를 판정하는 단계로 이루어질 수 있다.The typical sales relationship determination step (e) is a logarithm of the normal distribution created by examining the false-positive rate and the false-negative rate, and using the result of the kinship index extracted in the step (d). After calculating the mean and standard deviation for the median value [Log10 (median LR)] in Micro Excel, the value is based on the likelihood ratio of 1 to 100,000 in the full-sib. The method may be applied to the normal distribution chart to determine a full-sib relationship.

상기 정규분포도의 작성은 다음과 같다. 실제 전형매 관계가 있는 개체군과 전형매 관계가 없는 개체군을 선발하고 상기 개체군으로부터 상기 (가) 내지 (라)단계를 실시하여 혈연관계지수를 계산한다.The normal distribution is prepared as follows. An individual population having a typical sales relationship and a population having no typical sales relationship are selected, and the relative relationship index is calculated by performing steps (a) to (d) from the population.

전형매 관계가 있는 개체군과 전형매 관계가 없는 개체군으로부터 계산된 전형매관계지수에 대한 정규분포 확률을 계산하여 Log₁₀ 값에 대한 정규분포 그래프를 작성하고, 정규분포 그래프를 이용하여 도 8에 도시된 바와 같이, log 기준값에 대한 거짓양성률 및 거짓음성률 값 산출이 가능하다.The normal distribution probability for the typical sales relationship index calculated from the population with typical sales relation and the population without typical sales relation was calculated to prepare a normal distribution graph for Log ₁₀ values, and is shown in FIG. 8 using the normal distribution graph. As can be seen, the false positive rate and false negative rate values for the log reference value can be calculated.

일 실시예로서, Log₁₀ 0을 기준으로 두고 판정을 한다면 false-negative 3.671%, false-positive 2.747%의 가능성이 있다는 것을 의미한다. 따라서 이러한 결과들을 이용하여 전형매관계에 대한 판정을 위해 거짓 양성 및 거짓음성의 오류를 최소화 및 최적으로 할 수 있다.As an example, if the determination is made based on Log ₁₀ 0, this means that there is a possibility of false-negative 3.671% and false-positive 2.747%. Therefore, these results can be used to minimize and optimize the false positive and false negative errors for the determination of the typical sales relationship.

일 실시예로서 집단 F₀과 집단F₁은 상기 (가) 내지 (라)단계로 추출된 전형매관계지수 결과를 이용하여 로그 중앙값[Log10 (median LR)]에 대한 평균 값과 표준편차를 Micro Excel로 계산한 후, 전형매관계(Full-sib)에서의 1~100000 범위의 혈연관계지수(likelihood ratio) 값을 기준으로 상기 정규분포도에 적용한다. 상기 정규분포도에 거짓양성률과 거짓음성률 정규분포도 각각의 중심 축과 정규분포에 적용된 혈연관계지수가 거짓양성률과 가까울 경우 전형매 관계가 없는 것으로 판정되고 거짓음성률과 가까울 경우 전형매 관계가 있는것으로 전형매(Full-sib)관계를 판정될 수 있다.As an example, the group F ₀ and the group F ₁ are set to the average value and standard deviation of the log median [Log10 (median LR)] using the results of the typical sales relationship index extracted in steps (a) to (d). After calculating with Excel, the normal distribution is applied based on the likelihood ratio value in the range of 1 to 100,000 in a full-sib. False Positive Rate and False Negative Rate on the Normal Distribution Map Normal distribution also has no typical sales relationship when the kinetic index applied to the central axis and the normal distribution is close to the false positive rate. (Full-sib) relationship can be determined.

이하, 본 발명의 유전자형 분석을 이용한 전형매관계분석 방법과 관련한 구체적인 실시예를 설명하면 다음과 같다.Hereinafter, specific examples related to the typical sales relationship analysis method using genotyping of the present invention will be described.

<실시예 1> 넙치 유전자형 분석을 이용한 전형매관계분석Example 1 Typical Sales Relationship Analysis Using Flounder Genotyping

(가) 유전자형 분석단계(A) Genotyping stage

본 발명에 따른 실험어의 자식집단은 동일 부모에서 생산된 것으로 확인된 전형매 관계의 넙치 개체를 집단 F₀과 집단F₁로 설정하여 유전자형 분석단계를 실시하였고, 본 발명의 실험예는 상기 집단 사이가 전형매관계가 성립하는지 확인한 것이다. 상기 전형매관계는 일반적인 육종학에 따른 가계에서 동일한 부모로부터 생산된 형제자매(full-sib)들로 구성된 관계를 지칭한다.The child group of the experimental fish according to the present invention was carried out genotyping step by setting the flounder individuals of the typical sales relationship confirmed to be produced in the same parent group F ₀ and group F ₁ , the experimental example of the present invention This is to confirm whether the typical sales relationship is established. The typical relationship refers to a relationship composed of siblings (full-sib) produced from the same parent in the family according to the general breeding.

(나) 유전자형 분석결과 입력단계(B) Input step of genotyping results

도 3은 유전자형 분석이 완료된 집단 F₀과 집단F₁ 에 대한 유전자형 분석결과를 입력하는 소프트웨어를 나타낸다. 일실시예로서 CSV 파일로 저장하고 sample 1, 2 부분의 open-파일선택-확인 과정을 통해 소프트웨어상에 결과값을 입력하였다.3 shows software for inputting genotyping results for population F ₀ and population F ₁ for which genotyping has been completed. As an example, the result value was saved as a CSV file and the result value was input to the software through the open-file selection-check process of sample 1 and 2 parts.

(다) 대립유전자빈도값 입력단계(C) Entering allele frequency values

도 4은 각각의 마커에 대한 대립유전자 빈도를 추출한 결과 값을 나타내고, 도 5는 대립유전자 빈도값을 입력하는 본 발명의 실시예에 따른 소프트웨어를 나타낸다. 모든 개체에 대한 유전형 결과 값을 이용하여 대립유전자 수를 계산하고 각 대립유전자에 대해 총 갯수로 나누어 대립유전자빈도를 생성하며 각 대립유전자에 대한 빈도를 모두 합치면 1.0 이 나온다. 4 shows a result of extracting allele frequencies for each marker, and FIG. 5 shows software according to an embodiment of the present invention for inputting allele frequencies. Calculate the number of alleles using genotyping results for all individuals, generate allele frequencies by dividing by the total number for each allele, and sum the frequencies for each allele to yield 1.0.

상기와 같은 방법으로 추출한 대립유전자 빈도값은 본 발명의 실시예에 따른 소프트웨어에 입력할 수 있다. 본 발명에 따른 스프트웨어에 대한 기능에서 Save as는 선택한 대립유전자빈도 파일을 다른 이름으로 저장(엑셀)을, Data는 선택한 대립유전자빈도 파일 보기를, Import 저장되어있는 대립유전자빈도 파일을 선택하고, Create는 모든 개체의 유전형데이터를 이용한 대립유전자빈도를 계산하고 선택할 수 있게 파일 생성을 할 수 있다.The allele frequency value extracted by the above method can be input to the software according to the embodiment of the present invention. In the function for the software according to the present invention, Save as saves the selected allele frequency file under another name (Excel), Data selects the selected allele frequency file view, selects the stored allele frequency file, and creates A file can be generated to calculate and select allele frequencies using genotyping data of all individuals.

(라) 전형매관계(Full siblings)지수 추출단계 (D) Extraction of Full siblings index

상기 (나)단계의 집단 F₀의 1번 개체와 집단 F₁의 1번 개체를 1:1비교를 진행한다. 표1과 같이 각 마커에서의 대립유전자에 대한 비교를 하여 그에 맞는 frequency를 사용하고 표2의 Identity by descent coefficients에서 Full siblings 부분의 Φ 값을 표1에 적용하면 각 마커에 대한 전형매 관계지수는 도 6과 같이 산출된다. In the step (b), the object 1 of the group F ₀ and the object ₁ of the group F ₁ are subjected to a 1: 1 comparison. When comparing the alleles in each marker as shown in Table 1, using the appropriate frequency and applying the Φ value of the full siblings portion in the Identity by descent coefficients of Table 2 to Table 1, the typical sales relationship index for each marker is It is calculated as shown in FIG.

도 6는 본 발명의 실시예 1에 따른 마커에 대해 산출된 전형매관계지수(likelihood ratio)를 나타낸다. 전형매관계지수는 산출된 Identity by descent(IBD) 상수와 대립유전자형빈도를 적용하여 각각의 마커에 대한 전형매관계지수를 산출하였다. FIG. 6 shows a likelihood ratio calculated for a marker according to Example 1 of the present invention. The typical sales relation index was calculated by applying the calculated Identity by descent (IBD) constant and allele frequency.

전형매관계지수는 도 2에 도시된 수식에 따라 계산을 실시하였다. 계산은 서로 특정 혈연관계에 있는 경우(related, H _p )와 서로 혈연관계가 없는 경우(unrelated, H _d )의 두 가지 상반되는 가정 하에 특정 집단에서 주어진 유전자형이 나타날 짝 확률의 비로 최종 전형매관계지수를 계산하였다. 도 7은 (라)단계에 따라 시스템에 저장된 전형매관계지수값을 나타낸다.Typical sales relation index was calculated according to the formula shown in FIG. The calculation is based on the ratio of the probability of a given genotype to a given genotype in a particular population, under the two opposing assumptions of a relative kinship (related, H _p ) and an unrelated kin (unrelated, H _d ). The index was calculated. 7 illustrates a typical sales relationship index value stored in the system according to step (d).

(마) 전형매관계 결정단계 (E) Decision-making process

상기 전형매관계지수(likelihood ratio) 결과를 이용하여 도 8와 같이 로그 중앙값[Log10 (median LR)]에 대한 평균 값과 표준편차를 Micro Excel로 계산하고 전형매관계(Full-sib)에서의 1~100000 범위의 혈연관계지수(likelihood ratio) 값을 기준으로 거짓양성률(false-positive rate)과 거짓음성률(false-negative rate)을 조사한다. 상기 정규분포도에 거짓양성률과 거짓음성률 정규분포도 각각의 중심 축과 정규분포에 적용된 혈연관계지수가 거짓양성률과 가까울 경우, 전형매 관계가 없는 것으로 판정되고 거짓음성률과 가까울 경우 전형매 관계가 있는 것으로 전형매(Full-sib)관계를 판정될 수 있다.Using the likelihood ratio result, the average value and standard deviation for the log median [Log10 (median LR)] are calculated in Micro Excel as shown in FIG. 8, and 1 in the full-sib. The false-positive rate and false-negative rate are investigated based on the likelihood ratio value in the range of ~ 100000. False Positive Rate and False Negative Rate in the Normal Distribution Map Normal distribution also shows that there is no typical sales relationship when the kinetic index applied to the central axis and the normal distribution is close to the false positive rate, and when there is close to the false negative rate, the typical sales relationship is typical. Full-sib relationships can be determined.

도 8는 실험예 1에 따른 실제 형제관계에서의 전형매관계지수(Likelihood ratio) 값에 대한 분포혈연관계 확률을 분석한 결과를 나타낸다. 예를 들어 정규분포에 적용된 혈연관계지수 값의 좌표 위치로부터 거리가 거짓양성률의 정규분포도 중심축 3.15보다 거짓음성률의 정규분포도 중심 축 -1.98에 가깝게 위치하면 전형매 관계가 있는 것으로 전형매(Full-sib)관계를 판정할 수 있다.FIG. 8 shows the result of analyzing the distribution relationship probability for the value of a typical medium relationship (Likelihood ratio) in the actual sibling relationship according to Experimental Example 1. FIG. For example, if the distance from the coordinate position of the kinetic index value applied to the normal distribution is located closer to the central axis -1.98 of the false negative rate than the central axis 3.15 of the false positive rate, there is a typical sales relationship. sib) relationship can be determined.

도 9은 실험예 1에 따른 거짓양성률(false-positive rate)과 거짓음성률(false-negative rate)을 이용한 전형매관계지수(Likelihood ratio) 분포의 평가결과를 나타낸다. 최종적으로 전형매관계 분석결과, 집단 F₀과 집단F₁ 사이의 전형매관계(Full-sib)판정은 혈연관계지수의 기준과 도 10의 혈연관계확률을 적절히 이용하여 판정이 가능할 것이다. FIG. 9 shows evaluation results of a Likelihood Ratio distribution using a false-positive rate and a false-negative rate according to Experimental Example 1. FIG. Finally, the result of the typical sales relationship analysis, group F ₀ and group F ₁ The full-sib determination between the two can be determined by appropriately using the kinship index and the kinship probability of FIG.

본 발명의 실시예에 따라 도출된 전형매관계는 99.999%의 확률로 형제관계(Full-sib)로 판정할 수 있었으며 실제로 실험어 집단 F₀과 집단F₁는 전형매 관계이므로 본 발명에 따른 방법은 검증효과 및 높은 신뢰도를 가지고 있는 것으로 판단된다.The typical sales relationship derived according to the embodiment of the present invention was determined to have a sibling relationship (Full-sib) with a probability of 99.999%, and in fact, the experimental group F ₀ and group F ₁ are typical sale relationships. Is considered to have a verification effect and high reliability.

기존의 어류 양식은 과학적이지 못한 친어집단의 관리 및 유전학적 다양성 축소와 집단의 근친도 증가와 유전적 병목현상으로 인해, 성장 저하, 빈번한 질병 발생, 기형 등의 발생과 집단크기가 축소되고 단순히 양성, 사료, 시설, 유통 등의 기술 개발로는 상기 문제점을 해결하기가 어렵다. 이에 본 발명의 유전자형 분석을 이용한 혈연관계 분석은 보다 신속하고 높은 신뢰도의 혈연관계 분석 결과를 제공이 가능함으로 혼획률 조사, 유전적 다양성 분석, 가계 확인, 선발육종 등 많은 부분에 활용이 가능함으로 산업상 이용가능성이 있다.Existing fish farming has reduced growth, frequent disease outbreaks, malformations, population size, and positivity due to unscientific management and reduced genetic diversity, increased inbreeding and genetic bottlenecks. It is difficult to solve the above problems by developing technologies such as feed, facilities, and distribution. In this regard, the kinship analysis using the genotype analysis of the present invention can provide more rapid and highly reliable kinship analysis results, so that it can be utilized in many parts, such as a survey of congestion rate, genetic diversity analysis, household identification, and selection sarcoma. There is a possibility.

Claims

(A) analyzing the genotypes of the individuals to be related to the markers based on the markers; Inputting genotyping result data as a result of step (a); Inputting an allele frequency from the data of genotyping results of step (b);
Compares allelic types between individuals to be related to blood relationship, sets the IBD formula according to the number of cases where the allele is matched, and sets the IBD according to the number of cases related to the related IBD formula. Extracting a likelihood ratio by applying a constant (d);
Correlation relationship analysis method using genotype analysis consisting of the step (e) of determining the relative relationship by checking the probability result of the highest value of step (d)

The method of claim 1, wherein the determination of blood relationship in the step (e) is performed based on the calculated likelihood ratio value and the mean and standard deviation of the log median of the likelihood ratio [Log10 (median LR)]. Calculation of deviations and kinship analysis using genotyping, which examines false-positive rates and false-negative rates based on likelihood ratio values ranging from 1 to 100,000 in each relationship. Way