KR101530535B1

KR101530535B1 - Method, Composition, and Kit for Predicting the Risk of Developing Graft Versus Host Disease

Info

Publication number: KR101530535B1
Application number: KR1020130067909A
Authority: KR
Inventors: 양송주; 정정희; 백규흠; 김정은; 양갑석; 서정선; 전은형; 박선양
Original assignee: 주식회사 마크로젠; 서울대학교산학협력단
Priority date: 2013-06-13
Filing date: 2013-06-13
Publication date: 2015-06-23
Also published as: KR20140145669A

Abstract

Techniques are provided to select specific single nucleotide polymorphisms (SNPs) that have a significant correlation with the onset of graft versus host disease and use this as a marker to predict graft versus host disease risk.

Description

Technical Field [0001] The present invention relates to a method for predicting the risk of graft-versus-host disease and a composition and kit for predicting the risk of graft-versus-

Specific single nucleotide polymorphisms (SNPs) that have a significant correlation with the onset of graft versus host disease; Identifying the SNP bases and providing information to predicting the risk of graft-versus-host disease; A composition for screening a risk of graft-versus-host disease comprising a probe for detecting the SNP and / or a primer for amplifying the chromosomal region, and a graft-versus-host disease disease comprising the graft-versus- A kit for risk assessment is provided.

Hematopoietic stem cell transplantation is the treatment of leukemia, aplastic anemia, and immunodeficiency patients after the anti-cancer / immunosuppressive therapy or radiation therapy, to replace damaged bone marrow or diseased bone marrow, , And the types are autotransplantation and homologous transplantation. Allogeneic hematopoietic stem cell transplantation is an excellent treatment that can cure many intractable hematologic diseases. However, many complications can occur after transplantation. The most common complication is graft-versus-host disease (GVHD) Of the total.

Graft-versus-host disease is a disease in which immune cells (lymphocytes) contained in tissues (or bone marrow) transplanted from a donor attack cells of a patient (host) whose immune system has been degraded. In general, acute and chronic can be divided into 100 days after transplantation. Acute target organs are mainly damaged in skin, liver, gastrointestinal tract, etc. Chronic is caused by autoimmune disease It will develop in a similar form.

According to the transplantation status data of the hospital hematopoietic stem cell transplantation nursing society (http://bmtnurse.org/), the number of transplant patients is increasing year by year, and according to the 2011 transplant status data, The cells have been reported to have been transplanted. Therefore, these patients may have a complication of graft versus host disease, even though the severity of the individual severity may be different. Pre-transplantation of allogeneic hematopoietic stem cells can predict the onset of graft-versus-host disease, so that early diagnosis and treatment can be dealt with differently. However, molecular diagnostic methods for predicting the risk of graft-versus-host disease in advance can not be commercialized by preliminarily judging individual susceptibility, so research and technology development are required.

One example of the present invention is a method for predicting the risk of graft-versus-host disease by identifying a base of a predetermined single nucleotide polymorphism (SNP) having a significant correlation with the onset of graft-versus-host disease in a gene sample collected from a human subject .

Another example is a graft comprising a probe having a complementary sequence with a chromosome region containing a predetermined SNP having a significant correlation with the onset of graft versus host disease and a primer for amplification of the chromosome region There is provided a composition for screening for the risk of metastatic disease.

Another example provides a kit for screening a risk of graft-versus-host disease comprising a composition for screening for the risk of graft-versus-host disease.

Yet another example is a method for identifying a SNP in a sample of a human subject, comprising identifying a base of a predetermined SNP having a significant correlation with the onset of graft-versus-host disease The method comprising the steps of:

Therefore, the present inventors examined the nucleotide sequences of individual graft-versus-host disease patients and used single nucleotide polymorphisms (SNPs) in graft-versus-host disease patients to predict the risk of graft-versus-host disease The present invention has been completed by developing a screening technique that enables proactive prevention.

The present inventors obtained genomic DNA from the blood of a patient before allogeneic hematopoietic stem cell transplantation and found that the graft-versus-host disease was expected to be related to graft-versus-host disease by dividing the graft-versus-host disease- , We examined the SNPs specific to graft-versus-host disease patients by examining the DNA of the genes that have been reported in the graft-versus-host disease group by identifying the SNP bases that are expected to be associated with graft- SNPs having a significant correlation with graft versus host disease and screening techniques for measuring the risk of graft versus host disease of the individual based on information on the pattern of the selected SNPs in the genes of the individual were completed.

The screening technique according to the present invention is a useful technique for preventing or early detection of graft-versus-host disease by separately classifying the risk groups of graft-versus-host disease by a simple and high sensitivity.

In the present invention, graft-versus-host disease (GVHD) may include both chronic and acute, and may specifically be an acute graft-versus-host disease.

In the present invention, seven SNPs were selected as SNPs having a specific and significant correlation with graft-versus-host disease patients, and SNPs having a significant correlation with the graft-versus-host disease were shown in Table 1 below.

Marker SNP Chromosome number Chromosomal location
(ref) Gene name Ancestal
Allele Wild Hetero Homo Significant genetic model Translate rs882559 6 39869066 DAAM2 / intron G GG GC CC recessive CC protective rs7178935 15 59368167 RNF111 / cds-synon A GG AG AA dominant AG + AA risky rs3744805 17 38248354 THRA / intron G AA AG GG recessive GG protective rs7564005 2 137512698 - T TT TG GG recessive GG protective rs1876522 17 53610289 - C TT TC CC co.dominant TC or CC protective rs2228048 3 30713842 TGFBR2 / cds-synon C CC TC TT co.dominant TC or TT risky rs1946518 11 112035458 IL18 / 5 gene T TT TG GG recessive GG risky

(Genome Build GRCh37 / hg19)

Bold in the above table is a factor with a high risk of graft-versus-host disease, and italicized indicates a factor having a medium risk of wild type and mutation type. Also, the bases corresponding to SNPs rs882559, rs3744805, and rs1876522 in the above table represent two allele bases of the reverse strand of the double strand and two allele bases of the forward strand, These indications are the same throughout this specification.

As shown in Table 1, SNP rs882559 is a SNP located at position 39869066 base of chromosome 6. In the case of wild type, SNP rs882559 has a base G in the above position, but when mutated and changed into C, C is a graft- CC in which both alleles are mutated into C, acting as protective in the recessive model in disease outbreaks, significantly reduces the risk of graft-versus-host disease development. That is, the risk of graft-versus-host disease development increases in GG wild-type or GC (shown in bold).

SNP rs7178935 is a SNP located at position 59368167 base of chromosome 15 and has a base G at the above position in the case of the wild type but the mutated A when changed into A is a dominant model for the onset of graft versus host disease , The risk of graft-versus-host disease increases (not only in AA where both alleles are mutated to A, but also in AG when only one allele is mutated to A) (indicated in bold) .

SNP rs3744805 is a SNP located at position 38248354 on chromosome 17, which has base A in the above position in the case of wild type, but mutated G when changed into G, protects it from a recessive model in the onset of graft- GG, in which both alleles are mutated into G, acts as a protective factor, and the risk of graft-versus-host disease development is significantly lowered. In other words, there is an increased risk of developing acute graft alopecia in AA wild type or AG (shown in bold).

SNP rs7564005 is a SNP located at the 137512698th base position of chromosome 2 and has a base T in the above position in the case of wild type but G mutant when mutated and changed to G is a recessive model in the onset of graft versus host disease , The risk of graft-versus-host disease development is significantly lower for GG in which both alleles are mutated to G. [ That is, the risk of developing acute graft alopecia are increased in TT wild type or TG (indicated in bold).

SNP rs1876522 is a SNP located at position 53610289 base of chromosome 17 and has a base T in the above position in the case of wild type but mutated C when C is changed to C in the presence of a homozygous (co In contrast, CC with both alleles mutated to C has a lower risk of graft-versus-host disease, and TC with an allele shifted to C, But a lower risk of graft-versus-host disease than wild-type. That is, the TT wild type increases the risk of developing acute graft-versus-host disease (indicated in bold).

SNP rs2228048 is a SNP located at position 30713842 base of chromosome 3 and has a base C in the above position in the case of the wild type but mutated T when the mutation occurs and changes to T in the case of the graft-versus host disease TT in which both alleles are mutated to T, acting as a risky in the. dominant model, increases the risk of graft-versus-host disease (indicated in bold), or TC with an allele mutated to T Even slightly lower than in TT, but with a higher risk of graft-versus-host disease (in italics) than in the wild type.

SNP rs1946518 is a SNP located at position 112035458 base of chromosome 11 and has a base T in the above position in the case of wild type but mutated G when G is changed to recessive in the onset of graft- GGs with both alleles mutated to G, acting as risky in the model, increase the risk of graft-versus-host disease (shown in bold).

Based on the information on the seven SNPs as described above, the nucleotide sequences of the seven SNPs among all the genes of the subjects (for example, those without graft-versus-host disease) are identified and the risk of graft- To determine the risk of graft versus host disease in the subject.

In addition, the present invention has revealed that SNPs that are at risk of developing graft-versus-host disease as described above are selected, and the mutation type of each SNP acts as dominant, recessive, or heterozygous, It is possible to provide a technique capable of quantifying and quantifying aspects of high risk of host disease. Therefore, using the information on the selected SNPs and the risk base of the risk of graft-versus-host disease, the risk of graft-versus-host disease can be selected with higher sensitivity. Indeed, it has been shown that the selected SNPs were tested again in graft-versus-host disease patients who were the source of the gene samples used for their screening, as well as in more than 72% of graft- And predicted the accuracy of host disease prediction.

On the other hand, Linkage Disequilibrium (LD) is a part of a genome that is generated by a genome that is as short as a genome-wide section, with little cross-over. Thus, the genomic information present in this section is almost identical and is almost preserved throughout the generations. Seven SNPs having a significant correlation with the risk of selected graft-versus-host disease in the present invention are mutations present at specific positions on the genome. Therefore, when the LD region is formed around the seven SNPs, the genes located in the LD region also have the same genetic information as the SNPs. Therefore, in addition to the seven SNPs, other SNPs It is possible to predict the risk of graft-versus-host disease.

Thus, SNPs present in the LD region of the seven SNPs and the R square value of 1 can also be useful for predicting the risk of graft-versus-host disease. The reason why the R-square value of a specific SNP pair is 1 means that the information of two SNPs is the same. Based on the data of the International HapMap project ( http://www.hapmap.org/downloads/index.html.en ), the distance of all revealed SNP pairs within 1 Mbp based on the chromosomal location of each of the selected SNPs R square values are compared and the results are shown in FIGS. 3 to 9. FIG. In the present invention, the calculation of the distance at which the R square value of a specific SNP pair is 1 can be calculated from the distance distribution of all the SNP pairs revealed within 1 Mbps based on the positions of the respective SNP chromosomes 1 was considered to be an abnormal value. Also, the limit of the association imbalance length is taken as the value of 90 percentile in the distribution with the R square value of 1, and the blue vertical line in Figs. 3 to 9 means the limit of the distance. In other words, the X-axis in FIGS. 3 to 9 represents the distance of all the SNP pairs within 1 Mbp on the basis of each of the seven SNPs in the chromosome on which the seven SNPs selected in the present invention are located, in the base pair (bp) Is shown by a blue vertical line. Table 2 below shows the bp length of 90 percentile in the distribution of the distance of the SNP pair in which the R square value can be 1, as a possible link unbalance block (LD block) length.

In addition, since the linkage unbalance block is present in both the upstream and downstream of the seven SNPs, the linkage unbalance portion can be represented to exist as far as the association unbalance block length upstream and downstream respectively with respect to each SNP, This is also shown in Table 2 below as the position of the base in the chromosome.

Marker SNP Chromosome number Chromosomal location (ref) The possible LD block length (bp) Possible LD region centered on marker SNP rs882559 6 39869066 56822 39812244 to 39925888 rs7178935 15 59368167 30582 59337585 ~ 59398749 rs3744805 17 38248354 33992 38214362 to 38282346 rs7564005 2 137512698 93625 137419073 to 137606323 rs1876522 17 53610289 48000 53562289 to 53658289 rs2228048 3 30713842 17329 30696513 ~ 30731171 rs1946518 11 112035458 95872 111939586 to 112131330

Since the cross-over tendencies of homologous chromosomes are slightly different for each chromosome in each generation, it can be seen from Table 2 that the lengths of the unbalanced regions may be different. The linkage disequilibrium region (LD region) in which the R square value is 1 is a region covering about 17 to 96 kbp upstream and downstream based on each SNP as a group consisting of all SNPs present in the region One or more selected SNPs may be used to predict graft-versus-host disease risk. Since the number and types of SNPs present in the above-mentioned predetermined regions are already known in the technical field of the present invention, those skilled in the art will appreciate that the starting and ending positions SNPs present in the specified linkage disequilibrium region can be easily selected and used. For example, the SNP can be easily selected from http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp (entrezSNP), a site provided by the National Center for Biotechnology Information, &Lt; / RTI > are shown in Table 3 below: < tb >< TABLE &

Marker SNP Chromosome number Neighbor SNP
(upstream) Neighbor SNP
(downstream) rs882559 6 rs7765077, rs186480179, rs182233667, rs77149857, rs191290624 gt; rs7178935 15 rs148964812, rs145696454, rs149777069, rs184331808, rs140572228 rs142867281, rs200747794, rs149481646, rs139318473, rs77443443 rs3744805 17 rs138640083, rs145404278, rs113211153, rs139979507, rs75248071 rs148709476, rs3834609, rs59275034, rs113962350, rs3760531 rs7564005 2 rs7563777, rs2558090, rs34800310, rs35134764, rs2558089 rs35545627, rs34367058, rs13416345, rs13416453, rs1427591 rs1876522 17 rs1876523, rs1876524, rs1508230, rs8079251, rs6504973 rs8080030, rs9896610, rs8080297, rs9902863, rs8082684 rs2228048 3 rs35766612, rs35719192, rs17854016, rs2229102, rs3209742 rs11466513, rs11466514, rs17026171, rs11466515, rs11466516 rs1946518 11 rs1946519, rs5744224, rs182302223, rs187031408, rs192374798 rs5744225, rs5744226, rs185641181, rs34360641, rs5744227

That is, SNPs (for example, rs882559) located at 39869066 base of human chromosome 6, SNPs located at 59368167 base of human chromosome 15 (for example, rs7178935), 38248354 base of human chromosome 17 (E. G., Rs3744805), SNPs (e.g., rs7564005) located in the 137512698th base of human chromosome 2, SNPs located in the 53610289th base of human chromosome 17 (e.g., rs1876522) (For example, rs2228048) located in the 30713842 base of the chromosome of the chromosome 11, a SNP located in the 112035458 base of human chromosome 11 (for example, rs1946518), and about 17 kbp to 96 kbp (See Table 3) are very useful as markers for the risk of graft-versus-host disease. Therefore, the present invention basically comprises a SNP located at 39869066 base of human chromosome 6, a SNP located at 59368167 base of human chromosome 15, a SNP located at 38248354 base of human chromosome 17, The SNP located at 137512698 base of chromosome 2, the SNP located at 53610289 base of human chromosome 17, the SNP located at 30713842 base of human chromosome 3, the 112035458 base of human chromosome 11 And SNPs present in an associated disequilibrium region over about 17 kbp to 96 kbp of each of the upstream and downstream of the respective SNPs, as markers for predicting risk of one or more graft-versus-host disease .

Thus, one example of the present invention is a SNP located at 39869066 base of human chromosome 6, a SNP located at 59368167 base of human chromosome 15, SNP located at 38248354 base of human chromosome 17, A SNP located at 137512698 base of human chromosome 2, a SNP located at 53610289 base of human chromosome 17, a SNP located at 30713842 base of human chromosome 3, 112035458 of human chromosome 11 A SNP located in a base and SNPs present in an associated disequilibrium region ranging from about 17 kbp to 96 kbp upstream and downstream of each of the SNPs, respectively, and a marker matrix for predicting risk of graft-versus-host disease .

Another example of the present invention provides a method of analyzing genes of each individual to provide information on the risk of developing graft-versus-host disease, which determines whether the individual is at risk of developing graft versus host disease. The gene analysis was performed on a gene sample obtained from a human subject, a SNP located at 39869066 base of human chromosome 6, a SNP located at 59368167 base of human chromosome 15, a 38248354 base of human chromosome 17 A SNP located at 137512698 base of human chromosome 2, a SNP located at 53610289 base of human chromosome 17, a SNP located at 30713842 base of human chromosome 3, Confirming the bases of one or more SNPs selected from the group consisting of SNPs located at 112035458 base of the chromosome and SNPs located at about 17 kbp to 96 kbp upstream and downstream of each of the SNPs in association disequilibrium regions, . In addition, the above method can use a probe and / or a primer suitable for detecting the SNP, if necessary. The specific method of such gene analysis is not particularly limited, and may be by any gene detection method known in the art. The gene sample includes all the biological samples separated from the subject, and may be, for example, at least one selected from the group consisting of hair, blood, various body fluids, isolated tissues and cells, but is not limited thereto.

More particularly, the present invention provides a method of providing information on the risk of graft-versus-host disease, comprising:

A SNP located in positions 39812244 to 39925888 of the human chromosome 6;

A SNP located at 59337585th base to 59398749th base of human chromosome 15;

A SNP located at positions 38214362 to 38282346 of the human chromosome 17;

SNPs located from 137419073 th base to 137606323 th base of human chromosome 2;

A SNP located at positions 53562289 base to 53658289 base of human chromosome 17;

A SNP located in positions 30696513 to 30731171 of the human chromosome 3; and

SNPs located from 111939586 base to 112131330 base of human chromosome 11

&Lt; / RTI > and identifying the base of one or more SNPs selected from the group consisting of SEQ ID NOs.

More specifically,

The SNP located in the 39812244 base to the 39925888 base of the human chromosome 6 includes one or more SNPs including the SNP located in the 39869066 base of human chromosome 6, such as rs882559, rs7765077, rs186480179, rs182233667, may be at least one selected from the group consisting of rs77149857, rs191290624, rs111433318, rs148294935, rs141483786, rs199589116, rs112473099,

The SNPs located at 59337585th base to 59398749th base of human chromosome 15 include one or more SNPs including SNPs located at 59368167 base of human chromosome 15, such as rs7178935, rs148964812, rs145696454, rs149777069, rs184331808, rs140572228, rs142867281, rs200747794, rs149481646, rs139318473, rs77443443, etc.,

The SNPs located at positions 38214362 to 38282346 of the human chromosome 17 include one or more SNPs located in the 38248354 base of human chromosome 17, such as rs3744805, rs138640083, rs145404278, rs113211153, rs139979507 , rs75248071, rs148709476, rs3834609, rs59275034, rs113962350, rs3760531, and the like,

The SNPs located from 137419073 th base to 137606323 th base of the human chromosome 2 include one or more SNPs including SNPs located in the 137512698 base of human chromosome 2, such as rs7564005, rs7563777, rs2558090, rs34800310, rs35134764, rs2558089, rs35545627, rs34367058, rs13416345, rs13416453, rs1427591, etc.,

The SNPs located in the 53562289th base to the 53658289th base of the human chromosome 17 include one or more SNPs including SNPs located at 53610289 base of human chromosome 17, such as rs1876522, rs1876523, rs1876524, rs1508230, may be at least one selected from the group consisting of rs8079251, rs6504973, rs8080030, rs9896610, rs8080297, rs9902863, rs8082684,

The SNPs located at positions 30696513 to 30731171 of the human chromosome 3 include one or more SNPs including SNPs located at the 30713842 base of human chromosome 3, such as rs2228048, rs35766612, rs35719192, rs17854016, may be at least one selected from the group consisting of rs2229102, rs3209742, rs11466513, rs11466514, rs17026171, rs11466515, rs11466516,

The SNPs located from the 111939586th base to the 112131330th base of the human chromosome 11 include one or more SNPs including the SNP located at 112035458 base of human chromosome 11 such as rs1946518, rs1946519, rs5744224, rs182302223, may be at least one selected from the group consisting of rs187031408, rs192374798, rs5744225, rs5744226, rs185641181, rs34360641, rs5744227, and the like.

As described above, the step of confirming the base of the SNP may include directly sequencing a gene sample obtained from a human subject, a probe capable of hybridizing with a predetermined site containing the SNP, or a predetermined site containing the SNP Or by any conventional method known in the art to which the present invention belongs, such as whether the sample is in contact with an amplifiable primer pair and whether the probe is reacted or the amplified gene fragment is examined.

In one embodiment, identifying the base of the SNP may include determining whether the genetic sample from the human subject is one or more of the following seven cases:

When 39869066th base of chromosome 6 is GG or GC (e.g., rs882559);

When 59368167th base of chromosome 15 is AG or AA (e.g., rs7178935);

When the 38248354th base of chromosome 17 is AA or AG (e.g., rs3744805);

When the 137512698th base of chromosome 2 is TT or TG (e.g., rs7564005);

The 53610289th base of chromosome 17 is TT (e.g., rs1876522);

When the 30713842th base of chromosome 3 is TC or TT (e.g., rs2228048); And

When the 112035458th base of chromosome 11 is GG (e.g., rs1946518).

If at least one of the seven cases is present, the risk of graft-versus-host disease is high.

The process of determining whether the gene sample corresponds to one or more of the following seven cases can be performed by analyzing the nucleotide sequence of the gene sample by a conventional method.

In addition, in one embodiment of the present invention, in addition to genetic information, in addition to genetic information, several clinical factors known to be associated with graft-versus-host disease outbreaks may be further used to further increase the predictive sensitivity of graft-versus-host disease development. The clinical factor may be a disease type of the recipient, a relationship with the recipient and recipient, a transplant type, and the like.

In one embodiment, the method of providing information to predicting the risk of graft-versus-host disease comprises, in addition to identifying the bases of the SNPs, the disease form of the recipient, the relationship between the recipient and recipient, The method further comprises the step of examining one or more clinical factors selected from the group. The step of examining the clinical factors may be performed before and after the step of confirming the base of the SNP without limitation in the step and sequence of confirming the base of the SNP. Also, in the case of examining two or more clinical factors, the step of examining each clinical factor may proceed concurrently or temporally, and there is no particular limitation in the order in which it proceeds. For example, in addition to the above-described genetic information (SNP) analysis, a method of providing information on the risk of graft-versus-host disease risk may include a disease type [1: acute myeloid leukemia (AML) Acute lymphocytic leukemia (ALL), 3: chronic myeloid leukemia (CML), 4: myelodysplastic syndrome (MDS), 5: aplastic anemia, 6: lymphoma Lymphoma, 7: acute biphenotypic leukemia (ABL), 8: myelo-fibrosis, 9: multiple myeloma, 10: other blood related diseases] The relationship between the sharer and the recipient (0: related (sibling), 1: un-related, 2: haploidentical), transplantation type (1: bone-marrow, BM), 2: peripheral blood hemolysis blood, PB), 3: BM + PB), and the like. By reflecting the results obtained in the step of examining the clinical variables into the judgment of the risk of graft-versus-host disease, the accuracy of predicting the risk of graft-versus-host disease may be further improved (see Table 6 and Table 7).

In another aspect, the present invention provides a method for detecting the presence of a SNP located at 39869066 base of human chromosome 6, a SNP located at 59368167 base of human chromosome 15, SNP located at 38248354 base of human chromosome 17, A SNP located at 137512698 base of human chromosome 2, a SNP located at 53610289 base of human chromosome 17, a SNP located at 30713842 base of human chromosome 3, 112035458 of human chromosome 11 A detectable one or more probes of one or more SNPs selected from the group consisting of SNPs located in bases and SNPs present in an associated unbalanced region over about 17 kbp to 96 kbp of each of the upstream and downstream of said respective SNPs, An amplifiable primer pair, and at least one selected from the group consisting of an amplifiable primer pair. As described above, a region ranging from about 17 kbp to 96 kbp in each of the upstream and downstream means 90 percentile of the distance of all known SNP pairs in which the chromosome R square value at which each SNP is located is 1 That is, a region corresponding to a possible link disequilibrium region. Since the genetic information of these linkage disequilibrium sites is completely identical, as shown by the R square values, it is possible that not only the seven SNPs of Table 1 but also the association over about 17 kbp to 96 kbp of each of the upstream and downstream of each of the seven SNPs The possibility of developing graft-versus-host disease can be predicted by confirming the base of SNPs present in unbalanced regions (see Table 3).

More specifically, the composition for screening a risk of graft-versus-host disease according to the present invention comprises:

A SNP located in positions 39812244 to 39925888 of the human chromosome 6;

A SNP located at 59337585th base to 59398749th base of human chromosome 15;

A SNP located at positions 38214362 to 38282346 of the human chromosome 17;

SNPs located from 137419073 th base to 137606323 th base of human chromosome 2;

A SNP located in positions 30696513 to 30731171 of the human chromosome 3; and

SNPs located from 111939586 base to 112131330 base of human chromosome 11

, A probe capable of detecting one or more SNPs selected from the group consisting of the above-mentioned SNPs, or a pair of primers capable of amplifying the SNPs, or both.

More specifically,

In one embodiment, the composition for screening for risk of graft-versus-host disease may comprise a probe for detecting a SNP having a significant correlation with the onset of graft-versus-host disease, SNPs within the linkage disequilibrium region with almost the same genetic information as the SNPs,

Among consecutive 5 to 100 bp sequences in the region from 39812244 base to 39925888 base of human chromosome 6, for example, from 39812244 base to 39925888 base of human chromosome 6, rs882559, rs7765077, rs186480179 an oligonucleotide having a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising at least one SNP selected from the group consisting of rs182233667, rs77149857, rs191290624, rs141493135, rs148294935, rs141483786, rs199589116, rs112473099 and the like;

Among consecutive 5 to 100 bp sequences in the region from 59337585th base to 59398749th base of human chromosome 15, for example, from 59337585th base to 59398749th base of human chromosome 15, rs7178935, rs148964812, rs145696454 an oligonucleotide having a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising at least one SNP selected from the group consisting of rs149777069, rs184331808, rs140572228, rs142867281, rs200747794, rs149481646, rs139318473, rs77443443 and the like;

Among consecutive 5 to 100 bp sequences in the region from the 38214362 base to the 38282346 base of human chromosome 17, for example, from the 38214362 base to the 38282346 base of human chromosome 17, rs3744805, rs138640083, rs145404278 oligonucleotides having a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising at least one SNP selected from the group consisting of rs113211153, rs139979507, rs75248071, rs148709476, rs3834609, rs59275034, rs113962350, rs3760531 and the like;

Among consecutive 5 to 100 bp sequences in the region from 137419073 th base to 137606323 th base of human chromosome 2, for example, from 137419073 th base to 137606323 th base of human chromosome 2, rs7564005, rs7563777, rs2558090 an oligonucleotide having a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising at least one SNP selected from the group consisting of rs34800310, rs35134764, rs2558089, rs35545627, rs34367058, rs13416345, rs13416453, rs1427591 and the like;

Among consecutive 5 to 100 bp sequences in the region from the 53562289th base to the 53658289th base of the human chromosome 17, for example, from the 53562289th base to the 53658289th base of human chromosome 17, rs1876522, rs1876523, rs1876524 oligonucleotides having a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising at least one SNP selected from the group consisting of rs1508230, rs8079251, rs6504973, rs8080030, rs9896610, rs8080297, rs9902863, rs8082684,

Among consecutive 5 to 100 bp nucleotide sequences from the 30696513 base to 30731171 base of human chromosome 3, for example, from 30696513 base to 30731171 base of human chromosome 3, rs2228048, rs35766612, rs35719192 oligonucleotides having a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising at least one SNP selected from the group consisting of rs17854016, rs2229102, rs3209742, rs11466513, rs11466514, rs17026171, rs11466515, rs11466516 and the like; And

Among consecutive 5 to 100 bp nucleotide sequences from the 111939586th base to the 112131330th base of human chromosome 11, for example, from 111939586 base to 112131330 base of human chromosome 11, rs1946518, rs1946519, rs5744224 an oligonucleotide having a sequence complementary to a consecutive 5 to 100 bp base sequence comprising at least one selected from the group consisting of rs182302223, rs187031408, rs192374798, rs5744225, rs5744226, rs185641181, rs34360641, rs5744227,

&Lt; / RTI > can be one or more oligonucleotides selected from the group consisting of < RTI ID = 0.0 >

In an embodiment, the probe is capable of detecting one or more of the SNPs of the seven (Table 1 or Table 2)

Comprising a sequence complementary to a consecutive 5 to 100 bp base sequence comprising 39869066 base of human chromosome 6, wherein the 39869066 base is GC or CC (rs882559), oligonucleotide;

A sequence complementary to a consecutive 5 to 100 bp base sequence comprising 59368167 base of human chromosome 15, wherein the 59368167 base is AG or AA (rs7178935), an oligonucleotide;

A sequence complementary to a consecutive 5 to 100 bp base sequence comprising the 38248354 base of human chromosome 17, said base 38248354 being AG or AA (rs3744805), oligonucleotide;

A sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising the 137512698 base of human chromosome 2, wherein the base 137512698 is selected from the group consisting of TT or TG (rs7564005), oligonucleotides;

A sequence complementary to a consecutive 5 to 100 bp base sequence comprising 53610289 base of human chromosome 17, wherein the 53610289 base is selected from the group consisting of TT (rs1876522), oligonucleotides;

Wherein the 30713842 base is a TC or TT (rs2228048), an oligonucleotide; a sequence complementary to a consecutive 5 to 100 bp base sequence comprising 30713842 base of human chromosome 3; And

Comprising a sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising the 112035458 base of human chromosome 11, wherein the 112035458 base is selected from the group consisting of GG (rs1946518), oligonucleotide

, But it is not limited thereto.

In another embodiment of the present invention, the composition for diagnosing the risk of graft-versus-host disease may be used to amplify a gene fragment containing the SNP in order to identify a graft-versus- Lt; RTI ID = 0.0 > a < / RTI > specific primer pair. As used herein, a "primer pair" is used as a concept including a forward primer and a reverse primer. In one embodiment, the primer pair comprises a gene fragment of about 50 bp to about 20 kbp, about 50 to about 1000 bp, about 50 to about 500 bp, or about 100 to about 300 bp, Can be amplified to a pair of forward and reverse primers of about 5 to 50 bp in length.

The primer pairs usable in the present invention include,

A pair of primers consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences not overlapping each other in a region from 39812244 base to 39925888 base of human chromosome 6;

A pair of primers consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences that do not overlap each other in a region from 59337585th base to 59398749th base of human chromosome 15;

A pair of primers consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences not overlapping each other in a region from 38214362 base to 38282346 base of human chromosome 17;

A primer pair consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences not overlapping each other in a region from 137419073 th base to 137606323 th base of human chromosome 2;

A pair of primers consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences that do not overlap each other in the region from 53562289 base to 53658289 base of human chromosome 17;

A pair of primers consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences that do not overlap each other in the region from 30696513 base to 30731171 base of human chromosome 3; And

A primer pair consisting of two oligonucleotides having a sequence complementary to two consecutive 5 to 50 bp nucleotide sequences that do not overlap each other in a region from 111939586 base to 112131330 base of human chromosome 11 Or more.

In one embodiment, the primer pair comprises:

A sequence comprising at least one selected from the group consisting of rs882559, rs7765077, rs186480179, rs182233667, rs191290624, rs111433318, rs148294935, rs148294935, rs149983786, rs199589116, rs112473099, etc. in the region from 39812244 base to 39925888 base of human chromosome 6 A primer pair consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of the gene fragment of about 50 bp to about 20 kbp;

A sequence comprising at least one selected from the group consisting of rs7178935, rs148964812, rs14579669, rs144777069, rs184331808, rs140572228, rs142867281, rs200747794, rs149481646, rs139318473 and rs77443443 from the 59337585th base to the 59398749th base of human chromosome 15 A primer pair consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of the gene fragment of about 50 bp to about 20 kbp;

A sequence comprising at least one selected from the group consisting of rs3744805, rs134640083, rs145404278, rs133211153, rs139979507, rs75248071, rs148709476, rs3834609, rs59275034, rs113962350, rs3760531, etc. in the region from 38214362 base to 38282346 base of human chromosome 17 A primer pair consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of the gene fragment of about 50 bp to about 20 kbp;

A sequence comprising at least one selected from the group consisting of rs7564005, rs7563777, rs2558090, rs34800310, rs35134764, rs2558089, rs35545627, rs34367058, rs13416345, rs13416453, rs1427591, etc. in the region from 137419073 th base to 137606323 th base of human chromosome 2 A primer pair consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of the gene fragment of about 50 bp to about 20 kbp;

A sequence comprising at least one selected from the group consisting of rs1876522, rs1876523, rs1876524, rs1508230, rs8079251, rs6504973, rs8080030, rs9896610, rs8080297, rs9902863, rs8082684, etc. in the region from 53562289th base to 53658289th base of human chromosome 17 A primer pair consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of the gene fragment of about 50 bp to about 20 kbp;

A sequence comprising at least one selected from the group consisting of rs2228048, rs35766612, rs35719192, rs17854016, rs2229102, rs3209742, rs11466513, rs11466514, rs17026171, rs11466515, rs11466516, etc. in the region from 3069651313 base to 30731171 base of human chromosome 3 A primer pair consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of the gene fragment of about 50 bp to about 20 kbp; And

A sequence comprising at least one selected from the group consisting of rs1946518, rs1946519, rs1843019, rs182302223, rs187031408, rs192374798, rs5744225, rs5744226, rs185641181, rs34360641, rs5744227, etc. in the region from 111939586 base to 112131330 base of human chromosome 11 A pair of primers consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 ' end and the 3 ' end of the gene fragment of about 50 bp to about 20 kbp

And a primer pair.

More specifically, the primer pair is a primer pair,

An oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp sequence in the 56822 bp region toward the upstream side based on the 39869066 base sequence and an oligonucleotide having a sequence complementary to the downstream sequence A primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 56822 bp region;

The method for amplifying a chromosome fragment comprising 59368167 base of human chromosome 15, wherein the oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 30582 bp region toward the upstream side based on the 59368167 base sequence, A primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 30582 bp region;

An oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence upstream of the 33992 bp region on the basis of the 38248354 base sequence and an oligonucleotide having a sequence complementary to the downstream nucleotide sequence A primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5-50 bp nucleotide sequence in the 33992 bp region;

The present invention relates to a method for amplifying a chromosome fragment comprising 137512698 base of human chromosome 2, wherein the oligonucleotide having a sequence complementary to a consecutive 5-50 bp nucleotide sequence in the 93625 bp region upstream of the 137512698 base sequence, A primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 93625 bp region;

An oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence upstream of the 48000 bp region on the basis of the 53610289 base sequence and an oligonucleotide having a nucleotide sequence complementary to the nucleotide sequence of SEQ ID NO: A primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 48000 bp region;

An oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 17329 bp region toward the upstream side based on the 30713842 base sequence, and an oligonucleotide having a nucleotide sequence complementary to the downstream nucleotide sequence A primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp nucleotide sequence in the 17329 bp region;

And

An oligonucleotide having a sequence complementary to a consecutive 5 to 50 bp sequence in the 95872 bp region upstream of the 112035458 base sequence and an upstream oligonucleotide having a sequence complementary to the 5 to 50 bp sequence upstream of the 112035458 base sequence, And a primer pair consisting of an oligonucleotide having a sequence complementary to a consecutive 5-50 bp nucleotide sequence in the 95872 bp region.

The chromosomal fragment amplified by the primer pair can be analyzed for the nucleotide sequence to identify a SNP base having a significant correlation with the graft-versus-host disease outbreak, thereby predicting the risk of graft-versus-host disease. There is no particular limitation on the above-mentioned gene amplification and nucleotide sequence analysis method, and all methods commonly known in the art can be used.

In another aspect, the present invention provides a kit for the risk assessment of graft-versus-host disease comprising the composition for diagnosing the risk of graft-versus-host disease. The kit for detecting the risk of graft-versus-host disease may further include a gene detecting means. More specifically, the SNPs of Table 1 and the respective SNPs (see Tables 1 and 2), as described above, contained in the composition for diagnosing the risk of graft-versus-host disease infection are each about 17 kbp to 96 kbp There is no limitation on the use form of one or more detectable probes and / or primer pairs of one or more SNPs selected from the group consisting of SNPs (see Table 3) Used in the form of a solution, or in any other form. In addition, the gene detecting means may be dependent on the use form of the probe and / or the primer pair included in the composition for diagnosing the risk of graft-versus-host disease infection. In the case of using a fluorescence-labeled probe, And a sequence analysis means through PCR when a primer pair is used. In addition, any gene analysis means commonly known in the technical field of the present invention may be included.

A method for providing information on the risk of developing graft-versus-host disease, a composition for screening graft-versus-host disease risk, and a graft-versus-host disease risk screening kit is a graft-

One or more SNPs, such as rs7178935, rs148964812, rs145696454, etc., including SNPs located in the 59337585th base to 59398749th base of human chromosome 15, specifically SNPs located in 59368167th base of human chromosome 15, at least one selected from the group consisting of rs149777069, rs184331808, rs140572228, rs142867281, rs200747794, rs149481646, rs139318473, rs77443443 and the like;

A SNP located in the 38214362 base to the 38282346 base of human chromosome 17, specifically, one or more SNPs located in the 38248354 base of human chromosome 17, such as rs3744805, rs138640083, rs145404278, rs113211153 , rs139979507, rs75248071, rs148709476, rs3834609, rs59275034, rs113962350, rs3760531, and the like;

One or more SNPs comprising SNPs located at positions 137419073 to 137606323 of human chromosome 2, specifically SNPs located at the 137512698th base of human chromosome 2, such as rs7564005, rs7563777, rs2558090, at least one selected from the group consisting of rs34800310, rs35134764, rs2558089, rs35545627, rs34367058, rs13416345, rs13416453, rs1427591 and the like; And

One or more SNPs including SNP located in the 53562289th base to the 53658289th base of human chromosome 17, specifically SNP located in the 53610289th base of human chromosome 17, such as rs1876522, rs1876523, rs1876524, rs1508230 , rs8079251, rs6504973, rs8080030, rs9896610, rs8080297, rs9902863, rs8082684, etc.

, And the like.

In addition, optionally, a method for providing information to predict the risk of graft-versus-host disease, a composition for screening for risk of graft-versus-host disease, and a graft-versus-host disease risk screening kit is a graft-

One or more SNPs, such as rs882559, rs7765077, rs186480179, including SNPs located in 39812244 base to 39925888 base of human chromosome 6, specifically SNPs located in 39869066 base of human chromosome 6, at least one selected from the group consisting of rs182233667, rs77149857, rs191290624, rs111433318, rs148294935, rs141483786, rs199589116, rs112473099,

One or more SNPs, such as rs2228048, rs35766612, rs35719192, and rs35719192, comprising SNPs located in the 30696513 base to 30731171 base of human chromosome 3, specifically SNPs located in the 30713842 base of human chromosome 3, at least one selected from the group consisting of rs17854016, rs2229102, rs3209742, rs11466513, rs11466514, rs17026171, rs11466515, rs11466516,

One or more SNPs including SNPs located at 111939586 base to 112131330 base of human chromosome 11, specifically SNPs located at 112035458 base of human chromosome 11, such as rs1946518, rs1946519, rs5744224, at least one selected from the group consisting of rs182302223, rs187031408, rs192374798, rs5744225, rs5744226, rs185641181, rs34360641, rs5744227, etc.

Or a combination thereof.

The present invention relates to a method for predicting the risk of graft-versus-host disease in a patient with graft-versus-host disease, It is a technique that allows individuals with high risk of graft-versus-host disease to be prevented from delaying or developing the onset of disease through special and appropriate management, and to be able to diagnose early through continuous proactive monitoring even if it develops.

Figure 1 is a graph showing the minor allele frequencies of the seven SNPs selected for constructing the disease prediction model in the original data set.
2 is a graph showing receiver operating characteristic (ROC) curves of six algorithm models.
Fig. 3 shows the results of comparing the R square value with the distance of all SNP pairs revealed within 1 Mbps based on the 39869066-th position of the human chromosome 6.
FIG. 4 shows the results of comparing the R square value and the distance of all SNP pairs revealed within 1 Mbps based on the 59368167-th position of human chromosome 15.
FIG. 5 shows the results of comparing the R square value and the distance of all SNP pairs revealed within 1 Mbps based on the 38248354-th position of the human chromosome 17.
FIG. 6 shows the results of comparing the R square value and the distance of all SNP pairs revealed within 1 Mbps based on the 137512698-th position of human chromosome 2.
FIG. 7 shows the results of comparing the R square value with the distance of all SNP pairs revealed within 1 Mbps based on the 53610289-th position of human chromosome 17.
FIG. 8 shows the results of comparing the R square value and the distance of all the SNP pairs revealed within 1 Mbps based on the 30713842-th position of the human chromosome 3.
FIG. 9 shows the results of comparing the R square value and the distance of all SNP pairs revealed within 1 Mbps based on the 112035458-th position of human chromosome 11.

The present invention will be described in more detail with reference to the following examples. However, these examples are only for illustrating the present invention, and the present invention is not limited by these examples.

Example 1: Acute graft versus host disease Specific SNP quest

The following tests were performed to determine the single nucleotide polymorphism (SNP) characteristic of acute graft versus host disease by analyzing the genes of patients with acute graft versus host disease.

1.1. The gene and / SNP

To determine the genetic tendency of patients with acute graft-versus-host disease, 87 SNPs and 53 SNPs shown in Table 4 below, and 7 SNPs in the intergenic region, . These SNPs were SNPs with a minor allele frequency of 0.05 or more in the Asian populations. In the following examples, 94 patients (114 patients) without acute graft versus host disease (74 patients) and acute graft versus host disease SNPs were screened to identify SNPs with significant correlations with patients with acute GVHD.

Number of SNPs by function motif by gene No. Promoter /
5'UTR Exon Intron 3'UTR /
flanking Intergenicregion Total One ABCB1 2 2 2 ABCC2 One One 2 3 ACTA2 2 2 4 ATIC One One 5 BCL2 One 2 3 6 C18orf56 One One 7 CD44 One One 8 CTGF One One 9 CTLA4 2 One 2 5 10 CYP2B6 One One 11 CYP2C19 2 2 12 CYP3A5 One One 13 DAAM2 2 2 14 DPY30 One One 15 ENOSF1 One One 2 16 ESR1 One One 17 FAS One One 18 GGH One One 2 19 GSTA1 One One 20 GSTP1 One One 21 HMHA1 One One 22 HSPA1L 2 2 23 ICOS One One 24 IFNGR2 One One 25 IL10 3 3 26 IL10RB One One 27 IL13 One One 28 IL18 2 2 29 IL1A One One 30 IL1B One One 2 31 IL2 One One 32 IL4 One One 33 IL4R 2 2 34 IL6 One One 35 IRF4 One One 2 36 LTA One 2 3 37 MTHFR 2 One 3 38 NIPSNAP3B One One 39 NLRC4 One One 40 NOD2 One One 41 RNF111 One One 42 SERPINE2 2 2 43 SLC30A6 2 2 44 SPAST 4 4 45 TGFB1 One One 46 TGFBR2 One One 47 THRA One One 48 TNF One One 49 TNFRSF1B One One 50 TRAF3IP1 One One 51 VDR One 3 4 52 VEGFA 2 One 3 53 ZNF597 One One 54 Intergenic 7 7 Total 21 30 23 13 7 94

1.2. Prepare Samples and Analysis Instruments

Genomic DNA samples were extracted from patient blood prior to transplantation in patients who had to undergo allogeneic hematopoietic stem cell transplantation. A total of 188 genomic samples were used in the experimental group (74 patients) and the control group (114 patients) who did not develop acute graft versus host disease on the 100th day of acute GVHD. Diseases that undergo allogeneic stem cell transplantation are acute myeloid leukemia (AML), acute lymphocytic leukemia (ALL), chronic myeloid leukemia (CML), myelodysplastic syndrome, (MDS), aplastic anemia, lymphoma, acute biphenotypic leukemia (ABL), myelo-fibrosis (MF), multiple myeloma (MM) The patients were diagnosed as 10 groups. We performed allogeneic stem cell transplantation from January 2002 to February 2013 in Seoul National University Hospital. The patients were divided into two groups: those with acute GVHD and the patients without GVHD.

As the SNP genotyping platform, we used SNP chips prepared by Illumina's BeadArray technology to contain 94 SNPs as contents. Genotyping was performed using the GoldenGate Genotyping Assay according to the protocol provided by Illumina. Seven SNPs with high SNP associated with acute GVHD were selected by SNP chip experiment. In order to increase the power of the prediction model construction, an additional 206 individuals (50 in the experimental group and 156 in the control group) were analyzed. The gene analysis was performed by TaqMan Genotyping Assay from Applied Biosystems, Inc. instead of the existing Illumina chip for technical bias correction. Of 394 patients who obtained the above 7 SNP genotyping results, 298 patients with clinical information (disease type, recipient and donor relationship, and transplant type) were selected to construct a disease prediction model.

1.3. Analysis method

1.3.1. feature Selection

① characteristic SNP Selection

In order to independently discriminate the degree of association between the 94 SNPs and the acute graft-versus-host disease, five genetic models (dominant model, recessive model, co-dominant model, Logistic regression was performed on the additive model and the allelic model to select the most significant genetic model for each SNP and align the SNP with its p-value. The rule of thumb, which determines the number of features in a normal discriminate analysis, is less than 1/10 of the number of samples used in model building. Therefore, the total number of samples used in this experiment is 298, so that about 20 to 30 features are suitable. The top 30 SNPs were extracted from the list of SNPs sorted by P value.

Minor allele frequencies of 0.05 or less in the Korean population did not meet the definition of polymorphism and were filtered. In addition, if SNPs with r ² > 0.8 of all pair-wise SNPs exist, filtering is also performed because a problem arises in the model construction process due to information redundancy.

Seven significant SNPs were selected through the above steps and the selected SNPs were summarized in Table 5 below:

SNP Chromosome number Chromosomal location (ref) rs882559 6 39869066 rs7178935 15 59368167 rs3744805 17 38248354 rs7564005 2 137512698 rs1876522 17 53610289 rs2228048 3 30713842 rs1946518 11 112035458

② Selection of clinical variables

Using the 10-fold cross-validation method, the Weka program's chi-square Eval function and ranker function, commonly known as a data mining tool, are used to compare the samples obtained from the above to the acute graft versus host disease , And the results are shown in Table 6 below. At this time, influential variables were selected by dividing into acute graft versus host disease (0 or 1) and acute graft versus host disease grade (0, 1, 2, 3, 4) , Acute myeloid leukemia (AML), 2: acute lymphocytic leukemia (ALL), 3: chronic myeloid leukemia (CML), 4 : Myelodysplastic syndrome (MDS), 5: aplastic anemia, 6: lymphoma, 7: acute biphenotypic leukemia, 8: myelo-fibrosis, MF), 9: Multiple myeloma (MM), 10: other blood related diseases, 0: related (sibling), 1: un- related, 2: haploidentical) (1: bone marrow, BM, 2: peripheral blood, 3: BM + PB) and pretreatment type (0: Non- myeloablative, 1: Myeloablative.

In Table 6, the higher the average merit value and the lower the average rank value, the more influential the variable. The average merit value is higher and the average rank value is lower in the table. Significant variables are shown in red. Specifically, among the six clinical parameters tested, the relationship between the sharer and the recipient (Transplant_donor) (0: related (sibling), 1: un- related, 2: haploidentical) 8, MF, 9: MM, 10: Others), transplant type (Transplant type) Three clinical variables such as variable (1: bone-marrow, BM, 2: peripheral blood, PB, 3: BM + PB) Were found to be highly influential in acute graft versus host disease.

In addition, logistic regression analysis of the 298 samples obtained above verified the association with acute graft versus host disease among various clinical parameters, and the results are shown in Table 7 below.

OddsRatio z value Pr (> | z |) (Intercept) 0.248 -1.652 0.0986 Sex (sex) 1.072 0.247 0.8051 Age (age) 1.003 0.240 0.8104 Disease2_ (Disease form of the recipient) 0.991 -0.024 0.9809 Disease3_ (Disease form of the recipient) 1.636 0.942 0.3463 Disease4_ (Disease form of the recipient) 1.086 0.161 0.8721 Disease5_ (Disease form of the recipient) 0.380 -1.738 0.0822 * Disease6_ (Disease form of the recipient) 0.608 -0.692 0.489 Disease7_ (Disease form of the recipient) 2.824 1.382 0.1669 Disease8_ (Disease form of the recipient) 1.558 0.526 0.5988 Disease9_ (Disease form of the recipient) 2.702 0.954 0.3402 Disease10_ (Disease form of the recipient) 0.775 -0.280 0.7791 transplant_donor1_ (relationship with recipient) 3.298 3.968 0.0000724 *** transplant_donor2_ (relationship with recipient) 0.000 -0.015 0.988 transplant_type2_ (transplant type) 1.169 0.467 0.6406 transplant_type3_ (transplant type) 1.180 0.191 0.8484 prehandle1 (pretreatment) 0.571 -1,612 0.1069

* Indicated in bold type for a meaningful variable.

(*** p <0.0001, * p <0.1).

As shown in Table 7, among the various clinical variables, the relationship between donor and recipient is "transplant_donor 0: sibling" (ie, non-blood-related) and "transplant_donor 1: unrelated" ), The number of graft versus host disease inventions increased more than 3.2 times compared with the case of grafting from a non-blood-related relationship to that of a graft from a blood-related relationship, which is consistent with general medical sense. In addition, the odds ratio value of Disease5 (Aplastic anemia) is 0.38 times as compared with the case of Disease1 (AML) among the parameters satisfying p value <0.1, and the risk of graft-versus host disease The risk of graft versus host disease was higher for Disease 3 (CML), Disease 7 (ABL) and Disease 9 (MM) compared to Disease 1 (AML) with odds ratio values of 1.6, 2.7 and 2.6, respectively.

In addition to the above-mentioned 7 SNPs, it is thought that adding the clinical variables of the above-mentioned significant clinical variables, such as the recipient and the disease type of the recipient, to the model would further improve the accuracy and clarity of disease prediction. And the logistic results were not significant. However, in Table 6, the transplantation type was also ranked as a highly influential variable to distinguish the aGVHD_grade variable, and since it is a clinically significant factor, the acute graft versus host Were included in the input variables of the disease prediction model.

1.3.2. Model building

① The algorithm used for model building

i) Artificial Neural Net (ANN): Set the selected neurons as input neurons, set the output neurons to determine whether the specific entity is an acute graft versus host disease using the features, It is a typical machine learning method that finds the most discriminating model by updating the value of connection with the hidden layer (hidden layer).

ii) Logistic Regression: Estimates the coefficients of the regression model by using the selected features as independent variable and the dependent variable as acute graft versus host disease. When the information about each feature of a specific sample is assigned to the estimated function, the result of the probability that the sample is caught in the acute graft versus host disease is deduced.

iii) Support Vector Machine: In addition to ANN, it is a classification algorithm used in many fields in recent years. Data points are used as support vectors to identify the most significant classifiers. Non-linear type classifiers can be implemented through several kernel methods.

iv) Naive Bayes: One of the simplest classifications using Supervised Learning. This classification basically uses Bayes' Rule for classification. The main reason for using Bayes rules is that it is easier to get the values by using Bayes rules when calculating conditional probabilities. In order to classify a sample into features of a specific sample as an input value, and to classify a group that has acute graft versus host disease and a group that does not have acute graft versus host disease, the conditional probability of each group can be calculated. This is the group to which you belong.

v) RBFNetWork (Radial basis function network): One of the neural networks, RBFN belongs to the multi-layer feed-forward neural network and has two layers. Each neuron in the hidden layer has a radial basis function, such as Gaussian, as an activation function. The center of the radial basis function of each neuron is determined by the connection strength of the neuron, and the position and the width of the function are obtained through learning. The output is determined by the linear combination of the outputs of all radial basis functions. From the viewpoint of function approximation, it can be said that the hidden layer forms a basis for expressing the input pattern.

vi) Decision Tree: A universal and powerful tool for prediction and classification. Unlike neural network structure analysis, it is easy to understand because it expresses rule by tree structure. One form of tree structure is a binary tree structure, where each node creates two child nodes and proceeds to the terminal node by answering yes-no questions. There is not only a simple binary tree shape, but also mixed models.

② Data coding for model building

We coded all the genotypes of the samples (0 = Wild Homo, 1 = Hetero, 2 = Mutant Homo) for 7 SNP features and compared the relationship between the sharer and the recipient, Were categorized as categories in each category. The selected SNPs and clinical information were added to determine the susceptibility of the acute graft versus host disease by applying the machine learning algorithm described above.

1.4. Selected SNP pool Significance test of

1.4.1. The optimal parameters of each algorithm

i) Artificial Neural Network (ANN): learning rate: 0.3, momentum: 0.2, training time: 500, number of hidden nodes = 21, activation function: sigmoid function,

ii) Logistic Regression: no parameter,

iii) Support Vector Machine (SVM): Linear kernel,

iv) Naive Bayes: no parameter,

v) Radial basis function network (RBFNetWork): clusteringSeed: 1, numClusters: 2,

vi) Decision Tree: Confidence Factor: 0.25, The minimum number of instances per leaf: 2.

1.4.2. Sample In the providing group Acute graft versus host disease Classification experiment

Using the seven SNPs selected in Example 1.3.1 and clinical information (the relationship between the sharer and the recipient, the disease type of the recipient, and the transplantation form), the final clinical information mentioned in Example 1.2 is summarized 298 patients were classified into acute graft versus host disease and normal group.

That is, the above-mentioned six algorithms were used for statistical processing for Y ~ 7 SNPs + Transplant_donor + Disease + Transplant_type, and 10 cross validation results are shown in Table 8 below.

As shown in Table 8, the accuracy of prediction of risk of acute graft versus host disease was 82.89% (specificity 88.89%, sensitivity 67.07%) in the ANN model among the six models. The receiver operating characteristic (ROC) curves of the six models are shown in FIG. As shown in FIG. 2, in the ANN model, the area under ROC (AUR) was 84.8% and the explanatory power of the model was the highest.

Example 2: Acute graft versus host disease Prediction of onset risk

In order to ensure the objective accuracy of predicting the risk of acute graft versus host disease in the selected seven SNPs in Example 1.3.1, a separate validation set other than the population providing the gene sample for SNP selection By confirming the accuracy of prediction of the risk of acute graft versus host disease and obtaining the same level of accuracy as in the original data set, the risk of acute graft versus host disease in actual patient screening And demonstrate satisfactory utility. A validation set of 76 patients (control: 50 patients, case: 26 patients) with detailed clinical information was obtained from Seoul National University Hospital. The genotype analysis was performed by the Applied Biosystems TaqMan Genotyping Assay.

The prediction of the risk of acute graft versus host disease for other population (validation set) was conducted in the same manner as in Example 1.4.2, and the results are shown in Table 9.

As shown in Table 9, in another population (validation set), the accuracy of predicting the risk of acute graft versus host disease in the ANN model was 72.4% (specificity 88%, sensitivity 42.3%).

As can be seen from Tables 8 and 9, when the selected seven SNPs and the clinical information (the relationship between the sharer and the recipient, the disease type of the recipient, and the transplantation form) are used, the SNP and the clinical information selection The accuracy of the original data set used in this study was close to 83% and the accuracy of 72% of the validation set was also selected randomly. These SNPs and clinical information were used to evaluate the acute graft versus host disease It was confirmed to be effective in predicting the risk of occurrence.

The SNP genotypes of the 188 SNP genotypes obtained in Example 1.2 were firstly analyzed by logistic regression analysis. The results are shown in Table 5 below. Table 10 shows the results.

¹⁾ : p-value of Z statistic of logistic regression model, ²⁾ : p-value of chi-square statistic of chi-square test

** p <0.05: red mark, p <0.1: underline mark

As shown in Table 10 above, rs882559, rs7178935, rs3744805, rs7564005, and rs1876522 were significant at p <0.05 in at least one of the five genetic models and rs2228048 and rs1946518 were at least one of the five genetic models P <0.1 in one or more models. Thus, the SNPs may be useful in explaining graft-versus-host disease association and predicting the risk of outbreaks.

Claims

For gene samples obtained from human subjects,
And determining whether 39869066 base of chromosome 6 is GG or GC. &Lt; RTI ID = 0.0 > 8. < / RTI >

delete

4. The method of claim 1, wherein the method further comprises the step of determining whether a genetic sample from a human subject is one or more of the following six cases: How to provide:
The 59368167th base of chromosome 15 is AG or AA;
The 38248354th base of chromosome 17 is AA or AG;
The 137512698th base of chromosome 2 is TT or TG;
The 53610289th base of chromosome 17 is TT;
The 30713842th base of chromosome 3 is TC or TT; And
The 112035458th base of chromosome 11 is GG.

4. The method of claim 1 or 3, further comprising the step of examining one or more clinical variables selected from the group consisting of the type of disease of the recipient, the relationship with the recipient and recipient, and the form of transplantation. How to provide information on disease outbreak risk prediction.

A SNP located in 39869066 base of human chromosome 6, a detectable probe,
A primer pair capable of amplifying the SNP, or both,
Wherein the probe is an oligonucleotide comprising a sequence complementary to a consecutive 5 to 100 bp base sequence comprising 39869066 base of human chromosome 6, wherein the 39869066 base is GC or CC,
The primer pair consists of two oligonucleotides having a sequence complementary to each of the consecutive 5-50 bp nucleotide sequences at the 5 ' end and the 3 ' end of consecutive 50 to 20 kbp gene fragments comprising rs882559 of the human chromosome 6, A pair of primers consisting of nucleotides,
A composition for screening for risk of graft versus host disease.

delete

6. The composition of claim 5, wherein the composition comprises
A sequence complementary to a consecutive 5 to 100 bp base sequence comprising 59368167 base of human chromosome 15, wherein the 59368167 base is an AG or AA oligonucleotide;
A sequence complementary to a consecutive 5 to 100 bp nucleotide sequence comprising the 38248354 base of human chromosome 17, wherein the 38248354 base is AG or AA, oligonucleotides;
A sequence complementary to a consecutive 5 to 100 bp base sequence comprising 137512698 base of human chromosome 2, wherein the 137512698 base is TT or TG; an oligonucleotide;
Comprising a sequence complementary to a consecutive 5 to 100 bp base sequence comprising 53610289 base of human chromosome 17, wherein the 53610289 base is TT, an oligonucleotide;
An oligonucleotide comprising a sequence complementary to a consecutive 5 to 100 bp base sequence comprising 30713842 base of human chromosome 3, wherein the 30713842 base is TC or TT; And
Comprising a sequence complementary to a consecutive 5 to 100 bp base sequence comprising 112035458 base of human chromosome 11, wherein the 112035458 base is an oligonucleotide
&Lt; / RTI > wherein the composition further comprises one or more probes selected from the group consisting of < RTI ID = 0.0 >

6. The composition of claim 5, wherein the composition comprises
A primer consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of a consecutive 50 to 20 kbp gene fragment comprising rs7178935 of human chromosome 15 pair;
A primer consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of a consecutive 50 to 20 kbp gene fragment comprising rs3744805 of human chromosome 17 pair;
A primer consisting of two oligonucleotides having a sequence complementary to each of the 5 'terminal and the consecutive 5 to 50 bp nucleotide sequences of the 5' end and the 3 'end of the consecutive 50 to 20 kbp gene fragment containing the human chromosome 2 rs7564005 pair;
A primer consisting of two oligonucleotides having a sequence complementary to each of the 5 'terminal and the consecutive 5 to 50 bp nucleotide sequences at the 5' end and the 3 'end of a consecutive 50 to 20 kbp gene fragment containing human chromosome 17 rs1876522 pair;
A primer consisting of two oligonucleotides having a sequence complementary to each of the 5 'terminal and the consecutive 5 to 50 bp nucleotide sequences at the 5' end and the 3 'end of a consecutive 50 to 20 kbp gene fragment containing rs2228048 of human chromosome 3 pair; And
A primer consisting of two oligonucleotides having a sequence complementary to each of the 5 to 50 bp consecutive nucleotide sequences at the 5 'end and the 3' end of a consecutive 50 to 20 kbp gene fragment comprising rs1946518 of human chromosome 11 pair
&Lt; / RTI > wherein the composition further comprises at least one primer pair selected from the group consisting of: < RTI ID = 0.0 >

8. A kit for the risk of graft-versus-host disease, comprising a composition for screening a risk of graft-versus-host disease according to any one of claims 5, 8 and 9.