WO2021107567A1

WO2021107567A1 - Method and device for identifying genetic variation causative of recessive genetic disease by using ngs

Info

Publication number: WO2021107567A1
Application number: PCT/KR2020/016706
Authority: WO
Inventors: 이정설; 한주현; 박중영; 금창원
Original assignee: 주식회사 쓰리빌리언
Priority date: 2019-11-28
Filing date: 2020-11-24
Publication date: 2021-06-03
Also published as: KR102319447B1; KR20210066276A

Abstract

The present invention provides a device for identifying a genetic variation causative of a recessive genetic disease, the device comprising: a gene extraction part in which a nucleotide sequence of a subject sample is compared with a reference nucleotide sequence of a reference genome by next generation sequencing (NGS) to extract a gene having two or more genetic variations generated thereon; a read detection part in which a read matched with the extracted gene is detected from the subject sample; and a genetic variation identification part in which a genetic variation causative of the recessive genetic disease is identified using the read.

Description

Method and Apparatus for Determining Genetic Variation Caused by Recessive Genetic Disease Using NGS

The present invention relates to a method and apparatus for determining a genetic mutation that causes a recessive genetic disease using next-generation sequencing (NGS).

Since the start of the Human Genome Project, many studies have been conducted on genes that cause diseases, and sequencing technology to reveal the nucleotide sequence of the genes is still being developed. Recently, next generation sequencing (NGS), which produces a large amount of short sequences due to low cost and rapid data production, is rapidly replacing the traditional Sanger sequencing method. With the development of next-generation sequencing (NGS) technology, the cost of generating a read (fragment sequence) has become less than half of what it used to be, and the availability of its application in the field of disease diagnosis is also increasing.

Recently, using next-generation sequencing (NGS) technology, we have succeeded in finding the causative genes of diseases such as Mendelian genetic diseases, rare diseases, and cancer.

However, although next-generation sequencing (NGS) has the advantage of obtaining a large amount of sequencing information at a relatively low cost, it is not easy to find the causative gene of a disease from the large amount of information.

The present invention relates to a method and apparatus for determining a genetic mutation causing a recessive genetic disease using reads in next-generation sequencing (NGS) technology.

Korean Patent Publication No. 10-1614471 discloses a method for diagnosing genetic abnormalities using reads, which is different from a method for determining genetic mutations that cause recessive genetic diseases.

The technical problem to be achieved by the present invention relates to a method and apparatus for determining a genetic mutation causing a recessive genetic disease for determining a causative genetic mutation that causes a recessive genetic disease using a lead in a next-generation sequencing analysis. In addition, the present invention relates to a method and apparatus for determining a genetic mutation causing a recessive genetic disease for determining whether the two genetic mutations detected in a read are a cis-related mutation or a trans-related mutation to determine the causative genetic mutation that causes recessive genetic disease.

In order to solve this problem, the method for determining the genetic variation causing a recessive genetic disease according to an embodiment of the present invention compares the reference nucleotide sequence of the reference genome with the nucleotide sequence of the target sample in next-generation sequencing (NGS) through the genetic variation extraction unit. , a gene extraction step of extracting a gene in which two or more genetic mutations have occurred; a read detection step of detecting a read of a target sample matching the extracted gene through a read detection unit; and a genetic variation discrimination step of discriminating a genetic variation causing a recessive genetic disease by using the read through a genetic variation determining unit.

In the step of determining the genetic mutation, when the two genetic mutations detected in the read are cis-related variants or trans-elated variants, and determined as the trans-related mutation, the genetic mutation causing the recessive genetic disease , and the cis-related variants refer to mutations in which the two genetic mutations are found only on one of the homologous chromosomes, and the trans-elated variants are the two genetic mutations. A mutation found on all homologous chromosomes.

The detecting of the read may include: identifying two genetic mutations (v1, v2) and a position (p1, p2) at which the genetic mutation exists in the extracted gene; and the number of reads containing both the two genetic mutation positions (p1 and p2) (N), and the number of reads containing v1 among the reads containing both the two genetic mutation positions (p1 and p2) (n1) , the number of reads containing v2 (n2) among the reads containing both the two genetic mutation positions (p1, p2), and v1 and v2 among the reads containing both the two genetic mutation positions (p1 and p2) Detecting the number of reads including all (c1) and the number of reads including v1 but not including v2 (c2) among the reads including both the two genetic mutation positions (p1 and p2) can

In the step of determining the genetic variation, when the score (v1, v2) calculated by the following [Equation 1] is equal to or greater than the reference value, the two genetic variations (v1, v2) may be determined as cis-related variants.

[Equation 1]

(where N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), n2 is the number of reads containing v2 among reads containing both genetic mutation sites (p1, p2), c1 is the number of reads containing both v1 and v2 among reads containing both genetic mutation sites (p1, p2) number of leads.)

The genetic variation determination step is calculated by the following [Equation 2], and when the score (v1, v2) is greater than or equal to the reference value, the two genetic variations (v1, v2) can be determined as trans-elated variants. .

[Equation 2]

(where N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), n2 is the number of reads containing v2 among the reads containing both genetic mutation sites (p1, p2), c2 is the number of reads containing v1 and v2 among the reads containing both genetic mutation sites (p1, p2) The number of leads that do not.)

Among the reads, for the first read having two genetic variations (v1, v2) and the second lead having two genetic variations (v2, v3), the genetic variation (v1) and the genetic variation (v2) are cis-related variations (cis-related variants), and if the genetic variant (v2) and the genetic variant (v3) are cis-related variants, the genetic variant (v1) and the genetic variant (v3) are cis-related variants variants), genetic variation (v1) and genetic variation (v2) are cis-related variants, and genetic variation (v2) and genetic variation (v3) are trans-elated variants If this is the case, genetic variation (v1) and genetic variation (v3) are determined as trans-elated variants, and genetic variation (v1) and genetic variation (v2) are trans-related variants, If the genetic variation (v2) and the genetic variation (v3) are trans-related variants, the genetic variation (v1) and the genetic variation (v3) may be determined as cis-related variants.

The extracted gene may be a gene causing a recessive genetic disease.

In order to solve this problem, the apparatus for determining genetic mutations causing recessive genetic diseases according to an embodiment of the present invention compares the reference nucleotide sequence of the reference genome with the nucleotide sequence of the target sample in next-generation sequencing (NGS), and two or more genetic mutations a gene extraction unit for extracting the gene in which the a read detection unit for detecting a read of a target sample matching the extracted gene; and a genetic variation determining unit for discriminating a genetic variation causing a recessive genetic disease by using the read.

The genetic mutation determining unit is configured to distinguish whether the two genetic mutations detected in the read are cis-related variants or trans-elated variants, and when it is determined as a trans-related mutation, it is a genetic mutation causing a recessive genetic disease. discriminate, and the cis-related variants refer to mutations in which the two genetic mutations are found only in one of homologous chromosomes, and the trans-elated variants are homologous to the two genetic mutations. Mutations found on all chromosomes.

The read detection unit identifies two genetic mutations (v1, v2) and a position (p1, p2) where the genetic mutation exists in the extracted gene, and includes both the two genetic mutation positions (p1, p2) the number of reads (N), the number of reads containing v1 among the reads containing both of the two genetic mutation positions (p1, p2) (n1), including both the two genetic mutation positions (p1, p2) The number of reads containing v2 among the reads (n2), the number of reads containing both v1 and v2 among the reads containing both of the two genetic mutation positions (p1, p2) (c1), and the two It is possible to detect the number (c2) of reads including v1 but not including v2 among the reads including both the mutation positions (p1 and p2).

The genetic variation determining unit may determine the two genetic variations (v1, v2) as cis-related variants when the score (v1, v2) calculated by the following [Equation 1] is equal to or greater than a reference value.

[Equation 1]

If the score (v1, v2) calculated by the following [Equation 2] is equal to or greater than the reference value, the genetic variation determining unit may determine the two genetic variations (v1, v2) as trans-elated variants.

[Equation 2]

The extracted gene may be a gene causing a recessive genetic disease.

In addition to the technical problems of the present invention mentioned above, other features and advantages of the present invention will be described below, or will be clearly understood by those skilled in the art from such description and description.

According to the present invention as described above, there are the following effects.

The present invention can determine the causative mutation in the causative agent of a recessive genetic disease by using a read in next-generation sequencing (NGS).

The present invention can significantly reduce the time and effort for determining the genetic mutation that causes recessive genetic disease by discriminating whether two genetic mutations detected in the read of a target sample are cis-related mutations or trans-related mutations.

The present invention can significantly reduce the time and effort for determining the genetic mutation that causes recessive genetic disease by determining whether two genetic mutations separated by more than a read length are trans-related or cis-related mutations without statistical significance.

In addition, other features and advantages of the present invention may be newly recognized through embodiments of the present invention.

1 is a block diagram illustrating a schematic configuration of an apparatus for determining a genetic variation causing a recessive genetic disease according to an embodiment of the present invention.

2 is a diagram for explaining a case in which two genetic mutations are trans-related mutations, according to an embodiment of the present invention.

* FIG. 3 is a diagram for explaining a case in which two genetic mutations are cis-related mutations, according to an embodiment of the present invention.

4 is a diagram for explaining a case in which two genetic mutations separated by a read length or more are trans-related mutations, according to an embodiment of the present invention.

5 is a diagram for explaining a case in which two genetic mutations separated by a read length or more are cis-related mutations according to an embodiment of the present invention.

6 is a schematic flowchart for explaining a method for determining a genetic mutation causing a recessive genetic disease according to an embodiment of the present invention.

7 is a flowchart for explaining a method for determining a genetic mutation causing a recessive genetic disease according to an embodiment of the present invention.

8 is a view showing the analysis result of directly confirming a read containing a genetic mutation, according to an embodiment of the present invention.

It should be noted that in the present specification, in adding reference numbers to the components of each drawing, the same numbers are used for the same components, even if they are indicated on different drawings, as much as possible.

On the other hand, the meaning of the terms described in this specification should be understood as follows.

The singular expression is to be understood as including the plural expression unless the context clearly defines otherwise, and the terms "first", "second", etc. are used to distinguish one element from another, The scope of rights should not be limited by these terms.

It should be understood that terms such as “comprise” or “have” do not preclude the possibility of addition or existence of one or more other features or numbers, steps, operations, components, parts, or combinations thereof.

In addition, for clarity of interpretation of the present specification, terms used in the present specification will be defined below.

As used herein, the term “next-generation sequencing” is one of genome sequencing techniques, and it is possible to analyze a nucleotide sequence at a high speed by processing DNA fragments in parallel. Due to these characteristics, next-generation sequencing may be referred to as high-throughput sequencing, massive parallel sequencing, or second-generation sequencing. In addition, next-generation sequencing can be used as a variety of analysis platforms depending on the purpose. For example, analysis platforms for next-generation sequencing include Roche 454, GS FLX Titanium, Illumina MiSeq, Illumina HiSeq, Illumina Genome Analyzer IIX, Life Technologie SOLiD4, Life Technologies Ion Proton, Life Technologies Ion Proton, Complete Genomics, Helicos Biosciences Heliscope , Pacific Biosciences SMRT, and the like. Furthermore, next-generation sequencing technology can be used to detect mutations in nucleotide sequences (genetic mutations). Preferred analysis platforms for detecting sequence mutations may be Illumina hybridcapture, Illumina Amplicon, and IonTorrent Amplicon, but are not limited thereto.

As used herein, the term "genetic variation" may refer to a mutation in a nucleotide sequence occurring in a chromosome due to various factors. For example, the genetic mutation may be a somatic mutation, a mutation in a nucleotide sequence due to contamination of a sample, and a mutation in a nucleotide sequence due to a genetic disease. Further, the genetic mutation is present in a small amount together with maternal DNA in the mother's blood. It may further include mutations present in small amounts in brain cells, mutations in the nucleotide sequence due to alleles, which may appear due to the DNA of the fetus. However, the genetic variation is not limited to the above.

As used herein, the term "target sample" may be a biological sample obtained from a patient to confirm a genetic variation, and the term, "reference genome" as used herein, is genetic as opposed to a target sample. It may be a normal biological sample that does not show any mutations. A preferred target sample may be a tumor cell associated with a somatic mutation, and a preferred reference genome may be reference data sequenced in advance with respect to normal cells, but is not limited thereto. For example, a reference genome may be variously selected according to a target sample, and its nucleotide sequence may be analyzed together with the nucleotide sequence of the target sample.

As used herein, the term “reads” is short-length nucleotide sequence data output from a genome sequencer. The read length is generally composed of about 35 to 500 bp (base pair) depending on the type of genome sequencer, and in general, DNA bases are expressed by alphabetic letters A, C, G, and T.

1 is a block diagram illustrating a schematic configuration of an apparatus for determining a genetic variation causing a recessive genetic disease according to an embodiment of the present invention, and FIG. 2 is a trans-related variation in which, according to an embodiment of the present invention, two genetic mutations are trans-related mutations. It is a view for explaining a case, and FIG. 3 is a view for explaining a case where two genetic mutations are cis-related mutations according to an embodiment of the present invention.

1 to 3 , the apparatus 1000 for determining a genetic mutation causing a recessive genetic disease according to an embodiment of the present invention includes a gene extraction unit 100 , a read detection unit 300 , and a genetic variation determining unit 500 . include

The gene extraction unit 100 may extract the gene (G) in which two or more genetic mutations have occurred by comparing the reference nucleotide sequence of the reference genome with the nucleotide sequence of the target sample in next-generation sequencing (NGS). In this case, the gene (G) may be a gene that causes a recessive genetic disease to be described later.

The read detection unit 300 may prepare a read of a target sample and detect a read (R) of the target sample that matches the gene (G) in which two or more genetic mutations have occurred.

The read detection unit 300 confirms the two genetic mutations (v1, v2) and the positions (p1, p2) at which the genetic mutations exist in the extracted gene (G), and then at any of the two genetic mutation positions (p1, p2). A read having a position including at least one can be detected.

In addition, the read detection unit 300 includes the number of reads (N) including both the two genetic mutation positions (p1 and p2), and the genetic mutation v1 among the reads including both the two genetic mutation positions (p1 and p2). The number of reads (n1), the number of reads containing the genetic mutation v2 (n2) among the reads containing both of the two genetic mutation positions (p1, p2), and the two genetic mutation positions (p1, p2) The number of reads containing both genetic mutations v1 and v2 (c1) among the included reads, and the number of reads containing both genetic mutation positions (p1 and p2) among the reads containing the genetic mutation v1 but not including the genetic mutation v2 The number of reads (c2) can be detected.

In humans, when a disease occurs due to a mutation in a gene known to cause a hereditary disease on both the chromosomes from the father and the chromosomes from the mother, which are homologous chromosomes, the disease is called a recessive disease. When a disease occurs due to a mutation in a gene known to cause a hereditary disease in any one, the disease is called a dominant genetic disease.

2, one of the two genetic mutations (v1, v2) occurring at different positions (p1, p2) of the same gene (G) is from the father (a), and the other is from the mother (b). In some cases, if a disease is caused by the gene (G), the gene (G) is said to be recessive, and the disease is a recessive disease.

On the other hand, referring to FIG. 3 , when both genetic mutations (v1, v2) occurring at different positions (p1, p2) of the same gene (G) come only from either side of the parent (a) or the mother (b) E.g., if the gene (G) is recessive, no disease occurs.

At this time, mutations in which two genetic mutations are found on only one of the homologous chromosomes are called cis-related variants, and mutations in which two genetic mutations are found on both homologous chromosomes are trans-elated. called variants).

In other words, if the gene of the genetic mutation found in the patient causes the disease in a recessive way, if the genetic mutation found in the patient is present in both parents (trans-relational mutation), this genetic mutation is the cause of the patient's recessive genetic disease. It can be a candidate for causative genetic variation.

On the other hand, if the gene of the genetic mutation found in the patient causes the disease in a recessive manner, if the genetic mutation found in the patient exists only in either the parent or the mother (cis-related mutation), this genetic mutation is the Causative genetic mutations are excluded from candidates.

As such, finding out whether two or more mutations occurring at different positions (p1, p2) of the same gene are cis-related variants or trans-elated variants is the key to causing recessive genetic disease. It can be important information to determine whether a causal genetic mutation or not.

On the other hand, each read (R) detected in the next-generation sequencing (NGS) cannot tell whether the chromosome (a) sequence received from the father or the chromosome (b) sequence received from the mother is read, so each read It is not known whether the genetic mutations present in (R) are cis-related mutations or trans-related mutations.

The genetic variation determining unit 500 may determine the genetic variation causing the recessive genetic disease by using the lead (R).

The genetic mutation determining unit 500 distinguishes whether the two genetic mutations detected in the read (R) are cis-related mutations or trans-related mutations, and when it is determined as trans-related mutations, it is possible to determine the two genetic mutations as genetic mutations causing recessive genetic diseases. have.

At this time, the two genetic variants (v1, v2) detected in the read (R) are cis-related variants if the score (v1, v2) calculated by the following [Equation 1] is greater than or equal to the reference value Can be determined as cis-related variants. have.

[Equation 1]

Here, N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), and n2 is The number of reads containing v2 among the reads containing both genetic mutation sites (p1, p2), c1 is the number of reads containing both v1 and v2 among the reads containing both genetic mutation sites (p1, p2) is the number of

Theoretically, in the case of a cis-relational mutation, if a read includes both p1 and p2, v1 and v2 should both come from the grid, so n1 and n2 should be the same. However, because n1 and n2 are not completely the same for experimental or biological reasons, it can be determined with statistical significance (Fisher's exact p-value) of how similar n1 and n2 are.

In addition, the two genetic variants (v1, v2) detected in the read (R) can be determined as trans-elated variants if the score (v1, v2) calculated by the following [Equation 1] is greater than or equal to the reference value. have.

[Equation 2]

Here, N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), and n2 is The number of reads containing v2 among the reads containing both mutation sites (p1, p2), c2 is the number of reads containing v1 and not v2 among the reads containing both mutation positions (p1, p2) is the number of leads that do not.

As described above, the apparatus 1000 for determining the genetic mutation causing recessive genetic disease according to an embodiment of the present invention may determine the causative genetic mutation causing the recessive genetic disease by using the read in next-generation sequencing (NGS).

In addition, in order to distinguish whether two genetic mutations are cis-related mutations or trans-related mutations, it is possible to perform genetic testing on both the patient's father and mother, but the device for determining genetic mutations causing recessive genetic diseases according to an embodiment of the present invention ( 1000) can significantly reduce the time and effort for determining the causative genetic mutation that causes recessive genetic disease by distinguishing whether the two genetic mutations detected in the read of the target sample are cis-related mutations or trans-related mutations.

4 is a diagram for explaining a case in which two genetic mutations separated by more than a read length are trans-related mutations, according to an embodiment of the present invention, and FIG. 5 is a diagram that is separated by more than a read length according to an embodiment of the present invention. It is a diagram for explaining a case in which two genetic mutations are cis-related mutations.

The apparatus for determining a genetic mutation causing a recessive genetic disease according to another embodiment of the present invention may determine whether two genetic mutations separated by a read length or more are a trans-related mutation or a cis-related mutation.

Since the length of a DNA sequence that can be read at a time in next-generation sequencing (NGS) is limited, the read length is inevitably limited to about 150 bp.

That is, it is impossible to determine whether two genetic mutations separated by more than a read length have statistical significance and whether they are trans-related or cis-related mutations.

The present invention proposes a method for determining whether a trans-related mutation or a cis-related mutation by using a mutation in the middle of two genetic mutations separated by more than a read length.

Referring to FIG. 4 , if v1 and v2 are trans relational variables and v2 and v3 are trans relational variables, v1 and v3 are cis relational variables.

Since two chromosomes are a pair of homologous chromosomes, if v1 and v2 are on different chromosomes and v2 and v3 are on different chromosomes, v1 and v3 must be on the same chromosome, so it is a cis-relational mutation.

Referring to FIG. 5 , when v1 and v2 are cis-related variables and v2 and v3 are trans-related variables, v1 and v3 are trans-related variables.

Since two chromosomes are a pair of homologous chromosomes, if v1 and v2 are on the same chromosome and v2 and v3 are on different chromosomes, v1 and v3 must be on different chromosomes, so it is a trans relational mutation.

Although not shown, if v1 and v2 are cis-relational variables and v2 and v3 are cis-related variables, v1 and v3 are cis-relational variables.

As such, the apparatus for determining the cause of a recessive genetic disease according to another embodiment of the present invention determines whether two genetic mutations separated by more than a read length are trans-related or cis-related without statistical significance, thereby causing recessive genetic disease. The time and effort for identifying genetic mutations can be greatly reduced.

Hereinafter, a method for determining a genetic mutation causing a recessive genetic disease according to an embodiment of the present invention will be described with reference to FIGS. 6 to 8 .

6 is a schematic flowchart for explaining a method for determining a genetic variation causing a recessive genetic disease according to an embodiment of the present invention, and FIG. 7 is a flowchart for explaining a method for determining a genetic variation causing a recessive genetic disease according to an embodiment of the present invention. and FIG. 8 is a view showing an analysis result of directly confirming a read containing a genetic mutation, according to an embodiment of the present invention.

The method for determining a genetic variation causing a recessive genetic disease according to an embodiment of the present invention includes a gene extraction step (S100), a read detection step (S300), and a genetic variation determination step (S500).

In the gene extraction step (S100), after comparing the reference nucleotide sequence of the reference genome with the nucleotide sequence of the target sample in the next-generation sequencing (NGS) through the gene extraction unit (S110), the gene with two or more genetic mutations is extracted You can (S130).

Next, in the read detection step (S300), two genetic mutations (v1, v2) and the positions (p1, p2) where the genetic mutations exist in the extracted gene are confirmed through the read detection unit (S310), and the two genetic mutations A read having a position including any one of the shift positions p1 and p2 is detected ( S330 ).

At this time, in the step of detecting the number of reads (S330), the number of reads (N) including both the two genetic mutation positions (p1, p2) and v1 among the reads including both the two genetic mutation positions (p1, p2) are included the number of reads (n1), the number of reads containing v2 among the reads containing both the two genetic mutation positions (p1, p2) (n2), and the two genetic mutation positions (p1, p2) Detects the number of reads containing both v1 and v2 among reads (c1), and the number of reads containing both v1 and no v2 among reads containing both genetic mutation sites (p1 and p2) (c2) do.

Referring to Figure 8, the data is a genetic mutation in which the guanine (G) base at 979690 of the target sample chromosome 1 is changed to an adenine (A) base and the guanine (G) base at 979835 of the target sample chromosome 1 is adenine (A) It is the result of each read of a genetic mutation that has been converted to a base.

Based on the two genetic mutations included in the gene extracted in the read detection step S300, the number of reads can be detected by displaying normal (O) and genetic mutation (X) in each read.

Next, the genetic variation determination step (S500) determines whether the two genetic variations (v1, v2) occurring at different positions (p1, p2) detected in the read are cis-related variations or trans-related variations through the genetic variation discrimination unit and (S510), and the trans-related mutation is determined as a genetic mutation causing a recessive genetic disease (S530).

Two genetic mutations (v1, v2) occurring at different positions (p1, p2) detected in the read can be determined as cis-related mutations by [Equation 1] described above, and trans-related mutations by [Equation 2] can be decided.

Meanwhile, the above-described embodiments of the present invention can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable recording medium. The computer-readable recording medium includes a storage medium such as a magnetic storage medium (eg, a ROM, a floppy disk, a hard disk, etc.) and an optically readable medium (eg, a CD-ROM, a DVD, etc.).

The present invention described above is not limited to the above-described embodiments and the accompanying drawings, and it is in the technical field to which the present invention pertains that various substitutions, modifications and changes are possible within the scope of the present invention. It will be clear to those of ordinary skill in the art.

Claims

a gene extraction step of extracting genes in which two or more genetic mutations have occurred by comparing the reference nucleotide sequence of the reference genome with the nucleotide sequence of the target sample in next-generation sequencing (NGS) through the genetic mutation extraction unit;

a read detection step of detecting a read of a target sample matching the extracted gene through a read detection unit; and

A method for determining a genetic variation causing a recessive genetic disease, comprising a genetic variation determining step of determining a genetic variation causing a recessive genetic disease by using the read through a genetic variation determining unit.
According to claim 1,

In the step of determining the genetic mutation,

If the two genetic mutations detected in the read are determined to be trans-related mutations by distinguishing whether they are cis-related variants or trans-elated variants, it is determined as a genetic mutation causing a recessive genetic disease,

The cis-related variants refer to mutations in which the two genetic mutations are found only on one of homologous chromosomes, and the trans-elated variants refer to mutations in which the two genetic mutations are found on both homologous chromosomes. A method for determining genetic mutations causing recessive genetic diseases, characterized in that it refers to the mutations found.
3. The method of claim 2,

The step of detecting the lead,

identifying two genetic mutations (v1, v2) and positions (p1, p2) at which the genetic mutations exist in the extracted gene; and

The number of reads containing both of the two genetic mutation positions (p1, p2) (N), the number of reads containing v1 among the reads containing both the two genetic mutation positions (p1, p2) (n1), The number of reads containing v2 (n2) among the reads containing both of the two genetic mutation positions (p1, p2), and both v1 and v2 among the reads containing both the two genetic mutation positions (p1 and p2) Recession comprising the step of detecting the number of reads including v1 (c1) and the number of reads including v1 but not including v2 (c2) among the reads containing both of the two genetic mutation positions (p1 and p2) Methods for determining genetic mutations that cause genetic diseases.
4. The method of claim 3,

In the step of determining the genetic mutation,

When the score (v1, v2) calculated by the following [Equation 1] is greater than or equal to the reference value, the genetic mutation causing recessive genetic disease, characterized in that two genetic variants (v1, v2) are determined as cis-related variants Determination method.

[Equation 1]

(where N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), n2 is the number of reads containing v2 among reads containing both genetic mutation sites (p1, p2), c1 is the number of reads containing both v1 and v2 among reads containing both genetic mutation sites (p1, p2) number of leads.)
4. The method of claim 3,

In the step of determining the genetic mutation,

It is calculated by the following [Equation 2], and when the score (v1, v2) is greater than or equal to the reference value, two genetic variants (v1, v2) are determined as trans-elated variants. Variant determination method.

[Equation 2]

(where N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), n2 is the number of reads containing v2 among the reads containing both genetic mutation sites (p1, p2), c2 is the number of reads containing v1 and v2 among the reads containing both genetic mutation sites (p1, p2) The number of leads that do not.)
3. The method of claim 2,

Among the reads, for the first lead having two genetic variations (v1, v2) and the second lead having two genetic variations (v2, v3),

If the genetic variation (v1) and the genetic variation (v2) are cis-related variants, and the genetic variation (v2) and the genetic variation (v3) are cis-related variants, the genetic variation (v1) ) and genetic variation (v3) are determined as cis-related variants,

If the genetic variation (v1) and the genetic variation (v2) are cis-related variants, and the genetic variation (v2) and the genetic variation (v3) are trans-elated variants, the genetic variation (v1) ) and genetic variation (v3) are determined as trans-elated variants,

If the genetic variation (v1) and the genetic variation (v2) are trans-related variants, and the genetic variation (v2) and the genetic variation (v3) are trans-related variants, the genetic variation (v1) ) and genetic variation (v3) are cis-related variants (cis-related variants), characterized in that the determination of the genetic variation causing the recessive genetic disease.
According to claim 1,

The gene extracted is a method for determining a genetic variation causing a recessive genetic disease, characterized in that the gene that causes the recessive genetic disease.
a gene extracting unit that compares the reference nucleotide sequence of the reference genome with the nucleotide sequence of the target sample in next-generation sequencing (NGS), and extracts genes in which two or more genetic mutations have occurred;

a read detection unit for detecting a read of a target sample matching the extracted gene; and

A device for determining a genetic variation causing a recessive genetic disease, including a genetic variation determining unit for discriminating a genetic variation causing a recessive genetic disease by using the lead.
9. The method of claim 8,

The genetic mutation determination unit,

If the two genetic mutations detected in the read are determined to be trans-related mutations by distinguishing whether they are cis-related variants or trans-elated variants, it is determined as a genetic mutation causing a recessive genetic disease,

The cis-related variants refer to mutations in which the two genetic mutations are found only on one of homologous chromosomes, and the trans-elated variants refer to mutations in which the two genetic mutations are found on both homologous chromosomes. A device for determining genetic variation causing recessive genetic disease, characterized in that it refers to the found mutation.
10. The method of claim 9,

The lead detection unit,

Confirming the two genetic mutations (v1, v2) and the position (p1, p2) where the genetic mutation exists in the extracted gene,

The number of reads containing both of the two genetic mutation positions (p1, p2) (N), the number of reads containing v1 among the reads containing both the two genetic mutation positions (p1, p2) (n1), The number of reads containing v2 (n2) among the reads containing both the two genetic mutation positions (p1 and p2), and both v1 and v2 among the reads containing both the two genetic mutation positions (p1 and p2) A recessive genetic mutation that detects the number of reads that contain (c1), and the number of reads that contain v1 but do not include v2 (c2) among the reads that contain both of the two genetic mutation positions (p1 and p2) discrimination device.
11. The method of claim 10,

The genetic mutation determination unit,

If the score (v1, v2) calculated by the following [Equation 1] is greater than or equal to the reference value, the genetic mutation causing recessive genetic disease, characterized in that the two genetic variants (v1, v2) are determined as cis-related variants discrimination device.

[Equation 1]

(where N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), n2 is the number of reads containing v2 among reads containing both genetic mutation sites (p1, p2), c1 is the number of reads containing both v1 and v2 among reads containing both genetic mutation sites (p1, p2) number of leads.)
11. The method of claim 10,

The genetic mutation determination unit,

If the score (v1, v2) calculated by the following [Equation 2] is greater than or equal to the reference value, the genetic mutation causing recessive genetic disease, characterized in that the two genetic variants (v1, v2) are determined as trans-elated variants. discrimination device.

[Equation 2]

(where N is the number of reads containing both genetic mutation sites (p1, p2), n1 is the number of reads containing v1 among reads containing both genetic mutation positions (p1, p2), n2 is the number of reads containing v2 among the reads containing both genetic mutation sites (p1, p2), c2 is the number of reads containing v1 and v2 among the reads containing both genetic mutation sites (p1, p2) The number of leads that do not.)
10. The method of claim 9,

Among the reads, for the first lead having two genetic variations (v1, v2) and the second lead having two genetic variations (v2, v3),

If the genetic variation (v1) and the genetic variation (v2) are cis-related variants, and the genetic variation (v2) and the genetic variation (v3) are cis-related variants, the genetic variation (v1) ) and genetic variation (v3) are determined as cis-related variants,

If the genetic variation (v1) and the genetic variation (v2) are cis-related variants, and the genetic variation (v2) and the genetic variation (v3) are trans-elated variants, the genetic variation (v1) ) and genetic variation (v3) are determined as trans-elated variants,

If the genetic variation (v1) and the genetic variation (v2) are trans-related variants, and the genetic variation (v2) and the genetic variation (v3) are trans-related variants, the genetic variation (v1) ) and genetic variation (v3) is a device for determining genetic variation causing recessive genetic disease, characterized in that determined as cis-related variants.
9. The method of claim 8,

The extracted gene is a recessive genetic disease cause genetic variation determination device, characterized in that the gene that causes the recessive genetic disease.
A computer-readable recording medium in which a program for executing the method of any one of claims 1 to 7 on a computer is recorded.