JP7141039B2

JP7141039B2 - DNA mutation screening device, DNA mutation screening system, DNA mutation screening method, and program

Info

Publication number: JP7141039B2
Application number: JP2018164050A
Authority: JP
Inventors: 淑希恒元; 隆史山内; 大介越智; 聡檜山; 正朗長▲崎▼; 準一菅原; 直子峯岸
Original assignee: Tohoku University NUC; NTT Docomo Inc
Current assignee: Tohoku University NUC; NTT Docomo Inc
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2022-09-22
Anticipated expiration: 2038-08-31
Also published as: JP2020038413A

Description

本発明は、ＤＮＡ変異絞込装置、ＤＮＡ変異絞込システム、ＤＮＡ変異絞込方法、及びプログラムに関する。 The present invention relates to a DNA mutation narrowing-down device, a DNA mutation narrowing-down system, a DNA mutation narrowing-down method, and a program.

複数の被験者のＤＮＡ（DeoxyriboNucleic Acid）変異データを用いて、ＤＮＡ変異と特定の病気や量的形質との関連を調べる技術が知られている（例えば、特許文献１参照）。 A technique is known in which DNA (DeoxyriboNucleic Acid) mutation data of a plurality of subjects is used to investigate the relationship between DNA mutations and specific diseases or quantitative traits (see, for example, Patent Document 1).

また、具体的な研究の例として、日本人約１６万人のＤＮＡ情報を用いて全ゲノム関連解析を実施し、さらに欧米人３２万人で行われたメタ解析を行うことにより、体重に影響すると考えられる１９３のゲノム領域を同定した例がある（例えば、非特許文献１参照）。 In addition, as a specific example of research, we conducted a genome-wide association analysis using the DNA information of about 160,000 Japanese people, and further conducted a meta-analysis conducted on 320,000 Westerners to determine the effects on body weight. There is an example of identifying 193 genomic regions that are thought to be (see, for example, Non-Patent Document 1).

さらに、妊娠高血圧症候群（以下HDP: Hypertensive Disorders of Pregnancy）に罹患した５０名の遺伝子情報と、罹患していない５０名の遺伝子情報とを用いてエキソーム解析を行ったところ、両群の遺伝子の発現量に差は見られたものの、有意差がでなかった例がある（例えば、非特許文献２参照）。 Furthermore, exome analysis was performed using the genetic information of 50 people with hypertensive disorders of pregnancy (HDP: Hypertensive Disorders of Pregnancy) and the genetic information of 50 people who were not affected. Although there was a difference in the amount, there is an example in which the difference was not significant (see, for example, Non-Patent Document 2).

特開２０１８－４９４７２号公報JP 2018-49472 A

Akiyama M, et.al, "Genome-wide association study identifies 112 loci for body mass index in the Japanese population.", Nature genetics, doi: 10.1038/ng.3951Akiyama M, et.al, "Genome-wide association study identifies 112 loci for body mass index in the Japanese population.", Nature genetics, doi: 10.1038/ng.3951 Hansen, Anette Tarp, et al. "The Genetic Component of Preeclampsia: A Whole-Exome Sequencing Study.", Plos One, vol. 13, no. 5, 2018, doi:10.1371Hansen, Anette Tarp, et al. "The Genetic Component of Preeclampsia: A Whole-Exome Sequencing Study.", Plos One, vol. 13, no. 5, 2018, doi:10.1371

例えば、非特許文献１のように、全ゲノム関連解析を行う際には、約３０億の塩基対の中から、所定の疾患に関連するＤＮＡ変異を特定するため、膨大な数（例えば、数千～数十万単位）のＤＮＡサンプルが必要となる。 For example, as in Non-Patent Document 1, when performing whole-genome association analysis, a huge number (for example, a number of thousands to hundreds of thousands) of DNA samples are required.

しかし、研究機関の資金不足や、所定の疾患の罹患率が低い等の理由により、十分な数のサンプルが得られないことも多く、このような場合には、例えば、非特許文献２に示すように、所定の疾患に関連するバイオマーカーを十分に絞り込めない場合がある。 However, due to reasons such as lack of funding for research institutions and low prevalence of certain diseases, it is often not possible to obtain a sufficient number of samples. As such, it may not be possible to sufficiently narrow down the biomarkers associated with a given disease.

また、疾患によっては、罹患群と非罹患群の定義が曖昧な場合がある。例えば、ＨＤＰには診断のガイドラインが設定されているが、実際には、ＨＤＰに分類されていなくてもＨＤＰのような血圧値が現れる場合がある。このような場合には罹患群、非罹患群の遺伝的背景に十分な差がない可能性があり、所定の疾患に関連するＤＮＡ変異を十分に絞り込めない場合がある。 Also, depending on the disease, the definition of an affected group and a non-affected group may be ambiguous. For example, there are diagnostic guidelines for HDP, but in reality, blood pressure values like HDP may appear even if they are not classified as HDP. In such a case, there may not be a sufficient difference in genetic background between the affected group and the non-affected group, and DNA mutations associated with a given disease may not be sufficiently narrowed down.

こういった課題を解決する方法として、罹患群と非罹患群のＤＮＡ配列の違いに関し関連解析を行う方法に加え、体重、ＢＭＩ、血圧、体温などの所定の疾患に関わる量的形質を用いて、量的形質の大小にＤＮＡ配列が及ぼす影響を評価する量的形質を用いた関連解析が考えられる。しかし、例えば来院時に測定した量的形質を用いて関連解析を行う場合、患者によって来院時期が異なっている等の理由から、罹患群、非罹患群で量的形質の比較を行う際、量的形質に及ぼす来院時期の影響を取り除くことができず、所定の疾患に関連するＤＮＡ変異を十分に絞り込めない場合が想定できる。 As a method to solve these problems, in addition to the method of performing association analysis on the difference in the DNA sequence between the affected group and the non-affected group, quantitative traits related to a given disease such as body weight, BMI, blood pressure, body temperature, etc. , an association analysis using quantitative traits that evaluates the effect of DNA sequences on the magnitude of a quantitative trait. However, when performing association analysis using, for example, quantitative traits measured at the time of hospital visit, it is difficult to compare quantitative It is conceivable that the effect of visit time on traits cannot be removed and DNA mutations associated with a given disease cannot be sufficiently narrowed down.

このように、従来の技術では、例えば、検体のＤＮＡサンプルが少ない場合や、疾患の定義が曖昧である場合には、所定の疾患に関連するＤＮＡ変異を絞り込むことには困難を伴っていた。さらに量的形質を用いて関連解析を行う場合でも所定の疾患に関連するＤＮＡ変異を絞り込むことが困難であることが想定できる。 As described above, with conventional techniques, it is difficult to narrow down DNA mutations associated with a given disease, for example, when there are few DNA samples of a specimen or when the definition of a disease is ambiguous. Furthermore, even when performing association analysis using quantitative traits, it can be assumed that it is difficult to narrow down DNA mutations associated with a given disease.

本発明の実施形態は、上記の問題点に鑑みてなされたものであって、所定の疾患に関わる量的形質を用いて関連解析を行う場合に、所定の疾患に関連するＤＮＡ変異を容易に絞り込むことができるＤＮＡ変異絞込装置を提供する。 An embodiment of the present invention has been made in view of the above-mentioned problems, and facilitates identification of DNA mutations associated with a given disease when performing association analysis using quantitative traits associated with the given disease. To provide a DNA mutation narrowing down device capable of narrowing down.

上記の課題を解決するため、本発明の一実施形態に係るＤＮＡ変異絞込装置は、所定の疾患に罹患している被験者を含む複数の被験者から収集した、前記所定の疾患に関連する量的形質の連続的な計測値に対して所定の多項式にフィッティングを行い、前記多項式の係数、前記多項式の切片、又は前記多項式の係数と切片を用いて、前記被験者ごとにから特徴量を抽出する特徴量抽出部と、前記複数の被験者におけるＤＮＡ変異の情報を取得する取得部と、前記特徴量と前記ＤＮＡ変異との関連を解析する解析部と、前記解析部の解析結果に基づいて、前記所定の疾患に関連するＤＮＡ変異を抽出するＤＮＡ変異抽出部と、を有し、前記解析部は、前記特徴量を目的変数、前記ＤＮＡ変異を説明変数として関連解析を行い、前記特徴量と各ＤＮＡ変異との関連度を示す統計的指標の値を算出し、前記ＤＮＡ変異抽出部は、前記解析部が算出した前記統計的指標の値が、予め定められた基準値以上、又は基準値以下となるＤＮＡ変異を抽出する。

In order to solve the above problems, a DNA mutation narrowing down apparatus according to one embodiment of the present invention provides a quantitative A feature of performing fitting to a predetermined polynomial on the continuous measured values of traits and extracting a feature amount from each subject using the coefficient of the polynomial, the intercept of the polynomial, or the coefficient and intercept of the polynomial an amount extraction unit, an acquisition unit that acquires information on DNA mutations in the plurality of subjects, an analysis unit that analyzes the relationship between the feature amount and the DNA mutations, and based on the analysis results of the analysis unit, the predetermined and a DNA mutation extraction unit for extracting DNA mutations related to the disease , wherein the analysis unit performs association analysis using the feature amount as an objective variable and the DNA mutation as an explanatory variable, and performs association analysis with the feature amount and each DNA A statistical index value indicating the degree of association with a mutation is calculated, and the DNA mutation extraction unit determines whether the statistical index value calculated by the analysis unit is equal to or greater than a predetermined reference value or equal to or less than a predetermined reference value. DNA mutations are extracted.

ここで、所定の疾患に関連する量的形質には、例えば、被験者の血圧、体重、ＢＭＩ（Body Mass Index）、脈拍、心拍数、体脂肪率、活動量、消費カロリー、睡眠時間、体温等の連続的かつ量的に変化する形質が含まれる。量的形質の連続的な計測値は、これらの量的形質を所定の期間連続的に測定し記録した値である。 Quantitative traits related to a given disease include, for example, subject's blood pressure, weight, BMI (Body Mass Index), pulse, heart rate, body fat percentage, amount of activity, calorie consumption, sleep time, body temperature, and the like. includes continuously and quantitatively changing traits of Continuous measurements of quantitative traits are values that are measured and recorded continuously for a given period of time for those quantitative traits.

このように、疾患に関わる連続的な量的形質を用いることで、罹患、非罹患という２値の情報を使う場合よりも当該疾患に関する情報を増やすことを可能とする。 In this way, the use of continuous quantitative traits associated with a disease makes it possible to obtain more information about the disease than the binary information of diseased and non-diseased.

本発明の一実施形態によれば、所定の疾患に関わる量的形質を用いて関連解析を行う場合に、所定の疾患に関連するＤＮＡ変異を容易に絞り込むことができるＤＮＡ変異絞込装置を提供することができる。 According to one embodiment of the present invention, there is provided a DNA mutation narrowing-down apparatus capable of easily narrowing down DNA mutations associated with a given disease when performing association analysis using quantitative traits associated with the given disease. can do.

一実施形態に係るＤＮＡ変異絞込装置の構成例を示す図（１）である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram (1) showing a configuration example of a DNA mutation screening device according to an embodiment; 一実施形態に係るＤＮＡ変異絞込装置の構成例を示す図（２）である。FIG. 2 is a diagram (2) showing a configuration example of a DNA mutation screening device according to an embodiment. 一実施形態に係るＤＮＡ変異絞込装置のハードウェア構成の例を示す図である。It is a figure which shows the example of the hardware constitutions of the DNA mutation narrowing-down apparatus which concerns on one Embodiment. 一実施形態に係るＤＮＡ変異の絞込処理の概要を示すフローチャートである。4 is a flowchart showing an overview of DNA mutation narrowing down processing according to an embodiment. 一実施形態に係る特徴量の抽出処理の例を示すフローチャートである。6 is a flowchart illustrating an example of feature amount extraction processing according to an embodiment; 一実施形態に係る特徴量の一例について説明するための図である。FIG. 4 is a diagram for explaining an example of feature amounts according to one embodiment; 一実施形態に係る第１の具体例のＤＮＡ変異の絞込処理の例を示すフローチャートである。FIG. 4 is a flow chart showing an example of narrowing down processing for DNA mutations of the first specific example according to one embodiment. FIG. 一実施形態に係る第１の具体例のＤＮＡ変異の絞込処理の応用例を示すフローチャートである。FIG. 10 is a flow chart showing an application example of the DNA mutation narrowing down process of the first specific example according to one embodiment. FIG. 一実施形態に係る第２の具体例のＤＮＡ変異の絞込処理の例を示すフローチャートである。FIG. 10 is a flow chart showing an example of narrowing down processing for DNA mutations in a second specific example according to one embodiment. FIG. 一実施形態に係る第２の具体例の罹患群、非罹患群のラベル付け処理の例を示すフローチャートである。FIG. 11 is a flow chart showing an example of a diseased group and non-diseased group labeling process of the second specific example according to one embodiment. FIG.

以下、図面を参照して本発明の実施の形態を説明する。なお、以下で説明する実施の形態は一例であり、本発明が適用される実施の形態は、以下の実施の形態に限られない。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described with reference to the drawings. In addition, the embodiment described below is an example, and the embodiment to which the present invention is applied is not limited to the following embodiment.

＜ＤＮＡ変異絞込装置の構成＞
図１～３を用いて、一実施形態に係るＤＮＡ（DeoxyriboNucleic Acid）変異絞込装置の構成について説明する。ＤＮＡ変異絞込装置１１０は、コンピュータの構成を有する情報処理装置、又は複数の情報処理装置を含むシステム（ＤＮＡ変異絞込システム１００）である。ＤＮＡ変異絞込装置１１０は、例えば、図３に示すようなハードウェア構成を備えている。 <Configuration of DNA mutation screening device>
The configuration of a DNA (DeoxyriboNucleic Acid) mutation screening device according to one embodiment will be described with reference to FIGS. The DNA mutation narrowing down device 110 is an information processing device having a computer configuration or a system including a plurality of information processing devices (DNA mutation narrowing down system 100). The DNA mutation narrowing down device 110 has, for example, a hardware configuration as shown in FIG.

ＤＮＡ変異絞込装置１１０は、例えば、入力された複数の被験者のＤＮＡ変異データと、連続した量的形質のデータとに基づいて関連解析（検定）を行いＤＮＡ変異データが連続した量的形質の変化に及ぼす影響を算出し所定の疾患に関連するＤＮＡ変異の絞込を行い、絞込結果（例えば、候補リスト等）を出力する。 The DNA mutation narrowing down apparatus 110 performs, for example, an association analysis (test) based on the input DNA mutation data of a plurality of subjects and continuous quantitative trait data, and the DNA mutation data is used to analyze continuous quantitative trait data. The effect on the change is calculated to narrow down the DNA mutations associated with a given disease, and the narrowed-down result (for example, a candidate list, etc.) is output.

ここで、関連解析（検定）には、線形回帰、ロジスティック回帰、フィッシャーの正確検定、カイ二乗検定、コクラン・アミテージ検定、ｔ検定等を用いることができるが、これに限定されない。 Here, for the association analysis (test), linear regression, logistic regression, Fisher's exact test, chi-square test, Cochran-Amitage test, t-test, etc. can be used, but not limited thereto.

なお、ＤＮＡ変異絞込装置１１０は、例えば、所定の疾患に関連する一塩基多型等のＤＮＡ変異の候補を絞り込むものであれば良く、必ずしも、所定の疾患の罹患／非罹患や進行度等を直接示すＤＮＡ変異を特定するものでなくても良い。 Note that the DNA mutation narrowing-down device 110 may be, for example, a device that narrows down candidates for DNA mutations such as single nucleotide polymorphisms related to a predetermined disease, and does not necessarily need to be the presence/absence of a predetermined disease, the degree of progression, etc. It does not have to specify a DNA mutation that directly indicates the .

図１は、一実施形態に係るＤＮＡ変異絞込装置１１０の構成例を示す図（１）である。ＤＮＡ変異絞込装置１１０は、図３のプロセッサ３０１で所定のプログラムを実行することにより、図１に示す入力受付部１１１、特徴量抽出部１１２、ＤＮＡ変異情報取得部１１３、関連解析部１１４、ＤＮＡ変異抽出部１１５、記憶部１１６、及び結果出力部１１７等を実現している。なお、入力受付部１１１、特徴量抽出部１１２、ＤＮＡ変異情報取得部１１３、関連解析部１１４、ＤＮＡ変異抽出部１１５、記憶部１１６、及び結果出力部１１７のうち、少なくとも一部は、ハードウェアによって実現されるものであっても良い。 FIG. 1 is a diagram (1) showing a configuration example of a DNA mutation screening device 110 according to an embodiment. The DNA mutation narrowing down apparatus 110 executes a predetermined program in the processor 301 of FIG. It implements a DNA mutation extraction unit 115, a storage unit 116, a result output unit 117, and the like. Note that at least some of the input reception unit 111, the feature amount extraction unit 112, the DNA mutation information acquisition unit 113, the association analysis unit 114, the DNA mutation extraction unit 115, the storage unit 116, and the result output unit 117 are implemented by hardware. It may be realized by

入力受付部１１１は、例えば、図３のプロセッサ３０１で実行されるプログラムによって実現され、図３の入力装置３０４、通信装置３０６等から入力される入力データや入力操作等を受け付ける。例えば、入力受付部１１１は、所定の疾患に罹患している被験者を含む複数の被験者から収集した、所定の疾患に関連する量的形質の連続的な計測値の入力を受け付ける。 The input reception unit 111 is implemented by, for example, a program executed by the processor 301 in FIG. 3, and receives input data, input operations, and the like input from the input device 304, the communication device 306, and the like in FIG. For example, the input receiving unit 111 receives inputs of continuous measured values of quantitative traits related to a given disease, collected from a plurality of subjects including a subject suffering from the given disease.

また、入力受付部１１１は、所定の疾患に罹患している被験者を含む複数の被験者から収集したＤＮＡ配列情報（以下、ＤＮＡ情報と呼ぶ）や、ＤＮＡ変異の情報等の入力を受け付けることもできる。 The input receiving unit 111 can also receive inputs such as DNA sequence information (hereinafter referred to as DNA information) collected from a plurality of subjects including subjects suffering from a predetermined disease, DNA mutation information, and the like. .

ＤＮＡは、遺伝情報を記録している物質（デオキシリボ核酸）であり、約３０億対の塩基から構成されている。ＤＮＡには、遺伝情報を含む特定の領域（塩基配列）である遺伝子が含まれる。本実施形態では、遺伝子以外の領域に存在するＤＮＡ変異も絞込の対象とすることができる。 DNA is a substance (deoxyribonucleic acid) that records genetic information, and is composed of about 3 billion pairs of bases. DNA includes genes, which are specific regions (nucleotide sequences) containing genetic information. In this embodiment, DNA mutations present in regions other than genes can also be narrowed down.

ＤＮＡ変異は、ＤＮＡにおける構造（塩基配列）の変異であり、例えば、一塩基多型（SNP: Single Nucleotide Polymorphism）、コピー数多型（CNV: Copy Number Variation）、ＤＮＡの欠失、挿入等が含まれ得る。 DNA mutations are structural (nucleotide sequence) mutations in DNA, and include, for example, single nucleotide polymorphisms (SNPs), copy number variations (CNVs), DNA deletions, insertions, and the like. can be included.

特徴量抽出部１１２は、例えば、図３のプロセッサ３０１で実行されるプログラムによって実現され、入力受付部１１１が受け付けた、所定の疾患に関連する量的形質の連続的な計測値から特徴量を抽出（決定）する。 The feature quantity extraction unit 112 is implemented by, for example, a program executed by the processor 301 in FIG. Extract (determine).

ここで特徴量は、例えば、所定の疾患に関連する量的形質の連続的な計測値を、所定の多項式にフィッティングし得られた、多項式の係数、多項式の切片、又は多項式の係数と切片等である。なお、特徴量抽出部１１２による特徴量の抽出方法の具体的な一例については、後述する。 Here, the feature amount is, for example, a polynomial coefficient, a polynomial intercept, or a polynomial coefficient and intercept obtained by fitting continuous measured values of quantitative traits related to a given disease to a given polynomial. is. A specific example of the method of extracting feature amounts by the feature amount extraction unit 112 will be described later.

ＤＮＡ変異情報取得部（取得部）１１３は、例えば、図３のプロセッサ３０１で実行されるプログラムによって実現され、所定の疾患に罹患している被験者を含む複数の被験者におけるＤＮＡ変異の情報を取得する。 The DNA mutation information acquisition unit (acquisition unit) 113 is implemented by, for example, a program executed by the processor 301 in FIG. 3, and acquires DNA mutation information in a plurality of subjects including subjects suffering from a predetermined disease. .

例えば、ＤＮＡ変異情報取得部１１３は、入力受付部１１１が受け付けたＤＮＡ情報を解析して、ＤＮＡ変異の情報を抽出する。 For example, the DNA mutation information acquisition unit 113 analyzes the DNA information received by the input reception unit 111 and extracts DNA mutation information.

或いは、ＤＮＡ変異情報取得部１１３は、図２に示すように、ＤＮＡ情報からＤＮＡ変異の情報を取得する外部装置であるＤＮＡ変異情報取得装置２１０から、複数の被験者におけるＤＮＡ変異の情報を取得するものであっても良い。さらに、ＤＮＡ変異情報取得部１１３は、図２に示すように、予め取得したＤＮＡ変異の情報を格納するＤＮＡ変異情報ＤＢ（Database）２２０から、複数の被験者に対応するＤＮＡ変異の情報を取得するものであっても良い。 Alternatively, as shown in FIG. 2, the DNA mutation information acquisition unit 113 acquires DNA mutation information of a plurality of subjects from a DNA mutation information acquisition device 210, which is an external device for acquiring DNA mutation information from DNA information. It can be anything. Furthermore, as shown in FIG. 2, the DNA mutation information acquisition unit 113 acquires DNA mutation information corresponding to a plurality of subjects from a DNA mutation information DB (database) 220 that stores pre-acquired DNA mutation information. It can be anything.

好ましくは、ＤＮＡ変異情報取得部１１３が取得するＤＮＡ変異の情報には、複数の被験者のＤＮＡ情報（ＤＮＡ配列情報）から抽出された、ＤＮＡ配列中の全ての一塩基多型（ＳＮＰ）の情報が含まれる。 Preferably, the DNA mutation information acquired by the DNA mutation information acquiring unit 113 includes all single nucleotide polymorphism (SNP) information in the DNA sequence extracted from the DNA information (DNA sequence information) of a plurality of subjects. is included.

関連解析部（解析部）１１４は、例えば、図３のプロセッサ３０１で実行されるプログラムによって実現され、特徴量抽出部１１２が抽出した特徴量と、ＤＮＡ変異情報取得部１１３が取得したＤＮＡ変異の情報に含まれるＤＮＡ変異との関連を解析する。例えば、関連解析部１１４は、特徴量を目的変数、ＤＮＡ変異を説明変数として関連解析（検定）を行い、ＤＮＡ変異が特徴量にどの程度の影響を与えているかを表す、特徴量とＤＮＡ変異との関連度を示す統計的指標の値を算出する。 The association analysis unit (analysis unit) 114 is realized by, for example, a program executed by the processor 301 in FIG. Analyze the relationship with DNA mutations contained in the information. For example, the association analysis unit 114 performs association analysis (testing) using the feature amount as the objective variable and the DNA mutation as the explanatory variable, and expresses how much the DNA mutation affects the feature amount. Calculate the value of a statistical index that indicates the degree of association with

ここで、特徴量とＤＮＡ変異との関連度を示す統計的指標の値には、特徴量抽出部１１２が抽出した特徴量と、ＤＮＡ変異情報取得部１１３が取得したＤＮＡ変異の情報に含まれる各ＤＮＡ変異との有意差を示す数値（例えば、ｐ値、ｆ値、オッズ比等）が含まれる。 Here, the value of the statistical index indicating the degree of association between the feature quantity and the DNA mutation is included in the feature quantity extracted by the feature quantity extraction unit 112 and the DNA mutation information acquired by the DNA mutation information acquisition unit 113. Numerical values (eg, p-values, f-values, odds ratios, etc.) indicating significant differences from each DNA mutation are included.

また、関連解析（検定）には、線形回帰、ロジスティック回帰、フィッシャーの正確検定、カイ二乗検定、コクラン・アミテージ検定、ｔ検定等を用いることができるが、これに限定されない。 For the association analysis (test), linear regression, logistic regression, Fisher's exact test, chi-square test, Cochran-Amitage test, t-test, etc. can be used, but not limited thereto.

なお、関連解析部１１４は、特徴量抽出部１１２が抽出した特徴量を、予め定めた基準値を用いて所定の疾患の罹患群と非罹患群とにラベル付けし、各ＤＮＡ変異の有意性を示す統計的指標の値を算出するものであっても良い。 Note that the association analysis unit 114 labels the feature amount extracted by the feature amount extraction unit 112 into a group affected by a predetermined disease and a group not affected by a predetermined disease using a predetermined reference value, and determines the significance of each DNA mutation. It is also possible to calculate the value of a statistical index indicating

ＤＮＡ変異抽出部１１５は、例えば、図３のプロセッサ３０１で実行されるプログラムによって実現され、関連解析部１１４の解析結果に基づいて、所定の疾患に関連するＤＮＡ変異を抽出する。例えば、ＤＮＡ変異抽出部１１５は、関連解析部１１４が算出した統計的指標の値を基準として、所定の疾患に関するＤＮＡ変異を抽出する。 The DNA mutation extraction unit 115 is implemented by, for example, a program executed by the processor 301 in FIG. For example, the DNA mutation extraction unit 115 extracts DNA mutations related to a predetermined disease based on the statistical index values calculated by the association analysis unit 114 .

記憶部１１６は、例えば、図３のプロセッサ３０１で実行されるプログラム、及びストレージ３０３、メモリ３０２等によって実現され、入力受付部１１１が受け付けた情報や、ＤＮＡ変異情報取得部１１３が取得した情報等を記憶する。 The storage unit 116 is implemented by, for example, the program executed by the processor 301 in FIG. 3, the storage 303, the memory 302, and the like, and stores information received by the input receiving unit 111, information obtained by the DNA mutation information obtaining unit 113, and the like. memorize

結果出力部１１７は、例えば、図３のプロセッサ３０１で実行されるプログラムによって実現される。結果出力部１１７は、ＤＮＡ変異抽出部１１５が抽出した、所定の疾患に関連するＤＮＡ変異の情報（例えば、ＤＮＡ変異の候補リスト等）を、本実施形態のハードウェア構成の例を示す図３の出力装置３０５等を用いて出力する。 The result output unit 117 is implemented by, for example, a program executed by the processor 301 in FIG. The result output unit 117 outputs the information of the DNA mutations related to the predetermined disease (for example, a DNA mutation candidate list, etc.) extracted by the DNA mutation extraction unit 115 to the hardware configuration shown in FIG. output device 305 or the like.

（具体的な構成の一例）
ＤＮＡ変異絞込装置１１０の具体的な構成の一例として、図１で示す入力受付部１１１は、所定の疾患に罹患している被験者を含む複数の被験者から、所定の時間間隔（例えば、毎日、同じ時間帯等）で測定した、連続した血圧の計測値の情報を受け付ける。 (Example of specific configuration)
As an example of a specific configuration of the DNA mutation narrowing down device 110, the input reception unit 111 shown in FIG. receive information on continuous blood pressure measurement values measured in the same period of time, etc.;

特徴量抽出部１１２は、入力受付部１１１が受け付けた、連続した血圧の計測値を多項式（例えば、一次式）にフィッティングして、多項式の傾き、切片等を算出する。一例として、特徴量抽出部１１２は、多項式の傾きを特徴量として用いることができる。 The feature amount extraction unit 112 fits the continuous blood pressure measurement values received by the input reception unit 111 to a polynomial (for example, a linear expression) to calculate the slope, intercept, and the like of the polynomial. As an example, the feature amount extraction unit 112 can use the slope of the polynomial as the feature amount.

ＤＮＡ変異情報取得部１１３は、例えば、所定の疾患に罹患している被験者を含む複数の被験者のＤＮＡ情報から抽出した、全ての被験者の各々の一塩基多型（ＤＮＡ変異の一例）の情報を取得する。 The DNA mutation information acquisition unit 113, for example, extracts from the DNA information of a plurality of subjects including a subject suffering from a predetermined disease, and obtains information on each single nucleotide polymorphism (an example of DNA mutation) of all subjects. get.

関連解析部１１４は、特徴量抽出部１１２が抽出した特徴量（多項式の傾き）を目的変数、ＤＮＡ変異情報取得部１１３が取得した一塩基多型の情報に含まれる全ての一塩基多型を説明変数として、関連解析（検定）を行う。例えば、関連解析部１１４は、線形回帰による検定を行い、抽出した特徴量と、一塩基多型の情報に含まれる全ての一塩基多型との有意確率を表すｐ値（統計的指標の値の一例）を算出する。 The association analysis unit 114 uses the feature amount (slope of the polynomial) extracted by the feature amount extraction unit 112 as an objective variable, An association analysis (test) is performed as an explanatory variable. For example, the association analysis unit 114 performs a linear regression test, and p-value (statistical indicator value (an example of ) is calculated.

ＤＮＡ変異抽出部１１５は、塩基多型の情報に含まれる全ての一塩基多型のうち、ｐ値が、予め定められた有意水準（例えば、５×１０^-８）以下（又は未満）となる一塩基多型を抽出する。 The DNA mutation extraction unit 115 determines that the p value of all the single nucleotide polymorphisms included in the nucleotide polymorphism information is below (or less than) a predetermined significance level (for example, 5×10 ⁻⁸ ). Extract single nucleotide polymorphisms.

ここで、有意水準は、ｐ値が統計的に有意とみなすことができる水準を示す値であり、一般的な解析では０．０５（５％）が用いられる。しかし、ゲノム解析においては数千～数万のパラメータを一度に検定するため、検定回数が増え、偶然に有意差が生じる可能性が増すため有意水準を補正するのが一般的である。補正後のｐ値の水準には５×１０^-８（０．０００００５％）が多く用いられるが、必要に応じて他の値（例えば、５×１０^-１０、５×１０^-１２等）を用いても良い。 Here, the significance level is a value indicating the level at which the p-value can be regarded as statistically significant, and 0.05 (5%) is used in general analysis. However, since thousands to tens of thousands of parameters are tested at once in genome analysis, the number of tests increases and the possibility of significant differences occurring by chance increases, so it is common to correct the significance level. 5 × 10 ^-8 (0.000005%) is often used as the level of the p-value after correction, but other values (for example, 5 × 10 ^-10 , 5 × 10 ^-12 , etc.) can be used as necessary. You can use it.

記憶部１１６は、所定の疾患に罹患している被験者を含む複数の被験者の連続的な量的形質の計測値を取得し、取得した連続的な量的形質の計測値を記憶する。 The storage unit 116 acquires continuous quantitative trait measurement values of a plurality of subjects including a subject suffering from a predetermined disease, and stores the acquired continuous quantitative trait measurement values.

結果出力部１１７は、ＤＮＡ変異抽出部１１５が抽出した一塩基多型の情報を、所定の疾患に関連する一塩基多型として、例えば、候補リスト等として出力する。 The result output unit 117 outputs the single nucleotide polymorphism information extracted by the DNA mutation extraction unit 115 as a single nucleotide polymorphism related to a predetermined disease, for example, as a candidate list.

図２は、一実施形態に係るＤＮＡ変異絞込装置の構成例を示す図（２）である。図２に示すように、ＤＮＡ変異絞込装置１１０は、ＤＮＡ変異情報取得装置２１０や、ＤＮＡ変異情報ＤＢ２２０等の外部装置と連携して、各機能を実現するものであっても良い。また、ＤＮＡ変異絞込装置１１０は、前述したように、複数の情報処理装置で構成されるＤＮＡ変異絞込システム１００であっても良い。 FIG. 2 is a diagram (2) showing a configuration example of a DNA mutation screening device according to an embodiment. As shown in FIG. 2, the DNA mutation narrowing-down device 110 may realize each function in cooperation with an external device such as the DNA mutation information acquisition device 210 and the DNA mutation information DB 220 . Further, the DNA mutation narrowing-down apparatus 110 may be the DNA mutation narrowing-down system 100 composed of a plurality of information processing apparatuses, as described above.

（ハードウェア構成）
図３は、図１、図２に記載の一実施形態に係るＤＮＡ変異絞込装置１１０のハードウェア構成の例を示す図である。ＤＮＡ変異絞込装置１１０は、物理的には、プロセッサ３０１、メモリ３０２、ストレージ３０３、入力装置３０４、出力装置３０５、通信装置３０６、及びバス３０７等を含むコンピュータ装置として構成されても良い。なお、以下の説明では、「装置」という文言は、回路、デバイス、ユニット等に読み替えることができる。 (Hardware configuration)
FIG. 3 is a diagram showing an example of the hardware configuration of the DNA mutation screening device 110 according to the embodiment shown in FIGS. 1 and 2. As shown in FIG. The DNA mutation narrowing down device 110 may be physically configured as a computer device including a processor 301, a memory 302, a storage 303, an input device 304, an output device 305, a communication device 306, a bus 307, and the like. In the following description, the term "apparatus" can be read as a circuit, device, unit, or the like.

プロセッサ３０１は、例えば、オペレーティングシステムを動作させてコンピュータ全体を制御する。プロセッサ３０１は、周辺装置とのインタフェース、制御装置、演算装置、レジスタ等を含む中央処理装置（ＣＰＵ：Central Processing Unit）で構成されても良い。 The processor 301, for example, operates an operating system and controls the entire computer. The processor 301 may be configured with a central processing unit (CPU) including an interface with peripheral devices, a control device, an arithmetic device, registers, and the like.

また、プロセッサ３０１は、プログラム（プログラムコード）、ソフトウェアモジュールやデータを、ストレージ３０３及び／又は通信装置３０６からメモリ３０２に読み出し、これらに従って各種の処理を実行する。プログラムとしては、ＤＮＡ変異絞込装置１１０の動作の少なくとも一部をコンピュータに実行させるプログラムが用いられる。ＤＮＡ変異絞込装置１１０において実行される各種処理は、１つのプロセッサ３０１で実行されても良いし、２以上のプロセッサ３０１により同時又は逐次に実行されても良い。プロセッサ３０１は、１以上のチップで実装されても良い。なお、プログラムは、電気通信回線を介してネットワークから送信されても良い。 The processor 301 also reads programs (program codes), software modules and data from the storage 303 and/or the communication device 306 to the memory 302 and executes various processes according to them. As the program, a program that causes a computer to execute at least part of the operation of the DNA mutation narrowing down apparatus 110 is used. Various processes performed in the DNA mutation narrowing down apparatus 110 may be performed by one processor 301, or may be performed by two or more processors 301 simultaneously or sequentially. Processor 301 may be implemented with one or more chips. Note that the program may be transmitted from a network via an electric communication line.

メモリ３０２は、コンピュータ読み取り可能な記憶媒体であり、例えば、ＲＯＭ（Read Only Memory）、ＥＰＲＯＭ（Erasable Programmable ＲＯＭ）、ＥＥＰＲＯＭ（Electrically Erasable Programmable ＲＯＭ）、ＲＡＭ（Random Access Memory）等の少なくとも１つで構成されても良い。メモリ３０２は、レジスタ、キャッシュ、メインメモリ（主記憶装置）等と呼ばれても良い。メモリ３０２は、本発明の一実施の形態に係るＤＮＡ変異絞込方法を実施するために実行可能なプログラム（プログラムコード）、ソフトウェアモジュール等を保存することができる。 The memory 302 is a computer-readable storage medium, and is composed of at least one of, for example, ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), RAM (Random Access Memory), and the like. May be. The memory 302 may also be called a register, cache, main memory (main storage device), or the like. The memory 302 can store executable programs (program code), software modules, etc. for implementing the DNA mutation screening method according to one embodiment of the present invention.

ストレージ３０３は、コンピュータ読み取り可能な記憶媒体であり、例えば、ＣＤ－ＲＯＭ（Compact Disc ＲＯＭ）等の光ディスク、ハードディスクドライブ、フレキシブルディスク、光磁気ディスク（例えば、コンパクトディスク、デジタル多用途ディスク、Ｂｌｕ－ｒａｙ（登録商標）ディスク）、スマートカード、フラッシュメモリ（例えば、カード、スティック、キードライブ）、フロッピー（登録商標）ディスク、磁気ストリップ等の少なくとも１つで構成されても良い。ストレージ３０３は、補助記憶装置と呼ばれても良い。上述の記憶媒体は、例えば、メモリ３０２及び／又はストレージ３０３を含むデータベース、サーバその他の適切な媒体であっても良い。 The storage 303 is a computer-readable storage medium, for example, an optical disc such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disc, a magneto-optical disc (for example, a compact disc, a digital versatile disc, a Blu-ray disk), smart card, flash memory (eg, card, stick, key drive), floppy disk, magnetic strip, and/or the like. The storage 303 may also be called an auxiliary storage device. The storage medium described above may be, for example, a database, server, or other suitable medium including memory 302 and/or storage 303 .

入力装置３０４は、外部からの入力を受け付ける入力デバイス（例えば、キーボード、マウス、マイクロフォン、スイッチ、ボタン、センサ等）である。出力装置３０５は、外部への出力を実施する出力デバイス（例えば、ディスプレイ、スピーカ、ＬＥＤランプ等）である。なお、入力装置３０４及び出力装置３０５は、一体となった構成（例えば、タッチパネルディスプレイ）であっても良い。 The input device 304 is an input device (for example, keyboard, mouse, microphone, switch, button, sensor, etc.) that receives input from the outside. The output device 305 is an output device (for example, display, speaker, LED lamp, etc.) that outputs to the outside. Note that the input device 304 and the output device 305 may be integrated (for example, a touch panel display).

通信装置３０６は、有線及び／又は無線ネットワークを介してコンピュータ間の通信を行うためのハードウェア（送受信デバイス）であり、例えばネットワークデバイス、ネットワークコントローラ、ネットワークカード、通信モジュール等ともいう。また、通信装置３０６は、近距離無線通信により、外部装置と直接通信を行う機能を有していても良い。 The communication device 306 is hardware (transmitting/receiving device) for communicating between computers via a wired and/or wireless network, and is also called a network device, network controller, network card, communication module, or the like. Also, the communication device 306 may have a function of directly communicating with an external device by short-range wireless communication.

上記のプロセッサ３０１やメモリ３０２等の各装置は、情報を通信するためのバス３０７で接続される。バス３０７は、単一のバスで構成されても良いし、装置間で異なるバスで構成されても良い。 Devices such as the processor 301 and the memory 302 are connected by a bus 307 for communicating information. The bus 307 may be composed of a single bus, or may be composed of different buses between devices.

＜処理の流れ＞
続いて、本実施形態に係るＤＮＡ変異絞込方法の処理の流れについて図４～６を用いて説明する。 <Process flow>
Next, the process flow of the method for narrowing down DNA mutations according to this embodiment will be described with reference to FIGS. 4 to 6. FIG.

図４は、一実施形態に係るＤＮＡ変異絞込処理の概要を示すフローチャートである。ここでは、後述する第１の具体例、及び第２の具体例に共通する処理を中心に説明を行う。 FIG. 4 is a flowchart showing an overview of DNA mutation screening processing according to one embodiment. Here, the explanation will focus on the processing common to the first specific example and the second specific example, which will be described later.

ステップＳ４０１において、ＤＮＡ変異絞込装置１１０の入力受付部１１１は、所定の疾患に罹患している被験者を含む複数の被験者の連続的な量的形質の計測値を取得し、取得した連続的な量的形質の計測値を記憶部１１６等に記憶する。 In step S401, the input reception unit 111 of the DNA mutation narrowing down device 110 acquires continuous quantitative trait measurement values of a plurality of subjects including a subject suffering from a predetermined disease, Measured values of quantitative traits are stored in the storage unit 116 or the like.

ステップＳ４０２において、ＤＮＡ変異絞込装置１１０の特徴量抽出部１１２は、記憶部１１６に記憶された、量的形質の経時的な変化から、所定の疾患に関連する特徴量を抽出する。例えば、特徴量抽出部１１２は、図５に示すような特徴量の抽出処理を実行する。 In step S<b>402 , the feature quantity extraction unit 112 of the DNA mutation narrowing down device 110 extracts a feature quantity related to a predetermined disease from temporal changes in quantitative traits stored in the storage unit 116 . For example, the feature quantity extraction unit 112 executes a feature quantity extraction process as shown in FIG.

図５は、一実施形態に係る特徴量の抽出処理の例を示すフローチャートである。この処理は、例えば、図４のステップＳ４０２で実行される特徴量の抽出処理の例を示している。 FIG. 5 is a flowchart illustrating an example of feature amount extraction processing according to an embodiment. This process shows an example of the feature amount extraction process executed in step S402 of FIG. 4, for example.

ステップＳ５０１において、特徴量抽出部１１２は、入力受付部１１１が取得した連続的な量的形質の計測値を、例えば、図６に示すように、多項式にフィッティングする。 In step S<b>501 , the feature amount extraction unit 112 fits the continuous quantitative trait measurement values acquired by the input reception unit 111 to a polynomial, for example, as shown in FIG. 6 .

図６は、一実施形態に係る特徴量の一例について説明するための図である。ここでは、具体的な一例として、連続的な量的形質が、所定の時間間隔（例えば、毎日、同じ時間帯等）に測定した妊婦（被験者の一例）の血圧の計測値であり、多項式が１次式であるものとする。 FIG. 6 is a diagram for explaining an example of feature amounts according to one embodiment. Here, as a specific example, the continuous quantitative trait is the measured value of the blood pressure of a pregnant woman (an example of a subject) measured at predetermined time intervals (eg, every day, the same time zone, etc.), and the polynomial is It is assumed to be a linear expression.

図６において、特徴量抽出部１１２は、妊婦Ａの血圧の計測値６０１を、例えば、線形回帰により、１次式「ｙ＝ａｘ＋ｂ」にフィッティングする。なお、１次式「ｙ＝ａｘ＋ｂ」は所定の多項式の一例である。 In FIG. 6 , the feature quantity extraction unit 112 fits the blood pressure measurement value 601 of the pregnant woman A to a linear expression “y=ax+b” by, for example, linear regression. Note that the linear expression "y=ax+b" is an example of a predetermined polynomial.

ステップＳ５０２において、特徴量抽出部１１２は、フィッティングした多項式の係数、及び切片を算出する。例えば、特徴量抽出部１１２は、１次式で表される直線の傾きａ１、及び切片ｂ１を算出する。 In step S502, the feature quantity extraction unit 112 calculates coefficients and intercepts of the fitted polynomial. For example, the feature quantity extraction unit 112 calculates the slope a1 and the intercept b1 of the straight line represented by the linear expression.

ステップＳ５０３において、特徴量抽出部１１２は、算出した係数、切片、又は係数と切片を特徴量として抽出する。 In step S503, the feature quantity extraction unit 112 extracts the calculated coefficient, intercept, or coefficient and intercept as feature quantities.

例えば、他の妊婦Ｂの計測値を用いて、１次式「ｙ＝ａｘ＋ｂ」にフィッティングを行い、傾きａ２、及び切片ｂ２を、さらに算出したものとする。この場合、一例として、１次式の傾きａ１、ａ２を、特徴量として利用することができる。 For example, it is assumed that the slope a2 and the intercept b2 are further calculated by performing fitting to the linear expression "y=ax+b" using the measured values of another pregnant woman B. In this case, as an example, the slopes a1 and a2 of the linear expressions can be used as feature amounts.

例えば、図６に示すように、時間（妊娠経過）とともに血圧が増加するものとする。各妊婦が各々のタイミングで血圧を測定している場合、例えば、図６の通り、ｔ１時点での妊婦Ａの血圧の計測値６０１は存在するが、妊婦Ｂの血圧の計測値６０３は存在しない。異なる時点の血圧を妊婦ＡとＢで比較した場合、その差には、測定時点差の影響が含まれることから、妊婦ＡとＢの血圧値の差を真に比較することは困難である。 For example, as shown in FIG. 6, it is assumed that blood pressure increases with time (pregnancy progress). When each pregnant woman measures her blood pressure at each timing, for example, as shown in FIG. . When the blood pressures of pregnant women A and B are compared at different time points, the difference includes the influence of the difference in the time points of measurement.

一方、図６に示すように、例えば、妊婦Ａと妊婦Ｂの血圧の計測値を1次式にフィッティングし、妊娠Ａの血圧値に対する回帰線６０２と妊婦Ｂの血圧の計測値に対する回帰線６０４を算出したとする。回帰線６０２や６０４も用いることで、両者の傾きから妊娠ＡとＢの血圧推移の差を比較することができる。また、所定の時間（ｔ）において血圧の予測値を用いることで、測定時点差の影響を排除した比較が可能となる。 On the other hand, as shown in FIG. 6, for example, the measured blood pressure values of pregnant woman A and pregnant woman B are fitted to a linear expression, and a regression line 602 for the blood pressure value of pregnancy A and a regression line 604 for the measured blood pressure value of pregnant woman B are obtained. is calculated. By also using the regression lines 602 and 604, it is possible to compare the difference in blood pressure transition between pregnancies A and B from the slopes of both. In addition, by using the predicted value of blood pressure at a predetermined time (t), it is possible to perform a comparison that eliminates the influence of the difference in measurement time points.

また、別の一例として、図６において、時間ｔ０が妊婦の妊娠の初日であるものとする。また、ある疾患の発症の有無が、妊娠の初日の血圧値に関連があるものとする。この場合、妊婦Ａ、妊婦Ｂ、妊婦Ｃのように妊娠初日の血圧値の記録がない場合でも、各妊婦の血圧値を１次式にフィッティングすることで、１次式の切片ｂ１と、切片ｂ２、切片３を得ることができ、特徴量として用いることができる。 As another example, in FIG. 6, it is assumed that time t0 is the first day of pregnancy of the pregnant woman. It is also assumed that the presence or absence of the onset of a certain disease is related to the blood pressure value on the first day of pregnancy. In this case, even if there is no record of blood pressure values on the first day of pregnancy, such as pregnant women A, pregnant women B, and pregnant women C, by fitting the blood pressure values of each pregnant woman to the linear equation, the intercept b1 of the linear equation and the intercept b2, intercept 3 can be obtained and used as a feature quantity.

同様にして、例えば、妊娠の初日から所定の日数を経過した時点における各妊婦の血圧を、フィッティングした１次式を用いて算出し、特徴量として利用することができる。 Similarly, for example, the blood pressure of each pregnant woman at the time when a predetermined number of days have passed since the first day of pregnancy can be calculated using a fitted linear expression and used as a feature amount.

このように、特徴量抽出部１１２は、複数の被験者から収集した、量的形質の経時的な変化から、所定の疾患に関連する特徴量を抽出する。 In this way, the feature amount extraction unit 112 extracts a feature amount related to a predetermined disease from temporal changes in quantitative traits collected from a plurality of subjects.

なお、上記の傾きａ１、ａ２、a３及び切片ｂ１、ｂ２、ｂ３は、特徴量抽出部１１２が抽出する特徴量の一例である。また、図６に示す１次式は、所定の多項式の一例であり、所定の多項式は、２次以上の多項式であっても良い。 Note that the slopes a1, a2, a3 and the intercepts b1, b2, b3 described above are examples of feature amounts extracted by the feature amount extraction unit 112. FIG. Also, the linear expression shown in FIG. 6 is an example of a predetermined polynomial, and the predetermined polynomial may be a polynomial of degree 2 or higher.

ここで、図４に戻り、ＤＮＡ変異の絞込処理の例を示すフローチャートの説明を続ける。 Here, returning to FIG. 4, the description of the flowchart showing an example of DNA mutation narrowing down processing is continued.

ステップＳ４０３において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異情報取得部１１３は、例えば、ステップＳ４０１、Ｓ４０２の処理と並行して、所定の疾患に罹患している被験者を含む複数の被験者におけるＤＮＡ変異の情報を取得する。 In step S403, the DNA mutation information acquiring unit 113 of the DNA mutation narrowing down apparatus 110, for example, in parallel with the processing of steps S401 and S402, detects DNA mutations in a plurality of subjects including a subject suffering from a predetermined disease. Get information.

一例として、ＤＮＡ変異情報取得部１１３は、入力受付部１１１が受け付けた、所定の疾患に罹患している被験者を含む複数の被験者のＤＮＡ情報を用いて、ＤＮＡ配列中の所定のＤＮＡ変異情報（例えば、全ての一塩基多型等）を抽出（取得）する。 As an example, the DNA mutation information acquiring unit 113 uses the DNA information of a plurality of subjects including a subject suffering from a predetermined disease, which is received by the input receiving unit 111, to obtain predetermined DNA mutation information ( For example, all single nucleotide polymorphisms, etc.) are extracted (obtained).

また、別の一例として、ＤＮＡ変異情報取得部１１３は、図２に示すＤＮＡ変異情報取得装置２１０から、ＤＮＡ変異情報取得装置２１０が抽出した、所定の疾患に罹患している被験者を含む複数の被験者のＤＮＡ変異情報を取得するものであっても良い。さらに、ＤＮＡ変異情報取得部１１３は、所定の疾患に罹患している被験者を含む複数の被験者のＤＮＡ変異情報が、図２に示す予め登録されたＤＮＡ変異情報ＤＢ２２０から、所定の疾患に罹患している被験者を含む複数の被験者のＤＮＡ変異情報を取得するものであっても良い。 As another example, the DNA mutation information obtaining unit 113 may obtain a plurality of subjects including a subject suffering from a predetermined disease, which are extracted by the DNA mutation information obtaining apparatus 210 shown in FIG. It may be one that acquires the subject's DNA mutation information. Further, the DNA mutation information acquiring unit 113 obtains the DNA mutation information of a plurality of subjects including a subject suffering from a predetermined disease from the pre-registered DNA mutation information DB 220 shown in FIG. DNA mutation information of a plurality of subjects including a subject who is in the same group may be obtained.

なお、ステップＳ４０１において、連続的な量的形質の計測値を取得する複数の被験者と、ステップＳ４０３において、一塩基多型の情報を取得する複数の被験者は、同じ被験者である。 It should be noted that the plurality of subjects for whom continuous quantitative trait measurement values are obtained in step S401 and the plurality of subjects for which single nucleotide polymorphism information is obtained in step S403 are the same subjects.

ステップＳ４０４において、ＤＮＡ変異絞込装置１１０の関連解析部１１４は、ステップＳ４０２で抽出した特徴量と、ステップＳ４０３で取得したＤＮＡ変異の情報に含まれるＤＮＡ変異との関連を解析する。 In step S404, the association analysis unit 114 of the DNA mutation narrowing down apparatus 110 analyzes the association between the feature amount extracted in step S402 and the DNA mutation included in the DNA mutation information acquired in step S403.

一例として、関連解析部１１４は、特徴量を目的変数、ＤＮＡ変異を説明変数として関連解析（検定）を行い、特徴量と各ＤＮＡ変異との関連度を示す統計的指標の値を算出する（第１の具体例）。例えば、関連解析部１１４は、特徴量を目的変数、ＤＮＡ変異を説明変数とした回帰式において、各ＤＮＡ変異の係数のＷａｌｄ統計量から算出されるｐ値を、統計的指標の値とすることができる。 As an example, the association analysis unit 114 performs association analysis (testing) using the feature amount as the objective variable and the DNA mutation as the explanatory variable, and calculates a statistical index value indicating the degree of association between the feature amount and each DNA mutation ( first specific example). For example, the association analysis unit 114 may use the p-value calculated from the Wald statistic of the coefficient of each DNA mutation as the value of the statistical index in a regression equation in which the feature amount is the objective variable and the DNA mutation is the explanatory variable. can be done.

別の一例として、関連解析部１１４は、特徴量を、予め定めた基準値を用いて、罹患群と非罹患群とラベル付けし、各ＤＮＡ変異に対して、罹患群、及び非罹患群における保有数に差があるかを示す統計的指標の値を算出する（第２の具体例）。例えば、関連解析部１１４は、フィッシャーの正確検定や、カイ二乗検定を行い、有意確率を表すｐ値を、統計的指標の値とすることができる。 As another example, the association analysis unit 114 uses a predetermined reference value to label the feature quantity as a diseased group and a non-diseased group, and for each DNA mutation, in the diseased group and the non-diseased group A value of a statistical index indicating whether there is a difference in the number of possessions is calculated (second specific example). For example, the association analysis unit 114 can perform Fisher's exact test or chi-square test, and use the p-value representing the significance probability as the value of the statistical index.

なお、関連解析部１１４が、統計的指標の値を算出する際の検定（関連解析）には、例えば、線形回帰、ロジスティック回帰、フィッシャーの正確検定、カイ二乗検定、コクラン・アミテージ検定、ｔ検定等が用いられるが、これに限定されない。また、関連解析部１１４が算出する統計的指標の値には、例えば、ｐ値、ｆ値、又はオッズ比等が用いられるが、これに限定されない。 The test (association analysis) when the association analysis unit 114 calculates the value of the statistical index includes, for example, linear regression, logistic regression, Fisher's exact test, chi-square test, Cochran-Amitage test, and t-test. etc. are used, but are not limited to these. Also, the value of the statistical index calculated by the association analysis unit 114 may be, for example, a p-value, an f-value, or an odds ratio, but is not limited to this.

また、関連解析部１１４は、優性遺伝子作用、劣性遺伝子作用、遺伝子型等をさらに考慮して統計的指標の値を算出しても良いし、共編量として年齢、体重、ＢＭＩ等の情報を用いるものであっても良い。 In addition, the association analysis unit 114 may further consider dominant gene action, recessive gene action, genotype, etc. to calculate the value of the statistical index, or use information such as age, weight, BMI, etc. as the amount of co-editing. It can be anything.

ステップＳ４０５において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異抽出部１１５は、関連解析部１１４の解析結果に基づいて、所定の疾患に関連するＤＮＡ変異を抽出する。例えば、ＤＮＡ変異抽出部１１５は、ステップＳ４０３で取得したＤＮＡ変異の情報に含まれる各ＤＮＡ変異の中から、ステップＳ４０４で算出した統計的指標の値を基準値として、所定の疾患に関連するＤＮＡ変異を抽出する。例えば、ＤＮＡ変異抽出部１１５は、統計的指標の値が、基準値以上、又は基準値以下のＤＮＡ変異を抽出する。 In step S405, the DNA mutation extraction unit 115 of the DNA mutation narrowing down device 110 extracts DNA mutations related to a given disease based on the analysis results of the association analysis unit 114. FIG. For example, the DNA mutation extracting unit 115 extracts DNA related to a predetermined disease from among the DNA mutations included in the DNA mutation information acquired in step S403, using the value of the statistical index calculated in step S404 as a reference value. Extract mutations. For example, the DNA mutation extraction unit 115 extracts DNA mutations whose statistical index values are greater than or equal to a reference value or less than or equal to the reference value.

なお、上記の処理で抽出されたＤＮＡ変異の情報は、ＤＮＡ変異絞込装置１１０の結果出力部１１７によって、例えば、所定の疾患に関連するＤＮＡ変異の候補として、出力装置３０５、又はストレージ３０３等に出力される。 The information on the DNA mutation extracted by the above process is output by the result output unit 117 of the DNA mutation screening device 110 to the output device 305, the storage 303, etc., as a DNA mutation candidate related to a predetermined disease, for example. output to

続いて、第１の具体例、及び第２の具体例におけるＤＮＡ変異絞込方法の処理の流れについて説明する。 Next, the processing flow of the DNA mutation narrowing-down method in the first specific example and the second specific example will be described.

［第１の具体例］
第１の具体例では、ＤＮＡ変異絞込装置１１０の関連解析部１１４が、特徴量を目的変数、ＤＮＡ変異を説明変数として関連解析を行い、特徴量と各ＤＮＡ変異との関連度を示す統計的指標の値（例えば、ｐ値等）を算出する場合の処理の例について説明する。 [First specific example]
In a first specific example, the association analysis unit 114 of the DNA mutation narrowing down apparatus 110 performs association analysis using the feature amount as the objective variable and the DNA mutation as the explanatory variable, and obtains statistics indicating the degree of association between the feature amount and each DNA mutation. An example of processing for calculating a target index value (for example, a p-value) will be described.

図７は、一実施形態に係る第１の具体例に係るＤＮＡ変異の絞込処理の例を示すフローチャートである。 FIG. 7 is a flowchart showing an example of narrowing down processing for DNA mutations according to the first specific example according to the embodiment.

ステップＳ７０１において、ＤＮＡ変異絞込装置１１０の入力受付部１１１は、複数の被験者の連続的な量的形質の計測値を取得し、取得した連続的な量的形質の計測値を記憶部１１６等に保存（記憶）する。 In step S701, the input reception unit 111 of the DNA mutation narrowing down apparatus 110 acquires continuous quantitative trait measurement values of a plurality of subjects, and stores the acquired continuous quantitative trait measurement values in the storage unit 116 or the like. Save (memorize) to

ステップＳ７０２において、ＤＮＡ変異絞込装置１１０の特徴量抽出部１１２は、記憶部１１６に記憶された、量的形質の経時的な変化から、所定の表現型に関連する特徴量を抽出する。例えば、特徴量抽出部１１２は、図５に示すような特徴量の抽出処理を実行する。 In step S<b>702 , the feature quantity extraction unit 112 of the DNA mutation narrowing down device 110 extracts a feature quantity related to a given phenotype from temporal changes in quantitative traits stored in the storage unit 116 . For example, the feature quantity extraction unit 112 executes a feature quantity extraction process as shown in FIG.

なお、表現型とは、例えば、ヒトの遺伝子型が形質として表現されたものであり、例えば身長の高さや目や肌、髪の色、肥満になりやすい体質であるか否か、等の個人の体質や、高血圧、糖尿病、妊娠高血圧症候群、冠動脈疾患等の所定の疾患を含む。本実施形態は、所定の疾患に限られず、所定の表現型に関連する遺伝子変異を絞り込む用途にも適用することができる。 A phenotype is, for example, a person's genotype expressed as a trait. and pre-determined diseases such as hypertension, diabetes, pregnancy-induced hypertension, and coronary artery disease. This embodiment can be applied not only to specific diseases but also to narrow down gene mutations associated with specific phenotypes.

ステップＳ７０３、Ｓ７０４において、ＤＮＡ変異絞込装置１１０は、例えば、ステップＳ７０１、Ｓ７０２の処理と並行して、複数の被験者におけるＤＮＡ変異の情報を取得する。例えば、ステップＳ７０３において、ＤＮＡ変異絞込装置１１０の入力受付部１１１は、複数の被験者のＤＮＡ情報を取得し、記憶部１１６に記憶する。また、ステップ７０４において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異情報取得部１１３は、記憶部１１６に記憶した複数の被験者のＤＮＡ情報を解析して、ＤＮＡ変異の情報を抽出する。 In steps S703 and S704, the DNA mutation narrowing down apparatus 110 acquires information on DNA mutations in a plurality of subjects, for example, in parallel with the processing of steps S701 and S702. For example, in step S<b>703 , the input reception unit 111 of the DNA mutation narrowing down device 110 acquires DNA information of multiple subjects and stores it in the storage unit 116 . In step 704, the DNA mutation information acquiring unit 113 of the DNA mutation narrowing down device 110 analyzes the DNA information of the plurality of subjects stored in the storage unit 116 and extracts DNA mutation information.

なお、ステップＳ７０３、Ｓ７０４において、ＤＮＡ変異情報取得部１１３は、図２に示すＤＮＡ変異情報取得装置２１０や、ＤＮＡ変異情報ＤＢ２２０等から、複数の被験者におけるＤＮＡ変異の情報を取得するものであっても良い。 In steps S703 and S704, the DNA mutation information acquisition unit 113 acquires information on DNA mutations in a plurality of subjects from the DNA mutation information acquisition device 210 shown in FIG. 2, the DNA mutation information DB 220, or the like. Also good.

ステップＳ７０５において、ＤＮＡ変異絞込装置１１０の関連解析部１１４は、特徴量を目的変数、ＤＮＡ変異を説明変数として関連解析（検定）を行い、特徴量と各ＤＮＡ変異との関連度を示す統計的指標の値を算出する。例えば、関連解析部１１４は、特徴量と各ＤＮＡ変異との関連度を示す統計的指標の値として、ｐ値、ｆ値、オッズ比等を算出する。 In step S705, the association analysis unit 114 of the DNA mutation narrowing down apparatus 110 performs association analysis (test) using the feature amount as the objective variable and the DNA mutation as the explanatory variable, and obtains statistics indicating the degree of association between the feature amount and each DNA mutation. Calculate the value of the target index. For example, the association analysis unit 114 calculates a p-value, an f-value, an odds ratio, and the like as statistical index values indicating the degree of association between the feature amount and each DNA mutation.

ステップＳ７０６において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異抽出部１１５は、ステップＳ７０５で算出した統計的指標の値が、基準値以上、又は基準値以下となるＤＮＡ変異を抽出する。 In step S706, the DNA mutation extraction unit 115 of the DNA mutation narrowing down device 110 extracts DNA mutations for which the value of the statistical index calculated in step S705 is greater than or equal to the reference value or less than or equal to the reference value.

上記の処理により、ＤＮＡ変異絞込装置１１０は、ステップＳ７０３、Ｓ７０４で取得したＤＮＡ変異の中から、所定の表現型に関連するＤＮＡ変異を抽出することができる。 By the above processing, the DNA mutation narrowing down device 110 can extract DNA mutations associated with a predetermined phenotype from among the DNA mutations obtained in steps S703 and S704.

（応用例）
図８は、一実施形態に係る第１の具体例のＤＮＡ変異の絞込処理の応用例を示すフローチャートである。 (Application example)
FIG. 8 is a flowchart showing an application example of the DNA mutation narrowing-down process of the first specific example according to one embodiment.

ここでは、上記の処理を用いて、ＤＮＡ情報のサンプルが少ない場合や、所定の疾患の定義が曖昧である場合であっても、所定の疾患に関連するＤＮＡ変異を容易に絞り込むことができるＤＮＡ変異絞込方法について説明する。 Here, using the above-described processing, even when there are few DNA information samples or when the definition of a given disease is ambiguous, DNA mutations associated with a given disease can be easily narrowed down. A method for narrowing down mutations will be described.

ここでは、具体的な一例として、所定の疾患が、妊娠高血圧症候群（以下HDP: Hypertensive Disorders of Pregnancy）であるものとして、以下の説明を行う。 Here, as a specific example, the following description will be given assuming that the prescribed disease is hypertensive disorders of pregnancy (hereinafter referred to as HDP: Hypertensive Disorders of Pregnancy).

ＨＤＰには、診断のガイドラインが設定されているが、実際には、ＨＤＰに分類されていなくてもＨＤＰのような血圧値が現れる場合がある。このような場合には、罹患群、非罹患群の遺伝的背景に十分な差がなく、所定の疾患に関連するＤＮＡ変異を十分に絞り込めない場合がある。なお、ＨＤＰは、罹患群と非罹患群の定義が曖昧な疾患の一例である。 Although diagnostic guidelines are set for HDP, in practice, blood pressure values similar to HDP may appear even if they are not classified as HDP. In such cases, there may not be a sufficient difference in genetic background between the diseased group and the non-affected group, and DNA mutations associated with a given disease may not be sufficiently narrowed down. HDP is an example of a disease in which the definition of an affected group and a non-affected group is ambiguous.

また、ここでは、図６に示すように、血圧の計測値の変化率を示す傾きが、ＨＤＰの罹患と関連しており、例えば、ＨＤＰに罹患している妊婦における血圧の計測値の傾きは、ＨＤＰに罹患していない妊婦における血圧の計測値の傾きより大きい傾向があるものとする。なお、妊婦の血圧の計測値は、所定の疾患に関連する量的形質の連続的な計測値の一例である。また、血圧の計測値の傾きは、正規化された特徴量の一例である。 Also, here, as shown in FIG. 6, the slope indicating the rate of change of the measured blood pressure value is related to the prevalence of HDP. , tend to be greater than the slope of blood pressure measurements in pregnant women without HDP. It should be noted that blood pressure measurements of pregnant women are an example of continuous measurements of quantitative traits associated with a given disease. Also, the slope of the blood pressure measurement value is an example of a normalized feature amount.

ステップＳ８０１において、ＤＮＡ変異絞込装置１１０の入力受付部１１１は、ＨＤＰに罹患している被験者を含む、複数の被験者における血圧の計測値を含む連続的な量的形質の計測値を取得し、記憶部１１６に記憶する。 In step S801, the input reception unit 111 of the DNA mutation narrowing down device 110 acquires continuous quantitative trait measurement values including blood pressure measurement values in a plurality of subjects, including subjects suffering from HDP, Stored in storage unit 116 .

ステップＳ８０２において、ＤＮＡ変異絞込装置１１０の特徴量抽出部１１２は、量的形質の経時的な変化から、ＨＤＰに関連する特徴量を抽出する。例えば、特徴量抽出部１１２は、図６に示すように、妊婦の血圧の計測値６０１を１次式「ｙ＝ａｘ＋ｂ」にフィッティングし、一次式の傾きａを特徴量として抽出する。 In step S802, the feature quantity extraction unit 112 of the DNA mutation narrowing down device 110 extracts the feature quantity related to HDP from the temporal change of the quantitative trait. For example, as shown in FIG. 6, the feature quantity extraction unit 112 fits a measured blood pressure value 601 of a pregnant woman to a linear expression "y=ax+b" and extracts the slope a of the linear expression as a feature quantity.

ステップＳ８０３において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異情報取得部１１３は、ＨＤＰに罹患している被験者を含む複数の被験者のＤＮＡ情報から抽出された一塩基多型の情報を取得する。例えば、図２のＤＮＡ変異情報取得装置２１０は、ＨＤＰに罹患している被験者を含む複数の被験者のＤＮＡ情報を用いて、ＤＮＡ配列中における全ての一塩基多型の情報を抽出する。また、ＤＮＡ変異情報取得部１１３は、ＤＮＡ変異情報取得装置２１０から、抽出された一塩基多型の情報を取得する。なお、一塩基多型の情報は、ＤＮＡ変異の情報の一例である。 In step S803, the DNA mutation information acquisition unit 113 of the DNA mutation narrowing down device 110 acquires single nucleotide polymorphism information extracted from the DNA information of a plurality of subjects including subjects suffering from HDP. For example, the DNA mutation information acquisition apparatus 210 in FIG. 2 extracts information on all single nucleotide polymorphisms in a DNA sequence using DNA information of a plurality of subjects including subjects suffering from HDP. Further, the DNA mutation information acquisition unit 113 acquires information on the extracted single nucleotide polymorphism from the DNA mutation information acquisition device 210 . Information on single nucleotide polymorphisms is an example of information on DNA mutations.

ステップＳ８０４において、ＤＮＡ変異絞込装置１１０の関連解析部１１４は、ステップＳ８０２で抽出した特徴量を目的変数、ステップＳ８０３で取得した一塩基多型の情報に含まれる一塩基多型を説明変数として関連解析（検定）を行う。例えば、関連解析部１１４は、特徴量を目的変数、一塩基多型を説明変数とした回帰式において、各一塩基多型の係数のＷａｌｄ統計量から、有意確率を表すｐ値を算出する。 In step S804, the association analysis unit 114 of the DNA mutation narrowing down device 110 uses the feature amount extracted in step S802 as the objective variable and the single nucleotide polymorphism included in the single nucleotide polymorphism information acquired in step S803 as the explanatory variable. Perform an association analysis (test). For example, the association analysis unit 114 calculates the p-value representing the significance probability from the Wald statistic of the coefficient of each single nucleotide polymorphism in a regression equation in which the feature amount is the objective variable and the single nucleotide polymorphism is the explanatory variable.

ステップＳ８０５において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異抽出部１１５は、ステップＳ８０３で取得した一塩基多型の情報に含まれる一塩基多型の中から、ｐ値が、予め定められた有意水準（例えば、５×１０^-８）より小さい一塩基多型を抽出する。 In step S805, the DNA mutation extraction unit 115 of the DNA mutation narrowing down apparatus 110 extracts the single nucleotide polymorphisms included in the single nucleotide polymorphism information acquired in step S803, and the p-value is at a predetermined significance level. Single nucleotide polymorphisms smaller than (eg, 5×10 ⁻⁸ ) are extracted.

これにより、図６に示すような、血圧の計測値の傾きａに関連する一塩基多型の情報、例えば、ＨＤＰに関連する一塩基多型の情報を抽出することができる。 This makes it possible to extract single nucleotide polymorphism information related to the slope a of the blood pressure measurement value, for example, single nucleotide polymorphism information related to HDP, as shown in FIG.

本実施形態によれば、例えば、ＨＤＰのように疾患の症状と血圧等の量的形質が関わっている疾患において、量的形質を用いた関連解析を行う際に、多時点の情報を扱うことで、関連解析（検定）の精度を高め、所定の疾患に関連する一塩基多型を容易に抽出（絞込）することができる。 According to this embodiment, for example, in a disease such as HDP in which symptoms of the disease and quantitative traits such as blood pressure are involved, when performing association analysis using quantitative traits, multi-time information can be handled. , it is possible to improve the accuracy of association analysis (test) and easily extract (narrow down) single nucleotide polymorphisms associated with a predetermined disease.

また、本実施形態では、例えば、図６に示すように、量的形質の連続的な計測値から抽出した、特徴量を用いて関連解析を行う。これにより、互いに異なる期間に計測された複数の被験者の計測値を、同様に処理することができる。また、計測期間とは異なる時点における量的形質の計測値を推定して、関連解析を行うこと（例えば、妊娠１０週～１５週の血圧の計測値に基づいて、妊娠初日の血圧値を推定して、関連解析を行う等）もできる。 Further, in this embodiment, for example, as shown in FIG. 6, association analysis is performed using feature amounts extracted from continuous measurement values of quantitative traits. Thereby, measured values of a plurality of subjects measured in mutually different periods can be similarly processed. In addition, estimating the measured values of quantitative traits at time points different from the measurement period and performing association analysis and perform related analysis, etc.).

以上、本実施形態によれば、量的形質を用いた関連解析を行う際に、所定の疾患に関連するＤＮＡ変異を容易に絞り込むことができるＤＮＡ変異絞込装置、ＤＮＡ変異絞込システム、及びＤＮＡ変異絞込方法を提供することができる。 As described above, according to the present embodiment, a DNA mutation narrowing-down device, a DNA mutation narrowing-down system, and a A method for narrowing down DNA mutations can be provided.

［第２の具体例］
第１の具体例では、関連解析部１１４が、特徴量を目的変数、ＤＮＡ変異を説明変数として関連解析を行い、特徴量と各ＤＮＡ変異との関連度を示す統計的指標の値（例えば、ｐ値等）を算出する場合の処理の例について説明した。 [Second example]
In a first specific example, the association analysis unit 114 performs association analysis using the feature amount as the objective variable and the DNA mutation as the explanatory variable, and a statistical index value (for example, (p-value, etc.) has been described.

第２の具体例では、関連解析部１１４が、特徴量を、予め定めた基準値を用いて、罹患群と非罹患群とラベル付けし、各ＤＮＡ変異に対して、罹患群、及び非罹患群における保有数に差があるかを示す統計的指標の値を算出する場合の処理の例について説明する。 In a second specific example, the association analysis unit 114 uses a predetermined reference value to label the feature quantity as a diseased group and a non-diseased group, and for each DNA mutation, the diseased group and the non-diseased group. An example of processing for calculating the value of a statistical index indicating whether there is a difference in the number of possessions between groups will be described.

図９は、一実施形態に係る第２の具体例のＤＮＡ変異の絞込処理の例を示すフローチャートである。なお、ここでは、第１の具体例と同様の処理に対する詳細な説明は省略する。 FIG. 9 is a flow chart showing an example of narrowing down processing for DNA mutations in the second specific example according to one embodiment. A detailed description of the same processing as in the first specific example is omitted here.

ステップＳ９０１において、ＤＮＡ変異絞込装置１１０の入力受付部１１１は、複数の被験者の連続的な量的形質の計測値を取得し、取得した連続的な量的形質の計測値を記憶部１１６に保存（記憶）する。 In step S901, the input reception unit 111 of the DNA mutation narrowing down device 110 acquires continuous quantitative trait measurement values of a plurality of subjects, and stores the acquired continuous quantitative trait measurement values in the storage unit 116. Save (memorize).

ステップＳ９０２において、ＤＮＡ変異絞込装置１１０は、量的形質の経時的な変化から所定の表現型に関連する特徴量を抽出し、抽出した特徴量を予め定めた基準値を用いて罹患群と非罹患群とにラベル付けを行う。例えば、ＤＮＡ変異絞込装置１１０は、図１０に示すような罹患群、非罹患群のラベル付け処理を実行する。 In step S902, the DNA mutation narrowing down apparatus 110 extracts a feature amount related to a predetermined phenotype from changes in quantitative traits over time, and uses the extracted feature amount as a diseased group using a predetermined reference value. Unaffected and unaffected groups are labeled. For example, the DNA mutation narrowing down device 110 executes the labeling process for the diseased group and the non-diseased group as shown in FIG.

図１０は、一実施形態に係る第２の具体例の罹患群、非罹患群のラベル付け処理の例を示すフローチャートである。この処理は、図９のステップＳ９０２において、ＤＮＡ変異絞込装置１１０が実行する罹患群、非罹患群のラベル付け処理の一例を示している。 FIG. 10 is a flow chart showing an example of the process of labeling the diseased group and the non-diseased group in the second specific example according to one embodiment. This process shows an example of the diseased group and non-diseased group labeling process executed by the DNA mutation screening device 110 in step S902 of FIG.

ステップＳ１００１において、ＤＮＡ変異絞込装置１１０の特徴量抽出部１１２は、入力受付部１１１が取得した連続的な量的形質の計測値を、多項式にフィッティングする。 In step S1001, the feature extraction unit 112 of the DNA mutation narrowing down apparatus 110 fits the continuous quantitative trait measurement values acquired by the input reception unit 111 to a polynomial.

ステップＳ１００２において、特徴量抽出部１１２は、フィッティングした多項式の係数、切片等を算出する。 In step S1002, the feature quantity extraction unit 112 calculates coefficients, intercepts, etc. of the fitted polynomial.

ステップＳ１００３において、特徴量抽出部１１２は、算出した係数、切片、又は係数と切片を用いて、正規化された特徴量を抽出する。 In step S1003, the feature quantity extraction unit 112 extracts a normalized feature quantity using the calculated coefficient, intercept, or coefficient and intercept.

具体的な一例として、所定の表現型が、ＨＤＰである場合、特徴量抽出部１１２は、図６で前述したように、連続的な血圧（量的形質の一例）の計測値を１次式にフィッティングし、１次式の傾きを特徴量として抽出する。 As a specific example, when the predetermined phenotype is HDP, the feature quantity extraction unit 112, as described above with reference to FIG. , and the slope of the linear expression is extracted as a feature quantity.

ステップＳ１００４において、ＤＮＡ変異絞込装置１１０の関連解析部１１４は、特徴量抽出部１１２が抽出した特徴量を、予め定めた基準値を用いて罹患群と非罹患群とに分類（ラベル付け）する。 In step S1004, the association analysis unit 114 of the DNA mutation narrowing down device 110 classifies (labels) the feature amount extracted by the feature amount extraction unit 112 into a diseased group and a non-diseased group using a predetermined reference value. do.

例えば、前述したように、ＨＤＰに罹患している妊婦における血圧の計測値の傾きは、ＨＤＰに罹患していない妊婦における血圧の計測値の傾きより大きい傾向があることから、予め定められた傾き（基準値）を用いて、罹患群と非罹患群とを分類することができる。具体的な一例として、関連解析部１１４は、予め定められた傾きより大きい特徴量を罹患群としてラベル付けし、予め定められた傾き以下の特徴量を非罹患群としてラベル付けすることができる。 For example, as described above, the slope of blood pressure measurements in pregnant women with HDP tends to be greater than the slope of blood pressure measurements in pregnant women without HDP. (reference value) can be used to classify diseased and non-diseased groups. As a specific example, the association analysis unit 114 can label a feature quantity larger than a predetermined slope as a diseased group, and label a feature quantity less than or equal to a predetermined slope as a non-diseased group.

ここで、図９に戻り、ＤＮＡ変異の絞込処理の例を示すフローチャートの説明を続ける。 Here, returning to FIG. 9, the description of the flowchart showing an example of narrowing down processing for DNA mutations will be continued.

ステップＳ９０３、Ｓ９０４において、ＤＮＡ変異絞込装置１１０は、例えば、ステップＳ９０１、Ｓ９０２の処理と並行して、複数の被験者におけるＤＮＡ変異の情報を取得する。 In steps S903 and S904, the DNA mutation narrowing down apparatus 110 acquires information on DNA mutations in a plurality of subjects, for example, in parallel with the processing of steps S901 and S902.

ステップＳ９０５において、ＤＮＡ変異絞込装置１１０の関連解析部１１４は、ＤＮＡ変異情報取得部１１３が取得したＤＮＡ変異の情報に含まれる各ＤＮＡ変異に対して、罹患群、及び非罹患群における保有数に差があるかを示す統計的指標の値を算出する。 In step S905, the association analysis unit 114 of the DNA mutation narrowing down device 110 determines the number of possessions in the diseased group and the non-diseased group for each DNA mutation included in the DNA mutation information acquired by the DNA mutation information acquisition unit 113. Calculate the value of a statistical indicator that indicates whether there is a difference in

具体的な一例として、関連解析部１１４は、フィッシャーの正確検定、又はカイ二乗検定を行い、ｐ値を算出する。 As a specific example, the association analysis unit 114 performs Fisher's exact test or chi-square test to calculate the p-value.

ステップＳ９０６において、ＤＮＡ変異絞込装置１１０のＤＮＡ変異抽出部１１５は、関連解析部１１４が算出した統計的指標の値が、基準値以上、又は基準値以下となるＤＮＡ変異を抽出する。 In step S906, the DNA mutation extraction unit 115 of the DNA mutation narrowing down device 110 extracts DNA mutations for which the value of the statistical index calculated by the association analysis unit 114 is equal to or greater than the reference value or equal to or less than the reference value.

具体的な一例として、ＤＮＡ変異抽出部１１５は、ｐ値が、予め定められた基準値（例えば、０．０５等）以下となるＤＮＡ変異を抽出する。 As a specific example, the DNA mutation extraction unit 115 extracts DNA mutations whose p-values are equal to or less than a predetermined reference value (for example, 0.05).

上記の処理において、例えば、所定の表現型をＨＤＰ、量的形質の計測値を妊婦の血圧の計測値、血圧の計測値の傾きを特徴量とすることにより、第１の具体例と同様に、ＨＤＰに関連するＤＮＡ変異（一塩基多型）を抽出することができる。 In the above process, for example, by setting the predetermined phenotype as HDP, the measured value of the quantitative trait as the blood pressure measurement value of the pregnant woman, and the slope of the blood pressure measurement value as the feature quantity, the same as in the first specific example , DNA mutations (single nucleotide polymorphisms) associated with HDP can be extracted.

また、第２の具体例においても、検体のサンプルが少ない場合や、疾患の定義が曖昧である場合であっても、所定の疾患に関連するＤＮＡ変異を容易に絞り込むことができるＤＮＡ変異絞込装置、ＤＮＡ変異絞込システム、及びＤＮＡ変異絞込方法を提供することができる。 Also in the second specific example, even when there are few specimen samples or when the definition of a disease is ambiguous, DNA mutation narrowing down can easily narrow down DNA mutations associated with a given disease. Devices, DNA mutation screening systems, and DNA mutation screening methods can be provided.

以上、本発明の実施形態によれば、サンプルが少ない場合や、疾患の定義が曖昧であり、通常のゲノムワイド関連解析で有効な結果が得られないようなケースでも、有意なＤＮＡ変異を絞込できることができる。これにより、従来の技術では特定できなかった疾患に関連するＤＮＡ変異を特定し、疾患の早期発見や予防に活用することが期待できる。 As described above, according to the embodiments of the present invention, significant DNA mutations can be narrowed down even in cases where the number of samples is small, or the definition of a disease is ambiguous, and effective results cannot be obtained by ordinary genome-wide association analysis. can be included. As a result, it can be expected to identify disease-related DNA mutations that could not be identified by conventional techniques, and to utilize them for early detection and prevention of diseases.

＜補足＞
なお、図１、２の構成図は、機能単位のブロックを示している。これらの機能ブロックは、ハードウェア及び／又はソフトウェアの任意の組み合わせによって実現される。また、各機能ブロックの実現手段は特に限定されない。すなわち、各機能ブロックは、物理的及び／又は論理的に結合した１つの装置により実現されても良いし、物理的及び／又は論理的に分離した２つ以上の装置を直接的及び／又は間接的に（例えば、有線及び／又は無線）で接続し、これら複数の装置により実現されても良い。 <Supplement>
1 and 2 show blocks for each function. These functional blocks are realized by any combination of hardware and/or software. Further, means for realizing each functional block is not particularly limited. That is, each functional block may be implemented by one device physically and/or logically coupled, or may be implemented by two or more physically and/or logically separated devices directly and/or indirectly. These multiple devices may be connected together (eg, wired and/or wirelessly).

また、図３に示すＤＮＡ変異絞込装置１１０のハードウェア構成は、図に示した各装置を１つ又は複数含むように構成されても良いし、一部の装置を含まずに構成されても良い。また、ＤＮＡ変異絞込装置１１０は、マイクロプロセッサ、デジタル信号プロセッサ（ＤＳＰ：Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＰＬＤ（Programmable Logic Device）、ＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを含んで構成されても良く、当該ハードウェアにより、各機能ブロックの一部又は全てが実現されても良い。例えば、プロセッサ３０１は、これらのハードウェアの少なくとも１つで実装されても良い。 The hardware configuration of the DNA mutation screening device 110 shown in FIG. 3 may be configured to include one or more of each device shown in the figure, or may be configured without including some of the devices. Also good. In addition, the DNA mutation screening device 110 includes hardware such as a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). , and part or all of each functional block may be implemented by the hardware. For example, processor 301 may be implemented with at least one of these hardware.

本明細書で説明した各態様／実施形態の処理手順、シーケンス、フローチャート等は、矛盾のない限り、順序を入れ替えても良い。例えば、本明細書で説明した方法については、例示的な順序で様々なステップの要素を提示しており、提示した特定の順序に限定されない。 The processing procedures, sequences, flowcharts, etc. of each aspect/embodiment described herein may be interchanged in order as long as there is no contradiction. For example, the methods described herein present elements of the various steps in a sample order, and are not limited to the specific order presented.

入出力された情報等は特定の場所（例えば、メモリ）に保存されても良いし、管理テーブルで管理しても良い。入出力される情報等は、上書き、更新、又は追記され得る。出力された情報等は削除されても良い。入力された情報等は他の装置へ送信されても良い。 Input/output information and the like may be stored in a specific location (for example, memory), or may be managed in a management table. Input/output information and the like can be overwritten, updated, or appended. The output information and the like may be deleted. The entered information and the like may be transmitted to another device.

判定は、１ビットで表される値（０か１か）によって行われても良いし、真偽値（Boolean：true又はfalse）によって行われても良いし、数値の比較（例えば、所定の値との比較）によって行われても良い。 The determination may be made by a value represented by 1 bit (0 or 1), by a true/false value (Boolean: true or false), or by numerical comparison (for example, a predetermined value).

本明細書で説明した各態様／実施形態は単独で用いても良いし、組み合わせて用いても良いし、実行に伴って切り替えて用いても良い。また、所定の情報の通知（例えば、「Ｘであること」の通知）は、明示的に行うものに限られず、暗黙的（例えば、当該所定の情報の通知を行わない）ことによって行われても良い。 Each aspect/embodiment described herein may be used alone, may be used in combination, or may be used by switching between implementations. In addition, the notification of predetermined information (for example, notification of “being X”) is not limited to being performed explicitly, but may be performed implicitly (for example, not notifying the predetermined information). Also good.

ソフトウェアは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語と呼ばれるか、他の名称で呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、プログラム、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順、機能などを意味するよう広く解釈されるべきである。 Software, whether referred to as software, firmware, middleware, microcode, hardware description language or otherwise, includes instructions, instruction sets, code, code segments, program code, programs, subprograms, and software modules. , applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, and the like.

また、ソフトウェア、命令などは、伝送媒体を介して送受信されても良い。例えば、ソフトウェアが、同軸ケーブル、光ファイバケーブル、ツイストペア及びデジタル加入者回線（ＤＳＬ）などの有線技術及び／又は赤外線、無線及びマイクロ波などの無線技術を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び／又は無線技術は、伝送媒体の定義内に含まれる。 Software, instructions, etc. may also be transmitted and received over a transmission medium. For example, the software can be used to access websites, servers, or other When transmitted from a remote source, these wired and/or wireless technologies are included within the definition of transmission media.

本明細書で説明した情報、信号等は、様々な異なる技術のいずれかを使用して表されても良い。例えば、上記の説明全体に渡って言及され得るデータ、命令、コマンド、情報、信号、ビット、シンボル、チップなどは、電圧、電流、電磁波、磁界若しくは磁性粒子、光場若しくは光子、又はこれらの任意の組み合わせによって表されても良い。 Information, signals, etc. described herein may be represented using any of a variety of different technologies. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may refer to voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. may be represented by a combination of

なお、本明細書で説明した用語及び／又は本明細書の理解に必要な用語については、同一の又は類似する意味を有する用語と置き換えても良い。 The terms explained in this specification and/or terms necessary for understanding this specification may be replaced with terms having the same or similar meanings.

また、本明細書で説明した情報、パラメータなどは、絶対値で表されても良いし、所定の値からの相対値で表されても良いし、対応する別の情報で表されてもよい。例えば、無線リソースはインデックスで指示されるものであっても良い。 In addition, the information, parameters, and the like described in this specification may be represented by absolute values, may be represented by relative values from a predetermined value, or may be represented by corresponding other information. . For example, radio resources may be indexed.

本明細書で使用する「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 As used herein, the phrase "based on" does not mean "based only on," unless expressly specified otherwise. In other words, the phrase "based on" means both "based only on" and "based at least on."

「含む(ｉｎｃｌｕｄｉｎｇ)」、「含んでいる（ｃｏｍｐｒｉｓｉｎｇ）」、及びそれらの変形が、本明細書、或いは特許請求の範囲で使用されている限り、これら用語は、用語「備える」と同様に、包括的であることが意図される。さらに、本明細書、或いは特許請求の範囲において使用されている用語「または（or）」は、排他的論理和ではないことが意図される。 To the extent that "including," "comprising," and variations thereof are used herein or in the claims, these terms, as well as the term "comprising," intended to be inclusive. Furthermore, the term "or" as used in this specification or the claims is not intended to be an exclusive OR.

本開示の全体において、例えば、英語でのa, an, 及びtheのように、翻訳により冠詞が追加された場合、これらの冠詞は、文脈から明らかにそうではないことが示されていなければ、複数のものを含むものとする。 Throughout this disclosure, where articles have been added by translation, e.g., a, an, and the in English, these articles are used unless the context clearly indicates otherwise. It shall include plural things.

以上、本発明について詳細に説明したが、当業者にとっては、本発明が本明細書中に説明した実施形態に限定されるものではないということは明らかである。本発明は、特許請求の範囲の記載により定まる本発明の趣旨及び範囲を逸脱することなく修正及び変更態様として実施することができる。従って、本明細書の記載は、例示説明を目的とするものであり、本発明に対して何ら制限的な意味を有するものではない。 Although the present invention has been described in detail above, it will be apparent to those skilled in the art that the present invention is not limited to the embodiments described herein. The present invention can be implemented with modifications and variations without departing from the spirit and scope of the invention defined by the claims. Accordingly, the descriptions herein are for illustrative purposes only and are not meant to be limiting of the present invention.

１００ＤＮＡ変異絞込システム
１１０ＤＮＡ変異絞込装置
１１２特徴量抽出部
１１３ＤＮＡ変異情報取得部（取得部）
１１４関連解析部（解析部）
１１５ＤＮＡ変異抽出部 100 DNA mutation narrowing down system 110 DNA mutation narrowing down device 112 Feature quantity extraction unit 113 DNA mutation information acquisition unit (acquisition unit)
114 Association analysis part (analysis part)
115 DNA Mutation Extractor

Claims

A continuous measure of a quantitative trait associated with a given disease collected from a plurality of subjects, including subjects with the given diseaseto a predetermined polynomial, and using the coefficient of the polynomial, the intercept of the polynomial, or the coefficient and intercept of the polynomial, for each subjecta feature quantity extraction unit that extracts a feature quantity;
an acquisition unit that acquires information on DNA mutations in the plurality of subjects;
an analysis unit that analyzes the relationship between the feature quantity and the DNA mutation;
a DNA mutation extraction unit that extracts a DNA mutation associated with the predetermined disease based on the analysis result of the analysis unit;
havedeath,
The analysis unit performs association analysis using the feature amount as an objective variable and the DNA mutation as an explanatory variable, and calculates a statistical index value indicating the degree of association between the feature amount and each DNA mutation,
The DNA mutation extraction unit extracts DNA mutations in which the value of the statistical index calculated by the analysis unit is a predetermined reference value or more or a reference value or less.
D.NA mutation screening device.

Fitting a given polynomial to continuous measurements of a quantitative trait associated with the given disease, collected from a plurality of subjects, including subjects suffering from the given disease, the coefficients of the polynomial, A feature quantity extraction unit that extracts a feature quantity for each subject using the intercept of the polynomial, or the coefficient and intercept of the polynomial,
an acquisition unit that acquires information on DNA mutations in the plurality of subjects;
an analysis unit that analyzes the relationship between the feature quantity and the DNA mutation;
a DNA mutation extraction unit that extracts a DNA mutation associated with the predetermined disease based on the analysis result of the analysis unit;
has
The analysis unit is
Labeling the feature amount extracted by the feature amount extraction unit into a group affected by the predetermined disease and a group not affected by the predetermined disease using a predetermined reference value,
For each DNA mutation contained in the DNA mutation information acquired by the acquisition unit, calculate a statistical index value indicating whether there is a difference in the number of possession between the diseased group and the non-diseased group,
The DNA mutation extraction unit extracts DNA mutations in which the value of the statistical index calculated by the analysis unit is a predetermined reference value or more or a reference value or less.,
D.NA mutation screening device.

The DNA mutation includes a single nucleotide polymorphism,
The statistical index includes a p-value representing a significance probability,
3. The DNA mutation narrowing down device according to claim 1, wherein said DNA mutation extracting unit extracts said single nucleotide polymorphisms whose p-value is smaller than a predetermined significance level.

The predetermined disease includes gestational hypertension,
4. The DNA mutation narrowing down apparatus according to any one of claims 1 to 3 , wherein the continuous measured values of quantitative traits related to the predetermined disease include measured values of blood pressure measured at predetermined time intervals. .

Continuous measurements of a quantitative trait associated with a given disease collected from a plurality of subjects, including subjects with the given diseaseis fitted to a predetermined polynomial, and using the coefficient of the polynomial, the intercept of the polynomial, or the coefficient and intercept of the polynomial, for each subjecta feature quantity extraction unit that extracts a feature quantity;
an acquisition unit that acquires information on DNA mutations in the plurality of subjects;
an analysis unit that analyzes the relationship between the feature quantity and the DNA mutation;
a DNA mutation extraction unit that extracts a DNA mutation associated with the predetermined disease based on the analysis result of the analysis unit;
havedeath,
The analysis unit performs association analysis using the feature amount as an objective variable and the DNA mutation as an explanatory variable, and calculates a statistical index value indicating the degree of association between the feature amount and each DNA mutation,
The DNA mutation extraction unit extracts DNA mutations in which the value of the statistical index calculated by the analysis unit is a predetermined reference value or more or a reference value or less.
D.NA mutation screening system.

Fitting a predetermined polynomial to continuous measurements of a quantitative trait associated with the predetermined disease collected from a plurality of subjects, including subjects suffering from the predetermined disease, the coefficients of the polynomial, A feature quantity extraction unit that extracts a feature quantity for each subject using the intercept of the polynomial, or the coefficient and intercept of the polynomial,
an acquisition unit that acquires information on DNA mutations in the plurality of subjects;
an analysis unit that analyzes the relationship between the feature quantity and the DNA mutation;
a DNA mutation extraction unit that extracts a DNA mutation associated with the predetermined disease based on the analysis result of the analysis unit;
has
The analysis unit is
labeling the feature amount extracted by the feature amount extracting unit into a diseased group and a non-affected group of the predetermined disease using a predetermined reference value;
For each DNA mutation included in the DNA mutation information acquired by the acquisition unit, calculate a statistical index value indicating whether there is a difference in the number of possession between the diseased group and the non-diseased group,
The DNA mutation extraction unit extracts DNA mutations in which the value of the statistical index calculated by the analysis unit is a predetermined reference value or more or a reference value or less.
DNA mutation screening system.

the computer
A continuous measure of a quantitative trait associated with a given disease collected from a plurality of subjects, including subjects with the given diseaseis fitted to a predetermined polynomial, and using the coefficient of the polynomial, the intercept of the polynomial, or the coefficient and intercept of the polynomial, for each subjecta process of extracting a feature quantity;
a process of obtaining information on DNA mutations in the plurality of subjects;
analyzing the relationship between the feature amount and the DNA mutationanalysisprocessing;
Saidanalysisextracting a DNA mutation associated with the predetermined disease based on the analysis result of the treatment;DNA mutation extractionprocessing;
rundeath,
In the analysis process, association analysis is performed using the feature amount as an objective variable and the DNA mutation as an explanatory variable, and a statistical index value indicating the degree of association between the feature amount and each DNA mutation is calculated;
The DNA mutation extraction process extracts DNA mutations in which the value of the statistical index calculated in the analysis process is a predetermined reference value or more or a reference value or less.
DNA mutation screening method.

the computer
Fitting a predetermined polynomial to continuous measurements of a quantitative trait associated with the predetermined disease collected from a plurality of subjects, including subjects suffering from the predetermined disease, the coefficients of the polynomial, A feature quantity extraction process for extracting a feature quantity for each subject using the intercept of the polynomial, or the coefficient and intercept of the polynomial,
an acquisition process for acquiring information on DNA mutations in the plurality of subjects;
an analysis process for analyzing the relationship between the feature amount and the DNA mutation;
a DNA mutation extraction process for extracting a DNA mutation associated with the predetermined disease based on the analysis result of the analysis process;
and run
The analysis process includes
Labeling the feature amount extracted by the feature amount extraction process into the diseased group and the non-affected group of the predetermined disease using a predetermined reference value,
For each DNA mutation contained in the DNA mutation information acquired in the acquisition process, calculate a statistical index value indicating whether there is a difference in the number of carriers in the diseased group and the non-diseased group,
The DNA mutation extraction process extracts DNA mutations in which the value of the statistical index calculated in the analysis process is a predetermined reference value or more or a reference value or less.
DNA mutation screening method.

to the computer,
A continuous measure of a quantitative trait associated with a given disease collected from a plurality of subjects, including subjects with the given diseaseto a predetermined polynomial, and using the coefficient of the polynomial, the intercept of the polynomial, or the coefficient and intercept of the polynomial, for each subjecta process of extracting a feature quantity;
a process of obtaining information on DNA mutations in the plurality of subjects;
analyzing the relationship between the feature amount and the DNA mutationanalysisprocessing;
extracting a DNA mutation associated with the predetermined disease based on the analysis result of the analyzing process;DNA mutation extractionprocessing;
run,
In the analysis process, association analysis is performed using the feature amount as an objective variable and the DNA mutation as an explanatory variable, and a statistical index value indicating the degree of association between the feature amount and each DNA mutation is calculated;
The DNA mutation extraction process extracts DNA mutations in which the value of the statistical index calculated in the analysis process is a predetermined reference value or more or a reference value or less.
program.

to the computer,
Fitting a predetermined polynomial to continuous measurements of a quantitative trait associated with the predetermined disease collected from a plurality of subjects, including subjects suffering from the predetermined disease, the coefficients of the polynomial, A feature amount extraction process for extracting a feature amount for each subject using the intercept of the polynomial, or the coefficient and intercept of the polynomial,
an acquisition process for acquiring information on DNA mutations in the plurality of subjects;
an analysis process for analyzing the relationship between the feature amount and the DNA mutation;
a DNA mutation extraction process for extracting a DNA mutation associated with the predetermined disease based on the analysis result of the analysis process;
and
The analysis process includes
Labeling the feature amount extracted by the feature amount extraction process into the diseased group and the non-affected group of the predetermined disease using a predetermined reference value,
For each DNA mutation contained in the DNA mutation information acquired in the acquisition process, calculate a statistical index value indicating whether there is a difference in the number of carriers in the diseased group and the non-diseased group,
The DNA mutation extraction process extracts DNA mutations in which the value of the statistical index calculated in the analysis process is a predetermined reference value or more or a reference value or less.
program.