JP2004113661A

JP2004113661A - Program, data base, system, and method for anticipating effectiveness of therapeutic method

Info

Publication number: JP2004113661A
Application number: JP2002284455A
Authority: JP
Inventors: Toshiharu Mishiro; 三代　俊治; Hirohiko Ota; 太田　裕彦; Satoshi Ito; 伊藤　聡; Yoshiko Hiraoka; 平岡　佳子; Michie Hashimoto; 橋本　みちえ; Noriko Matsuyama; 松山　徳子
Original assignee: Toshiba Corp; Genecare Research Institute Co Ltd
Current assignee: Toshiba Corp; Genecare Research Institute Co Ltd
Priority date: 2002-09-27
Filing date: 2002-09-27
Publication date: 2004-04-15
Anticipated expiration: 2022-09-27
Also published as: JP4284050B2

Abstract

<P>PROBLEM TO BE SOLVED: To anticipate the effectiveness of a of treatment according to a gene on virus and patient sides. <P>SOLUTION: A effectiveness determination means 22 reads a human weight coefficient showing a correlation between a human genotype and the effectiveness of the treatment and a virus weight coefficient showing a correlation between a virus genotype and the effectiveness of the treatment, computes human effectiveness prediction discrete values by multiplying for each genotype the human genotype information and the human weight coefficient of a subject, computes virus effectiveness prediction discrete values by multiplying for each genotype the virus genotype intelligence and the virus weight coefficient of the subject, computes effectiveness prediction summation values by adding each of the human effectiveness prediction discrete values and each of the virus effectiveness prediction discrete values, and outputs the effectiveness prediction summation values. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、遺伝子の存在状態に基づき治療法の有効性を予測するためのプログラム、データベース、システム及び方法に関する。
【０００２】
【従来の技術】
患者への薬剤投与などの治療法の有効性は、患者の状態によって変わってくる場合が多い。そのため、患者の臨床データに基づき、これを解析して患者側にフィードバックする仕組みが考案されている（例えば特許文献１参照）。これによれば、患者毎に蓄積された臨床データを解析及び評価し、臨床的危険度をスコア化し、医療者に適切な医療処置を表示する。
【０００３】
【特許文献１】
特開２００２−９５６５０号公報（第１頁、左上欄１行〜１８行）
【０００４】
【発明が解決しようとする課題】
前述した通り、臨床データをスコア化し、患者の臨床的危険度を表示することにより、医療者に適切な医療処置を施すことが高まる。
【０００５】
しかしながら、例えば患者がウィルスに感染している場合、そのウィルスに対する治療の有効性は臨床データのみにより定まるものではない。この場合、患者側の遺伝子や、ウィルス側の遺伝子などにより治療の有効性が変わるものと考えられている。しかしながら、これらウィルス側や患者側の遺伝子に基づき、治療の有効性を決定する仕組みは存在しなかった。
【０００６】
本発明は上記課題を解決するためになされたもので、その目的とするところは、ウィルス側や患者側の遺伝子に応じた治療の有効性を予測するためのプログラム、データベース、システム及び方法を提供することにある。
【０００７】
【課題を解決するための手段】
この発明の一の観点によれば、コンピュータに、治療の有効性を判定するための有効性判定データベースを生成させる処理を実行させる有効性判定データベース生成プログラムであって、この有効性判定データベース生成プログラムは前記コンピュータに、検体毎に、検体の感染宿主遺伝子型を示す感染宿主遺伝子型情報と、検体が感染した感染病原体の遺伝子型を示す感染病原体遺伝子型情報と、前記感染病原体に対する治療の有効性を示す有効性情報が与えられた検体データを複数の検体について読み出させ、前記複数の検体についての検体データに基づき、前記感染宿主遺伝子型情報及び前記感染病原体遺伝子型情報の少なくとも一方と前記治療の有効性との相関度を示す重み係数を、遺伝子型毎に算出させ、前記重み係数を、遺伝子型毎に記憶装置に格納させることを特徴とする有効性判定データベース生成プログラムが提供される。
【０００８】
また、本発明の別の観点によれば、１あるいは複数の感染宿主遺伝子型と、治療の有効性との相関度を示す感染宿主重み係数とを記録した第１のフィールドと、１あるいは複数の感染病原体遺伝子型と、前記治療の有効性との相関度を示す感染病原体重み係数とを記録した第２のフィールドとを有することを特徴とする有効性判定データのデータ構造が提供される。
【０００９】
また、本発明のさらに別の観点によれば、コンピュータに、治療の有効性を判定する処理を実行させる有効性判定プログラムであって、この有効性判定プログラムは前記コンピュータに、１あるいは複数の感染宿主遺伝子型と治療の有効性との相関度を示す感染宿主重み係数と、１あるいは複数の感染病原体遺伝子型と前記治療の有効性との相関度を示す感染病原体重み係数とを読み出させ、被検体の感染宿主遺伝子型情報と、前記感染宿主重み係数を遺伝子型毎に乗算して感染宿主有効性予測個別値を算出させ、被検体の感染病原体遺伝子型情報と、前記感染病原体重み係数を遺伝子型毎に乗算して感染病原体有効性予測個別値を算出させ、前記感染宿主有効性予測個別値の各々及び感染病原体有効性予測個別値の各々を加算して有効性予測加算値を算出させ、前記有効性予測加算値あるいは前記有効性予測加算値に所定の数値を加算した値に基づき治療の有効性を判定し、前記判定結果を出力させることを特徴とする有効性判定プログラムが提供される。
【００１０】
また、プログラムに係る本発明は、そのプログラムを実行するためのコンピュータにより構成される装置、そのプログラムによりコンピュータで実行される手順からなる方法、そのプログラムを記録した記録媒体の発明としても成立する。
【００１１】
【発明の実施の形態】
以下、図面を参照しながら本発明の実施形態を説明する。
【００１２】
なお、本実施形態では、本発明をＣ型肝炎ウィルスに対するインターフェロンを用いた治療法の有効性の予測に適用する例として説明する。
【００１３】
（第１実施形態）
図１は本発明の第１実施形態に係る治療法有効性予測装置の全体構成を示す図である。
【００１４】
図１に示すように、治療法有効性予測装置１００は、コンピュータ１０と、このコンピュータ１０と通信ネットワーク１１を介して接続された検体統計データベース１２から構成される。
【００１５】
コンピュータ１０は、入力装置１と、この入力装置１に接続された処理装置２と、この処理装置２に接続された有効性判定データベース３と、処理装置２に接続された出力装置４及び記憶媒体読取装置５から構成される。このコンピュータ１０は例えばパーソナルコンピュータにより実現される。コンピュータ１０は通信インタフェース（不図示）を介して通信インタフェースに接続される。コンピュータ１０は、この通信インタフェースを介して通信ネットワーク１１との間でデータを送受信することができる。
【００１６】
入力装置１は、例えばキーボードやマウスなどにより実現される。処理装置２は、ＣＰＵなど、一般的なコンピュータの演算処理を実現するハードウェアにより実現される。有効性判定データベース３は、磁気ディスク、光学式ディスクなどにより実現される。出力装置４は、例えばディスプレイやプリンタなどにより実現される。記憶媒体読取装置５は、例えば磁気ディスク読取装置、ＣＤ−ＲＯＭ読取装置、ＤＶＤ読取装置などにより実現される。
【００１７】
入力装置１は、治療が施される個体である患者の遺伝子型に関する情報と、その患者が感染しているウィルスの遺伝子型に関する情報や、その他処理装置２における処理に必要な各種データを入力するための装置である。
【００１８】
処理装置２は、検体統計データベース１２に基づく有効性判定データベース３の生成処理や、入力装置１から入力されたデータ及び有効性判定データベース３に基づく有効性判定処理を実行する。
【００１９】
処理装置２は、例えば処理装置に接続された記憶媒体読取装置５からＣＤ−ＲＯＭやＤＶＤ、磁気ディスクなどの記憶媒体６に記憶されたデータベース生成プログラムや有効性判定プログラムを読み取り、それらプログラムを実行する。これにより、処理装置２が前述した生成処理を行うデータベース生成手段２１や有効性判定処理を行う有効性判定手段２２として機能する。もちろん、処理装置２内蔵のメモリや記憶装置などからこれらプログラムを読み取り実行されるようにしてもよい。また、処理装置２は、これらデータベース生成手段２１や有効性判定手段２２による処理以外の他の処理も、前述のプログラムに基づき実行する。
【００２０】
データベース生成手段２１は、著効率算出処理、オッズ算出処理、χ^２値算出処理、重み係数算出処理などを実行する。有効性判定手段２２は、遺伝子型情報入力処理、有効性判定処理などを実行する。
【００２１】
出力装置４は、処理装置２で処理された結果を表示し、あるいは処理結果を印字出力する。
【００２２】
検体統計データベース（ＤＢ）１２の構成の一例を図２に示す。図２に示すように、検体番号、複数のヒト遺伝子型識別情報及び有効性情報の各々がデータベースの各フィールドに検体毎に関連づけて格納されている。図２に示す検体統計データベース１２は、ウィルス遺伝子型識別情報で識別されるウィルス遺伝子型毎に与えられている。具体的には、図２の例では、インターフェロン治療に関連するＣ型肝炎ウィルスの遺伝子型を識別するウィルス遺伝子型として、ＨＣＶ−１ｂ、ＨＣＶ−２ａ、ＨＣＶ−２ｂなどが挙げられる。
【００２３】
検体番号は、検体としての患者を識別するための検体識別情報として用いられる番号である。ヒト遺伝子型識別情報は、患者の遺伝子型を識別する情報である。有効性情報は、その患者への治療の有効性（薬剤の投与であれば薬剤の有効性）を示す情報である。以下の実施形態では、既にヒト遺伝子型情報、ウィルス遺伝子型情報、薬剤投与などの治療を施した結果としての著効／非著効の別が分かっている対象（患者）を検体と呼び、以下の実施形態の発明による有効性判定の対象（患者）を被検体と呼ぶ。
【００２４】
図２には、Ｃ型肝炎のインターフェロンを用いた治療に関連するヒト遺伝子型識別情報として、ＭｘＡ−８８、ＭｘＡ−１２３、ＭＢＬ、ＬＭＰ７、ＩＦＮＡＲ１（ＧＴ）_ｎ、ＩＦＮＡＲ１　Ｃ／ＴなどのＳＮＰｓに関する情報が挙げられている。ＩＦＮＡＲ１（ＧＴ）_ｎ及びＩＦＮＡＲ１　Ｃ／Ｔは、インターフェロン受容体遺伝子の遺伝子型を識別する情報である。有効性情報は、「著効」あるいは「非著効」のいずれかで示される。より具体的には、図２の例では、インターフェロン療法、すなわちＣ型肝炎の患者にインターフェロンを投与する治療の有効性が「著効」あるいは「非著効」で示されている。
【００２５】
ヒト遺伝子型情報は、それぞれ２つあるいはそれ以上の情報で区別される。ウィルス遺伝子型情報も同様に、２つあるいはそれ以上の情報で区別される。例えばヒト遺伝子型のＭｘＡ−８８は、Ｇ／Ｔ、Ｔ／Ｔ及びＧ／Ｇの３種類の対立遺伝子型のいずれかに分類される。ＭｘＡ−１２３及びＬＭＰ７は、Ｃ／Ａ、Ａ／Ａ及びＣ／Ｃの３種類の対立遺伝子型のいずれかに分類される。ＭＢＬは、ＹＡ及びＸＢの２種類の対立遺伝子型のいずれかに分類される。ＩＦＮＡＲ１（ＧＴ）_ｎは、５／５、５／１４及びそれ以外（ｏｔｈｅｒｓ）の３種類の対立遺伝子型のいずれかに分類される。ＩＦＮＡＲ１　Ｃ／Ｔは、Ｃ／Ｔ、Ｔ／Ｔ及びＣ／Ｃの３種類の対立遺伝子型のいずれかに分類される。なお、対立遺伝子型による分類の他に対立ハプロタイプにより分類してもよい。
【００２６】
「遺伝子型」とは、全対立遺伝子、あるいは、注目している遺伝子座の遺伝子の存在状態を指す。「遺伝子座」とは、染色体、もしくは遺伝子地図上での遺伝子の位置を指す。例えば、互いに対立遺伝子である遺伝子の場合は同一の遺伝子座にある。「対立遺伝子」とは、相同染色体の相同な場所に位置し、機能的にも相同な遺伝子をいう。
【００２７】
有効性判定データベース３の構成の一例を図３に示す。図３に示すように、複数のヒト遺伝子型重み係数の各々がデータベースの各フィールドに格納されている。図３に示すヒト遺伝子型重み係数は、ウィルス遺伝子型としてＨＣＶ−１ｂに対して与えられた例を示している。このようなヒト遺伝子型重み係数がＨＣＶ−２ａ／ｂなどの他のウィルス遺伝子型に対しても同様に与えられている。以下の実施形態で、重み係数とは、治療の有効性に対してある遺伝子型が与える影響、すなわち治療の有効性に対する相関度を数値化したものである。また、図３の例では、重み係数に符号が乗算された値として示されている。
【００２８】
次に、データベース生成手段２１による有効性判定データベース３の作成方法について図４のフローチャートに沿って説明する。
【００２９】
まず、検体統計データベース１２のヒト遺伝子型の各々に対する治療の著効率を算出する著効率算出処理を実行する（ｓ１）。具体的には、あるヒト遺伝子型に着目し、その遺伝子型の対立遺伝子型あるいは対立ハプロタイプの各々について、著効率を算出する。
【００３０】
例えばＭｘＡ−８８という遺伝子型に着目すると、その対立遺伝子型はＧ／Ｔ、Ｔ／Ｔ及びＧ／Ｇが存在する。この３種類の対立遺伝子型をＧ／Ｔ及びＴ／Ｔの第１の型と、Ｇ／Ｇの第２の型に類型化した場合、第１の型に属する全検体データのうち、著効であった検体データの割合を算出する。第１の型に属する検体のうちＸ_ａ個が著効（ＳＲ）でＹ_ａ個が非著効（ＮＲ）であった場合には、第１の型の著効率（ＳＲｒａｔｅ_１）は、
ＳＲｒａｔｅ_１＝１００×Ｘ_ａ／（Ｘ_ａ＋Ｙ_ａ）（％）
となる。同様に、第２の型に属する検体のうちＸ_ｂ個が著効（ＳＲ）でＹ_ｂ個が非著効（ＮＲ）であった場合には、第２の型の著効率（ＳＲｒａｔｅ_２）は、
ＳＲｒａｔｅ_２＝１００×Ｘ_ｂ／（Ｘ_ｂ＋Ｙ_ｂ）（％）
により求められる。このように、検体統計データベース１２の対立遺伝子型が２つ以上の型に分類されている場合、２つの型１，２に分類し、その型毎に著効率ＳＲｒａｔｅ_１及びＳＲｒａｔｅ_２を算出する。２つの型への分類は、治療の有効性に対する相関度が互いに差が生じるように分類するのが望ましい。より具体的には、２つの型の各々に対する著効率の差が大きくなるように分類するのが望ましい。以下では、この２つの型をそれぞれ「第１の型」及び「第２の型」と呼ぶ。
【００３１】
なお、この２つの型への分類を自動化してもよい。自動化する場合には、遺伝子型の各々を２つの型に分類したと仮定した場合に得られる著効率をすべての分類の組合せについて算出し、互いの著効率が最も大きくなるような分類を選択する。例えばＡ，Ｂ，Ｃ，Ｄ４つの遺伝子型を２つの型に分類する場合、型（Ａ，Ｂ）と型（Ｃ，Ｄ）の分類、型（Ａ，Ｃ）と型（Ｂ，Ｄ）の分類、型（Ａ，Ｄ）と型（Ｂ，Ｃ）の分類、型（Ａ）と型（Ｂ，Ｃ，Ｄ）の分類、型（Ｂ）と型（Ａ，Ｃ，Ｄ）の分類、型（Ｃ）と型（Ａ，Ｂ，Ｄ）の分類、型（Ｄ）と型（Ａ，Ｂ，Ｃ）の分類という７種類の分類が考えられる。この７種類の分類のそれぞれについて、２つの型の著効率及びその差を算出する。その著効率の差が最大となるような分類を選択する。
【００３２】
以上はＭｘＡ−８８を例に説明したが、他の遺伝子型、すなわちＭｘＡ−１２３、ＭＢＬ、ＬＭＰ７、ＩＦＮＡＲ１（ＧＴ）_ｎ、ＩＦＮＡＲ１　Ｃ／Ｔなどについても同様の２つの型への分類、各型に対する著効率の算出を実行する。
【００３３】
ＭｘＡ−１２３の場合、Ｃ／Ａ及びＡ／Ａからなる型と、Ｃ／Ｃからなる型の２つの型に分類される。ＭＢＬの場合、ＹＡの型とＸＢの型の２つの型に分類される。ＬＭＰ７は、Ｃ／Ａ及びＡ／Ａからなる型と、Ｃ／Ｃからなる型の２つの型に分類される。ＩＦＮＡＲ１（ＧＴ）_ｎは、５／５及び５／１４からなる型と、それ以外の型の２つの型に分類される。ＩＦＮＡＲ１　Ｃ／Ｔは、Ｃ／Ｔ及びＴ／Ｔからなる型と、Ｃ／Ｃからなる型の２つの型に分類される。
【００３４】
得られた著効率は、その型毎に図示しないデータベース（有効性判定データベース３でもよい）に格納される。
【００３５】
次に、オッズ算出処理を実行する（ｓ２）。オッズＯ_ｋとは、２つの型の著効率のうち、大きい方の著効率を小さな方の著効率で除した数値であり、以下の式で示される。なお、ｋは、遺伝子型毎に付与される識別番号を示す。
【００３６】
Ｏ_ｋ＝ＳＲｒａｔｅ_１／ＳＲｒａｔｅ_２　　　（ＳＲｒａｔｅ_１＞ＳＲｒａｔｅ_２の場合）
Ｏ_ｋ＝ＳＲｒａｔｅ_２／ＳＲｒａｔｅ_１　　　（ＳＲｒａｔｅ_１＜ＳＲｒａｔｅ_２の場合）
ＳＲｒａｔｅ_１＝ＳＲｒａｔｅ_２の場合は、上記２式のいずれで定義してもよい。
【００３７】
以上の式により、すべての遺伝子型についてオッズＯ_ｋを算出し、得られたオッズＯ_ｋは図示しないデータベース（有効性判定データベース３でもよい）に格納される。
【００３８】
次に、χ^２値算出処理を実行する（ｓ３）。以下、ｃ_ｋ＝χ^２と定義する。χ^２値ｃ_ｋは以下の式により算出される。
【００３９】
ｃ_ｋ＝χ^２＝（ｎ−１）ｓ^２／σ^２
ここで、確率変数はＸ_ａ，Ｙ_ａ，Ｘ_ｂ，Ｙ_ｂ、ｎは標本の大きさ、σ^２は分散で、ｓ^２は標本分散である。より具体的には、Ｘ_ａ，Ｙ_ａ，Ｘ_ｂ，Ｙ_ｂの各数値をカイ二乗検定の２×２分割表に代入することによりカイ二乗値χ^２を得ることができる。
【００４０】
次に、オッズＯ_ｋ及びχ^２値に基づき、重み係数Ｓ_ｋを算出する重み係数算出処理を実行する（ｓ４）。重み係数Ｓ_ｋは、以下の式により算出される。
【００４１】
Ｓ_ｋ＝（Ｏ_ｋ−１）ｃ_ｋ／２０
次に、得られた重み係数Ｓ_ｋに対する２つの型への符号割り当て処理を実行する（ｓ５）。重み係数Ｓ_ｋは正の値をとる。正の符号＋は、著効率（ＳＲｒａｔｅ）の高い型に割り当てられ、負の符号は、著効率（ＳＲｒａｔｅ）の低い型に割り当てられる。具体的には、各遺伝子型における２つの型の著効率を比較し、大きな著効率を有する型に正の符号を割り当て、小さな著効率を有する型に負の符号を割り当てる。
【００４２】
このように符号割り当て処理を各遺伝子型について実行し、得られた重み係数Ｓ_ｋ、−Ｓ_ｋを有効性判定データベース３に格納する。
【００４３】
以上により、有効性判定データベース３の作成が終了する。
【００４４】
図５は、有効性判定データベース３作成に際し算出された著効率、オッズ、χ^２値ｃ_ｋ、重み係数Ｓ_ｋなどを表にしたものである。また、図５には、著効とされた検体と非著効とされた検体の各々の標本数も参考のために示している。重み係数Ｓ_ｋの欄には、各型に対して割り当てられる符号に重み係数Ｓ_ｋを乗算した値が示されている。
【００４５】
次に、有効性判定手段２２による有効性判定データベース３を用いた有効性判定処理について図６のフローチャートに沿って説明する。
【００４６】
まず、被検体としての患者から、ヒト遺伝子型情報及びウィルス遺伝子型情報を得る（ｓ６１）。具体的には、個体に由来するサンプルを取得し、得られたサンプルをＤＮＡチップのセル内に注入し、セル内に固定化されたＤＮＡプローブとのハイブリダイゼーション反応を行わせる。ハイブリダイゼーション反応の後、バッファや挿入剤をセル内に充填し、セル内に固定された電極からの電気化学信号を検出する。この電気化学信号を信号処理することにより、ヒト遺伝子型情報及びウィルス遺伝子型情報を得ることができる。もちろん、他の手法によりこの被検体からの情報を取得してもよい。
【００４７】
次に、得られた被検体のヒト遺伝子型情報及びウィルス遺伝子型情報からなる被検体データを入力装置１を用いて入力する（ｓ６２）。なお、入力装置１の代わりに、ＤＮＡチップを信号線により接続し、ＤＮＡチップから信号線を介してヒト遺伝子型情報及びウィルス遺伝子型情報を入力してもよい。
【００４８】
処理装置２は、入力された遺伝子型情報及びウィルス遺伝子型情報に基づき有効性の判定を行う（ｓ６３）。
【００４９】
具体的には、有効性判定プログラムが記憶媒体６から記憶媒体読取装置５により読み取られ、これにより処理装置２は有効性判定手段２２として機能する。有効性判定手段２２は、まず入力されたウィルス遺伝子型情報に対応するデータテーブルを有効性判定データベース３から読み出す（ｓ６３ａ）。そして、入力された遺伝子型に対応付けられた符号と重み係数を乗算することにより各々の遺伝子型についてヒト有効性予測個別値を得る（ｓ６３ｂ）。そして、得られた各々の遺伝子型についての有効性予測個別値を加算し、有効性予測加算値を算出する（ｓ６３ｃ）。
【００５０】
この有効性予測加算値の算出概念図を図７に示す。図７に示すように、入力された被検体データがＭｘＡ−８８、Ｍｘａ−１２３、ＭＢＬ、ＬＭＰ７、ＩＦＮＡＲ１（ＧＴ）_ｎ、ＩＦＮＡＲ１　Ｃ／Ｔの順にＧ／Ｔ、Ｃ／Ｃ、ＹＡ、Ａ／Ａ、５／５、Ｔ／Ｔである場合、各々についてＭｘａ−８８、ＭＢＬ、ＬＭＰ７、ＩＦＮＡＲ１（ＧＴ）_ｎ、ＩＦＮＡＲ１　Ｃ／Ｔについては重み係数Ｓ_ｋに＋１を乗算した値が割り当てられ、Ｍｘａ−１２３については重み係数Ｓ_ｋに−１が乗算した値が割り当てられる。その結果、重み係数は図７の通りとなる。この重み係数（有効性予測個別値）を加算すると、有効性個別加算値は（＋０．４１）＋（−０．４０）＋（＋０．２８）＋（＋０．０２）＋（＋０．２８）＋（＋０．００）＝０．５９となる。
【００５１】
次に、処理装置２は、出力装置４に有効性判定結果として有効性予測加算値を出力する（ｓ６４）。主治医あるいは患者は、出力された有効性予測加算値に基づき、治療の有効性を予測し、その治療方法を選択するか否かを判断する際の参考に供することができる。一般に有効性予測加算値が高いほど、治療の有効性が高い。
【００５２】
なお、有効性の判定は、有効性予測加算値の算出に限定されない。例えば、有効性予測加算値に対応する著効率を算出してもよい。
【００５３】
図８は、有効性判定結果の出力例を示す図である。図８に示すように、検体である患者の識別ＩＤ、主治医、その患者の遺伝子型に関する情報、有効性予測加算値、ウィルス遺伝子型と、ウィルス遺伝子型毎の有効性予測加算値に対する著効率が示されている。また、有効性予測加算値を所定の範囲毎に分類し、各類型毎に著効率が算出されている。そして、その著効率を所定のしきい値に基づき治療が有効か否かの目安を示す有効性予測境界を定め（例えば図８では５０％以上と５０％未満）、その境界線がいずれかの類型の間に線引きされて示されている。
【００５４】
有効性予測加算値の範囲毎の著効率の算出は簡単に行える。具体的には、すべての検体データについて有効性予測加算値を算出する。そして、得られた有効性予測加算値を、所定の範囲毎に分類する。図８の例では、０〜１．００、１．００〜２．００、２．００〜３．００…というように、分類している。これは、検体統計データベース１２に格納された検体データの有効性情報に基づき容易に分類可能である。そして、その分類された検体データにおけるすべての標本数に対する著効の標本数の割合を算出することにより、範囲別の著効率が算出される。この範囲別の著効率を有効性予測加算値とともに出力することにより、ウィルスや宿主である患者のヒトの遺伝子に応じた治療の有効性を確率により簡単に把握することができる。また、宿主はヒトに限らず、他の生物でもよい。
【００５５】
以上により、有効性判定データベース３の生成及びそれを用いた有効性判定は終了する。
【００５６】
検体統計データベース１２が更新される毎に上述した（ｓ１）〜（ｓ５）に示す有効性判定データベース３の生成処理を行うことにより、有効性判定データベース３を更新することができる。
【００５７】
また、検体統計データベース１２以外の検体データをさらに入力して有効性判定データベース３を更新することもできる。
【００５８】
このように、検体の増加に伴い有効性判定データベース３をその都度更新することにより、最新の統計情報に基づく治療有効性判定が可能となり、また母集団の増加に伴い判定の制度が向上する。
【００５９】
以上説明したように本実施形態によれば、治療の有効性をその治療対象のウィルスの遺伝子型やヒトの遺伝子型に応じて確率により簡便に判断できる。
【００６０】
（第２実施形態）
図９は本発明の第２実施形態に係る治療法有効性予測装置の全体構成を示す図である。
【００６１】
図９に示すように、本実施形態の治療法有効性予測装置２００の基本的な構成は、図１に示した治療法有効性予測装置１００の構成と共通する。共通する構成には同一符号を付し、詳細な説明は省略する。
【００６２】
治療法有効性予測装置２００では、データベース生成手段２１０、有効性判定手段２２０の処理内容、記憶媒体６０に格納されたプログラム、有効性判定データベース３００に格納される有効性判定データベースの構成、検体統計データベース１２０の構成が図１の治療法有効性予測装置１００のそれと異なる。記憶媒体６０には、処理装置２をデータベース生成手段２１０及び有効性判定手段２２０として機能させるためのデータベース生成プログラム及び有効性判定プログラムなどが格納されている。有効性判定データベース３００には、データベース生成手段２１０により生成された有効性判定データベースが格納される。
【００６３】
データベース生成手段２１０は、データ分類処理、分散共分散行列算出処理、平均差行列算出処理、判別係数算出処理、判別点算出処理、判別関数算出処理、著効率、誤判別率算出処理などを実行する。有効性判定手段２２０は、被検体遺伝子情報入力処理、有効性判定データ選択処理、判別関数値算出処理、判別関数値正負判定処理、判定結果出力処理などを実行する。
【００６４】
検体統計データベース１２０の構成の一例を図１０に示す。図１０に示すように、検体番号、複数のヒト遺伝子型識別情報、複数のウィルス遺伝子型識別情報及び有効性情報の各々がデータベースの各フィールドに検体毎に関連づけて格納されている。
【００６５】
検体番号は、検体としての患者を識別するための検体識別番号として用いられる番号である。ヒト遺伝子型識別情報は、患者の遺伝子型を識別する情報である。ウィルス遺伝子型識別情報は、患者が感染したウィルスの遺伝子型を識別する情報である。有効性情報は、その患者への治療の有効性を示す情報である。
【００６６】
ウィルスやヒトの遺伝子は、通常Ａ，Ｔ，Ｇ，Ｃと略記される４つの塩基により定まるもので、検体データに関しても、この塩基の配列で与えられる。これら遺伝子の配列は、部分的に個々人により差があり、この差が生物学的な特徴付けに関わっており、医療応用の立場からはこのような遺伝子配列のわずかな差がその人の薬剤応答性にかかわっていることが分かっている。図１０では、このような配列の差を数量化して示している。
【００６７】
図１０の例では、Ｃ型肝炎のインターフェロンを用いた治療に関連するヒト遺伝子型識別情報として、ＭｘＡ−８８、ＭｘＡ−１２３、ＭＢＬ、ＬＭＰ７などのＳＮＰｓに関する情報が挙げられる。有効性情報は、著効が「２」、非著効が「１」で示されている。より具体的には、図１０の例では、インターフェロン療法、すなわちＣ型肝炎の患者にインターフェロンを投与する治療の有効性が著効あるいは非著効の２値で示されている。
【００６８】
ヒト遺伝子型情報は、それぞれ２以上の情報で区別される。例えばヒト遺伝子型のＭｘＡ−８８は、Ｇ／Ｇ、Ｇ／Ｔ及びＴ／Ｔの３種類の対立遺伝子型の各々に数値「１」、「２」及び「３」が対応付けられている。なお、医学的な知見に基づき、この３つの類型を２つに分類することも可能である。例えば、Ｇ／Ｇ及びＧ／Ｔを「１」に、Ｔ／Ｔを「２」に対応付ける等である。ＭｘＡ−１２３は、Ｃ／Ｃ、Ｃ／Ａ及びＡ／Ａの３種類の対立遺伝子型の各々に「１」、「２」及び「３」が対応付けられている。ＭＢＬは、ＸＢ及びＹＡの２種類の対立遺伝子型の各々に「１」及び「２」が対応付けられている。ＬＭＰ７は、Ｃ／Ｃ、Ｃ／Ａ及びＡ／Ａの３種類の対立遺伝子型の各々に「１」、「２」及び「３」が対応付けられている。
【００６９】
図１０の例では、Ｃ型肝炎のウィルスの遺伝子型情報として、遺伝子型、ＩＳＤＲが挙げられる。遺伝子型は、「１」及び「２」で示される。それぞれ、ＨＣＶ−１ｂ及びＨＣＶ−２ａ／２ｂが対応している。ＩＳＤＲは、単純な塩基の相違で示されるものではないため、Ｃ型肝炎のＩＳＤＲでのアミノ酸の変異の数を用いている。もちろん、ＭｘＡ−８８等と同様に、医学的な見地に基づき２以上の類型に分類してもよい。例えば、変異数３以下に「１」を割り当て、４以上に「２」を割り当てもよい。
【００７０】
なお、この図１０に示した数量化の手法はほんの一例にすぎず、遺伝子型の各類型に他の数値を割り当てたり、遺伝子の存在状態を示す他の類型や数値を用いたりする等、他の数量化の手法を適用してもよい。望ましくは、誤判別率が最小となるような数値の割り当てが選択される。また、検体データを格納した統計データベースとしては、検体統計データベース１２０に示される形式で格納されていない場合もある。この場合、その統計データベースから、上述した類型化、数量化手法により、検体統計データベース１２０を生成すればよい。
【００７１】
有効性判定データベース３００の構成の一例を図１１に示す。図１１に示すように、線形判別関数と、変数と、有効群著効率、有効群誤判別率、無効群著効率、無効群誤判別率の各々がデータベースの各フィールドに対応付けられて格納されている。
【００７２】
線形判別関数は、入力としての検体の遺伝子情報に対する治療法の有効性を判別するための関数である。この線形判別関数は、複数の変数ａ，ｂ，ｃ，ｄの関数として表現されている。この判別関数を一般化した式（１）を以下に示す。
【００７３】
ｆ（ａ，ｂ，ｃ，ｄ）＝ｐ×ａ＋ｑ×ｂ＋ｒ×ｃ＋ｓ×ｄ＋ｔ　…（１）
式（１）で、ｐ，ｑ，ｒ，ｓは、有効性判定データベース３００の生成処理により決定される係数、ｔは定数である。なお、関数ｆは、ａ，ｂ，ｃ，ｄの４つの変数すべてで示される必要はなく、少なくともそのうちの１つの変数で示されていればよい。定数ｔは、治療法が有効か無効かを関数値の正負により判別するために与えられた数値である。判別関数の値が正であると、その治療法が有効であり、その値が負であると、その治療法が有効でないことを示している。これら判別関数の係数ｐ，ｑ，ｒ，ｓ及び定数ｔは、通常の統計学の手法に従って決定される（例えば栗原考次著“データの科学”放送大学教育振興会参照）。
【００７４】
有効群著効率は、線形判別関数により有効と判別される領域に属する検体データに対する著効が対応付けられた検体データの割合を示している。
【００７５】
無効群著効率は、線形判別関数により無効と判別される領域に属する検体データに対する著効が対応付けられた検体データの割合を示している。
【００７６】
有効群誤判別率は、本当は著効なのに、判別関数により誤って無効と判別される確率、無効群誤判別率は、本当は非著効なのに、判別関数により誤って有効と判別される確率を示す数値である。
【００７７】
図１１（ａ）〜（ｃ）は、いずれもＣ型肝炎ウィルスに対するインターフェロン治療の有効性を判別するための有効性判定データを示している。図１１（ｂ）は、Ｃ型肝炎ウィルスのうちの遺伝子型がＨＣＶ−１ｂに該当する検体データに対して生成された有効性判定データを示している。図１１（ｃ）は、ＨＣＶ−２に該当する検体データに対して生成された有効性判定データを示している。図１１（ａ）は、ＨＣＶ−１ｂに該当する検体データとＨＣＶ−２に該当する検体データの全データに対して生成された有効性判定データを示している。
【００７８】
判別関数は、ウィルスの遺伝子型やヒトの遺伝子型の１つの組合せに対して１つ与えられる。従って、遺伝子型の組合せが複数ある場合には、その組合せ毎に複数の判別関数が与えられる。図１１（ａ）に示すように、遺伝子型としてｇｅｎｏ−ｔｙｐｅ（ウィルス遺伝子型）のみを考える場合、ＭｘＡ−１２３のみを考える場合、ＭｘＡ−８８のみを考える場合に対して１つの判別関数が与えられ、ｇｅｎｏ−ｔｙｐｅ、ＭｘＡ−１２３、ＭｘＡ−８８の組合せに対して１つの判別関数が与えられ、ＭｘＡ−１２３、Ｍｘａ−８８の組合せに対して１つの判別関数が与えられる。同様に、図１１（ｂ）では、遺伝子型の８つの組合せに対してそれぞれ判別関数が与えられる。図１１（ｃ）では、遺伝子型の５つの組合せに対してそれぞれ判別関数が与えられる。
【００７９】
次に、有効性判定データベース３００の作成方法について図１２のフローチャートに沿って説明する。
【００８０】
まず、検体統計データベース１２から検体データを読み出し、各検体を、著効及び非著効の別で２つのデータ群に分類する（ｓ１１１）。分類された各データ群は図１３に示される。図１３（ａ）は著効である検体データのデータ群［Ａ］、図１３（ｂ）は非著効である検体データのデータ群［Ｂ］を示している。遺伝子型情報は、検体のヒト遺伝子型情報及びウィルス遺伝子型情報の図１０に従った数値を示している。なお、被検体のウィルス遺伝子型が予め分かっている場合などには、その被検体のウィルス遺伝子型に関する検体データのみを抽出し、それに基づきデータ群［Ａ］及び［Ｂ］を算出してもよい。
【００８１】
次に、分類されたデータ群［Ａ］について分散共分散行列Ｖ_Ａを算出し、分類されたデータ群［Ｂ］について分散共分散行列Ｖ_Ｂを算出する（ｓ１１２）。
【００８２】
分散共分散行列Ｖ_Ａは、データ群［Ａ］におけるヒト遺伝子型情報、あるいはウィルス遺伝子型情報を変数とした場合に、そのうちの２つの変数の間の直線的傾向の度合いを示す共分散を、すべての変数の組合せについて求めた行列である。また、分散共分散行列Ｖ_Ｂは、データ群［Ｂ］におけるヒト遺伝子型情報、あるいはウィルス遺伝子型情報を変数とした場合に、そのうちの２つの変数の間の直線的傾向の度合いを示す共分散を、すべての変数の組合せについて求めた行列である。
【００８３】
分散共分散行列Ｖ_Ａを以下の式（２）で、分散共分散行列Ｖ_Ｂを以下の式（３）で定義する。
【００８４】
【数１】

【００８５】
この場合、分散共分散行列Ｖ^ｉｊ _Ａ及びＶ^ｉｊ _Ｂは、以下の式（４）及び（５）で算出される。
【００８６】
【数２】

【００８７】
次に、データ群［Ａ］及び［Ｂ］全体の分散共分散行列Ｖ_ｔｏｔを算出する（ｓ１１３）。全体の分散共分散行列の行列要素Ｖ^ｉｊ _ｔｏｔは以下の式（６）で示される。
【００８８】
【数３】

【００８９】
次に、平均差行列ｄを算出する（ｓ１１４）。平均差行列ｄは以下の式（７）で示される。
【００９０】
【数４】

【００９１】
次に、判別係数Ａを算出する（ｓ１１５）。判別係数Ａは、以下の式（８）で示され、式（９）で示される行列により表現される。
【００９２】
【数５】

【００９３】
Ａ＝Ｖ_ｔｏｔ ^−１×ｄ　…（９）
次に、判別点Ｙ_０を算出する（ｓ１１６）。判別点Ｙ_０は、以下の式（１０）で示される。
【００９４】
【数６】

【００９５】
（ｓ１１５）及び（ｓ１１６）で得られた判別係数Ａ及び判別点Ｙ_０により、判別関数Ｚが決定される（ｓ１１７）。判別関数Ｚは以下の式（１１）で示される。
【００９６】
Ｚ＝ａ_１Ｘ_１＋ａ_２Ｘ_２＋…＋ａ_ｐＸ_ｐ−Ｙ_０　…（１１）
次に、判別関数Ｚに対して検体データを適用し、有効群著効率、有効群誤判別率、無効群著効率及び無効群誤判別率を算出する（ｓ１１８）。
【００９７】
具体的には、判別関数Ｚにより有効、すなわちＺ＞０と判別された検体のうち、実際に著効であった検体数をＳＲ_＋、実際には非著効であった検体数をＮＲ_＋とすると、有効群著効率はＳＲ_＋×１００／（ＳＲ_＋＋ＮＲ_＋）で計算できる。また、判別関数Ｚにより無効、すなわちＺ＜０と判別された検体のうち、実際に非著効だった検体数をＮＲ₋、予想に反して著効だった検体数をＳＲ₋とすると、無効群著効率はＳＲ₋×１００／（ＳＲ₋＋ＮＲ₋）で計算できる。理想的には、有効群著効率は１００で、無効群著効率は０になるべきである。
【００９８】
有効群誤判別率はＳＲ₋×１００／（ＳＲ_＋＋ＳＲ₋）で算出される。無効群誤判別率は、ＮＲ_＋×１００／（ＮＲ_＋＋ＮＲ₋）で算出される。
【００９９】
得られた判別関数Ｚ、有効群著効率、有効群誤判別率、無効群著効率及び無効群誤判別率は、それぞれ関連づけられて有効性判定データとして有効性判定データベース３００に格納される（ｓ１１９）。
【０１００】
なお、以上（ｓ１１２）〜（ｓ１１８）に示される処理は、遺伝子型の組合せに応じて繰り返し実行される。すなわち、（ｓ１１２）で生成する分散共分散行列Ｖ_Ａ及びＶ_Ｂを、複数の遺伝子型の組合せについて設定する。そして、その設定された分散共分散行列Ｖ_Ａ及びＶ_Ｂに対して（ｓ１１３）〜（ｓ１１８）で示される処理を実行することにより、遺伝子型の組合せの各々について有効性判定データが得られる。
【０１０１】
次に、有効性判定データベース３００を用いた有効性判定手段２２０による有効性判定処理について図１４のフローチャートに沿って説明する。
【０１０２】
まず、検体としての患者から、ヒト遺伝子型情報及びウィルス遺伝子型情報を得る（ｓ６１）。取得の手法は第１実施形態で述べた通りであるので詳細な説明は省略する。
【０１０３】
次に、得られたヒト遺伝子型情報及びウィルス遺伝子型情報を入力装置１を用いて入力する（ｓ６２）。
【０１０４】
処理装置２は、入力された遺伝子型情報及びウィルス遺伝子型情報に基づき有効性の判定を行う（ｓ１３１）。
【０１０５】
具体的には、有効性判定プログラムが記憶媒体６０から記憶媒体読取装置５により読み取られ、これにより処理装置２は有効性判定手段２２０として機能する。有効性判定手段２２０は、入力されたヒト遺伝子型情報及びウィルス遺伝子型情報に対応する変数の組合せを有する有効性判定データを検索し、例えば複数の有効性判定データを選択する（ｓ１３１ａ）。
【０１０６】
なお、この（ｓ１３１ａ）では、被検体データが入力される毎に、前述の（ｓ１１１）〜（ｓ１１９）に示す有効性判定データベース３００生成処理を実行してもよい。その場合、被検体データのウィルス遺伝子型に応じて生成する有効性判定データベース３００を特定のウィルス遺伝子型のものに限定することもできる。
【０１０７】
次に、選択された有効性判定データの変数として定められる遺伝子型に対応するヒト遺伝子型情報を、線形判別関数の変数にそれぞれ代入し、判別関数値Ｚ_１を得る（ｓ１３１ｂ）。
【０１０８】
次に、判別関数値Ｚ_１の正負の判定を行う（ｓ１３１ｃ）。
【０１０９】
判別関数値Ｚ_１＞０であれば、その判別関数Ｚが対応付けられた有効著効率及び有効誤判別率を読み出し、判別結果「有効」とともに、出力装置４に出力させる（ｓ１３２）。
【０１１０】
判別関数値Ｚ_１＜０であれば、その判別関数Ｚが対応付けられた無効著効率及び無効誤判別率を読み出し、判別結果「無効」とともに、出力装置４に出力する（ｓ１３２）。
【０１１１】
ＨＣＶ−１ｂに感染した患者の被検体データとして、ＩＳＤＲ＝６、ＭＢＬ＝２、ＭｘＡ−１２３＝３及びＭｘａ−８８＝２が与えられている場合の判定手法を具体例として説明する。図１１の９番の判別関数Ｚ＝０．７６ａ＋１．７１ｂ−０．０４ｃ＋０．９２ｄ−６．３４に着目する。（ｓ１３１ｂ）で、被検体データを判別関数Ｚに代入すると、有効性予測加算値として、判別関数値Ｚ_１＝（０．７６×６）＋（１．７１×２）−（０．０４×３）＋（０．９２×２）−６．３４＝３．３６となる。０．７６×６はウィルス有効性予測個別値、１．７１×２、−（０．０４×３）、０．９２×２はヒト有効性予測個別値である。Ｚ_１＝３．３６＞０であるため、判定結果「有効」とともに、この９番の判別関数に関連づけられた有効群著効率５２％、有効群誤判別率１８％、無効群著効率４％、無効群誤判別率１６％が読み出されて出力装置４に出力される。
【０１１２】
この判定結果は、被検体に対する治療が有効と判定されたことを示している。さらに、過去にこの判別式で有効と判別されたデータ群に着目すると、判別どおり実際に著効であった割合は５２％であり、実際には著効であるのに誤って無効と判別された割合が１８％であることを示している。また、過去に判別式で無効と判別されたデータ群に着目すると、判別結果に反して著効だった割合は４％であり、実際には非著効であるのに誤って有効と判別された割合は１６％であることを示している。この判定結果により、医者は適切な治癒手段を選択することができる。
【０１１３】
なお、判別結果の出力の前に、読み出された有効誤判別率あるいは無効誤判別率が所定のしきい値以下（例えば２０％以下）か否かを判定し、所定のしきい値以下の判別結果とそれに関連するデータを出力するようにしてもよい。また、読み出された有効誤判別率あるいは無効誤判別率の低い順にデータを出力するようにしてもよい。これにより、誤判別率が高く、信頼性のあまり高く無いデータに基づき有効性を判断するおそれが低減し、誤判別率の低い信頼性の高いデータに基づく有効性予測が可能となる。
【０１１４】
有効性判定データベース３００の更新は、第１実施形態の有効性判定データベース３の更新処理と同様に行うことができる。
【０１１５】
以上説明したように本実施形態によれば、治療の有効性をその治療対象のウィルスの遺伝子型やヒトの遺伝子型に応じて有効／無効の別とその判定結果の誤判別率により、簡便に判断することができる。
【０１１６】
また、このような仕組みにより、薬剤有効性と遺伝子レベルでの関係がさらに明らかになり、かつ検体の母集団が増大することによってデータベースは更新され、より高精度かつ高信頼性のある予測システムを構築することが可能となる。
【０１１７】
この第２実施形態では、判別関数Ｚの係数は、通常の統計学の手法に従って決定され、ヒト遺伝子型情報同士、ウィルス遺伝子型情報同士、あるいはヒト遺伝子型情報及びウィルス遺伝子型情報同士に先験的な相関はないとした。すなわち、１のヒト遺伝子型情報と治療の有効性との相関、あるいは１のウィルス遺伝子型情報と治療の有効性との相関に基づき、係数を決定した。これにより、判別関数の算出に際して計算量の少ない線形判別関数が求まる。これは、事前にデータ群の相関が分かっていないときには、それぞれが正規分布することを前提としたものである。
【０１１８】
しかしながら、本発明はこれに限定されるものではない。例えば、あるヒト遺伝子型情報と別のヒト遺伝子型情報の間の相関、あるヒト遺伝子型情報とウィルス遺伝子型情報の間の相関、あるいはあるウィルス遺伝子型情報と別のウィルス遺伝子型情報の相関が事前に分かっている場合のように、明らかにデータ因子間に相関がある場合には、独立データ因子への縮約作業や非線形判別関数による判定作業を行えばよい。より具体的には、それらの相関に対する係数を改めて求め、その係数とその相関に対応する変数の乗じた項を判別関数Ｚに付加すればよい。従って、第２実施形態で示した線形判別関数のみならず、二次関数を用いた判別関数や、多次元解析の手法であるクラスタ分析法を取り入れてもよい。ここで採用するデータ解析的な手法がアルゴリズムとして既知のものであるならが、それは本発明の範囲内にあることはもちろんである。また、この有効性予測システムでは、判別関数はシステムの中に予め与えられているものである。本発明者らは、ここでは母集団の大きさをある程度変化させて、それが判別関数の係数にどの程度影響するかを調べたが、数％の影響しかないことを確認した。従って、ある程度の大きさの母集団を用いれば、汎用的な判別関数を構築することが可能である。しかし、検体データの母集団に結果は依存し、また人種や地域、生活習慣による相違は当然あり得る。本予測システムは、臨床用遺伝子検査装置への実装が可能であるが、このような装置に通信手段を装備させ、新たな検体データを母集団として決定した判別関数のセットを通信手段を介して遺伝し検査装置に送り込むということも可能である。
【０１１９】
本実施形態で述べた判別関数はあくまである検体データを用いた判別関数であって、その値に普遍性は存在しないが、ここで述べた方法によって、それぞれの診断に適した、また常に最新の情報に基づいた判別関数を提供することができる。また、このような判別関数及び誤判別率のデータベースを通信によらず、磁気媒体などの記録媒体によって配布することも可能である。
【０１２０】
本発明は上記第１，第２実施形態に限定されるものではない。
【０１２１】
検体統計データベース１２は、コンピュータ１０と通信ネットワーク１１を介して接続される形態を示したが、これに限定されず、コンピュータ１０に通信ネットワーク１１を介さず、コンピュータ１０に備えられた記憶装置に格納されていてもよい。
【０１２２】
また、上記実施形態では、Ｃ型肝炎に対してインターフェロンを投与して治療を行う方法が有効か否かの予測に本発明を適用する例を示したが、これに限定されるものではない。治療法は、インターフェロンの投与のみならず、他の治療法も該当する。また、治療法は、薬物の投与を伴わない治療も含まれる。また、治療により治癒すべきウィルスは、Ｃ型肝炎に限らず、他のいかなるウィルスも本発明の予測の対象となり得る。また、ウィルスのみならず、細菌や真菌など遺伝子を有するすべての感染病原体による感染治療に簡単に適用できる。また、感染宿主をヒトとして説明したが、これに限定されず、動物や他の生物などを感染宿主とした場合にも同様に適用可能である。
【０１２３】
また、第１実施形態では、検体統計データベース１２に基づき有効性判定データベース３を生成する例を示したが、これに限定されない。例えば、第２実施形態の検体統計データベース１２０のようなデータベースに基づき有効性判定データベース３を生成してもよい。この場合、検体統計データベース１２０に示されるＩＳＤＲやウィルス遺伝子型についても図５のＭｘａ−８８などのヒト遺伝子型と同様の著効率、オッズ、χ^２値、重み係数などを算出する。この場合、第１実施形態と第２実施形態では同一の検体統計データベースから生成され得る。
【０１２４】
ウィルスを類型化する第１実施形態の変形例における検体統計データベース１２ａを図１５に示す。図１５に示すように、ヒト遺伝子型情報に関するＭｘＡ−８８、ＭｘＡ−１２３、ＭＢＬ、ＬＭＰ７、ＩＦＮＡＲ１（ＧＴ）_ｎ、ＩＦＮＡＲ１　Ｃ／Ｔなどの遺伝子型のみならず、ウィルス遺伝子型情報に関するＨＣＶ−１ｂ、ＨＣＶ−２ａ、ＨＣＶ−２ｂの別に関する情報が検体毎に対応付けられている。このウィルス遺伝子型情報も例えばＨＣＶ−１ｂを第１の型に、ＨＣＶ−２ａ／２ｂを第２の型に類型化し、ウィルス遺伝子型も含め重み係数を算出した例を図１６に示す。この図１６に示す重み係数と、被検体から得られるヒト遺伝子型情報及びウィルス遺伝子型情報を乗算し、その乗算値をそれぞれ加算することにより、ウィルス遺伝子型も含めた有効性予測加算値を得ることが出来る。なお、図１５や図１６の例では、ウィルス遺伝子型を１つのみ取り上げて重み係数を算出する例を示したが、他のウィルス遺伝子型に関する情報についても同様に重み係数を算出してもよいことはもちろんである。
【０１２５】
また、上述した有効性判定データベース３や３００の生成は、検体統計データベース１２が更新される毎に繰り返し自動的に実行することにより、有効性判定の精度を向上させることができる。また、検体統計データベース１２が更新された場合のみならず、別の経路から検体データが得られた場合には、その検体データを検体統計データベース１２から得られた検体データに加算して有効性判定データベース３や３００の生成を行えばよい。
【０１２６】
また、本実施形態では有効性判定データベースを有するコンピュータ１０で有効性判定を実行する場合を示したが、これに限定されない。例えば、通信ネットワーク１１に接続された端末から被検体の遺伝子型データを送信してコンピュータ１０に判定要求を行い、この要求に応答してコンピュータ１０が上述の有効性判定処理を実行し、判定結果をその端末に送信するようにしてもよい。
【０１２７】
また、第１実施形態では、複数のヒト遺伝子型情報を２種類の型に分類し、その型の各々について分散共分散行列を算出する例を示したが、これに限定されるものではない。例えば３種類以上の型に分類し、その型の各々について第１実施形態と同様に著効率ＳＲｒａｔｅ、オッズＯ_ｋ、重み係数Ｓ_ｋを算出してもよい。第２実施形態でも同様に、（ｓ１１２）３種類以上のデータ群に対して分散共分散行列Ｖ_Ａ，Ｖ_Ｂ，Ｖ_Ｃ，…を算出し、これらに基づき全体の分散共分散行列Ｖ_ｔｏｔを算出してもよい。
【０１２８】
また、上記第１、第２実施形態における有効／無効の結果の表示は、数値で与えてもよいし、あるいは視覚的に色分けして見分けられる仕組みにしてもよい。
【０１２９】
【発明の効果】
以上詳述したように本発明によれば、ウィルス側や患者側の遺伝子に応じた治療の有効性を予測することができる。
【図面の簡単な説明】
【図１】本発明の第１実施形態に係る治療法有効性予測装置の全体構成を示す図。
【図２】同実施形態に係る検体統計データベース（ＤＢ）の構成の一例を示す図。
【図３】同実施形態に係る有効性判定データベースの構成の一例を示す図。
【図４】同実施形態に係る有効性判定データベースの作成方法のフローチャートを示す図。
【図５】同実施形態に係る有効性判定データベース作成に際し算出された著効率、オッズ、χ^２値、重み係数などを表にして示した図。
【図６】同実施形態に係る有効性判定処理のフローチャートを示す図。
【図７】同実施形態に係る有効性予測加算値の算出概念図。
【図８】同実施形態に係る有効性判定結果の出力例を示す図。
【図９】本発明の第２実施形態に係る治療法有効性予測装置の全体構成を示す図。
【図１０】同実施形態に係る検体統計データベースの構成の一例を示す図。
【図１１】同実施形態に係る有効性判定データベースの構成の一例を示す図。
【図１２】同実施形態に係る有効性判定データベースの生成方法のフローチャートを示す図。
【図１３】同実施形態に係る分類された各データ群のデータの一例を示す図。
【図１４】同実施形態に係る有効性判定データベースを用いた有効性判定処理のフローチャートを示す図。
【図１５】第１実施形態の変形例に関わる検体統計データベースの構成の一例を示す図。
【図１６】第１実施形態の変形例に関わる有効性判定データベースの構成の一例を示す図。
【符号の説明】
１…入力装置
２…処理装置
３…有効性判定データベース
４…出力装置
５…記憶媒体読取装置
６…記憶媒体
１０…コンピュータ
１１…通信ネットワーク
１２…検体統計データベース[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a program, a database, a system, and a method for predicting the effectiveness of a therapy based on the presence state of a gene.
[0002]
[Prior art]
The effectiveness of a therapy, such as administering a drug to a patient, often depends on the condition of the patient. Therefore, a mechanism has been devised which analyzes the data based on clinical data of a patient and feeds it back to the patient (for example, see Patent Document 1). According to this, the clinical data accumulated for each patient is analyzed and evaluated, the clinical risk is scored, and an appropriate medical treatment is displayed to a medical professional.
[0003]
[Patent Document 1]
JP-A-2002-95650 (page 1, upper left column, lines 1 to 18)
[0004]
[Problems to be solved by the invention]
As discussed above, scoring clinical data and displaying the patient's clinical risk increases the likelihood that a clinician will be given appropriate medical treatment.
[0005]
However, if a patient is infected with a virus, for example, the efficacy of treatment for that virus is not determined solely by clinical data. In this case, it is considered that the efficacy of the treatment changes depending on the gene on the patient side, the gene on the virus side, and the like. However, there is no mechanism for determining the efficacy of treatment based on these virus and patient genes.
[0006]
The present invention has been made in order to solve the above-mentioned problems, and an object of the present invention is to provide a program, a database, a system, and a method for predicting the effectiveness of a treatment according to a gene on a virus side or a patient side. Is to do.
[0007]
[Means for Solving the Problems]
According to one aspect of the present invention, there is provided an effectiveness determination database generation program for causing a computer to execute a process of generating an effectiveness determination database for determining the effectiveness of a treatment, and the effectiveness determination database generation program In the computer, for each sample, infected host genotype information indicating the infected host genotype of the sample, infectious agent genotype information indicating the genotype of the infectious agent infected with the sample, and the effectiveness of treatment for the infectious agent Is read for a plurality of samples provided with validity information indicating the validity information, based on the sample data of the plurality of samples, based on at least one of the infected host genotype information and the infectious pathogen genotype information and the treatment A weighting factor indicating the degree of correlation with the effectiveness of each genotype, Validity determination database generation program is provided for causing stored in the storage device for each.
[0008]
According to another aspect of the present invention, a first field that records one or more infected host genotypes and an infected host weighting factor that indicates a degree of correlation with the efficacy of the treatment, A data structure of efficacy determination data is provided, comprising a second field that records an infectious agent genotype and an infectious agent weight coefficient that indicates a degree of correlation with the effectiveness of the treatment.
[0009]
According to still another aspect of the present invention, there is provided an effectiveness determination program for causing a computer to execute processing for determining the effectiveness of a treatment, wherein the effectiveness determination program causes the computer to perform one or more infections. Infected host weight coefficient indicating the degree of correlation between the host genotype and the efficacy of the treatment, and one or more infectious agent weight coefficients indicating the degree of correlation between the infectious agent genotype and the effectiveness of the treatment are read out, The infected host genotype information of the subject and the infected host effectiveness prediction individual value are calculated by multiplying the infected host weight coefficient for each genotype, and the infected pathogen genotype information of the subject and the infected pathogen weight coefficient are calculated. By multiplying for each genotype, an infectious pathogen efficacy prediction individual value is calculated, and each of the infected host efficacy prediction individual value and each of the infectious agent efficacy prediction individual values is added to calculate the efficacy predictor. Calculating an additional value, determining the effectiveness of the treatment based on the effectiveness prediction addition value or a value obtained by adding a predetermined numerical value to the effectiveness prediction addition value, and outputting the determination result. A determination program is provided.
[0010]
Further, the present invention relating to a program is also realized as an apparatus constituted by a computer for executing the program, a method including a procedure executed by the computer by the program, and a recording medium recording the program.
[0011]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0012]
In this embodiment, the present invention will be described as an example in which the present invention is applied to prediction of the effectiveness of a therapy using interferon for hepatitis C virus.
[0013]
(1st Embodiment)
FIG. 1 is a diagram showing an overall configuration of a treatment method effectiveness prediction device according to a first embodiment of the present invention.
[0014]
As shown in FIG. 1, the treatment-effect-effectiveness prediction device 100 includes a computer 10 and a sample statistical database 12 connected to the computer 10 via a communication network 11.
[0015]
The computer 10 includes an input device 1, a processing device 2 connected to the input device 1, a validity determination database 3 connected to the processing device 2, an output device 4 and a storage medium connected to the processing device 2. It comprises a reading device 5. The computer 10 is realized by, for example, a personal computer. The computer 10 is connected to a communication interface via a communication interface (not shown). The computer 10 can transmit and receive data to and from the communication network 11 via the communication interface.
[0016]
The input device 1 is realized by, for example, a keyboard and a mouse. The processing device 2 is realized by hardware such as a CPU for realizing arithmetic processing of a general computer. The validity determination database 3 is realized by a magnetic disk, an optical disk, or the like. The output device 4 is realized by, for example, a display or a printer. The storage medium reading device 5 is realized by, for example, a magnetic disk reading device, a CD-ROM reading device, a DVD reading device, or the like.
[0017]
The input device 1 inputs information on the genotype of a patient who is an individual to be treated, information on the genotype of a virus that the patient is infected with, and various other data necessary for processing in the processing device 2. It is a device for.
[0018]
The processing device 2 executes a generation process of the validity determination database 3 based on the sample statistical database 12 and a validity determination process based on the data input from the input device 1 and the validity determination database 3.
[0019]
The processing device 2 reads a database generation program and a validity determination program stored in a storage medium 6 such as a CD-ROM, a DVD, or a magnetic disk from a storage medium reading device 5 connected to the processing device, and executes the programs. I do. Thus, the processing device 2 functions as the database generation unit 21 that performs the above-described generation process and the validity determination unit 22 that performs the validity determination process. Of course, these programs may be read from a memory or a storage device built in the processing device 2 and executed. Further, the processing device 2 executes other processes other than the processes by the database generation unit 21 and the validity determination unit 22 based on the above-described program.
[0020]
The database generation unit 21 performs the efficiency calculation processing, the odds calculation processing,²A value calculation process, a weight coefficient calculation process, and the like are executed. The validity determining unit 22 executes a genotype information input process, a validity determining process, and the like.
[0021]
The output device 4 displays the result processed by the processing device 2 or prints out the processed result.
[0022]
FIG. 2 shows an example of the configuration of the sample statistical database (DB) 12. As shown in FIG. 2, each of the specimen number, the plurality of human genotype identification information, and the validity information is stored in each field of the database in association with each specimen. The sample statistical database 12 shown in FIG. 2 is provided for each virus genotype identified by the virus genotype identification information. Specifically, in the example of FIG. 2, HCV-1b, HCV-2a, HCV-2b and the like are listed as virus genotypes for identifying the genotype of hepatitis C virus related to interferon treatment.
[0023]
The sample number is a number used as sample identification information for identifying a patient as a sample. The human genotype identification information is information for identifying a patient's genotype. The efficacy information is information indicating the efficacy of the treatment for the patient (if the drug is administered, the efficacy of the drug). In the following embodiments, a subject (patient) whose human genotype information, virus genotype information, and the effect / non-effect as a result of performing treatment such as drug administration are known is referred to as a specimen. The target (patient) for which the effectiveness is determined according to the invention of the embodiment is referred to as a subject.
[0024]
FIG. 2 shows MxA-88, MxA-123, MBL, LMP7, IFNAR1 (GT) as human genotype identification information related to treatment of hepatitis C with interferon._n, IFNAR1 @ C / T, and other information about SNPs. IFNAR1 (GT)_nAnd IFNAR1ΔC / T are information for identifying the genotype of the interferon receptor gene. The validity information is indicated by either “significant effect” or “non-significant effect”. More specifically, in the example of FIG. 2, the efficacy of interferon therapy, that is, the treatment of administering interferon to a patient with hepatitis C, is indicated as “excellent” or “non-excellent”.
[0025]
Human genotype information is distinguished by two or more pieces of information, respectively. Virus genotype information is likewise distinguished by two or more pieces of information. For example, the human genotype MxA-88 is classified into one of three alleles, G / T, T / T and G / G. MxA-123 and LMP7 are classified into one of three alleles, C / A, A / A and C / C. MBL is classified into one of two alleles, YA and XB. IFNAR1 (GT)_nIs classified into one of three alleles: 5/5, 5/14 and other. IFNAR1ΔC / T is classified into one of three alleles, C / T, T / T and C / C. In addition, you may classify by allele haplotype besides classification by allele type.
[0026]
“Genotype” refers to the presence of all alleles or the gene at the locus of interest. “Locus” refers to the location of a gene on a chromosome or genetic map. For example, genes that are alleles to each other are at the same locus. "Allele" refers to a gene that is located at a homologous location on a homologous chromosome and is functionally homologous.
[0027]
FIG. 3 shows an example of the configuration of the validity determination database 3. As shown in FIG. 3, each of a plurality of human genotype weighting factors is stored in each field of the database. The human genotype weighting factor shown in FIG. 3 shows an example given to HCV-1b as a virus genotype. Such a human genotype weighting factor is similarly given to other virus genotypes such as HCV-2a / b. In the following embodiments, the weighting factor is a value obtained by quantifying the effect of a certain genotype on the effectiveness of treatment, that is, the degree of correlation with the effectiveness of treatment. Further, in the example of FIG. 3, it is shown as a value obtained by multiplying the sign by the weight coefficient.
[0028]
Next, a method of creating the validity determination database 3 by the database generation unit 21 will be described with reference to the flowchart of FIG.
[0029]
First, a significant efficiency calculation process for calculating the significant efficiency of treatment for each human genotype in the sample statistical database 12 is executed (s1). Specifically, focusing on a certain human genotype, the remarkable efficiency is calculated for each allele or allele haplotype of that genotype.
[0030]
For example, focusing on the genotype MxA-88, its alleles include G / T, T / T and G / G. When these three types of alleles are categorized into the first type of G / T and T / T and the second type of G / G, among all the specimen data belonging to the first type, significant effects are obtained. Is calculated. X of the specimen belonging to the first type_aIndividual is very effective (SR) and Y_aIf the individual was non-responsible (NR), the first type of responsivity (SRrate)₁)
SRrate₁= 100 × X_a/ (X_a+ Y_a) (%)
Becomes Similarly, among the specimens belonging to the second type, X_bIndividual is very effective (SR) and Y_bIf the individual was not significant (NR), the second type of significant efficiency (SRrate)₂)
SRrate₂= 100 × X_b/ (X_b+ Y_b) (%)
Required by As described above, when the allele types in the sample statistical database 12 are classified into two or more types, the allele types are classified into two

types

1 and 2, and the remarkable efficiency SR rate is obtained for each type.₁And SRrate₂Is calculated. The classification into the two types is desirably performed such that the degree of correlation with the effectiveness of the treatment differs from each other. More specifically, it is desirable to classify the two types so that the difference between the respective efficiencies is large. Hereinafter, these two types are referred to as a “first type” and a “second type”, respectively.
[0031]
Note that the classification into these two types may be automated. In the case of automation, the efficiency obtained when assuming that each genotype is classified into two types is calculated for all combinations of the classifications, and the classification that maximizes each other is selected. . For example, when classifying four genotypes of A, B, C and D into two types, classification of type (A, B) and type (C, D), and classification of type (A, C) and type (B, D) Classification, classification of type (A, D) and type (B, C), classification of type (A) and type (B, C, D), classification of type (B) and type (A, C, D), There are seven types of classification: type (C) and type (A, B, D), and type (D) and type (A, B, C). For each of these seven classifications, the remarkable efficiencies of the two types and their differences are calculated. The classification that maximizes the difference in the efficiency is selected.
[0032]
Although the above description has been made using MxA-88 as an example, other genotypes, that is, MxA-123, MBL, LMP7, IFNAR1 (GT)_n, IFNAR1 @ C / T, etc., are similarly classified into two types, and the remarkable efficiency is calculated for each type.
[0033]
MxA-123 is classified into two types: a type consisting of C / A and A / A, and a type consisting of C / C. MBL is classified into two types, YA type and XB type. LMP7 is classified into two types, a type consisting of C / A and A / A, and a type consisting of C / C. IFNAR1 (GT)_nAre classified into two types, a type consisting of 5/5 and 5/14 and another type. IFNAR1 @ C / T is classified into two types, a type consisting of C / T and T / T and a type consisting of C / C.
[0034]
The obtained efficiency is stored in a database (not shown) (or the validity determination database 3) for each type.
[0035]
Next, odds calculation processing is executed (s2). Odds O_kIs a numerical value obtained by dividing the larger one by the smaller one of the two types of efficiency, and is represented by the following equation. Here, k indicates an identification number assigned to each genotype.
[0036]
O_k= SRrate₁/ SRrate₂(SRrate₁> SRrate₂in the case of)
O_k= SRrate₂/ SRrate₁(SRrate₁<SRrate₂in the case of)
SRrate₁= SRrate₂In the case of, it may be defined by any of the above two equations.
[0037]
From the above equation, the odds O for all genotypes_kAnd calculate the odds O_kAre stored in a database (not shown) (may be the validity determination database 3).
[0038]
Next, χ²A value calculation process is executed (s3). Hereinafter, c_k= Χ²Is defined. χ²Value c_kIs calculated by the following equation.
[0039]
c_k= Χ²= (N-1) s²/ Σ²
Where the random variable is X_a, Y_a, X_b, Y_b, N is the sample size, σ²Is the variance, s²Is the sample variance. More specifically, X_a, Y_a, X_b, Y_bBy substituting each numerical value of into the 2 × 2 contingency table of the chi-square test, the chi-square value χ²Can be obtained.
[0040]
Next, the odds O_kAnd χ²Based on the value, the weighting factor S_kIs executed (s4). Weight coefficient S_kIs calculated by the following equation.
[0041]
S_k= (O_k-1) c_k/ 20
Next, the obtained weight coefficient S_kA code assignment process to two types is performed (s5). Weight coefficient S_kTakes a positive value. A positive sign + is assigned to a type with high SRrate, and a negative sign is assigned to a type with low SRrate. Specifically, the efficiency of two types in each genotype is compared, and a type having a large efficiency is assigned a positive sign, and a type having a small efficiency is assigned a negative sign.
[0042]
The code assignment process is executed for each genotype in this manner, and the obtained weight coefficient S_k, -S_kIs stored in the validity determination database 3.
[0043]
Thus, the creation of the validity determination database 3 is completed.
[0044]
FIG. 5 shows the remarkable efficiency, odds, and 際 calculated when the effectiveness judgment database 3 was created.²Value c_k, Weight coefficient S_kAnd so on. FIG. 5 also shows, for reference, the number of samples of each of the significant samples and the non-significant samples. Weight coefficient S_kIn the column of, the weighting coefficient S is assigned to the code assigned to each type._kAre shown.
[0045]
Next, the validity determining process performed by the validity determining unit 22 using the validity determination database 3 will be described with reference to the flowchart of FIG.
[0046]
First, human genotype information and virus genotype information are obtained from a patient as a subject (s61). Specifically, a sample derived from an individual is obtained, the obtained sample is injected into a cell of a DNA chip, and a hybridization reaction with a DNA probe immobilized in the cell is performed. After the hybridization reaction, a buffer or an intercalating agent is filled in the cell, and an electrochemical signal from an electrode fixed in the cell is detected. By subjecting this electrochemical signal to signal processing, human genotype information and virus genotype information can be obtained. Of course, information from the subject may be obtained by another method.
[0047]
Next, subject data including the obtained human genotype information and virus genotype information of the subject is input using the input device 1 (s62). Instead of the input device 1, a DNA chip may be connected by a signal line, and human genotype information and virus genotype information may be input from the DNA chip via the signal line.
[0048]
The processing device 2 determines validity based on the input genotype information and virus genotype information (s63).
[0049]
Specifically, the validity determination program is read from the storage medium 6 by the storage medium reading device 5, whereby the processing device 2 functions as the validity determination unit 22. The validity judging means 22 first reads out a data table corresponding to the inputted virus genotype information from the validity judging database 3 (s63a). Then, by multiplying the sign corresponding to the input genotype by the weighting coefficient, a human efficacy prediction individual value is obtained for each genotype (s63b). Then, the obtained individual validity prediction values for each genotype are added to calculate a predicted validity added value (s63c).
[0050]
FIG. 7 shows a conceptual diagram of the calculation of the effectiveness prediction addition value. As shown in FIG. 7, the input subject data is MxA-88, Mxa-123, MBL, LMP7, IFNAR1 (GT)._n, IFNAR1 C / T in the order of G / T, C / C, YA, A / A, 5/5, T / T, Mxa-88, MBL, LMP7, IFNAR1 (GT) for each_n, IFNAR1 C / T, the weighting factor S_kIs assigned a value obtained by multiplying by +1. For Mxa-123, a weighting factor S_kIs multiplied by -1. As a result, the weight coefficients are as shown in FIG. When this weighting coefficient (effectiveness prediction individual value) is added, the effectiveness individual addition value is (+0.41) + (− 0.40) + (+ 0.28) + (+ 0.02) + (+ 0.28) + (+ 0.00) = 0.59.
[0051]
Next, the processing device 2 outputs a validity prediction addition value as a validity determination result to the output device 4 (s64). The attending physician or patient can predict the effectiveness of the treatment based on the output effectiveness prediction addition value, and use it as a reference when determining whether or not to select the treatment method. In general, the higher the effectiveness prediction addition value, the higher the effectiveness of the treatment.
[0052]
The determination of the validity is not limited to the calculation of the predicted effectiveness addition value. For example, a significant efficiency corresponding to the effectiveness prediction addition value may be calculated.
[0053]
FIG. 8 is a diagram illustrating an output example of the validity determination result. As shown in FIG. 8, the identification ID of the patient as the specimen, the attending physician, information on the genotype of the patient, the predicted efficacy added value, the virus genotype, and the remarkable efficiency for the predicted efficacy added value for each virus genotype are: It is shown. In addition, the effectiveness prediction addition value is classified for each predetermined range, and the remarkable efficiency is calculated for each type. Then, based on the remarkable efficiency, based on a predetermined threshold value, an efficacy prediction boundary indicating an indication of whether or not the treatment is effective is determined (for example, 50% or more and less than 50% in FIG. 8), and the boundary line is any one of the boundaries. Lines are drawn between the types.
[0054]
The calculation of the significant efficiency for each range of the effectiveness prediction addition value can be easily performed. Specifically, the predicted effectiveness addition value is calculated for all the sample data. Then, the obtained effectiveness prediction addition values are classified for each predetermined range. In the example of FIG. 8, the classification is performed as 0 to 1.00, 1.00 to 2.00, 2.00 to 3.00,. This can be easily classified based on the validity information of the sample data stored in the sample statistical database 12. Then, by calculating the ratio of the number of significant samples to the total number of samples in the classified sample data, the significant efficiency for each range is calculated. By outputting the remarkable efficiency for each range together with the predicted efficacy addition value, it is possible to easily grasp the effectiveness of the treatment corresponding to the virus or the human gene of the patient who is the host by probability. The host is not limited to humans, but may be other organisms.
[0055]
Thus, the generation of the validity determination database 3 and the validity determination using the same are completed.
[0056]
Each time the sample statistical database 12 is updated, the validity determination database 3 can be updated by performing the generation processing of the validity determination database 3 shown in (s1) to (s5) described above.
[0057]
Further, the validity determination database 3 can be updated by further inputting sample data other than the sample statistical database 12.
[0058]
In this way, by updating the efficacy determination database 3 each time the number of specimens increases, it becomes possible to determine the treatment efficacy based on the latest statistical information, and the system of determination improves with an increase in the population.
[0059]
As described above, according to the present embodiment, the effectiveness of the treatment can be easily determined by probability according to the genotype of the virus to be treated or the genotype of the human.
[0060]
(2nd Embodiment)
FIG. 9 is a diagram showing an overall configuration of a therapy effectiveness predicting apparatus according to the second embodiment of the present invention.
[0061]
As shown in FIG. 9, the basic configuration of the therapy effectiveness prediction device 200 of the present embodiment is common to the configuration of the therapy effectiveness prediction device 100 shown in FIG. 1. The same components are denoted by the same reference numerals, and detailed description is omitted.
[0062]
In the treatment method effectiveness prediction device 200, the processing contents of the database generation means 210, the effectiveness determination means 220, the program stored in the storage medium 60, the configuration of the effectiveness determination database stored in the effectiveness determination database 300, the sample statistics The configuration of the database 120 is different from that of the therapy effectiveness prediction apparatus 100 of FIG. The storage medium 60 stores a database generation program and a validity determination program for causing the processing device 2 to function as the database generation unit 210 and the validity determination unit 220. The validity determination database 300 stores the validity determination database generated by the database generation unit 210.
[0063]
The database generation unit 210 executes data classification processing, variance-covariance matrix calculation processing, average difference matrix calculation processing, discrimination coefficient calculation processing, discrimination point calculation processing, discrimination function calculation processing, remarkable efficiency, erroneous discrimination rate calculation processing, and the like. . The validity determining unit 220 executes subject gene information input processing, validity determination data selection processing, discriminant function value calculation processing, discriminant function value positive / negative determination processing, determination result output processing, and the like.
[0064]
FIG. 10 shows an example of the configuration of the sample statistical database 120. As shown in FIG. 10, a specimen number, a plurality of human genotype identification information, a plurality of virus genotype identification information, and efficacy information are stored in each field of the database in association with each specimen.
[0065]
The sample number is a number used as a sample identification number for identifying a patient as a sample. The human genotype identification information is information for identifying a patient's genotype. The virus genotype identification information is information for identifying the genotype of the virus infected by the patient. The effectiveness information is information indicating the effectiveness of the treatment for the patient.
[0066]
Virus and human genes are usually determined by four bases abbreviated as A, T, G, and C, and the sequence of these bases is also provided for sample data. The sequences of these genes vary in part from individual to individual, and these differences are involved in biological characterization, and from a medical application standpoint, these slight differences in the sequence of such genes can lead to a person's drug response. I know it is related to gender. FIG. 10 quantifies and shows such an arrangement difference.
[0067]
In the example of FIG. 10, information on SNPs such as MxA-88, MxA-123, MBL, and LMP7 is given as human genotype identification information related to treatment with hepatitis C interferon. In the effectiveness information, the significant effect is indicated by “2” and the non-significant effect is indicated by “1”. More specifically, in the example of FIG. 10, the efficacy of interferon therapy, that is, the treatment of administering interferon to a patient with hepatitis C, is indicated by two levels of significant or ineffective.
[0068]
Human genotype information is distinguished by two or more pieces of information. For example, in the human genotype MxA-88, numerical values “1”, “2”, and “3” are associated with each of three types of alleles, G / G, G / T, and T / T. In addition, it is also possible to classify these three types into two based on medical knowledge. For example, G / G and G / T are associated with “1”, and T / T is associated with “2”. In MxA-123, “1”, “2”, and “3” are associated with each of the three types of alleles, C / C, C / A, and A / A. In the MBL, “1” and “2” are associated with each of two types of alleles, XB and YA. In LMP7, "1", "2", and "3" are associated with each of the three allele types, C / C, C / A, and A / A.
[0069]
In the example of FIG. 10, the genotype and ISDR are given as the genotype information of the hepatitis C virus. Genotypes are indicated by "1" and "2". HCV-1b and HCV-2a / 2b correspond to each other. Since ISDR is not indicated by a simple base difference, the number of amino acid mutations in ISDR of hepatitis C is used. Of course, similar to MxA-88 and the like, it may be classified into two or more types based on medical viewpoint. For example, “1” may be assigned to three or less mutations, and “2” may be assigned to four or more mutations.
[0070]
Note that the quantification method shown in FIG. 10 is merely an example, and other numerical values are assigned to each type of genotype, or other types or numerical values indicating the state of the gene are used. May be applied. Desirably, an assignment of a numerical value that minimizes the misclassification rate is selected. The statistical database storing the sample data may not be stored in the format shown in the sample statistical database 120. In this case, the sample statistical database 120 may be generated from the statistical database by the above-described categorization and quantification method.
[0071]
FIG. 11 shows an example of the configuration of the validity determination database 300. As shown in FIG. 11, the linear discriminant function, the variables, and the effective group significance, the effective group misclassification rate, the invalid group significance, and the invalid group misclassification rate are stored in association with each field of the database. ing.
[0072]
The linear discriminant function is a function for discriminating the validity of the therapy with respect to the genetic information of the specimen as an input. This linear discriminant function is expressed as a function of a plurality of variables a, b, c, d. Equation (1) that generalizes this discriminant function is shown below.
[0073]
f (a, b, c, d) = p × a + q × b + r × c + s × d + t (1)
In equation (1), p, q, r, and s are coefficients determined by the generation processing of the validity determination database 300, and t is a constant. Note that the function f does not need to be represented by all four variables a, b, c, and d, but may be represented by at least one of them. The constant t is a numerical value given to determine whether the treatment method is effective or ineffective based on whether the function value is positive or negative. A positive value of the discriminant function indicates that the therapy is effective, and a negative value indicates that the therapy is not effective. The coefficients p, q, r, s and the constant t of these discriminant functions are determined in accordance with ordinary statistical techniques (for example, see Koji Kurihara, "Data Science", Open University of Japan Education Promotion Association).
[0074]
The effective group significant efficiency indicates the ratio of sample data associated with significant effect to sample data belonging to an area determined to be valid by the linear discriminant function.
[0075]
The invalid group rejection efficiency indicates a ratio of sample data associated with a remarkable effect to sample data belonging to an area determined to be invalid by the linear discriminant function.
[0076]
The effective group misclassification rate indicates the probability of being incorrectly determined to be invalid by the discriminant function when it is actually significant, and the invalid group misclassification rate indicates the probability of being incorrectly determined to be valid by the discriminant function even though it is ineffective. It is a numerical value.
[0077]
11 (a) to 11 (c) all show efficacy determination data for determining the efficacy of interferon treatment for hepatitis C virus. FIG. 11B shows effectiveness determination data generated for specimen data in which the genotype of the hepatitis C virus corresponds to HCV-1b. FIG. 11C shows the validity determination data generated for the sample data corresponding to HCV-2. FIG. 11A shows validity determination data generated for all of the sample data corresponding to HCV-1b and the sample data corresponding to HCV-2.
[0078]
One discriminant function is provided for one combination of a virus genotype and a human genotype. Therefore, when there are a plurality of combinations of genotypes, a plurality of discriminant functions are given for each combination. As shown in FIG. 11A, one discriminant function is given for the case where only geno-type (virus genotype) is considered as the genotype, the case where only MxA-123 is considered, and the case where only MxA-88 is considered. Then, one discriminant function is given to the combination of geno-type, MxA-123 and MxA-88, and one discriminant function is given to the combination of MxA-123 and Mxa-88. Similarly, in FIG. 11B, a discriminant function is given to each of the eight combinations of genotypes. In FIG. 11C, a discriminant function is given to each of the five combinations of genotypes.
[0079]
Next, a method for creating the validity determination database 300 will be described with reference to the flowchart in FIG.
[0080]
First, the sample data is read from the sample statistical database 12, and each sample is classified into two data groups according to whether the sample is significant or not (s111). The classified data groups are shown in FIG. FIG. 13 (a) shows a data group [A] of sample data with significant effect, and FIG. 13 (b) shows a data group [B] of sample data with non-significant effect. The genotype information indicates numerical values of the human genotype information and virus genotype information of the sample according to FIG. When the virus genotype of the subject is known in advance, for example, only the sample data related to the virus genotype of the subject may be extracted, and the data groups [A] and [B] may be calculated based on the extracted data. .
[0081]
Next, for the classified data group [A], the variance-covariance matrix V_AIs calculated, and the variance-covariance matrix V is calculated for the classified data group [B]._BIs calculated (s112).
[0082]
Covariance matrix V_AWhen the human genotype information or virus genotype information in the data group [A] is used as a variable, the covariance indicating the degree of linear tendency between the two variables is calculated for all combinations of variables. This is the matrix obtained. Also, the variance-covariance matrix V_BWhen the human genotype information or virus genotype information in the data group [B] is used as a variable, the covariance indicating the degree of linear tendency between two of the variables is calculated for all combinations of variables. This is the matrix obtained.
[0083]
Covariance matrix V_AIn the following equation (2), the variance-covariance matrix V_BIs defined by the following equation (3).
[0084]
(Equation 1)

[0085]
In this case, the variance-covariance matrix V^ij _AAnd V^ij _BIs calculated by the following equations (4) and (5).
[0086]
(Equation 2)

[0087]
Next, the variance-covariance matrix V for the entire data groups [A] and [B]_totIs calculated (s113). Matrix element V of the overall variance-covariance matrix^ij _totIs represented by the following equation (6).
[0088]
(Equation 3)

[0089]
Next, an average difference matrix d is calculated (s114). The average difference matrix d is represented by the following equation (7).
[0090]
(Equation 4)

[0091]
Next, the discrimination coefficient A is calculated (s115). The discrimination coefficient A is represented by the following equation (8), and is represented by a matrix represented by equation (9).
[0092]
(Equation 5)

[0093]
A = V_tot ^-1× d ... (9)
Next, determination point Y₀Is calculated (s116). Discrimination point Y₀Is represented by the following equation (10).
[0094]
(Equation 6)

[0095]
Discrimination coefficient A and discrimination point Y obtained in (s115) and (s116)₀Determines the discriminant function Z (s117). The discriminant function Z is represented by the following equation (11).
[0096]
Z = a₁X₁+ A₂X₂+ ... + a_pX_p-Y₀… (11)
Next, the sample data is applied to the discriminant function Z, and the effective group significant efficiency, the effective group misclassification rate, the invalid group significant efficiency, and the invalid group misclassification rate are calculated (s118).
[0097]
More specifically, the number of samples that were effective by the discriminant function Z, that is, samples that were determined to be₊, The number of specimens that were actually ineffective₊Then the effective group efficiency is SR₊× 100 / (SR₊+ NR₊) Can be calculated. Also, among the samples determined to be invalid by the discriminant function Z, that is, Z <0, the number of samples that were actually ineffective is represented by NR₋, SR₋Then, the invalid group efficiency is SR₋× 100 / (SR₋+ NR₋) Can be calculated. Ideally, the efficiency of the effective group should be 100 and the efficiency of the invalid group should be 0.
[0098]
Effective group misclassification rate is SR₋× 100 / (SR₊+ SR₋). The invalid group misclassification rate is NR₊× 100 / (NR₊+ NR₋).
[0099]
The obtained discriminant function Z, effective group significant efficiency, effective group false distinction rate, invalid group significant efficiency, and invalid group false distinction rate are stored in the validity determination database 300 as validity determination data in association with each other (s119). ).
[0100]
The processes shown in (s112) to (s118) are repeatedly executed according to the combination of genotypes. That is, the variance-covariance matrix V generated in (s112)_AAnd V_BIs set for a combination of a plurality of genotypes. Then, the set variance-covariance matrix V_AAnd V_BBy executing the processing shown in (s113) to (s118) for the above, effectiveness determination data is obtained for each of the genotype combinations.
[0101]
Next, the validity determining process by the validity determining unit 220 using the validity determination database 300 will be described with reference to the flowchart of FIG.
[0102]
First, human genotype information and virus genotype information are obtained from a patient as a specimen (s61). The method of acquisition is as described in the first embodiment, and a detailed description thereof will be omitted.
[0103]
Next, the obtained human genotype information and virus genotype information are input using the input device 1 (s62).
[0104]
The processing device 2 determines the validity based on the input genotype information and virus genotype information (s131).
[0105]
Specifically, the validity determination program is read from the storage medium 60 by the storage medium reading device 5, whereby the processing device 2 functions as the validity determination unit 220. The validity determining means 220 searches for validity determination data having a combination of variables corresponding to the input human genotype information and virus genotype information, and selects, for example, a plurality of pieces of validity determination data (s131a).
[0106]
In this (s131a), the validity determination database 300 generation processing shown in (s111) to (s119) described above may be executed every time the subject data is input. In this case, the effectiveness determination database 300 generated according to the virus genotype of the subject data may be limited to a specific virus genotype.
[0107]
Next, the human genotype information corresponding to the genotype determined as the variable of the selected validity determination data is substituted for each variable of the linear discriminant function, and the discriminant function value Z₁Is obtained (s131b).
[0108]
Next, the discriminant function value Z₁Is determined (s131c).
[0109]
Discriminant function value Z₁If> 0, the effective writing efficiency and the effective erroneous discrimination rate associated with the discriminant function Z are read out and output to the output device 4 together with the discrimination result “valid” (s132).
[0110]
Discriminant function value Z₁If <0, the invalidation efficiency and the invalid invalid discrimination rate associated with the discrimination function Z are read out and output to the output device 4 together with the discrimination result “invalid” (s132).
[0111]
A specific example of a determination method when ISDR = 6, MBL = 2, MxA-123 = 3, and Mxa-88 = 2 is given as subject data of a patient infected with HCV-1b. Attention is paid to the ninth discriminant function Z = 0.76a + 1.71b-0.04c + 0.92d-6.34 in FIG. In (s131b), when the subject data is substituted into the discriminant function Z, the discriminant function value Z₁= (0.76 x 6) + (1.71 x 2)-(0.04 x 3) + (0.92 x 2)-6.34 = 3.36. 0.76 × 6 is an individual value for predicting virus efficacy, 1.71 × 2, − (0.04 × 3), and 0.92 × 2 is an individual value for predicting human efficacy. Z₁= 3.36> 0, and together with the judgment result “valid”, the effective group significant efficiency 52%, the effective group misclassification rate 18%, the invalid group significant efficiency 4%, and the invalid group associated with the ninth discriminant function The group misclassification rate of 16% is read and output to the output device 4.
[0112]
This determination result indicates that the treatment for the subject has been determined to be effective. Furthermore, focusing on the data group that was previously determined to be valid by this discriminant, the percentage of the data that was actually significant as determined was 52%, and although it was actually significant, it was erroneously determined to be invalid. It is shown that the ratio of the measured values is 18%. In addition, focusing on the data group determined to be invalid by the discriminant in the past, the ratio of significant effect was 4% contrary to the discrimination result. This indicates that the ratio is 16%. Based on this determination result, the doctor can select an appropriate healing means.
[0113]
Before the output of the determination result, it is determined whether or not the read validity discrimination rate or the invalid false discrimination rate is equal to or less than a predetermined threshold value (for example, 20% or less). The determination result and the data related thereto may be output. Further, data may be output in the order of the read valid / incorrect discrimination rate or invalid / incorrect discrimination rate. This reduces the risk of determining validity based on data with a high misclassification rate and not very high reliability, and makes it possible to predict validity based on highly reliable data with a low misclassification rate.
[0114]
The update of the validity determination database 300 can be performed in the same manner as the update processing of the validity determination database 3 of the first embodiment.
[0115]
As described above, according to the present embodiment, the effectiveness of the treatment can be easily determined by determining whether the treatment is effective / ineffective according to the genotype of the virus to be treated or the genotype of the human, and the misjudgment rate of the judgment result. You can judge.
[0116]
In addition, such a mechanism will further clarify the relationship between drug efficacy and the gene level, and the database will be updated as the sample population increases, creating a more accurate and reliable prediction system. It is possible to build.
[0117]
In the second embodiment, the coefficient of the discriminant function Z is determined in accordance with a general statistical technique, and a priori is applied to human genotype information, virus genotype information, or human genotype information and virus genotype information. There was no significant correlation. That is, the coefficient was determined based on the correlation between one human genotype information and the effectiveness of the treatment, or the correlation between one virus genotype information and the effectiveness of the treatment. As a result, a linear discriminant function that requires a small amount of calculation when calculating the discriminant function is obtained. This is based on the premise that when the correlation of the data group is not known in advance, each data group is normally distributed.
[0118]
However, the present invention is not limited to this. For example, a correlation between one human genotype information and another human genotype information, a correlation between one human genotype information and a virus genotype information, or a correlation between one virus genotype information and another virus genotype information. If there is a clear correlation between the data factors, as is known in advance, a reduction operation to an independent data factor or a determination operation using a nonlinear discriminant function may be performed. More specifically, the coefficients for those correlations may be obtained again, and a term obtained by multiplying the coefficient and a variable corresponding to the correlation may be added to the discriminant function Z. Therefore, not only the linear discriminant function shown in the second embodiment but also a discriminant function using a quadratic function or a cluster analysis method which is a technique of multidimensional analysis may be adopted. If the data analysis technique employed here is known as an algorithm, it is, of course, within the scope of the present invention. In this effectiveness prediction system, the discriminant function is given in advance in the system. Here, the present inventors varied the size of the population to some extent and examined how much it affected the coefficient of the discriminant function, but confirmed that it had only a few percent effect. Therefore, if a population having a certain size is used, a general-purpose discriminant function can be constructed. However, the results depend on the population of the sample data, and there may naturally be differences due to race, region, and lifestyle. Although the present prediction system can be implemented in a clinical genetic test device, such a device is provided with a communication means, and a set of discriminant functions for which a new sample data is determined as a population is communicated via the communication means. It is also possible to inherit and send it to a testing device.
[0119]
The discriminant function described in the present embodiment is a discriminant function to the last using certain sample data, and its value does not have universality.However, according to the method described here, it is suitable for each diagnosis and is always the latest. An informed discriminant function can be provided. Further, such a database of the discriminant function and the erroneous discrimination rate can be distributed by a recording medium such as a magnetic medium without using communication.
[0120]
The present invention is not limited to the first and second embodiments.
[0121]
Although the sample statistical database 12 has been shown to be connected to the computer 10 via the communication network 11, the present invention is not limited to this, and the sample statistical database 12 is stored in a storage device provided in the computer 10 without passing through the communication network 11. It may be.
[0122]
Further, in the above embodiment, an example in which the present invention is applied to predicting whether a method of administering interferon to hepatitis C for treatment is effective is shown, but the present invention is not limited to this. Treatment includes not only administration of interferon but also other treatments. Therapy also includes treatment without administration of the drug. The virus to be cured by the treatment is not limited to hepatitis C, and any other virus can be a target of the present invention. Further, the present invention can be easily applied to treatment of infection by not only viruses but also all infectious agents having genes such as bacteria and fungi. In addition, although the description has been given of the case where the infected host is a human, the present invention is not limited to this.
[0123]
Further, in the first embodiment, an example has been described in which the validity determination database 3 is generated based on the sample statistical database 12, but the present invention is not limited to this. For example, the validity determination database 3 may be generated based on a database such as the sample statistical database 120 of the second embodiment. In this case, the ISDRs and virus genotypes shown in the sample statistical database 120 are similar to human genotypes such as Mxa-88 in FIG.²Calculate values, weighting factors, etc. In this case, the first embodiment and the second embodiment can generate the same sample statistical database.
[0124]
FIG. 15 shows a sample statistical database 12a according to a modification of the first embodiment in which viruses are categorized. As shown in FIG. 15, MxA-88, MxA-123, MBL, LMP7, IFNAR1 (GT) relating to human genotype information_n, IFNAR1 @ C / T, etc., as well as information relating to HCV-1b, HCV-2a, and HCV-2b relating to viral genotype information are associated with each sample. FIG. 16 shows an example in which HCV-1b is categorized into the first type and HCV-2a / 2b is categorized into the second type, and the weight coefficient including the virus genotype is calculated. The weight coefficient shown in FIG. 16 is multiplied by the human genotype information and the virus genotype information obtained from the subject, and the multiplied values are respectively added to obtain a predicted effectiveness addition value including the virus genotype. I can do it. Although the example of FIG. 15 and FIG. 16 shows an example in which only one virus genotype is taken and the weight coefficient is calculated, the weight coefficient may be similarly calculated for information on other virus genotypes. Of course.
[0125]
Further, the above-described generation of the

validity determination databases

3 and 300 is automatically performed repeatedly every time the sample statistical database 12 is updated, so that the accuracy of the validity determination can be improved. Further, not only when the sample statistical database 12 is updated, but also when sample data is obtained from another route, the sample data is added to the sample data obtained from the sample statistical database 12 to determine the validity. The

database

3 or 300 may be generated.
[0126]
Further, in this embodiment, the case where the validity determination is performed by the computer 10 having the validity determination database has been described, but the present invention is not limited to this. For example, the genotype data of the subject is transmitted from a terminal connected to the communication network 11 to make a determination request to the computer 10, and in response to this request, the computer 10 executes the above-described validity determination processing, and May be transmitted to the terminal.
[0127]
Further, in the first embodiment, an example in which a plurality of human genotype information is classified into two types and a variance-covariance matrix is calculated for each of the types has been described. However, the present invention is not limited to this. For example, it is classified into three or more types, and for each of the types, the remarkable efficiency SRrate, the odds O_k, Weight coefficient S_kMay be calculated. Similarly, in the second embodiment, (s112) the variance-covariance matrix V_A, V_B, V_C,... Are calculated, and based on these, the overall variance-covariance matrix V_totMay be calculated.
[0128]
In addition, the display of the valid / invalid result in the first and second embodiments may be given by a numerical value, or may be configured to be visually distinguished by color.
[0129]
【The invention's effect】
As described above in detail, according to the present invention, it is possible to predict the effectiveness of a treatment according to the gene on the virus side or the patient side.
[Brief description of the drawings]
FIG. 1 is a diagram showing an overall configuration of a treatment method effectiveness prediction device according to a first embodiment of the present invention.
FIG. 2 is an exemplary view showing an example of the configuration of a sample statistical database (DB) according to the embodiment.
FIG. 3 is an exemplary view showing an example of the configuration of a validity determination database according to the embodiment.
FIG. 4 is an exemplary view showing a flowchart of a method for creating a validity determination database according to the embodiment;
FIG. 5 is a diagram illustrating the efficiency, odds, and χ calculated when creating the validity determination database according to the embodiment.²The figure which showed the value, the weight coefficient, etc. in the table.
FIG. 6 is an exemplary flowchart showing validity determination processing according to the embodiment;
FIG. 7 is a conceptual diagram of calculation of an effectiveness prediction addition value according to the embodiment;
FIG. 8 is an exemplary view showing an output example of a validity determination result according to the embodiment;
FIG. 9 is a diagram showing an entire configuration of a treatment method effectiveness prediction device according to a second embodiment of the present invention.
FIG. 10 is an exemplary view showing an example of the configuration of a sample statistical database according to the embodiment.
FIG. 11 is an exemplary view showing an example of the configuration of a validity determination database according to the embodiment.
FIG. 12 is a view showing a flowchart of a method for generating a validity determination database according to the embodiment.
FIG. 13 is an exemplary view showing an example of data of each classified data group according to the embodiment;
FIG. 14 is a view showing a flowchart of a validity judgment process using the validity judgment database according to the embodiment.
FIG. 15 is a diagram showing an example of the configuration of a sample statistical database according to a modification of the first embodiment.
FIG. 16 is a diagram showing an example of a configuration of a validity determination database according to a modification of the first embodiment.
[Explanation of symbols]
1. Input device
2. Processing equipment
3: Effectiveness judgment database
4. Output device
5. Storage medium reader
6. Storage medium
10 ... Computer
11 Communication network
12: Sample statistical database

Claims

An effectiveness determination database generation program for causing a computer to execute a process of generating an effectiveness determination database for determining the effectiveness of treatment, wherein the effectiveness determination database generation program causes the computer to:
For each sample, infected host genotype information indicating the genotype of the infected host of the sample, infectious agent genotype information indicating the genotype of the infectious agent infected by the sample, and efficacy indicating the effectiveness of treatment for the infectious agent Read the sample data given information for a plurality of samples,
Based on the specimen data for the plurality of specimens, a weighting factor indicating a degree of correlation between at least one of the infected host genotype information and the infectious pathogen genotype information and the effectiveness of the treatment is calculated for each genotype,
A program for generating a validity determination database, wherein the weighting factor is stored in a storage device for each genotype.

A first field that records one or more infected host genotypes and an infected host weighting factor that indicates a degree of correlation with the efficacy of the treatment;
A data structure of effectiveness determination data, comprising: a second field in which one or more infectious agent genotypes and an infectious agent weight coefficient indicating a degree of correlation with the effectiveness of the treatment are recorded.

An infected host weighting factor that indicates the degree of correlation between one or more infected host genotypes and the effectiveness of the treatment, and an infectious pathogen weight that indicates the degree of correlation between one or more infectious agent genotypes and the effectiveness of the treatment A first field that records the coefficients and
A second field in which efficacy information indicating the efficacy of the treatment is recorded.

An effectiveness determination program that causes a computer to execute a process of determining the effectiveness of treatment, wherein the effectiveness determination program causes the computer to:
An infected host weighting factor that indicates the degree of correlation between one or more infected host genotypes and the efficacy of the treatment, and an infectious agent weighting coefficient that indicates the degree of correlation between one or more infectious pathogen genotypes and the effectiveness of the treatment. Read out,
Infected host genotype information of the subject, the infected host weight coefficient is multiplied for each genotype, and the infected host effectiveness prediction individual value is calculated,
Infectious pathogen genotype information of the subject, and the infectious agent effectiveness prediction individual value is calculated by multiplying the infectious agent weight coefficient for each genotype,
Each of the infected host efficacy prediction individual value and each of the infectious pathogen efficacy prediction individual value is added to calculate an efficacy prediction addition value,
Determine the effectiveness of the treatment based on the value obtained by adding a predetermined numerical value to the efficacy prediction addition value or the efficacy prediction addition value,
A validity judging program for outputting the judgment result.

An effectiveness determination database generation system that generates an effectiveness determination database for determining the effectiveness of treatment using a computer,
For each sample, infected host genotype information indicating the infected host genotype of the sample, infectious agent genotype information indicating the genotype of the infectious agent infected by the sample, and efficacy information indicating the effectiveness of treatment for the infectious agent Read the given sample data for a plurality of samples, based on the sample data for the plurality of samples, the degree of correlation between at least one of the infected host genotype information and the infectious pathogen genotype information and the effectiveness of the treatment Database generating means for calculating a weighting factor indicating for each genotype,
A storage unit for storing the weighting factor calculated by the database generation unit for each genotype;

A therapeutic method effectiveness determination system that determines the effectiveness of treatment using a computer,
An infected host weighting factor indicating the degree of correlation between one or more infected host genotypes and the efficacy of the treatment, and an infectious agent weighting coefficient indicating the degree of correlation between one or more infectious pathogen genotypes and the effectiveness of the treatment. An effectiveness determination database storing the
The infected host genotype information of the subject and the infected host effectiveness prediction individual value are calculated by multiplying the infected host weight coefficient for each genotype, and the infected pathogen genotype information of the subject and the infected pathogen weight coefficient are calculated. Infectious pathogen efficacy prediction individual value is calculated by multiplying for each genotype, and each of the infected host efficacy prediction individual value and each of the infectious agent efficacy prediction individual value are added to calculate an efficacy prediction added value. Effectiveness determination means for determining the effectiveness of treatment based on a value obtained by adding a predetermined numerical value to the effectiveness prediction addition value or the effectiveness prediction addition value,
An output device that outputs the predicted efficacy addition value.

An effectiveness determination database generation method for generating an effectiveness determination database for determining the effectiveness of the treatment using a computer,
For each sample, infected host genotype information indicating the infected host genotype of the sample, infectious agent genotype information indicating the genotype of the infectious agent infected by the sample, and efficacy information indicating the effectiveness of treatment for the infectious agent Is read from the first storage device for a plurality of specimens given the
Based on the specimen data for the plurality of specimens, a weighting factor indicating the degree of correlation between at least one of the infected host genotype information and the infectious pathogen genotype information and the effectiveness of the treatment is calculated for each genotype,
A method for generating a validity determination database, wherein the weighting factor is stored in a second storage device for each genotype.

A method for determining the effectiveness of a treatment using a computer, the method comprising:
An infected host weighting factor that indicates the degree of correlation between one or more infected host genotypes and the efficacy of the treatment, and an infectious agent weighting coefficient that indicates the degree of correlation between one or more infectious pathogen genotypes and the effectiveness of the treatment. Read out,
Infected host genotype information of the subject and the infected host effectiveness prediction individual value are calculated by multiplying the infected host weight coefficient for each genotype,
Infectious agent pathogen genotype information of the subject, and calculating the infectious agent effectiveness prediction individual value by multiplying the infectious agent weight coefficient for each genotype,
Each of the infected host efficacy prediction individual value and each of the infectious pathogen efficacy prediction individual value is added to calculate an efficacy prediction addition value,
Determine the effectiveness of the treatment based on the value obtained by adding a predetermined numerical value to the efficacy prediction addition value or the efficacy prediction addition value,
A method for judging the effectiveness of a therapy for outputting the judgment result.